每天学习一些旧知识,第十四部分:read()返回值可能会令人惊讶

  • Porting Source Code: Last week, the author ported some source code from Watcom C to Microsoft C. Watcom C was intended to be highly compatible with Microsoft's C dialect, so in general, it wasn't difficult.
  • One Small Program Crash: However, one small program kept crashing when built with Microsoft C. It didn't do anything suspicious and didn't produce notable warnings with either compiler.
  • Difference in read() Behavior: After debugging, the difference was traced to code involving read(). When a file is open with the O_TEXT flag, the return value of read() is different between the two compilers' run-time libraries.
  • Watcom Documentation: Watcom's documentation states that read() returns the number of bytes of data transmitted from the file to the buffer (excluding carriage-return characters removed during transmission). It attempts to fill the entire buffer.
  • Microsoft Documentation: Microsoft's documentation says that read() returns the number of bytes actually read, which may be less than the requested count if there are fewer bytes left in the file or if opened in text mode. In text mode, each CR-LF pair is replaced with a single LF, and only the single LF is counted in the return value.
  • Discrepancy Reason: The discrepancy is that for files opened in text mode, the total number of bytes on disk is likely higher than the number read into application buffers due to CR/LF sequence shrinking to LF. Microsoft defines the length argument as the number of bytes read from disk, while Watcom interprets it as the number of bytes written to the buffer.
  • Drawbacks of Microsoft Approach: Microsoft's approach has two drawbacks. It makes it impossible to test for end-of-file and error conditions by simply checking the number of bytes read, and it is inconsistent with the behavior of fread() for text files.
  • fread() Behavior: In all tested run-time libraries, fread() on text files behaves like Watcom's read(), attempting to fill the destination buffer with the specified number of bytes.
  • No Standard Specification: The behavior of read() for text files is not specified by any standard as POSIX and its successors don't know about text files.
  • Runtime Library Differences: There are also differences in how different run-time libraries handle CR characters in text files, which is a topic for a different blog post.
阅读 8
0 条评论