istream eof discrepancy between libc++ and libstdc++

1.3k Views Asked by At

The following (toy) program returns different things when linked against libstdc++ and libc++. Is this a bug in libc++ or do I not understand how istream eof() works? I have tried running it using g++ on linux and mac os x and clang on mac os x, with and without -std=c++0x. It was my impression that eof() does not return true until an attempt to read (by get() or something else) actually fails. This is how libstdc++ behaves, but not how libc++ behaves.

#include <iostream>
#include <sstream>

int main() {
    std::stringstream s;

    s << "a";

    std::cout << "EOF? " << (s.eof() ? "T" : "F") << std::endl;
    std::cout << "get: " << s.get() << std::endl;
    std::cout << "EOF? " << (s.eof() ? "T" : "F") << std::endl;

return 0;
}

Thor:~$ g++ test.cpp
Thor:~$ ./a.out
EOF? F
get: 97
EOF? F
Thor:~$ clang++ -std=c++0x -stdlib=libstdc++ test.cpp 
Thor:~$ ./a.out
EOF? F
get: 97
EOF? F
Thor:~$ clang++ -std=c++0x -stdlib=libc++ test.cpp 
Thor:~$ ./a.out
EOF? F
get: 97
EOF? T
Thor:~$ clang++ -stdlib=libc++ test.cpp 
Thor:~$ ./a.out
EOF? F
get: 97
EOF? T
4

There are 4 best solutions below

1
On BEST ANSWER

This was a libc++ bug and has been fixed as Cubbi noted. My bad. Details are here:

http://lwg.github.io/issues/lwg-closed.html#2036

0
On

eofbit is set when there is an operation which tries to read past the end of file, the operation may not fail (if you are reading an integer and there is no end of line after the integer, I expect eofbit to be set but the read of the integer to succeed). I.E. I get and expect FT for

#include <iostream>
#include <sstream>

int main() {
    std::stringstream s("12");
    int i;
    s >> i;

    std::cout << (s.fail() ? "T" : "F") << (s.eof() ? "T" : "F") << std::endl;

    return 0;
}

Here I don't expect istream::get to try and read after the returned character (i.e. I don't expect it to hang until I enter the next line if I read a \n with it), so libstd++ seems indeed right, at least in a QOI POV.

The standard description for istream::get just says "extracts a character c, if one is available" without describing how and so doesn't seem to prevent libc++ behavior.

3
On

The value of s.eof() is unspecified in the second call—it may be true or false, and it might not even be consistent. All you can say is that if s.eof() returns true, all future input will fail (but if it returns false, there's no guarantee that future input will succeed). After failure (s.fail()), if s.eof() returns true, it's likely (but not 100% certain) that the failure was due to end of file. It's worth considering the following scenario, however:

double test;
std::istringstream s1("");
s1 >> test;
std::cout << (s1.fail() ? "T" : "F") << (s1.eof() ? "T" : "F") << endl;
std::istringstream s2("1.e-");
s2 >> test;
std::cout << (s2.fail() ? "T" : "F") << (s2.eof() ? "T" : "F") << endl;

On my machine, both lines are "TT", despite the fact that the first failed because there was no data (end of file), the second because the floating point value was incorrectly formatted.

12
On

EDIT: This was due to the way older versions of libc++ interpreted the C++ standard. The interpretation was discussed in LWG issue 2036, it was ruled to be incorrect and libc++ was changed.

Current libc++ gives the same results on your test as libstdc++.

old answer:

Your understanding is correct.

istream::get() does the following:

  1. Calls good(), and sets failbit if it returns false (this adds a failbit to a stream that had some other bit set), (§27.7.2.1.2[istream::sentry]/2)
  2. Flushes whatever's tie()'d if necessary
  3. If good() is false at this point, returns eof and does nothing else.
  4. Extracts a character as if by calling rdbuf()->sbumpc() or rdbuf()->sgetc() (§27.7.2.1[istream]/2)
  5. If sbumpc() or sgetc() returned eof, sets eofbit. (§27.7.2.1[istream]/3) and failbit (§27.7.2.2.3[istream.unformatted]/4)
  6. If an exception was thrown, sets badbit (§27.7.2.2.3[istream.unformatted]/1) and rethrows if allowed.
  7. Updates gcount and returns the character (or eof if it couldn't be obtained).

(chapters quoted from C++11, but C++03 has all the same rules, under §27.6.*)

Now let's take a look at the implementations:

libc++ (current svn version) defines the relevant part of get() as

sentry __s(*this, true);
if (__s)
{
    __r = this->rdbuf()->sbumpc();
    if (traits_type::eq_int_type(__r, traits_type::eof()))
       this->setstate(ios_base::failbit | ios_base::eofbit);
    else
        __gc_ = 1;
}

libstdc++ (as shipped with gcc 4.6.2) defines the same part as

sentry __cerb(*this, true);
if (__cerb)
  {
    __try
      {
        __c = this->rdbuf()->sbumpc();
        // 27.6.1.1 paragraph 3
        if (!traits_type::eq_int_type(__c, __eof))
          _M_gcount = 1;
        else
          __err |= ios_base::eofbit;
      }
[...]
if (!_M_gcount)
  __err |= ios_base::failbit;

As you can see, both libraries call sbumpc() and set eofbit if and only if sbumpc() returned eof.

Your testcase produces the same output for me using recent versions of both libraries.