Why is dereference of past-the-end iterator of std::basic_string still UB after C++11?

129 Views Asked by At

As we all knows, C++11 adds a null terminator to std::basic_string (which doesn't count into most member functions). But when I read cpp ref, I found dereference of end() an UB (this paragraph is almost the same to the one for std::vector). Why does this happen? Or is this an error of cpp ref (please provide the doc for this for verification)?

I tried on GNU C++ but unfortunately __gnu_debug seems not to contain checker for std::string iterator. Neither does sanitizer of Clang++.

1

There are 1 best solutions below

2
ecatmur On

Correct; the end() iterator cannot be indirected, even though [data(), data() + size()] is a closed range.

The only major compiler that I know to enforce this in debug mode is Microsoft Visual Studio:

#include <string>
int main(int argc, char* argv[]) {
    return *std::string(argv[argc - 1]).end();
}

The above program compiled as cl.exe a.cpp /EHsc /Zi /MDd /std:c++20 /D_ITERATOR_DEBUG_LEVEL=2 gives the following debug assertion:

Expression: cannot dereference string iterator because it is out of range (e.g. an end iterator)

(libstdc++ does not perform iterator debugging for std::string by design, to allow ABI compatibility between debug and release mode; libc++ claims to perform iterator debugging for std::string but does not appear to catch this error.)

The reason for this seeming inconsistency in the standard is that the null terminator is provided as a convenience for C-style APIs that expect null-terminated strings; but these are accessing the string via a raw character pointer, not an iterator. So making the past-the-end iterator dereferenceable would do nothing to help such code and could hide bugs in the use of C++-style iterators.