I'm working on some C++ code that also eventually needs to call OS-level C code, for example scandir.
I'd like to use C++ for most of the codebase, which (to me) means that I'm mostly working with std::string instead of char pointers.
I have a method that accepts a string_view, so that I can pass in both std::string and char*, depending on whether the C++ or the "C interop" code needs to call it:
std::optional<std::vector<FileInfo>> scan_directory(const std::string_view directory)
{
if (directory.empty()) {
return {};
}
struct dirent **namelist;
int num = scandir(directory.data(), &namelist, nullptr, nullptr);
[...]
}
Note the call to data()
here, since scandir takes a const char *
. Now, I saw this note:
Unlike std::basic_string::data() and string literals, data() may return a pointer to a buffer that is not null-terminated. Therefore it is typically a mistake to pass data() to a routine that takes just a const CharT* and expects a null-terminated string.
That got me thinking: Is there a better/safer way? I know that in practice, the callers will be null-terminated strings, but I don't want to create a hard-to-diagnose bug later on when I'm already aware there's a potential issue here. Though I guess that there's already no guarantee that a char* is null-terminated, so I'm not making the situation any worse.
Still, curious if there is a better option.
- Should I check the string_view for a null-terminator, and if none exists, create a
char[directory.size() + 1]{0}
and copy the characters myself? - Or create two overloads, one that takes a std::string and one that takes a const char*?
I'm on g++ (GCC) 10.2.1 20201016 (Red Hat 10.2.1-6)
in C++20 mode via CMake's set(CMAKE_CXX_STANDARD 20)
.
When all you have is a
std::string_view
, you have no guarantees whatsoever that accessing beyond itssize()
is not undefined behavior. It's not an absolute guarantee that it's undefined behavior, but neither you have any guarantees that it's not. If thestring_view
was constructed using a pointer to a\0
-terminated character string, but it's not included in the constructedstring_view
's size, then you're arguably safe (I note that recent versions of libstdc++ have an option to do some boundary checking on vectors and string accesses, and if at some point in the future boundary checking get introduced forstring_view
s they'll trip you up, and you'll be out of luck). But it is certainly possible to construct astring_view
with a pointer to a string that's not\0
terminated, with an exact character count. In that case looking beyond itssize()
will result in nasal demons. And you have no way to determine whether this is the case.The simplest solution for you here is to declare the parameter as
const char *
, and create an overload that takes aconst std::string &
alternative parameter, and then calls this function usingc_str
().