Using gsl::zstring_view with C APIs

1.4k Views Asked by At

I'm trying to use modern string-handling approaches (like std::string_view or GSL's string_span) to interact with a C API (DBus) that takes strings as null-terminated const char*s, e.g.

DBusMessage* dbus_message_new_method_call(
    const char* destination,
    const char* path,
    const char* iface,
    const char* method 
    )

string_view and string_span don't guarantee that their contents are null-terminated - since spans are (char* start, ptrdiff_t length) pairs, that's largely the point. But GSL also provides a zstring_view, which is guaranteed to be null-terminated. The comments around zstring_span suggest that it's designed exactly for working with legacy and C APIs, but I ran into several sticking points as soon as I started using it:

  1. Representing a string literal as a string_span is trivial:

    cstring_span<> bar = "easy peasy";
    

    but representing one as a zstring_span requires you to wrap the literal in a helper function:

    czstring_span<> foo = ensure_z("odd");
    

    This makes declarations noisier, and it also seems odd that a literal (which is guaranteed to be null-terminated) isn't implicitly convertible to a zstring_span. ensure_z() also isn't constexpr, unlike constructors and conversions for string_span.

  2. There's a similar oddity with std::string, which is implicitly convertible to string_span, but not zstring_span, even though std::string::data() has been guaranteed to return a null-terminated sequence since C++11. Again, you have to call ensure_z():

    zstring_span<> to_zspan(std::string& s) { return ensure_z(s); }
    
  3. There seems to be some const-correctness issues. The above works, but

    czstring_span<> to_czspan(const std::string& s) { return ensure_z(s); }
    

    fails to compile, with errors about being unable to convert from span<char, ...> to span<const char, ...>

  4. This is a smaller point than the others, but the member function that returns a char* (which you would feed to a C API like DBus) is called assume_z(). What's being assumed when the constructor of zstring_span expects a null-terminated range?

If zstring_span is designed "to convert zero-terminated spans to legacy strings", why does its use here seem so cumbersome? Am I misusing it? Is there something I'm overlooking?

2

There are 2 best solutions below

4
On

It's "cumbersome" in part because it's intended to be.

This:

zstring_span<> to_zspan(std::string& s) { return ensure_z(s); }

Is not a safe operation. Why? Because while it is true that s is NUL terminated, it is entirely possible that the actual s contains internal NUL characters. That's a legitimate thing you can do with std::string, but zstring_span and whomever takes it can't handle that. They'll truncate the string.

By contrast, string_span/view conversions are safe from this perspective. Consumers of such strings take a sized string and thus can handle embedded NULs.

Because the zstring_span conversion is unsafe, there should be some explicit notation that something potentially unsafe is being done. ensure_z represents that explicit notation.

Another problem is that C++ has no mechanism to tell the difference between a literal string argument and any old const char* or const char[] parameter. Since a bare const char* may or may not be a string literal, you have to assume that it isn't and therefore use a more verbose conversion.

Also, C++ string literals can contain embedded NUL characters, so the above reasoning applies.

The const issue seems like a code bug, and you should probably file it as such.

0
On
  1. it also seems odd that a literal (which is guaranteed to be null-terminated) isn't implicitly convertible to a zstring_span

A string literal is of type const char[...]. There is no information in the type that this const char array is a null terminated string. Here is some other code with the same types, but without null termination where ensure_z will fail fast.

const char foo_arr[4]{ 'o', 'd', 'd', '-' };
ensure_z(foo_arr);

Both "foo" and foo_arr are of type const char[4], but only the string literal is null terminated while foo_arr is not.

Please note that your combination of ensure_z and czstring_span<> compiles, but it does not work. ensure_z returns only the string without the terminating null byte. When you pass that to the czstring_span<> constructor, then the constructor will fail searching for the null byte (which was cut off by ensure_z).

You need to convert the string literal to a span and pass that to the constructor:

czstring_span<> foo = ensure_span("odd");
  1. There's a similar oddity with std::string, which is implicitly convertible to string_span, but not zstring_span

Good point. There is a constructor for string_span that takes a std::string, but for zstring_span there is only a constructor taking the internal implementation type, a span<char>. For span there is a constructor taking a "container" having .data() and .size() - which std::string implements. Even worse: the following code compiles but will not work:

zstring_span<> to_zspan(std::string& s) { return zstring_span<>{s}; }

You should consider filing an issue in the GSL repo to get the classes aligned. I am not sure if the implicit conversions are a good idea, so I prefer how it is done in zstring_span over how string_span does it.

  1. There seems to be some const-correctness issues.

Also here my first idea of czstring_span<> to_czspan(const std::string& s) { return czstring_span<>{s}; } compiles but does not work. Another solution would be a new function ensure_cz that returns a span<const char, ...>. You should consider filing an issue.

  1. assume_z()

The existance of empty() and the code in as_string_span() suggest that the class was meant to be able to handle empty string spans. In that case as_string_span would always return the string without terminating null byte, ensure_z would return the string with terminating null byte, failing if empty, and assume_z would assume that !empty() and return the string with terminating null byte.

But the one and only constructor is taking a non-empty span of characters, so empty() can never be true. I just created a PR to address these inconsistencies. Please consider filing an issue if you think that more should be changed.

If zstring_span is designed "to convert zero-terminated spans to legacy strings", why does its use here seem so cumbersome? Am I misusing it? Is there something I'm overlooking?

In pure C++ code I prefer std::string_view, zstring_span is only for C interop, that limits its use. And of course you must know the guidelines and the guideline support library. Given that I bet that zstring_span is rarely been used and that you are one of the very few people taking a deep look into it.