Is implicit construction of `const std::string` from `const char *` efficient?

806 Views Asked by At

Like many people I'm in the habit of writing new string functions as functions of const std::string &. The advantages are efficiency (you can pass existing std::string objects without incurring overhead for copying/moving) and flexibility/readability (if all you have is a const char * you can just pass it and have the construction done implicitly, without cluttering up your code with an explicit std::string construction):

#include <string>
#include <iostream>
unsigned int LengthOfStringlikeObject(const std::string & s)
{
    return s.length();
}
int main(int argc, const char * argv[])
{
    unsigned int n = LengthOfStringlikeObject(argv[0]);
    std::cout << "'" << argv[0] << "' has " << n << " characters\n";
}

My aim is to write efficient cross-platform code that can handle long strings efficiently. My question is, what happens during the implicit construction? Are there any guarantees that the string will not be copied? It strikes me that, because everything is const, copying is not necessary—a thin STL wrapper around the existing pointer is all that's needed—but I'm not sure how compiler- and platform-dependent I should expect that behavior to be. Would it be safer to always explicitly write two versions of the function, one for const std::string & and one for const char *?

3

There are 3 best solutions below

1
On BEST ANSWER

It strikes me that, because everything is const, copying is not necessary—a thin STL wrapper around the existing pointer is all that's needed

I don't think this assumption is correct. Just because you have a pointer to const, it does not imply that the underlying value cannot change. It only implies that the value cannot be changed through that pointer. The pointer could be pointing to non-const storage which can change at any time.

Because of this, the library must make its own copy (to provide the "correct" string observable behavior). A quick review of libstdc++ shows that it always makes a copy. The construction from char* is not inline, so it cannot be optimized away without static linking and LTO.

While extremely trivial statically linked programs might have the copy optimized away with LTO (I wasn't able to reproduce this), I think in general it would be unlikely this optimization could be performed (especially considering the aliasing rules for char*). g++ doesn't even perform this optimization for a string literal.

0
On

If you pass a const char* to something that takes a std::string, reference or not, a string will be constructed. A compiler might even complain if you send it to a reference with a warning that there is an implicit temporary object.

Now this may be optimized by the compiler and also some implementations will not allocate memory for small strings. The compiler might also internally optimize it to use a C++17 string_view. It essentially depends on what you will do to the string in your code. If you only use constant member functions, a clever compiler might optimize out.

But that is up to the implementation and outside your control. You can use explicitly std::string_view if you want to take over.

0
On

If you don't want copying, then string_view is what you want.

However, with this benefit comes problems. Specifically, you have to ensure that the storage that you pass lasts "long enough".

For string literals, that's no problem. For argv[0], that's almost certainly not a problem. For arbitrary sequences of characters, then you'll need to think about them.

but you can write:

unsigned int LengthOfStringlikeObject(std::string_view sv)
{
    return sv.length();
}

and call it with a string, or a const char *, and it will be fine.