What is more efficient in this case, using const char* or std::string

1.1k Views Asked by At

I am using a combination of C and C++ code in my application.

I want to print out if a boolean flag is true or false as below, by using a ternary operator to determine the string to print.

If I use a const char*, doesn't the compiler more than likely store these string literals "Yes" and "No" in some read-only memory before the program starts.

If I use std::string, when the string goes out of scope, it will be destroyed? But I guess the complier still needs to store the string literals "Yes" and "No" somewhere anyways? I'm not sure.

bool isSet = false;

// More code

//std::string isSetStr = isSet ? "Yes" : "No";
const char* isSetStr  =  isSet ? "Yes" : "No";

//printf ( "Flag is set ? : %s\n", isSetStr.c_str());
printf ( "Flag is set ? : %s\n", isSetStr);
6

There are 6 best solutions below

4
On

You can test it with godbolt. The former (using const char*) gives this:

.LC0:
        .string "No"
.LC1:
        .string "Yes"
.LC2:
        .string "Flag is set ? : %s\n"
a(bool):
        test    dil, dil
        mov     eax, OFFSET FLAT:.LC0
        mov     esi, OFFSET FLAT:.LC1
        cmove   rsi, rax
        mov     edi, OFFSET FLAT:.LC2
        xor     eax, eax
        jmp     printf

The latter (using std::string) gives this:

.LC0:
        .string "Yes"
.LC1:
        .string "No"
.LC2:
        .string "Flag is set ? : %s\n"
a(bool):
        push    r12
        push    rbp
        mov     r12d, OFFSET FLAT:.LC1
        push    rbx
        mov     esi, OFFSET FLAT:.LC0
        sub     rsp, 32
        test    dil, dil
        lea     rax, [rsp+16]
        cmovne  r12, rsi
        or      rcx, -1
        mov     rdi, r12
        mov     QWORD PTR [rsp], rax
        xor     eax, eax
        repnz scasb
        not     rcx
        lea     rbx, [rcx-1]
        mov     rbp, rcx
        cmp     rbx, 15
        jbe     .L3
        mov     rdi, rcx
        call    operator new(unsigned long)
        mov     QWORD PTR [rsp+16], rbx
        mov     QWORD PTR [rsp], rax
.L3:
        cmp     rbx, 1
        mov     rax, QWORD PTR [rsp]
        jne     .L4
        mov     dl, BYTE PTR [r12]
        mov     BYTE PTR [rax], dl
        jmp     .L5
.L4:
        test    rbx, rbx
        je      .L5
        mov     rdi, rax
        mov     rsi, r12
        mov     rcx, rbx
        rep movsb
.L5:
        mov     rax, QWORD PTR [rsp]
        mov     QWORD PTR [rsp+8], rbx
        mov     edi, OFFSET FLAT:.LC2
        mov     BYTE PTR [rax-1+rbp], 0
        mov     rsi, QWORD PTR [rsp]
        xor     eax, eax
        call    printf
        mov     rdi, QWORD PTR [rsp]
        lea     rax, [rsp+16]
        cmp     rdi, rax
        je      .L6
        call    operator delete(void*)
        jmp     .L6
        mov     rdi, QWORD PTR [rsp]
        lea     rdx, [rsp+16]
        mov     rbx, rax
        cmp     rdi, rdx
        je      .L8
        call    operator delete(void*)
.L8:
        mov     rdi, rbx
        call    _Unwind_Resume
.L6:
        add     rsp, 32
        xor     eax, eax
        pop     rbx
        pop     rbp
        pop     r12
        ret

Using std::string_view such as:

#include <stdio.h>
#include <string_view>


int a(bool isSet) {

// More code

std::string_view isSetStr = isSet ? "Yes" : "No";
//const char* isSetStr  =  isSet ? "Yes" : "No";

printf ( "Flag is set ? : %s\n", isSetStr.data());
//printf ( "Flag is set ? : %s\n", isSetStr);
}

gives:

.LC0:
        .string "No"
.LC1:
        .string "Yes"
.LC2:
        .string "Flag is set ? : %s\n"
a(bool):
        test    dil, dil
        mov     eax, OFFSET FLAT:.LC0
        mov     esi, OFFSET FLAT:.LC1
        cmove   rsi, rax
        mov     edi, OFFSET FLAT:.LC2
        xor     eax, eax
        jmp     printf

So to sum up, both const char* and string_view gives optimal code. string_view is a bit more code to type compared to const char*. std::string is made to manipulate string content, so it's overkill here and leads to less efficient code.

Another remark with string_view: It does not guarantee that the string is NUL terminated. In this case, it is, since it's built from a NUL terminated static string. For a generic string_view usage with printf, use printf("%.*s", str.length(), str.data());

EDIT: By disabling exception handling, you can reduce std::string version to:

.LC0:
        .string "Yes"
.LC1:
        .string "No"
.LC2:
        .string "Flag is set ? : %s\n"
a(bool):
        push    r12
        mov     eax, OFFSET FLAT:.LC1
        push    rbp
        mov     ebp, OFFSET FLAT:.LC0
        push    rbx
        sub     rsp, 32
        test    dil, dil
        cmove   rbp, rax
        lea     r12, [rsp+16]
        mov     QWORD PTR [rsp], r12
        mov     rdi, rbp
        call    strlen
        mov     rsi, rbp
        mov     rdi, r12
        lea     rdx, [rbp+0+rax]
        mov     rbx, rax
        call    std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_copy_chars(char*, char const*, char const*)
        mov     rax, QWORD PTR [rsp]
        mov     QWORD PTR [rsp+8], rbx
        mov     edi, OFFSET FLAT:.LC2
        mov     BYTE PTR [rax+rbx], 0
        mov     rsi, QWORD PTR [rsp]
        xor     eax, eax
        call    printf
        mov     rdi, rsp
        call    std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_dispose()
        add     rsp, 32
        pop     rbx
        pop     rbp
        pop     r12
        ret

which is still a lot more than the string_view's version. Remark that the compiler was smart enough to remove the memory allocation on the heap here, but it's still forced to compute the string's length (even if printf will also compute it itself).

4
On

Either version will allocate the string literals themselves in read-only memory. Either version uses a local variable that goes out of scope, but the string literals remain since they aren't stored locally.

Regarding performance, C++ container classes are almost always going to be more inefficient than "raw" C. When testing your code with g++ -O3 I get this:

void test_cstr (bool isSet)
{
  const char* isSetStr  =  isSet ? "Yes" : "No";
  printf ( "Flag is set ? : %s\n", isSetStr);
}

Disassembly (x86):

.LC0:
        .string "Yes"
.LC1:
        .string "No"
.LC2:
        .string "Flag is set ? : %s\n"
test_cstr(bool):
        test    dil, dil
        mov     eax, OFFSET FLAT:.LC1
        mov     esi, OFFSET FLAT:.LC0
        mov     edi, OFFSET FLAT:.LC2
        cmove   rsi, rax
        xor     eax, eax
        jmp     printf

The string literals are loaded into read-only locations and the isSetStr variable is simply optimized away.

Now try this using the same compiler and options (-O3):

void test_cppstr (bool isSet)
{
  std::string isSetStr = isSet ? "Yes" : "No";
  printf ( "Flag is set ? : %s\n", isSetStr.c_str());
}

Disassembly (x86):

.LC0:
        .string "Yes"
.LC1:
        .string "No"
.LC2:
        .string "Flag is set ? : %s\n"
test_cppstr(bool):
        push    r12
        mov     eax, OFFSET FLAT:.LC1
        push    rbp
        push    rbx
        mov     ebx, OFFSET FLAT:.LC0
        sub     rsp, 32
        test    dil, dil
        cmove   rbx, rax
        lea     rbp, [rsp+16]
        mov     QWORD PTR [rsp], rbp
        mov     rdi, rbx
        call    strlen
        xor     edx, edx
        mov     esi, eax
        test    eax, eax
        je      .L7
.L6:
        mov     ecx, edx
        add     edx, 1
        movzx   edi, BYTE PTR [rbx+rcx]
        mov     BYTE PTR [rbp+0+rcx], dil
        cmp     edx, esi
        jb      .L6
.L7:
        mov     QWORD PTR [rsp+8], rax
        mov     edi, OFFSET FLAT:.LC2
        mov     BYTE PTR [rsp+16+rax], 0
        mov     rsi, QWORD PTR [rsp]
        xor     eax, eax
        call    printf
        mov     rdi, QWORD PTR [rsp]
        cmp     rdi, rbp
        je      .L1
        call    operator delete(void*)
.L1:
        add     rsp, 32
        pop     rbx
        pop     rbp
        pop     r12
        ret
        mov     r12, rax
        jmp     .L4
test_cppstr(bool) [clone .cold]:
.L4:
        mov     rdi, QWORD PTR [rsp]
        cmp     rdi, rbp
        je      .L5
        call    operator delete(void*)
.L5:
        mov     rdi, r12
        call    _Unwind_Resume

The string literals are still allocated in read-only memory so that part is the same. But we got a massive chunk of overhead bloat code.

But on the other hand, the biggest bottleneck by far in this case is the console I/O so the performance of the rest of the code isn't even relevant. Strive to write the most readable code possible and only optimize when you actually need it. Manual string handling in C is fast, but it's also very error-prone and cumbersome.

1
On

String literals have static storage duration, They are alive until program ends.

Pay attention to that if you are using in a program the same string literal it is not necessary that the compiler stores this string literal as one object.

That is this expression

"Yes" == "Yes"

can yield either true or false depending on compiler options. But usually by defaults identical string literals are stored as one string literal.

Objects of the type std::string if they are not declared in a namespace and without the keyword static has automatic storage duration. It means that when the control is passed to a block such an object is created anew and destroyed each time when the control leaves the block.

3
On

Equivalent C++ code:

#include <string>

using namespace std::string_literals;

void test_cppstr (bool isSet)
{
  const std::string& isSetStr = isSet ? "Yes"s : "No"s;
  printf ( "Flag is set ? : %s\n", isSetStr.c_str());
}

Efficient almost like C version.

Edit: This version has small overhead with setup/exit, but has same efficiency as C code in calling printf.

#include <string>

using namespace std::string_literals;

const std::string yes("Yes");
const std::string no("No");

void test_cppstr (bool isSet)
{
  const std::string& isSetStr = isSet ? yes : no;
  printf ( "Flag is set ? : %s\n", isSetStr.c_str());
}

https://godbolt.org/z/v3ebcsrYE

3
On

Chill out!

The printf will be orders of magnitude slower than any construction of a std::string from const char[] data embedded in the program source code.

Always use a profiler when examining code performance. Writing a small program in an attempt to test a hypothesis will often fail to tell you anything about what is happening in your big program. In the case you present, a good compiler will optimise to

int main(){printf ( "Flag is set ? : No\n");}
0
On

Type of isSet ? "Yes" : "No" is const char*, independently of the fact that you store it inside std::string or a const char* (or std::stringview, or ...). (so string literals are treated equally by the compiler).

According to quick-bench.com,

std::string version is ~6 times slower, which is understandable as it requires extra dynamic allocation.

Unless you need the extra feature of std::string, you might stay with const char*.