Does sizeof result depends on the declaration of the string?

82 Views Asked by At

I get the size of str character array with the following test code:

int main()
{
    unsigned char str[] = "abcde";
    for(int j = 0; j <= 6; j++) {
        printf("str[%u]=%c;  %i\n", j, *(str+j), *(str+j));
    }
    printf("sizeof(str)=%lu\nstrlen(str)=%lu\n", sizeof(str), strlen(str));
    return 0;
}

The result is 6, as expected, as can be seen in the screen output here below:

str[0]=a;  97
str[1]=b;  98
str[2]=c;  99
str[3]=d;  100
str[4]=e;  101
str[5]=;  0
str[6]=;  0
sizeof(str)=6    //here it is!
strlen(str)=5

However, if I explicitly include the string dimension (5) in its declaration, like this:

unsigned char str[5] = "abcde";

now the result of sizeof is 5 rather than the expected 6, as can be seen from the function output:

str[0]=a;  97
str[1]=b;  98
str[2]=c;  99
str[3]=d;  100
str[4]=e;  101
str[5]=;  0
str[6];  8
sizeof(str)=5    // why 5 and not 6???
strlen(str)=5

My question: what is the reason for that different result? Note the termination character is correctly placed after last string character, as can be seen from the examples above. Thanks for the attention.

2

There are 2 best solutions below

6
Vlad from Moscow On BEST ANSWER

The string literal "abcde" used as an initializer of the array str has 6 characters including the terminating zero character '\0'.

But you explicitly declared the array only with 5 characters:

unsigned char str[5] = "abcde";
              ^^^^^^  

So it is even unimportant how you are initializing the array because you explicitly specfied its size equal to 5 and sizeof( unsigned char ) is always equal to 1. So sizeof( str ) is evidently equal to 5.

Pay attention to that in this case your array does not contain a string because it is unable to accommodate the terminating zero character '\0' of the string literal. So for eample calling the function strlen for the array invokes undefined behavior.

Opposite to C in C++ such a declaration is invalid. In C++ you should write at least

unsigned char str[6] = "abcde";

or as you wrote the first declaration of the array like

unsigned char str[] = "abcde";

In the last case the number of elements in the array is equal to the number of characters in the string literal.

Also to output values of the type size_t you should use conversion specifier zu instead of lu because in general it is not necessay that the type size_t is an alias of the type unsigned long. In some systems it can be an alias of the type unsigned long long.

Frpm the C Standard (7.19 Common definitions <stddef.h>)

Recommended practice

4 The types used for size_t and ptrdiff_t should not have an integer conversion rank greater than that of signed long int unless the implementation supports objects large enough to make this necessary.

So you need to write

printf("sizeof(str)=%zu\nstrlen(str)=%zu\n", sizeof(str), strlen(str));
2
Sergej Christoforov On

strlen(str) in the second case is an undefined behavior.

Note the termination character is correctly placed after last string character, as can be seen from the examples above.

No, it's not. As per C11 6.7.9 14:

An array of character type may be initialized by a character string literal or UTF−8 string literal, optionally enclosed in braces. Successive bytes of the string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

There's no room for the terminating null character, so it's not added. Thus the result of sizeof is different.