Taking this simple C program
const char s1[] = "hello",
s2[] = "there";
and compiling it using gcc -c a.c -O0 -o a.o
yields in .rodata
containing the following:
'hello\x00there\x00'
, which is what I expect. Each of the strings occupy 6 bytes, for 12 bytes in total.
However if I change the 2nd string to "there s"
, like so:
const char s1[] = "hello",
s2[] = "there s";
, .rodata
contains the following:
'hello\x00\x00\x00there s\x00'
An extra 2 null padding bytes were added to the end of s1
.
I am assuming that they were added in order to align the first string to an 8byte boundary (seeing as I'm on a 64bit platform) - though I may be wrong?
My question then arises - why wasn't that done in the first example? Why weren't 2 extra padding bytes added to the end of each string to get them to an 8byte boundary?
All examples were conducted on an amd64/linux/gcc machine.
At the beginning, both strings are aligned 1 byte (8 bits internally in gcc). You can see the gimple to be sure.
If you take a look to i386 porting. You will see that
DATA_ALIGNMENT
is defined asix86_data_alignment
. This function is used byalign_variable
( in varasm.c) to align strings bigger than 8 bytes to something between 8 bytes and 32 bytes depending on their size (between 64 and 256 bits internally in gcc).After that, you can see in
assemble_variable
(varasm.c) that theASM_OUTPUT_ALIGN
which print the.align
is only called if the align is bigger thanBITS_PER_UNIT
which is 1 byte by default (8 bits internally in gcc).You can find
DATA_ALIGNMENT
definition in https://github.com/gcc-mirror/gcc/blob/master/gcc/config/i386/i386.hYou can find
ix86_data_alignment
in https://github.com/gcc-mirror/gcc/blob/master/gcc/config/i386/i386.cyou can find
assemble_variable
andalign_variable
in https://github.com/gcc-mirror/gcc/blob/master/gcc/varasm.cSo if you declare a string of a size equal to or greater than 8 bytes it will be aligned. You will see a
.align x
with x between 8 and 32 bytes depending on the size of the string. As @Peter Cordes said, it will be more visible with the assembly.