Trailing padding in C/C++ in nested structures - is it neccesary?

1.6k Views Asked by At

This is more of a theoretical question. I'm familiar with how padding and trailing padding works.

struct myStruct{
    uint32_t x;
    char*    p;
    char     c;
};

// myStruct layout will compile to
// x:       4 Bytes
// padding: 4 Bytes
// *p:      8 Bytes
// c:       1 Byte
// padding: 7 Bytes
// Total:   24 Bytes

There needs to be padding after x, so that *p is aligned, and there needs to be trailing padding after c so that the whole struct size is divisible by 8 (in order to get the right stride length). But consider this example:

struct A{
    uint64_t x;
    uint8_t  y;
};

struct B{
    struct A myStruct;
    uint32_t c;
};

// Based on all information I read on internet, and based on my tinkering
// with both GCC and Clang, the layout of struct B will look like:
// myStruct.x:       8 Bytes
// myStruct.y:       1 Byte
// myStruct.padding: 7 Bytes
// c:                4 Bytes
// padding:          4 Bytes
// total size:       24 Bytes
// total padding:    11 Bytes
// padding overhead: 45%

// my question is, why struct A does not get "inlined" into struct B,
// and therefore why the final layout of struct B does not look like this:
// myStruct.x:       8 Bytes
// myStruct.y:       1 Byte
// padding           3 Bytes
// c:                4 Bytes
// total size:       16 Bytes
// total padding:    3 Bytes
// padding overhead: 19%

Both layouts satisfy alignments of all variables. Both layouts have the same order of variables. In both layouts struct B has correct stride length (divisible by 8 Bytes). Only thing that differs (besides 33% smaller size), is that struct A does not have correct stride length in layout 2, but that should not matter, since clearly there is no array of struct As.

I checked this layout in GCC with -O3 and -g, struct B has 24 Bytes.

My question is - is there some reason why this optimization is not applied? Is there some layout requirement in C/C++ that forbids this? Or is there some compilation flag I'm missing? Or is this an ABI thing?

EDIT: Answered.

  1. See answer from @dbush on why compiler cannot emit this layout on it's own.
  2. The following code example uses GCC pragmas packed and aligned (as suggested by @jaskij) to manualy enforce the more optimized layout. Struct B_packed has only 16 Bytes instead of 24 Bytes (note that this code might cause issues/run slow when there is an array of structs B_packed, be aware and don't blindly copy this code):
struct __attribute__ ((__packed__)) A_packed{
    uint64_t x;
    uint8_t  y;
};

struct __attribute__ ((__packed__)) B_packed{
    struct A_packed myStruct;
    uint32_t c __attribute__ ((aligned(4)));
};

// Layout of B_packed will be
// myStruct.x:       8 Bytes
// myStruct.y:       1 Byte
// padding for c:    3 Bytes
// c:                4 Bytes
// total size:       16 Bytes
// total padding:    3 Bytes
// padding overhead: 19%
2

There are 2 best solutions below

5
dbush On BEST ANSWER

is there some reason why this optimization is not applied

If this were allowed, the value of sizeof(struct B) would be ambiguous.

Suppose you did this:

struct B b;
struct A a = { 1, 2 };
b.c = 0x12345678;
memcpy(&b.myStruct, &a, sizeof(struct A));

You'd be overwriting the value of b.c.

1
gnasher729 On

Padding is used to force alignment. Now if you have an array of struct myStruct, then there is a rule that array elements follow each other without any padding. In your case, without padding inside myStruct after the last field, the second myStruct in an array wouldn't be properly aligned. Therefore it is necessary that sizeof(myStruct) is a multiple of the alignment of myStruct, and for that you may need enough padding at the end.