In my day job, I have encountered a lot of C codes resembling the following pattern. I am worrying whether this pattern is safe.
typedef struct
{
unsigned char someField : 4;
unsigned char someOtherField : 4;
unsigned char body[1];
} __attribute__((__packed__, aligned(1))) SomeStruct;
int main()
{
unsigned char pack[16] = {};
SomeStruct* structPack = (SomeStruct*)pack;
structPack->someField = 0xC;
structPack->body[4] = 0x5;
return 0;
}
What makes me worry is that the program uses structPack->body[4]
, which is still a part of the 16-byte array, but is out-of-bound if we look at the definition of SomeStruct
. So there are two ways to look at it:
- It is refering to a valid memory location. No danger.
- It is out-of-bound, therefore undefined behaviour.
So, my questions are:
- According to the C standard (more specifically, C89), is this pattern safe or undefined behaviour?
- Also, for some specific compilers (esp. GCC) or platform, is it safe?
- Are there better alternatives?
Note that this kind of code mostly runs on micro-controllers, and sometimes runs as application on Linux desktop.
Accessing an object through an incompatible lvalue is undefined behaviour. Alignment may be solved by your attribute line, but using the pointer to access the object is still violating strict aliasing:
Where effective type is:
SomeStruct*
is not compatible to a char array.The correct way of allocating SomeStruct is to use memory allocators, or alloca( which will allocate stack if that is a concern ) if the function is supported.
Still there is the problem of the
body
member which is a size one array and Standard will not permit accessing it out of bounds( i.e. body[1] ). c99 introduced a solution which is the flexible array member:When you set the size to allocate this struct, you add additional size depending how large the
body[]
member needs to be.