I have the following code:
#include <stddef.h>
int main() {
struct X {
int a;
int b;
} x = {0, 0};
void *ptr = (char*)&x + offsetof(struct X, b);
*(int*)ptr = 42;
return 0;
}
The last line performs indirect access to x.b.
Is this code defined according to any of C standards?
I know that:
*(char*)ptr = 42;is defined though only implementation defined.ptr == (void*)&x.b
I guess that accessing data pointed by ptr via int*does not violate strict aliasing rule but I'm not fully sure that the standard guarantees that.
Yes, this is perfectly well defined, and is exactly how
offsetofis intended to be used. You do the pointer arithmetic on a pointer to character type, so that it is done in bytes, and then cast back to the actual type of the member.You can see for instance 6.3.2.3 p7 (all references are to C17 draft N2176):
So
(char *)&xis a pointer toxconverted to a pointer tochar, therefore it points to the lowest addressed byte ofx. When we addoffsetof(struct X, b)(say it's 4) then we have a pointer to byte 4 ofx. Nowoffsetof(struct X, b)is defined to returnso 4 is in fact the offset from the beginning of
xtox.b. Hence byte 4 ofxis the lowest byte ofx.b, and that's whatptrpoints to; in other words,ptris a pointer tox.b, but of typechar *. When we cast it back toint *, we have a pointer tox.bwhich is of the typeint *, exactly the same as we would get from the expression&x.b. So dereferencing this pointer accessesx.b.A question arose in the comments about this last step: when
ptris cast back toint *, how do we know we indeed have a pointer to theintx.b? This is a bit less explicit in the standard but I think it is the obvious intent.However, I think we can also derive it indirectly. Hopefully we agree that
ptrabove is a pointer to the lowest addressed byte ofx.b. Now by the same passage of 6.3.2.3 p7 quoted above, taking a pointer tox.band converting it tochar *, as in(char *)&x.b, would also yield a pointer to the lowest addressed byte ofx.b. As they are pointers of the same type which point to the same byte, they are the same pointer:ptr == (char *)&x.b.Then we look at the preceding sentences of 6.3.2.3 p7:
There are no problems with alignment here, because
charhas the weakest alignment requirement (6.2.8 p6). So converting(char *)&x.bback toint *must recover a pointer tox.b, i.e.(int *)(char *)&x.b == &x.b.But
ptris the same pointer as(char *)&x.b, so we may substitute them in this equality:(int *)ptr == &x.b.Obviously
*&x.bproduces an lvalue designatingx.b(6.5.3.2 p4), hence so does*(int *)ptr.There is no problem with strict aliasing (6.5p7). First, determine the effective type of
x.busing 6.5p6:Well,
x.bdoes have a declared type, which isint. So its effective type isint.Now to see if the access is legal under strict aliasing, see 6.5p7:
We are accessing
x.bthrough the lvalue expression*(int *)ptr, which has typeint. Andintis compatible withintper 6.2.7p1:An example of this same technique that maybe is more familiar is indexing into an array by bytes. If we have
then this is equivalent to
arr[17] = 42;.This is how generic routines like
qsortandbsearchare implemented. If we try toqsortan array ofint, then withinqsortall the pointer arithmetic is done in bytes, on pointers to character type with the offsets manually scaled by the object size passed as an argument (which here would besizeof(int)). Whenqsortneeds to compare two objects, it casts them toconst void *and passes them as arguments to the comparator function, which casts them back toconst int *to do the comparison.This all works fine and is clearly an intended feature of the language. So I think we needn't doubt that the use of
offsetofin the current question is similarly an intended feature.