Can a designated initializer legally refer to the variable it's initializing in C99?

697 Views Asked by At

GCC and Clang both allow a designated initializer to refer to a member of the struct or array being initialized, but is this legal and well defined behaviour?

The following code example compiles and runs for both GCC and Clang and outputs { .a = 3, .b = 6, } in both cases:

#include <stdio.h>

typedef struct
{
    int a;
    int b;
} foo;

int main()
{
    foo bar = {
        .a = 3,
        .b = bar.a + 3,
    };
    printf("{ .a = %d, .b = %d, }\n", bar.a, bar.b);

    return 0;
}

GCC generates the following output (Compiler Explorer link) for the designated initialization which shows that the operation is safe for this example:

mov     dword ptr [rbp - 4], 0
mov     dword ptr [rbp - 16], 3
mov     eax, dword ptr [rbp - 16]
add     eax, 3
mov     dword ptr [rbp - 12], eax

Section 6.7.8 of the draft C99 spec discusses this, but I don't see how it defines this behaviour one way or another.

In particular, point 19 suggests that initialization happens in the specified order, but point 23 mentions side effects having an unspecified order. I'm unsure if the data being written to the struct is considered a side effect.

  1. The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.
  1. The order in which any side effects occur among the initialization list expressions is unspecified
3

There are 3 best solutions below

0
On BEST ANSWER

This foot-note

  1. In particular, the evaluation order need not be the same as the order of subobject initialization

for this quote

23 The evaluations of the initialization list expressions are indeterminately sequenced with respect to one another and thus the order in which any side effects occur is unspecified.152)

means that such an initialization like this

foo bar = {
        .a = 3,
        .b = bar.a + 3,
    };

invokes undefined behavior because the expression bar.a + 3 can be evaluated before the initialization of the data member a.

The undefined behavior in particularly is defined like

Possible undefined behavior ranges from ignoring the situation completely with unpredictable results

0
On

Given something like:

struct foo {int arr[10], *p;};

there would be no problem with a definition like:

struct foo x = {.p = x.arr, .arr={1,2,3,4,5}};

making reference to the object under construction, since such a reference would not be used to actually access the object until after construction is complete. There are probably some corner cases where attempts to access the object would yield defined results (e.g. if an initialization expression of one member, as a side effect, stores to another member the value with which it has been or will be initialized), but I don't think the authors of the Standard made any effort to consider such cases in any substantive detail and determine whether or not they should be defined. I think it's clear that the actions involved with determining and writing the initial values of structure members are indeterminately sequenced with respect to each other, implying that if a struct member is accessed by some action other than the initialization itself, such action might occur before or after that member gets initialized, with whatever consequences could result from such sequencing.

0
On

You're quoting an old version of the C standard. Current drafts (since C11) have, for point 23:

The evaluations of the initialization list expressions are indeterminately sequenced with respect to one another and thus the order in which any side effects occur is unspecified.

I take that to mean that the compiler can choose to evaluate a particular initialization expression at any time prior to the moment at which that expression is used, which means that it might happen before or after the element it refers to has been initialised.

That being the case, using a (possibly) uninitialised element of the same aggregate object in an initialisation expression must result in an indeterminate value.