What are the [[reproducible]] and [[unsequenced]] attributes in C23, and when should I use them?

486 Views Asked by At

C23 introduced the attributes [[reproducible]] and [[unsequenced]].

  • What is the motivation behind them?
  • How are they defined, and what effect do they have on a function?
  • What kind of functions should I apply them to?
1

There are 1 best solutions below

2
On BEST ANSWER

The motivating problem is that the compiler has no insight into functions when no definition of the function is available. This prevents almost all compiler optimizations (unless LTOs are used).

Consider the following example:

// note: this is redundant, because [[unsequenced]] implies [[reproducible]]
int square(int x) [[reproducible]] [[unsequenced]];

int arr[] = {
    square(2),
    square(2),
    square(3),
    square(3),
};

Even though the compiler has no definition of square, it is allowed to perform two optimizations:

  • Because the function is [[reproducible]], square(2) yields the same result when called twice in a row, and the compiler can decide to call square(2) only once
  • Because the function is [[unsequenced]], calls to square can be made in any order, and the compiler could even decide to evaluate square(2) just once at program startup. It can also decide to evaluate square(3) before square(2), if this is somehow more efficient.

Such optimizations can also be made by defining functions inline in headers, and letting the compiler infer these properties on its own. However, for complicated functions, making everything inline isn't feasible due to the added compilation slowdown.

Semi-Formal Definitions

For a more rigorous explanation, see the C23 standard working draft N3096 §6.7.12.7 Standard attributes for function types.

[[reproducible]]

This attribute asserts that a function is a reproducible function, which means that

  • it is effectless, and
  • it is idempotent

Effectless restricts what state a function can modify. If any non-local state is modified, this can only happen through pointers passed to it. For example, a void to_upper_case(char *str) function is effectless if it only modifies local variables and the contents of str. (Intuitively, the function has no observable side effects.)

Idempotent means that calling the function multiple times has the same effect as calling it once. For example, we can call to_upper_case(s); to_upper_case(s);, and it would have the same effect as calling it just once.

[[unsequenced]]

This attribute asserts that a function is an unsequenced function, which means that

  • it is effectless and idempotent (which also makes it reproducible)
  • it is stateless
  • it is independent

Stateless means that static or thread_local local variables cannot be non-const, and cannot be volatile.

Independent means that all calls of the function will see the same values for global variables, won't change global state, and won't change any state through pointer parameters. to_upper_case is not independent, but a function like strlen can be.

Intuitively, an unsequenced function can be arbitrarily sequenced, and even sequenced in parallel between changes to its observed state: (see also footnote 196 in the standard)

char *str = /* ... */;  // A
  strlen(str);
  global = 123;
  strlen(str);
strcpy(str, /* ... */); // B

In this example, there can be one, two, or infinitely many calls to strlen between points A and B. These can happen sequentially, or in parallel. No matter what, the outcome must be the same for an unsequenced function. The mutation of global is not allowed to change the result of strlen.

Note on GCC attributes

The GCC attributes pure and const are the inspiration for these standard attributes, and behave similarly. See N2956 5.8 Some differences with GCC const and pure for a comparison. In short:

  • pure is more relaxed than [[reproducible]]
  • const is more strict than [[unsequenced]]

When To Use These Attributes

These attributes are meant for advanced users who want to take advantage of compiler optimizations.

In general, you have to be quite careful with applying them. The program is ill-formed, no diagnostic required if you apply them to a function which doesn't have the asserted properties. Compilers are encouraged to detect such misuse of these attributes, but this isn't required.

Common Examples (and surprises)

  • printf is obviously neither
  • strlen and memcmp can be [[unsequenced]] (Can strlen be [[unsequenced]]?)
  • memcpy can be [[reproducible]]
  • memmove can't be either, because it isn't idempotent for overlapping memory regions
  • fabs can be [[unsequenced]]
  • sqrt can't be either, because it modifies the floating point environment and may set errno

See Also