Formal definition of restrict fails to account for valid cases

369 Views Asked by At

The following reasonable program seems to have undefined behavior according to the formal definition of restirct in the C standard:

void positive_intcpy(int * restrict q, const int * restrict p, size_t n) {
  int *qBgn = q;
  const int *pEnd = p + n;
  // sequence point S
  while (p != pEnd && *p>0) *q++ = *p++;
  if (q != qBgn) fprintf(stderr,"Debug: %d.\n",*(q-1)); // undefined behavior!?
}
int main(void) {
  int a[6] = {4,3,2,1,0,-1};
  int b[3];
  positive_intcpy(b,a,3);
  return 0;
}

The function copies integers from one array to another, as long as the integers are positive. The fprintf call displays the last positive integer that was copied (if any). There is never any aliasing between p and q.

Is this really UB, or is my reasoning wrong?

This question concerns section 6.7.3.1 of the C99 standard. The relevant text is unchanged in the latest draft for C23.

We are talking about the pointer expression q-1 marked above. Is it based on the restricted pointer object designated by p?

The standard says:

In what follows, a pointer expression E is said to be based on object P [where P is a restrict-qualified pointer object] if (at some sequence point in the execution of B [where B is the block associated with the declaration of P] prior to the evaluation of E) modifying P to point to a copy of the array object into which it formerly pointed would change the value of E [see footnote]. Note that "based" is defined only for expressions with pointer types.

[footnote] In other words, E depends on the value of P itself rather than on the value of an object referenced indirectly through P. For example, if identifier p has type (int **restrict), then the pointer expressions p and p+1 are based on the restricted pointer object designated by p, but the pointer expressions *p and p[1] are not.

In our program, at the sequence point S marked above, modifying p to point to a copy of the a array would cause p != pEnd to always be true (because pEnd is not modified together with p), thus the loop would execute until *p>0 becomes false, thus the value of q at the end of the loop would change (it would be one machine word greater). Therefore we conclude that our expression q-1 is based on p.

Now the standard says:

During each execution of B, let L be any lvalue that has &L based on P. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: T [where T is the type to which P is declared to point] shall not be const-qualified. Every other lvalue used to access the value of X shall also have its address based on P. Every access that modifies X shall be considered also to modify P, for the purposes of this subclause. If P is assigned the value of a pointer expression E that is based on another restricted pointer object P2, associated with block B2, then either the execution of B2 shall begin before the execution of B, or the execution of B2 shall end prior to the assignment. If these requirements are not met, then the behavior is undefined.

In our case, L is *(q-1). X is the object at &b[2], which has been modified in B to the value of 2. However, T is const int, which means the program has UB.

You might say that this is harmless, because the compiler will probably not take advantage of such a hard to prove UB case and so won't mis-optimize it.

But the same logic can be applied in reverse, where it becomes truly dangerous:

int f(int *restrict p, int *restrict q) {
  int *p0=p, *q0=q;
  // sequence point S
  if (p != p0) q++;
  if (q != q0) p++;
  *p = 1;
  *q = 2;
  return *p + *q;
}
int main(void) {
  int x;
  printf("%d\n", f(&x,&x));
  return 0;
}

GCC does optimize here. GCC -O0 prints 4, while GCC -O3 prints 3. Unfortunately, reading the standard literally, GCC must be wrong.

Here's why:

  1. In the expression *q = 2, q is "based on" p (because if p was modified at sequence point S to point to a copy of x, the conditional p != p0 would become true, changing the value of q).
  2. In the expression *p = 1, p is "based on" q (because if q was modified at sequence point S to point to a copy of x, the conditional q != q0 would become true, changing the value of p).
  3. In the body of f, any access to any object via a pointer expression is always both "based on" p and "based on" q. So no restrict-violation, neither of restrict p, nor of restrict q. Note that no restrict-qualified pointer is ever assigned a pointer expression inside f (because p++ and q++ do not actually happen).
  4. Since there is no restrict violation, the program doesn't have UB and the compiler is not allowed to optimize.

It seems to me that the standard failed to define "based on" in the intended way. What is the intended way, then? I mean, clearly there is an intended meaning, which is allowing GCC to optimize, and I believe GCC is morally right in this case. I want to avoid writing programs that might be mis-optimized.

3

There are 3 best solutions below

6
John Bollinger On

Is this really UB, or is my reasoning wrong?

Your reasoning is inconsistent with the intent of the specification.

In our program, at the sequence point S marked above, modifying p to point to a copy of the a array would cause p != pEnd to always be true (because pEnd is not modified together with p)

But this and the reasoning following from it is not what the spec is trying to describe. It isn't interested in differences arising from changes in control flow, such as those you present. It's actually quite hard to describe precisely, so perhaps you can forgive the committee, but here's a go:

  • consider an execution of the program, and look at all the expression evaluations that have side effects that affect the value of some evaluation of expression E.

  • imagine modifying the value of p as described, and performing all those same evaluations again, taking into account the change in p's value and the differences propagating through the results, but not choosing more, fewer, or different expressions to evaluate, nor performing them in a different order.

  • if the ultimate evaluation of E yields a different value, then E is based on p.

That's not perfect, I don't think, but I hope it is adequate for explaining the difference between what the spec means to say and what you present.

In that light, I hope you see that the fact that a change to the value of p at some carefully selected place could cause a different sequence of evaluations to be performed is not germane. In evaluating "based on", we look at effects on the evaluations that were performed, not at how that might have changed the sequence of evaluations.

Since q and q - 1 are not, in fact, "based on" p in the sense intended by the spec, the program does not violate the requirements related to restrict-qualified types.


But the same logic can be applied in reverse, where it

... is irrelevant, because it is wrong in the first place.

But let's look at the program in this example:

int f(int *restrict p, int *restrict q) {
  int *p0=p, *q0=q;
  // sequence point S
  if (p != p0) q++;
  if (q != q0) p++;
  *p = 1;
  *q = 2;
  return *p + *q;
}
int main(void) {
  int x;
  printf("%d\n", f(&x,&x));
  return 0;
}

p, p0, p != p0, and p++ are all based on p, but not on q.

q, q0, q != q0, and q++ are all based on q, but not on p.

Either *p or *q can fill the role of the spec's L, with the other performing the restrict-violating modification to the one underlying object. The behavior is therefore undefined, and GCC can do whatever it wants with it.

7
dbush On

Your assessment that the wording of 6.7.3.1p3 allows for the case you mentioned is correct. This wording also goes against the intended meaning.

The intent is first mentioned in section 6.7.3p8:

An object that is accessed through a restrict-qualified pointer has a special association with that pointer. This association, defined in 6.7.3.1 below, requires that all accesses to that object use, directly or indirectly, the value of that particular pointer.

So by this description q and q-1 are based on the object q, not the object p.

An example is given in section 6.7.3.1p7:

EXAMPLE 1 The file scope declarations

int * restrict a;
int * restrict b;
extern int c[];

assert that if an object is accessed using one of a, b, or c, and that object is modified anywhere in the program, then it is never accessed using either of the other two

Which further backs up this intent.

By a strict reading of 6.7.3.1p3, the expression q-1 is based on both the objects p and q, which doesn't seem to make much sense and is counter to the above quoted description and example.

The intent of 6.7.3.1p3, I believe, was to capture transitive expressions such as the following:

int x[5];
int * restrict p = x;
int *p1 = p+2;    // p1 is based on p
int *p2 = p1-1;   // p2 is also based on p

As simply stating "an expression including p among its operands" wouldn't account for such a case.

A possible rewording of 6.7.3.1p3 could be the following:

A pointer expression E is said to be based on object P if any of the following holds:

  • E contains P as an operand
  • E contains an lvalue V as an operand, and V was set to the value of an expression E2 which is based on P.
0
log65536 On

I became convinced that the heart of the matter here is that the standard essentially says: If certain access of X is through an lvalue L1 based on P, and simultaneously certain other access of X is through an lvalue L2 not based on P, the behavior is undefined.

This places enormous strain on the definition of "based on".

If the definition of "based on" is too tight (like in: "based on" shall be only through such and such constructs), L1 would be judged "based on" less often while simultaneously L2 would be judged "not based on" more often. Such an approximation to the ideal definition is not conservative: it has the potential to UB valid and sensible programs. (Say int f(int *q, int *restrict p) { h(&p,&q); *p=1,*q=2; return *p+*q; } and "calling an external function" was not among the constructions that the compiler thought can base q on p. The compiler could return 3, while the correct answer could be 4. This was an actual GCC bug.)

On the other hand, if the definition of "based on" is too loose, L1 would be judged "based on" more often while simultaneously L2 would be judged "not based on" less often. Such an approximation to the ideal definition is again not conservative: it has the potential to UB valid and sensible programs. (The first example in the question, regarding fprintf and q-1 demonstrates it.)

In such a setting, basically nothing less than an absolutely exact ideal definition of "based on" would suffice in order to keep valid programs from being judged as UB.

I think the spec doesn't succeed in making such an exact definition. In the question, the answers, the comments, and answers to related questions, we have seen problems that arise with the use of conditional expressions, unknown external functions, assignment via known pointer addresses obtained from the environment (if (p==A) q=A; else if (p==B) q=B; ...), even some function might block the process, show a core dump to a human and ask him what to assign to q... . A definition based on a hypothetical (what would happen if p were to point to a copy of the array object) is hard to justify in many of these cases. I have yet to be convinced that it is even theoretically possible to define the concept of "based on" in a self-consistent way.

Sometimes just part of the information that the address of p holds is transferred to q. A conditional transfers just one bit of information, repeated conditionals transfer more bits... . Saying either "yes" or "no" in such a case will give either too tight or too loose a definition, and since both directions are not conservative, an ideal definition that cannot be defeated by a contrived example might be impossible.

And practically speaking, compilers can never work with an ideal definition, because of limited reasoning abilities (complete reasoning about Turing-complete programs is theoretically impossible).

So they have to work with conservative approximations. And because neither direction is conservative, they need to work with two different approximations. A tight approximation for the L1 case and a loose approximation for the L2 case. That would allow them to judge L1 "based on" P less often, while simultaneously judging L2 "not based on" P also less often.

As if the standard had said: If certain access of X is through an lvalue L1 surely-based on P, and simultaneously certain other access of X is through an lvalue X surely-not based on P, the behavior is undefined.

That would be workable. And I guess that's what compilers actually do.

In the most conservative scenario (-O0) a compiler can assume that no L1 whatsoever is surely-based on P and no L2 whatsoever is surely-not based on P. Thus there is never UB and it will do no optimizations.

A cleverer compiler (-O3) can assume more expressions to be surely-based on P and more expressions to be surely-not based on P. With more reasoning power, increase both of these sets, as long as they stay disjoint, and as long as they stay sensible. A compiler can be morally correct this way, even if it is incorrect by the broken definition of the standard, and even if a self-consistent idealized single definition is not theoretically possible at all.

So maybe the standard should adopt this approach, and define the largest sensible set of expressions that are guaranteed to be "based on", and also the largest sensible set of expressions that are guaranteed to be "not based on". Expressions not participating in either set do not contribute towards UB, because UB is only granted via the vetted sets ("based on" for L1 and "not based on" for L2).

This way, it would be easy for compilers to check that they comply, and it would be easy for programmers to check that their programs won't misbehave.

I post this as an answer in the hope that it could be useful to somebody when reasoning about restrict, but I won't mark it as an accepted answer because, well... this is only wishful thinking and not the approach that the standard actually takes. And I have no insider information what approach the compilers actually take.