Does dereferencing a char* inhibit strict aliasing optimizations?

269 Views Asked by At

Consider the following snippet as an example:

*pInt = 0xFFFF;
*pFloat = 5.0;

Since they are int and float pointers, the compiler will assume they don't alias and can exchange them for example.

Now let's assume we spice it up with this:

*pInt = 0xFFFF;
*pChar = 'X';
*pFloat = 5.0;

Since char* is allowed to alias anything, it may point to *pInt, so the assignment to *pInt cannot be moved beyond the assignment of *pChar, because it may legitimately point to *pInt and set its first byte to 'X'. Similarly pChar may point to *pFloat, assignment to *pFloat cannot be moved before the char assignment, because the code may intend to nullify the effects of the previous byte setting by reassigning the *pFloat .

Does this mean I can write and read through char* to create barriers for rearrangement and other strict aliasing related optimizations?

3

There are 3 best solutions below

1
On BEST ANSWER

Pointer aliasing mostly makes sense in scenarios when the compiler can't know if a pointer variable alias another pointer or not. As in the case when you compile a function located in a different translation unit than the caller.

void func (char* pChar, float* pFloat)
{
  *pChar = 'X';
  *pFloat = 5.0;
}

Here the pFloat assignment can indeed not be sequenced before the pChar one, because the compiler can't deduct that pChar does not point at the same location as pFloat.

However, when facing this scenario, the compiler can (and probably will) add a run-time check to see if the addresses could be pointing at overlapping memory or not. If they do, then the code must be sequenced in the given order. If not, then the code may be re-organized and optimized.

Meaning that you would only get memory barrier-like behavior in case the pointers actually do alias/point at overlapping memory. If not, then all bets regarding instruction ordering would be off. So this is probably not a mechanism that you should rely upon.

3
On

I think in general you cannot use that as a sort of sequencing barrier. The reason is that the compiler could do some sort of versionning of your code

if (pInt == pChar || pFloat == pChar) {
  // be careful
} else {
  // no aliasing
}

Clearly, for the simple case that you are presenting this has no advantages at all, but could be beneficial if your pointers don't change in a large section of the code.

If you would be just using this as means for the "barrier" by using a dummy pChar the else part would always win. But there the compiler can assume that no aliasing occurs and can always reorder the assignments.

The only data that is otherwise unrelated for which the C standard gives reordering guarantees are atomic objects that are operated with sequential consistency.

0
On

If a program needs to use pointer-based type punning, the only reliable way to ensure it will work with gcc, and probably also with clang, is to use `-fno-strict-aliasing'. "Modern" compilers will aggressively strip out code which can't change the bits that are held by an object, and then use the resulting lack of such code to "justify" optimizations that would otherwise not be legal. For example,

struct s1 {unsigned short x;};
struct s2 {unsigned short x;};

int test(struct s1 *p1, struct s2 *p2)
{
  if (p1->x)
  {
    p2->x = 12;
    unsigned char *cp = (unsigned char*)p1;
    unsigned char c0=cp[0] ^ 1,c1=cp[1] ^ 2;
    cp[0]=c0 ^ 1; cp[1]=c1 ^ 2;
  }
  return p1->x;
}

both clang and gcc will generate code that returns the value that p1 had when the "if" statement was executed. I see nothing in the Standard that would justify that (if p1==p2, the entire contents of *p2 will be read via character types into discrete objects of type "char", which is defined behavior, and the contents of those discrete objects will be used to overwrite the entire contents of *p1, which is also defined behavior) but both gcc and clang will decide that since the values that are written to cp[0] and cp[1] will match what's already there, it should omit those operations.