In the documentation of std::memory_order
on cppreference.com there is an example of relaxed ordering:
Relaxed ordering
Atomic operations tagged
memory_order_relaxed
are not synchronization operations; they do not impose an order among concurrent memory accesses. They only guarantee atomicity and modification order consistency.For example, with x and y initially zero,
// Thread 1: r1 = y.load(std::memory_order_relaxed); // A x.store(r1, std::memory_order_relaxed); // B // Thread 2: r2 = x.load(std::memory_order_relaxed); // C y.store(42, std::memory_order_relaxed); // D
is allowed to produce r1 == r2 == 42 because, although A is sequenced-before B within thread 1 and C is sequenced before D within thread 2, nothing prevents D from appearing before A in the modification order of y, and B from appearing before C in the modification order of x. The side-effect of D on y could be visible to the load A in thread 1 while the side effect of B on x could be visible to the load C in thread 2. In particular, this may occur if D is completed before C in thread 2, either due to compiler reordering or at runtime.
it says "C is sequenced before D within thread 2".
According to the definition of sequenced-before, which can be found in Order of evaluation, if A is sequenced before B, then evaluation of A will be completed before evaluation of B begins. Since C is sequenced before D within thread 2, C must be completed before D begins, hence the condition part of the last sentence of the snapshot will never be satisfied.
I believe cppreference is right. I think this boils down to the "as-if" rule [intro.execution]/1. Compilers are only bound to reproduce the observable behavior of the program described by your code. A sequenced-before relation is only established between evaluations from the perspective of the thread in which these evaluations are performed [intro.execution]/15. That means when two evaluations sequenced one after the other appear somewhere in some thread, the code actually running in that thread must behave as if whatever the first evaluation does did indeed affect whatever the second evaluation does. For example
must print 42. However, the compiler doesn't actually have to store the value 42 into an object
x
before reading the value back from that object to print it. It may as well remember that the last value to be stored inx
was 42 and then simply print the value 42 directly before doing an actual store of the value 42 tox
. In fact, ifx
is a local variable, it may as well just track what value that variable was last assigned at any point and never even create an object or actually store the value 42. There is no way for the thread to tell the difference. The behavior is always going to be as if there was a variable and as if the value 42 were actually stored in an objectx
before being loaded from that object. But that doesn't mean that the generated machine code has to actually store and load anything anywhere ever. All that is required is that the observable behavior of the generated machine code is indistinguishable from what the behavior would be if all these things were to actually happen.If we look at
then yes, C is sequenced before D. But when viewed from this thread in isolation, nothing that C does affects the outcome of D. And nothing that D does would change the result of C. The only way one could affect the other would be as an indirect consequence of something happening in another thread. However, by specifying
std::memory_order_relaxed
, you explicitly stated that the order in which the load and store are observed by another thread is irrelevant. Since no other thread can observe the load and store in any particular order, there is nothing another thread could do to make C and D affect each other in a consistent manner. Thus, the order in which the load and store are actually performed is irrelevant. Thus, the compiler is free to reorder them. And, as mentioned in the explanation underneath that example, if the store from D is performed before the load from C, then r1 == r2 == 42 can indeed come about…