-Thread 1-
y.store (20, memory_order_release);
x.store (10, memory_order_release);
-Thread 2-
if (x.load(memory_order_acquire) == 10) {
assert (y.load(memory_order_acquire) == 20);
y.store (10, memory_order_release)
}
-Thread 3-
if (y.load(memory_order_acquire) == 10) {
assert (x.load(memory_order_acquire) == 10);
}
GCC Atomic Wiki paragraph “Overall Summary” says, the above code assert(x.load(memory_order_acquire))
can fail. But I don't understand why ?
My understanding is:
- Thread3 can not LoadLoad reorder due to acquire barrier.
- Thread1 can not StoreStore reorder due to release barrier.
- When Thread2 read(x)->10, x must be flushed from storebuffer to cache in Thread1, so every thread know the value x has changed, such as invalidate cache line.
- Thread3 uses Acquire barrier, so it can see x(10).
This is a bad example, though it does illustrate how mind-warping relaxed atomics can be, I suppose.
[intro.execution]p9:
[atomics.order]p2:
As a result, the evaluations shown are chained together by sequenced-before and synchronized-with relationships:
And so each evaluation in the chain happens before the next (see [intro.races]p9-10).
[intro.races]p15,
Here, A is the load in Thread 2 that took the value 10, B is the load in Thread 3 (in the assert). Since A happens before B, and there are no other side effects on
x
, B must also read 10.Herb Sutter has a much simpler example on his blog:
You absolutely need sequential consistency to guarantee that at most one line is printed.