I dont really understand what exactly is causing the problem in this example:
Here is a snippet from my book:
Based on the discussion of the MESI protocol in the preceding section, it would seem that the problem of data sharing between L1 caches in a multicore machine has been solved in a watertight way. How, then, can the memory ordering bugs we’ve hinted at actually happen? There’s a one-word answer to that question: Optimization. On most hardware, the MESI protocol is highly optimized to minimize latency. This means that some operations aren’t actually performed immediately when messages are received over the ICB. Instead, they are deferred to save time. As with compiler optimizations and CPU out-of-order execution optimizations, MESI optimizations are carefully crafted so as to be undetectable by a single thread. But, as you might expect, concurrent programs once again get the raw end of this deal. For example, our producer (running on Core 1) writes 42 into
g_data
and then immediately writes 1 intog_ready
. Under certain circumstances, optimizations in the MESI protocol can cause the new value ofg_ready
to become visible to other cores within the cache coherency domain before the updated value ofg_data
becomes visible. This can happen, for example, if Core 1 already hasg_ready
’s cache line in its local L1 cache, but does not haveg_data
’s line yet. This means that the consumer (on Core 2) can potentially see a value of 1 forg_ready
before it sees a value of 42 ing_data
, resulting in a data race bug.
Here is the code:
int32_t g_data = 0;
int32_t g_ready = 0;
void ProducerThread() // running on Core 1
{
g_data = 42;
// assume no instruction reordering across this line
g_ready = 1;
}
void ConsumerThread() // running on Core 2
{
while (!g_ready)
PAUSE();
// assume no instruction reordering across this line
ASSERT(g_data == 42);
}
- How can
g_data
be computed but not present in the cache?
This can happen, for example, if Core 1 already has
g_ready
’s cache line in its local L1 cache, but does not haveg_data
’s line yet.
If
g_data
is not in cache, then why does the previous sentece end with a yet? Would the CPU load the cache line withg_data
after it has been computed?If we read this sentence:
This means that some operations aren’t actually performed immediately when messages are received over the ICB. Instead, they are deferred to save time.
Then what operation is deferred in our example with producer and consumer threads?
So basically I dont understand how under the MESI protocol, some operations are visible to other cores in the wrong order, despite being computed in the right order by a specific core.
PS: This example is from a book called "Game Engine Architecture, Third Edition" by Jason Gregory, its on the page 309. Here is the book