In a multi-producer, multi-consumer situation. If producers are writing into int a
, and consumers are reading from int a
, do I need memory barriers around int a
?
We all learned that: Shared resources should always be protected and the standard does not guarantee a proper behavior otherwise.
However on cache-coherent architectures visibility is ensured automatically and atomicity of 8, 16, 32 and 64 bit variables MOV
operation is guaranteed.
Therefore, why protect int a
at all?
At least in C++11 (or later), you don't need to (explicitly) protect your variable with a mutex or memory barriers.
You can use
std::atomic
to create an atomic variable. Changes to that variable are guaranteed to propagate across threads.Of course, there's a little more to it than that--for example, there's no guarantee that
std::cout
works atomically, so you probably will have to protect that (if you try to write from more than one thread, anyway).It's then up to the compiler/standard library to figure out the best way to handle the atomicity requirements. On a typical architecture that ensures cache coherence, it may mean nothing more than "don't allocate this variable in a register". It could impose memory barriers, but is only likely to do so on a system that really requires them.
On real world C++ implementations where
volatile
worked as a pre-C++11 way to roll your own atomics (i.e. all of them), no barriers are needed for inter-thread visibility, only for ordering wrt. operations on other variables. Most ISAs do need special instructions or barriers for the defaultmemory_order_seq_cst
.On the other hand, explicitly specifying memory ordering (especially
acquire
andrelease
) for an atomic variable may allow you to optimize the code a bit. By default, an atomic uses sequential ordering, which basically acts like there are barriers before and after access--but in a lot of cases you only really need one or the other, not both. In those cases, explicitly specifying the memory ordering can let you relax the ordering to the minimum you actually need, allowing the compiler to improve optimization.(Not all ISAs actually need separate barrier instructions even for
seq_cst
; notably AArch64 just has a special interaction betweenstlr
andldar
to stop seq_cst stores from reordering with later seq_cst loads, on top of acquire and release ordering. So it's as weak as the C++ memory model allows, while still complying with it. But weaker orders, likememory_order_acquire
orrelaxed
, can avoid even that blocking of reordering when it's not needed.)