I was reading about [[carries_dependency]] in this SO post.
But what I could not understand is the below sentences in the accepted answer :
"In particular, if a value read with memory_order_consume is passed in to a function, then without [[carries_dependency]], then the compiler may have to issue a memory fence instruction to guarantee that the appropriate memory ordering semantics are upheld. If the parameter is annotated with [[carries_dependency]] then the compiler can assume that the function body will correctly carry the dependency, and this fence may no longer be necessary.
Similarly, if a function returns a value loaded with memory_order_consume, or derived from such a value, then without [[carries_dependency]] the compiler may be required to insert a fence instruction to guarantee that the appropriate memory ordering semantics are upheld. With the [[carries_dependency]] annotation, this fence may no longer be necessary, as the caller is now responsible for maintaining the dependency tree."
Lets take it step by step:
"if a value read with memory_order_consume is passed in to a function, then without [[carries_dependency]], then the compiler may have to issue a memory fence instruction to guarantee that the appropriate memory ordering semantics are upheld."
So for an atomic variable in release-consume memory model when atomic variable is being passed as a parameter to the function the compiler will introduce a fence hardware instruction so that it always has the latest and updated value of the atomic variable provided to the function.
Next -
"If the parameter is annotated with [[carries_dependency]] then the compiler can assume that the function body will correctly carry the dependency, and this fence may no longer be necessary."
This is confusing me - the atomic variable value is already consumed and then what dependency the function is carried?
Similarly -
"if a function returns a value loaded with memory_order_consume, or derived from such a value, then without [[carries_dependency]] the compiler may be required to insert a fence instruction to guarantee that the appropriate memory ordering semantics are upheld. With the [[carries_dependency]] annotation, this fence may no longer be necessary, as the caller is now responsible for maintaining the dependency tree."
From the example its not clear what the point it is trying to state about carrying the dependency?
Just FYI,
memory_order_consume
(and[[carries_dependency]]
) is essentially deprecated because it's too hard for compilers to efficiently and correctly implement the rules the way C++11 designed them. (And/or because[[carries_dependency]]
and/orkill_dependency
would end up being needed all over the place.) See P0371R1: Temporarily discourage memory_order_consume.Current compilers simply treat
mo_consume
asmo_acquire
(and thus on ISAs that need one, put a barrier right after the consume load). If you want the performance of data dependency ordering without barriers, you have to trick the compiler by usingmo_relaxed
and code carefully to avoid things that would make it likely for the compiler to create asm without an actual dependency. (e.g. Linux RCU). See C++11: the difference between memory_order_relaxed and memory_order_consume for more details and links about that, and the asm feature thatmo_consume
was designed to expose.Also Memory order consume usage in C11.
Understanding the concept of dependency ordering (in asm) is basically essential to understanding how this C++ feature is designed.
You don't "pass an atomic variable" to a function in the first place; what would that even mean? If you were passing a pointer or reference to an atomic object, the function would be doing its own load from it, and the source code for that function would use
memory_order_consume
or not.The relevant thing is passing a value loaded from an atomic variable with mo_consume. Like this:
func
may use that arg as an index into an array ofatomic<int>
to do anmo_relaxed
load. For that load to be dependency-ordered after theshared_var.load
even without a memory barrier, code-gen forfunc
has to make sure that load has an asm data dependency on the arg, even if the C++ code does something liketmp -= tmp;
that compilers would normally just treat the same astmp = 0;
(killing the previous value).But
[[carries_dependency]]
would make the compiler still reference that zeroed value with a data dependency in implementing something likearray[idx+tmp]
."Already consumed" is not a valid concept. The whole point of
consume
instead ofacquire
is that later loads are ordered correctly because they have a data dependency on themo_consume
load result, letting you avoid barriers. Every later load needs such a dependency if you want it ordered after the original load; there is no sense in which you can say a value is "already consumed".If you do end up inserting a barrier to promote consume to acquire because of a missing carries_dependency on one function, later functions wouldn't need another barrier because you could say the value was "already acquired". (Although that's not standard terminology. You'd instead say code after the first barrier was ordered after the load.)
It might be useful to understand how the Linux kernel handles this, with their hand-rolled atomics and limited set of compilers they support. Search for "dependency" in https://github.com/torvalds/linux/blob/master/Documentation/memory-barriers.txt, and note the difference between a "control dependency" like
if(flag) data.load()
vs. a data dependency likedata[idx].load
.IIRC, even C++ doesn't guarantee
mo_consume
dependency ordering when the dependency is a conditional likeif(x.load(consume)) tmp=y.load();
.Note that compilers will sometimes turn a data dependency into a control dependency if there's only 2 possible values for example. This would break
mo_consume
, and be an optimization that wouldn't be allowed if the value came from amo_consume
load or a[[carries_dependency]]
function arg. This is part of why it's hard to implement; it would require teaching lots of optimization passes about data dependency ordering instead of just expecting users to write code that doesn't do things which will normally optimize away. (Liketmp -= tmp;
)