Understanding of C++'s std::atomic<T> and compare-and-swap

205 Views Asked by At

My understanding is that compare-and-swap is something supported by hardware, e.g., CMPXCHG in x86 architecture. I have the following two confusions:

  • Is it that C++'s atomic does not "implement" atomicity itself, but rather it leverages the atomic functions of CPUs?
  • But what if an architecture does not have compare-and-swap functions? If a compiler on that platform wants to be C++ standard-compliant, it has to find some other (probably much more computationally expensive) ways to implement std::atomic without using compare-and-swap approach?
1

There are 1 best solutions below

15
user17732522 On

Specializations of std::atomic are not generally required to be lock-free.

On a platform that doesn't support the required operations for a type X atomically, the C++ implementation can still implement std::atomic<X> with the help of a mutex. That way you can simply do the comparison and swap operations in multiple instructions which do not need to make any atomicity/ordering guarantees while holding a lock on the mutex.

To test whether a specialization of std::atomic is lock-free, use std::atomic<X>::is_always_lock_free or the weaker form std::atomic<X>::is_lock_free().

The only type that is required to provide lock-free atomic operations on a conforming C++ implementation is std::atomic_flag which has only two states and provides fewer operations than std::atomic and can be fully implemented by an atomic exchange of a byte, which the platform needs to provide for (plus a pure load in C++20 and later).

A std::atomic_flag is sufficient to implement locking, so that it is sufficient to implement all std::atomic specializations, but not lock-free.

The above requirements can be satisfied with the help of the OS scheduler by always having only one C++ thread run at the same time, effectively using only a single physical thread. But for actual concurrent multi-threading the hardware needs to provide the above-mentioned mechanisms. Depending on what atomic operations the CPU/instruction set supports on which size of operands, the std::atomic specializations will be implemented as lock-free using these operations, or instead using a locking mechanism.