I'm working on a coroutine multiple-producers, single-consumer Event
(here it is, for context). Simplified:
class WaitList {
public:
void Append() { coro_.store(GetCurrentCoro()); }
void Remove() { coro_.store({}); }
void WakeUp() {
auto p = coro_.exchange({});
if (p) p->WakeUpIfSleeping();
}
private:
std::atomic<Coro*> coro_{nullptr};
};
class Event {
public:
void Wait() {
waiters_.Append();
if (IsReady()) waiters_.WakeUp();
// fall asleep
// ...
// woken up
waiters_.Remove();
}
void Send() {
SetReady();
waiters_.WakeUp();
}
private:
bool IsReady() { return signal_.load(); }
void SetReady() { return signal_.store(true); }
WaitList waiters_;
std::atomic<bool> signal_{false};
};
Please help me set the minimum required std::memory_order
s for:
Append
IsReady
SetReady
WakeUp
(talking about theexchange
)Remove
The primary platform is x86_64, but being optimal for other platforms is welcome. If there are multiple local minima, you can list them all (there is a finite number of theoretically possible combinations: sizeof(std::memory_order) ^ 5
).
It seems to me that 1-4 require std::memory_order_seq_cst
, because:
- If
Append
andIsReady
are reordered from the POV ofSend
thread, then after the sequenceIsReady-SetReady-WakeUp-Append
the coroutine will fall asleep forever (right?) - If
SetReady
andWakeUp
are reordered from the POV ofWait
thread, then after the sequenceWakeUp-Append-IsReady-SetReady
the coroutine will fall asleep forever (right?)
Given that we are dealing with two atomics (coro_
and signal_
), I'd expect acquire-release
not to be enough.
I know it is possible to squash coro_
and signal_
into a single std::uintptr_t
or something, but let's leave them as they are for the purpose of this exercise. It's not as simple in reality: there are multiple synchronization primitives all based around the same ideas, and the current layout generalizes to them well.