I'm interested in sequentially-consistent load operation on x86.
As far as I see from assembler listing, generated by compiler it is implemented as a plain load on x86, however plain loads as far as I know guaranteed to have acquire semantics, while plain stores are guaranteed to have release.
Sequentially-consistent store is implemented as locked xchg, while load as plain load. That sounds strange to me, could you please explain this in details?
added
Just found in internet, that sequentially-consistent atomic load could be done as simple mov as long as store is done with locked xchg, but there was no proof and no links to documentation.
Register to memory transfers and vice versa are not necessarily atomic in an multiprocessor environment.
READING
This first instruction will zero the EAX register, the second instruction will exchange the content of both EAX with [address] and will store the sum of both in [address] again. Since EAX register was zero before, nothing gets changed.
WRITING
EAX register will get the value to store to specified address.
EDIT: LOCK ADD EAX, [address] will cause an "Invalid Opcode Exception" because destination operand is no memory address.
Edit 2: Summarizes information from comments.
While
There are restrictions to this