Fastest way to Compare And Swap (CAS) on Intel x86 CPU?

1.1k Views Asked by At

I need to swap two 8x byte regions of memory, most-likely using CMPXCHG8B. However, I want to do this as fast as possible. Other threads will be waiting until this operation is finished. I have a few questions relating to this:

-Is the LOCK prefix only required if I am using multiple processors, or multiple cores? I really want to avoid using this if possible.

-Would I be able to "lock" based on the MESI protocol without using the LOCK prefix, if the memory the waiting threads wish to access is in a different cache line?

I am running on a single processor (multiple cores), but answers explaining the difference to a multiprocessor system most-welcome if there's a difference.

1

There are 1 best solutions below

1
On

If you have multiple processors or multiple cores, and you want synchronized safe access to shared variables, you can't avoid the LOCK. (Using XCHG doesn't avoid the lock; it is just hidden in the instruction).

Following Jester's hint, I'd be tempted to name your two chunks of memory "Left" and "Right", and use a FLAG to dynamically rename them, e.g.

    GetLeft:  if LSB(FLAG)   ; least significant bit
              Read Left
              else Read Right

and

    GetRight: if LSB(FLAG)
              Read Right
              else Read Left

Then the following code will "interchange" them about as quickly as can be done:

    SwapLeftAndRight: 
               LOCK INC FLAG   ; flips LSB of flag

This eliminates any need for a mutex. (If your threads are trying to update these regions, you'll need the mutex no matter what you do).

If access speed is actually critical, then his hint about swapping two consecutive pointers to LEFT and RIGHT is pretty good.