I was writing my own AtomicLong
class and I just found that the function I had is much slower than the one provided in the Unsafe class. I am wondering why?
Below are the codes I have:
public interface Counter {
void increment();
long get();
}
public class PrimitiveUnsafeSupportCounter implements Counter{
private volatile long count = 0;
private Unsafe unsafe;
private long offset;
public PrimitiveUnsafeSupportCounter() throws IllegalAccessException, NoSuchFieldException {
Field f = Unsafe.class.getDeclaredField("theUnsafe");
f.setAccessible(true);
this.unsafe = (Unsafe) f.get(null);
this.offset = this.unsafe.objectFieldOffset(PrimitiveUnsafeSupportCounter.class.getDeclaredField("count"));
}
@Override
public void increment() {
this.unsafe.getAndAddLong(this, this.offset, 1);
}
@Override
public long get() {
return this.count;
}
}
public class CounterThread implements Runnable {
private Counter counter;
public CounterThread(Counter counter){
this.counter = counter;
}
@Override
public void run() {
for (int i = 0; i < 100000; i ++){
this.counter.increment();
}
}
}
class Test{
public static void test(Counter counter) throws NoSuchFieldException, IllegalAccessException, InterruptedException {
ExecutorService executor = Executors.newFixedThreadPool(1000);
long start = System.currentTimeMillis();
for (int i = 0 ; i < 1000; i++){
executor.submit(new CounterThread(counter));
}
executor.shutdown();
executor.awaitTermination(1, TimeUnit.MINUTES);
long stop = System.currentTimeMillis();
System.out.println(counter.get());
System.out.println(stop - start);
}
}
public class Main {
public static void main(String[] args) throws NoSuchFieldException, IllegalAccessException, InterruptedException {
Counter primitiveUnsafeSupportCounter = new PrimitiveUnsafeSupportCounter();
Test.test(primitiveUnsafeSupportCounter);
}
}
it takes about 3000ms to finish the above codes.
however, it takes about even 7000ms if I used the below codes instead of this.unsafe.getAndAddLong(this, this.offset, 1);
.
long before;
do {
before = this.unsafe.getLongVolatile(this, this.offset);
} while (!this.unsafe.compareAndSwapLong(this, this.offset, before, before + 1));
I went through the source codes of getAndAddLong
and found it does nearly the same thing as the above codes, so what should I miss?
That's JVM intrinsic and hand-written loop version has highly inefficient compiled code for the purpose. On x86 you can have atomic version of such read-modify-write operations via
lock
prefix. See Intel Manual 8.1.2.2 Software Controlled Bus Locking :In particular you can have something like
lock add op1 op2
. In your example you test the result ofcmpxchg
and do some jump which is obviously slower. Also as far as I remember on x86 volatile access requires some sort ofmfence
orlock
to ensure memory ordering.