I am trying to benchmark the effects false sharing has on the performance of a program.
In this example: https://github.com/lexburner/JMH-samples/blob/master/src/main/java/org/openjdk/jmh/samples/JMHSample_22_FalseSharing.java , cache line padding results in a performance improvement by an order-of-magnitude.
However when I use project panamas Foreign Memory Access API the cache line padding actually makes the performance slightly worse. Do MemorySegments implicitly use padding? What else could be causing this behaviour?
I have already tried to run the benchmarks on different Hardware and to turn off hyperthreading with the same outcome.
Benchmark details:
@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(5)
public class JMH_FalseSharing {
@State(Scope.Group)
public static class StateBaseline {
@TearDown(Level.Trial)
public void tearDown(){
rs.close();
}
ResourceScope rs = ResourceScope.newSharedScope();
static final VarHandle VH;
final MemorySegment readSegment = MemorySegment.allocateNative(MemoryLayouts.JAVA_LONG, rs);
final MemorySegment writeSegment = MemorySegment.allocateNative(MemoryLayouts.JAVA_LONG, rs);
static{
VH = MemoryLayouts.JAVA_LONG.varHandle(long.class);
}
}
@State(Scope.Group)
public static class StatePadded {
@TearDown(Level.Trial)
public void tearDown(){
rs.close();
}
ResourceScope rs = ResourceScope.newSharedScope();
static final VarHandle VH;
private static final GroupLayout gl = MemoryLayout.structLayout(
MemoryLayout.paddingLayout(448L),
MemoryLayouts.JAVA_LONG.withName("val"),
MemoryLayout.paddingLayout(448L)
);
final MemorySegment readSegment = MemorySegment.allocateNative(gl, rs);
final MemorySegment writeSegment = MemorySegment.allocateNative(gl, rs);
static{
VH = gl.varHandle(long.class, MemoryLayout.PathElement.groupElement("val"));
}
}
@Group("baseline")
@Benchmark
public void baselineWrite(StateBaseline baselineState){
StateBaseline.VH.setRelease(baselineState.writeSegment, (long)StateBaseline.VH.getAcquire(baselineState.writeSegment) + 1);
}
@Group("baseline")
@Benchmark
public void baselineRead(Blackhole blackhole, StateBaseline baselineState){
blackhole.consume((long)StateBaseline.VH.getAcquire(baselineState.readSegment));
}
@Group("padded")
@Benchmark
public void paddedWrite(StatePadded paddedState){
StatePadded.VH.setRelease(paddedState.writeSegment, (long)StatePadded.VH.getAcquire(paddedState.writeSegment) + 1);
}
@Group("padded")
@Benchmark
public void paddedRead(Blackhole blackhole, StatePadded paddedState){
blackhole.consume((long)StatePadded.VH.getAcquire(paddedState.readSegment));
}
}
Belated answer, now that the FFM API is quite different.
Take a look at the following JShell session.
The two
MemorySegment
s I allocated, like how you do inStateBaseline
, are far away from each other. So there's no false sharing.Contrast that with the
readOnly
andwriteOnly
fields inJMHSample_22_FalseSharing.StateBaseline
. They are next to each other in the same object, so the various tricks to separate them improve the performance by preventing false sharing.Now as to why the two
MemorySegments
are so far away, this answer might be relevant.However they can be close to each other. In fact when I do
try (ResourceScope rs = ResourceScope.newSharedScope()) {
, orin Java 21:
try (Arena arena = Arena.ofConfined()) {
,they are usually close, or sometimes next to each other.
To test false sharing with
MemorySegment
you can try the following (in Java 21):