Panama: Padding doesn't prevent false sharing

231 Views Asked by At

I am trying to benchmark the effects false sharing has on the performance of a program.

In this example: https://github.com/lexburner/JMH-samples/blob/master/src/main/java/org/openjdk/jmh/samples/JMHSample_22_FalseSharing.java , cache line padding results in a performance improvement by an order-of-magnitude.

However when I use project panamas Foreign Memory Access API the cache line padding actually makes the performance slightly worse. Do MemorySegments implicitly use padding? What else could be causing this behaviour?

I have already tried to run the benchmarks on different Hardware and to turn off hyperthreading with the same outcome.

Benchmark details:

@BenchmarkMode(Mode.Throughput)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Warmup(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 1, timeUnit = TimeUnit.SECONDS)
@Fork(5)
public class JMH_FalseSharing {

    @State(Scope.Group)
    public static class StateBaseline {
        @TearDown(Level.Trial)
        public void tearDown(){
            rs.close();
        }

        ResourceScope rs = ResourceScope.newSharedScope();
        static final VarHandle VH;
        final MemorySegment readSegment = MemorySegment.allocateNative(MemoryLayouts.JAVA_LONG, rs);
        final MemorySegment writeSegment = MemorySegment.allocateNative(MemoryLayouts.JAVA_LONG, rs);

        static{
            VH = MemoryLayouts.JAVA_LONG.varHandle(long.class);
        }
    }
    
    @State(Scope.Group)
    public static class StatePadded {
        @TearDown(Level.Trial)
        public void tearDown(){
            rs.close();
        }
    
        ResourceScope rs = ResourceScope.newSharedScope();
        static final VarHandle VH;
        private static final GroupLayout gl = MemoryLayout.structLayout(
            MemoryLayout.paddingLayout(448L),
            MemoryLayouts.JAVA_LONG.withName("val"),
            MemoryLayout.paddingLayout(448L)
            );
            
        final MemorySegment readSegment  = MemorySegment.allocateNative(gl, rs);
        final MemorySegment writeSegment = MemorySegment.allocateNative(gl, rs);

        static{
            VH = gl.varHandle(long.class, MemoryLayout.PathElement.groupElement("val"));
        }
    }
    
    @Group("baseline")
    @Benchmark
    public void baselineWrite(StateBaseline baselineState){
        StateBaseline.VH.setRelease(baselineState.writeSegment, (long)StateBaseline.VH.getAcquire(baselineState.writeSegment) + 1);
    }

    @Group("baseline")
    @Benchmark
    public void baselineRead(Blackhole blackhole, StateBaseline baselineState){
        blackhole.consume((long)StateBaseline.VH.getAcquire(baselineState.readSegment));
    }
    
    @Group("padded")
    @Benchmark
    public void paddedWrite(StatePadded paddedState){
        StatePadded.VH.setRelease(paddedState.writeSegment, (long)StatePadded.VH.getAcquire(paddedState.writeSegment) + 1);
    }

    @Group("padded")
    @Benchmark
    public void paddedRead(Blackhole blackhole, StatePadded paddedState){
        blackhole.consume((long)StatePadded.VH.getAcquire(paddedState.readSegment));
    }
}
1

There are 1 best solutions below

0
On

Belated answer, now that the FFM API is quite different.


Take a look at the following JShell session.

jshell --add-modules jdk.incubator.foreign
|  Welcome to JShell -- Version 17.0.2
|  For an introduction type: /help intro

jshell> import jdk.incubator.foreign.*;
   ...> ResourceScope rs = ResourceScope.newSharedScope();
   ...>
rs ==> jdk.internal.foreign.SharedScope@238e0d81

jshell> final MemorySegment readSegment = MemorySegment.allocateNative(MemoryLayouts.JAVA_LONG, rs);
   ...> final MemorySegment writeSegment = MemorySegment.allocateNative(MemoryLayouts.JAVA_LONG, rs);
   ...>
readSegment ==> MemorySegment{ id=0x28758e86 limit: 8 }
writeSegment ==> MemorySegment{ id=0x286203e6 limit: 8 }

jshell> writeSegment.address().toRawLongValue() - readSegment.address().toRawLongValue()
$5 ==> 24928

jshell>

The two MemorySegments I allocated, like how you do in StateBaseline, are far away from each other. So there's no false sharing.

Contrast that with the readOnly and writeOnly fields in JMHSample_22_FalseSharing.StateBaseline. They are next to each other in the same object, so the various tricks to separate them improve the performance by preventing false sharing.


Now as to why the two MemorySegments are so far away, this answer might be relevant.

However they can be close to each other. In fact when I do
try (ResourceScope rs = ResourceScope.newSharedScope()) {, or
in Java 21: try (Arena arena = Arena.ofConfined()) {,
they are usually close, or sometimes next to each other.


To test false sharing with MemorySegment you can try the following (in Java 21):

MemorySegment longArray = Arena.global().allocateArray(ValueLayout.JAVA_LONG, 0, 0);
MemorySegment readSegment = longArray.asSlice(0, ValueLayout.JAVA_LONG);
MemorySegment writeSegment = longArray.asSlice(ValueLayout.JAVA_LONG.byteSize(), ValueLayout.JAVA_LONG);

MemorySegment structArray = Arena.global().allocateArray(gl, 2);