Weird performance behavior involving VarHandles

Remi Forax forax at univ-mlv.fr
Wed Apr 24 06:37:42 UTC 2024


Hello,
i'm trying to build an API on top of the foreign memory API and i've found a performance difference i'm not able to explain.

I'm using a guardWithTest to try to provide a simple way to access a VarHandle on a MemoryLayout without having to declare each VarHandle by hand,
so instead of

  private static final StructLayout LAYOUT = MemoryLayout.structLayout(
      ValueLayout.JAVA_INT.withName("x"),
      ValueLayout.JAVA_INT.withName("y")
  );

  private static final VarHandle HANDLE_X =
      LAYOUT.varHandle(MemoryLayout.PathElement.groupElement("x"));
  private static final VarHandle HANDLE_Y =
      LAYOUT.varHandle(MemoryLayout.PathElement.groupElement("y"));

I want something like

  private static final MethodHandle MH = guardWithTest(
      TEST.bindTo("x"),
      dropArguments(constant(VarHandle.class, HANDLE_X), 0, String.class),
      guardWithTest(
          TEST.bindTo("y"),
          dropArguments(constant(VarHandle.class, HANDLE_Y), 0, String.class),
          BOOM
      ));

  (TEST does an == on the strings and BOOM throws an exception)

which if called with "x" returns the VarHandle for "x" and if called with "y" returns the VarHandle for "y".


Now if I try to benchmark the performance with JMH,

  private final MemorySegment segment = Arena.ofAuto().allocate(LAYOUT);

  @Benchmark
  public int control() {
    var x = (int) HANDLE_X.get(segment, 0L);
    var y = (int) HANDLE_Y.get(segment, 0L);
    return x + y;
  }

  @Benchmark
  public int gwt2_methodhandle() throws Throwable {
    var x = (int) ((VarHandle) MH.invokeExact("x")).get(segment, 0L);
    var y = (int) ((VarHandle) MH.invokeExact("y")).get(segment, 0L);
    return x + y;
  }

I get

Benchmark                               Mode  Cnt  Score   Error  Units
ReproducerBenchmarks.control            avgt    5  1.250 ± 0.024  ns/op
ReproducerBenchmarks.gwt2_methodhandle  avgt    5  1.852 ± 0.024  ns/op

and I don't understand why there is a difference in performance because for c2, the strings "x" and "y" are constant so the corresponding VarHandles should be constant thus optimized the same way.

The full benchmark is available here:
  https://raw.githubusercontent.com/forax/memory-mapper/master/src/main/java/com/github/forax/memorymapper/bench/ReproducerBenchmarks.java

regards,
Rémi



More information about the hotspot-compiler-dev mailing list