Weird performance behavior involving VarHandles

Andrew Haley aph-open at littlepinkcloud.com
Fri Apr 26 12:57:55 UTC 2024


On 4/24/24 23:28, Maurizio Cimadamore wrote:
> I seem to recall that the lambda forms for guards-with-test are rather complex, as they need to profile the various branches. I wonder if some "leftover" from the profiling code stays there and pollutes the benchmark?

It's definitely different inlining. On AArch64 I see

ReproducerBenchmarks.control                avgt    5  1.438 ± 0.005  ns/op
ReproducerBenchmarks.gwt2_methodhandle      avgt    5  2.112 ± 0.076  ns/op
ReproducerBenchmarks.gwt_methodhandle       avgt    5  1.440 ± 0.074  ns/op

and the important difference is here, see the "dmb ish" that is pinned:


            │  0x0000fffefcb54a70:   tbnz                w14, #0x1f, #0xfffefcb54cd8
            │                                                            ;*invokevirtual invokeBasic {reexecute=0 rethrow=0 return_oop=0}
            │                                                            ; - java.lang.invoke.VarHandleGuards::guard_LJ_I at 80 (line 1002)
            │                                                            ; - org.openjdk.bench.vm.lang.ReproducerBenchmarks::gwt2_methodhandle at 30 (line 107)
            │                                                            ; - org.openjdk.bench.vm.lang.jmh_generated.ReproducerBenchmarks_gwt2_methodhandle_jmh
            │ ;; B29: #  out( B42 B30 ) <- in( B28 )  Freq: 91235.6
            │  0x0000fffefcb54a74:   ldr         w11, [x11]          ;*invokevirtual getIntUnaligned {reexecute=0 rethrow=0 return_oop=0}
            │                                                            ; - jdk.internal.misc.Unsafe::getIntUnaligned at 5 (line 3576)
            │                                                            ; - jdk.internal.misc.ScopedMemoryAccess::getIntUnalignedInternal at 15 (line 1893)
            │                                                            ; - jdk.internal.misc.ScopedMemoryAccess::getIntUnaligned at 6 (line 1881)
            │                                                            ; - java.lang.invoke.VarHandleSegmentAsInts::get at 48 (line 108)
            │                                                            ; - java.lang.invoke.LambdaForm$DMH/0x00000000231d1c00::invokeStatic at 14
            │                                                            ; - java.lang.invoke.LambdaForm$MH/0x00000000231d3800::invoke at 53
            │                                                            ; - java.lang.invoke.VarHandleGuards::guard_LJ_I at 80 (line 1002)
            │                                                            ; - org.openjdk.bench.vm.lang.ReproducerBenchmarks::gwt2_methodhandle at 13 (line 106)
            │                                                            ; - org.openjdk.bench.vm.lang.jmh_generated.ReproducerBenchmarks_gwt2_methodhandle_jmh
            │ ;; membar_release
            │  0x0000fffefcb54a78:   dmb         ish                 ;*synchronization entry
            │                                                            ; - java.lang.invoke.VarHandle::getMethodHandle at -1 (line 2203)
            │                                                            ; - java.lang.invoke.VarHandleGuards::guard_LJ_I at 59 (line 1001)
            │                                                            ; - org.openjdk.bench.vm.lang.ReproducerBenchmarks::gwt2_methodhandle at 30 (line 107)
            │                                                            ; - org.openjdk.bench.vm.lang.jmh_generated.ReproducerBenchmarks_gwt2_methodhandle_jmh
            │  0x0000fffefcb54a7c:   ldr         w14, [x13, #0x18]   ;*getfield scope {reexecute=0 rethrow=0 return_oop=0}
            │                                                            ; - jdk.internal.foreign.AbstractMemorySegmentImpl::sessionImpl at 1 (line 430)
            │                                                            ; - java.lang.invoke.VarHandleSegmentAsInts::get at 24 (line 108)
            │                                                            ; - java.lang.invoke.LambdaForm$DMH/0x00000000231d1c00::invokeStatic at 14
            │                                                            ; - java.lang.invoke.LambdaForm$MH/0x00000000231d3800::invoke at 53
            │                                                            ; - java.lang.invoke.VarHandleGuards::guard_LJ_I at 80 (line 1002)
            │                                                            ; - org.openjdk.bench.vm.lang.ReproducerBenchmarks::gwt2_methodhandle at 30 (line 107)
            │                                                            ; - org.openjdk.bench.vm.lang.jmh_generated.ReproducerBenchmarks_gwt2_methodhandle_jmh

This is a release fence. It could be from a constructor with a final field.

I think it's this:

     MethodHandle getMethodHandle(int mode) {
         MethodHandle[] mhTable = methodHandleTable;
         if (mhTable == null) {
             mhTable = methodHandleTable = new MethodHandle[AccessMode.COUNT];
         }
         MethodHandle mh = mhTable[mode];
         if (mh == null) {
             mh = mhTable[mode] = getMethodHandleUncached(mode);
         }
         return mh;

If I had to guess, it's that a constructor here is being scalar replaced, but its fence
is remaining, and it prevents code motion, so the fields scope and min are being reloaded
rather than hoisted. Even though a release barrier doesn't generate any code on x86 because
x86 is TSO, it will still prevent code motion.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671



More information about the hotspot-compiler-dev mailing list