Apparent regression in memory segment access in JDK 23?

Chris Hegarty chegar999 at gmail.com
Mon Sep 30 13:44:27 UTC 2024


Hi Maurizio,

I did try with the previous suggestion:

   -XX:CompileCommand=inline,java.lang.invoke.MethodHandle::asType

.. and things certainly improved a lot. The compilation output looked 
much better. That said, I'm not sure it restored all the perf loss we're 
seeing with 23, but I'm not sure that it is all down to this particular 
issue. But it mostly is.

I'll try with the new suggestion and get back to you later today.

Thanks,
-Chris.

On 30/09/2024 11:12, Maurizio Cimadamore wrote:
> Hi Chris,
> It would be very helpful if you could run again with these flags:
> 
> -XX:CompileCommand=dontinline,java/lang/invoke/MethodHandle.setAsTypeCache -XX:CompileCommand=dontinline,java/lang/invoke/MethodHandle.asTypeUncached
> 
> As these are more similar to the fix we're liklely going to apply (e.g. 
> disable inlining for the slow path of MethodHandle::asType). Our 
> benchmarks responds very well to this, and original performance is fully 
> restored. It would be helpful to know how these synthetic results 
> translate to the "real world" :-)
> 
> Cheers
> Maurizio
> 
> On 27/09/2024 23:10, Chris Hegarty wrote:
>> Ha!!! You tracked it down. Thank you.
>>
>> -Chris
>>
>>> On 27 Sep 2024, at 22:47, Maurizio Cimadamore 
>>> <maurizio.cimadamore at oracle.com> wrote:
>>>
>>> No the defaults have not changed.
>>>
>>> I think we managed to isolate the issue. More details here 
>>> https://bugs.openjdk.org/browse/JDK-8341127
>>>
>>> In the meantime, I believe that using either of the commands I 
>>> provided in the last email should workaround the issue.
>>>
>>> We will try to get this sorted quickly.
>>>
>>> Thanks
>>> Maurizio
>>>
>>> On 27/09/2024 22:41, Chris Hegarty wrote:
>>>>>> On 27 Sep 2024, at 18:36, Maurizio Cimadamore 
>>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>> Control question: is your command line affecting the default 
>>>>> values of the inlining parameters? E.g. "InlineSmallCode" ?
>>>>>
>>>> No. We don’t touch any of these JVM flags. So they’re just the 
>>>> defaults.  And I don’t think that any of these defaults changed 
>>>> between 22 and 23?
>>>>
>>>> -Chris
>>>>
>>>>
>>>>> Maurizio
>>>>>
>>>>>> On 27/09/2024 17:17, Chris Hegarty wrote:
>>>>>> Thanks for your quick reply Maurizio.
>>>>>>
>>>>>> Lemme try out those and I'll report back.
>>>>>>
>>>>>> -Chris.
>>>>>>
>>>>>>> On 27/09/2024 16:19, Maurizio Cimadamore wrote:
>>>>>>> So,
>>>>>>> some concrete things for you to try out, which might help us 
>>>>>>> diagnose this further (we think we have some idea of what's going 
>>>>>>> on).
>>>>>>>
>>>>>>> One thing to try, is to disable var handle guards. You can do 
>>>>>>> that by passing the following option to the launcher:
>>>>>>>
>>>>>>> ```
>>>>>>> -Djava.lang.invoke.VarHandle.VAR_HANDLE_GUARDS=false
>>>>>>> ```
>>>>>>>
>>>>>>> Another is to force inlining for MethodHandle::asType. This can 
>>>>>>> be done with:
>>>>>>>
>>>>>>> ```
>>>>>>> -XX:CompileCommand=inline,java.lang.invoke.MethodHandle::asType
>>>>>>> ```
>>>>>>>
>>>>>>> (and you can repeat that command if you see that other methods 
>>>>>>> get in the way).
>>>>>>>
>>>>>>> Hopefully one of these should work.
>>>>>>>
>>>>>>> Cheers
>>>>>>> Maurizio
>>>>>>>
>>>>>>> On 27/09/2024 15:43, Maurizio Cimadamore wrote:
>>>>>>>> Hi Chris,
>>>>>>>> the only thing I can think of is:
>>>>>>>>
>>>>>>>> https://git.openjdk.org/jdk/pull/19251
>>>>>>>>
>>>>>>>> Which I believe you already zeroed in on.
>>>>>>>>
>>>>>>>> We ran all our benchmarks before/after the fix:
>>>>>>>>
>>>>>>>> https://urldefense.com/v3/__https://jmh.morethan.io/?sources=https:**Acorsproxy.io**Ahttps:**Acr.openjdk.org**Amcimadamore*jdk*8331865*loop_over_00_baseline.json,https:**Acorsproxy.io**Ahttps:**Acr.openjdk.org**Amcimadamore*jdk*8331865*loop_over_01_8331865.json__;Ly8vPy8vL34vLy8vLy8_Ly8vfi8vLw!!ACWV5N9M2RV99hQ!K5xyY9T-QUt6IIEJBSaRGjb3rs9Zg6OY94yX5_dWCXjf95L7pGe3-GCNhMrGdjU_E8ndoqxykvssGNXLqIVuc9IwuA$
>>>>>>>> And the results were largely neutral. The patch should not add 
>>>>>>>> more checks, but mostly reshuffle the existing checks.
>>>>>>>>
>>>>>>>> But from your inlining traces, it does seem that something fails 
>>>>>>>> to inline - which then means bound check elimination will not be 
>>>>>>>> applied.
>>>>>>>>
>>>>>>>> Now, it's hard to say: maybe this code was already 95% near some 
>>>>>>>> threshold, and the new code shape (after our fix) pushes it over 
>>>>>>>> the fence. Or maybe there's something suboptimal in our fix (but 
>>>>>>>> our JMH doesn't seem to indicate that).
>>>>>>>>
>>>>>>>> The var handle caching you mention should affect the number of 
>>>>>>>> var handles being created, but not peak performance. E.g. if you 
>>>>>>>> see more heap being used, that might be a sign that we're 
>>>>>>>> creating multiple copies for the same VH. But creating redundant 
>>>>>>>> copies should not prevent inlining... so something else is up here.
>>>>>>>>
>>>>>>>> Can you try against the latest 24 ea build? We made other 
>>>>>>>> changes in the area (mainly to improve startup of memory segment 
>>>>>>>> var handles). It would be interesting to know if they help bring 
>>>>>>>> things back into shape (in which case we might consider to 
>>>>>>>> backport the startup fixes).
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>> Maurizio
>>>>>>>>
>>>>>>>>
>>>>>>>> On 27/09/2024 15:21, Chris Hegarty wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm trying to track down what appears as a regression when 
>>>>>>>>> accessing long values from a memory segment, when moving from 
>>>>>>>>> JDK 22 to JDK 23. Approx 100-150% slower.
>>>>>>>>>
>>>>>>>>> There are some details in this GH issue [1], but not a lot more 
>>>>>>>>> than what is in this email.
>>>>>>>>>
>>>>>>>>> I'm still debugging, but git bisect on JDK 23 builds was not 
>>>>>>>>> all that helpful. I did see some changes in b25 and further in 
>>>>>>>>> b26 (to restore a varhandle cache).
>>>>>>>>>
>>>>>>>>> I'm still trying to get a basic jmh benchmark, but so far I've 
>>>>>>>>> been unable to reproduce this yet. I'll keep trying
>>>>>>>>>
>>>>>>>>> Any thoughts or comments would be gratefully appreciated.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> JDK 23:
>>>>>>>>>    @ 16 
>>>>>>>>> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get (34 bytes)   inline (hot)
>>>>>>>>>     \-> TypeProfile (11846/11846 counts) = 
>>>>>>>>> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
>>>>>>>>>     @ 14 
>>>>>>>>> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong (31 bytes)   inline (hot)
>>>>>>>>>       \-> TypeProfile (11494/11494 counts) = 
>>>>>>>>> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
>>>>>>>>>        @ 8 jdk.internal.foreign.AbstractMemorySegmentImpl::get 
>>>>>>>>> (12 bytes)   force inline by annotation
>>>>>>>>>         \-> TypeProfile (11022/11022 counts) = 
>>>>>>>>> jdk/internal/foreign/MappedMemorySegmentImpl
>>>>>>>>>          @ 1 
>>>>>>>>> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle (43 bytes)   force inline by annotation
>>>>>>>>>          @ 8   java.lang.invoke.VarHandleGuards::guard_LJ_J (84 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>            @ 3 
>>>>>>>>> java.lang.invoke.IndirectVarHandle::checkAccessModeThenIsDirect 
>>>>>>>>> (8 bytes)   force inline by annotation
>>>>>>>>>              @ 2 
>>>>>>>>> java.lang.invoke.VarHandle::checkAccessModeThenIsDirect (29 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>            @ 59   java.lang.invoke.VarHandle::getMethodHandle 
>>>>>>>>> (41 bytes)   force inline by annotation
>>>>>>>>>            @ 71   java.lang.invoke.MethodHandle::asType (32 
>>>>>>>>> bytes) failed to inline: already compiled into a big method
>>>>>>>>>            @ 75 java.lang.invoke.IndirectVarHandle::asDirect (5 
>>>>>>>>> bytes) accessor
>>>>>>>>>            @ 80 
>>>>>>>>> java.lang.invoke.MethodHandle::invokeBasic(LLJ)J (0 bytes) 
>>>>>>>>> failed to inline: receiver not constant
>>>>>>>>>
>>>>>>>>> JDK 22:
>>>>>>>>>     @ 16 
>>>>>>>>> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get (34 bytes)   inline (hot)
>>>>>>>>>      \-> TypeProfile (6998/6998 counts) = 
>>>>>>>>> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
>>>>>>>>>       @ 14 
>>>>>>>>> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong (31 bytes)   inline (hot)
>>>>>>>>>        \-> TypeProfile (7475/7475 counts) = 
>>>>>>>>> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
>>>>>>>>>         @ 8 jdk.internal.foreign.AbstractMemorySegmentImpl::get 
>>>>>>>>> (12 bytes)   force inline by annotation
>>>>>>>>>          \-> TypeProfile (10270/10270 counts) = 
>>>>>>>>> jdk/internal/foreign/MappedMemorySegmentImpl
>>>>>>>>>           @ 1 
>>>>>>>>> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle (26 bytes)   force inline by annotation
>>>>>>>>>           @ 8   java.lang.invoke.VarHandleGuards::guard_LJ_J 
>>>>>>>>> (84 bytes)  force inline by annotation
>>>>>>>>>             @ 3 
>>>>>>>>> java.lang.invoke.VarHandle::checkAccessModeThenIsDirect (29 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>             @ 46   java.lang.invoke.VarForm::getMemberName (38 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>             @ 49 java.lang.invoke.VarHandleSegmentAsLongs::get 
>>>>>>>>> (52 bytes) force inline by annotation
>>>>>>>>>               @ 14 
>>>>>>>>> java.lang.invoke.VarHandleSegmentAsLongs::checkAddress (21 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                 @ 1   java.util.Objects::requireNonNull (14 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                 @ 15 
>>>>>>>>> jdk.internal.foreign.AbstractMemorySegmentImpl::checkAccess (30 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                   @ 26 
>>>>>>>>> jdk.internal.foreign.AbstractMemorySegmentImpl::checkBounds (54 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                     @ 16 
>>>>>>>>> jdk.internal.util.Preconditions::checkIndex (22 bytes) (intrinsic)
>>>>>>>>>               @ 24 
>>>>>>>>> jdk.internal.foreign.AbstractMemorySegmentImpl::sessionImpl (5 
>>>>>>>>> bytes) accessor
>>>>>>>>>               @ 29 
>>>>>>>>> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetBase (2 
>>>>>>>>> bytes) inline (hot)
>>>>>>>>>               @ 40 
>>>>>>>>> java.lang.invoke.VarHandleSegmentAsLongs::offsetPlain (39 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                 @ 1 
>>>>>>>>> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetOffset 
>>>>>>>>> (5 bytes)  accessor
>>>>>>>>>                 @ 13 
>>>>>>>>> jdk.internal.foreign.NativeMemorySegmentImpl::maxAlignMask (2 
>>>>>>>>> bytes) inline (hot)
>>>>>>>>>               @ 48 
>>>>>>>>> jdk.internal.misc.ScopedMemoryAccess::getLongUnaligned (18 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                 @ 6 
>>>>>>>>> jdk.internal.misc.ScopedMemoryAccess::getLongUnalignedInternal 
>>>>>>>>> (36 bytes)   force inline by annotation
>>>>>>>>>                   @ 5 
>>>>>>>>> jdk.internal.foreign.MemorySessionImpl::checkValidStateRaw (33 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                   @ 15 
>>>>>>>>> jdk.internal.misc.Unsafe::getLongUnaligned (12 bytes) inline (hot)
>>>>>>>>>                     @ 5 
>>>>>>>>> jdk.internal.misc.Unsafe::getLongUnaligned (173 bytes) (intrinsic)
>>>>>>>>>                     @ 8 jdk.internal.misc.Unsafe::convEndian 
>>>>>>>>> (16 bytes)   inline (hot)
>>>>>>>>>                   @ 21 
>>>>>>>>> java.lang.ref.Reference::reachabilityFence (1 bytes)   force 
>>>>>>>>> inline by annotation
>>>>>>>>>
>>>>>>>>> -Chris
>>>>>>>>>
>>>>>>>>> [1] 
>>>>>>>>> https://urldefense.com/v3/__https://github.com/elastic/elasticsearch/issues/113030__;!!ACWV5N9M2RV99hQ!K5xyY9T-QUt6IIEJBSaRGjb3rs9Zg6OY94yX5_dWCXjf95L7pGe3-GCNhMrGdjU_E8ndoqxykvssGNXLqIUql_x_cQ$


More information about the panama-dev mailing list