Apparent regression in memory segment access in JDK 23?

Chris Hegarty chegar999 at gmail.com
Mon Sep 30 19:21:39 UTC 2024


Hi Maurizio,

I can confirm that performance has been restored with the compiler 
directives that you suggested.

There is some other variance I see in the benchmark that I'm running, 
but that exists in JDK 22 also. I'll look into that separately. ( I have 
some ideas )

Thanks for jumping on this so fast. And the workaround will allow us to 
upgrade to JDK 23.

I assume that a fix will be prepared for JDK 24, and backported to 23u. 
If there is anything that I can do to help, please let me know.

Thanks,
-Chris.

On 30/09/2024 11:12, Maurizio Cimadamore wrote:
> Hi Chris,
> It would be very helpful if you could run again with these flags:
> 
> -XX:CompileCommand=dontinline,java/lang/invoke/MethodHandle.setAsTypeCache -XX:CompileCommand=dontinline,java/lang/invoke/MethodHandle.asTypeUncached
> 
> As these are more similar to the fix we're liklely going to apply (e.g. 
> disable inlining for the slow path of MethodHandle::asType). Our 
> benchmarks responds very well to this, and original performance is fully 
> restored. It would be helpful to know how these synthetic results 
> translate to the "real world" :-)
> 
> Cheers
> Maurizio
> 
> On 27/09/2024 23:10, Chris Hegarty wrote:
>> Ha!!! You tracked it down. Thank you.
>>
>> -Chris
>>
>>> On 27 Sep 2024, at 22:47, Maurizio Cimadamore 
>>> <maurizio.cimadamore at oracle.com> wrote:
>>>
>>> No the defaults have not changed.
>>>
>>> I think we managed to isolate the issue. More details here 
>>> https://bugs.openjdk.org/browse/JDK-8341127
>>>
>>> In the meantime, I believe that using either of the commands I 
>>> provided in the last email should workaround the issue.
>>>
>>> We will try to get this sorted quickly.
>>>
>>> Thanks
>>> Maurizio
>>>
>>> On 27/09/2024 22:41, Chris Hegarty wrote:
>>>>>> On 27 Sep 2024, at 18:36, Maurizio Cimadamore 
>>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>> Control question: is your command line affecting the default 
>>>>> values of the inlining parameters? E.g. "InlineSmallCode" ?
>>>>>
>>>> No. We don’t touch any of these JVM flags. So they’re just the 
>>>> defaults.  And I don’t think that any of these defaults changed 
>>>> between 22 and 23?
>>>>
>>>> -Chris
>>>>
>>>>
>>>>> Maurizio
>>>>>
>>>>>> On 27/09/2024 17:17, Chris Hegarty wrote:
>>>>>> Thanks for your quick reply Maurizio.
>>>>>>
>>>>>> Lemme try out those and I'll report back.
>>>>>>
>>>>>> -Chris.
>>>>>>
>>>>>>> On 27/09/2024 16:19, Maurizio Cimadamore wrote:
>>>>>>> So,
>>>>>>> some concrete things for you to try out, which might help us 
>>>>>>> diagnose this further (we think we have some idea of what's going 
>>>>>>> on).
>>>>>>>
>>>>>>> One thing to try, is to disable var handle guards. You can do 
>>>>>>> that by passing the following option to the launcher:
>>>>>>>
>>>>>>> ```
>>>>>>> -Djava.lang.invoke.VarHandle.VAR_HANDLE_GUARDS=false
>>>>>>> ```
>>>>>>>
>>>>>>> Another is to force inlining for MethodHandle::asType. This can 
>>>>>>> be done with:
>>>>>>>
>>>>>>> ```
>>>>>>> -XX:CompileCommand=inline,java.lang.invoke.MethodHandle::asType
>>>>>>> ```
>>>>>>>
>>>>>>> (and you can repeat that command if you see that other methods 
>>>>>>> get in the way).
>>>>>>>
>>>>>>> Hopefully one of these should work.
>>>>>>>
>>>>>>> Cheers
>>>>>>> Maurizio
>>>>>>>
>>>>>>> On 27/09/2024 15:43, Maurizio Cimadamore wrote:
>>>>>>>> Hi Chris,
>>>>>>>> the only thing I can think of is:
>>>>>>>>
>>>>>>>> https://git.openjdk.org/jdk/pull/19251
>>>>>>>>
>>>>>>>> Which I believe you already zeroed in on.
>>>>>>>>
>>>>>>>> We ran all our benchmarks before/after the fix:
>>>>>>>>
>>>>>>>> https://urldefense.com/v3/__https://jmh.morethan.io/?sources=https:**Acorsproxy.io**Ahttps:**Acr.openjdk.org**Amcimadamore*jdk*8331865*loop_over_00_baseline.json,https:**Acorsproxy.io**Ahttps:**Acr.openjdk.org**Amcimadamore*jdk*8331865*loop_over_01_8331865.json__;Ly8vPy8vL34vLy8vLy8_Ly8vfi8vLw!!ACWV5N9M2RV99hQ!K5xyY9T-QUt6IIEJBSaRGjb3rs9Zg6OY94yX5_dWCXjf95L7pGe3-GCNhMrGdjU_E8ndoqxykvssGNXLqIVuc9IwuA$
>>>>>>>> And the results were largely neutral. The patch should not add 
>>>>>>>> more checks, but mostly reshuffle the existing checks.
>>>>>>>>
>>>>>>>> But from your inlining traces, it does seem that something fails 
>>>>>>>> to inline - which then means bound check elimination will not be 
>>>>>>>> applied.
>>>>>>>>
>>>>>>>> Now, it's hard to say: maybe this code was already 95% near some 
>>>>>>>> threshold, and the new code shape (after our fix) pushes it over 
>>>>>>>> the fence. Or maybe there's something suboptimal in our fix (but 
>>>>>>>> our JMH doesn't seem to indicate that).
>>>>>>>>
>>>>>>>> The var handle caching you mention should affect the number of 
>>>>>>>> var handles being created, but not peak performance. E.g. if you 
>>>>>>>> see more heap being used, that might be a sign that we're 
>>>>>>>> creating multiple copies for the same VH. But creating redundant 
>>>>>>>> copies should not prevent inlining... so something else is up here.
>>>>>>>>
>>>>>>>> Can you try against the latest 24 ea build? We made other 
>>>>>>>> changes in the area (mainly to improve startup of memory segment 
>>>>>>>> var handles). It would be interesting to know if they help bring 
>>>>>>>> things back into shape (in which case we might consider to 
>>>>>>>> backport the startup fixes).
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>> Maurizio
>>>>>>>>
>>>>>>>>
>>>>>>>> On 27/09/2024 15:21, Chris Hegarty wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm trying to track down what appears as a regression when 
>>>>>>>>> accessing long values from a memory segment, when moving from 
>>>>>>>>> JDK 22 to JDK 23. Approx 100-150% slower.
>>>>>>>>>
>>>>>>>>> There are some details in this GH issue [1], but not a lot more 
>>>>>>>>> than what is in this email.
>>>>>>>>>
>>>>>>>>> I'm still debugging, but git bisect on JDK 23 builds was not 
>>>>>>>>> all that helpful. I did see some changes in b25 and further in 
>>>>>>>>> b26 (to restore a varhandle cache).
>>>>>>>>>
>>>>>>>>> I'm still trying to get a basic jmh benchmark, but so far I've 
>>>>>>>>> been unable to reproduce this yet. I'll keep trying
>>>>>>>>>
>>>>>>>>> Any thoughts or comments would be gratefully appreciated.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> JDK 23:
>>>>>>>>>    @ 16 
>>>>>>>>> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get (34 bytes)   inline (hot)
>>>>>>>>>     \-> TypeProfile (11846/11846 counts) = 
>>>>>>>>> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
>>>>>>>>>     @ 14 
>>>>>>>>> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong (31 bytes)   inline (hot)
>>>>>>>>>       \-> TypeProfile (11494/11494 counts) = 
>>>>>>>>> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
>>>>>>>>>        @ 8 jdk.internal.foreign.AbstractMemorySegmentImpl::get 
>>>>>>>>> (12 bytes)   force inline by annotation
>>>>>>>>>         \-> TypeProfile (11022/11022 counts) = 
>>>>>>>>> jdk/internal/foreign/MappedMemorySegmentImpl
>>>>>>>>>          @ 1 
>>>>>>>>> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle (43 bytes)   force inline by annotation
>>>>>>>>>          @ 8   java.lang.invoke.VarHandleGuards::guard_LJ_J (84 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>            @ 3 
>>>>>>>>> java.lang.invoke.IndirectVarHandle::checkAccessModeThenIsDirect 
>>>>>>>>> (8 bytes)   force inline by annotation
>>>>>>>>>              @ 2 
>>>>>>>>> java.lang.invoke.VarHandle::checkAccessModeThenIsDirect (29 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>            @ 59   java.lang.invoke.VarHandle::getMethodHandle 
>>>>>>>>> (41 bytes)   force inline by annotation
>>>>>>>>>            @ 71   java.lang.invoke.MethodHandle::asType (32 
>>>>>>>>> bytes) failed to inline: already compiled into a big method
>>>>>>>>>            @ 75 java.lang.invoke.IndirectVarHandle::asDirect (5 
>>>>>>>>> bytes) accessor
>>>>>>>>>            @ 80 
>>>>>>>>> java.lang.invoke.MethodHandle::invokeBasic(LLJ)J (0 bytes) 
>>>>>>>>> failed to inline: receiver not constant
>>>>>>>>>
>>>>>>>>> JDK 22:
>>>>>>>>>     @ 16 
>>>>>>>>> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get (34 bytes)   inline (hot)
>>>>>>>>>      \-> TypeProfile (6998/6998 counts) = 
>>>>>>>>> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
>>>>>>>>>       @ 14 
>>>>>>>>> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong (31 bytes)   inline (hot)
>>>>>>>>>        \-> TypeProfile (7475/7475 counts) = 
>>>>>>>>> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
>>>>>>>>>         @ 8 jdk.internal.foreign.AbstractMemorySegmentImpl::get 
>>>>>>>>> (12 bytes)   force inline by annotation
>>>>>>>>>          \-> TypeProfile (10270/10270 counts) = 
>>>>>>>>> jdk/internal/foreign/MappedMemorySegmentImpl
>>>>>>>>>           @ 1 
>>>>>>>>> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle (26 bytes)   force inline by annotation
>>>>>>>>>           @ 8   java.lang.invoke.VarHandleGuards::guard_LJ_J 
>>>>>>>>> (84 bytes)  force inline by annotation
>>>>>>>>>             @ 3 
>>>>>>>>> java.lang.invoke.VarHandle::checkAccessModeThenIsDirect (29 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>             @ 46   java.lang.invoke.VarForm::getMemberName (38 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>             @ 49 java.lang.invoke.VarHandleSegmentAsLongs::get 
>>>>>>>>> (52 bytes) force inline by annotation
>>>>>>>>>               @ 14 
>>>>>>>>> java.lang.invoke.VarHandleSegmentAsLongs::checkAddress (21 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                 @ 1   java.util.Objects::requireNonNull (14 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                 @ 15 
>>>>>>>>> jdk.internal.foreign.AbstractMemorySegmentImpl::checkAccess (30 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                   @ 26 
>>>>>>>>> jdk.internal.foreign.AbstractMemorySegmentImpl::checkBounds (54 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                     @ 16 
>>>>>>>>> jdk.internal.util.Preconditions::checkIndex (22 bytes) (intrinsic)
>>>>>>>>>               @ 24 
>>>>>>>>> jdk.internal.foreign.AbstractMemorySegmentImpl::sessionImpl (5 
>>>>>>>>> bytes) accessor
>>>>>>>>>               @ 29 
>>>>>>>>> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetBase (2 
>>>>>>>>> bytes) inline (hot)
>>>>>>>>>               @ 40 
>>>>>>>>> java.lang.invoke.VarHandleSegmentAsLongs::offsetPlain (39 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                 @ 1 
>>>>>>>>> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetOffset 
>>>>>>>>> (5 bytes)  accessor
>>>>>>>>>                 @ 13 
>>>>>>>>> jdk.internal.foreign.NativeMemorySegmentImpl::maxAlignMask (2 
>>>>>>>>> bytes) inline (hot)
>>>>>>>>>               @ 48 
>>>>>>>>> jdk.internal.misc.ScopedMemoryAccess::getLongUnaligned (18 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                 @ 6 
>>>>>>>>> jdk.internal.misc.ScopedMemoryAccess::getLongUnalignedInternal 
>>>>>>>>> (36 bytes)   force inline by annotation
>>>>>>>>>                   @ 5 
>>>>>>>>> jdk.internal.foreign.MemorySessionImpl::checkValidStateRaw (33 
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>>                   @ 15 
>>>>>>>>> jdk.internal.misc.Unsafe::getLongUnaligned (12 bytes) inline (hot)
>>>>>>>>>                     @ 5 
>>>>>>>>> jdk.internal.misc.Unsafe::getLongUnaligned (173 bytes) (intrinsic)
>>>>>>>>>                     @ 8 jdk.internal.misc.Unsafe::convEndian 
>>>>>>>>> (16 bytes)   inline (hot)
>>>>>>>>>                   @ 21 
>>>>>>>>> java.lang.ref.Reference::reachabilityFence (1 bytes)   force 
>>>>>>>>> inline by annotation
>>>>>>>>>
>>>>>>>>> -Chris
>>>>>>>>>
>>>>>>>>> [1] 
>>>>>>>>> https://urldefense.com/v3/__https://github.com/elastic/elasticsearch/issues/113030__;!!ACWV5N9M2RV99hQ!K5xyY9T-QUt6IIEJBSaRGjb3rs9Zg6OY94yX5_dWCXjf95L7pGe3-GCNhMrGdjU_E8ndoqxykvssGNXLqIUql_x_cQ$


More information about the panama-dev mailing list