Apparent regression in memory segment access in JDK 23?

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Fri Sep 27 17:36:19 UTC 2024


Control question: is your command line affecting the default values of 
the inlining parameters? E.g. "InlineSmallCode" ?

Maurizio

On 27/09/2024 17:17, Chris Hegarty wrote:
> Thanks for your quick reply Maurizio.
>
> Lemme try out those and I'll report back.
>
> -Chris.
>
> On 27/09/2024 16:19, Maurizio Cimadamore wrote:
>> So,
>> some concrete things for you to try out, which might help us diagnose 
>> this further (we think we have some idea of what's going on).
>>
>> One thing to try, is to disable var handle guards. You can do that by 
>> passing the following option to the launcher:
>>
>> ```
>> -Djava.lang.invoke.VarHandle.VAR_HANDLE_GUARDS=false
>> ```
>>
>> Another is to force inlining for MethodHandle::asType. This can be 
>> done with:
>>
>> ```
>> -XX:CompileCommand=inline,java.lang.invoke.MethodHandle::asType
>> ```
>>
>> (and you can repeat that command if you see that other methods get in 
>> the way).
>>
>> Hopefully one of these should work.
>>
>> Cheers
>> Maurizio
>>
>> On 27/09/2024 15:43, Maurizio Cimadamore wrote:
>>> Hi Chris,
>>> the only thing I can think of is:
>>>
>>> https://git.openjdk.org/jdk/pull/19251
>>>
>>> Which I believe you already zeroed in on.
>>>
>>> We ran all our benchmarks before/after the fix:
>>>
>>> https://urldefense.com/v3/__https://jmh.morethan.io/?sources=https:**Acorsproxy.io**Ahttps:**Acr.openjdk.org**Amcimadamore*jdk*8331865*loop_over_00_baseline.json,https:**Acorsproxy.io**Ahttps:**Acr.openjdk.org**Amcimadamore*jdk*8331865*loop_over_01_8331865.json__;Ly8vPy8vL34vLy8vLy8_Ly8vfi8vLw!!ACWV5N9M2RV99hQ!K5xyY9T-QUt6IIEJBSaRGjb3rs9Zg6OY94yX5_dWCXjf95L7pGe3-GCNhMrGdjU_E8ndoqxykvssGNXLqIVuc9IwuA$ 
>>>
>>> And the results were largely neutral. The patch should not add more 
>>> checks, but mostly reshuffle the existing checks.
>>>
>>> But from your inlining traces, it does seem that something fails to 
>>> inline - which then means bound check elimination will not be applied.
>>>
>>> Now, it's hard to say: maybe this code was already 95% near some 
>>> threshold, and the new code shape (after our fix) pushes it over the 
>>> fence. Or maybe there's something suboptimal in our fix (but our JMH 
>>> doesn't seem to indicate that).
>>>
>>> The var handle caching you mention should affect the number of var 
>>> handles being created, but not peak performance. E.g. if you see 
>>> more heap being used, that might be a sign that we're creating 
>>> multiple copies for the same VH. But creating redundant copies 
>>> should not prevent inlining... so something else is up here.
>>>
>>> Can you try against the latest 24 ea build? We made other changes in 
>>> the area (mainly to improve startup of memory segment var handles). 
>>> It would be interesting to know if they help bring things back into 
>>> shape (in which case we might consider to backport the startup fixes).
>>>
>>> Cheers
>>> Maurizio
>>>
>>>
>>> On 27/09/2024 15:21, Chris Hegarty wrote:
>>>> Hi,
>>>>
>>>> I'm trying to track down what appears as a regression when 
>>>> accessing long values from a memory segment, when moving from JDK 
>>>> 22 to JDK 23. Approx 100-150% slower.
>>>>
>>>> There are some details in this GH issue [1], but not a lot more 
>>>> than what is in this email.
>>>>
>>>> I'm still debugging, but git bisect on JDK 23 builds was not all 
>>>> that helpful. I did see some changes in b25 and further in b26 (to 
>>>> restore a varhandle cache).
>>>>
>>>> I'm still trying to get a basic jmh benchmark, but so far I've been 
>>>> unable to reproduce this yet. I'll keep trying
>>>>
>>>> Any thoughts or comments would be gratefully appreciated.
>>>>
>>>>
>>>> JDK 23:
>>>>   @ 16 
>>>> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get 
>>>> (34 bytes)   inline (hot)
>>>>    \-> TypeProfile (11846/11846 counts) = 
>>>> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
>>>>    @ 14 
>>>> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong 
>>>> (31 bytes)   inline (hot)
>>>>      \-> TypeProfile (11494/11494 counts) = 
>>>> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
>>>>       @ 8 jdk.internal.foreign.AbstractMemorySegmentImpl::get (12 
>>>> bytes)   force inline by annotation
>>>>        \-> TypeProfile (11022/11022 counts) = 
>>>> jdk/internal/foreign/MappedMemorySegmentImpl
>>>>         @ 1 
>>>> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle 
>>>> (43 bytes)   force inline by annotation
>>>>         @ 8   java.lang.invoke.VarHandleGuards::guard_LJ_J (84 
>>>> bytes) force inline by annotation
>>>>           @ 3 
>>>> java.lang.invoke.IndirectVarHandle::checkAccessModeThenIsDirect (8 
>>>> bytes)   force inline by annotation
>>>>             @ 2 
>>>> java.lang.invoke.VarHandle::checkAccessModeThenIsDirect (29 bytes) 
>>>> force inline by annotation
>>>>           @ 59   java.lang.invoke.VarHandle::getMethodHandle (41 
>>>> bytes)   force inline by annotation
>>>>           @ 71   java.lang.invoke.MethodHandle::asType (32 bytes) 
>>>> failed to inline: already compiled into a big method
>>>>           @ 75 java.lang.invoke.IndirectVarHandle::asDirect (5 
>>>> bytes) accessor
>>>>           @ 80 java.lang.invoke.MethodHandle::invokeBasic(LLJ)J (0 
>>>> bytes) failed to inline: receiver not constant
>>>>
>>>> JDK 22:
>>>>    @ 16 
>>>> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get 
>>>> (34 bytes)   inline (hot)
>>>>     \-> TypeProfile (6998/6998 counts) = 
>>>> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
>>>>      @ 14 
>>>> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong 
>>>> (31 bytes)   inline (hot)
>>>>       \-> TypeProfile (7475/7475 counts) = 
>>>> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
>>>>        @ 8 jdk.internal.foreign.AbstractMemorySegmentImpl::get (12 
>>>> bytes)   force inline by annotation
>>>>         \-> TypeProfile (10270/10270 counts) = 
>>>> jdk/internal/foreign/MappedMemorySegmentImpl
>>>>          @ 1 
>>>> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle 
>>>> (26 bytes)   force inline by annotation
>>>>          @ 8   java.lang.invoke.VarHandleGuards::guard_LJ_J (84 
>>>> bytes)  force inline by annotation
>>>>            @ 3 
>>>> java.lang.invoke.VarHandle::checkAccessModeThenIsDirect (29 bytes) 
>>>> force inline by annotation
>>>>            @ 46   java.lang.invoke.VarForm::getMemberName (38 
>>>> bytes) force inline by annotation
>>>>            @ 49 java.lang.invoke.VarHandleSegmentAsLongs::get (52 
>>>> bytes) force inline by annotation
>>>>              @ 14 
>>>> java.lang.invoke.VarHandleSegmentAsLongs::checkAddress (21 bytes) 
>>>> force inline by annotation
>>>>                @ 1   java.util.Objects::requireNonNull (14 bytes) 
>>>> force inline by annotation
>>>>                @ 15 
>>>> jdk.internal.foreign.AbstractMemorySegmentImpl::checkAccess (30 
>>>> bytes) force inline by annotation
>>>>                  @ 26 
>>>> jdk.internal.foreign.AbstractMemorySegmentImpl::checkBounds (54 
>>>> bytes) force inline by annotation
>>>>                    @ 16 jdk.internal.util.Preconditions::checkIndex 
>>>> (22 bytes) (intrinsic)
>>>>              @ 24 
>>>> jdk.internal.foreign.AbstractMemorySegmentImpl::sessionImpl (5 
>>>> bytes) accessor
>>>>              @ 29 
>>>> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetBase (2 
>>>> bytes) inline (hot)
>>>>              @ 40 
>>>> java.lang.invoke.VarHandleSegmentAsLongs::offsetPlain (39 bytes) 
>>>> force inline by annotation
>>>>                @ 1 
>>>> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetOffset (5 
>>>> bytes)  accessor
>>>>                @ 13 
>>>> jdk.internal.foreign.NativeMemorySegmentImpl::maxAlignMask (2 
>>>> bytes) inline (hot)
>>>>              @ 48 
>>>> jdk.internal.misc.ScopedMemoryAccess::getLongUnaligned (18 bytes) 
>>>> force inline by annotation
>>>>                @ 6 
>>>> jdk.internal.misc.ScopedMemoryAccess::getLongUnalignedInternal (36 
>>>> bytes)   force inline by annotation
>>>>                  @ 5 
>>>> jdk.internal.foreign.MemorySessionImpl::checkValidStateRaw (33 
>>>> bytes) force inline by annotation
>>>>                  @ 15 jdk.internal.misc.Unsafe::getLongUnaligned 
>>>> (12 bytes) inline (hot)
>>>>                    @ 5 jdk.internal.misc.Unsafe::getLongUnaligned 
>>>> (173 bytes) (intrinsic)
>>>>                    @ 8 jdk.internal.misc.Unsafe::convEndian (16 
>>>> bytes)   inline (hot)
>>>>                  @ 21 java.lang.ref.Reference::reachabilityFence (1 
>>>> bytes)   force inline by annotation
>>>>
>>>> -Chris
>>>>
>>>> [1] 
>>>> https://urldefense.com/v3/__https://github.com/elastic/elasticsearch/issues/113030__;!!ACWV5N9M2RV99hQ!K5xyY9T-QUt6IIEJBSaRGjb3rs9Zg6OY94yX5_dWCXjf95L7pGe3-GCNhMrGdjU_E8ndoqxykvssGNXLqIUql_x_cQ$ 
>>>>


More information about the panama-dev mailing list