Apparent regression in memory segment access in JDK 23?

Chris Hegarty chegar999 at gmail.com
Fri Sep 27 16:17:41 UTC 2024


Thanks for your quick reply Maurizio.

Lemme try out those and I'll report back.

-Chris.

On 27/09/2024 16:19, Maurizio Cimadamore wrote:
> So,
> some concrete things for you to try out, which might help us diagnose 
> this further (we think we have some idea of what's going on).
> 
> One thing to try, is to disable var handle guards. You can do that by 
> passing the following option to the launcher:
> 
> ```
> -Djava.lang.invoke.VarHandle.VAR_HANDLE_GUARDS=false
> ```
> 
> Another is to force inlining for MethodHandle::asType. This can be done 
> with:
> 
> ```
> -XX:CompileCommand=inline,java.lang.invoke.MethodHandle::asType
> ```
> 
> (and you can repeat that command if you see that other methods get in 
> the way).
> 
> Hopefully one of these should work.
> 
> Cheers
> Maurizio
> 
> On 27/09/2024 15:43, Maurizio Cimadamore wrote:
>> Hi Chris,
>> the only thing I can think of is:
>>
>> https://git.openjdk.org/jdk/pull/19251
>>
>> Which I believe you already zeroed in on.
>>
>> We ran all our benchmarks before/after the fix:
>>
>> https://jmh.morethan.io/?sources=https://corsproxy.io/?https://cr.openjdk.org/~mcimadamore/jdk/8331865/loop_over_00_baseline.json,https://corsproxy.io/?https://cr.openjdk.org/~mcimadamore/jdk/8331865/loop_over_01_8331865.json
>>
>> And the results were largely neutral. The patch should not add more 
>> checks, but mostly reshuffle the existing checks.
>>
>> But from your inlining traces, it does seem that something fails to 
>> inline - which then means bound check elimination will not be applied.
>>
>> Now, it's hard to say: maybe this code was already 95% near some 
>> threshold, and the new code shape (after our fix) pushes it over the 
>> fence. Or maybe there's something suboptimal in our fix (but our JMH 
>> doesn't seem to indicate that).
>>
>> The var handle caching you mention should affect the number of var 
>> handles being created, but not peak performance. E.g. if you see more 
>> heap being used, that might be a sign that we're creating multiple 
>> copies for the same VH. But creating redundant copies should not 
>> prevent inlining... so something else is up here.
>>
>> Can you try against the latest 24 ea build? We made other changes in 
>> the area (mainly to improve startup of memory segment var handles). It 
>> would be interesting to know if they help bring things back into shape 
>> (in which case we might consider to backport the startup fixes).
>>
>> Cheers
>> Maurizio
>>
>>
>> On 27/09/2024 15:21, Chris Hegarty wrote:
>>> Hi,
>>>
>>> I'm trying to track down what appears as a regression when accessing 
>>> long values from a memory segment, when moving from JDK 22 to JDK 23. 
>>> Approx 100-150% slower.
>>>
>>> There are some details in this GH issue [1], but not a lot more than 
>>> what is in this email.
>>>
>>> I'm still debugging, but git bisect on JDK 23 builds was not all that 
>>> helpful. I did see some changes in b25 and further in b26 (to restore 
>>> a varhandle cache).
>>>
>>> I'm still trying to get a basic jmh benchmark, but so far I've been 
>>> unable to reproduce this yet. I'll keep trying
>>>
>>> Any thoughts or comments would be gratefully appreciated.
>>>
>>>
>>> JDK 23:
>>>   @ 16 
>>> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get 
>>> (34 bytes)   inline (hot)
>>>    \-> TypeProfile (11846/11846 counts) = 
>>> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
>>>    @ 14 
>>> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong (31 bytes)   inline (hot)
>>>      \-> TypeProfile (11494/11494 counts) = 
>>> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
>>>       @ 8   jdk.internal.foreign.AbstractMemorySegmentImpl::get (12 
>>> bytes)   force inline by annotation
>>>        \-> TypeProfile (11022/11022 counts) = 
>>> jdk/internal/foreign/MappedMemorySegmentImpl
>>>         @ 1 
>>> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle (43 bytes)   force inline by annotation
>>>         @ 8   java.lang.invoke.VarHandleGuards::guard_LJ_J (84 bytes) 
>>> force inline by annotation
>>>           @ 3 
>>> java.lang.invoke.IndirectVarHandle::checkAccessModeThenIsDirect (8 
>>> bytes)   force inline by annotation
>>>             @ 2 
>>> java.lang.invoke.VarHandle::checkAccessModeThenIsDirect (29 bytes) 
>>> force inline by annotation
>>>           @ 59   java.lang.invoke.VarHandle::getMethodHandle (41 
>>> bytes)   force inline by annotation
>>>           @ 71   java.lang.invoke.MethodHandle::asType (32 bytes) 
>>> failed to inline: already compiled into a big method
>>>           @ 75   java.lang.invoke.IndirectVarHandle::asDirect (5 
>>> bytes)   accessor
>>>           @ 80 java.lang.invoke.MethodHandle::invokeBasic(LLJ)J (0 
>>> bytes) failed to inline: receiver not constant
>>>
>>> JDK 22:
>>>    @ 16 
>>> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get 
>>> (34 bytes)   inline (hot)
>>>     \-> TypeProfile (6998/6998 counts) = 
>>> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
>>>      @ 14 
>>> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong (31 bytes)   inline (hot)
>>>       \-> TypeProfile (7475/7475 counts) = 
>>> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
>>>        @ 8   jdk.internal.foreign.AbstractMemorySegmentImpl::get (12 
>>> bytes)   force inline by annotation
>>>         \-> TypeProfile (10270/10270 counts) = 
>>> jdk/internal/foreign/MappedMemorySegmentImpl
>>>          @ 1 
>>> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle (26 bytes)   force inline by annotation
>>>          @ 8   java.lang.invoke.VarHandleGuards::guard_LJ_J (84 
>>> bytes)  force inline by annotation
>>>            @ 3 
>>> java.lang.invoke.VarHandle::checkAccessModeThenIsDirect (29 bytes) 
>>> force inline by annotation
>>>            @ 46   java.lang.invoke.VarForm::getMemberName (38 bytes) 
>>> force inline by annotation
>>>            @ 49   java.lang.invoke.VarHandleSegmentAsLongs::get (52 
>>> bytes)   force inline by annotation
>>>              @ 14 
>>> java.lang.invoke.VarHandleSegmentAsLongs::checkAddress (21 bytes) 
>>> force inline by annotation
>>>                @ 1   java.util.Objects::requireNonNull (14 bytes) 
>>> force inline by annotation
>>>                @ 15 
>>> jdk.internal.foreign.AbstractMemorySegmentImpl::checkAccess (30 
>>> bytes) force inline by annotation
>>>                  @ 26 
>>> jdk.internal.foreign.AbstractMemorySegmentImpl::checkBounds (54 
>>> bytes) force inline by annotation
>>>                    @ 16 jdk.internal.util.Preconditions::checkIndex 
>>> (22 bytes) (intrinsic)
>>>              @ 24 
>>> jdk.internal.foreign.AbstractMemorySegmentImpl::sessionImpl (5 bytes) 
>>> accessor
>>>              @ 29 
>>> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetBase (2 bytes) 
>>> inline (hot)
>>>              @ 40 
>>> java.lang.invoke.VarHandleSegmentAsLongs::offsetPlain (39 bytes) 
>>> force inline by annotation
>>>                @ 1 
>>> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetOffset (5 
>>> bytes)  accessor
>>>                @ 13 
>>> jdk.internal.foreign.NativeMemorySegmentImpl::maxAlignMask (2 bytes) 
>>> inline (hot)
>>>              @ 48 
>>> jdk.internal.misc.ScopedMemoryAccess::getLongUnaligned (18 bytes) 
>>> force inline by annotation
>>>                @ 6 
>>> jdk.internal.misc.ScopedMemoryAccess::getLongUnalignedInternal (36 
>>> bytes)   force inline by annotation
>>>                  @ 5 
>>> jdk.internal.foreign.MemorySessionImpl::checkValidStateRaw (33 bytes) 
>>> force inline by annotation
>>>                  @ 15 jdk.internal.misc.Unsafe::getLongUnaligned (12 
>>> bytes)   inline (hot)
>>>                    @ 5 jdk.internal.misc.Unsafe::getLongUnaligned 
>>> (173 bytes) (intrinsic)
>>>                    @ 8   jdk.internal.misc.Unsafe::convEndian (16 
>>> bytes)   inline (hot)
>>>                  @ 21 java.lang.ref.Reference::reachabilityFence (1 
>>> bytes)   force inline by annotation
>>>
>>> -Chris
>>>
>>> [1] https://github.com/elastic/elasticsearch/issues/113030
>>>


More information about the panama-dev mailing list