Apparent regression in memory segment access in JDK 23?
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Fri Sep 27 15:19:05 UTC 2024
So,
some concrete things for you to try out, which might help us diagnose
this further (we think we have some idea of what's going on).
One thing to try, is to disable var handle guards. You can do that by
passing the following option to the launcher:
```
-Djava.lang.invoke.VarHandle.VAR_HANDLE_GUARDS=false
```
Another is to force inlining for MethodHandle::asType. This can be done
with:
```
-XX:CompileCommand=inline,java.lang.invoke.MethodHandle::asType
```
(and you can repeat that command if you see that other methods get in
the way).
Hopefully one of these should work.
Cheers
Maurizio
On 27/09/2024 15:43, Maurizio Cimadamore wrote:
> Hi Chris,
> the only thing I can think of is:
>
> https://git.openjdk.org/jdk/pull/19251
>
> Which I believe you already zeroed in on.
>
> We ran all our benchmarks before/after the fix:
>
> https://jmh.morethan.io/?sources=https://corsproxy.io/?https://cr.openjdk.org/~mcimadamore/jdk/8331865/loop_over_00_baseline.json,https://corsproxy.io/?https://cr.openjdk.org/~mcimadamore/jdk/8331865/loop_over_01_8331865.json
>
>
> And the results were largely neutral. The patch should not add more
> checks, but mostly reshuffle the existing checks.
>
> But from your inlining traces, it does seem that something fails to
> inline - which then means bound check elimination will not be applied.
>
> Now, it's hard to say: maybe this code was already 95% near some
> threshold, and the new code shape (after our fix) pushes it over the
> fence. Or maybe there's something suboptimal in our fix (but our JMH
> doesn't seem to indicate that).
>
> The var handle caching you mention should affect the number of var
> handles being created, but not peak performance. E.g. if you see more
> heap being used, that might be a sign that we're creating multiple
> copies for the same VH. But creating redundant copies should not
> prevent inlining... so something else is up here.
>
> Can you try against the latest 24 ea build? We made other changes in
> the area (mainly to improve startup of memory segment var handles). It
> would be interesting to know if they help bring things back into shape
> (in which case we might consider to backport the startup fixes).
>
> Cheers
> Maurizio
>
>
> On 27/09/2024 15:21, Chris Hegarty wrote:
>> Hi,
>>
>> I'm trying to track down what appears as a regression when accessing
>> long values from a memory segment, when moving from JDK 22 to JDK 23.
>> Approx 100-150% slower.
>>
>> There are some details in this GH issue [1], but not a lot more than
>> what is in this email.
>>
>> I'm still debugging, but git bisect on JDK 23 builds was not all that
>> helpful. I did see some changes in b25 and further in b26 (to restore
>> a varhandle cache).
>>
>> I'm still trying to get a basic jmh benchmark, but so far I've been
>> unable to reproduce this yet. I'll keep trying
>>
>> Any thoughts or comments would be gratefully appreciated.
>>
>>
>> JDK 23:
>> @ 16
>> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get
>> (34 bytes) inline (hot)
>> \-> TypeProfile (11846/11846 counts) =
>> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
>> @ 14
>> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong
>> (31 bytes) inline (hot)
>> \-> TypeProfile (11494/11494 counts) =
>> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
>> @ 8 jdk.internal.foreign.AbstractMemorySegmentImpl::get (12
>> bytes) force inline by annotation
>> \-> TypeProfile (11022/11022 counts) =
>> jdk/internal/foreign/MappedMemorySegmentImpl
>> @ 1
>> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle
>> (43 bytes) force inline by annotation
>> @ 8 java.lang.invoke.VarHandleGuards::guard_LJ_J (84 bytes)
>> force inline by annotation
>> @ 3
>> java.lang.invoke.IndirectVarHandle::checkAccessModeThenIsDirect (8
>> bytes) force inline by annotation
>> @ 2
>> java.lang.invoke.VarHandle::checkAccessModeThenIsDirect (29 bytes)
>> force inline by annotation
>> @ 59 java.lang.invoke.VarHandle::getMethodHandle (41
>> bytes) force inline by annotation
>> @ 71 java.lang.invoke.MethodHandle::asType (32 bytes)
>> failed to inline: already compiled into a big method
>> @ 75 java.lang.invoke.IndirectVarHandle::asDirect (5
>> bytes) accessor
>> @ 80 java.lang.invoke.MethodHandle::invokeBasic(LLJ)J (0
>> bytes) failed to inline: receiver not constant
>>
>> JDK 22:
>> @ 16
>> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get
>> (34 bytes) inline (hot)
>> \-> TypeProfile (6998/6998 counts) =
>> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
>> @ 14
>> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong
>> (31 bytes) inline (hot)
>> \-> TypeProfile (7475/7475 counts) =
>> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
>> @ 8 jdk.internal.foreign.AbstractMemorySegmentImpl::get (12
>> bytes) force inline by annotation
>> \-> TypeProfile (10270/10270 counts) =
>> jdk/internal/foreign/MappedMemorySegmentImpl
>> @ 1
>> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle
>> (26 bytes) force inline by annotation
>> @ 8 java.lang.invoke.VarHandleGuards::guard_LJ_J (84
>> bytes) force inline by annotation
>> @ 3
>> java.lang.invoke.VarHandle::checkAccessModeThenIsDirect (29 bytes)
>> force inline by annotation
>> @ 46 java.lang.invoke.VarForm::getMemberName (38 bytes)
>> force inline by annotation
>> @ 49 java.lang.invoke.VarHandleSegmentAsLongs::get (52
>> bytes) force inline by annotation
>> @ 14
>> java.lang.invoke.VarHandleSegmentAsLongs::checkAddress (21 bytes)
>> force inline by annotation
>> @ 1 java.util.Objects::requireNonNull (14 bytes)
>> force inline by annotation
>> @ 15
>> jdk.internal.foreign.AbstractMemorySegmentImpl::checkAccess (30
>> bytes) force inline by annotation
>> @ 26
>> jdk.internal.foreign.AbstractMemorySegmentImpl::checkBounds (54
>> bytes) force inline by annotation
>> @ 16 jdk.internal.util.Preconditions::checkIndex
>> (22 bytes) (intrinsic)
>> @ 24
>> jdk.internal.foreign.AbstractMemorySegmentImpl::sessionImpl (5 bytes)
>> accessor
>> @ 29
>> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetBase (2 bytes)
>> inline (hot)
>> @ 40
>> java.lang.invoke.VarHandleSegmentAsLongs::offsetPlain (39 bytes)
>> force inline by annotation
>> @ 1
>> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetOffset (5
>> bytes) accessor
>> @ 13
>> jdk.internal.foreign.NativeMemorySegmentImpl::maxAlignMask (2 bytes)
>> inline (hot)
>> @ 48
>> jdk.internal.misc.ScopedMemoryAccess::getLongUnaligned (18 bytes)
>> force inline by annotation
>> @ 6
>> jdk.internal.misc.ScopedMemoryAccess::getLongUnalignedInternal (36
>> bytes) force inline by annotation
>> @ 5
>> jdk.internal.foreign.MemorySessionImpl::checkValidStateRaw (33 bytes)
>> force inline by annotation
>> @ 15 jdk.internal.misc.Unsafe::getLongUnaligned (12
>> bytes) inline (hot)
>> @ 5 jdk.internal.misc.Unsafe::getLongUnaligned
>> (173 bytes) (intrinsic)
>> @ 8 jdk.internal.misc.Unsafe::convEndian (16
>> bytes) inline (hot)
>> @ 21 java.lang.ref.Reference::reachabilityFence (1
>> bytes) force inline by annotation
>>
>> -Chris
>>
>> [1] https://github.com/elastic/elasticsearch/issues/113030
>>
More information about the panama-dev
mailing list