Apparent regression in memory segment access in JDK 23?
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Fri Sep 27 14:43:11 UTC 2024
Hi Chris,
the only thing I can think of is:
https://git.openjdk.org/jdk/pull/19251
Which I believe you already zeroed in on.
We ran all our benchmarks before/after the fix:
https://jmh.morethan.io/?sources=https://corsproxy.io/?https://cr.openjdk.org/~mcimadamore/jdk/8331865/loop_over_00_baseline.json,https://corsproxy.io/?https://cr.openjdk.org/~mcimadamore/jdk/8331865/loop_over_01_8331865.json
And the results were largely neutral. The patch should not add more
checks, but mostly reshuffle the existing checks.
But from your inlining traces, it does seem that something fails to
inline - which then means bound check elimination will not be applied.
Now, it's hard to say: maybe this code was already 95% near some
threshold, and the new code shape (after our fix) pushes it over the
fence. Or maybe there's something suboptimal in our fix (but our JMH
doesn't seem to indicate that).
The var handle caching you mention should affect the number of var
handles being created, but not peak performance. E.g. if you see more
heap being used, that might be a sign that we're creating multiple
copies for the same VH. But creating redundant copies should not prevent
inlining... so something else is up here.
Can you try against the latest 24 ea build? We made other changes in the
area (mainly to improve startup of memory segment var handles). It would
be interesting to know if they help bring things back into shape (in
which case we might consider to backport the startup fixes).
Cheers
Maurizio
On 27/09/2024 15:21, Chris Hegarty wrote:
> Hi,
>
> I'm trying to track down what appears as a regression when accessing
> long values from a memory segment, when moving from JDK 22 to JDK 23.
> Approx 100-150% slower.
>
> There are some details in this GH issue [1], but not a lot more than
> what is in this email.
>
> I'm still debugging, but git bisect on JDK 23 builds was not all that
> helpful. I did see some changes in b25 and further in b26 (to restore
> a varhandle cache).
>
> I'm still trying to get a basic jmh benchmark, but so far I've been
> unable to reproduce this yet. I'll keep trying
>
> Any thoughts or comments would be gratefully appreciated.
>
>
> JDK 23:
> @ 16
> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get
> (34 bytes) inline (hot)
> \-> TypeProfile (11846/11846 counts) =
> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
> @ 14
> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong
> (31 bytes) inline (hot)
> \-> TypeProfile (11494/11494 counts) =
> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
> @ 8 jdk.internal.foreign.AbstractMemorySegmentImpl::get (12
> bytes) force inline by annotation
> \-> TypeProfile (11022/11022 counts) =
> jdk/internal/foreign/MappedMemorySegmentImpl
> @ 1
> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle
> (43 bytes) force inline by annotation
> @ 8 java.lang.invoke.VarHandleGuards::guard_LJ_J (84 bytes)
> force inline by annotation
> @ 3
> java.lang.invoke.IndirectVarHandle::checkAccessModeThenIsDirect (8
> bytes) force inline by annotation
> @ 2
> java.lang.invoke.VarHandle::checkAccessModeThenIsDirect (29 bytes)
> force inline by annotation
> @ 59 java.lang.invoke.VarHandle::getMethodHandle (41
> bytes) force inline by annotation
> @ 71 java.lang.invoke.MethodHandle::asType (32 bytes)
> failed to inline: already compiled into a big method
> @ 75 java.lang.invoke.IndirectVarHandle::asDirect (5
> bytes) accessor
> @ 80 java.lang.invoke.MethodHandle::invokeBasic(LLJ)J (0
> bytes) failed to inline: receiver not constant
>
> JDK 22:
> @ 16
> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get
> (34 bytes) inline (hot)
> \-> TypeProfile (6998/6998 counts) =
> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
> @ 14
> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong
> (31 bytes) inline (hot)
> \-> TypeProfile (7475/7475 counts) =
> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
> @ 8 jdk.internal.foreign.AbstractMemorySegmentImpl::get (12
> bytes) force inline by annotation
> \-> TypeProfile (10270/10270 counts) =
> jdk/internal/foreign/MappedMemorySegmentImpl
> @ 1
> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle
> (26 bytes) force inline by annotation
> @ 8 java.lang.invoke.VarHandleGuards::guard_LJ_J (84 bytes)
> force inline by annotation
> @ 3 java.lang.invoke.VarHandle::checkAccessModeThenIsDirect
> (29 bytes) force inline by annotation
> @ 46 java.lang.invoke.VarForm::getMemberName (38 bytes)
> force inline by annotation
> @ 49 java.lang.invoke.VarHandleSegmentAsLongs::get (52
> bytes) force inline by annotation
> @ 14
> java.lang.invoke.VarHandleSegmentAsLongs::checkAddress (21 bytes)
> force inline by annotation
> @ 1 java.util.Objects::requireNonNull (14 bytes)
> force inline by annotation
> @ 15
> jdk.internal.foreign.AbstractMemorySegmentImpl::checkAccess (30 bytes)
> force inline by annotation
> @ 26
> jdk.internal.foreign.AbstractMemorySegmentImpl::checkBounds (54 bytes)
> force inline by annotation
> @ 16 jdk.internal.util.Preconditions::checkIndex
> (22 bytes) (intrinsic)
> @ 24
> jdk.internal.foreign.AbstractMemorySegmentImpl::sessionImpl (5 bytes)
> accessor
> @ 29
> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetBase (2 bytes)
> inline (hot)
> @ 40
> java.lang.invoke.VarHandleSegmentAsLongs::offsetPlain (39 bytes) force
> inline by annotation
> @ 1
> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetOffset (5
> bytes) accessor
> @ 13
> jdk.internal.foreign.NativeMemorySegmentImpl::maxAlignMask (2 bytes)
> inline (hot)
> @ 48
> jdk.internal.misc.ScopedMemoryAccess::getLongUnaligned (18 bytes)
> force inline by annotation
> @ 6
> jdk.internal.misc.ScopedMemoryAccess::getLongUnalignedInternal (36
> bytes) force inline by annotation
> @ 5
> jdk.internal.foreign.MemorySessionImpl::checkValidStateRaw (33 bytes)
> force inline by annotation
> @ 15 jdk.internal.misc.Unsafe::getLongUnaligned (12
> bytes) inline (hot)
> @ 5 jdk.internal.misc.Unsafe::getLongUnaligned (173
> bytes) (intrinsic)
> @ 8 jdk.internal.misc.Unsafe::convEndian (16
> bytes) inline (hot)
> @ 21 java.lang.ref.Reference::reachabilityFence (1
> bytes) force inline by annotation
>
> -Chris
>
> [1] https://github.com/elastic/elasticsearch/issues/113030
>
More information about the panama-dev
mailing list