Apparent regression in memory segment access in JDK 23?
Chris Hegarty
chegar999 at gmail.com
Mon Sep 30 19:21:39 UTC 2024
Hi Maurizio,
I can confirm that performance has been restored with the compiler
directives that you suggested.
There is some other variance I see in the benchmark that I'm running,
but that exists in JDK 22 also. I'll look into that separately. ( I have
some ideas )
Thanks for jumping on this so fast. And the workaround will allow us to
upgrade to JDK 23.
I assume that a fix will be prepared for JDK 24, and backported to 23u.
If there is anything that I can do to help, please let me know.
Thanks,
-Chris.
On 30/09/2024 11:12, Maurizio Cimadamore wrote:
> Hi Chris,
> It would be very helpful if you could run again with these flags:
>
> -XX:CompileCommand=dontinline,java/lang/invoke/MethodHandle.setAsTypeCache -XX:CompileCommand=dontinline,java/lang/invoke/MethodHandle.asTypeUncached
>
> As these are more similar to the fix we're liklely going to apply (e.g.
> disable inlining for the slow path of MethodHandle::asType). Our
> benchmarks responds very well to this, and original performance is fully
> restored. It would be helpful to know how these synthetic results
> translate to the "real world" :-)
>
> Cheers
> Maurizio
>
> On 27/09/2024 23:10, Chris Hegarty wrote:
>> Ha!!! You tracked it down. Thank you.
>>
>> -Chris
>>
>>> On 27 Sep 2024, at 22:47, Maurizio Cimadamore
>>> <maurizio.cimadamore at oracle.com> wrote:
>>>
>>> No the defaults have not changed.
>>>
>>> I think we managed to isolate the issue. More details here
>>> https://bugs.openjdk.org/browse/JDK-8341127
>>>
>>> In the meantime, I believe that using either of the commands I
>>> provided in the last email should workaround the issue.
>>>
>>> We will try to get this sorted quickly.
>>>
>>> Thanks
>>> Maurizio
>>>
>>> On 27/09/2024 22:41, Chris Hegarty wrote:
>>>>>> On 27 Sep 2024, at 18:36, Maurizio Cimadamore
>>>>>> <maurizio.cimadamore at oracle.com> wrote:
>>>>> Control question: is your command line affecting the default
>>>>> values of the inlining parameters? E.g. "InlineSmallCode" ?
>>>>>
>>>> No. We don’t touch any of these JVM flags. So they’re just the
>>>> defaults. And I don’t think that any of these defaults changed
>>>> between 22 and 23?
>>>>
>>>> -Chris
>>>>
>>>>
>>>>> Maurizio
>>>>>
>>>>>> On 27/09/2024 17:17, Chris Hegarty wrote:
>>>>>> Thanks for your quick reply Maurizio.
>>>>>>
>>>>>> Lemme try out those and I'll report back.
>>>>>>
>>>>>> -Chris.
>>>>>>
>>>>>>> On 27/09/2024 16:19, Maurizio Cimadamore wrote:
>>>>>>> So,
>>>>>>> some concrete things for you to try out, which might help us
>>>>>>> diagnose this further (we think we have some idea of what's going
>>>>>>> on).
>>>>>>>
>>>>>>> One thing to try, is to disable var handle guards. You can do
>>>>>>> that by passing the following option to the launcher:
>>>>>>>
>>>>>>> ```
>>>>>>> -Djava.lang.invoke.VarHandle.VAR_HANDLE_GUARDS=false
>>>>>>> ```
>>>>>>>
>>>>>>> Another is to force inlining for MethodHandle::asType. This can
>>>>>>> be done with:
>>>>>>>
>>>>>>> ```
>>>>>>> -XX:CompileCommand=inline,java.lang.invoke.MethodHandle::asType
>>>>>>> ```
>>>>>>>
>>>>>>> (and you can repeat that command if you see that other methods
>>>>>>> get in the way).
>>>>>>>
>>>>>>> Hopefully one of these should work.
>>>>>>>
>>>>>>> Cheers
>>>>>>> Maurizio
>>>>>>>
>>>>>>> On 27/09/2024 15:43, Maurizio Cimadamore wrote:
>>>>>>>> Hi Chris,
>>>>>>>> the only thing I can think of is:
>>>>>>>>
>>>>>>>> https://git.openjdk.org/jdk/pull/19251
>>>>>>>>
>>>>>>>> Which I believe you already zeroed in on.
>>>>>>>>
>>>>>>>> We ran all our benchmarks before/after the fix:
>>>>>>>>
>>>>>>>> https://urldefense.com/v3/__https://jmh.morethan.io/?sources=https:**Acorsproxy.io**Ahttps:**Acr.openjdk.org**Amcimadamore*jdk*8331865*loop_over_00_baseline.json,https:**Acorsproxy.io**Ahttps:**Acr.openjdk.org**Amcimadamore*jdk*8331865*loop_over_01_8331865.json__;Ly8vPy8vL34vLy8vLy8_Ly8vfi8vLw!!ACWV5N9M2RV99hQ!K5xyY9T-QUt6IIEJBSaRGjb3rs9Zg6OY94yX5_dWCXjf95L7pGe3-GCNhMrGdjU_E8ndoqxykvssGNXLqIVuc9IwuA$
>>>>>>>> And the results were largely neutral. The patch should not add
>>>>>>>> more checks, but mostly reshuffle the existing checks.
>>>>>>>>
>>>>>>>> But from your inlining traces, it does seem that something fails
>>>>>>>> to inline - which then means bound check elimination will not be
>>>>>>>> applied.
>>>>>>>>
>>>>>>>> Now, it's hard to say: maybe this code was already 95% near some
>>>>>>>> threshold, and the new code shape (after our fix) pushes it over
>>>>>>>> the fence. Or maybe there's something suboptimal in our fix (but
>>>>>>>> our JMH doesn't seem to indicate that).
>>>>>>>>
>>>>>>>> The var handle caching you mention should affect the number of
>>>>>>>> var handles being created, but not peak performance. E.g. if you
>>>>>>>> see more heap being used, that might be a sign that we're
>>>>>>>> creating multiple copies for the same VH. But creating redundant
>>>>>>>> copies should not prevent inlining... so something else is up here.
>>>>>>>>
>>>>>>>> Can you try against the latest 24 ea build? We made other
>>>>>>>> changes in the area (mainly to improve startup of memory segment
>>>>>>>> var handles). It would be interesting to know if they help bring
>>>>>>>> things back into shape (in which case we might consider to
>>>>>>>> backport the startup fixes).
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>> Maurizio
>>>>>>>>
>>>>>>>>
>>>>>>>> On 27/09/2024 15:21, Chris Hegarty wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm trying to track down what appears as a regression when
>>>>>>>>> accessing long values from a memory segment, when moving from
>>>>>>>>> JDK 22 to JDK 23. Approx 100-150% slower.
>>>>>>>>>
>>>>>>>>> There are some details in this GH issue [1], but not a lot more
>>>>>>>>> than what is in this email.
>>>>>>>>>
>>>>>>>>> I'm still debugging, but git bisect on JDK 23 builds was not
>>>>>>>>> all that helpful. I did see some changes in b25 and further in
>>>>>>>>> b26 (to restore a varhandle cache).
>>>>>>>>>
>>>>>>>>> I'm still trying to get a basic jmh benchmark, but so far I've
>>>>>>>>> been unable to reproduce this yet. I'll keep trying
>>>>>>>>>
>>>>>>>>> Any thoughts or comments would be gratefully appreciated.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> JDK 23:
>>>>>>>>> @ 16
>>>>>>>>> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get (34 bytes) inline (hot)
>>>>>>>>> \-> TypeProfile (11846/11846 counts) =
>>>>>>>>> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
>>>>>>>>> @ 14
>>>>>>>>> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong (31 bytes) inline (hot)
>>>>>>>>> \-> TypeProfile (11494/11494 counts) =
>>>>>>>>> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
>>>>>>>>> @ 8 jdk.internal.foreign.AbstractMemorySegmentImpl::get
>>>>>>>>> (12 bytes) force inline by annotation
>>>>>>>>> \-> TypeProfile (11022/11022 counts) =
>>>>>>>>> jdk/internal/foreign/MappedMemorySegmentImpl
>>>>>>>>> @ 1
>>>>>>>>> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle (43 bytes) force inline by annotation
>>>>>>>>> @ 8 java.lang.invoke.VarHandleGuards::guard_LJ_J (84
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>> @ 3
>>>>>>>>> java.lang.invoke.IndirectVarHandle::checkAccessModeThenIsDirect
>>>>>>>>> (8 bytes) force inline by annotation
>>>>>>>>> @ 2
>>>>>>>>> java.lang.invoke.VarHandle::checkAccessModeThenIsDirect (29
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>> @ 59 java.lang.invoke.VarHandle::getMethodHandle
>>>>>>>>> (41 bytes) force inline by annotation
>>>>>>>>> @ 71 java.lang.invoke.MethodHandle::asType (32
>>>>>>>>> bytes) failed to inline: already compiled into a big method
>>>>>>>>> @ 75 java.lang.invoke.IndirectVarHandle::asDirect (5
>>>>>>>>> bytes) accessor
>>>>>>>>> @ 80
>>>>>>>>> java.lang.invoke.MethodHandle::invokeBasic(LLJ)J (0 bytes)
>>>>>>>>> failed to inline: receiver not constant
>>>>>>>>>
>>>>>>>>> JDK 22:
>>>>>>>>> @ 16
>>>>>>>>> org.apache.lucene.util.packed.DirectReader$DirectPackedReader40::get (34 bytes) inline (hot)
>>>>>>>>> \-> TypeProfile (6998/6998 counts) =
>>>>>>>>> org/apache/lucene/util/packed/DirectReader$DirectPackedReader40
>>>>>>>>> @ 14
>>>>>>>>> org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl::readLong (31 bytes) inline (hot)
>>>>>>>>> \-> TypeProfile (7475/7475 counts) =
>>>>>>>>> org/apache/lucene/store/MemorySegmentIndexInput$SingleSegmentImpl
>>>>>>>>> @ 8 jdk.internal.foreign.AbstractMemorySegmentImpl::get
>>>>>>>>> (12 bytes) force inline by annotation
>>>>>>>>> \-> TypeProfile (10270/10270 counts) =
>>>>>>>>> jdk/internal/foreign/MappedMemorySegmentImpl
>>>>>>>>> @ 1
>>>>>>>>> jdk.internal.foreign.layout.ValueLayouts$AbstractValueLayout::varHandle (26 bytes) force inline by annotation
>>>>>>>>> @ 8 java.lang.invoke.VarHandleGuards::guard_LJ_J
>>>>>>>>> (84 bytes) force inline by annotation
>>>>>>>>> @ 3
>>>>>>>>> java.lang.invoke.VarHandle::checkAccessModeThenIsDirect (29
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>> @ 46 java.lang.invoke.VarForm::getMemberName (38
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>> @ 49 java.lang.invoke.VarHandleSegmentAsLongs::get
>>>>>>>>> (52 bytes) force inline by annotation
>>>>>>>>> @ 14
>>>>>>>>> java.lang.invoke.VarHandleSegmentAsLongs::checkAddress (21
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>> @ 1 java.util.Objects::requireNonNull (14
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>> @ 15
>>>>>>>>> jdk.internal.foreign.AbstractMemorySegmentImpl::checkAccess (30
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>> @ 26
>>>>>>>>> jdk.internal.foreign.AbstractMemorySegmentImpl::checkBounds (54
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>> @ 16
>>>>>>>>> jdk.internal.util.Preconditions::checkIndex (22 bytes) (intrinsic)
>>>>>>>>> @ 24
>>>>>>>>> jdk.internal.foreign.AbstractMemorySegmentImpl::sessionImpl (5
>>>>>>>>> bytes) accessor
>>>>>>>>> @ 29
>>>>>>>>> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetBase (2
>>>>>>>>> bytes) inline (hot)
>>>>>>>>> @ 40
>>>>>>>>> java.lang.invoke.VarHandleSegmentAsLongs::offsetPlain (39
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>> @ 1
>>>>>>>>> jdk.internal.foreign.NativeMemorySegmentImpl::unsafeGetOffset
>>>>>>>>> (5 bytes) accessor
>>>>>>>>> @ 13
>>>>>>>>> jdk.internal.foreign.NativeMemorySegmentImpl::maxAlignMask (2
>>>>>>>>> bytes) inline (hot)
>>>>>>>>> @ 48
>>>>>>>>> jdk.internal.misc.ScopedMemoryAccess::getLongUnaligned (18
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>> @ 6
>>>>>>>>> jdk.internal.misc.ScopedMemoryAccess::getLongUnalignedInternal
>>>>>>>>> (36 bytes) force inline by annotation
>>>>>>>>> @ 5
>>>>>>>>> jdk.internal.foreign.MemorySessionImpl::checkValidStateRaw (33
>>>>>>>>> bytes) force inline by annotation
>>>>>>>>> @ 15
>>>>>>>>> jdk.internal.misc.Unsafe::getLongUnaligned (12 bytes) inline (hot)
>>>>>>>>> @ 5
>>>>>>>>> jdk.internal.misc.Unsafe::getLongUnaligned (173 bytes) (intrinsic)
>>>>>>>>> @ 8 jdk.internal.misc.Unsafe::convEndian
>>>>>>>>> (16 bytes) inline (hot)
>>>>>>>>> @ 21
>>>>>>>>> java.lang.ref.Reference::reachabilityFence (1 bytes) force
>>>>>>>>> inline by annotation
>>>>>>>>>
>>>>>>>>> -Chris
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://urldefense.com/v3/__https://github.com/elastic/elasticsearch/issues/113030__;!!ACWV5N9M2RV99hQ!K5xyY9T-QUt6IIEJBSaRGjb3rs9Zg6OY94yX5_dWCXjf95L7pGe3-GCNhMrGdjU_E8ndoqxykvssGNXLqIUql_x_cQ$
More information about the panama-dev
mailing list