[External] : Re: Missing entry in SCC due to decomp count mismatch
Ashutosh Mehra
asmehra at redhat.com
Fri Jul 26 18:53:13 UTC 2024
Yes, this seems to work. I don't see the "Missing entry" message
for StackMapGenerator::processBlock any more,
and it is now able to locate the c2 compiled version in the code cache and
use it.
[0.911s][info ][scc,nmethod ] 2662 (L4): Reading nmethod
'jdk.internal.classfile.impl.StackMapGenerator::processBlock(Ljdk/internal/classfile/impl/RawBytecodeHelper;)Z'
(md: 0x7f99f00c5f68, decomp: 0, hash: 0x493f24e2)
Thanks,
- Ashutosh Mehra
On Fri, Jul 26, 2024 at 12:28 PM Vladimir Kozlov <vladimir.kozlov at oracle.com>
wrote:
> Ashutosh, please try this patch:
>
> diff --git a/src/hotspot/share/code/SCCache.cpp
> b/src/hotspot/share/code/SCCache.cpp
> index 551f80e1411..87f2d718320 100644
> --- a/src/hotspot/share/code/SCCache.cpp
> +++ b/src/hotspot/share/code/SCCache.cpp
> @@ -934,8 +934,7 @@ static bool check_entry(SCCEntry::Kind kind, uint id,
> uint comp_level, uint deco
> if (entry->kind() == kind) {
> assert(entry->id() == id, "sanity");
> if (kind != SCCEntry::Code || (!entry->not_entrant() &&
> !entry->has_clinit_barriers() &&
> - entry->comp_level() == comp_level &&
> - (comp_level ==
> CompLevel_limited_profile || entry->decompile() == decomp))) {
> + (entry->comp_level() == comp_level))) {
> return true; // Found
> }
> }
>
>
> Thanks,
> Vladimir K
>
> On 7/26/24 8:17 AM, Ashutosh Mehra wrote:
> > The counter's value in "Preloading" and "Missing" messages is taking
> from current MDO [1] and not from the entry
> > describing AOT code.
> >
> >
> > That's right.
> >
> >
> > I remember (Igor may confirm it) that CompilerCounters data is
> nulled for TD including _nof_decompiles value. That is
> > why we see 0 when loading code.
> >
> >
> > For "Missing" message, the counter is obtained from
> Method->method_data() which would return null if the MethodData
> > recorded in the training data is not yet installed in the Method. If
> Method->method_data() is null, then it prints 0 as
> > the counter value.
> > This is the case I am seeing if I print the method data in the "Missing"
> message:
> >
> > [0.878s][info ][scc,nmethod ] Missing entry for
> >
> 'jdk.internal.classfile.impl.StackMapGenerator::processBlock(Ljdk/internal/classfile/impl/RawBytecodeHelper;)Z'
> (md:
> > (nil), comp_level 4, decomp: 0, hash: 0x493f24e2)
> >
> > Notice that md is null.
> >
> > With one step workflow we should ignore decomp count because code
> is generated not
> > during execution but based on training data in forked VM - no
> deoptimization happens there.
> >
> >
> > hmm, I see some entries marked as "made not entrant" in the output of
> PrintCompilation during the assembly phase of
> > 1-step workflow.
> >
> > S131 432 % 2
> jdk.internal.util.ArraysSupport::unsignedHashCode @ 8 (36 bytes) made not
> entrant
> > S133 433 2
> jdk.internal.util.ArraysSupport::unsignedHashCode (36 bytes) made not
> entrant
> > S158 506 2 java.util.HashMap::putVal (300 bytes)
> made not entrant
> >
> > Thanks,
> > - Ashutosh Mehra
> >
> >
> > On Thu, Jul 25, 2024 at 8:18 PM Vladimir Kozlov <
> vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
> >
> > I looked on code and for "Preloading" nmethod we ignore decompile
> counter. We check it only when looking for "normal"
> > AOT code.
> >
> > The counter's value in "Preloading" and "Missing" messages is taking
> from current MDO [1] and not from the entry
> > describing AOT code.
> >
> > I remember (Igor may confirm it) that CompilerCounters data is
> nulled for TD including _nof_decompiles value. That is
> > why we see 0 when loading code.
> >
> > It seems we are "lucky" in most cases because we deoptimize (and
> update counter) may be level 2 (C1) AOT code or
> > "Preloaded" code. That is what I see in my runs which has opposite
> issue: we record AOT code with `decomp == 0` and
> > when
> > we load it "preload" AOT code was deoptimized and counter updated in
> MDO and check failed when we tried to load
> > "normal"
> > AOT code.
> >
> > Anyway. As I said, for one-step workflow we should not use decompile
> counter. I may actually save only latest
> > version of
> > "normal" nmethod even for 5-step workflow. The question is which
> nmethod version corresponds to saved MDO? For on-step
> > workflow the answer is simple and we can ignore the counter. For 5
> steps I need to check corresponding MDO to save
> > correct version.
> >
> > Thanks,
> > Vladimir K
> >
> > [1]
> https://github.com/openjdk/leyden/blob/premain/src/hotspot/share/code/SCCache.cpp#L2868
> > <
> https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/premain/src/hotspot/share/code/SCCache.cpp*L2868__;Iw!!ACWV5N9M2RV99hQ!MUQ7NwYRlq4ggiM1yNmZO3xJb1OV-cpfGFcPgPZ9zA7vbDKlPtPvsVtsm-gca03z_WTzgi8AUzOWOYcRh3ntYA$
> >
> > [2]
> https://github.com/openjdk/leyden/blob/premain/src/hotspot/share/code/SCCache.cpp#L317
> > <
> https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/premain/src/hotspot/share/code/SCCache.cpp*L317__;Iw!!ACWV5N9M2RV99hQ!MUQ7NwYRlq4ggiM1yNmZO3xJb1OV-cpfGFcPgPZ9zA7vbDKlPtPvsVtsm-gca03z_WTzgi8AUzOWOYe9CuvRDQ$
> >
> >
> > On 7/24/24 3:46 PM, Ashutosh Mehra wrote:
> > > Hi Vladimir,
> > >
> > > Is this with one step workflow? With one step workflow we
> should ignore decomp count because code is
> > generated not
> > > during execution but based on training data in forked VM - no
> deoptimization happens there.
> > >
> > >
> > > Yes, this is with the 1-step workflow.
> > >
> > > - Ashutosh Mehra
> > >
> > >
> > > On Wed, Jul 24, 2024 at 3:56 PM Vladimir Kozlov <
> vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>
> > <mailto:vladimir.kozlov at oracle.com <mailto:
> vladimir.kozlov at oracle.com>>> wrote:
> > >
> > > Thank you for report, Ashutosh
> > >
> > > Is this with one step workflow? With one step workflow we
> should ignore decomp count because code is
> > generated not
> > > during execution but based on training data in forked VM - no
> deoptimization happens there.
> > >
> > > `decomp count` was introduced for 5 steps workflow when we
> generate aot code as we execute application with
> > idea that
> > > production run will follow the same
> compilation/deoptimization steps.
> > >
> > > Actually I implemented it before we start using TD to trigger
> compilation. May be this is the reason that 5 steps
> > > workflow is slow now when we use TD. I need to check.
> > >
> > > Thanks,
> > > Vladimir K
> > >
> > > On 7/24/24 7:54 AM, Ashutosh Mehra wrote:
> > > > During the startup of a quarkus app, I see a particular
> method that gets C2 compiled almost every time in the
> > > production
> > > > run with the premain branch . I don't see this happening
> with the mainline.
> > > > The reason this method caught my attention is the
> significant amount of memory its C2 compilation consumes
> > (between
> > > > 25-40 mb) compared to the other compilations.
> > > > The method in question is
> > > >
> jdk.internal.classfile.impl.StackMapGenerator::processBlock(Ljdk/internal/classfile/impl/RawBytecodeHelper;)Z
> > > >
> > > > The assembly phase added two entries for this method in
> the code cache:
> > > >
> > > > [3.391s][info ][scc,nmethod ] 2631 (L4): Writing nmethod
> > > >
> >
> 'jdk.internal.classfile.impl.StackMapGenerator::processBlock(Ljdk/internal/classfile/impl/RawBytecodeHelper;)Z'
> > > (comp
> > > > level: 4, decomp: 1, has clinit barriers) to Startup Code
> Cache 'quarkus-getting-started.cds.code'
> > > > ...
> > > > [7.215s][info ][scc,nmethod ] 4354 (L4): Writing nmethod
> > > >
> >
> 'jdk.internal.classfile.impl.StackMapGenerator::processBlock(Ljdk/internal/classfile/impl/RawBytecodeHelper;)Z'
> > > (comp
> > > > level: 4, decomp: 1) to Startup Code Cache
> 'quarkus-getting-started.cds.code'
> > > >
> > > > In the production run the "preload" version was
> successfully loaded:
> > > >
> > > > [0.695s][info ][scc,nmethod ] 727 (L4): Preloading nmethod
> > > >
> >
> 'jdk.internal.classfile.impl.StackMapGenerator::processBlock(Ljdk/internal/classfile/impl/RawBytecodeHelper;)Z'
> > > (decomp:
> > > > 0, hash: 0x493f24e2, has clinit barriers)
> > > >
> > > > The PrintTieredEventslogs indicate this method was also
> sent for compilation during replay training:
> > > >
> > > > 0.877593: [force-compile level=4
> > > >
> >
> [jdk.internal.classfile.impl.StackMapGenerator::processBlock(Ljdk/internal/classfile/impl/RawBytecodeHelper;)Z]
> @-1
> > > > queues=0,0 rate=0.000000 load=0.007812 k=1.00,1.00
> total=56,0 mdo=0(0),0(0) max levels=4,0
> > > > compilable=c1,c1-osr,c2,c2-osr status=idle mtd:
> mdo=18830(8306), 0(0), deps=0]
> > > >
> > > > Ideally this request should have been fulfilled by the
> second entry in the code cache. But instead I see this
> > > message:
> > > >
> > > > [0.878s][info ][scc,nmethod] Missing entry for
> > > >
> >
> 'jdk.internal.classfile.impl.StackMapGenerator::processBlock(Ljdk/internal/classfile/impl/RawBytecodeHelper;)Z'
> > > > (comp_level 4, decomp: 0, hash: 0x493f24e2)
> > > >
> > > > This is followed by the C2 compilation of the method.
> > > >
> > > > It looks like the failure to find the second entry is due
> to a mismatch of the decomp count [0]. The
> > decomp count is
> > > > stored in the MethodData.
> > > > Is it possible that the method data is not yet installed
> when replay training is done? If so, is that by
> > design
> > > or a bug?
> > > >
> > > > [0]
> > > >
> > >
> >
> https://github.com/openjdk/leyden/blob/ec5eb99653624d02a923a314ce40086753b240fc/src/hotspot/share/code/SCCache.cpp#L938
> <
> https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/ec5eb99653624d02a923a314ce40086753b240fc/src/hotspot/share/code/SCCache.cpp*L938__;Iw!!ACWV5N9M2RV99hQ!MUQ7NwYRlq4ggiM1yNmZO3xJb1OV-cpfGFcPgPZ9zA7vbDKlPtPvsVtsm-gca03z_WTzgi8AUzOWOYcE3EES5w$>
> <
> https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/ec5eb99653624d02a923a314ce40086753b240fc/src/hotspot/share/code/SCCache.cpp*L938__;Iw!!ACWV5N9M2RV99hQ!MGy9Dev-630RGry8or7lUEBQ4OYoS8cYwif2Z56RKLsqk7kx5YSD65AzSK9yYycpOWGubLGwFIZSImuh3ykRKw$
> <
> https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/ec5eb99653624d02a923a314ce40086753b240fc/src/hotspot/share/code/SCCache.cpp*L938__;Iw!!ACWV5N9M2RV99hQ!MGy9Dev-630RGry8or7lUEBQ4OYoS8cYwif2Z56RKLsqk7kx5YSD65AzSK9yYycpOWGubLGwFIZSImuh3ykRKw$
> >>
> > > >
> > >
> > <
> https://github.com/openjdk/leyden/blob/ec5eb99653624d02a923a314ce40086753b240fc/src/hotspot/share/code/SCCache.cpp#L938
> <
> https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/ec5eb99653624d02a923a314ce40086753b240fc/src/hotspot/share/code/SCCache.cpp*L938__;Iw!!ACWV5N9M2RV99hQ!MUQ7NwYRlq4ggiM1yNmZO3xJb1OV-cpfGFcPgPZ9zA7vbDKlPtPvsVtsm-gca03z_WTzgi8AUzOWOYcE3EES5w$>
> <
> https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/ec5eb99653624d02a923a314ce40086753b240fc/src/hotspot/share/code/SCCache.cpp*L938__;Iw!!ACWV5N9M2RV99hQ!MGy9Dev-630RGry8or7lUEBQ4OYoS8cYwif2Z56RKLsqk7kx5YSD65AzSK9yYycpOWGubLGwFIZSImuh3ykRKw$
> <
> https://urldefense.com/v3/__https://github.com/openjdk/leyden/blob/ec5eb99653624d02a923a314ce40086753b240fc/src/hotspot/share/code/SCCache.cpp*L938__;Iw!!ACWV5N9M2RV99hQ!MGy9Dev-630RGry8or7lUEBQ4OYoS8cYwif2Z56RKLsqk7kx5YSD65AzSK9yYycpOWGubLGwFIZSImuh3ykRKw$
> >>>
> > > >
> > > > Thanks,
> > > > - Ashutosh Mehra
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20240726/c696a82e/attachment.htm>
More information about the leyden-dev
mailing list