Initialization code that never got trained

Wed Feb 4 09:14:14 UTC 2026

Hi!

On Tue, Feb 3, 2026 at 8:40 PM Vladimir Kozlov <vladimir.kozlov at oracle.com>
wrote:

> Thank you, María, for your report.
>
>  > (for some reason, the log didn't say anything about any nmethod in
> the codecache)
>
> I just check latest premain build and it shows nmethods.
> What command liens you used?
>

Training: java  -XX:+PrintCompilation
-agentpath:/..../libasyncProfiler.so=start,event=cpu,file=.....profile.html
-XX:AOTCacheOutput=...../sqpc-quarkus-uberjar-app.aot -Xcomp
-Xlog:aot+map=trace,aot+map+oops=trace:file=......-aot.map:none:filesize=0
-Xlog:class+load=info,aot+resolve*=trace,aot+codecache+exit=debug,aot=warning:file=......training.log:level,tags
-jar file.jar

Production: java  -XX:+PrintCompilation
-agentpath:/..../libasyncProfiler.so=start,event=cpu,file=......profile.html
-XX:AOTCache=...../sqpc-quarkus-uberjar-app.aot
-Xlog:class+load=info,aot+resolve*=trace,aot+codecache+exit=debug,aot=warning:file=./.....aot.log:level,tags
-jar file.jar

(removed the paths)

 > JDK26 said that "[warning][aot] The AOT cache was created by a
> different version or build of HotSpot" so I couldn't even use it on my
> experiment.
>
> What command lines you used for JDK 26 experiment?
>

Same.

>
> Thanks,
> Vladimir K
>
> On 2/3/26 3:14 AM, María Arias de Reyna Dominguez wrote:
> > Hi again!
> >
> > Comparing native and java was not as straight forward as I thought...
> > but I decided to just do an experiment: What would happen if I train
> > with "-Xcomp" and force compilation of everything? Would I get some
> > advantage?
> >
> > My hypothesis said yes. Reality had other ideas.
> >
> > This is a simple REST API over Quarkus that calls a database and returns
> > a select. I trained with "-Xcomp" and then run production without that
> > option. And compared production runs with what happens if I don't train
> > with XComp.
> > This was done on 2 cores with dedicated cpu cores.
> > But on my laptop, so other things running at the same time may have
> > interfered (like using IO or memory, who knows, slack is a beast). But I
> > run it four times and the results are always similar.
> >
> > JDK26 said that "[warning][aot] The AOT cache was created by a different
> > version or build of HotSpot" so I couldn't even use it on my experiment.
> > Premain (results from build
> > over 127bfc9b0dd122c78e702867a88e0847ec362e68)  didn't throw that error.
> > Probably this is a bug, not a feature, but let's use it!
> >
> > Do we store more stuff on the cache with that option enabled? Yes, we
> > definitely do.
> >
> > image.png
> >
> > Do we have a faster start-up time with Xcomp enabled? No, we even have a
> > worse start-up time:
> >
> > image.png
> > I decided to take a look at the cache statistics in both Premain runs:
> >
> > With XComp on:
> > [debug][aot,codecache,exit]   Adapters:  total=725
> > [debug][aot,codecache,exit]   Shared Blobs:  total=0
> > [debug][aot,codecache,exit]   C1 Blobs:      total=0
> > [debug][aot,codecache,exit]   C2 Blobs:      total=0
> > [debug][aot,codecache,exit]   AOT code cache size: 894352 bytes, max
> > entry's size: 2208 bytes
> > [info ][aot,codecache,exit] Wrote 725 AOT code entries to AOT Code Cache
> > Classes in AOT Cache: 12,603
> >    -> KlassTrainingData: 7,101 (56.34%)
> > Objects in AOT Cache: 149,684
> >    -> AOT-inited: 1,261 (0.84%)
> >    -> java.lang.Class instances: 12,361 (8.26%)
> >    -> java.lang.String instances: 46,320 (30.95%)
> > Methods in AOT Cache: 158,664
> >    -> MethodCounters: 38,424 (24.22%)
> >    -> MethodData: 33,347 (21.02%)
> >    -> MethodTrainingData: 37,619 (23.71%)
> >    -> CompileTrainingData:
> >        -> Level 1: 552 (0.35%)
> >        -> Level 2: 36 (0.02%)
> >        -> Level 3: 24,737 (15.59%)
> >        -> Level 4: 23,761 (14.98%)
> >
> >
> > Without XComp:
> > [debug][aot,codecache,exit]   Adapters:  total=724
> > [debug][aot,codecache,exit]   Shared Blobs:  total=0
> > [debug][aot,codecache,exit]   C1 Blobs:      total=0
> > [debug][aot,codecache,exit]   C2 Blobs:      total=0
> > [debug][aot,codecache,exit]   AOT code cache size: 893136 bytes, max
> > entry's size: 2208 bytes
> > [info ][aot,codecache,exit] Wrote 724 AOT code entries to AOT Code Cache
> > Classes in AOT Cache: 12,465
> >    -> KlassTrainingData: 2,693 (21.60%)
> > Objects in AOT Cache: 149,416
> >    -> AOT-inited: 1,250 (0.84%)
> >    -> java.lang.Class instances: 12,208 (8.17%)
> >    -> java.lang.String instances: 46,458 (31.09%)
> > Methods in AOT Cache: 157,933
> >    -> MethodCounters: 11,004 (6.97%)
> >    -> MethodData: 7,311 (4.63%)
> >    -> MethodTrainingData: 8,794 (5.57%)
> >    -> CompileTrainingData:
> >        -> Level 1: 1,249 (0.79%)
> >        -> Level 2: 947 (0.60%)
> >        -> Level 3: 4,784 (3.03%)
> >        -> Level 4: 1,154 (0.73%)
> >
> > (for some reason, the log didn't say anything about any nmethod in the
> > codecache)
> >
> > Whatever that argument is doing, is not helping as I expected.
> >
> > We get many more TrainingData objects, and the CompileTrainingData is
> > done at a higher level. But it doesn't seem to speed up the application,
> > probably because we are busy loading things we are not really going to
> use?
> >
> > So, the conclusion is: don't bother. This looks like a dead end. María,
> > you should have trusted the process: the JVM knows better than you.
> >
> >
> > On Wed, Jan 7, 2026 at 9:18 AM María Arias de Reyna Dominguez
> > <mariasde at redhat.com <mailto:mariasde at redhat.com>> wrote:
> >
> >     Hi!
> >
> >     Thanks! I will try to take a closer look and see what is exactly
> >     what is happening.
> >
> >     Right now, a comparison on a Quarkus native vs Quarkus Leyden (J26
> >     main latest) is close to six or seven times faster on the tests I
> >     have done. But that may be test-dependent, so I have to dig further.
> >
> >     On Sun, Jan 4, 2026 at 6:23 PM Dan Heidinga <dan.heidinga at oracle.com
> >     <mailto:dan.heidinga at oracle.com>> wrote:
> >
> >         Happy new year!
> >
> >         > For example: a REST API. It has some initialization, port
> >         opening, reading configurations,
> >         > etc... that run only once. So the code will never be trained.
> But it always runs at startup,
> >         > impacting the time to first response.
> >
> >         Historically, JVMs have looked at run-once code - like the body
> >         of <clinit> -  as not being worth compiling as the return on the
> >         investment in compile time is too low.  There have always been
> >         exceptions but even template style jits have avoided run once
> code.
> >
> >         Can you quantify how much of the applications startup is spent
> >         in these run-once methods?
> >
> >         > So, how can I tell Leyden to please compile and cache those
> functions, even if they are
> >         > going to be run just once, even if they are not optimized at
> all, even if those compilations
> >         > can get discarded after a couple of seconds?
> >
> >         Compiling the code isn’t enough.  There’s a lot of work with
> >         careful timing required to get the code ready for use before the
> >         first invocation.  If we miss that window, then the compiled
> >         code is just overhead.
> >
> >         For “expensive” or long running single use code, we may be able
> >         to precompile with C1 and get out of the interpreter earlier at
> >         the cost of some coordination overhead to ensure the methods are
> >         installed immediately.
> >
> >         I think we’d need to understand better where the time is being
> >         spent to see why this run once code is slowing down startup.
> >
> >         —Dan
> >
> >         *From: *leyden-dev <leyden-dev-retn at openjdk.org <mailto:leyden-
> >         dev-retn at openjdk.org>> on behalf of María Arias de Reyna
> >         Dominguez <mariasde at redhat.com <mailto:mariasde at redhat.com>>
> >         *Date: *Tuesday, December 30, 2025 at 4:13 AM
> >         *To: *leyden-dev <leyden-dev at openjdk.org <mailto:leyden-
> >         dev at openjdk.org>>
> >         *Subject: *Initialization code that never got trained
> >
> >         Happy New Year!
> >
> >         I have been doing some experiments with Leyden and realized
> >         something: there is some code at startup/initialization that
> >         never gets optimized but is impacting on startup and warmup time.
> >
> >         This was a realization while doing comparisons with native/
> >         graalvm images of the same code.
> >
> >         For example: a REST API. It has some initialization, port
> >         opening, reading configurations, etc... that run only once. So
> >         the code will never be trained. But it always runs at startup,
> >         impacting the time to first response.
> >
> >         Compared to a native image, the native image may not have it
> >         optimized, but at least it is already compiled, not interpreted.
> >         Therefore, the native image starts faster.
> >
> >         So, how can I tell Leyden to please compile and cache those
> >         functions, even if they are going to be run just once, even if
> >         they are not optimized at all, even if those compilations can
> >         get discarded after a couple of seconds?
> >
> >         Or are we just going to assume that that code, which is
> >         impacting startup time, doesn't need to be pre-compiled because
> >         we are focusing only on optimizations made by the JVM on runtime?
> >
> >         Kind regards,
> >         María Arias de Reyna Domínguez
> >         Senior Software Engineer
> >         She / Her / Hers
> >         ariasdereyna at redhat.com <mailto:ariasdereyna at redhat.com>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20260204/5caf0e39/attachment.htm>