Initialization code that never got trained

Tue Feb 3 11:14:09 UTC 2026

Hi again!

Comparing native and java was not as straight forward as I thought... but I
decided to just do an experiment: What would happen if I train with
"-Xcomp" and force compilation of everything? Would I get some advantage?

My hypothesis said yes. Reality had other ideas.

This is a simple REST API over Quarkus that calls a database and returns a
select. I trained with "-Xcomp" and then run production without that
option. And compared production runs with what happens if I don't train
with XComp.
This was done on 2 cores with dedicated cpu cores.
But on my laptop, so other things running at the same time may have
interfered (like using IO or memory, who knows, slack is a beast). But I
run it four times and the results are always similar.

JDK26 said that "[warning][aot] The AOT cache was created by a different
version or build of HotSpot" so I couldn't even use it on my experiment.
Premain (results from build over 127bfc9b0dd122c78e702867a88e0847ec362e68)
didn't throw that error. Probably this is a bug, not a feature, but let's
use it!

Do we store more stuff on the cache with that option enabled? Yes, we
definitely do.

[image: image.png]

Do we have a faster start-up time with Xcomp enabled? No, we even have a
worse start-up time:

[image: image.png]
I decided to take a look at the cache statistics in both Premain runs:

With XComp on:
[debug][aot,codecache,exit]   Adapters:  total=725
[debug][aot,codecache,exit]   Shared Blobs:  total=0
[debug][aot,codecache,exit]   C1 Blobs:      total=0
[debug][aot,codecache,exit]   C2 Blobs:      total=0
[debug][aot,codecache,exit]   AOT code cache size: 894352 bytes, max
entry's size: 2208 bytes
[info ][aot,codecache,exit] Wrote 725 AOT code entries to AOT Code Cache
Classes in AOT Cache: 12,603
  -> KlassTrainingData: 7,101 (56.34%)
Objects in AOT Cache: 149,684
  -> AOT-inited: 1,261 (0.84%)
  -> java.lang.Class instances: 12,361 (8.26%)
  -> java.lang.String instances: 46,320 (30.95%)
Methods in AOT Cache: 158,664
  -> MethodCounters: 38,424 (24.22%)
  -> MethodData: 33,347 (21.02%)
  -> MethodTrainingData: 37,619 (23.71%)
  -> CompileTrainingData:
      -> Level 1: 552 (0.35%)
      -> Level 2: 36 (0.02%)
      -> Level 3: 24,737 (15.59%)
      -> Level 4: 23,761 (14.98%)

Without XComp:
[debug][aot,codecache,exit]   Adapters:  total=724
[debug][aot,codecache,exit]   Shared Blobs:  total=0
[debug][aot,codecache,exit]   C1 Blobs:      total=0
[debug][aot,codecache,exit]   C2 Blobs:      total=0
[debug][aot,codecache,exit]   AOT code cache size: 893136 bytes, max
entry's size: 2208 bytes
[info ][aot,codecache,exit] Wrote 724 AOT code entries to AOT Code Cache
Classes in AOT Cache: 12,465
  -> KlassTrainingData: 2,693 (21.60%)
Objects in AOT Cache: 149,416
  -> AOT-inited: 1,250 (0.84%)
  -> java.lang.Class instances: 12,208 (8.17%)
  -> java.lang.String instances: 46,458 (31.09%)
Methods in AOT Cache: 157,933
  -> MethodCounters: 11,004 (6.97%)
  -> MethodData: 7,311 (4.63%)
  -> MethodTrainingData: 8,794 (5.57%)
  -> CompileTrainingData:
      -> Level 1: 1,249 (0.79%)
      -> Level 2: 947 (0.60%)
      -> Level 3: 4,784 (3.03%)
      -> Level 4: 1,154 (0.73%)

(for some reason, the log didn't say anything about any nmethod in the
codecache)

Whatever that argument is doing, is not helping as I expected.

We get many more TrainingData objects, and the CompileTrainingData is done
at a higher level. But it doesn't seem to speed up the application,
probably because we are busy loading things we are not really going to use?

So, the conclusion is: don't bother. This looks like a dead end. María, you
should have trusted the process: the JVM knows better than you.

On Wed, Jan 7, 2026 at 9:18 AM María Arias de Reyna Dominguez <
mariasde at redhat.com> wrote:

> Hi!
>
> Thanks! I will try to take a closer look and see what is exactly what is
> happening.
>
> Right now, a comparison on a Quarkus native vs Quarkus Leyden (J26 main
> latest) is close to six or seven times faster on the tests I have done. But
> that may be test-dependent, so I have to dig further.
>
> On Sun, Jan 4, 2026 at 6:23 PM Dan Heidinga <dan.heidinga at oracle.com>
> wrote:
>
>> Happy new year!
>>
>> > For example: a REST API. It has some initialization, port opening,
>> reading configurations,
>> > etc... that run only once. So the code will never be trained. But it
>> always runs at startup,
>> > impacting the time to first response.
>>
>> Historically, JVMs have looked at run-once code - like the body of
>> <clinit> -  as not being worth compiling as the return on the investment in
>> compile time is too low.  There have always been exceptions but even
>> template style jits have avoided run once code.
>>
>> Can you quantify how much of the applications startup is spent in these
>> run-once methods?
>>
>> > So, how can I tell Leyden to please compile and cache those functions,
>> even if they are
>> > going to be run just once, even if they are not optimized at all, even
>> if those compilations
>> > can get discarded after a couple of seconds?
>>
>> Compiling the code isn’t enough.  There’s a lot of work with careful
>> timing required to get the code ready for use before the first invocation.
>> If we miss that window, then the compiled code is just overhead.
>>
>> For “expensive” or long running single use code, we may be able to
>> precompile with C1 and get out of the interpreter earlier at the cost of
>> some coordination overhead to ensure the methods are installed immediately.
>>
>> I think we’d need to understand better where the time is being spent to
>> see why this run once code is slowing down startup.
>>
>> —Dan
>>
>> *From: *leyden-dev <leyden-dev-retn at openjdk.org> on behalf of María
>> Arias de Reyna Dominguez <mariasde at redhat.com>
>> *Date: *Tuesday, December 30, 2025 at 4:13 AM
>> *To: *leyden-dev <leyden-dev at openjdk.org>
>> *Subject: *Initialization code that never got trained
>>
>> Happy New Year!
>>
>> I have been doing some experiments with Leyden and realized something:
>> there is some code at startup/initialization that never gets optimized but
>> is impacting on startup and warmup time.
>>
>> This was a realization while doing comparisons with native/graalvm images
>> of the same code.
>>
>> For example: a REST API. It has some initialization, port opening,
>> reading configurations, etc... that run only once. So the code will never
>> be trained. But it always runs at startup, impacting the time to first
>> response.
>>
>> Compared to a native image, the native image may not have it optimized,
>> but at least it is already compiled, not interpreted. Therefore, the native
>> image starts faster.
>>
>> So, how can I tell Leyden to please compile and cache those functions,
>> even if they are going to be run just once, even if they are not optimized
>> at all, even if those compilations can get discarded after a couple of
>> seconds?
>>
>> Or are we just going to assume that that code, which is impacting startup
>> time, doesn't need to be pre-compiled because we are focusing only on
>> optimizations made by the JVM on runtime?
>>
>> Kind regards,
>> María Arias de Reyna Domínguez
>> Senior Software Engineer
>> She / Her / Hers
>> ariasdereyna at redhat.com
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20260203/93c74804/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 54724 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20260203/93c74804/image.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 33413 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20260203/93c74804/image-0001.png>