<div dir="ltr"><div dir="ltr">Hi!<div><div dir="ltr" class="gmail_signature"><div dir="ltr"><br></div></div></div></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Tue, Feb 3, 2026 at 8:40 PM Vladimir Kozlov <<a href="mailto:vladimir.kozlov@oracle.com">vladimir.kozlov@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Thank you, María, for your report.<br><br>
> (for some reason, the log didn't say anything about any nmethod in <br>
the codecache)<br>
<br>
I just check latest premain build and it shows nmethods.<br>
What command liens you used?<br></blockquote><div><br></div><div>Training: java -XX:+PrintCompilation -agentpath:/..../libasyncProfiler.so=start,event=cpu,file=.....profile.html -XX:AOTCacheOutput=...../sqpc-quarkus-uberjar-app.aot -Xcomp -Xlog:aot+map=trace,aot+map+oops=trace:file=......-aot.map:none:filesize=0 -Xlog:class+load=info,aot+resolve*=trace,aot+codecache+exit=debug,aot=warning:file=......training.log:level,tags -jar file.jar</div><div><br></div><div>Production: java -XX:+PrintCompilation -agentpath:/..../libasyncProfiler.so=start,event=cpu,file=......profile.html -XX:AOTCache=...../sqpc-quarkus-uberjar-app.aot -Xlog:class+load=info,aot+resolve*=trace,aot+codecache+exit=debug,aot=warning:file=./.....aot.log:level,tags -jar file.jar</div><div><br></div><div>(removed the paths)</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
> JDK26 said that "[warning][aot] The AOT cache was created by a <br>
different version or build of HotSpot" so I couldn't even use it on my <br>
experiment.<br>
<br>
What command lines you used for JDK 26 experiment?<br></blockquote><div><br></div><div><br></div><div>Same.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
Thanks,<br>
Vladimir K<br>
<br>
On 2/3/26 3:14 AM, María Arias de Reyna Dominguez wrote:<br>
> Hi again!<br>
> <br>
> Comparing native and java was not as straight forward as I thought... <br>
> but I decided to just do an experiment: What would happen if I train <br>
> with "-Xcomp" and force compilation of everything? Would I get some <br>
> advantage?<br>
> <br>
> My hypothesis said yes. Reality had other ideas.<br>
> <br>
> This is a simple REST API over Quarkus that calls a database and returns <br>
> a select. I trained with "-Xcomp" and then run production without that <br>
> option. And compared production runs with what happens if I don't train <br>
> with XComp.<br>
> This was done on 2 cores with dedicated cpu cores.<br>
> But on my laptop, so other things running at the same time may have <br>
> interfered (like using IO or memory, who knows, slack is a beast). But I <br>
> run it four times and the results are always similar.<br>
> <br>
> JDK26 said that "[warning][aot] The AOT cache was created by a different <br>
> version or build of HotSpot" so I couldn't even use it on my experiment.<br>
> Premain (results from build <br>
> over 127bfc9b0dd122c78e702867a88e0847ec362e68) didn't throw that error. <br>
> Probably this is a bug, not a feature, but let's use it!<br>
> <br>
> Do we store more stuff on the cache with that option enabled? Yes, we <br>
> definitely do.<br>
> <br>
> image.png<br>
> <br>
> Do we have a faster start-up time with Xcomp enabled? No, we even have a <br>
> worse start-up time:<br>
> <br>
> image.png<br>
> I decided to take a look at the cache statistics in both Premain runs:<br>
> <br>
> With XComp on:<br>
> [debug][aot,codecache,exit] Adapters: total=725<br>
> [debug][aot,codecache,exit] Shared Blobs: total=0<br>
> [debug][aot,codecache,exit] C1 Blobs: total=0<br>
> [debug][aot,codecache,exit] C2 Blobs: total=0<br>
> [debug][aot,codecache,exit] AOT code cache size: 894352 bytes, max <br>
> entry's size: 2208 bytes<br>
> [info ][aot,codecache,exit] Wrote 725 AOT code entries to AOT Code Cache<br>
> Classes in AOT Cache: 12,603<br>
> -> KlassTrainingData: 7,101 (56.34%)<br>
> Objects in AOT Cache: 149,684<br>
> -> AOT-inited: 1,261 (0.84%)<br>
> -> java.lang.Class instances: 12,361 (8.26%)<br>
> -> java.lang.String instances: 46,320 (30.95%)<br>
> Methods in AOT Cache: 158,664<br>
> -> MethodCounters: 38,424 (24.22%)<br>
> -> MethodData: 33,347 (21.02%)<br>
> -> MethodTrainingData: 37,619 (23.71%)<br>
> -> CompileTrainingData:<br>
> -> Level 1: 552 (0.35%)<br>
> -> Level 2: 36 (0.02%)<br>
> -> Level 3: 24,737 (15.59%)<br>
> -> Level 4: 23,761 (14.98%)<br>
> <br>
> <br>
> Without XComp:<br>
> [debug][aot,codecache,exit] Adapters: total=724<br>
> [debug][aot,codecache,exit] Shared Blobs: total=0<br>
> [debug][aot,codecache,exit] C1 Blobs: total=0<br>
> [debug][aot,codecache,exit] C2 Blobs: total=0<br>
> [debug][aot,codecache,exit] AOT code cache size: 893136 bytes, max <br>
> entry's size: 2208 bytes<br>
> [info ][aot,codecache,exit] Wrote 724 AOT code entries to AOT Code Cache<br>
> Classes in AOT Cache: 12,465<br>
> -> KlassTrainingData: 2,693 (21.60%)<br>
> Objects in AOT Cache: 149,416<br>
> -> AOT-inited: 1,250 (0.84%)<br>
> -> java.lang.Class instances: 12,208 (8.17%)<br>
> -> java.lang.String instances: 46,458 (31.09%)<br>
> Methods in AOT Cache: 157,933<br>
> -> MethodCounters: 11,004 (6.97%)<br>
> -> MethodData: 7,311 (4.63%)<br>
> -> MethodTrainingData: 8,794 (5.57%)<br>
> -> CompileTrainingData:<br>
> -> Level 1: 1,249 (0.79%)<br>
> -> Level 2: 947 (0.60%)<br>
> -> Level 3: 4,784 (3.03%)<br>
> -> Level 4: 1,154 (0.73%)<br>
> <br>
> (for some reason, the log didn't say anything about any nmethod in the <br>
> codecache)<br>
> <br>
> Whatever that argument is doing, is not helping as I expected.<br>
> <br>
> We get many more TrainingData objects, and the CompileTrainingData is <br>
> done at a higher level. But it doesn't seem to speed up the application, <br>
> probably because we are busy loading things we are not really going to use?<br>
> <br>
> So, the conclusion is: don't bother. This looks like a dead end. María, <br>
> you should have trusted the process: the JVM knows better than you.<br>
> <br>
> <br>
> On Wed, Jan 7, 2026 at 9:18 AM María Arias de Reyna Dominguez <br>
> <<a href="mailto:mariasde@redhat.com" target="_blank">mariasde@redhat.com</a> <mailto:<a href="mailto:mariasde@redhat.com" target="_blank">mariasde@redhat.com</a>>> wrote:<br>
> <br>
> Hi!<br>
> <br>
> Thanks! I will try to take a closer look and see what is exactly<br>
> what is happening.<br>
> <br>
> Right now, a comparison on a Quarkus native vs Quarkus Leyden (J26<br>
> main latest) is close to six or seven times faster on the tests I<br>
> have done. But that may be test-dependent, so I have to dig further.<br>
> <br>
> On Sun, Jan 4, 2026 at 6:23 PM Dan Heidinga <<a href="mailto:dan.heidinga@oracle.com" target="_blank">dan.heidinga@oracle.com</a><br>
> <mailto:<a href="mailto:dan.heidinga@oracle.com" target="_blank">dan.heidinga@oracle.com</a>>> wrote:<br>
> <br>
> Happy new year!<br>
> <br>
> > For example: a REST API. It has some initialization, port<br>
> opening, reading configurations,<br>
> > etc... that run only once. So the code will never be trained. But it always runs at startup,<br>
> > impacting the time to first response.<br>
> <br>
> Historically, JVMs have looked at run-once code - like the body<br>
> of <clinit> - as not being worth compiling as the return on the<br>
> investment in compile time is too low. There have always been<br>
> exceptions but even template style jits have avoided run once code.<br>
> <br>
> Can you quantify how much of the applications startup is spent<br>
> in these run-once methods?<br>
> <br>
> > So, how can I tell Leyden to please compile and cache those functions, even if they are<br>
> > going to be run just once, even if they are not optimized at all, even if those compilations<br>
> > can get discarded after a couple of seconds?<br>
> <br>
> Compiling the code isn’t enough. There’s a lot of work with<br>
> careful timing required to get the code ready for use before the<br>
> first invocation. If we miss that window, then the compiled<br>
> code is just overhead.<br>
> <br>
> For “expensive” or long running single use code, we may be able<br>
> to precompile with C1 and get out of the interpreter earlier at<br>
> the cost of some coordination overhead to ensure the methods are<br>
> installed immediately.<br>
> <br>
> I think we’d need to understand better where the time is being<br>
> spent to see why this run once code is slowing down startup.<br>
> <br>
> —Dan<br>
> <br>
> *From: *leyden-dev <<a href="mailto:leyden-dev-retn@openjdk.org" target="_blank">leyden-dev-retn@openjdk.org</a> <mailto:<a href="mailto:leyden-" target="_blank">leyden-</a><br>
> <a href="mailto:dev-retn@openjdk.org" target="_blank">dev-retn@openjdk.org</a>>> on behalf of María Arias de Reyna<br>
> Dominguez <<a href="mailto:mariasde@redhat.com" target="_blank">mariasde@redhat.com</a> <mailto:<a href="mailto:mariasde@redhat.com" target="_blank">mariasde@redhat.com</a>>><br>
> *Date: *Tuesday, December 30, 2025 at 4:13 AM<br>
> *To: *leyden-dev <<a href="mailto:leyden-dev@openjdk.org" target="_blank">leyden-dev@openjdk.org</a> <mailto:<a href="mailto:leyden-" target="_blank">leyden-</a><br>
> <a href="mailto:dev@openjdk.org" target="_blank">dev@openjdk.org</a>>><br>
> *Subject: *Initialization code that never got trained<br>
> <br>
> Happy New Year!<br>
> <br>
> I have been doing some experiments with Leyden and realized<br>
> something: there is some code at startup/initialization that<br>
> never gets optimized but is impacting on startup and warmup time.<br>
> <br>
> This was a realization while doing comparisons with native/<br>
> graalvm images of the same code.<br>
> <br>
> For example: a REST API. It has some initialization, port<br>
> opening, reading configurations, etc... that run only once. So<br>
> the code will never be trained. But it always runs at startup,<br>
> impacting the time to first response.<br>
> <br>
> Compared to a native image, the native image may not have it<br>
> optimized, but at least it is already compiled, not interpreted.<br>
> Therefore, the native image starts faster.<br>
> <br>
> So, how can I tell Leyden to please compile and cache those<br>
> functions, even if they are going to be run just once, even if<br>
> they are not optimized at all, even if those compilations can<br>
> get discarded after a couple of seconds?<br>
> <br>
> Or are we just going to assume that that code, which is<br>
> impacting startup time, doesn't need to be pre-compiled because<br>
> we are focusing only on optimizations made by the JVM on runtime?<br>
> <br>
> Kind regards,<br>
> María Arias de Reyna Domínguez<br>
> Senior Software Engineer<br>
> She / Her / Hers<br>
> <a href="mailto:ariasdereyna@redhat.com" target="_blank">ariasdereyna@redhat.com</a> <mailto:<a href="mailto:ariasdereyna@redhat.com" target="_blank">ariasdereyna@redhat.com</a>><br>
> <br>
<br>
</blockquote></div></div>