Alternative ways to mark the end of a training run
ioi.lam at oracle.com
ioi.lam at oracle.com
Thu Mar 13 05:43:00 UTC 2025
Hi Sebastien,
Thank you for providing your insights! Please see my response in-line:
On 3/12/25 7:47 AM, Sebastien Deleuze wrote:
> Those 3 alternatives make sense from my POV and look complementary:
> - JCMD could be used by sysops and for testing.
> - API could potentially be used by a Spring Boot actuator [1]
> exposing a secured endpoint allowing to trigger the end of a training run.
> - Commandline could be useful for platforms wanting to provide AOT
> cache support with the knowledge of the frameworks used. For example,
> it could be used to specify
> `-XX:CDSEndTrainingOnEntry=org.springframework.web.servlet.DispatcherServlet#doService`
> after 10000 invocations.
>
> If that's a new capability, would it make sense to only expose it via
> `-XX:AOTEndTrainingOnEntry` instead of `-XX:CDSEndTrainingOnEntry` to
> avoid too many options and some confusion?
>
Sorry I cut-and-pasted the text from the bug report without updating it
to use the new "AOT" terminology. In the Leyden repo, this option is
already changed to -XX:AOTEndTrainingOnMethodEntry.
> Providing those capabilities will be very important when AOT profiles
> and AOT compiled methods will be available as they could allow target
> use cases where the production environment is used to get profiling
> data (will be the most popular use case IMO), and you would typically
> want to dump the profiling data without killing the instance. That
> said, it could maybe ease integrators' work to support those
> capabilities earlier than AOT profiles and AOT compiled methods, if
> that makes sense with the current feature set. For example, it could
> allow platforms and frameworks to be "Leyden-ready" with a Java 25
> minimal requirement if we consider those JCMD/API/Commandline as a
> subset of "Leyden public API". AOT profiles and AOT compiled methods
> support could come later and be almost an implementation detail from
> platforms and frameworks POV.
>
> Since the creation of an AOT cache with Spring Petclinic was pretty
> long and resource consuming last time I tried, I am wondering if it
> could be possible to provide an option to control what kind of output
> we expect at the end of the training run. Generating directly the
> CDS/AOT archive makes sense, but on production it could be potentially
> too resource consuming and would produce too many side effects. Like
> the JEP 483 differentiates the AOT configuration recorded
> (app.aotconf) from the cache (app.aot), would it be possible with AOT
> profiles and AOT compiled methods to optionally generate an
> intermediate format that would be fast to generate at the end of the
> training, and defer the more involved creation of the archive at a
> later point (keeping the constraints of same OS, Java version and
> classpath)?
>
We plan to support the intermediate form (app.aotconf) going forward, so
you will be able to record that on a small host. This file can be
transferred to a bigger development host to create the final AOT cache
(with metadata, profiles and compiled methods).
Thanks
- Ioi
> On Thu, Mar 6, 2025 at 9:23 PM <ioi.lam at oracle.com> wrote:
>
> With JEP 483 [1], the profiling data (AOT config file) are
> captured when the training run exits. In the Leyden repo, we have
> implemented a mechanism [2] to capture the profiler data at an
> earlier point. Excerpt from [2]:
>
>
> ===
>
> It may be difficult for users to run to normal completion for all
> training runs - some may prefer to only record data until the
> application framework has started or prior to some method being
> invoked.
>
> This RFE is to track other possible triggers for CDS data to be
> collected and / or for the _assembly phase_ to being.
>
> (a) JCMD: a new jcmd can be developed to attach to a running JVM
> and signal the training run has ended. For classic CDS, this may
> be the point at which the classlist is dumped to the file. For
> premain CDS, this may trigger the start of the assembly phase and
> the creation of the CDS archive.
>
> (b) API: a new Leyden-specific API may be created that allows
> developers to indicate programatically the point at which the
> training run should end. This could be as simple as a static
> method `Leyden.endTraining()` or something that exposes more state
> such as the name of the CDS file. Details TBD based on need.
>
> (c) Commandline: a new option could specify when to trigger the
> end of the training run. ie:
> `-XX:CDSEndTrainingOnEntry=org.foo.bar.someMethod`. This can be
> extended beyond single entry to also include a counted entry ie:
> the 1000 time this method is entered.
>
> ===
>
>
> As of today, (a) and (c) have been implemented in the Leyden repo.
> We have received positive feedback from developers who found this
> mechanism to be useful and are requesting for similar features in
> the JDK mainline.
>
> I think now will be a good time to have a wider discussion with
> the community about:
>
> - The use case and requirements for such a mechanism
>
> - The solution space -- besides the 3 options listed above, are
> there other approaches? Pros & cons?
>
> For example, an API might be more precise. However, many apps are
> built with 3rd libraries that cannot be modified easily, so an
> external mechanism would be preferable. JCMD might be least
> intrusive, but it's timing dependent and may not be available (in
> containers, etc).
>
>
> --------------------------------------
>
> [1] https://openjdk.org/jeps/483
>
> [2] https://bugs.openjdk.org/browse/JDK-8335358
>
>
> This electronic communication and the information and any files
> transmitted with it, or attached to it, are confidential and are
> intended solely for the use of the individual or entity to whom it is
> addressed and may contain information that is confidential, legally
> privileged, protected by privacy laws, or otherwise restricted from
> disclosure to anyone else. If you are not the intended recipient or
> the person responsible for delivering the e-mail to the intended
> recipient, you are hereby notified that any use, copying,
> distributing, dissemination, forwarding, printing, or copying of this
> e-mail is strictly prohibited. If you received this e-mail in error,
> please return the e-mail to the sender, delete it from your computer,
> and destroy any printed copy of it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20250312/20a1ad2d/attachment.htm>
More information about the leyden-dev
mailing list