AOT cache distribution with my application
Joffrey Bion
joffrey.bion at jetbrains.com
Tue Nov 11 18:58:34 UTC 2025
Hi Ioi, and thanks for getting back to me.
> The Leyden design philosophy is that the training/assembly should be done
in an environment as close to the production run as possible
While I understand this reasoning, there are cases were this is not really
convenient or possible. Overall, I believe it would be great to see a
general solution to the distribution of CLI/desktop programs (meant to be
installed on user machines) from the Leyden team.
> In your scenario, since you are creating the AOT cache on first
execution, would it be possible to do a training run on first execution as
well?
Also, if we implement it during the user's first run, the training would
heavily depend on their command choices. What if they just ran "./amper
--help"? Adding logic to detect whether the command they run is suitable
for training complicates matters significantly, as CLI argument parsing
currently happens within the JVM app, not the wrapper script (which
determines the JVM arguments). This doesn't look like a good general
solution for CLI applications.
> For short running programs such as command-line tools, I think the
training run can be relatively short, and should take less time than it
takes to create the AOT cache
Doing a training run for our application means running a build (because
Amper is a build tool). This is most likely not short, especially on a
fresh user machine, and particularly if it involves building multiplatform
stuff. It could take 5 minutes for all I know.
For the record, I tried to explore alternatives here for Amper
specifically:
https://youtrack.jetbrains.com/issue/AMPER-4825/AOT-caching-for-Amper-CLI#focus=Comments-27-12950737.0-0
1. We could move the training run to the `./amper update` command (which
might also be run as part of Amper's installation). This involves
generating a training project, running a build as a training run, and
generating the cache. This would likely make the update longer than
acceptable. The AOT cache generation is already probably too much, but we
don't really have a choice there if we don't want a complicated CI setup.
2. We could make the training run explicit and project-specific, by asking
the user to run a specific command. This is not ideal because users must
know it exists, and most users will not benefit from it.
3. We could create a whole CI infrastructure with different types of
machines, each building the proper AOT cache for various OS and
architecture combinations. This is a CI hassle, and kinda goes against
"write once run anywhere" experience I would expect from the JVM world.
4. Use a GraalVM native image instead, but this also might require some CI
hassle.
> Technically it's not impossible to support an alternative format for the
AOT configuration file to be portable. We probably need a way to serialize
the existing contents into a text file, and then read it back in the AOT
assembly phase
Thanks, that gives me some hope! By the way, I have seen text versus binary
formats mentioned a couple of times, and I have to say I'm a bit confused.
This seems orthogonal to portability. We could have a non-portable text
format (with platform-specific classes) or a portable binary format (any
custom binary format that is the same on all machines). But I'll assume
that by "text" you mean "portable" and by "binary" you mean "non-portable".
> One disadvantage is it will not cover platform-dependent classes, so this
could be sub-optimal (e.g., for programs that makes a lot of file
operations).
That is a fair point I hadn't considered. Thanks for sharing. This could
perhaps be mitigated if the JDK provided a way to perform training runs
with emulation (but I might be dreaming here :D).
More seriously, this is a trade-off that some applications might be willing
to make, especially if the AOT cache generation could be done in different
variants from a single host, like some form of cross-compilation (which
seems more realistic to my ignorant brain).
Perhaps building a native image is our best option right now.
Thanks,
Joffrey
On Fri, Nov 7, 2025 at 5:39 PM <ioi.lam at oracle.com> wrote:
> Hi Joffrey,
>
> Thanks for you feedback.
>
> You're correct that we have changed the AOT configuration file to a
> binary format that's tied to the same JDK executable that generated it.
> It cannot be used on a different OS, or CPU, or even a different version
> of the JDK on the same OS/CPU.
>
> The Leyden design philosophy is that the training/assembly should be
> done in an environment as close to the production run as possible. In
> your scenario, since you are creating the AOT cache on first execution,
> would it be possible to do a training run on first execution as well?
> E.g, from something like:
>
> java -XX:AOTMode=create -XX:AOConfiguration=pregenerated.config
> -XX:AOTCache=app.aot
>
> to
>
> java -XX:AOTCacheOutput=app.aot -cp $JARS myapp.Training
>
> For short running programs such as command-line tools, I think the
> training run can be relatively short, and should take less time than it
> takes to create the AOT cache (especially when AOT code compilation is
> supported in the future). Therefore, I think this will not take
> significantly longer than your proposed approach.
>
> Technically it's not impossible to support an alternative format for the
> AOT configuration file to be portable. We probably need a way to
> serialize the existing contents into a text file, and then read it back
> in the AOT assembly phase. One disadvantage is it will not cover
> platform-dependent classes, so this could be sub-optimal (e.g., for
> programs that makes a lot of file operations).
>
> $ cd openjdk/src/java.base
> $ find windows -name \*.java | wc
> 70 70 3745
> $ find linux -name \*.java | wc
> 32 32 1804
> $ find macosx -name \*.java | wc
> 36 36 2043
>
> Therefore, we are a bit hesitant to go back to the text-based config
> file due to development cost and performance implication.
>
> Thanks
>
> - Ioi
>
>
> On 11/5/25 3:21 AM, Joffrey Bion wrote:
>
> > Hi,
> >
> > At JetBrains we're working on a JVM-based command-line tool called
> > Amper. We're trying to optimize startup time using AOT features, but
> > we're in a bit of a pickle regarding the AOT cache portability.
> >
> > The way our application is setup is the following:
> > * we build our project, and package our runtime classpath jars into a
> > .tgz, which we call our "distribution". This is done from a single
> > (Linux) host on our CI.
> > * we provide a wrapper script to users, which they should check into
> > their VCS repo (akin to gradlew). This wrapper script downloads the
> > proper JRE for Amper and the distribution tgz (if they are not already
> > present on the machine), and then runs the application.
> >
> > Our plan was the following:
> > * perform an AOT training run on a single CI host (the one that
> > publishes our application), record the amper.aotconf once, and package
> > it within our distribution tgz
> > * then, have our wrapper script generate the AOT cache from the
> > aotconf on the end user machine during the first run.
> >
> > This way, we remove the training run hassle (and time overhead) from
> > the users, but still generate the OS/arch/environment-specific cache
> > on the end user machine.
> >
> > However, it seems that the AOT config (output of the training run)
> > will no longer be portable:
> > https://bugs.openjdk.org/browse/JDK-8348426
> >
> > And the response here seems to confirm this:
> > https://mail.openjdk.org/pipermail/leyden-dev/2025-March/001781.html
> >
> > > In JDK 25 and going forward, we are collecting execution profile during
> > > AOT training. As a result, we have changed the AOT configuration
> > file to
> > > a binary file format that's tied to the execution platform of the JVM.
> > > You can see more information from
> > > https://bugs.openjdk.org/browse/JDK-8348426
> > >
> > > The profile data is difficult to be represented in a cross-platform
> > > format (e.g., a text file). The need for "cross platform builds" has
> > > come up before in our design discussion. We have decided to defer it
> > and
> > > focus on delivering optimizations for the most common use cases first.
> > > We might re-evaluate this decision in the future when we have more user
> > > feedback (and more time :-)
> >
> > So my question is: what is the plan of the project Leyden team
> > regarding our use case (non-server applications that run on different
> > types of user machines)? Are there any plans to allow app authors to
> > somehow bundle AOT data from a training run in a portable format
> > together with the jars of the application?
> > We're using the JVM for the "write once, run anywhere" benefit, so it
> > feels a bit awkward for us to create individual distributions for our
> > users (and it's a CI hassle). On the other hand, moving the training
> > run to the user machine means that we might have to expose part of
> > this to the users, or make them wait for a long time in some sort of
> > installation/optimization phase. Neither of these options are ideal,
> > hence why we're hoping for a solution right from the AOT feature.
> >
> > Thanks a lot in advance,
> > Joffrey
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20251111/3911e259/attachment.htm>
More information about the leyden-dev
mailing list