Spring Boot BackgroundPreinitializer

Mon Oct 7 22:29:12 UTC 2024

On 4 Oct 2024, at 3:29, Sebastien Deleuze wrote:

> I like the idea of a property, but it would be really nice if it could be
> set automatically by the AOT support without requiring an additional action
> from the users because they typically don't add such property when
> documented and we just want to detect the actual AOT caching mode in a
> reliable and consistent way.
>
> What about having an aot.mode property reflecting the actual mode:
>  - "record" when "-XX:AOTMode=record" is specified or a potential future
> "-XX:AOTMode=autocreate" (replacement of "-XX:CacheDataStore=app.cds") is
> specified and no cache files exist
>  - "on" when "-XX:AOTMode=on" is specified or a potential future
> "-XX:AOTMode=autocreate" (replacement of "-XX:CacheDataStore=app.cds") is
> specified and cache files exist
> - "off" when "-XX:AOTMode=off" is specified

That makes sense, partly.  There is a risk with code which runs
only during training, or only during production; on the one
hand you risk creating resources which are useless during
production (since you trained non-production code), and on
the other you don’t have resources that are probably useful
(since you didn’t train production code).

I think one way to reduce those risks is to use a real Java
API instead of string-based property.  My thought here is
that it will always be useful to survey who calls that API.
You can’t do this when the API is hidden under the generic
property access API.  For example, you can’t put a breakpoint
on a query to a particular property, and so on.  Also,
you can have javadoc on an API, not on a property.

The principle underlying the “unreasonable effectiveness”
of training runs is that a training run is a faithful
predictor of production behavior.  Taking different
paths in training and production is an experts-only
move.  I know you know this, but we will have to make
this contract very, very clear to everyone.

> I can see multiple use cases for that with Spring (reporting, disabling
> BackgroundPreinitializer, refining Spring behavior during the training run,
> etc.) that will likely be useful for others as well.

Yes.  Proper use of the flag might let you cause training
runs to be MORE similar to production runs, as long as you
are successful in manually forcing the execution of truly
representative code paths in your special training logic.

There are many ways to go wrong also; eventually we want
tools for auditing the success of an AOT cache.  For example,
it should be possible to detect when assets in an AOT
cache have gone unused in a production run.  When that
happens, something is probably wrong, especially if there
are many assets (or they are by some measure “big” in
the AOT cache).  A report of such assets would make a
good feedback into the engineering cycle, and might
even be useful to an automatic policy, which would
note events during a (future) training run that match
unused assets in a (past) production run; the policy
would refrain from taking those events into account
when choosing assets for the AOT cache.

There should be a tracking RFE filed in bugs.ojo
for the above idea, so here it is:

https://bugs.openjdk.org/browse/JDK-8341676

(…There’s a lot of work here, and it will be an
enjoyable adventure building out all the pieces and
parts and tools and tricks; it will take years…)

> Side note: we really hope that a "-XX:CacheDataStore=app.cds" successor
> will be introduced because it really helps to not have to change the java
> command line between training and deployment runs for some deployment
> scenarios.

We should have a tracking RFE for this; I think Ioi
might have one.

— John