Encapsulating JVM Options

Tue Aug 27 16:49:34 UTC 2024

[CC-ing leyden-dev for this reply.  We are talking about ways to wire
up dependencies on JVM options in globals.hpp and elsewhere to the
AOT cache as a whole, and/or to individually compiled nmethods.]

On 23 Aug 2024, at 10:21, Cesar Soares Lucas wrote:

> Hi, Igor, John.
>
> Some questions:
>
>> From John Rose
>> This should probably turn into a
>> dependency record (of some sort) on the particular AOT nmethod, so that
>> when we load AOT code in production, we can ensure we are not loading
>> a method that “doesn’t fit” into the current JVM.
>
> John, you mentioned above adding the "dependency record" to a particular _nmethod_. I was under the impression that the record would be the same for all HotSpot components (or a single record for the whole CDS archive). Do you think we might need (or want) different records (or parts of it) for different HotSpot components, compilation tasks, or nmethods?

I have been using the term “dependency record” loosely, because such a
thing can be appropriate on the AOT cache as a whole, or on specific “assets”
that are optionally loadable within the AOT cache, and specifically nmethods
or other code blobs.  (As you know, code blobs reside in the “code cache”,
aka the code heap.  An nmethod is a code blob that compiles the entry point
of some Java/JVM method.)  Nmethods in the AOT code cache are by nature
optionally adoptable, since you can always ignore it and use some other
mode for executing the corresponding Java/JVM method.

(Note:  AOT cache is the new name for the CDS archive, as expanded by
the new features we are working on.  In our EA, the AOT cache contains
AOT klasses and methods, AOT method profiles, AOT Java objects like
interned strings, and AOT nmethods.  All of these things have been
generated in the past mainly on a just-in-time basis.)

Some nmethods have dependencies which are more demanding than the
dependencies on the overall AOT cache.  That is, even if the AOT cache
is “cleared for takeoff” (matching flags like narrow oops, etc.), an
individual nmethod within its AOT code cache might be impossible to
adopt as the compilation of its method.  In our premain EA this can
happen because an nmethod X demands that some klass Foo be initialized,
and Foo has not yet been initialized; only when Foo is fully initialized
is the nmethod X (in the code cache) a candidate for connecting to
its Java/JVM method.  Also in this case, there is often another nmethod
X0 (also in the AOT code cache) for the same Java/JVM method, which
does NOT require Foo to be initialized; this nmethod is adopted ASAP
during startup, and only gets replaced when Foo (and every other klass
that X “cares about”) comes to be initialized.

I think if we make other kinds of individual dependencies on AOT nmethods,
such as “platform must support AVX-512”, we might also think about generating
two versions, one with the demanding dependency, and one without.  A
dependency, if asserted, creates a sort of “debt” to say what should happen
when the dependency fails (or is not yet true, as in the case of class
initialization).  That “debt” can be satisfied by having two versions of
some individual AOT cache asset, or of the AOT cache as a whole.

There could possibly be a use case for storing dependencies on other
kinds of assets in the AOT Cache.  For example, suppose a klass K is
somehow optionally loadable.  (Old CDS does just-in-time loading, so
this is physically possible in the AOT cache as well.)  We are
emphasizing AOT loading and linking, in the premain phase, but
perhaps klass K must be loaded “the old fashioned way” by processing
its pre-parsed representation.  I can imagine there could be some
dependency-like information attached to klass K that says “do not
adopt this asset unless XYZ is true”, and XYZ is something more
specific than the overall dependencies written into the AOT cache.
But here I am just speculating about something that might be useful
in thefuture.  Again, the current work focuses on profiting from
the decision to load and link as many classes as possible in the
premain phase, the startup moments before application main is invoked.
And all classes loaded in premain phase are loaded as a block, so
they either stand together or fall together, based on whatever
dependencies are written into the AOT cache as a whole.

> I'll for sure have more questions and things to discuss once I start working on this, which I'm planning to do next Monday.
> My first step will be trying to get an overview of all sorts of "flags" that I'll have to add to the record.
>
>
> I'm glad to start working on this!
> Cesar

Thanks!

— John

P.S. I’d like to invite anyone working on “dependencies” (in the broad
sense) to think about how there might be a “dependency API” that
applies in multiple places, and perhaps could be encoded (in multiple
places) in ways that are somehow uniform in our C++ code.  Currently,
dependencies.hpp is only for nmethods, since those are the key
optionally useable items in the JVM.

(You can always discard an nmethod, but you can't discard most other
items in the JVM, until the GC gives you permission.  Something which
is optionally discardable is also optionally usable, and can be
coupled to side conditions that are unpredictable.  So an nmethod can
depend on the fact some particular method has no overrides, because it
can be transparently discarded when a new class with an override is
suddenly loaded.  In Leyden, a cached AOT nmethod can also depend on
some granular platform feature like AVX-512 that the rest of the AOT
cache ignores.  The common thread is an optionally adoptable and/or
optionally discardable asset of some sort.)

In the AOT cache, there are potentially many kinds of optionally
adoptable assets, since the JVM in many cases can decide to regenerate
an item it needs from scratch, just in time, rather than adopting a
cached AOT asset. I can’t imagine a dependency on an AOT klass asset
(other than a useless one the duplicates a dependency on the whole AOT
cache) but maybe there is one.  (Note: the term “klass” is the
JVM-internal name for a metadata object that represents a loaded
classfile.  When I say “class” I might be referring to many things in
source code or elsewhere, and I don’t always mean “or interface”.  But
“klass” means “metadata for a loaded classfile”.  It also sometimes
means metadata for something else like /new int[0].getClass()/.)

OK, here is one very simple dependency, which probably does not need
an individual call-out: Some optionally adoptable klass asset K might
“bake in” stabilized pointers into the block of premain klasses.  In
that case, it would be a mistake to load and link K before Object,
String, and all the premain classes were loaded and linked.  I suppose
if we (in the far future) ever organized multiple blocks of klass
assets in layers the layers themselves could have load-sequence
dependencies.  A layer would be like a shared library that requires
resolved references to previously loaded shared libraries; the OS must
load those previous ones first and then load the shared library that
depends on it.  I say “far future” because this idea (klass layers) is
the sort of thing that comes up routinely, and is routinely put on the
list of things to examine later.

Maybe there are somehow concepts in dependencies.hpp that could be
teased out for use in other places.  Or perhaps there is no
cross-application, and logical dependencies look completely different
when applied to the AOT cache as a whole, and again applied to
nmethods, and perhaps later applied to some other optionally adoptable
asset within the AOT cache.  Still, it would be nice to have a common
API, especially there is parallel growth of complexity at the level of
nmethods and on the whole cache (e.g., of dependencies on flags).  Or
if there is a third location for dependencies (optional klass assets?)
it might be nice to make a new use of an existing API instead of
building out new dependency logic (in the new place) from scratch.