Classifying statics (the 3 bucket model)

Dan Heidinga heidinga at redhat.com
Wed Jan 25 18:43:33 UTC 2023


On Mon, Jan 23, 2023 at 3:22 PM Brian Goetz <brian.goetz at oracle.com> wrote:

> Do you have any data on the relative distribution of the three buckets?
>

That's a great question but one I don't have a clear answer for.  Here's
what I can provide though.

Our investigations have primarily focused on shifting class initialization
to occur at build time with a goal of shifting all of it there.  With the
qbicc efforts, it's still early but we've only had to add ~15 classes to
the "reinitialize" bucket so far when bringing up a "hello world" style
Quarkus application.  Many of the cases that need to be adapted are fairly
straightforward capturing things like the Runtime::availableProcessors,
environment variables, file descriptors for std{out/err/in}, page sizes,
etc.  Others are more complicated but of a similar build time vs runtime
env capture in classes like sun.security.provider.SeedGenerator.

That implies a fairly small number of classes in the reinitialize bucket
compared to the large number in the build time bucket.  We don't really
have a runtime bucket in qbicc by design.

Graal's JDKInitializationFeature [0] is a great place to look for the
divide between build / runtime / reinit in JDK classes as a baseline with
the Substitutions providing fixups / workarounds.  Most JDK classes are
build time as set by the package, a smaller number of classes (a handfull?)
are runtime, and 20ish classes are set to be reinitialized in the Feature.
The caveat being that there is a large set of Substitutions, each created
for various reasons.  Finding the reasons for most substitutions is a time
consuming effort given the little detail in the github history.  Some are
essential, others are performance or footprint optimizations, others adapt
to the class loading model in Graal.  Again not a perfect proxy for Leyden.

My impression is most classes have very simple static initializers that are
either empty or define some primitive (or String) constants.  They can be
freely shifted around.  It's the smaller set that create Threads, cache
environment variables / system defines, or allocate Objects that depend on
a class that does one of these things that become problematic.

--Dan

[0]
https://github.com/oracle/graal/blob/master/substratevm/src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/jdk/JDKInitializationFeature.java



>
>
> On 1/23/2023 1:50 PM, Dan Heidinga wrote:
>
> Picking up on the Class Init discussion from before the holidays, here's
> the next piece looking at how static fields can be characterized:
>
> The next major step in our investigation was to try to replace the GraalVM
> Substitutions with the IODH pattern.  Our assumption at the time was that
> we could group class initializations into two buckets - one for build time
> init and one for runtime init.
>
> Unfortunately, as we worked on adapting the Substitutions, we found that a
> third bucket needed to be introduced.  This is the "reinitialize bucket"
> for classes that needed to be initialized at build time and then
> reinitialized at runtime as they hold values that are specific to the
> current system.
>
> Classes also unfortunately don't neatly stay in one bucket.  Many of the
> classes in the "runtime init" bucket can be dragged into the "reinitialize"
> bucket if their initialization is triggered by a use in the "build time
> init" bucket.
>
> The characterization of a given class is often application-specific - in
> *this* app, class Foo can be runtime initialized but in *that* app Foo
> needs to be reinitialized.
>
> Here's an example that shows how the runtime bucket can easily be moved to
> the reinitialization bucket:
>
> java.nio.Bits has fields, like PAGE_SIZE, which can be converted to use
> the IODH pattern provided the IODH helper class is only initialized at
> runtime.  If it is initialized earlier (ie: build time), then it must be in
> the reinitialization bucket.  Any value that is calculated from PAGE_SIZE
> may also need to be reinitialized.
>
> Alternatively, some kind of constraint that prevents the app from running
> could be registered that ensures the page size is consistent between build
> and runtime envs.
>
> Bits also has fields like COUNT which represents the number of reserved
> direct memory allocations.  This field (and related fields) could be
> grouped into an IODH helper class but has the same characteristics as
> PAGE_SIZE in that it belongs in the "runtime" bucket but may be dragged
> into the "reinitialization" bucket by build time use. (ie: a library used
> at build time that temporarily creates a DirectByteBuffer to speed up
> processing of some json file and then discards the DBB.  Initialization of
> Bits' IODH helper will still have occurred).
>
> Similarly we may need constraints (or raise errors?) if COUNT is non-zero
> when transitioning from between phases (build -> runtime).
>
> See appendix B [0] for more examples.
>
> The use of the IODH pattern to group static fields provides two benefits:
> 1) it group fields having similar lifetimes together and allows them to
> benefit from delayed initialization (ie: init groups rather than all fields)
>
> 2) It makes it easier to reset the fields as smaller groupings are being
> reset (smaller blast radius).
>
> The IODH pattern doesn't help to define the constraints under which a
> class may be reused as we still need developer intent to be expressed to
> know what are the constraints.
>
> It also doesn't help to track the uses of the fields in the IODH helper
> class as once they've been read and used, the dependency information is
> lost.
>
> It's still a useful tool, but doesn't solve all use cases.
>
> --Dan
>
>
> [0] http://cr.openjdk.java.net/~heidinga/leyden/ClassInitPlan.pdf
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20230125/60c27630/attachment.htm>


More information about the leyden-dev mailing list