Project Leyden: Beginnings

Tue May 31 09:32:59 UTC 2022

Hi Anton,

On 30/05/2022 18:24, Anton Kozlov wrote:
> . . .  But why not a closed world start
> image that is restored into an open world Java instance, to achieve a
> great start-up, warm-up, and footprint of Leyden, without sacrificing
> the dynamic Java features in the runtime.
I'm not sue what you mean by 'a closed world start image' but the 
picture you are painting here sounds a little bit too rosy. Of There are 
some complexities here that are perhaps worth articulating, even if I am 
misreading your comment and you are already aware of them.

One has to bear in mind that a closed world as defined by full program 
analysis (possibly supplemented with user directives to embrace things 
like reflective targets) can exclude everything that is not marked as 
reachable during the analysis from its generated image, maybe whole 
classes in some cases, or maybe just static/instance fields and methods 
of some classes.

Whereas, a bootstrap image that needs to allow for operation of code 
loaded in an open world, must still include exported, public (or, in 
some cases, protected) elements of the code base even if they are not 
reachable during the analysis. That is needed because because yet to be 
loaded code may end up referencing those public (or protected) elements.

This is especially relevant to build time initialization as the closed 
world assumption means

   great swathes of public static state and its initialization code can 
be squeezed out of the final image

   great swathes of the static state that is included be treated as 
effectively final constants

   multiple occurences of those final constants which represent 
independent, effectively static fields can be coalesced to a single, 
read-only datum in the initial heap

GraalVM performs exactly those optimizations as part of its build time 
initialization and this provides a significant reduction in (initial) 
heap size and application startup.

Note that it is not just JDK data that this applies to either. The 
closed world analysis enables this optimization to be performed across 
the whole application code base.

Now that does not mean an initial image for use in an open world cannot 
perform the optimization described above on a partial basis. It simply 
has to limit itself to parts of the code base that are not accessible 
to/extensible by later loaded code.

So, for example, an analysis of an individual JDK module might be able 
to identify some non-exported subset of the module's code elements which 
effectively constitute a closed world and optimize an initial image 
accordingly. A global analysis of multiple modules that considered all 
possible client uses of exported code might be able to take these 
optimizations further.

Clearly, this approach does not immediately stretch to 'whole program', 
global optimization, as performed by Graal. It would only do so for a 
fully modularized application. However, it may still give significant 
improvements to JDK runtime startup and footprint -- hence benefit all 
apps running on the JVM.

Note, also that I said 'immediately' because I am also hopeful that this 
approach might be combined with some change to the programming model in 
order to allow a move towards a broader analysis that includes library 
and application code, essentially moving nearer toward the closed world 
analysis and attendant optimizations employed by GraalVM.

regards,

Andrew Dinn
-----------