Call for Discussion: New Project: Leyden

Thu Jun 4 09:17:38 UTC 2020

I would like to throw in a general point here based on my experience of
working with Graal and the SubstrateVM for some time, perhaps at the
risk of hijacking this current direction of travel (for which, apologies).

In my experience the notion of a 'static' image which provides the sort
of footprint and startup improvements that Graal has shown to be
possible while still retaining the dynamic load/link capabilities that
the JVM provides (i.e. discarding the closed world assumption) appears
to reside in the same ontological category as the King of France or the
unicorn horn. Indeed, I suspect it may even be a square, round cupola.

Graal derives almost all of its space/time reductions relative to the
dynamic JVM from being able to discard great swathes of class metadata
and class initialization. A closed world assumption is critical to the
validity and comprehensive applicability of both those dietary regimes.

Under that assumption almost all of the details of the loaded class base
are simply not needed at runtime. All decisions which depend on what
classes are present and how they are structured are baked in at compile
time. A great deal of initialization code and resulting static data can
be compressed out of VM startup by performing the necessary init at
build-time and embedding only the (runtime-) referenced static field
values and objects in a pre-populated heap, some of that data read-only,
some of it read-write. n.b. that's just as much an option for app data
as it is for JDK data. With a dynamic runtime it is much harder to
determine 1) whether or not a static field might be referenced and
(worse) 2) whether it's value might get computed differently from the
value derivable at build-time, due to intermediate loading of dynamic code.

The Graal 'static' image compiler also contributes significantly to
improvement of image code size and speed when compared to the OpenJDK
AOT compiler (also, at present, the Graal compiler in a very different
configuration). The former can almost always employ direct linkage, the
latter has to use copious amounts of indirection to allow for later
update to the class base.

The Graal 'static' image compiler obviously cannot rely on runtime
feedback to improve code performance but neither can an OpenJDK AOT
compiler when generating the initial AOT code. However, the former
compiler can profit from the closed world assumption to perform a few
AOT optimizations that can only be performed speculatively on OpenJDK if
dynamic loading is to be allowed. Not all such speculative opportunities
are currently taken, mainly because not all application classes are
normally included in the compiled AppCDS suite.

Of course, if OpenJDK AOT implemented a more comprehensive reachability
analysis then it might be able to embrace many of the same speculative
optimization opportunities over a more full, known application class
suite. However, it would still need to employ more metadata and a more
indirect linkage model in order to allow for dynamic loading.

On 04/06/2020 07:10, Ioi Lam wrote:
> 
> 
> On 6/3/20 7:22 AM, Mike Hearn wrote:
>> Thanks.
>>
>> I didn't quite follow the question about seconds or minutes. When
>> iterating on code that's easily unit tested the JVM manages to provide
>> a turnaround time of a few seconds, at least it does for me when
>> incremental compilation is available. It gets more painful with app
>> servers of course, hence the popularity of JRebel.
>>
>> Yes partitioning the app sounds like a great balance. It's the only
>> way I can see for AppCDS/AOT type optimisations to be applicable
>> during development, where otherwise JITC will continue to dominate
>> despite that developers probably start their app more than anyone
>> else! Usually only a tiny part of the app is changing and all the
>> libraries are static, so build systems could figure out a semi-static
>> set of modules pretty easily. Just assume anything pulled from a
>> repository is static and eligible for optimisations.
>>
>> Heap archiving can be seen as a serialisation scheme, just one with an
>> unusually unstable data format. I wonder if it helps to look at it
>> like that - perhaps a lot of the logic can be moved into Java? For
>> instance, do the AppCDS archives /have/ to be generated by C++ looking
>> for magic annotations/iterating the heap, or could some privileged
>> Java code just walk the graph, doing whatever checks are considered
>> useful and building a simple Object[] which is then passed into the VM
>> for the GC to format a heap region and return it as a byte buffer?
>> Then build system authors, app containers etc can take over management
>> of AppCDS files, like deciding when to [re]generate them and for which
>> modules it's worth doing.
> 
> The current way for AppCDS for archiving heap objects is certainly too
> restricted. I think we'll probably allow a subset of classes to be
> initialized ahead of time, and the static fields of those classes will
> be archived into the heap. There are challenges -- picking the
> correct/optimal subset many require substantial work; we also need to
> make sure we don't archive the wrong things (ephemeral resources,
> environment dependencies, etc).
> 
> Thanks
> - Ioi
> 
>>
>> I've tried using the DCEVM in the past, which has better edit and
>> continue. It sounds good but I found that I very rarely edited /only/
>> post-initialisation code and so editing classes in place just ended up
>> being kind of useless, as there was no way to force re-construction
>> and no obvious semantics for how it could work. Especially problematic
>> for GUI apps, which is the kind of app where you really benefit from
>> fast iteration times but most of the iteration is in the code that
>> constructs the GUI.
>>
>> I looked a bit at how to resolve that. I toyed a bit with adding a
>> notification queue to classloaders, that lets app code learn when a
>> class has been redefined, and some APIs on top that would
>> automatically track dependencies and trigger re-construction of object
>> graphs at various user-defined points. Because DCEVM uses a rather
>> large and complex patch I ended up looking more into whether it'd be
>> possible to do this with just classloaders, perhaps in combination
>> with a much smaller JVM patch, as if you're rebuilding object graphs
>> anyway then most of the magic class redefinition does isn't actually
>> useful.
>>
>>
> 

-- 
regards,

Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill