Call for Discussion: New Project: Leyden

Wed Jun 3 05:40:23 UTC 2020

Hi Mike,

Thanks for sharing the concern for the start-up needs for developers. My 
personal feeling is Leyden will initially focused on production 
environments. One particular developer experience we need to solve is 
the build-time of the optimized image. Currently AOT needs lots of time 
and memory to compile the classes. CDS is also kind of slow (tens of 
seconds for 20,000 classes).

In your edit/compile/run example, how much time are we talking about? 
Seconds or minuets?

I guess one thing that we can consider in Leyden is the ability to 
partition the application into a "static" part and a "dynamic" part. If 
you just edit the code in the dynamic part, you can restart the static 
app quickly, and then just have to load the dynamic part slowly.

In my experience with working for cached heap objects, I've found that 
they are extremely tangled. So we will need a tool similar to jdep, but 
can analyze the dependencies of the heap objects so that they can be 
properly partitioned. (Or, we can have design patterns that enforces 
partitioning -- e.g., cached objects cannot span across modules, or must 
be organized in a DAG, etc).

For example, an app's cached objects can be partitioned into a series of 
checkpoints. If you have modified a module, we can restore to the last 
checkpoint that doesn't involve that module, and then resume dynamic 
class loading.

Of course, a completely orthogonal way to solve this is to use 
edit-and-continue in various IDE offerings :-)

Thanks
- Ioi

On 5/22/20 3:11 AM, Mike Hearn wrote:
> Thanks Ioi, and thanks to Jiangli for the interesting presentation.
>
> Yes, class definition changes are certainly a problem. I spent a bit 
> of time looking at the AppCDS code a few months ago, back when I 
> thought the feature seemed not to be receiving much attention (glad to 
> see now that this was wrong!). I was curious if a customised JDK could 
> be used to make desktop apps start faster.
>
> My investigation went down the path of neither an annotation nor 
> language change but rather a "box api" like this:
>
> private static final Foo = new Archive<>(() ->
>     return new Foo("some-expensive-to-load-thing.xml");
> );
>
> For example, caching JavaFX scene graphs loaded from FXML. The Archive 
> class is defined to optionally run the lambda, but it may choose not 
> to if a prior object graph can be loaded.
>
> For the classloader situation I was digging in to whether AppCDS 
> archives could be scoped per module. At the moment OpenJDK seems to be 
> going down the path of shipping-oriented optimisations e.g. jlink, 
> "App"CDS. I wanted to see how re-usable the infrastructure was for use 
> during app development itself, because one area where startup time 
> really matters, and I don't see it being mentioned as part of Leyden, 
> is the edit/compile/run cycle. Obviously with an app-centric system 
> the answer is it can't help because the app is always changing. With a 
> more module-centric view, if the developer uses JPMS enthusiastically 
> e.g. one module per screen of the app, then other modules could be 
> seen as unchanging and have optimisations applied to them, i.e. AppCDS 
> archives dumped during a prior shutdown could be reused.
>
> For this to work you have to know if a module has changed, so you can 
> try to load or discard cached data. In JMOD/JAR form there's a 
> checksumming mechanism added to provide some basic enforcement of 
> "exports … to …" clauses. So I started looking at whether that could 
> be used, but during development modules are often in 
> exploded/directory form. So you'd need to be able to rapidly determine 
> if an entire directory hierarchy had changed, /or/ monitor it using 
> something like Facebook Watchman. This took me down the road of 
> thinking you'd need build system integration, which OpenJDK cannot 
> have in its current scope. My conclusion was that the Java world would 
> benefit from some kind of über-project on top of OpenJDK and other 
> tools that integrated them more tightly together … and that's where it 
> ended.
>
> No specific implementation conclusions, but I hope a more 
> module-centric vs app-centric set of features is explored - one of the 
> big wins of the Java platform is the combination of static type 
> checking /and/ script-like edit/run cycles. It'd be nice if some of 
> the Leyden work finds ways to be applicable prior to a final 
> jpackage-for-ship pass is done.
>
>
>
>
> On Thu, May 21, 2020 at 20:59:09, Ioi Lam <ioi.lam at oracle.com 
> <mailto:ioi.lam at oracle.com>> wrote:
>
>     On 5/19/20 7:07 AM, Mike Hearn wrote:
>
>         Although this may be veering into an unwanted "what will it
>         be" discussion, I'm curious how fundamental it is that an
>         image must be either static or dynamic.
>
>         HotSpot already does some speculative optimisations on the
>         assumption that new code isn't loaded e.g. the class hierarchy
>         analysis. Speculative optimisations do a great job of removing
>         costs of dynamic language features like class redefinition.
>         Leyden seems to be proposed as a new 'mode' of the Java
>         language, but could it also be envisioned as a new set of
>         optimisations that optimistically assume no dynamic code
>         loading? For instance, could a module be marked as "available
>         to dynamically loaded code" and then the points-to analysis /
>         dead code elimination would pin the public API of the module
>         before running, and if it's not marked as such, then DCE runs
>         across the boundaries too?
>
>         A big part of the SubstrateVM startup time win appears to come
>         from a more aggressive version of the AppCDS heap
>         serialisation feature, and a general focus on startup time to
>         the exclusion of other factors (e.g. there's no notion of
>         module layers or module boundary enforcement). To what extent
>         is the closed-world assumption contributing to the
>         footprint/startup time wins vs other spec changes - is that
>         known?
>
>         These questions aren't rhetorical, I don't have any view on
>         the answers. After playing around with native-image and seeing
>         mixed results (e.g. smaller wins than I thought for GUI JavaFX
>         apps), and looking at the work being done on AppCDS, I started
>         to wonder if HotSpot can eventually match the startup time and
>         footprint wins of native-image without the compatibility
>         breaking changes SVM makes to get there, just through doubling
>         down on current optimisation techniques like jlink and
>         speculation.
>
>         Thanks for any insight offered!
>
>     Hi Mike,
>
>     Thanks for mentioning AppCDS. As someone working primarily on
>     AppCDS, I am happy to hear that you have a good impression on its
>     current state and potentials. Over the past few years, we have
>     done many start-up optimizations, in AppCDS and many other parts
>     of the JDK (core libs, jlink, hotspot, etc). You can see Claes
>     Redestat's excellent presentation here:
>
>     https://cl4es.github.io/2019/11/20/OpenJDK-Startup-Update.html
>     <https://cl4es.github.io/2019/11/20/OpenJDK-Startup-Update.html>
>
>     Since JDK9, we have cut down the HelloWorld start-up time from
>     about 120ms in JDK 9  to less than 40ms in JDK 14. We will be
>     eking out a few more ms in JDK 15.
>
>     A large part of the improvement is due to archiving heap objects
>     into CDS -- mostly related module information. In JDK 16, I hope
>     to finish JDK-8244778 (Archive complete module graph in CDS). My
>     prototype shows about 7~10ms improvement in start-up time.
>
>     The problem with archiving heap objects, though, is currently it
>     requires a lot of manual work and knowledge about the code. I am
>     spending a lot of effort to take out 7~10ms for JDK-8244778. To be
>     useful in general applications, we need to make it much easier to
>     use.
>
>     A key assumption of archiving heap objects is -- part of my
>     program will always produce the same set of objects, so I can
>     execute it ahead of time and save the results for later use. Maybe
>     "closed world" sounds too strong, but at least I need to have a
>     "stable world", where I am guaranteed to always see the exact same
>     bytecodes in the exact set of classes, so I can analyze them to
>     see if they will indeed always produce the same results.
>
>     The problem is Java classes are loosely coupled, and the linking
>     happens only at execution time. A reference to "Foo" can give you
>     a completely different class (an app can even define classes
>     dynamically). I hope we can introduce a new "linking" concept in
>     the JLS, so Java classes can be more strongly combined together
>     into a bigger execution unit.
>
>     For example, a "linked" app will already have Foo in the VM's
>     dictionary before any bytecodes are executed. So the app won't be
>     able to substitute an different Foo. I think this will help not
>     only AppCDS but also AOT compilation.
>
>     I don't think speculation can help here -- if we have archived a
>     set of objects that involves Foo, and have used these objects part
>     away during execution, and (assuming we have not yet touched any
>     archived Foo objects yet) the app defines a new Foo class, we are
>     stuck. It would take tremendous effort to look at the archived
>     objects to patch the new version of the Foo objects, and this
>     won't be possible if you add or remove fields in Foo. We would
>     also need to add a lot of run time checks at every step to
>     guarantee that we indeed load the same classes. All these checks
>     will be counterproductive to start-up.
>
>     Thanks
>     - Ioi
>
>