Proposal for improving CDS archive creation

Tue Jul 10 17:16:10 UTC 2018

I have a proposal for improving the process of creating of the CDS 
archive(s),
so we can make CDS easier to use and support more use cases.

    - better support for custom loaders
    - remove explicit training run
    - support 2 levels of shared archives

I think the proposal is relatively straight-forward to implement, as we 
already
have most of the required infrastructures:

    + the ability to use Java class loaders at archive creation time
    + the ability to relocate MetaspaceObjects

Parts of this proposal will also simplify the CDS code and make it more
maintainable.

Current process of creating the base archive - [C]
==================================================

Currently each JVM process can map at most one CDS archive. Let's call this
the "base archive". It is created by [ref1]:

  C1. Reserve a region R of 3GB at 0x800000000.
  C2. Load all classes specified in the class list. All data for these 
classes
      live outside of R.
      (E.g., the Klass objects are loaded into tmp_class_space, which is
       adjacent to R).
  C3. Copy the metadata of all archivable classes (e.g, exclude generated
      Lambda classes) into R. At this step, R is divided into several
      sections (RO, RW, etc).

   //  +-- SharedBaseAddress   (default = 0x800000000)
   //  +-- _narrow_klass._base
   //  |
   //  |                               +-tmp_class_space.base
   //  v                               V
   //  +----+----+----+----+----+-....-+-------------------+
   //  |<-           R               ->|
   //  | MC | RW | RO | MD | OD |unused| tmp_class_space   |
   //  +----+----+----+----+----+------+-------------------+
   //  |<--  3GB        -------------->|
   //  |<-- UnscaledClassSpaceMax = 4GB ------------------>|

New process for creating the base archive - [N]
===============================================

Currently we have a lot of "if (DumpSharedSpaces)" code to for special case
handling of the above scheme. We can improve it by

  N1. Remove all code for special memory layout initialization for 
-Xshare:dump.
      As a result, we will reserve a region R of 1GB at 0x800000000, which
      is used by Klass objects (this is the same as if -Xshare:off were
      specified.)
  N2. Load all classes in the class list.
  N3. Now R contains the Klass objects of all loaded classes.
      Allocate a temporary space T, and copy all contents of R into T.
  N4. Now R is empty. Copy the metadata of all archivable classes into R.

Dump-as-you-go for the base archive - [G]
=========================================

Note that the [N] scheme will work even if you're running an app with
-Xshare:off. At some point (e.g., when the VM is about to exit), you
can:

  G1. Enter a safe point
  G2. Go to step [N3].

The benefit of [G] is you don't need a separate run to dump the archive, and
there's no need to use the class list. Instead, we can have an option like:

    java -Xshare:autocreate -cp app.jar -XX:SharedArchiveFile=foo.jsa App

If foo.jsa is not available, we run in [G] mode. At VM exit, we dump into
foo.jsa.

This way, we don't need to have an explicit training run with
-XX:DumpLoadedClassList. Instead, the training run is

This also makes it easy to support the classes from custom loaders. 
There's no
need for special tooling to convert -Xlog:class+load=debug output into a
classlist. [ref2]

Dumping for second-level archive - [S]
======================================

  S1. Load the base archive
  S2. Run the app as normal
  S3. All Klass objects of the dynamically loaded classes will be loaded in
      the region R, which immediately follows the end of the base archive.

   //  +-- SharedBaseAddress
   //  |                          +--- dynamically loaded Klasses
   //  |                          |    start from here.
   //  v                          v
   //  +--------------------------+---------...-----------------|
   //  | base archive             | region R                    |
   //  +--------------------------+---------...-----------------|
   //  |<- size of base archive ->|
   //  |<--            1GB -->|

   S4. At some point (possible when the VM is about to exit) we start
       dumping the second level archive
   S5. Enter safe point
   S6. Now R contains the Klass objects of all dynamically loaded classes.
       Allocate a temporary space T, and copy all contents of R into T.
   S7. Now R is empty. Copy the metadata of all archivable, dynamically 
loaded
       classes into R.
   S8. Create a new shared_dictionary (and shared_symbol_table) that 
contains
       all the Klasses (Symbols) from both the base and second-level 
archives.

References
==========

[ref1] Current initialization of memory space layout during -Xshare:dump
http://hg.openjdk.java.net/jdk/jdk/file/e0028bb6dd3d/src/hotspot/share/memory/metaspaceShared.cpp#l250
[ref2] Volker Simonis's tool for support custom class loaders in CDS
        https://github.com/simonis/cl4cds
----------------------------------------------------------------------

Any thoughts?

Thanks
- Ioi