Proposal for improving CDS archive creation
    Ioi Lam 
    ioi.lam at oracle.com
       
    Wed Jul 11 23:13:59 UTC 2018
    
    
  
I had an off-line discussion with Jiangli, and she has an alternative 
proposal:
When -Xshare:autocreate is specified, but the CDS archive is not available,
1. Load classes as normal. After each InstanceKlass is loaded, but 
before it's used,
    make a deep copy of this class into an internal cache.
2. The deep copy includes all methods, etc, for this class. However, if 
a Method is
    inherited from a super class, then only a reference to this Method 
is copied.
3. At a certain point (probably at VM exit), copy all the (suitable) 
classes from the
    cache and write them into the CDS archive.
The advantage of this approach is we will be able to archive classes 
that were
loaded by custom loaders, but have been freed at VM exit time because 
the class
loaders were GC'ed.
Note: When a class X is loaded, if its supertype(s) have already been 
redefined,
we probably should not copy X into the buffer. That's because the vtable 
of X may
point to some redefined methods from a supertype, which do not match the 
bytecodes
of these methods in the supertype's original class file, so it's a messy 
situation.
Thanks
- Ioi
On 7/10/18 12:50 PM, Ioi Lam wrote:
> Fixing some sloppy text below ....
>
>
> On 7/10/18 10:16 AM, Ioi Lam wrote:
>> I have a proposal for improving the process of creating of the CDS 
>> archive(s),
>> so we can make CDS easier to use and support more use cases.
>>
>>    - better support for custom loaders
>>    - remove explicit training run
>>    - support 2 levels of shared archives
>>
>> I think the proposal is relatively straight-forward to implement, as 
>> we already
>> have most of the required infrastructures:
>>
>>    + the ability to use Java class loaders at archive creation time
>>    + the ability to relocate MetaspaceObjects
>>
>> Parts of this proposal will also simplify the CDS code and make it more
>> maintainable.
>>
>> Current process of creating the base archive - [C]
>> ==================================================
>>
>> Currently each JVM process can map at most one CDS archive. Let's 
>> call this
>> the "base archive". It is created by [ref1]:
>>
>>  C1. Reserve a region R of 3GB at 0x800000000.
>>  C2. Load all classes specified in the class list. All data for these 
>> classes
>>      live outside of R.
>>      (E.g., the Klass objects are loaded into tmp_class_space, which is
>>       adjacent to R).
>>  C3. Copy the metadata of all archivable classes (e.g, exclude generated
>>      Lambda classes) into R. At this step, R is divided into several
>>      sections (RO, RW, etc).
>>
>>
>>   //  +-- SharedBaseAddress   (default = 0x800000000)
>>   //  +-- _narrow_klass._base
>>   //  |
>>   //  |                               +-tmp_class_space.base
>>   //  v                               V
>>   //  +----+----+----+----+----+-....-+-------------------+
>>   //  |<-           R               ->|
>>   //  | MC | RW | RO | MD | OD |unused| tmp_class_space   |
>>   //  +----+----+----+----+----+------+-------------------+
>>   //  |<--  3GB        -------------->|
>>   //  |<-- UnscaledClassSpaceMax = 4GB ------------------>|
>>
>>
>> New process for creating the base archive - [N]
>> ===============================================
>>
>> Currently we have a lot of "if (DumpSharedSpaces)" code to for 
>> special case
>> handling of the above scheme. We can improve it by
>>
>>  N1. Remove all code for special memory layout initialization for 
>> -Xshare:dump.
>>      As a result, we will reserve a region R of 1GB at 0x800000000, 
>> which
>>      is used by Klass objects (this is the same as if -Xshare:off were
>>      specified.)
>>  N2. Load all classes in the class list.
>>  N3. Now R contains the Klass objects of all loaded classes.
>>      Allocate a temporary space T, and copy all contents of R into T.
>>  N4. Now R is empty. Copy the metadata of all archivable classes into R.
>>
>>
>> Dump-as-you-go for the base archive - [G]
>> =========================================
>>
>> Note that the [N] scheme will work even if you're running an app with
>> -Xshare:off. At some point (e.g., when the VM is about to exit), you
>> can:
>>
>>  G1. Enter a safe point
>>  G2. Go to step [N3].
>>
>> The benefit of [G] is you don't need a separate run to dump the 
>> archive, and
>> there's no need to use the class list. Instead, we can have an option 
>> like:
>>
>>    java -Xshare:autocreate -cp app.jar -XX:SharedArchiveFile=foo.jsa App
>>
>> If foo.jsa is not available, we run in [G] mode. At VM exit, we dump 
>> into
>> foo.jsa.
>>
>> This way, we don't need to have an explicit training run with
>> -XX:DumpLoadedClassList. Instead, the training run is
>>
> I meant, "Instead, your first run, when the archive is not yet 
> available, becomes the
> training run".
>
> Thanks to Calvin and Dan for spotting this :-)
> - Ioi
>
>> This also makes it easy to support the classes from custom loaders. 
>> There's no
>> need for special tooling to convert -Xlog:class+load=debug output into a
>> classlist. [ref2]
>>
>>
>> Dumping for second-level archive - [S]
>> ======================================
>>
>>  S1. Load the base archive
>>  S2. Run the app as normal
>>  S3. All Klass objects of the dynamically loaded classes will be 
>> loaded in
>>      the region R, which immediately follows the end of the base 
>> archive.
>>
>>   //  +-- SharedBaseAddress
>>   //  |                          +--- dynamically loaded Klasses
>>   //  |                          |    start from here.
>>   //  v                          v
>>   //  +--------------------------+---------...-----------------|
>>   //  | base archive             | region R                    |
>>   //  +--------------------------+---------...-----------------|
>>   //  |<- size of base archive ->|
>>   //  |<--            1GB -->|
>>
>>
>>   S4. At some point (possible when the VM is about to exit) we start
>>       dumping the second level archive
>>   S5. Enter safe point
>>   S6. Now R contains the Klass objects of all dynamically loaded 
>> classes.
>>       Allocate a temporary space T, and copy all contents of R into T.
>>   S7. Now R is empty. Copy the metadata of all archivable, 
>> dynamically loaded
>>       classes into R.
>>   S8. Create a new shared_dictionary (and shared_symbol_table) that 
>> contains
>>       all the Klasses (Symbols) from both the base and second-level 
>> archives.
>>
>> References
>> ==========
>>
>> [ref1] Current initialization of memory space layout during -Xshare:dump
>> http://hg.openjdk.java.net/jdk/jdk/file/e0028bb6dd3d/src/hotspot/share/memory/metaspaceShared.cpp#l250 
>>
>> [ref2] Volker Simonis's tool for support custom class loaders in CDS
>>        https://github.com/simonis/cl4cds
>> ----------------------------------------------------------------------
>>
>>
>>
>> Any thoughts?
>>
>> Thanks
>> - Ioi
>
    
    
More information about the hotspot-runtime-dev
mailing list