Proposal for improving CDS archive creation

Jiangli Zhou jiangli.zhou at oracle.com
Thu Jul 12 00:46:22 UTC 2018


Volker originally suggested the idea in the email thread "Improving 
AppCDS for Custom Loaders". I think this is a cleaner approach.

Thanks,

Jiangli


On 7/11/18 4:13 PM, Ioi Lam wrote:
> I had an off-line discussion with Jiangli, and she has an alternative 
> proposal:
>
> When -Xshare:autocreate is specified, but the CDS archive is not 
> available,
>
> 1. Load classes as normal. After each InstanceKlass is loaded, but 
> before it's used,
>    make a deep copy of this class into an internal cache.
>
> 2. The deep copy includes all methods, etc, for this class. However, 
> if a Method is
>    inherited from a super class, then only a reference to this Method 
> is copied.
>
> 3. At a certain point (probably at VM exit), copy all the (suitable) 
> classes from the
>    cache and write them into the CDS archive.
>
> The advantage of this approach is we will be able to archive classes 
> that were
> loaded by custom loaders, but have been freed at VM exit time because 
> the class
> loaders were GC'ed.
>
>
> Note: When a class X is loaded, if its supertype(s) have already been 
> redefined,
> we probably should not copy X into the buffer. That's because the 
> vtable of X may
> point to some redefined methods from a supertype, which do not match 
> the bytecodes
> of these methods in the supertype's original class file, so it's a 
> messy situation.
>
> Thanks
> - Ioi
>
>
>
> On 7/10/18 12:50 PM, Ioi Lam wrote:
>> Fixing some sloppy text below ....
>>
>>
>> On 7/10/18 10:16 AM, Ioi Lam wrote:
>>> I have a proposal for improving the process of creating of the CDS 
>>> archive(s),
>>> so we can make CDS easier to use and support more use cases.
>>>
>>>    - better support for custom loaders
>>>    - remove explicit training run
>>>    - support 2 levels of shared archives
>>>
>>> I think the proposal is relatively straight-forward to implement, as 
>>> we already
>>> have most of the required infrastructures:
>>>
>>>    + the ability to use Java class loaders at archive creation time
>>>    + the ability to relocate MetaspaceObjects
>>>
>>> Parts of this proposal will also simplify the CDS code and make it more
>>> maintainable.
>>>
>>> Current process of creating the base archive - [C]
>>> ==================================================
>>>
>>> Currently each JVM process can map at most one CDS archive. Let's 
>>> call this
>>> the "base archive". It is created by [ref1]:
>>>
>>>  C1. Reserve a region R of 3GB at 0x800000000.
>>>  C2. Load all classes specified in the class list. All data for 
>>> these classes
>>>      live outside of R.
>>>      (E.g., the Klass objects are loaded into tmp_class_space, which is
>>>       adjacent to R).
>>>  C3. Copy the metadata of all archivable classes (e.g, exclude 
>>> generated
>>>      Lambda classes) into R. At this step, R is divided into several
>>>      sections (RO, RW, etc).
>>>
>>>
>>>   //  +-- SharedBaseAddress   (default = 0x800000000)
>>>   //  +-- _narrow_klass._base
>>>   //  |
>>>   //  |                               +-tmp_class_space.base
>>>   //  v                               V
>>>   //  +----+----+----+----+----+-....-+-------------------+
>>>   //  |<-           R               ->|
>>>   //  | MC | RW | RO | MD | OD |unused| tmp_class_space   |
>>>   //  +----+----+----+----+----+------+-------------------+
>>>   //  |<--  3GB        -------------->|
>>>   //  |<-- UnscaledClassSpaceMax = 4GB ------------------>|
>>>
>>>
>>> New process for creating the base archive - [N]
>>> ===============================================
>>>
>>> Currently we have a lot of "if (DumpSharedSpaces)" code to for 
>>> special case
>>> handling of the above scheme. We can improve it by
>>>
>>>  N1. Remove all code for special memory layout initialization for 
>>> -Xshare:dump.
>>>      As a result, we will reserve a region R of 1GB at 0x800000000, 
>>> which
>>>      is used by Klass objects (this is the same as if -Xshare:off were
>>>      specified.)
>>>  N2. Load all classes in the class list.
>>>  N3. Now R contains the Klass objects of all loaded classes.
>>>      Allocate a temporary space T, and copy all contents of R into T.
>>>  N4. Now R is empty. Copy the metadata of all archivable classes 
>>> into R.
>>>
>>>
>>> Dump-as-you-go for the base archive - [G]
>>> =========================================
>>>
>>> Note that the [N] scheme will work even if you're running an app with
>>> -Xshare:off. At some point (e.g., when the VM is about to exit), you
>>> can:
>>>
>>>  G1. Enter a safe point
>>>  G2. Go to step [N3].
>>>
>>> The benefit of [G] is you don't need a separate run to dump the 
>>> archive, and
>>> there's no need to use the class list. Instead, we can have an 
>>> option like:
>>>
>>>    java -Xshare:autocreate -cp app.jar -XX:SharedArchiveFile=foo.jsa 
>>> App
>>>
>>> If foo.jsa is not available, we run in [G] mode. At VM exit, we dump 
>>> into
>>> foo.jsa.
>>>
>>> This way, we don't need to have an explicit training run with
>>> -XX:DumpLoadedClassList. Instead, the training run is
>>>
>> I meant, "Instead, your first run, when the archive is not yet 
>> available, becomes the
>> training run".
>>
>> Thanks to Calvin and Dan for spotting this :-)
>> - Ioi
>>
>>> This also makes it easy to support the classes from custom loaders. 
>>> There's no
>>> need for special tooling to convert -Xlog:class+load=debug output 
>>> into a
>>> classlist. [ref2]
>>>
>>>
>>> Dumping for second-level archive - [S]
>>> ======================================
>>>
>>>  S1. Load the base archive
>>>  S2. Run the app as normal
>>>  S3. All Klass objects of the dynamically loaded classes will be 
>>> loaded in
>>>      the region R, which immediately follows the end of the base 
>>> archive.
>>>
>>>   //  +-- SharedBaseAddress
>>>   //  |                          +--- dynamically loaded Klasses
>>>   //  |                          |    start from here.
>>>   //  v                          v
>>>   // +--------------------------+---------...-----------------|
>>>   //  | base archive             | region R |
>>>   // +--------------------------+---------...-----------------|
>>>   //  |<- size of base archive ->|
>>>   //  |<--            1GB -->|
>>>
>>>
>>>   S4. At some point (possible when the VM is about to exit) we start
>>>       dumping the second level archive
>>>   S5. Enter safe point
>>>   S6. Now R contains the Klass objects of all dynamically loaded 
>>> classes.
>>>       Allocate a temporary space T, and copy all contents of R into T.
>>>   S7. Now R is empty. Copy the metadata of all archivable, 
>>> dynamically loaded
>>>       classes into R.
>>>   S8. Create a new shared_dictionary (and shared_symbol_table) that 
>>> contains
>>>       all the Klasses (Symbols) from both the base and second-level 
>>> archives.
>>>
>>> References
>>> ==========
>>>
>>> [ref1] Current initialization of memory space layout during 
>>> -Xshare:dump
>>> http://hg.openjdk.java.net/jdk/jdk/file/e0028bb6dd3d/src/hotspot/share/memory/metaspaceShared.cpp#l250 
>>>
>>> [ref2] Volker Simonis's tool for support custom class loaders in CDS
>>>        https://github.com/simonis/cl4cds
>>> ----------------------------------------------------------------------
>>>
>>>
>>>
>>> Any thoughts?
>>>
>>> Thanks
>>> - Ioi
>>
>



More information about the hotspot-runtime-dev mailing list