Proposal for improving CDS archive creation

Ioi Lam ioi.lam at oracle.com
Fri Jul 13 18:50:13 UTC 2018


When writing into the buffer, the algorithm works like this


     MetaspaceObj* get_buffered(MetaspaceObj *p) {
MetaspaceObj* saved = buffer_find(p);
         if (saved == NULL) {
             saved = buffer_write(p);
         }
     }

So when you're writing a vtable into the buffer:

     Method** vtable = ...; // points to the "real" class X
     Method** vtable_buffered = ...; // points to the "buffered" class X

     for (int i=0; i<vtable_length; i++) {
         Method* m = vtable[i];
Method* buffered_m = get_buffered(m);
         vtable_buffered[i] = buffered_m;
     }

buffer_write(m) will not happen if m is a method defined by a super 
class of X.

However, with some class are unloaded and the metaspace blocks are being 
reused, a new MetaspaceObject may happen to occupy the exact same 
address as an old MetaspaceObject from an unloaded class. This would 
make the buffering operation more complicated.

We have 2 choices:

[1] Disable the deallocation of MetaspaceObjects when -Xshare:autocreate 
is specified.
[2] When a MetaspaceObject is deallocated, remove it from the hash table 
used by buffer_find().

We can start with [1] as it has a lesser chance of working incorrectly, 
(except it might run out of metaspace memory for some pathological cases).


- Ioi

On 7/11/18 5:46 PM, Jiangli Zhou wrote:
> Volker originally suggested the idea in the email thread "Improving 
> AppCDS for Custom Loaders". I think this is a cleaner approach.
>
> Thanks,
>
> Jiangli
>
>
> On 7/11/18 4:13 PM, Ioi Lam wrote:
>> I had an off-line discussion with Jiangli, and she has an alternative 
>> proposal:
>>
>> When -Xshare:autocreate is specified, but the CDS archive is not 
>> available,
>>
>> 1. Load classes as normal. After each InstanceKlass is loaded, but 
>> before it's used,
>>    make a deep copy of this class into an internal cache.
>>
>> 2. The deep copy includes all methods, etc, for this class. However, 
>> if a Method is
>>    inherited from a super class, then only a reference to this Method 
>> is copied.
>>
>> 3. At a certain point (probably at VM exit), copy all the (suitable) 
>> classes from the
>>    cache and write them into the CDS archive.
>>
>> The advantage of this approach is we will be able to archive classes 
>> that were
>> loaded by custom loaders, but have been freed at VM exit time because 
>> the class
>> loaders were GC'ed.
>>
>>
>> Note: When a class X is loaded, if its supertype(s) have already been 
>> redefined,
>> we probably should not copy X into the buffer. That's because the 
>> vtable of X may
>> point to some redefined methods from a supertype, which do not match 
>> the bytecodes
>> of these methods in the supertype's original class file, so it's a 
>> messy situation.
>>
>> Thanks
>> - Ioi
>>
>>
>>
>> On 7/10/18 12:50 PM, Ioi Lam wrote:
>>> Fixing some sloppy text below ....
>>>
>>>
>>> On 7/10/18 10:16 AM, Ioi Lam wrote:
>>>> I have a proposal for improving the process of creating of the CDS 
>>>> archive(s),
>>>> so we can make CDS easier to use and support more use cases.
>>>>
>>>>    - better support for custom loaders
>>>>    - remove explicit training run
>>>>    - support 2 levels of shared archives
>>>>
>>>> I think the proposal is relatively straight-forward to implement, 
>>>> as we already
>>>> have most of the required infrastructures:
>>>>
>>>>    + the ability to use Java class loaders at archive creation time
>>>>    + the ability to relocate MetaspaceObjects
>>>>
>>>> Parts of this proposal will also simplify the CDS code and make it 
>>>> more
>>>> maintainable.
>>>>
>>>> Current process of creating the base archive - [C]
>>>> ==================================================
>>>>
>>>> Currently each JVM process can map at most one CDS archive. Let's 
>>>> call this
>>>> the "base archive". It is created by [ref1]:
>>>>
>>>>  C1. Reserve a region R of 3GB at 0x800000000.
>>>>  C2. Load all classes specified in the class list. All data for 
>>>> these classes
>>>>      live outside of R.
>>>>      (E.g., the Klass objects are loaded into tmp_class_space, 
>>>> which is
>>>>       adjacent to R).
>>>>  C3. Copy the metadata of all archivable classes (e.g, exclude 
>>>> generated
>>>>      Lambda classes) into R. At this step, R is divided into several
>>>>      sections (RO, RW, etc).
>>>>
>>>>
>>>>   //  +-- SharedBaseAddress   (default = 0x800000000)
>>>>   //  +-- _narrow_klass._base
>>>>   //  |
>>>>   //  |                               +-tmp_class_space.base
>>>>   //  v                               V
>>>>   //  +----+----+----+----+----+-....-+-------------------+
>>>>   //  |<-           R               ->|
>>>>   //  | MC | RW | RO | MD | OD |unused| tmp_class_space   |
>>>>   //  +----+----+----+----+----+------+-------------------+
>>>>   //  |<--  3GB        -------------->|
>>>>   //  |<-- UnscaledClassSpaceMax = 4GB ------------------>|
>>>>
>>>>
>>>> New process for creating the base archive - [N]
>>>> ===============================================
>>>>
>>>> Currently we have a lot of "if (DumpSharedSpaces)" code to for 
>>>> special case
>>>> handling of the above scheme. We can improve it by
>>>>
>>>>  N1. Remove all code for special memory layout initialization for 
>>>> -Xshare:dump.
>>>>      As a result, we will reserve a region R of 1GB at 0x800000000, 
>>>> which
>>>>      is used by Klass objects (this is the same as if -Xshare:off were
>>>>      specified.)
>>>>  N2. Load all classes in the class list.
>>>>  N3. Now R contains the Klass objects of all loaded classes.
>>>>      Allocate a temporary space T, and copy all contents of R into T.
>>>>  N4. Now R is empty. Copy the metadata of all archivable classes 
>>>> into R.
>>>>
>>>>
>>>> Dump-as-you-go for the base archive - [G]
>>>> =========================================
>>>>
>>>> Note that the [N] scheme will work even if you're running an app with
>>>> -Xshare:off. At some point (e.g., when the VM is about to exit), you
>>>> can:
>>>>
>>>>  G1. Enter a safe point
>>>>  G2. Go to step [N3].
>>>>
>>>> The benefit of [G] is you don't need a separate run to dump the 
>>>> archive, and
>>>> there's no need to use the class list. Instead, we can have an 
>>>> option like:
>>>>
>>>>    java -Xshare:autocreate -cp app.jar 
>>>> -XX:SharedArchiveFile=foo.jsa App
>>>>
>>>> If foo.jsa is not available, we run in [G] mode. At VM exit, we 
>>>> dump into
>>>> foo.jsa.
>>>>
>>>> This way, we don't need to have an explicit training run with
>>>> -XX:DumpLoadedClassList. Instead, the training run is
>>>>
>>> I meant, "Instead, your first run, when the archive is not yet 
>>> available, becomes the
>>> training run".
>>>
>>> Thanks to Calvin and Dan for spotting this :-)
>>> - Ioi
>>>
>>>> This also makes it easy to support the classes from custom loaders. 
>>>> There's no
>>>> need for special tooling to convert -Xlog:class+load=debug output 
>>>> into a
>>>> classlist. [ref2]
>>>>
>>>>
>>>> Dumping for second-level archive - [S]
>>>> ======================================
>>>>
>>>>  S1. Load the base archive
>>>>  S2. Run the app as normal
>>>>  S3. All Klass objects of the dynamically loaded classes will be 
>>>> loaded in
>>>>      the region R, which immediately follows the end of the base 
>>>> archive.
>>>>
>>>>   //  +-- SharedBaseAddress
>>>>   //  |                          +--- dynamically loaded Klasses
>>>>   //  |                          |    start from here.
>>>>   //  v                          v
>>>>   // +--------------------------+---------...-----------------|
>>>>   //  | base archive             | region R |
>>>>   // +--------------------------+---------...-----------------|
>>>>   //  |<- size of base archive ->|
>>>>   //  |<--            1GB -->|
>>>>
>>>>
>>>>   S4. At some point (possible when the VM is about to exit) we start
>>>>       dumping the second level archive
>>>>   S5. Enter safe point
>>>>   S6. Now R contains the Klass objects of all dynamically loaded 
>>>> classes.
>>>>       Allocate a temporary space T, and copy all contents of R into T.
>>>>   S7. Now R is empty. Copy the metadata of all archivable, 
>>>> dynamically loaded
>>>>       classes into R.
>>>>   S8. Create a new shared_dictionary (and shared_symbol_table) that 
>>>> contains
>>>>       all the Klasses (Symbols) from both the base and second-level 
>>>> archives.
>>>>
>>>> References
>>>> ==========
>>>>
>>>> [ref1] Current initialization of memory space layout during 
>>>> -Xshare:dump
>>>> http://hg.openjdk.java.net/jdk/jdk/file/e0028bb6dd3d/src/hotspot/share/memory/metaspaceShared.cpp#l250 
>>>>
>>>> [ref2] Volker Simonis's tool for support custom class loaders in CDS
>>>>        https://github.com/simonis/cl4cds
>>>> ----------------------------------------------------------------------
>>>>
>>>>
>>>>
>>>> Any thoughts?
>>>>
>>>> Thanks
>>>> - Ioi
>>>
>>
>



More information about the hotspot-runtime-dev mailing list