RFR: 8179302: Pre-resolve constant pool string entries and cache resolved_reference arrays in CDS archive

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Fri Aug 11 18:22:29 UTC 2017


These incremental changes look good to me.
Thanks,
Coleen

On 8/10/17 7:51 PM, Jiangli Zhou wrote:
> Thanks, Ioi!
>
> Jiangli
>
>> On Aug 10, 2017, at 2:15 PM, Ioi Lam <ioi.lam at oracle.com 
>> <mailto:ioi.lam at oracle.com>> wrote:
>>
>> Hi Jiangli,
>>
>>
>> The changes look good to me. Thanks for considering my suggestions.
>>
>>
>> - Ioi
>>
>>
>> On 8/8/17 5:33 PM, Jiangli Zhou wrote:
>>> Here is the incremental webrev that has all the changes incorporated 
>>> with suggestions from Coleen, Ioi and Thomas:
>>>
>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03.inc/ 
>>> <http://cr.openjdk.java.net/%7Ejiangli/8179302/webrev.hotspot.03.inc/>
>>>
>>> Updated full webrev: 
>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.03/ 
>>> <http://cr.openjdk.java.net/%7Ejiangli/8179302/webrev.hotspot.03/>
>>>
>>> Thanks again for Coleen's, Ioi's and Thomas’ review!
>>> Jiangli
>>>
>>>> On Aug 7, 2017, at 7:57 PM, Jiangli Zhou <jiangli.zhou at Oracle.COM 
>>>> <mailto:jiangli.zhou at Oracle.COM>> wrote:
>>>>
>>>> Hi Ioi,
>>>>
>>>> Thanks for getting back to me.
>>>>
>>>>> On Aug 7, 2017, at 5:45 PM, Ioi Lam <ioi.lam at oracle.com 
>>>>> <mailto:ioi.lam at oracle.com>> wrote:
>>>>>
>>>>> On 8/4/17 10:19 PM, Jiangli Zhou wrote:
>>>>>
>>>>>> Hi Ioi,
>>>>>>
>>>>>> Thanks for looking again.
>>>>>>
>>>>>>> On Aug 4, 2017, at 2:22 PM, Ioi Lam <ioi.lam at oracle.com 
>>>>>>> <mailto:ioi.lam at oracle.com>> wrote:
>>>>>>>
>>>>>>> Hi Jiangli,
>>>>>>>
>>>>>>> The code looks good in general. I just have a few pet peeves for 
>>>>>>> readability:
>>>>>>>
>>>>>>>
>>>>>>> (1) stringTable.cpp and metaspaceShared.cpp have the same asserts
>>>>>>>
>>>>>>>  704 assert(UseG1GC, "Only support G1 GC");
>>>>>>>  705 assert(UseCompressedOops && UseCompressedClassPointers,
>>>>>>>  706 "Only support UseCompressedOops and 
>>>>>>> UseCompressedClassPointers enabled");
>>>>>>>
>>>>>>> 1615 assert(UseG1GC, "Only support G1 GC");
>>>>>>> 1616 assert(UseCompressedOops && UseCompressedClassPointers,
>>>>>>> 1617 "Only support UseCompressedOops and 
>>>>>>> UseCompressedClassPointers enabled");
>>>>>>>
>>>>>>> Maybe it's better to combine them into a single function like 
>>>>>>> MetaspaceShared::assert_vm_flags() so they don't get out of sync?
>>>>>>
>>>>>> There is a MetaspaceShared::allow_archive_heap_object(), which 
>>>>>> checks for UseG1GC, UseCompressedOops and 
>>>>>> UseCompressedClassPointers combined. It does not seem to worth 
>>>>>> add another separate API for asserting the required flags. I’ll 
>>>>>> use that in the assert.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> (2) FileMapInfo::write_archive_heap_regions()
>>>>>>>
>>>>>>> I still find this code very hard to read, especially due to the 
>>>>>>> loop.
>>>>>>>
>>>>>>> First, the comments are not consistent with the code:
>>>>>>>
>>>>>>> 498 assert(arr_len <= max_num_regions, "number of memory regions 
>>>>>>> exceeds maximum");
>>>>>>>
>>>>>>> but the comments says: "The rest are consecutive full GC 
>>>>>>> regions" which means there's a chance for max_num_regions to be 
>>>>>>> more than 2 (which will be the case with Calvin's java-loader 
>>>>>>> dumping changes using very small heap size).So the code is 
>>>>>>> actually wrong.
>>>>>>
>>>>>> The max_num_regions is the maximum number of region for each 
>>>>>> archived heap space (the string space, or open archive space). We 
>>>>>> only run into the case where the MemRegion array size is larger 
>>>>>> than max_num_regions with Calvin’s pending change. As part of 
>>>>>> Calvin’s change, he will change the assert into a check and bail 
>>>>>> out if the number of MemRegions are larger than max_num_regions 
>>>>>> due to heap fragmentation.
>>>>>>
>>>>>>
>>>>> Your latest patch assumes that arr_len <= 2, but the 
>>>>> implementation of 
>>>>> G1CollectedHeap::heap()->begin_archive_alloc_range() / 
>>>>> G1CollectedHeap::heap()->end_archive_alloc_range() actually allows 
>>>>> more than 2 regions to returned. So simply putting an assert there 
>>>>> seems risky (unless you have analyzed all possible scenarios to 
>>>>> prove that's impossible).
>>>>>
>>>>> Instead of trying to come up with a complicated proof, I think 
>>>>> it's much safer to disable the archived string region if the 
>>>>> arr_len > 2. Also, if the string region is disabled, you should 
>>>>> also disable the open_archive_heap_region
>>>>>
>>>>> I think this is a general issue with the mapped heap regions, and 
>>>>> it just happens to be revealed by Calvin's patch. So we should fix 
>>>>> it now and not wait for Calvin's patch.
>>>>
>>>> Ok. I’ll change the assert to be a check.
>>>>
>>>>>
>>>>>
>>>>>>>
>>>>>>> The word "region" is used in these parameters, but they don't 
>>>>>>> mean the same thing.
>>>>>>>
>>>>>>> GrowableArray<MemRegion> *regions
>>>>>>> int first_region, int max_num_regions,
>>>>>>>
>>>>>>>
>>>>>>> How about regions      -> g1_regions_list
>>>>>>> first_region -> first_region_in_archive
>>>>>>
>>>>>> The GrowableArray above is the MemRegions that GC code gives back 
>>>>>> to us. The GC code combines multiple G1 regions. The comments 
>>>>>> probably are over-explaining the details, which are hidden in the 
>>>>>> GC code. Probably that’s the confusing source. I’ll make the 
>>>>>> comment more clear.
>>>>>>
>>>>>> Using g1_regions_list would also be confusing, since 
>>>>>> write_archive_heap_regions does not handle G1 regions directly. 
>>>>>> It processes the MemRegion array that GC code returns. How about 
>>>>>> changing ‘regions’ to ‘mem_regions’ or ‘archive_regions'?
>>>>>>
>>>>> How about heap_regions? These are regions in the active Java heap, 
>>>>> which current has not mapped anything from the CDS archive.
>>>>
>>>> Ok.
>>>>
>>>> I’m updating my changes and will send out a consolidated webrev.
>>>>
>>>> Thanks!
>>>> Jiangli
>>>>
>>>>>
>>>>>
>>>>>>>
>>>>>>>
>>>>>>> In the comments, I find the phrase 'the current archive heap 
>>>>>>> region' ambiguous. It could be (erroneously) interpreted as "a 
>>>>>>> region from the currently mapped archive”
>>>>>>>
>>>>>>> To make it unambiguous, how about changing
>>>>>>>
>>>>>>>
>>>>>>>  464 // Write the current archive heap region, which contains 
>>>>>>> one or multiple GC(G1) regions.
>>>>>>>
>>>>>>>
>>>>>>> to
>>>>>>>
>>>>>>>     // Write the given list of G1 memory regions into the 
>>>>>>> archive, starting at
>>>>>>>     // first_region_in_archive.
>>>>>>
>>>>>>
>>>>>> Ok. How about the following:
>>>>>>
>>>>>> // Write the given list of java heap memory regions into the 
>>>>>> archive, starting at
>>>>>> // first_region_in_archive.
>>>>>>
>>>>> Sounds good.
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Also, for the explanation of how the G1 regions are written into 
>>>>>>> the archive, how about:
>>>>>>>
>>>>>>>    // The G1 regions in the list are sorted in ascending address 
>>>>>>> order. When there are more objects
>>>>>>>    // than the capacity of a single G1 region, the bottom-most 
>>>>>>> G1 region may be partially filled, and the
>>>>>>>    // remaining G1 region(s) are consecutively allocated and 
>>>>>>> fully filled.
>>>>>>>    //
>>>>>>>    // Thus, the bottom-most G1 region (if not empty) is written 
>>>>>>> into first_region_in_archive.
>>>>>>>    // The remaining G1 regions (if exist) are coalesced and 
>>>>>>> written as a single block
>>>>>>>    // into (first_region_in_archive + 1)
>>>>>>>
>>>>>>>    // Here's the mapping from (g1 regions) -> (archive regions).
>>>>>>>
>>>>>>>
>>>>>>> All this function needs to do is to decide the values for
>>>>>>>
>>>>>>> r0_start, r0_top
>>>>>>> r1_start, r1_top
>>>>>>>
>>>>>>> I think it would be much better to not use the loop, and not use 
>>>>>>> the max_num_regions parameter (it's always 2 anyway).
>>>>>>>
>>>>>>>      *r0_start = *r0_top = NULL;
>>>>>>>      *r1_start = *r1_top = NULL;
>>>>>>>
>>>>>>> if (arr_len >= 1) {
>>>>>>>       *r0_start = regions->at(0).start();
>>>>>>>       *r0_end = *r0_start + regions->at(0).byte_size();
>>>>>>> }
>>>>>>> if (arr_len >= 2) {
>>>>>>>     int last = arr_len - 1;
>>>>>>>          *r1_start = regions->at(1).start();
>>>>>>>          *r1_end = regions->at(last).start() + 
>>>>>>> regions->at(last).byte_size();
>>>>>>> }
>>>>>>>
>>>>>>> what do you think?
>>>>>>
>>>>>> We need to write out all archive regions including the empty 
>>>>>> ones. The loop using max_num_regions is the easiest way. I’d like 
>>>>>> to remove the code that deals with r0_* and r1_ explicitly. Let 
>>>>>> me try that.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> (3) metaspace.cpp
>>>>>>>
>>>>>>> 3350         // Map the archived heap regions after compressed 
>>>>>>> pointers
>>>>>>> 3351         // because it relies on compressed class pointers 
>>>>>>> setting to work
>>>>>>>
>>>>>>> do you mean this?
>>>>>>>
>>>>>>>     // Archived heap regions depend on the parameters of 
>>>>>>> compressed class pointers, so
>>>>>>>     // they must be mapped after such parameters have been 
>>>>>>> decided in the above call.
>>>>>>
>>>>>> Hmmm, maybe use ‘arguments’ instead of ‘parameters’?
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> (4) I found this name not strictly grammatical. How about this:
>>>>>>>
>>>>>>> allow_archive_heap_object -> is_heap_object_archiving_allowed
>>>>>>
>>>>>> Ok.
>>>>>>
>>>>>>>
>>>>>>> (5) in most of your code, 'archive' is used as a noun, except in 
>>>>>>> StringTable::archive_string() where it's used as a verb.
>>>>>>>
>>>>>>> archive_string could also be interpreted erroneously as "return 
>>>>>>> a string that's already in the archive". So to be consistent and 
>>>>>>> unambiguous, I think it's better to rename it to 
>>>>>>> StringTable::create_archived_string()
>>>>>>
>>>>>> Ok.
>>>>>>
>>>>>> Thanks,
>>>>>> Jiangli
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>> - Ioi
>>>>>>>
>>>>>>>
>>>>>>> On 8/3/17 5:15 PM, Jiangli Zhou wrote:
>>>>>>>> Here are the updated webrevs.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.02/ 
>>>>>>>> <http://cr.openjdk.java.net/%7Ejiangli/8179302/webrev.hotspot.02/>
>>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.02/
>>>>>>>>
>>>>>>>> Changes in the updated webrevs include:
>>>>>>>>
>>>>>>>>   * Merge with Ioi’s recent shared space auto-sizing change
>>>>>>>>     (8072061)
>>>>>>>>   * Addressed all feedbacks from Ioi and Coleen (Thanks for
>>>>>>>>     detailed review!)
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jiangli
>>>>>>>>
>>>>>>>>
>>>>>>>>> On Aug 1, 2017, at 5:29 PM, Jiangli Zhou 
>>>>>>>>> <jiangli.zhou at Oracle.COM> wrote:
>>>>>>>>>
>>>>>>>>> Hi Ioi,
>>>>>>>>>
>>>>>>>>> Thank you so much for reviewing this. I’ve addressed all your 
>>>>>>>>> feedbacks. Please see details below. I’ll updated the webrev 
>>>>>>>>> after addressing Coleen’s comments.
>>>>>>>>>
>>>>>>>>>> On Jul 30, 2017, at 9:07 PM, Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>
>>>>>>>>>> Here are my comments. I've not reviewed the GC code and I'll 
>>>>>>>>>> leave that to the GC experts :-)
>>>>>>>>>>
>>>>>>>>>> stringTable.cpp: StringTable::archive_string
>>>>>>>>>>
>>>>>>>>>>    add assert for DumpSharedSpaces only
>>>>>>>>>
>>>>>>>>> Ok.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> filemap.cpp
>>>>>>>>>>
>>>>>>>>>> 525 void 
>>>>>>>>>> FileMapInfo::write_archive_heap_regions(GrowableArray<MemRegion> 
>>>>>>>>>> *regions,
>>>>>>>>>> 526        int first_region, int num_regions) {
>>>>>>>>>>
>>>>>>>>>> When I first read this function, I found it hard to follow, 
>>>>>>>>>> especially this part that coalesces the trailing regions:
>>>>>>>>>>
>>>>>>>>>> 537 int len = regions->length();
>>>>>>>>>> 538         if (len > 1) {
>>>>>>>>>> 539 start = (char*)regions->at(1).start();
>>>>>>>>>> 540 size = (char*)regions->at(len - 1).end() - start;
>>>>>>>>>> 541         }
>>>>>>>>>> 542       }
>>>>>>>>>>
>>>>>>>>>> The rest of filemap.cpp always perform identical operations 
>>>>>>>>>> on MemRegion arrays, which are either 1 or 2 in size. 
>>>>>>>>>> However, this function doesn't follow that pattern; it also 
>>>>>>>>>> has a very different notion of "region", and the confusing 
>>>>>>>>>> part is regions->size() is not the same as num_regions.
>>>>>>>>>>
>>>>>>>>>> How about we change the API to something like the following? 
>>>>>>>>>> Before calling this API, the caller needs to coalesce the 
>>>>>>>>>> trailing G1 regions into a single MemRegion.
>>>>>>>>>>
>>>>>>>>>> FileMapInfo::write_archive_heap_regions(MemRegion *regions, 
>>>>>>>>>> int first, int num_regions) {
>>>>>>>>>>        if (first == MetaspaceShared::first_string) {
>>>>>>>>>> assert(num_regons <=  MetaspaceShared::max_strings, "...");
>>>>>>>>>>        } else {
>>>>>>>>>> assert(first == 
>>>>>>>>>> MetaspaceShared::first_open_archive_heap_region, "...");
>>>>>>>>>> assert(num_regons <= 
>>>>>>>>>> MetaspaceShared::max_open_archive_heap_region, "...");
>>>>>>>>>> }
>>>>>>>>>>        ....
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I’ve reworked the function and simplified the code.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 756   if (!string_data_mapped) {
>>>>>>>>>> 757 StringTable::ignore_shared_strings(true);
>>>>>>>>>> 758 assert(string_ranges == NULL && num_string_ranges == 0, 
>>>>>>>>>> "sanity");
>>>>>>>>>> 759   }
>>>>>>>>>> 760
>>>>>>>>>> 761   if (open_archive_heap_data_mapped) {
>>>>>>>>>> 762 MetaspaceShared::set_open_archive_heap_region_mapped();
>>>>>>>>>> 763   } else {
>>>>>>>>>> 764 assert(open_archive_heap_ranges == NULL && 
>>>>>>>>>> num_open_archive_heap_ranges == 0, "sanity");
>>>>>>>>>> 765   }
>>>>>>>>>>
>>>>>>>>>> Maybe the two "if" statements should be more consistent? 
>>>>>>>>>> Instead of StringTable::ignore_shared_strings, how 
>>>>>>>>>> about StringTable::set_shared_strings_region_mapped()?
>>>>>>>>>
>>>>>>>>> Fixed.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> FileMapInfo::map_heap_data() --
>>>>>>>>>>
>>>>>>>>>> 818     char* addr = (char*)regions[i].start();
>>>>>>>>>> 819     char* base = os::map_memory(_fd, _full_path, 
>>>>>>>>>> si->_file_offset,
>>>>>>>>>> 820         addr, regions[i].byte_size(), si->_read_only,
>>>>>>>>>> 821 si->_allow_exec);
>>>>>>>>>>
>>>>>>>>>> What happens when the first region succeeds to map but the 
>>>>>>>>>> second region fails to map? Will both regions be unmapped? I 
>>>>>>>>>> don't see where you store the return value (base) from 
>>>>>>>>>> os::map_memory(). Does it mean the code assumes that (addr == 
>>>>>>>>>> base). If so, we need an assert here.
>>>>>>>>>
>>>>>>>>> If any of the region fails to map, we bail out and call 
>>>>>>>>> dealloc_archive_heap_regions(), which handles the deallocation 
>>>>>>>>> of any regions specified. If second region fails to map, all 
>>>>>>>>> memory ranges specified by ‘regions’ array are deallocated. We 
>>>>>>>>> don’t unmap the memory here since it is part of the java heap. 
>>>>>>>>> Unmapping of heap memory are handled by GC code. The ‘if’ 
>>>>>>>>> check below makes sure base == addr.
>>>>>>>>>
>>>>>>>>>     if (base == NULL || base != addr) {
>>>>>>>>>       // dealloc the regions from java heap
>>>>>>>>> dealloc_archive_heap_regions(regions, region_num);
>>>>>>>>>   if (log_is_enabled(Info, cds)) {
>>>>>>>>> log_info(cds)("UseSharedSpaces: Unable to map at required 
>>>>>>>>> address in java heap.");
>>>>>>>>>       }
>>>>>>>>>   return false;
>>>>>>>>>     }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> constantPool.cpp
>>>>>>>>>>
>>>>>>>>>>     Handle refs_handle;
>>>>>>>>>>     ...
>>>>>>>>>> refs_handle = Handle(THREAD, (oop)archived);
>>>>>>>>>>
>>>>>>>>>> This will first create a NULL handle, then construct a 
>>>>>>>>>> temporary handle, and then assign the temp handle back to the 
>>>>>>>>>> null handle. This means two handles will be pushed onto 
>>>>>>>>>> THREAD->metadata_handles()
>>>>>>>>>>
>>>>>>>>>> I think it's more efficient if you merge these into a single 
>>>>>>>>>> statement
>>>>>>>>>>
>>>>>>>>>>     Handle refs_handle(THREAD, (oop)archived);
>>>>>>>>>
>>>>>>>>> Fixed.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Is this experimental code? Maybe it should be removed?
>>>>>>>>>>
>>>>>>>>>> 664     if (tag_at(index).is_unresolved_klass()) {
>>>>>>>>>> 665 #if 0
>>>>>>>>>> 666 CPSlot entry = cp->slot_at(index);
>>>>>>>>>> 667 Symbol* name = entry.get_symbol();
>>>>>>>>>> 668 Klass* k = SystemDictionary::find(name, NULL, NULL, THREAD);
>>>>>>>>>> 669       if (k != NULL) {
>>>>>>>>>> 670 klass_at_put(index, k);
>>>>>>>>>> 671       }
>>>>>>>>>> 672 #endif
>>>>>>>>>> 673     } else
>>>>>>>>>
>>>>>>>>> Removed.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> cpCache.hpp:
>>>>>>>>>>
>>>>>>>>>>     u8 _archived_references
>>>>>>>>>>
>>>>>>>>>> shouldn't this be declared as an narrowOop to avoid the type 
>>>>>>>>>> casts when it's used?
>>>>>>>>>
>>>>>>>>> Ok.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> cpCache.cpp:
>>>>>>>>>>
>>>>>>>>>>    add assert so that one of these is used only at dump time 
>>>>>>>>>> and the other only at run time?
>>>>>>>>>>
>>>>>>>>>> 610 oop ConstantPoolCache::archived_references() {
>>>>>>>>>> 611   return 
>>>>>>>>>> oopDesc::decode_heap_oop((narrowOop)_archived_references);
>>>>>>>>>> 612 }
>>>>>>>>>> 613
>>>>>>>>>> 614 void ConstantPoolCache::set_archived_references(oop o) {
>>>>>>>>>> 615 _archived_references = (u8)oopDesc::encode_heap_oop(o);
>>>>>>>>>> 616 }
>>>>>>>>>
>>>>>>>>> Ok.
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> Jiangli
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>> - Ioi
>>>>>>>>>>
>>>>>>>>>> On 7/27/17 1:37 PM, Jiangli Zhou wrote:
>>>>>>>>>>> Sorry, the mail didn’t handle the rich text well. I fixed 
>>>>>>>>>>> the format below.
>>>>>>>>>>>
>>>>>>>>>>> Please help review the changes for JDK-8179302 (Pre-resolve 
>>>>>>>>>>> constant pool string entries and cache resolved_reference 
>>>>>>>>>>> arrays in CDS archive). Currently, the CDS archive can 
>>>>>>>>>>> contain cached class metadata, interned java.lang.String 
>>>>>>>>>>> objects. This RFE adds the constant pool 
>>>>>>>>>>> ‘resolved_references’ arrays (hotspot specific) to the 
>>>>>>>>>>> archive for startup/runtime performance enhancement. 
>>>>>>>>>>> The ‘resolved_references' arrays are used to hold references 
>>>>>>>>>>> of resolved constant pool entries including Strings, 
>>>>>>>>>>> mirrors, etc. With the 'resolved_references’ being cached, 
>>>>>>>>>>> string constants in shared classes can now be resolved to 
>>>>>>>>>>> existing interned java.lang.Strings at CDS dump time. G1 and 
>>>>>>>>>>> 64-bit platforms are required.
>>>>>>>>>>>
>>>>>>>>>>> The GC changes in the RFE were discussed and guided by 
>>>>>>>>>>> Thomas Schatzl and GC team. Part of the changes were 
>>>>>>>>>>> contributed by Thomas himself.
>>>>>>>>>>> RFE: https://bugs.openjdk.java.net/browse/JDK-8179302
>>>>>>>>>>> hotspot: 
>>>>>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.hotspot.01/
>>>>>>>>>>> whitebox: 
>>>>>>>>>>> http://cr.openjdk.java.net/~jiangli/8179302/webrev.whitebox.01/
>>>>>>>>>>>
>>>>>>>>>>> Please see below for details of supporting cached 
>>>>>>>>>>> ‘resolved_references’ and pre-resolving string constants.
>>>>>>>>>>>
>>>>>>>>>>> Types of Pinned G1 Heap Regions
>>>>>>>>>>>
>>>>>>>>>>> The pinned region type is a super type of all archive region 
>>>>>>>>>>> types, which include the open archive type and the closed 
>>>>>>>>>>> archive type.
>>>>>>>>>>>
>>>>>>>>>>> 00100 0 [ 8] Pinned Mask
>>>>>>>>>>> 01000 0 [16] Old Mask
>>>>>>>>>>> 10000 0 [32] Archive Mask
>>>>>>>>>>> 11100 0 [56] Open Archive:   ArchiveMask | PinnedMask | OldMask
>>>>>>>>>>> 11100 1 [57] Closed Archive: ArchiveMask | PinnedMask | 
>>>>>>>>>>> OldMask + 1
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Pinned Regions
>>>>>>>>>>>
>>>>>>>>>>> Objects within the region are 'pinned', which means GC does 
>>>>>>>>>>> not move any live objects. GC scans and marks objects in the 
>>>>>>>>>>> pinned region as normal, but skips forwarding live objects. 
>>>>>>>>>>> Pointers in live objects are updated. Dead objects 
>>>>>>>>>>> (unreachable) can be collected and freed.
>>>>>>>>>>>
>>>>>>>>>>> Archive Regions
>>>>>>>>>>>
>>>>>>>>>>> The archive types are sub-types of 'pinned'. There are two 
>>>>>>>>>>> types of archive region currently, open archive and closed 
>>>>>>>>>>> archive. Both can support caching java heap objects via the 
>>>>>>>>>>> CDS archive.
>>>>>>>>>>>
>>>>>>>>>>> An archive region is also an old region by design.
>>>>>>>>>>>
>>>>>>>>>>> Open Archive (GC-RW) Regions
>>>>>>>>>>>
>>>>>>>>>>> Open archive region is GC writable. GC scans & marks objects 
>>>>>>>>>>> within the region and adjusts (updates) pointers in live 
>>>>>>>>>>> objects the same way as a pinned region. Live objects 
>>>>>>>>>>> (reachable) are pinned and not forwarded by GC.
>>>>>>>>>>> Open archive region does not have 'dead' objects. 
>>>>>>>>>>> Unreachable objects are 'dormant' objects. Dormant objects 
>>>>>>>>>>> are not collected and freed by GC.
>>>>>>>>>>>
>>>>>>>>>>> Adjustable Outgoing Pointers
>>>>>>>>>>>
>>>>>>>>>>> As GC can adjust pointers within the live objects in open 
>>>>>>>>>>> archive heap region, objects can have outgoing pointers to 
>>>>>>>>>>> another java heap region, including closed archive region, 
>>>>>>>>>>> open archive region, pinned (or humongous) region, and 
>>>>>>>>>>> normal generational region. When a referenced object is 
>>>>>>>>>>> moved by GC, the pointer within the open archive region is 
>>>>>>>>>>> updated accordingly.
>>>>>>>>>>>
>>>>>>>>>>> Closed Archive (GC-RO) Regions
>>>>>>>>>>>
>>>>>>>>>>> The closed archive region is GC read-only region. GC cannot 
>>>>>>>>>>> write into the region. Objects are not scanned and marked by 
>>>>>>>>>>> GC. Objects are pinned and not forwarded. Pointers are not 
>>>>>>>>>>> updated by GC either. Hence, objects within the archive 
>>>>>>>>>>> region cannot have any outgoing pointers to another java 
>>>>>>>>>>> heap region. Objects however can still have pointers to 
>>>>>>>>>>> other objects within the closed archive regions (we might 
>>>>>>>>>>> allow pointers to open archive regions in the future). That 
>>>>>>>>>>> restricts the type of java objects that can be supported by 
>>>>>>>>>>> the archive region.
>>>>>>>>>>> In JDK 9 we support archive Strings with the archive regions.
>>>>>>>>>>>
>>>>>>>>>>> The GC-readonly archive region makes java heap memory 
>>>>>>>>>>> sharable among different JVM processes. NOTE: 
>>>>>>>>>>> synchronization on the objects within the archive heap 
>>>>>>>>>>> region can still cause writes to the memory page.
>>>>>>>>>>>
>>>>>>>>>>> Dormant Objects
>>>>>>>>>>>
>>>>>>>>>>> Dormant objects are unreachable java objects within the open 
>>>>>>>>>>> archive heap region.
>>>>>>>>>>> A java object in the open archive heap region is a live 
>>>>>>>>>>> object if it can be reached during scanning. Some of the 
>>>>>>>>>>> java objects in the region may not be reachable during 
>>>>>>>>>>> scanning. Those objects are considered as dormant, but not 
>>>>>>>>>>> dead. For example, a constant pool 'resolved_references' 
>>>>>>>>>>> array is reachable via the klass root if its container klass 
>>>>>>>>>>> (shared) is already loaded at the time during GC scanning. 
>>>>>>>>>>> If a shared klass is not yet loaded, the klass root is not 
>>>>>>>>>>> scanned and it's constant pool 'resolved_reference' array 
>>>>>>>>>>> (A) in the open archive region is not reachable. Then A is a 
>>>>>>>>>>> dormant object.
>>>>>>>>>>>
>>>>>>>>>>> Object State Transition
>>>>>>>>>>>
>>>>>>>>>>> All java objects are initially dormant objects when open 
>>>>>>>>>>> archive heap regions are mapped to the runtime java heap. A 
>>>>>>>>>>> dormant object becomes live object when the associated 
>>>>>>>>>>> shared class is loaded at runtime. Explicit call 
>>>>>>>>>>> to G1SATBCardTableModRefBS::enqueue() needs to be made when 
>>>>>>>>>>> a dormant object becomes live. That should be the case 
>>>>>>>>>>> for cached objects with strong roots as well, since strong 
>>>>>>>>>>> roots are only scanned at the start of GC marking (the 
>>>>>>>>>>> initial marking) but not during Remarking/Final marking. If 
>>>>>>>>>>> a cached object becomes live during concurrent marking 
>>>>>>>>>>> phase, G1 may not find it and mark it live unless a call to 
>>>>>>>>>>> G1SATBCardTableModRefBS::enqueue() is made for the object.
>>>>>>>>>>>
>>>>>>>>>>> Currently, a live object in the open archive heap region 
>>>>>>>>>>> cannot become dormant again. This restriction simplifies GC 
>>>>>>>>>>> requirement and guarantees all outgoing pointers are updated 
>>>>>>>>>>> by GC correctly. Only objects for shared classes from the 
>>>>>>>>>>> builtin class loaders (boot, PlatformClassLoaders, and 
>>>>>>>>>>> AppClassLoaders) are supported for caching.
>>>>>>>>>>>
>>>>>>>>>>> Caching Java Objects at Archive Dump Time
>>>>>>>>>>>
>>>>>>>>>>> The closed archive and open archive regions are allocated 
>>>>>>>>>>> near the top of the dump time java heap. Archived java 
>>>>>>>>>>> objects are copied into the designated archive heap regions. 
>>>>>>>>>>> For example, String objects and the underlying 'value' 
>>>>>>>>>>> arrays are copied into the closed archive regions. All 
>>>>>>>>>>> references to the archived objects (from shared class 
>>>>>>>>>>> metadata, string table, etc) are set to the new heap 
>>>>>>>>>>> locations. A hash table is used to keep track of all 
>>>>>>>>>>> archived java objects during the copying process to make 
>>>>>>>>>>> sure java object is not archived more than once if reached 
>>>>>>>>>>> from different roots. It also makes sure references to the 
>>>>>>>>>>> same archived object are updated using the same new address 
>>>>>>>>>>> location.
>>>>>>>>>>>
>>>>>>>>>>> Caching Constant Pool resolved_references Array
>>>>>>>>>>>
>>>>>>>>>>> The 'resolved_references' is an array that holds references 
>>>>>>>>>>> of resolved constant pool entries including Strings, mirrors 
>>>>>>>>>>> and methodTypes, etc. Each loaded class has one 
>>>>>>>>>>> 'resolved_references' array (in ConstantPoolCache). The 
>>>>>>>>>>> 'resolved_references' arrays are copied into the open 
>>>>>>>>>>> archive regions during dump process. Prior to copying the 
>>>>>>>>>>> 'resolved_references' arrays, JVM iterates through constant 
>>>>>>>>>>> pool entries and resolves all JVM_CONSTANT_String entries to 
>>>>>>>>>>> existing interned Strings for all archived classes. When 
>>>>>>>>>>> resolving, JVM only looks up the string table and finds 
>>>>>>>>>>> existing interned Strings without inserting new ones. If 
>>>>>>>>>>> a string entry cannot be resolved to an existing interned 
>>>>>>>>>>> String, the constant pool entry remain as unresolved. That 
>>>>>>>>>>> prevents memory waste if a constant pool string entry is 
>>>>>>>>>>> never used at runtime.
>>>>>>>>>>>
>>>>>>>>>>> All String objects referenced by the string table are copied 
>>>>>>>>>>> first into the closed archive regions. The string table 
>>>>>>>>>>> entry is updated with the new location when each String 
>>>>>>>>>>> object is archived. The JVM updates the resolved constant 
>>>>>>>>>>> pool string entries with the new object locations when 
>>>>>>>>>>> copying the 'resolved_references' arrays to the open archive 
>>>>>>>>>>> regions. References to the 'resolved_references' arrays in 
>>>>>>>>>>> the ConstantPoolCache are also updated.
>>>>>>>>>>> At runtime as part of 
>>>>>>>>>>> ConstantPool::restore_unshareable_info() work, call 
>>>>>>>>>>> G1SATBCardTableModRefBS::enqueue() to let GC know the 
>>>>>>>>>>> 'resolved_references' is becoming live. A handle is created 
>>>>>>>>>>> for the cached object and added to the loader_data's handles.
>>>>>>>>>>>
>>>>>>>>>>> Runtime Java Heap With Cached Java Objects
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The closed archive regions (the string regions) and open 
>>>>>>>>>>> archive regions are mapped to the runtime java heap at the 
>>>>>>>>>>> same offsets as the dump time offsets from the runtime java 
>>>>>>>>>>> heap base.
>>>>>>>>>>>
>>>>>>>>>>> Preliminary test execution and status:
>>>>>>>>>>>
>>>>>>>>>>> JPRT: passed
>>>>>>>>>>> Tier2-rt: passed
>>>>>>>>>>> Tier2-gc: passed
>>>>>>>>>>> Tier2-comp: passed
>>>>>>>>>>> Tier3-rt: passed
>>>>>>>>>>> Tier3-gc: passed
>>>>>>>>>>> Tier3-comp: passed
>>>>>>>>>>> Tier4-rt: passed
>>>>>>>>>>> Tier4-gc: passed
>>>>>>>>>>> Tier4-comp:6 jobs timed out, all other tests passed
>>>>>>>>>>> Tier5-rt: one test failed but passed when running locally, 
>>>>>>>>>>> all other tests passed
>>>>>>>>>>> Tier5-gc: passed
>>>>>>>>>>> Tier5-comp: running
>>>>>>>>>>> hotspot_gc: two jobs timed out, all other tests passed
>>>>>>>>>>> hotspot_gc in CDS mode: two jobs timed out, all other tests 
>>>>>>>>>>> passed
>>>>>>>>>>> vm.gc: passed
>>>>>>>>>>> vm.gc in CDS mode: passed
>>>>>>>>>>> Kichensink: passed
>>>>>>>>>>> Kichensink in CDS mode: passed
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Jiangli
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20170811/1861a345/attachment.htm>


More information about the hotspot-gc-dev mailing list