RFR[S] 8210289 ArchivedKlassSubGraphInfoRecord is incomplete

Tue Sep 11 17:45:21 UTC 2018


On 9/11/18 10:28 AM, Calvin Cheung wrote:
> Hi Ioi,
>
> Just couple of minor comments.
>
> 271 void HeapShared::initialize_from_archived_subgraph(Klass* k) {
>
> I think the above function is being called during system module 
> initialization?
> If so, maybe you can add an assert like the following?
>     assert(!Universe::is_module_initialized(), "should not be called 
> after system module initialization");
>
Actually there's no such requirement. With Lambda CDS support 
(JDK-8198698) this function will be called when a class is loaded at run 
time, after system module initialization.

> Could you also breakup line #679 in the same file?
>
Fixed.
> I don't need to see another webrev for the above changes.
>

Thanks
- Ioi

> thanks,
> Calvin
>
> On 9/10/18, 8:17 PM, Ioi Lam wrote:
>> Hi Jiangli,
>>
>> Thank you for being thorough with the review. I think it really makes 
>> the code much better than I originally had :-)
>>
>> - Ioi
>>
>> On 9/10/18 6:35 PM, Jiangli Zhou wrote:
>>>
>>> On 9/10/18 4:10 PM, Ioi Lam wrote:
>>>
>>>> On 9/10/2018 1:00 PM, Jiangli Zhou wrote:
>>>>
>>>>> Hi Ioi,
>>>>>
>>>>> Thanks for another updated webrev. Please see comments (minor) below.
>>>>>
>>>>>>> *From:* Ioi Lam <ioi.lam at oracle.com <mailto:ioi.lam at oracle.com>>
>>>>>>> *Date:* September 9, 2018 at 12:11:42 PM PDT
>>>>>>> *To:* Jiangli Zhou <jiangli.zhou at oracle.com 
>>>>>>> <mailto:jiangli.zhou at oracle.com>>, 
>>>>>>> hotspot-runtime-dev at openjdk.java.net 
>>>>>>> <mailto:hotspot-runtime-dev at openjdk.java.net>
>>>>>>> *Subject:* *Re: RFR[S] 8210289 ArchivedKlassSubGraphInfoRecord 
>>>>>>> is incomplete*
>>>>>>>
>>>>>>> Hi Jiangli,
>>>>>>>
>>>>>>> Thanks again for the review. I've updated the patch:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v05/ 
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v05-delta/ 
>>>>>>>
>>>>>>>
>>>>>>> More comments inline
>>>>>>>
>>>>>>> On 9/8/18 2:37 PM, Jiangli Zhou wrote:
>>>>>>>> Hi Ioi,
>>>>>>>>
>>>>>>>> The updated webrev looks great. Here are some additional comments.
>>>>>>>>
>>>>>>>> - Thanks for answering my question about the two loops. If we 
>>>>>>>> want to solve the klass list issue as a general problem, we 
>>>>>>>> also need to allow the cases where the entry fields from the 
>>>>>>>> same class are not specified consecutively. The 'break' in the 
>>>>>>>> inner loop should be a 'continue'. Currently, all the entry 
>>>>>>>> fields from a single class are specified together as we are 
>>>>>>>> doing it manually. When we enhance the mechanism by adding the 
>>>>>>>> other half API that enables registering the entry fields from 
>>>>>>>> java code automatically, we might see fields from different 
>>>>>>>> classes are in mixed orders.
>>>>>>>>
>>>>>>>>  657     while (j < num_archivable_static_fields) {
>>>>>>>>  658       ArchivableStaticFieldInfo* f = 
>>>>>>>> &archivable_static_fields[j];
>>>>>>>>  659       if (f->klass_name == klass_name) {
>>>>>>>>  660 archive_reachable_objects_from_static_field(f->klass, 
>>>>>>>> f->klass_name,
>>>>>>>>  661                                                     
>>>>>>>> f->offset, f->field_name, CHECK);
>>>>>>>>  662         j++;
>>>>>>>>  663       } else {
>>>>>>>>  664         break; <<<<<<<
>>>>>>>>  665       }
>>>>>>>>  666     }
>>>>>>>>
>>>>>>>
>>>>>>> It's actually OK now to not order the archivable_static_fields[] 
>>>>>>> array by class name. The inner loop is just for optimization. 
>>>>>>> I've updated the code and added comments to make this clear.
>>>>>>>
>>>>>>> I tested by reordering the archivable_static_fields[] array 
>>>>>>> (move the "archivedModuleFinder" line to the end of the table). 
>>>>>>> It shows:
>>>>>>>
>>>>>>>     Performed subgraph records = 7 times
>>>>>>>     Walked 3610 objects
>>>>>>>     Archived 2127 objects
>>>>>>>     Recorded 33 klasses
>>>>>>>
>>>>>>> With the original (sorted) table, it's faster with fewer walks, 
>>>>>>> but that doesn't change the number of archived objects or 
>>>>>>> recorded klasses:
>>>>>>>
>>>>>>>     Performed subgraph records = 6 times
>>>>>>>     Walked 2141 objects
>>>>>>>     Archived 2127 objects
>>>>>>>     Recorded 33 klasses
>>>>>>>
>>>>> The seen_objects_table resets when you see a klass_name that's 
>>>>> different. When the entry fields from different classes are mixed, 
>>>>> you might record an object klass that's already seen early for the 
>>>>> same containing class. That is not incorrect, but adds overhead 
>>>>> (although very small) to runtime. For following entry fields 
>>>>> sequence:
>>>>>
>>>>> A.f1
>>>>> A.f2
>>>>> B.f1
>>>>> A.f3
>>>>>
>>>>> For example, K1 is recorded for A.f1, A.f2 subgraphs. When 
>>>>> archiving A.f3 subgraph, K1 is encountered, K1 would be recorded 
>>>>> again for A.f3 because a different seen_objects_table is used. We 
>>>>> can fix the logic here or add a post process to eliminate any 
>>>>> duplication in the klass lists belonging to the same containing 
>>>>> klass. The post process seems to be an overkill.
>>>>
>>>> There's no duplication in the _subgraph_object_klasses list because 
>>>> we add it to only when needed:
>>>>
>>>> _subgraph_object_klasses->append_if_missing(relocated_k);
>>>>
>>>> So there's no runtime impact. (You can also see that with the same 
>>>> "Recorded 33 klasses" from the log above.)
>>> This indeed has no issue. I forgot that I built a single 
>>> subgraph_info_record for each containing klass, but not for each 
>>> entry field. :(
>>>
>>> One containing klass may have multiple subgraphs referenced from 
>>> multiple entry points. Since there is no separate info_record for 
>>> each entry field from the same containing klass, then there is no 
>>> duplicates for the recorded object_klass. Sorry for the confusion! 
>>> Thank you being patient with me.
>>>
>>> Thanks,
>>> Jiangli
>>>>
>>>>
>>>>>>>
>>>>>>>> - In order to make sure subgraphs are archived correctly, we 
>>>>>>>> need to do proper state propagation and checks/asserts, 
>>>>>>>> including the record_klasses_only state. When we see 
>>>>>>>> record_klasses_only becomes true, we need to verify all 
>>>>>>>> reachable object from the current one must have existing 
>>>>>>>> archived copies already. That helps to confirm that an earlier 
>>>>>>>> archived subgraph is complete, especially if the current walk 
>>>>>>>> enters an sub-subgraph from a different node. For your current 
>>>>>>>> scope of changes, resetting the record_klasses_only state based 
>>>>>>>> on if there is an existing archived copy is ok. We can enhance 
>>>>>>>> it separately.
>>>>>>>>
>>>>>>>
>>>>>>> I added a separate pass to verify these -- for each root 
>>>>>>> (orig_obj) of an archived subgraph:
>>>>>>>
>>>>>>>   // (1) Verify that all objects reachable from orig_obj are 
>>>>>>> archived.
>>>>>>>   // (2) Verify that all objects reachable from the archived 
>>>>>>> copy of orig_obj
>>>>>>>   //     can only point to archived objects.
>>>>>>>
>>>>>>> I think that's sufficient to ensure that you have correctly 
>>>>>>> archived everything you need.
>>>>>
>>>>> The verify code is good to have. (2) is not necessary as that's 
>>>>> already done by the archiving verification code in GC. There is 
>>>>> more fine grained region verification implemented for archived 
>>>>> objects in G1 and (2) is already done for all objects in open 
>>>>> archive heap regions at dump time.
>>>>>
>>>>
>>>> Thanks for letting me know. I've replaced (2) with this comment.
>>>>
>>>>   // Note: we could also verify that all objects reachable from the 
>>>> archived
>>>>   // copy of orig_obj can only point to archived objects, with:
>>>>   //      init_seen_objects_table();
>>>>   //      verify_reachable_objects_from(archived_obj, true);
>>>>   //      init_seen_objects_table();
>>>>   // but that's already done in 
>>>> G1HeapVerifier::verify_archive_regions so we
>>>>   // won't do it here.
>>>>
>>>>
>>>>
>>>>>>>
>>>>>>>> Please also test all tier1-tier6 for your changes with the 
>>>>>>>> default CDS archive patch.
>>>>>>>>
>>>>>>> I just submitted a job for hs-tier1~6 using your latest patch
>>>>>>> http://mail.openjdk.java.net/pipermail/build-dev/2018-September/023146.html 
>>>>>>>
>>>>>
>>>>
>>>> My test jobs passed with no new failures, so I'll push now.
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>>
>>>>> Thanks,
>>>>> Jiangli
>>>>>>>
>>>>>>> Thanks
>>>>>>> - Ioi
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jiangli
>>>>>>>>
>>>>>>>> On 9/7/18 5:58 PM, Ioi Lam wrote:
>>>>>>>>>
>>>>>>>>> Hi Jiangli,
>>>>>>>>>
>>>>>>>>> Thanks for the review!
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04/ 
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04-delta/ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 9/6/2018 4:28 PM, Jiangli Zhou wrote:
>>>>>>>>>> Hi Ioi,
>>>>>>>>>>
>>>>>>>>>> Sorry for the delay. It took me a while to get back to this. 
>>>>>>>>>> Thanks for folding the closures together. It cuts the cost by 
>>>>>>>>>> almost half. Here are my detailed comments.
>>>>>>>>>>
>>>>>>>>>> - _subgraph_object_klasses renaming
>>>>>>>>>>
>>>>>>>>>> The reason that I choose _subgraph_object_klasses was to 
>>>>>>>>>> disambiguate between the klass containing a subgraph and the 
>>>>>>>>>> klass of an object included in a subgraph. Is there a 
>>>>>>>>>> specific reason to rename the term to _subgraph_klasses?
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I just wanted to shorten the names to be the same as 
>>>>>>>>> ArchivedKlassSubGraphInfoRecord::subgraph_klasses, but I agree 
>>>>>>>>> with you for the naming.
>>>>>>>>>
>>>>>>>>> So instead, I've renamed 
>>>>>>>>> ArchivedKlassSubGraphInfoRecord::subgraph_klasses -> 
>>>>>>>>> subgraph_object_klasses to make the code consistent. What do 
>>>>>>>>> you think?
>>>>>>>>>
>>>>>>>>>> - Could you please align the lines at 358 - 360.
>>>>>>>>>>
>>>>>>>>>>  355   WalkOopAndArchiveClosure(int level, bool 
>>>>>>>>>> record_klasses_only,
>>>>>>>>>>  356 KlassSubGraphInfo* subgraph_info,
>>>>>>>>>>  357                            oop orig, oop archived, TRAPS) :
>>>>>>>>>>  358     _level(level), 
>>>>>>>>>> _record_klasses_only(record_klasses_only),
>>>>>>>>>>  359     _subgraph_info(subgraph_info),
>>>>>>>>>>  360     _orig_referencing_obj(orig), 
>>>>>>>>>> _archived_referencing_obj(archived),
>>>>>>>>>>  361     _thread(THREAD) {}
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The 2 space indent is for the C++ field initializer list. 
>>>>>>>>> HotSpot style guide doesn't mention this 
>>>>>>>>> (https://wiki.openjdk.java.net/display/HotSpot/StyleGuide) so 
>>>>>>>>> I just used emacs's default indentation.
>>>>>>>>>
>>>>>>>>>> - (2) should be only applied to object that's not seen within 
>>>>>>>>>> the current subgraph. Please clarify it in the comment.
>>>>>>>>>>
>>>>>>>>>>  401 // (2) If orig_obj has not been seen yet, trace all
>>>>>>>>>>  402 //     objects that are reachable from it, and make sure 
>>>>>>>>>> these objects are archived.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I changed it to
>>>>>>>>>
>>>>>>>>> // (2) If orig_obj has not been seen yet (since 
>>>>>>>>> start_recording_subgraph() was called),
>>>>>>>>> //     trace all  objects that are reachable from it, and make 
>>>>>>>>> sure these objects are archived.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> - Some Strings (such as "") are not archived in the closed 
>>>>>>>>>> archive heap region. In that case, the String object is 
>>>>>>>>>> automatically archived in the open archive heap region. The 
>>>>>>>>>> following comment should be clarified and fixed.
>>>>>>>>>>
>>>>>>>>>>  417     // Strings are already archived in closed archive. 
>>>>>>>>>> No need to walk it
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I changed to
>>>>>>>>>
>>>>>>>>>   if (java_lang_String::is_instance(orig_obj) && archived_obj 
>>>>>>>>> != NULL) {
>>>>>>>>>     // To save time, don't walk strings that are already 
>>>>>>>>> archived. They just contain
>>>>>>>>>     // pointers to a type array, whose klass doesn't need to 
>>>>>>>>> be recorded.
>>>>>>>>>     return archived_obj;
>>>>>>>>>   }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> - Could you please explain the reason for using two loops 
>>>>>>>>>> here? Shouldn't one loop be sufficient?
>>>>>>>>>>
>>>>>>>>>>  646   for (int i = 0; i < num_archivable_static_fields; ) {
>>>>>>>>>>
>>>>>>>>>>  650
>>>>>>>>>>  651     int j = i;
>>>>>>>>>>  652     while (j < num_archivable_static_fields) {
>>>>>>>>>>  653       ArchivableStaticFieldInfo* f = 
>>>>>>>>>> &archivable_static_fields[j];
>>>>>>>>>>  654       if (f->klass_name == klass_name) {
>>>>>>>>>>  655 archive_reachable_objects_from_static_field(f->klass, 
>>>>>>>>>> f->klass_name,
>>>>>>>>>>  656 f->offset, f->field_name, CHECK);
>>>>>>>>>>  657         j++;
>>>>>>>>>>  658       } else {
>>>>>>>>>>  659         break;
>>>>>>>>>>  660       }
>>>>>>>>>>  661     }
>>>>>>>>>> ..
>>>>>>>>>>  663     i = j;
>>>>>>>>>>  664   }
>>>>>>>>>>  665
>>>>>>>>>
>>>>>>>>> That's because you can archive multiple fields in the same klass:
>>>>>>>>>
>>>>>>>>> // If you add new entries to this table, you should know what 
>>>>>>>>> you're doing!
>>>>>>>>> static ArchivableStaticFieldInfo archivable_static_fields[] = {
>>>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>>>> "archivedSystemModules"},
>>>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>>>> "archivedModuleFinder"},
>>>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>>>> "archivedMainModule"},
>>>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>>>> "archivedConfiguration"},
>>>>>>>>>   {"java/util/ImmutableCollections$ListN", "EMPTY_LIST"},
>>>>>>>>>
>>>>>>>>> Note that start/done_recording_subgraph are called in the i 
>>>>>>>>> loop and not the j loop.
>>>>>>>>>
>>>>>>>>>> - HeapShared::archive_reachable_objects_from() should take a 
>>>>>>>>>> '_record_klasses_only' argument from caller, which decides if 
>>>>>>>>>> it only needs to record the klass list.
>>>>>>>>>>
>>>>>>>>>> As we walk all reachable objects from an entry point (a 
>>>>>>>>>> static field) during walk&archive process, at the end of it 
>>>>>>>>>> we obtain a complete and closed archived subgraph. Every 
>>>>>>>>>> object is also the root of a sub-subgraph that's reachable 
>>>>>>>>>> from it. In a subsequent subgraph archiving process from a 
>>>>>>>>>> different entry point, if we encounter an object that is 
>>>>>>>>>> already archived, then every objects contained within the 
>>>>>>>>>> sub-subgraph reachable from that root are already archived 
>>>>>>>>>> also. So, '_record_klasses_only' should be a inherited flag 
>>>>>>>>>> from the root and passed to all 
>>>>>>>>>> HeapShared::archive_reachable_objects_from() calls for 
>>>>>>>>>> processing objects within a sub-subgraph.
>>>>>>>>>>
>>>>>>>>>> During the walk, '_record_klasses_only' should be false 
>>>>>>>>>> initially. Once it sees an object that already has an 
>>>>>>>>>> archived copy (but not seen in the current subgraph), the 
>>>>>>>>>> '_record_klasses_only' needs to become true and be propagated 
>>>>>>>>>> to all sub-walking processes. '_record_klasses_only' should 
>>>>>>>>>> never go back to false state.
>>>>>>>>>>
>>>>>>>>>> With that, we can guarantee to build a correct archived 
>>>>>>>>>> subgraph without the risk of infinite cycles. It also reduces 
>>>>>>>>>> overhead for the walk&archive process.
>>>>>>>>>
>>>>>>>>> I actually started to write to code exactly as you suggested, 
>>>>>>>>> by passing record_klasses_only down, and then I found out that 
>>>>>>>>> it's unnecessary, and having it will require me to add 
>>>>>>>>> complicated asserts that was beyond my ability to get right.
>>>>>>>>>
>>>>>>>>> As I understand, you worry about about correctness 
>>>>>>>>> (termination of the algorithm), and performance.
>>>>>>>>>
>>>>>>>>> I believe the logic of the current code is correct, and the 
>>>>>>>>> algorithm will terminate:
>>>>>>>>>
>>>>>>>>>       oop archived_obj =
>>>>>>>>> MetaspaceShared::find_archived_heap_object(orig_obj);
>>>>>>>>>       ....
>>>>>>>>>       if (has_been_seen_during_subgraph_recording(orig_obj)) {
>>>>>>>>>         // orig_obj has already been archived and traced.
>>>>>>>>>     Nothing more to do.
>>>>>>>>>         return archived_obj;
>>>>>>>>>       }
>>>>>>>>>       ...
>>>>>>>>>      WalkOopAndArchiveClosure walker(level, record_klasses_only,
>>>>>>>>>     subgraph_info, orig_obj, archived_obj, THREAD);
>>>>>>>>>      orig_obj->oop_iterate(&walker); // recurse
>>>>>>>>>
>>>>>>>>> HeapShared::archive_reachable_objects_from will recurse ONLY 
>>>>>>>>> if you call this function with an object that has not need 
>>>>>>>>> seen, so eventually the recursion will stop because all 
>>>>>>>>> objects in the subgraph has been seen.
>>>>>>>>>
>>>>>>>>> To make the code easier to understand, I rewrote it like this:
>>>>>>>>>
>>>>>>>>>   if (has_been_seen_during_subgraph_recording(orig_obj)) {
>>>>>>>>>     // orig_obj has already been archived and traced. Nothing 
>>>>>>>>> more to do.
>>>>>>>>>     return archived_obj;
>>>>>>>>>   } else {
>>>>>>>>> set_has_been_seen_during_subgraph_recording(orig_obj);
>>>>>>>>>    }
>>>>>>>>>
>>>>>>>>> For performance, the logic for deciding record_klasses_only is 
>>>>>>>>> this:
>>>>>>>>>
>>>>>>>>>   oop archived_obj = 
>>>>>>>>> MetaspaceShared::find_archived_heap_object(orig_obj);
>>>>>>>>>    ...
>>>>>>>>>   bool record_klasses_only = (archived_obj != NULL);
>>>>>>>>>
>>>>>>>>> You have to call find_archived_heap_object anyway, so that's 
>>>>>>>>> unavoidable. The cost of computing record_klasses_only is very 
>>>>>>>>> simple. For performance, there's no need for the it to be 
>>>>>>>>> tracked in the recursive calls. That makes the code simpler, 
>>>>>>>>> and easier to maintain.
>>>>>>>>>
>>>>>>>>> Otherwise we may need to think about whether the caller and 
>>>>>>>>> the callee have different opinion on record_klasses_only:
>>>>>>>>>
>>>>>>>>>   - if this could happen, what would cause this to happen, and 
>>>>>>>>> how should we handle it?
>>>>>>>>>   - if this could never happen, how would you write an 
>>>>>>>>> assertion to validate that?
>>>>>>>>>
>>>>>>>>> Writing those asserts turned out to be too complicated.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> - Could add explanation in the comment?
>>>>>>>>>>
>>>>>>>>>> 438         // See 
>>>>>>>>>> runtime/appcds/cacheObject/ArchivedIntegerCacheTest.java
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I changed it to
>>>>>>>>>
>>>>>>>>>     if (level == 1) {
>>>>>>>>>         // Don't archive a subgraph root that's too big. For 
>>>>>>>>> archives static fields, that's OK
>>>>>>>>>         // as the Java code will take care of initializing 
>>>>>>>>> this field dynamically.
>>>>>>>>>         return NULL;
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>>>> - Please add the step 5) comment back to the walk&archive 
>>>>>>>>>> description.
>>>>>>>>>>
>>>>>>>>>>  466 // 5) The Klass of the current java object is added to 
>>>>>>>>>> the list of Klasses
>>>>>>>>>>  467 // for loading and initialzing before any object in the 
>>>>>>>>>> archived graph can
>>>>>>>>>>  468 // be accessed at runtime.
>>>>>>>>>>  469 //
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Done. Sorry I removed it by mistake.
>>>>>>>>>
>>>>>>>>>> - The HeapShared::archive_module_graph_objects() can take a 
>>>>>>>>>> more general name now, if you want to do it as part of your 
>>>>>>>>>> change.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> I renamed it to archive_static_fields(). I also removed/fixed 
>>>>>>>>> a few comments about "Currently there is only one class mirror 
>>>>>>>>> (ArchivedModuleGraph) with archived sub-graphs."
>>>>>>>>>
>>>>>>>>>> Could you please also send a sample of the changed cds+heap 
>>>>>>>>>> debug log output?
>>>>>>>>>>
>>>>>>>>> See 
>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04/logs/
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> - Ioi
>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>>
>>>>>>>>>> Jiangli
>>>>>>>>>>
>>>>>>>>>> On 9/5/18 3:56 PM, Ioi Lam wrote:
>>>>>>>>>>> I updated the patch:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v03/ 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v03-delta/ 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 1. No need to walk String objects during subgraph recording.
>>>>>>>>>>>
>>>>>>>>>>> 2. Print put statistics:
>>>>>>>>>>>
>>>>>>>>>>> =========================
>>>>>>>>>>>
>>>>>>>>>>> jdk/internal/module/ArchivedModuleGraph: walked 1871 objs, 
>>>>>>>>>>> archived 1871 new objs, recorded 28 classes
>>>>>>>>>>> java/util/ImmutableCollections$ListN: walked 2 objs, 
>>>>>>>>>>> archived 0 new objs, recorded 0 classes
>>>>>>>>>>> java/util/ImmutableCollections$MapN: walked 2 objs, archived 
>>>>>>>>>>> 0 new objs, recorded 0 classes
>>>>>>>>>>> java/util/ImmutableCollections$SetN: walked 2 objs, archived 
>>>>>>>>>>> 0 new objs, recorded 0 classes
>>>>>>>>>>> java/lang/Integer$IntegerCache: walked 257 objs, archived 
>>>>>>>>>>> 256 new objs, recorded 2 classes
>>>>>>>>>>> java/lang/module/Configuration: walked 7 objs, archived 0 
>>>>>>>>>>> new objs, recorded 3 classes
>>>>>>>>>>>
>>>>>>>>>>> Performed subgraph records = 6 times
>>>>>>>>>>> Walked 2141 objects
>>>>>>>>>>> Archived 2127 objects
>>>>>>>>>>> Recorded 33 klasses
>>>>>>>>>>>
>>>>>>>>>>> ========
>>>>>>>>>>>
>>>>>>>>>>> So, at least for now, we don't need to worry about the 
>>>>>>>>>>> performance of WalkOopAndArchiveClosure.
>>>>>>>>>>>
>>>>>>>>>>> Once I get some statistics of Lambda archiving, I'll post 
>>>>>>>>>>> them as well.
>>>>>>>>>>>
>>>>>>>>>>> So, with the bug fix, the subgraph recording should be both 
>>>>>>>>>>> correct and faster than before (no more String walking).
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> - Ioi
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 9/3/2018 1:02 AM, Ioi Lam wrote:
>>>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>>>
>>>>>>>>>>>> I think your suggestion of combining of combining 
>>>>>>>>>>>> RecordKlassesClosure and WalkOopAndArchiveClosure is a good 
>>>>>>>>>>>> idea. I have updated the webrev:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v02/ 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This also supersedes my other patch for JDK-8210295 
>>>>>>>>>>>> Refactor 
>>>>>>>>>>>> HeapShared::archive_reachable_objects_from_static_field
>>>>>>>>>>>>
>>>>>>>>>>>> More comments below on future optimizations.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 9/2/18 6:07 PM, Jiangli Zhou wrote:
>>>>>>>>>>>>> Hi Ioi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 9/1/18 8:58 PM, Ioi Lam wrote:
>>>>>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for looking at the optimization of this problem. I 
>>>>>>>>>>>>>> think your algorithm will run into problems if (a) there 
>>>>>>>>>>>>>> is a cycle, and (b) two sub-graphs point to different 
>>>>>>>>>>>>>> members in the cycle.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think such issue can be solved by tracking the original 
>>>>>>>>>>>>> K (in the K-list, short for the class-initialization-list) 
>>>>>>>>>>>>> for each archived object. If an encountered object is 
>>>>>>>>>>>>> already archived (within the current sub-graph) and is 
>>>>>>>>>>>>> associated with K1 within the current K-list (must be in 
>>>>>>>>>>>>> this case). We can update the sub-K-list between K1 and 
>>>>>>>>>>>>> the K associated with the referencing object of the 
>>>>>>>>>>>>> current object, so their level are the same. We can also 
>>>>>>>>>>>>> add a flag to the K-list elements to tag each K within the 
>>>>>>>>>>>>> cycle. It may not be what we need to do currently.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can you please describe the specific class that you ran 
>>>>>>>>>>>>> into with the bug in your sub-graph?
>>>>>>>>>>>>
>>>>>>>>>>>> I found the bug by code examination. It's also during this 
>>>>>>>>>>>> time that I realized that it's not a simple problem to 
>>>>>>>>>>>> optimize, because of cycles in the graph. There's been tons 
>>>>>>>>>>>> of established research in this area (as in "Tarjan's 
>>>>>>>>>>>> algorithm is one of Donald Knuth's favorite 
>>>>>>>>>>>> implementations"),  so I don't think we should reinvent the 
>>>>>>>>>>>> wheel. For future optimization, let's just apply an 
>>>>>>>>>>>> existing algorithm.
>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Maybe someone familiar with graph theory can suggest a 
>>>>>>>>>>>>>> solution?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> My current thinking is to first find all the cycles in 
>>>>>>>>>>>>>> the archived heap. The reachability of every member in 
>>>>>>>>>>>>>> the cycle is the same, so we can collapse each cycle into 
>>>>>>>>>>>>>> a single node. After that we have a DAG. Then, instead of 
>>>>>>>>>>>>>> encoding the class initialization order as multiple 
>>>>>>>>>>>>>> lists, we can encode it as a DAG to save space.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> See 
>>>>>>>>>>>>>> https://stackoverflow.com/questions/261573/best-algorithm-for-detecting-cycles-in-a-directed-graph
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In any case, I think this is a sufficiently difficult 
>>>>>>>>>>>>>> problem, so we should separate the dumping of the 
>>>>>>>>>>>>>> sub-graphs (the first part of 
>>>>>>>>>>>>>> HeapShared::archive_module_graph_objects() in my patch) 
>>>>>>>>>>>>>> and the computation of the initialization order 
>>>>>>>>>>>>>> (HeapShared::record_subgraph_klasses()).
>>>>>>>>>>>>> Our current K-lists are quite small. If the number of 
>>>>>>>>>>>>> sub-graphs grows in the future and there are many 
>>>>>>>>>>>>> duplicates in the K-lists, we can explore using a single 
>>>>>>>>>>>>> K-list, with each sub-graph-info recording the 
>>>>>>>>>>>>> [start-index, end-index] pairs of the sub-K-lists.
>>>>>>>>>>>>>
>>>>>>>>>>>>> To avoid any confusion to others, I want to point out that 
>>>>>>>>>>>>> the order of the elements in the K-list is not the 
>>>>>>>>>>>>> initialization order of the classes that happens at dump 
>>>>>>>>>>>>> time.
>>>>>>>>>>>>
>>>>>>>>>>>> Sorry for the confusion. I wanted to avoid using the word 
>>>>>>>>>>>> "list" -- I think the most space efficient representation 
>>>>>>>>>>>> will be a DAG instead of a list -- I should have used "set" 
>>>>>>>>>>>> instead.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks again for spending time on this topic.
>>>>>>>>>>>>
>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>> The order of the K-list elements is determined by how we 
>>>>>>>>>>>>> walk the graph. It works for our purpose because we are 
>>>>>>>>>>>>> not doing general-purpose object graph archiving (at least 
>>>>>>>>>>>>> currently), and all classes in the targeted sub-graph for 
>>>>>>>>>>>>> archiving must not have any dependencies on runtime 
>>>>>>>>>>>>> context. When we add more sub-graphs to the archive in the 
>>>>>>>>>>>>> future, we can examine our existing assumptions and lift 
>>>>>>>>>>>>> limitations case by case with enhancements.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also, I think we should defer the optimization of 
>>>>>>>>>>>>>> record_subgraph_klasses in a future RFE. That way, we can 
>>>>>>>>>>>>>> have correctness now, and performance later when we 
>>>>>>>>>>>>>> actually need it. (The current implementation of 
>>>>>>>>>>>>>> record_subgraph_klasses() has not caused any perceivable 
>>>>>>>>>>>>>> delay for the current set of objects that we are archiving).
>>>>>>>>>>>>> We probably don't need to do what I described in the above 
>>>>>>>>>>>>> for now. I've looked at your changes, I think we can 
>>>>>>>>>>>>> choose a middle-ground approach that's more efficient than 
>>>>>>>>>>>>> your current change, but could still be O(N*M) in the 
>>>>>>>>>>>>> worst case (N is the number of nodes, M is the number of 
>>>>>>>>>>>>> sub-graphs). It also avoid duplicating the logic between 
>>>>>>>>>>>>> the archiving_walk closure and the new RecordKlassesClosure.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the middle-ground approach, WalkOopAndArchiveClosure 
>>>>>>>>>>>>> can take a new boolean flag, 'record_klass_only'. We can 
>>>>>>>>>>>>> use a separate table to track the objects we have seen in 
>>>>>>>>>>>>> the current sub-graph as you are doing in the webrev. If 
>>>>>>>>>>>>> the there is an existing archived copy of the current 
>>>>>>>>>>>>> object, we check the other table to determine if it's 
>>>>>>>>>>>>> already in the current sub-graph. If yes, no additional 
>>>>>>>>>>>>> work is needed. If no, recursively enter the closure but 
>>>>>>>>>>>>> do 'record_klass_only'.
>>>>>>>>>>>>>
>>>>>>>>>>>>>       oop archived = 
>>>>>>>>>>>>> MetaspaceShared::find_archived_heap_object(obj);
>>>>>>>>>>>>>       if (archived != NULL) {
>>>>>>>>>>>>>         if (has_seen_in_the_current_graph) {
>>>>>>>>>>>>>             // no work
>>>>>>>>>>>>>         } else {
>>>>>>>>>>>>>             // walk the object but record klass only
>>>>>>>>>>>>>         }
>>>>>>>>>>>>>         return;
>>>>>>>>>>>>>       }
>>>>>>>>>>>>>
>>>>>>>>>>>>> With that, we can avoid duplicating the logic in 
>>>>>>>>>>>>> RecordKlassesClosure and also avoid walking the sub-graph 
>>>>>>>>>>>>> objects twice in most cases. What do you think?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 9/1/18 7:55 PM, Jiangli Zhou wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sep 1, 2018, at 7:25 PM, Jiangli Zhou 
>>>>>>>>>>>>>>>> <jiangli.zhou at oracle.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Ioi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for finding the bug. To address the incomplete 
>>>>>>>>>>>>>>>> class list issue, the original algorithm can be 
>>>>>>>>>>>>>>>> augmented while remaining as an O(N) solution.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> As each node (object) in an archived subgraph is a root 
>>>>>>>>>>>>>>>> of a sub-subgraph, the class-initialization-list 
>>>>>>>>>>>>>>>> associated with a specific node (within an existing 
>>>>>>>>>>>>>>>> archived sub-graph) is also a sub-list of the enclosing 
>>>>>>>>>>>>>>>> sub-graph's class-initialization-list. For example,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sub-graph1:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>              O1(k1)
>>>>>>>>>>>>>>>>            /         \
>>>>>>>>>>>>>>>>      O2(k2)       O5(k5)
>>>>>>>>>>>>>>>>      /       \
>>>>>>>>>>>>>>>> O3(k3)   O4(k4)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Klass-init-list: K1, K2, K3, K4, K5
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sub-graph2:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>      O2(k2)
>>>>>>>>>>>>>>>>      /       \
>>>>>>>>>>>>>>>> O3(k3)   O4(k4)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Klass-init-list: K2, K3, K4
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> During the sub-graph walking and archiving process for 
>>>>>>>>>>>>>>>> sub-graph2, if O2 has been previously archived in 
>>>>>>>>>>>>>>>> another sub-graph, we can find the other sub-graph's 
>>>>>>>>>>>>>>>> class list and take the sub-list starting from the 
>>>>>>>>>>>>>>>> klass (k2), without re-walking each nodes again. To do 
>>>>>>>>>>>>>>>> that, we only need to walk the existing recorded 
>>>>>>>>>>>>>>>> class-initialization-list. If K2 is found in any of the 
>>>>>>>>>>>>>>>> existing list, we can take the sub-list starting from 
>>>>>>>>>>>>>>>> K2 and append the list to the current one.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> With the approach, K5 would also be included if O5 is 
>>>>>>>>>>>>>>>> walked after O2 in sub-graph1. However, O5 is not part 
>>>>>>>>>>>>>>>> of O2's sub-graph. So that would introduce overhead at 
>>>>>>>>>>>>>>>> runtime. To avoid such problem, we can remember the 
>>>>>>>>>>>>>>>> sub-graph level (which is already built as part of the 
>>>>>>>>>>>>>>>> existing graph walking algorithm) for each K in the 
>>>>>>>>>>>>>>>> class-initialization-list. The ending indication of the 
>>>>>>>>>>>>>>>> sub-list would be the first K with the same level as 
>>>>>>>>>>>>>>>> the starting K. So we would have:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>     K1(1), K2(2), K3(3), K4(4), K5(2)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The K2 level is 2, the sub-list would end before K2 
>>>>>>>>>>>>>>>> who's level is 2.
>>>>>>>>>>>>>>> Typo. The second K2 above should be k5:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The K2 level is 2, the sub-list would end before K5, 
>>>>>>>>>>>>>>> who's level is 2.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This part is not currently implemented yet but is not 
>>>>>>>>>>>>>>>> difficult to add.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On 9/1/18 6:08 PM, Ioi Lam wrote:
>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8210289
>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v01/ 
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Description
>>>>>>>>>>>>>>>>> ===========
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I found this bug while trying to merge my code for CDS 
>>>>>>>>>>>>>>>>> support of Lambda
>>>>>>>>>>>>>>>>> classes (JDK-8198698).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> When heapShared.cpp dumps a sub-graph of heap objects, 
>>>>>>>>>>>>>>>>> it attempts to
>>>>>>>>>>>>>>>>> record all the classes of all the objects that are 
>>>>>>>>>>>>>>>>> referenced by
>>>>>>>>>>>>>>>>> this sub-graph. However, if one of these objects have 
>>>>>>>>>>>>>>>>> already been visited
>>>>>>>>>>>>>>>>> while a previous sub-graph was dumped, then this 
>>>>>>>>>>>>>>>>> object's class is not
>>>>>>>>>>>>>>>>> recorded in the current sub-graph.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> At runtime, if the current sub-graph is restored 
>>>>>>>>>>>>>>>>> before any other
>>>>>>>>>>>>>>>>> sub-graphs, we will end up with a live object in the 
>>>>>>>>>>>>>>>>> Java heap with
>>>>>>>>>>>>>>>>> an uninitialized class.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Fix
>>>>>>>>>>>>>>>>> ===
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Now I create the sub-graph's klasses list after all 
>>>>>>>>>>>>>>>>> sub-graphs have dumped.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For each class that has archived sub-graph(s), I do a 
>>>>>>>>>>>>>>>>> heap walk to discover
>>>>>>>>>>>>>>>>> all klasses that are reachable from this archived 
>>>>>>>>>>>>>>>>> fields of this class.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This is sub-optimal but suffice for now because we 
>>>>>>>>>>>>>>>>> just have a very small
>>>>>>>>>>>>>>>>> number of sub-graphs. The worst case its O(N^2) where 
>>>>>>>>>>>>>>>>> N is the number of
>>>>>>>>>>>>>>>>> objects in the archived heap.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In the future, we might need to implement a more 
>>>>>>>>>>>>>>>>> efficient algorithm that
>>>>>>>>>>>>>>>>> walks the heap once and computes all the klasses lists 
>>>>>>>>>>>>>>>>> of all the
>>>>>>>>>>>>>>>>> sub-graphs at the same time.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Testing
>>>>>>>>>>>>>>>>> =======
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> hs-tier[1,2,3]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>