RFR[S] 8210289 ArchivedKlassSubGraphInfoRecord is incomplete

Tue Sep 11 17:58:41 UTC 2018

On 9/11/18 10:45 AM, Ioi Lam wrote:
>
>
> On 9/11/18 10:28 AM, Calvin Cheung wrote:
>> Hi Ioi,
>>
>> Just couple of minor comments.
>>
>> 271 void HeapShared::initialize_from_archived_subgraph(Klass* k) {
>>
>> I think the above function is being called during system module 
>> initialization?
>> If so, maybe you can add an assert like the following?
>>     assert(!Universe::is_module_initialized(), "should not be called 
>> after system module initialization");
>>
> Actually there's no such requirement. With Lambda CDS support 
> (JDK-8198698) this function will be called when a class is loaded at 
> run time, after system module initialization.
I agree. HeapShared::initialize_from_archived_subgraph(Klass* k) was 
designed to be generic and not specific for archived system module objects.

The API is used to initialize/install archived subgraph(s) associated 
with the requesting klass at runtime. Currently, it is already used 
outside of the archived system modules. The Integer cache is an example.

Thanks,
Jiangli
>
>> Could you also breakup line #679 in the same file?
>>
> Fixed.
>> I don't need to see another webrev for the above changes.
>>
>
> Thanks
> - Ioi
>
>> thanks,
>> Calvin
>>
>> On 9/10/18, 8:17 PM, Ioi Lam wrote:
>>> Hi Jiangli,
>>>
>>> Thank you for being thorough with the review. I think it really 
>>> makes the code much better than I originally had :-)
>>>
>>> - Ioi
>>>
>>> On 9/10/18 6:35 PM, Jiangli Zhou wrote:
>>>>
>>>> On 9/10/18 4:10 PM, Ioi Lam wrote:
>>>>
>>>>> On 9/10/2018 1:00 PM, Jiangli Zhou wrote:
>>>>>
>>>>>> Hi Ioi,
>>>>>>
>>>>>> Thanks for another updated webrev. Please see comments (minor) 
>>>>>> below.
>>>>>>
>>>>>>>> *From:* Ioi Lam <ioi.lam at oracle.com <mailto:ioi.lam at oracle.com>>
>>>>>>>> *Date:* September 9, 2018 at 12:11:42 PM PDT
>>>>>>>> *To:* Jiangli Zhou <jiangli.zhou at oracle.com 
>>>>>>>> <mailto:jiangli.zhou at oracle.com>>, 
>>>>>>>> hotspot-runtime-dev at openjdk.java.net 
>>>>>>>> <mailto:hotspot-runtime-dev at openjdk.java.net>
>>>>>>>> *Subject:* *Re: RFR[S] 8210289 ArchivedKlassSubGraphInfoRecord 
>>>>>>>> is incomplete*
>>>>>>>>
>>>>>>>> Hi Jiangli,
>>>>>>>>
>>>>>>>> Thanks again for the review. I've updated the patch:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v05/ 
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v05-delta/ 
>>>>>>>>
>>>>>>>>
>>>>>>>> More comments inline
>>>>>>>>
>>>>>>>> On 9/8/18 2:37 PM, Jiangli Zhou wrote:
>>>>>>>>> Hi Ioi,
>>>>>>>>>
>>>>>>>>> The updated webrev looks great. Here are some additional 
>>>>>>>>> comments.
>>>>>>>>>
>>>>>>>>> - Thanks for answering my question about the two loops. If we 
>>>>>>>>> want to solve the klass list issue as a general problem, we 
>>>>>>>>> also need to allow the cases where the entry fields from the 
>>>>>>>>> same class are not specified consecutively. The 'break' in the 
>>>>>>>>> inner loop should be a 'continue'. Currently, all the entry 
>>>>>>>>> fields from a single class are specified together as we are 
>>>>>>>>> doing it manually. When we enhance the mechanism by adding the 
>>>>>>>>> other half API that enables registering the entry fields from 
>>>>>>>>> java code automatically, we might see fields from different 
>>>>>>>>> classes are in mixed orders.
>>>>>>>>>
>>>>>>>>>  657     while (j < num_archivable_static_fields) {
>>>>>>>>>  658       ArchivableStaticFieldInfo* f = 
>>>>>>>>> &archivable_static_fields[j];
>>>>>>>>>  659       if (f->klass_name == klass_name) {
>>>>>>>>>  660 archive_reachable_objects_from_static_field(f->klass, 
>>>>>>>>> f->klass_name,
>>>>>>>>>  661                                                     
>>>>>>>>> f->offset, f->field_name, CHECK);
>>>>>>>>>  662         j++;
>>>>>>>>>  663       } else {
>>>>>>>>>  664         break; <<<<<<<
>>>>>>>>>  665       }
>>>>>>>>>  666     }
>>>>>>>>>
>>>>>>>>
>>>>>>>> It's actually OK now to not order the 
>>>>>>>> archivable_static_fields[] array by class name. The inner loop 
>>>>>>>> is just for optimization. I've updated the code and added 
>>>>>>>> comments to make this clear.
>>>>>>>>
>>>>>>>> I tested by reordering the archivable_static_fields[] array 
>>>>>>>> (move the "archivedModuleFinder" line to the end of the table). 
>>>>>>>> It shows:
>>>>>>>>
>>>>>>>>     Performed subgraph records = 7 times
>>>>>>>>     Walked 3610 objects
>>>>>>>>     Archived 2127 objects
>>>>>>>>     Recorded 33 klasses
>>>>>>>>
>>>>>>>> With the original (sorted) table, it's faster with fewer walks, 
>>>>>>>> but that doesn't change the number of archived objects or 
>>>>>>>> recorded klasses:
>>>>>>>>
>>>>>>>>     Performed subgraph records = 6 times
>>>>>>>>     Walked 2141 objects
>>>>>>>>     Archived 2127 objects
>>>>>>>>     Recorded 33 klasses
>>>>>>>>
>>>>>> The seen_objects_table resets when you see a klass_name that's 
>>>>>> different. When the entry fields from different classes are 
>>>>>> mixed, you might record an object klass that's already seen early 
>>>>>> for the same containing class. That is not incorrect, but adds 
>>>>>> overhead (although very small) to runtime. For following entry 
>>>>>> fields sequence:
>>>>>>
>>>>>> A.f1
>>>>>> A.f2
>>>>>> B.f1
>>>>>> A.f3
>>>>>>
>>>>>> For example, K1 is recorded for A.f1, A.f2 subgraphs. When 
>>>>>> archiving A.f3 subgraph, K1 is encountered, K1 would be recorded 
>>>>>> again for A.f3 because a different seen_objects_table is used. We 
>>>>>> can fix the logic here or add a post process to eliminate any 
>>>>>> duplication in the klass lists belonging to the same containing 
>>>>>> klass. The post process seems to be an overkill.
>>>>>
>>>>> There's no duplication in the _subgraph_object_klasses list 
>>>>> because we add it to only when needed:
>>>>>
>>>>> _subgraph_object_klasses->append_if_missing(relocated_k);
>>>>>
>>>>> So there's no runtime impact. (You can also see that with the same 
>>>>> "Recorded 33 klasses" from the log above.)
>>>> This indeed has no issue. I forgot that I built a single 
>>>> subgraph_info_record for each containing klass, but not for each 
>>>> entry field. :(
>>>>
>>>> One containing klass may have multiple subgraphs referenced from 
>>>> multiple entry points. Since there is no separate info_record for 
>>>> each entry field from the same containing klass, then there is no 
>>>> duplicates for the recorded object_klass. Sorry for the confusion! 
>>>> Thank you being patient with me.
>>>>
>>>> Thanks,
>>>> Jiangli
>>>>>
>>>>>
>>>>>>>>
>>>>>>>>> - In order to make sure subgraphs are archived correctly, we 
>>>>>>>>> need to do proper state propagation and checks/asserts, 
>>>>>>>>> including the record_klasses_only state. When we see 
>>>>>>>>> record_klasses_only becomes true, we need to verify all 
>>>>>>>>> reachable object from the current one must have existing 
>>>>>>>>> archived copies already. That helps to confirm that an earlier 
>>>>>>>>> archived subgraph is complete, especially if the current walk 
>>>>>>>>> enters an sub-subgraph from a different node. For your current 
>>>>>>>>> scope of changes, resetting the record_klasses_only state 
>>>>>>>>> based on if there is an existing archived copy is ok. We can 
>>>>>>>>> enhance it separately.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I added a separate pass to verify these -- for each root 
>>>>>>>> (orig_obj) of an archived subgraph:
>>>>>>>>
>>>>>>>>   // (1) Verify that all objects reachable from orig_obj are 
>>>>>>>> archived.
>>>>>>>>   // (2) Verify that all objects reachable from the archived 
>>>>>>>> copy of orig_obj
>>>>>>>>   //     can only point to archived objects.
>>>>>>>>
>>>>>>>> I think that's sufficient to ensure that you have correctly 
>>>>>>>> archived everything you need.
>>>>>>
>>>>>> The verify code is good to have. (2) is not necessary as that's 
>>>>>> already done by the archiving verification code in GC. There is 
>>>>>> more fine grained region verification implemented for archived 
>>>>>> objects in G1 and (2) is already done for all objects in open 
>>>>>> archive heap regions at dump time.
>>>>>>
>>>>>
>>>>> Thanks for letting me know. I've replaced (2) with this comment.
>>>>>
>>>>>   // Note: we could also verify that all objects reachable from 
>>>>> the archived
>>>>>   // copy of orig_obj can only point to archived objects, with:
>>>>>   //      init_seen_objects_table();
>>>>>   //      verify_reachable_objects_from(archived_obj, true);
>>>>>   //      init_seen_objects_table();
>>>>>   // but that's already done in 
>>>>> G1HeapVerifier::verify_archive_regions so we
>>>>>   // won't do it here.
>>>>>
>>>>>
>>>>>
>>>>>>>>
>>>>>>>>> Please also test all tier1-tier6 for your changes with the 
>>>>>>>>> default CDS archive patch.
>>>>>>>>>
>>>>>>>> I just submitted a job for hs-tier1~6 using your latest patch
>>>>>>>> http://mail.openjdk.java.net/pipermail/build-dev/2018-September/023146.html 
>>>>>>>>
>>>>>>
>>>>>
>>>>> My test jobs passed with no new failures, so I'll push now.
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> Jiangli
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> - Ioi
>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Jiangli
>>>>>>>>>
>>>>>>>>> On 9/7/18 5:58 PM, Ioi Lam wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>
>>>>>>>>>> Thanks for the review!
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04/ 
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04-delta/ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 9/6/2018 4:28 PM, Jiangli Zhou wrote:
>>>>>>>>>>> Hi Ioi,
>>>>>>>>>>>
>>>>>>>>>>> Sorry for the delay. It took me a while to get back to this. 
>>>>>>>>>>> Thanks for folding the closures together. It cuts the cost 
>>>>>>>>>>> by almost half. Here are my detailed comments.
>>>>>>>>>>>
>>>>>>>>>>> - _subgraph_object_klasses renaming
>>>>>>>>>>>
>>>>>>>>>>> The reason that I choose _subgraph_object_klasses was to 
>>>>>>>>>>> disambiguate between the klass containing a subgraph and the 
>>>>>>>>>>> klass of an object included in a subgraph. Is there a 
>>>>>>>>>>> specific reason to rename the term to _subgraph_klasses?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I just wanted to shorten the names to be the same as 
>>>>>>>>>> ArchivedKlassSubGraphInfoRecord::subgraph_klasses, but I 
>>>>>>>>>> agree with you for the naming.
>>>>>>>>>>
>>>>>>>>>> So instead, I've renamed 
>>>>>>>>>> ArchivedKlassSubGraphInfoRecord::subgraph_klasses -> 
>>>>>>>>>> subgraph_object_klasses to make the code consistent. What do 
>>>>>>>>>> you think?
>>>>>>>>>>
>>>>>>>>>>> - Could you please align the lines at 358 - 360.
>>>>>>>>>>>
>>>>>>>>>>>  355   WalkOopAndArchiveClosure(int level, bool 
>>>>>>>>>>> record_klasses_only,
>>>>>>>>>>>  356 KlassSubGraphInfo* subgraph_info,
>>>>>>>>>>>  357                            oop orig, oop archived, 
>>>>>>>>>>> TRAPS) :
>>>>>>>>>>>  358     _level(level), 
>>>>>>>>>>> _record_klasses_only(record_klasses_only),
>>>>>>>>>>>  359     _subgraph_info(subgraph_info),
>>>>>>>>>>>  360     _orig_referencing_obj(orig), 
>>>>>>>>>>> _archived_referencing_obj(archived),
>>>>>>>>>>>  361     _thread(THREAD) {}
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The 2 space indent is for the C++ field initializer list. 
>>>>>>>>>> HotSpot style guide doesn't mention this 
>>>>>>>>>> (https://wiki.openjdk.java.net/display/HotSpot/StyleGuide) so 
>>>>>>>>>> I just used emacs's default indentation.
>>>>>>>>>>
>>>>>>>>>>> - (2) should be only applied to object that's not seen 
>>>>>>>>>>> within the current subgraph. Please clarify it in the comment.
>>>>>>>>>>>
>>>>>>>>>>>  401 // (2) If orig_obj has not been seen yet, trace all
>>>>>>>>>>>  402 //     objects that are reachable from it, and make 
>>>>>>>>>>> sure these objects are archived.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I changed it to
>>>>>>>>>>
>>>>>>>>>> // (2) If orig_obj has not been seen yet (since 
>>>>>>>>>> start_recording_subgraph() was called),
>>>>>>>>>> //     trace all  objects that are reachable from it, and 
>>>>>>>>>> make sure these objects are archived.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> - Some Strings (such as "") are not archived in the closed 
>>>>>>>>>>> archive heap region. In that case, the String object is 
>>>>>>>>>>> automatically archived in the open archive heap region. The 
>>>>>>>>>>> following comment should be clarified and fixed.
>>>>>>>>>>>
>>>>>>>>>>>  417     // Strings are already archived in closed archive. 
>>>>>>>>>>> No need to walk it
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I changed to
>>>>>>>>>>
>>>>>>>>>>   if (java_lang_String::is_instance(orig_obj) && archived_obj 
>>>>>>>>>> != NULL) {
>>>>>>>>>>     // To save time, don't walk strings that are already 
>>>>>>>>>> archived. They just contain
>>>>>>>>>>     // pointers to a type array, whose klass doesn't need to 
>>>>>>>>>> be recorded.
>>>>>>>>>>     return archived_obj;
>>>>>>>>>>   }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> - Could you please explain the reason for using two loops 
>>>>>>>>>>> here? Shouldn't one loop be sufficient?
>>>>>>>>>>>
>>>>>>>>>>>  646   for (int i = 0; i < num_archivable_static_fields; ) {
>>>>>>>>>>>
>>>>>>>>>>>  650
>>>>>>>>>>>  651     int j = i;
>>>>>>>>>>>  652     while (j < num_archivable_static_fields) {
>>>>>>>>>>>  653       ArchivableStaticFieldInfo* f = 
>>>>>>>>>>> &archivable_static_fields[j];
>>>>>>>>>>>  654       if (f->klass_name == klass_name) {
>>>>>>>>>>>  655 archive_reachable_objects_from_static_field(f->klass, 
>>>>>>>>>>> f->klass_name,
>>>>>>>>>>>  656 f->offset, f->field_name, CHECK);
>>>>>>>>>>>  657         j++;
>>>>>>>>>>>  658       } else {
>>>>>>>>>>>  659         break;
>>>>>>>>>>>  660       }
>>>>>>>>>>>  661     }
>>>>>>>>>>> ..
>>>>>>>>>>>  663     i = j;
>>>>>>>>>>>  664   }
>>>>>>>>>>>  665
>>>>>>>>>>
>>>>>>>>>> That's because you can archive multiple fields in the same 
>>>>>>>>>> klass:
>>>>>>>>>>
>>>>>>>>>> // If you add new entries to this table, you should know what 
>>>>>>>>>> you're doing!
>>>>>>>>>> static ArchivableStaticFieldInfo archivable_static_fields[] = {
>>>>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>>>>> "archivedSystemModules"},
>>>>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>>>>> "archivedModuleFinder"},
>>>>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>>>>> "archivedMainModule"},
>>>>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>>>>> "archivedConfiguration"},
>>>>>>>>>>   {"java/util/ImmutableCollections$ListN", "EMPTY_LIST"},
>>>>>>>>>>
>>>>>>>>>> Note that start/done_recording_subgraph are called in the i 
>>>>>>>>>> loop and not the j loop.
>>>>>>>>>>
>>>>>>>>>>> - HeapShared::archive_reachable_objects_from() should take a 
>>>>>>>>>>> '_record_klasses_only' argument from caller, which decides 
>>>>>>>>>>> if it only needs to record the klass list.
>>>>>>>>>>>
>>>>>>>>>>> As we walk all reachable objects from an entry point (a 
>>>>>>>>>>> static field) during walk&archive process, at the end of it 
>>>>>>>>>>> we obtain a complete and closed archived subgraph. Every 
>>>>>>>>>>> object is also the root of a sub-subgraph that's reachable 
>>>>>>>>>>> from it. In a subsequent subgraph archiving process from a 
>>>>>>>>>>> different entry point, if we encounter an object that is 
>>>>>>>>>>> already archived, then every objects contained within the 
>>>>>>>>>>> sub-subgraph reachable from that root are already archived 
>>>>>>>>>>> also. So, '_record_klasses_only' should be a inherited flag 
>>>>>>>>>>> from the root and passed to all 
>>>>>>>>>>> HeapShared::archive_reachable_objects_from() calls for 
>>>>>>>>>>> processing objects within a sub-subgraph.
>>>>>>>>>>>
>>>>>>>>>>> During the walk, '_record_klasses_only' should be false 
>>>>>>>>>>> initially. Once it sees an object that already has an 
>>>>>>>>>>> archived copy (but not seen in the current subgraph), the 
>>>>>>>>>>> '_record_klasses_only' needs to become true and be 
>>>>>>>>>>> propagated to all sub-walking processes. 
>>>>>>>>>>> '_record_klasses_only' should never go back to false state.
>>>>>>>>>>>
>>>>>>>>>>> With that, we can guarantee to build a correct archived 
>>>>>>>>>>> subgraph without the risk of infinite cycles. It also 
>>>>>>>>>>> reduces overhead for the walk&archive process.
>>>>>>>>>>
>>>>>>>>>> I actually started to write to code exactly as you suggested, 
>>>>>>>>>> by passing record_klasses_only down, and then I found out 
>>>>>>>>>> that it's unnecessary, and having it will require me to add 
>>>>>>>>>> complicated asserts that was beyond my ability to get right.
>>>>>>>>>>
>>>>>>>>>> As I understand, you worry about about correctness 
>>>>>>>>>> (termination of the algorithm), and performance.
>>>>>>>>>>
>>>>>>>>>> I believe the logic of the current code is correct, and the 
>>>>>>>>>> algorithm will terminate:
>>>>>>>>>>
>>>>>>>>>>       oop archived_obj =
>>>>>>>>>> MetaspaceShared::find_archived_heap_object(orig_obj);
>>>>>>>>>>       ....
>>>>>>>>>>       if (has_been_seen_during_subgraph_recording(orig_obj)) {
>>>>>>>>>>         // orig_obj has already been archived and traced.
>>>>>>>>>>     Nothing more to do.
>>>>>>>>>>         return archived_obj;
>>>>>>>>>>       }
>>>>>>>>>>       ...
>>>>>>>>>>      WalkOopAndArchiveClosure walker(level, record_klasses_only,
>>>>>>>>>>     subgraph_info, orig_obj, archived_obj, THREAD);
>>>>>>>>>>      orig_obj->oop_iterate(&walker); // recurse
>>>>>>>>>>
>>>>>>>>>> HeapShared::archive_reachable_objects_from will recurse ONLY 
>>>>>>>>>> if you call this function with an object that has not need 
>>>>>>>>>> seen, so eventually the recursion will stop because all 
>>>>>>>>>> objects in the subgraph has been seen.
>>>>>>>>>>
>>>>>>>>>> To make the code easier to understand, I rewrote it like this:
>>>>>>>>>>
>>>>>>>>>>   if (has_been_seen_during_subgraph_recording(orig_obj)) {
>>>>>>>>>>     // orig_obj has already been archived and traced. Nothing 
>>>>>>>>>> more to do.
>>>>>>>>>>     return archived_obj;
>>>>>>>>>>   } else {
>>>>>>>>>> set_has_been_seen_during_subgraph_recording(orig_obj);
>>>>>>>>>>    }
>>>>>>>>>>
>>>>>>>>>> For performance, the logic for deciding record_klasses_only 
>>>>>>>>>> is this:
>>>>>>>>>>
>>>>>>>>>>   oop archived_obj = 
>>>>>>>>>> MetaspaceShared::find_archived_heap_object(orig_obj);
>>>>>>>>>>    ...
>>>>>>>>>>   bool record_klasses_only = (archived_obj != NULL);
>>>>>>>>>>
>>>>>>>>>> You have to call find_archived_heap_object anyway, so that's 
>>>>>>>>>> unavoidable. The cost of computing record_klasses_only is 
>>>>>>>>>> very simple. For performance, there's no need for the it to 
>>>>>>>>>> be tracked in the recursive calls. That makes the code 
>>>>>>>>>> simpler, and easier to maintain.
>>>>>>>>>>
>>>>>>>>>> Otherwise we may need to think about whether the caller and 
>>>>>>>>>> the callee have different opinion on record_klasses_only:
>>>>>>>>>>
>>>>>>>>>>   - if this could happen, what would cause this to happen, 
>>>>>>>>>> and how should we handle it?
>>>>>>>>>>   - if this could never happen, how would you write an 
>>>>>>>>>> assertion to validate that?
>>>>>>>>>>
>>>>>>>>>> Writing those asserts turned out to be too complicated.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> - Could add explanation in the comment?
>>>>>>>>>>>
>>>>>>>>>>> 438         // See 
>>>>>>>>>>> runtime/appcds/cacheObject/ArchivedIntegerCacheTest.java
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I changed it to
>>>>>>>>>>
>>>>>>>>>>     if (level == 1) {
>>>>>>>>>>         // Don't archive a subgraph root that's too big. For 
>>>>>>>>>> archives static fields, that's OK
>>>>>>>>>>         // as the Java code will take care of initializing 
>>>>>>>>>> this field dynamically.
>>>>>>>>>>         return NULL;
>>>>>>>>>>       }
>>>>>>>>>>
>>>>>>>>>>> - Please add the step 5) comment back to the walk&archive 
>>>>>>>>>>> description.
>>>>>>>>>>>
>>>>>>>>>>>  466 // 5) The Klass of the current java object is added to 
>>>>>>>>>>> the list of Klasses
>>>>>>>>>>>  467 // for loading and initialzing before any object in the 
>>>>>>>>>>> archived graph can
>>>>>>>>>>>  468 // be accessed at runtime.
>>>>>>>>>>>  469 //
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Done. Sorry I removed it by mistake.
>>>>>>>>>>
>>>>>>>>>>> - The HeapShared::archive_module_graph_objects() can take a 
>>>>>>>>>>> more general name now, if you want to do it as part of your 
>>>>>>>>>>> change.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> I renamed it to archive_static_fields(). I also removed/fixed 
>>>>>>>>>> a few comments about "Currently there is only one class 
>>>>>>>>>> mirror (ArchivedModuleGraph) with archived sub-graphs."
>>>>>>>>>>
>>>>>>>>>>> Could you please also send a sample of the changed cds+heap 
>>>>>>>>>>> debug log output?
>>>>>>>>>>>
>>>>>>>>>> See 
>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04/logs/
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> - Ioi
>>>>>>>>>>
>>>>>>>>>>> Thanks!
>>>>>>>>>>>
>>>>>>>>>>> Jiangli
>>>>>>>>>>>
>>>>>>>>>>> On 9/5/18 3:56 PM, Ioi Lam wrote:
>>>>>>>>>>>> I updated the patch:
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v03/ 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v03-delta/ 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 1. No need to walk String objects during subgraph recording.
>>>>>>>>>>>>
>>>>>>>>>>>> 2. Print put statistics:
>>>>>>>>>>>>
>>>>>>>>>>>> =========================
>>>>>>>>>>>>
>>>>>>>>>>>> jdk/internal/module/ArchivedModuleGraph: walked 1871 objs, 
>>>>>>>>>>>> archived 1871 new objs, recorded 28 classes
>>>>>>>>>>>> java/util/ImmutableCollections$ListN: walked 2 objs, 
>>>>>>>>>>>> archived 0 new objs, recorded 0 classes
>>>>>>>>>>>> java/util/ImmutableCollections$MapN: walked 2 objs, 
>>>>>>>>>>>> archived 0 new objs, recorded 0 classes
>>>>>>>>>>>> java/util/ImmutableCollections$SetN: walked 2 objs, 
>>>>>>>>>>>> archived 0 new objs, recorded 0 classes
>>>>>>>>>>>> java/lang/Integer$IntegerCache: walked 257 objs, archived 
>>>>>>>>>>>> 256 new objs, recorded 2 classes
>>>>>>>>>>>> java/lang/module/Configuration: walked 7 objs, archived 0 
>>>>>>>>>>>> new objs, recorded 3 classes
>>>>>>>>>>>>
>>>>>>>>>>>> Performed subgraph records = 6 times
>>>>>>>>>>>> Walked 2141 objects
>>>>>>>>>>>> Archived 2127 objects
>>>>>>>>>>>> Recorded 33 klasses
>>>>>>>>>>>>
>>>>>>>>>>>> ========
>>>>>>>>>>>>
>>>>>>>>>>>> So, at least for now, we don't need to worry about the 
>>>>>>>>>>>> performance of WalkOopAndArchiveClosure.
>>>>>>>>>>>>
>>>>>>>>>>>> Once I get some statistics of Lambda archiving, I'll post 
>>>>>>>>>>>> them as well.
>>>>>>>>>>>>
>>>>>>>>>>>> So, with the bug fix, the subgraph recording should be both 
>>>>>>>>>>>> correct and faster than before (no more String walking).
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 9/3/2018 1:02 AM, Ioi Lam wrote:
>>>>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think your suggestion of combining of combining 
>>>>>>>>>>>>> RecordKlassesClosure and WalkOopAndArchiveClosure is a 
>>>>>>>>>>>>> good idea. I have updated the webrev:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v02/ 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> This also supersedes my other patch for JDK-8210295 
>>>>>>>>>>>>> Refactor 
>>>>>>>>>>>>> HeapShared::archive_reachable_objects_from_static_field
>>>>>>>>>>>>>
>>>>>>>>>>>>> More comments below on future optimizations.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 9/2/18 6:07 PM, Jiangli Zhou wrote:
>>>>>>>>>>>>>> Hi Ioi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 9/1/18 8:58 PM, Ioi Lam wrote:
>>>>>>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for looking at the optimization of this problem. 
>>>>>>>>>>>>>>> I think your algorithm will run into problems if (a) 
>>>>>>>>>>>>>>> there is a cycle, and (b) two sub-graphs point to 
>>>>>>>>>>>>>>> different members in the cycle.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think such issue can be solved by tracking the original 
>>>>>>>>>>>>>> K (in the K-list, short for the 
>>>>>>>>>>>>>> class-initialization-list) for each archived object. If 
>>>>>>>>>>>>>> an encountered object is already archived (within the 
>>>>>>>>>>>>>> current sub-graph) and is associated with K1 within the 
>>>>>>>>>>>>>> current K-list (must be in this case). We can update the 
>>>>>>>>>>>>>> sub-K-list between K1 and the K associated with the 
>>>>>>>>>>>>>> referencing object of the current object, so their level 
>>>>>>>>>>>>>> are the same. We can also add a flag to the K-list 
>>>>>>>>>>>>>> elements to tag each K within the cycle. It may not be 
>>>>>>>>>>>>>> what we need to do currently.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can you please describe the specific class that you ran 
>>>>>>>>>>>>>> into with the bug in your sub-graph?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I found the bug by code examination. It's also during this 
>>>>>>>>>>>>> time that I realized that it's not a simple problem to 
>>>>>>>>>>>>> optimize, because of cycles in the graph. There's been 
>>>>>>>>>>>>> tons of established research in this area (as in "Tarjan's 
>>>>>>>>>>>>> algorithm is one of Donald Knuth's favorite 
>>>>>>>>>>>>> implementations"),  so I don't think we should reinvent 
>>>>>>>>>>>>> the wheel. For future optimization, let's just apply an 
>>>>>>>>>>>>> existing algorithm.
>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Maybe someone familiar with graph theory can suggest a 
>>>>>>>>>>>>>>> solution?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> My current thinking is to first find all the cycles in 
>>>>>>>>>>>>>>> the archived heap. The reachability of every member in 
>>>>>>>>>>>>>>> the cycle is the same, so we can collapse each cycle 
>>>>>>>>>>>>>>> into a single node. After that we have a DAG. Then, 
>>>>>>>>>>>>>>> instead of encoding the class initialization order as 
>>>>>>>>>>>>>>> multiple lists, we can encode it as a DAG to save space.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> See 
>>>>>>>>>>>>>>> https://stackoverflow.com/questions/261573/best-algorithm-for-detecting-cycles-in-a-directed-graph
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In any case, I think this is a sufficiently difficult 
>>>>>>>>>>>>>>> problem, so we should separate the dumping of the 
>>>>>>>>>>>>>>> sub-graphs (the first part of 
>>>>>>>>>>>>>>> HeapShared::archive_module_graph_objects() in my patch) 
>>>>>>>>>>>>>>> and the computation of the initialization order 
>>>>>>>>>>>>>>> (HeapShared::record_subgraph_klasses()).
>>>>>>>>>>>>>> Our current K-lists are quite small. If the number of 
>>>>>>>>>>>>>> sub-graphs grows in the future and there are many 
>>>>>>>>>>>>>> duplicates in the K-lists, we can explore using a single 
>>>>>>>>>>>>>> K-list, with each sub-graph-info recording the 
>>>>>>>>>>>>>> [start-index, end-index] pairs of the sub-K-lists.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To avoid any confusion to others, I want to point out 
>>>>>>>>>>>>>> that the order of the elements in the K-list is not the 
>>>>>>>>>>>>>> initialization order of the classes that happens at dump 
>>>>>>>>>>>>>> time.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sorry for the confusion. I wanted to avoid using the word 
>>>>>>>>>>>>> "list" -- I think the most space efficient representation 
>>>>>>>>>>>>> will be a DAG instead of a list -- I should have used 
>>>>>>>>>>>>> "set" instead.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks again for spending time on this topic.
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>>> The order of the K-list elements is determined by how we 
>>>>>>>>>>>>>> walk the graph. It works for our purpose because we are 
>>>>>>>>>>>>>> not doing general-purpose object graph archiving (at 
>>>>>>>>>>>>>> least currently), and all classes in the targeted 
>>>>>>>>>>>>>> sub-graph for archiving must not have any dependencies on 
>>>>>>>>>>>>>> runtime context. When we add more sub-graphs to the 
>>>>>>>>>>>>>> archive in the future, we can examine our existing 
>>>>>>>>>>>>>> assumptions and lift limitations case by case with 
>>>>>>>>>>>>>> enhancements.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Also, I think we should defer the optimization of 
>>>>>>>>>>>>>>> record_subgraph_klasses in a future RFE. That way, we 
>>>>>>>>>>>>>>> can have correctness now, and performance later when we 
>>>>>>>>>>>>>>> actually need it. (The current implementation of 
>>>>>>>>>>>>>>> record_subgraph_klasses() has not caused any perceivable 
>>>>>>>>>>>>>>> delay for the current set of objects that we are 
>>>>>>>>>>>>>>> archiving).
>>>>>>>>>>>>>> We probably don't need to do what I described in the 
>>>>>>>>>>>>>> above for now. I've looked at your changes, I think we 
>>>>>>>>>>>>>> can choose a middle-ground approach that's more efficient 
>>>>>>>>>>>>>> than your current change, but could still be O(N*M) in 
>>>>>>>>>>>>>> the worst case (N is the number of nodes, M is the number 
>>>>>>>>>>>>>> of sub-graphs). It also avoid duplicating the logic 
>>>>>>>>>>>>>> between the archiving_walk closure and the new 
>>>>>>>>>>>>>> RecordKlassesClosure.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In the middle-ground approach, WalkOopAndArchiveClosure 
>>>>>>>>>>>>>> can take a new boolean flag, 'record_klass_only'. We can 
>>>>>>>>>>>>>> use a separate table to track the objects we have seen in 
>>>>>>>>>>>>>> the current sub-graph as you are doing in the webrev. If 
>>>>>>>>>>>>>> the there is an existing archived copy of the current 
>>>>>>>>>>>>>> object, we check the other table to determine if it's 
>>>>>>>>>>>>>> already in the current sub-graph. If yes, no additional 
>>>>>>>>>>>>>> work is needed. If no, recursively enter the closure but 
>>>>>>>>>>>>>> do 'record_klass_only'.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>       oop archived = 
>>>>>>>>>>>>>> MetaspaceShared::find_archived_heap_object(obj);
>>>>>>>>>>>>>>       if (archived != NULL) {
>>>>>>>>>>>>>>         if (has_seen_in_the_current_graph) {
>>>>>>>>>>>>>>             // no work
>>>>>>>>>>>>>>         } else {
>>>>>>>>>>>>>>             // walk the object but record klass only
>>>>>>>>>>>>>>         }
>>>>>>>>>>>>>>         return;
>>>>>>>>>>>>>>       }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> With that, we can avoid duplicating the logic in 
>>>>>>>>>>>>>> RecordKlassesClosure and also avoid walking the sub-graph 
>>>>>>>>>>>>>> objects twice in most cases. What do you think?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 9/1/18 7:55 PM, Jiangli Zhou wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sep 1, 2018, at 7:25 PM, Jiangli Zhou 
>>>>>>>>>>>>>>>>> <jiangli.zhou at oracle.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi Ioi,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for finding the bug. To address the incomplete 
>>>>>>>>>>>>>>>>> class list issue, the original algorithm can be 
>>>>>>>>>>>>>>>>> augmented while remaining as an O(N) solution.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> As each node (object) in an archived subgraph is a 
>>>>>>>>>>>>>>>>> root of a sub-subgraph, the class-initialization-list 
>>>>>>>>>>>>>>>>> associated with a specific node (within an existing 
>>>>>>>>>>>>>>>>> archived sub-graph) is also a sub-list of the 
>>>>>>>>>>>>>>>>> enclosing sub-graph's class-initialization-list. For 
>>>>>>>>>>>>>>>>> example,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sub-graph1:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>              O1(k1)
>>>>>>>>>>>>>>>>>            /         \
>>>>>>>>>>>>>>>>>      O2(k2)       O5(k5)
>>>>>>>>>>>>>>>>>      /       \
>>>>>>>>>>>>>>>>> O3(k3)   O4(k4)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Klass-init-list: K1, K2, K3, K4, K5
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sub-graph2:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>      O2(k2)
>>>>>>>>>>>>>>>>>      /       \
>>>>>>>>>>>>>>>>> O3(k3)   O4(k4)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Klass-init-list: K2, K3, K4
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> During the sub-graph walking and archiving process for 
>>>>>>>>>>>>>>>>> sub-graph2, if O2 has been previously archived in 
>>>>>>>>>>>>>>>>> another sub-graph, we can find the other sub-graph's 
>>>>>>>>>>>>>>>>> class list and take the sub-list starting from the 
>>>>>>>>>>>>>>>>> klass (k2), without re-walking each nodes again. To do 
>>>>>>>>>>>>>>>>> that, we only need to walk the existing recorded 
>>>>>>>>>>>>>>>>> class-initialization-list. If K2 is found in any of 
>>>>>>>>>>>>>>>>> the existing list, we can take the sub-list starting 
>>>>>>>>>>>>>>>>> from K2 and append the list to the current one.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> With the approach, K5 would also be included if O5 is 
>>>>>>>>>>>>>>>>> walked after O2 in sub-graph1. However, O5 is not part 
>>>>>>>>>>>>>>>>> of O2's sub-graph. So that would introduce overhead at 
>>>>>>>>>>>>>>>>> runtime. To avoid such problem, we can remember the 
>>>>>>>>>>>>>>>>> sub-graph level (which is already built as part of the 
>>>>>>>>>>>>>>>>> existing graph walking algorithm) for each K in the 
>>>>>>>>>>>>>>>>> class-initialization-list. The ending indication of 
>>>>>>>>>>>>>>>>> the sub-list would be the first K with the same level 
>>>>>>>>>>>>>>>>> as the starting K. So we would have:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>     K1(1), K2(2), K3(3), K4(4), K5(2)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The K2 level is 2, the sub-list would end before K2 
>>>>>>>>>>>>>>>>> who's level is 2.
>>>>>>>>>>>>>>>> Typo. The second K2 above should be k5:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The K2 level is 2, the sub-list would end before K5, 
>>>>>>>>>>>>>>>> who's level is 2.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This part is not currently implemented yet but is not 
>>>>>>>>>>>>>>>>> difficult to add.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On 9/1/18 6:08 PM, Ioi Lam wrote:
>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8210289
>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v01/ 
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Description
>>>>>>>>>>>>>>>>>> ===========
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I found this bug while trying to merge my code for 
>>>>>>>>>>>>>>>>>> CDS support of Lambda
>>>>>>>>>>>>>>>>>> classes (JDK-8198698).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> When heapShared.cpp dumps a sub-graph of heap 
>>>>>>>>>>>>>>>>>> objects, it attempts to
>>>>>>>>>>>>>>>>>> record all the classes of all the objects that are 
>>>>>>>>>>>>>>>>>> referenced by
>>>>>>>>>>>>>>>>>> this sub-graph. However, if one of these objects have 
>>>>>>>>>>>>>>>>>> already been visited
>>>>>>>>>>>>>>>>>> while a previous sub-graph was dumped, then this 
>>>>>>>>>>>>>>>>>> object's class is not
>>>>>>>>>>>>>>>>>> recorded in the current sub-graph.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> At runtime, if the current sub-graph is restored 
>>>>>>>>>>>>>>>>>> before any other
>>>>>>>>>>>>>>>>>> sub-graphs, we will end up with a live object in the 
>>>>>>>>>>>>>>>>>> Java heap with
>>>>>>>>>>>>>>>>>> an uninitialized class.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Fix
>>>>>>>>>>>>>>>>>> ===
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Now I create the sub-graph's klasses list after all 
>>>>>>>>>>>>>>>>>> sub-graphs have dumped.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For each class that has archived sub-graph(s), I do a 
>>>>>>>>>>>>>>>>>> heap walk to discover
>>>>>>>>>>>>>>>>>> all klasses that are reachable from this archived 
>>>>>>>>>>>>>>>>>> fields of this class.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This is sub-optimal but suffice for now because we 
>>>>>>>>>>>>>>>>>> just have a very small
>>>>>>>>>>>>>>>>>> number of sub-graphs. The worst case its O(N^2) where 
>>>>>>>>>>>>>>>>>> N is the number of
>>>>>>>>>>>>>>>>>> objects in the archived heap.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In the future, we might need to implement a more 
>>>>>>>>>>>>>>>>>> efficient algorithm that
>>>>>>>>>>>>>>>>>> walks the heap once and computes all the klasses 
>>>>>>>>>>>>>>>>>> lists of all the
>>>>>>>>>>>>>>>>>> sub-graphs at the same time.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Testing
>>>>>>>>>>>>>>>>>> =======
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> hs-tier[1,2,3]
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>