RFR[S] 8210289 ArchivedKlassSubGraphInfoRecord is incomplete

Tue Sep 11 17:28:47 UTC 2018

Hi Ioi,

Just couple of minor comments.

271 void HeapShared::initialize_from_archived_subgraph(Klass* k) {

I think the above function is being called during system module 
initialization?
If so, maybe you can add an assert like the following?
     assert(!Universe::is_module_initialized(), "should not be called 
after system module initialization");

Could you also breakup line #679 in the same file?

I don't need to see another webrev for the above changes.

thanks,
Calvin

On 9/10/18, 8:17 PM, Ioi Lam wrote:
> Hi Jiangli,
>
> Thank you for being thorough with the review. I think it really makes 
> the code much better than I originally had :-)
>
> - Ioi
>
> On 9/10/18 6:35 PM, Jiangli Zhou wrote:
>>
>> On 9/10/18 4:10 PM, Ioi Lam wrote:
>>
>>> On 9/10/2018 1:00 PM, Jiangli Zhou wrote:
>>>
>>>> Hi Ioi,
>>>>
>>>> Thanks for another updated webrev. Please see comments (minor) below.
>>>>
>>>>>> *From:* Ioi Lam <ioi.lam at oracle.com <mailto:ioi.lam at oracle.com>>
>>>>>> *Date:* September 9, 2018 at 12:11:42 PM PDT
>>>>>> *To:* Jiangli Zhou <jiangli.zhou at oracle.com 
>>>>>> <mailto:jiangli.zhou at oracle.com>>, 
>>>>>> hotspot-runtime-dev at openjdk.java.net 
>>>>>> <mailto:hotspot-runtime-dev at openjdk.java.net>
>>>>>> *Subject:* *Re: RFR[S] 8210289 ArchivedKlassSubGraphInfoRecord is 
>>>>>> incomplete*
>>>>>>
>>>>>> Hi Jiangli,
>>>>>>
>>>>>> Thanks again for the review. I've updated the patch:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v05/ 
>>>>>>
>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v05-delta/ 
>>>>>>
>>>>>>
>>>>>> More comments inline
>>>>>>
>>>>>> On 9/8/18 2:37 PM, Jiangli Zhou wrote:
>>>>>>> Hi Ioi,
>>>>>>>
>>>>>>> The updated webrev looks great. Here are some additional comments.
>>>>>>>
>>>>>>> - Thanks for answering my question about the two loops. If we 
>>>>>>> want to solve the klass list issue as a general problem, we also 
>>>>>>> need to allow the cases where the entry fields from the same 
>>>>>>> class are not specified consecutively. The 'break' in the inner 
>>>>>>> loop should be a 'continue'. Currently, all the entry fields 
>>>>>>> from a single class are specified together as we are doing it 
>>>>>>> manually. When we enhance the mechanism by adding the other half 
>>>>>>> API that enables registering the entry fields from java code 
>>>>>>> automatically, we might see fields from different classes are in 
>>>>>>> mixed orders.
>>>>>>>
>>>>>>>  657     while (j < num_archivable_static_fields) {
>>>>>>>  658       ArchivableStaticFieldInfo* f = 
>>>>>>> &archivable_static_fields[j];
>>>>>>>  659       if (f->klass_name == klass_name) {
>>>>>>>  660 archive_reachable_objects_from_static_field(f->klass, 
>>>>>>> f->klass_name,
>>>>>>>  661                                                     
>>>>>>> f->offset, f->field_name, CHECK);
>>>>>>>  662         j++;
>>>>>>>  663       } else {
>>>>>>>  664         break; <<<<<<<
>>>>>>>  665       }
>>>>>>>  666     }
>>>>>>>
>>>>>>
>>>>>> It's actually OK now to not order the archivable_static_fields[] 
>>>>>> array by class name. The inner loop is just for optimization. 
>>>>>> I've updated the code and added comments to make this clear.
>>>>>>
>>>>>> I tested by reordering the archivable_static_fields[] array (move 
>>>>>> the "archivedModuleFinder" line to the end of the table). It shows:
>>>>>>
>>>>>>     Performed subgraph records = 7 times
>>>>>>     Walked 3610 objects
>>>>>>     Archived 2127 objects
>>>>>>     Recorded 33 klasses
>>>>>>
>>>>>> With the original (sorted) table, it's faster with fewer walks, 
>>>>>> but that doesn't change the number of archived objects or 
>>>>>> recorded klasses:
>>>>>>
>>>>>>     Performed subgraph records = 6 times
>>>>>>     Walked 2141 objects
>>>>>>     Archived 2127 objects
>>>>>>     Recorded 33 klasses
>>>>>>
>>>> The seen_objects_table resets when you see a klass_name that's 
>>>> different. When the entry fields from different classes are mixed, 
>>>> you might record an object klass that's already seen early for the 
>>>> same containing class. That is not incorrect, but adds overhead 
>>>> (although very small) to runtime. For following entry fields sequence:
>>>>
>>>> A.f1
>>>> A.f2
>>>> B.f1
>>>> A.f3
>>>>
>>>> For example, K1 is recorded for A.f1, A.f2 subgraphs. When 
>>>> archiving A.f3 subgraph, K1 is encountered, K1 would be recorded 
>>>> again for A.f3 because a different seen_objects_table is used. We 
>>>> can fix the logic here or add a post process to eliminate any 
>>>> duplication in the klass lists belonging to the same containing 
>>>> klass. The post process seems to be an overkill.
>>>
>>> There's no duplication in the _subgraph_object_klasses list because 
>>> we add it to only when needed:
>>>
>>>   _subgraph_object_klasses->append_if_missing(relocated_k);
>>>
>>> So there's no runtime impact. (You can also see that with the same 
>>> "Recorded 33 klasses" from the log above.)
>> This indeed has no issue. I forgot that I built a single 
>> subgraph_info_record for each containing klass, but not for each 
>> entry field. :(
>>
>> One containing klass may have multiple subgraphs referenced from 
>> multiple entry points. Since there is no separate info_record for 
>> each entry field from the same containing klass, then there is no 
>> duplicates for the recorded object_klass. Sorry for the confusion! 
>> Thank you being patient with me.
>>
>> Thanks,
>> Jiangli
>>>
>>>
>>>>>>
>>>>>>> - In order to make sure subgraphs are archived correctly, we 
>>>>>>> need to do proper state propagation and checks/asserts, 
>>>>>>> including the record_klasses_only state. When we see 
>>>>>>> record_klasses_only becomes true, we need to verify all 
>>>>>>> reachable object from the current one must have existing 
>>>>>>> archived copies already. That helps to confirm that an earlier 
>>>>>>> archived subgraph is complete, especially if the current walk 
>>>>>>> enters an sub-subgraph from a different node. For your current 
>>>>>>> scope of changes, resetting the record_klasses_only state based 
>>>>>>> on if there is an existing archived copy is ok. We can enhance 
>>>>>>> it separately.
>>>>>>>
>>>>>>
>>>>>> I added a separate pass to verify these -- for each root 
>>>>>> (orig_obj) of an archived subgraph:
>>>>>>
>>>>>>   // (1) Verify that all objects reachable from orig_obj are 
>>>>>> archived.
>>>>>>   // (2) Verify that all objects reachable from the archived copy 
>>>>>> of orig_obj
>>>>>>   //     can only point to archived objects.
>>>>>>
>>>>>> I think that's sufficient to ensure that you have correctly 
>>>>>> archived everything you need.
>>>>
>>>> The verify code is good to have. (2) is not necessary as that's 
>>>> already done by the archiving verification code in GC. There is 
>>>> more fine grained region verification implemented for archived 
>>>> objects in G1 and (2) is already done for all objects in open 
>>>> archive heap regions at dump time.
>>>>
>>>
>>> Thanks for letting me know. I've replaced (2) with this comment.
>>>
>>>   // Note: we could also verify that all objects reachable from the 
>>> archived
>>>   // copy of orig_obj can only point to archived objects, with:
>>>   //      init_seen_objects_table();
>>>   //      verify_reachable_objects_from(archived_obj, true);
>>>   //      init_seen_objects_table();
>>>   // but that's already done in 
>>> G1HeapVerifier::verify_archive_regions so we
>>>   // won't do it here.
>>>
>>>
>>>
>>>>>>
>>>>>>> Please also test all tier1-tier6 for your changes with the 
>>>>>>> default CDS archive patch.
>>>>>>>
>>>>>> I just submitted a job for hs-tier1~6 using your latest patch
>>>>>> http://mail.openjdk.java.net/pipermail/build-dev/2018-September/023146.html 
>>>>>>
>>>>
>>>
>>> My test jobs passed with no new failures, so I'll push now.
>>>
>>> Thanks
>>> - Ioi
>>>
>>>
>>>> Thanks,
>>>> Jiangli
>>>>>>
>>>>>> Thanks
>>>>>> - Ioi
>>>>>>
>>>>>>> Thanks,
>>>>>>> Jiangli
>>>>>>>
>>>>>>> On 9/7/18 5:58 PM, Ioi Lam wrote:
>>>>>>>>
>>>>>>>> Hi Jiangli,
>>>>>>>>
>>>>>>>> Thanks for the review!
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04/ 
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04-delta/ 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 9/6/2018 4:28 PM, Jiangli Zhou wrote:
>>>>>>>>> Hi Ioi,
>>>>>>>>>
>>>>>>>>> Sorry for the delay. It took me a while to get back to this. 
>>>>>>>>> Thanks for folding the closures together. It cuts the cost by 
>>>>>>>>> almost half. Here are my detailed comments.
>>>>>>>>>
>>>>>>>>> - _subgraph_object_klasses renaming
>>>>>>>>>
>>>>>>>>> The reason that I choose _subgraph_object_klasses was to 
>>>>>>>>> disambiguate between the klass containing a subgraph and the 
>>>>>>>>> klass of an object included in a subgraph. Is there a specific 
>>>>>>>>> reason to rename the term to _subgraph_klasses?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I just wanted to shorten the names to be the same as 
>>>>>>>> ArchivedKlassSubGraphInfoRecord::subgraph_klasses, but I agree 
>>>>>>>> with you for the naming.
>>>>>>>>
>>>>>>>> So instead, I've renamed 
>>>>>>>> ArchivedKlassSubGraphInfoRecord::subgraph_klasses -> 
>>>>>>>> subgraph_object_klasses to make the code consistent. What do 
>>>>>>>> you think?
>>>>>>>>
>>>>>>>>> - Could you please align the lines at 358 - 360.
>>>>>>>>>
>>>>>>>>>  355   WalkOopAndArchiveClosure(int level, bool 
>>>>>>>>> record_klasses_only,
>>>>>>>>>  356                            KlassSubGraphInfo* subgraph_info,
>>>>>>>>>  357                            oop orig, oop archived, TRAPS) :
>>>>>>>>>  358     _level(level), 
>>>>>>>>> _record_klasses_only(record_klasses_only),
>>>>>>>>>  359     _subgraph_info(subgraph_info),
>>>>>>>>>  360     _orig_referencing_obj(orig), 
>>>>>>>>> _archived_referencing_obj(archived),
>>>>>>>>>  361     _thread(THREAD) {}
>>>>>>>>>
>>>>>>>>
>>>>>>>> The 2 space indent is for the C++ field initializer list. 
>>>>>>>> HotSpot style guide doesn't mention this 
>>>>>>>> (https://wiki.openjdk.java.net/display/HotSpot/StyleGuide) so I 
>>>>>>>> just used emacs's default indentation.
>>>>>>>>
>>>>>>>>> - (2) should be only applied to object that's not seen within 
>>>>>>>>> the current subgraph. Please clarify it in the comment.
>>>>>>>>>
>>>>>>>>>  401 // (2) If orig_obj has not been seen yet, trace all
>>>>>>>>>  402 //     objects that are reachable from it, and make sure 
>>>>>>>>> these objects are archived.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I changed it to
>>>>>>>>
>>>>>>>> // (2) If orig_obj has not been seen yet (since 
>>>>>>>> start_recording_subgraph() was called),
>>>>>>>> //     trace all  objects that are reachable from it, and make 
>>>>>>>> sure these objects are archived.
>>>>>>>>
>>>>>>>>
>>>>>>>>> - Some Strings (such as "") are not archived in the closed 
>>>>>>>>> archive heap region. In that case, the String object is 
>>>>>>>>> automatically archived in the open archive heap region. The 
>>>>>>>>> following comment should be clarified and fixed.
>>>>>>>>>
>>>>>>>>>  417     // Strings are already archived in closed archive. No 
>>>>>>>>> need to walk it
>>>>>>>>>
>>>>>>>>
>>>>>>>> I changed to
>>>>>>>>
>>>>>>>>   if (java_lang_String::is_instance(orig_obj) && archived_obj 
>>>>>>>> != NULL) {
>>>>>>>>     // To save time, don't walk strings that are already 
>>>>>>>> archived. They just contain
>>>>>>>>     // pointers to a type array, whose klass doesn't need to be 
>>>>>>>> recorded.
>>>>>>>>     return archived_obj;
>>>>>>>>   }
>>>>>>>>
>>>>>>>>
>>>>>>>>> - Could you please explain the reason for using two loops 
>>>>>>>>> here? Shouldn't one loop be sufficient?
>>>>>>>>>
>>>>>>>>>  646   for (int i = 0; i < num_archivable_static_fields; ) {
>>>>>>>>>
>>>>>>>>>  650
>>>>>>>>>  651     int j = i;
>>>>>>>>>  652     while (j < num_archivable_static_fields) {
>>>>>>>>>  653       ArchivableStaticFieldInfo* f = 
>>>>>>>>> &archivable_static_fields[j];
>>>>>>>>>  654       if (f->klass_name == klass_name) {
>>>>>>>>>  655 archive_reachable_objects_from_static_field(f->klass, 
>>>>>>>>> f->klass_name,
>>>>>>>>>  656 f->offset, f->field_name, CHECK);
>>>>>>>>>  657         j++;
>>>>>>>>>  658       } else {
>>>>>>>>>  659         break;
>>>>>>>>>  660       }
>>>>>>>>>  661     }
>>>>>>>>> ..
>>>>>>>>>  663     i = j;
>>>>>>>>>  664   }
>>>>>>>>>  665
>>>>>>>>
>>>>>>>> That's because you can archive multiple fields in the same klass:
>>>>>>>>
>>>>>>>> // If you add new entries to this table, you should know what 
>>>>>>>> you're doing!
>>>>>>>> static ArchivableStaticFieldInfo archivable_static_fields[] = {
>>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>>> "archivedSystemModules"},
>>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>>> "archivedModuleFinder"},
>>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>>> "archivedMainModule"},
>>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>>> "archivedConfiguration"},
>>>>>>>>   {"java/util/ImmutableCollections$ListN", "EMPTY_LIST"},
>>>>>>>>
>>>>>>>> Note that start/done_recording_subgraph are called in the i 
>>>>>>>> loop and not the j loop.
>>>>>>>>
>>>>>>>>> - HeapShared::archive_reachable_objects_from() should take a 
>>>>>>>>> '_record_klasses_only' argument from caller, which decides if 
>>>>>>>>> it only needs to record the klass list.
>>>>>>>>>
>>>>>>>>> As we walk all reachable objects from an entry point (a static 
>>>>>>>>> field) during walk&archive process, at the end of it we obtain 
>>>>>>>>> a complete and closed archived subgraph. Every object is also 
>>>>>>>>> the root of a sub-subgraph that's reachable from it. In a 
>>>>>>>>> subsequent subgraph archiving process from a different entry 
>>>>>>>>> point, if we encounter an object that is already archived, 
>>>>>>>>> then every objects contained within the sub-subgraph reachable 
>>>>>>>>> from that root are already archived also. So, 
>>>>>>>>> '_record_klasses_only' should be a inherited flag from the 
>>>>>>>>> root and passed to all 
>>>>>>>>> HeapShared::archive_reachable_objects_from() calls for 
>>>>>>>>> processing objects within a sub-subgraph.
>>>>>>>>>
>>>>>>>>> During the walk, '_record_klasses_only' should be false 
>>>>>>>>> initially. Once it sees an object that already has an archived 
>>>>>>>>> copy (but not seen in the current subgraph), the 
>>>>>>>>> '_record_klasses_only' needs to become true and be propagated 
>>>>>>>>> to all sub-walking processes. '_record_klasses_only' should 
>>>>>>>>> never go back to false state.
>>>>>>>>>
>>>>>>>>> With that, we can guarantee to build a correct archived 
>>>>>>>>> subgraph without the risk of infinite cycles. It also reduces 
>>>>>>>>> overhead for the walk&archive process.
>>>>>>>>
>>>>>>>> I actually started to write to code exactly as you suggested, 
>>>>>>>> by passing record_klasses_only down, and then I found out that 
>>>>>>>> it's unnecessary, and having it will require me to add 
>>>>>>>> complicated asserts that was beyond my ability to get right.
>>>>>>>>
>>>>>>>> As I understand, you worry about about correctness (termination 
>>>>>>>> of the algorithm), and performance.
>>>>>>>>
>>>>>>>> I believe the logic of the current code is correct, and the 
>>>>>>>> algorithm will terminate:
>>>>>>>>
>>>>>>>>       oop archived_obj =
>>>>>>>>     MetaspaceShared::find_archived_heap_object(orig_obj);
>>>>>>>>       ....
>>>>>>>>       if (has_been_seen_during_subgraph_recording(orig_obj)) {
>>>>>>>>         // orig_obj has already been archived and traced.
>>>>>>>>     Nothing more to do.
>>>>>>>>         return archived_obj;
>>>>>>>>       }
>>>>>>>>       ...
>>>>>>>>      WalkOopAndArchiveClosure walker(level, record_klasses_only,
>>>>>>>>     subgraph_info, orig_obj, archived_obj, THREAD);
>>>>>>>>      orig_obj->oop_iterate(&walker); // recurse
>>>>>>>>
>>>>>>>> HeapShared::archive_reachable_objects_from will recurse ONLY if 
>>>>>>>> you call this function with an object that has not need seen, 
>>>>>>>> so eventually the recursion will stop because all objects in 
>>>>>>>> the subgraph has been seen.
>>>>>>>>
>>>>>>>> To make the code easier to understand, I rewrote it like this:
>>>>>>>>
>>>>>>>>   if (has_been_seen_during_subgraph_recording(orig_obj)) {
>>>>>>>>     // orig_obj has already been archived and traced. Nothing 
>>>>>>>> more to do.
>>>>>>>>     return archived_obj;
>>>>>>>>   } else {
>>>>>>>> set_has_been_seen_during_subgraph_recording(orig_obj);
>>>>>>>>    }
>>>>>>>>
>>>>>>>> For performance, the logic for deciding record_klasses_only is 
>>>>>>>> this:
>>>>>>>>
>>>>>>>>   oop archived_obj = 
>>>>>>>> MetaspaceShared::find_archived_heap_object(orig_obj);
>>>>>>>>    ...
>>>>>>>>   bool record_klasses_only = (archived_obj != NULL);
>>>>>>>>
>>>>>>>> You have to call find_archived_heap_object anyway, so that's 
>>>>>>>> unavoidable. The cost of computing record_klasses_only is very 
>>>>>>>> simple. For performance, there's no need for the it to be 
>>>>>>>> tracked in the recursive calls. That makes the code simpler, 
>>>>>>>> and easier to maintain.
>>>>>>>>
>>>>>>>> Otherwise we may need to think about whether the caller and the 
>>>>>>>> callee have different opinion on record_klasses_only:
>>>>>>>>
>>>>>>>>   - if this could happen, what would cause this to happen, and 
>>>>>>>> how should we handle it?
>>>>>>>>   - if this could never happen, how would you write an 
>>>>>>>> assertion to validate that?
>>>>>>>>
>>>>>>>> Writing those asserts turned out to be too complicated.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> - Could add explanation in the comment?
>>>>>>>>>
>>>>>>>>> 438         // See 
>>>>>>>>> runtime/appcds/cacheObject/ArchivedIntegerCacheTest.java
>>>>>>>>>
>>>>>>>>
>>>>>>>> I changed it to
>>>>>>>>
>>>>>>>>     if (level == 1) {
>>>>>>>>         // Don't archive a subgraph root that's too big. For 
>>>>>>>> archives static fields, that's OK
>>>>>>>>         // as the Java code will take care of initializing this 
>>>>>>>> field dynamically.
>>>>>>>>         return NULL;
>>>>>>>>       }
>>>>>>>>
>>>>>>>>> - Please add the step 5) comment back to the walk&archive 
>>>>>>>>> description.
>>>>>>>>>
>>>>>>>>>  466 // 5) The Klass of the current java object is added to 
>>>>>>>>> the list of Klasses
>>>>>>>>>  467 // for loading and initialzing before any object in the 
>>>>>>>>> archived graph can
>>>>>>>>>  468 // be accessed at runtime.
>>>>>>>>>  469 //
>>>>>>>>>
>>>>>>>>
>>>>>>>> Done. Sorry I removed it by mistake.
>>>>>>>>
>>>>>>>>> - The HeapShared::archive_module_graph_objects() can take a 
>>>>>>>>> more general name now, if you want to do it as part of your 
>>>>>>>>> change.
>>>>>>>>>
>>>>>>>>>
>>>>>>>> I renamed it to archive_static_fields(). I also removed/fixed a 
>>>>>>>> few comments about "Currently there is only one class mirror 
>>>>>>>> (ArchivedModuleGraph) with archived sub-graphs."
>>>>>>>>
>>>>>>>>> Could you please also send a sample of the changed cds+heap 
>>>>>>>>> debug log output?
>>>>>>>>>
>>>>>>>> See 
>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04/logs/
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> - Ioi
>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> Jiangli
>>>>>>>>>
>>>>>>>>> On 9/5/18 3:56 PM, Ioi Lam wrote:
>>>>>>>>>> I updated the patch:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v03/ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v03-delta/ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 1. No need to walk String objects during subgraph recording.
>>>>>>>>>>
>>>>>>>>>> 2. Print put statistics:
>>>>>>>>>>
>>>>>>>>>> =========================
>>>>>>>>>>
>>>>>>>>>> jdk/internal/module/ArchivedModuleGraph: walked 1871 objs, 
>>>>>>>>>> archived 1871 new objs, recorded 28 classes
>>>>>>>>>> java/util/ImmutableCollections$ListN: walked 2 objs, archived 
>>>>>>>>>> 0 new objs, recorded 0 classes
>>>>>>>>>> java/util/ImmutableCollections$MapN: walked 2 objs, archived 
>>>>>>>>>> 0 new objs, recorded 0 classes
>>>>>>>>>> java/util/ImmutableCollections$SetN: walked 2 objs, archived 
>>>>>>>>>> 0 new objs, recorded 0 classes
>>>>>>>>>> java/lang/Integer$IntegerCache: walked 257 objs, archived 256 
>>>>>>>>>> new objs, recorded 2 classes
>>>>>>>>>> java/lang/module/Configuration: walked 7 objs, archived 0 new 
>>>>>>>>>> objs, recorded 3 classes
>>>>>>>>>>
>>>>>>>>>> Performed subgraph records = 6 times
>>>>>>>>>> Walked 2141 objects
>>>>>>>>>> Archived 2127 objects
>>>>>>>>>> Recorded 33 klasses
>>>>>>>>>>
>>>>>>>>>> ========
>>>>>>>>>>
>>>>>>>>>> So, at least for now, we don't need to worry about the 
>>>>>>>>>> performance of WalkOopAndArchiveClosure.
>>>>>>>>>>
>>>>>>>>>> Once I get some statistics of Lambda archiving, I'll post 
>>>>>>>>>> them as well.
>>>>>>>>>>
>>>>>>>>>> So, with the bug fix, the subgraph recording should be both 
>>>>>>>>>> correct and faster than before (no more String walking).
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> - Ioi
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 9/3/2018 1:02 AM, Ioi Lam wrote:
>>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>>
>>>>>>>>>>> I think your suggestion of combining of combining 
>>>>>>>>>>> RecordKlassesClosure and WalkOopAndArchiveClosure is a good 
>>>>>>>>>>> idea. I have updated the webrev:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v02/ 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This also supersedes my other patch for JDK-8210295 Refactor 
>>>>>>>>>>> HeapShared::archive_reachable_objects_from_static_field
>>>>>>>>>>>
>>>>>>>>>>> More comments below on future optimizations.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 9/2/18 6:07 PM, Jiangli Zhou wrote:
>>>>>>>>>>>> Hi Ioi,
>>>>>>>>>>>>
>>>>>>>>>>>> On 9/1/18 8:58 PM, Ioi Lam wrote:
>>>>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for looking at the optimization of this problem. I 
>>>>>>>>>>>>> think your algorithm will run into problems if (a) there 
>>>>>>>>>>>>> is a cycle, and (b) two sub-graphs point to different 
>>>>>>>>>>>>> members in the cycle.
>>>>>>>>>>>>
>>>>>>>>>>>> I think such issue can be solved by tracking the original K 
>>>>>>>>>>>> (in the K-list, short for the class-initialization-list) 
>>>>>>>>>>>> for each archived object. If an encountered object is 
>>>>>>>>>>>> already archived (within the current sub-graph) and is 
>>>>>>>>>>>> associated with K1 within the current K-list (must be in 
>>>>>>>>>>>> this case). We can update the sub-K-list between K1 and the 
>>>>>>>>>>>> K associated with the referencing object of the current 
>>>>>>>>>>>> object, so their level are the same. We can also add a flag 
>>>>>>>>>>>> to the K-list elements to tag each K within the cycle. It 
>>>>>>>>>>>> may not be what we need to do currently.
>>>>>>>>>>>>
>>>>>>>>>>>> Can you please describe the specific class that you ran 
>>>>>>>>>>>> into with the bug in your sub-graph?
>>>>>>>>>>>
>>>>>>>>>>> I found the bug by code examination. It's also during this 
>>>>>>>>>>> time that I realized that it's not a simple problem to 
>>>>>>>>>>> optimize, because of cycles in the graph. There's been tons 
>>>>>>>>>>> of established research in this area (as in "Tarjan's 
>>>>>>>>>>> algorithm is one of Donald Knuth's favorite 
>>>>>>>>>>> implementations"),  so I don't think we should reinvent the 
>>>>>>>>>>> wheel. For future optimization, let's just apply an existing 
>>>>>>>>>>> algorithm.
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Maybe someone familiar with graph theory can suggest a 
>>>>>>>>>>>>> solution?
>>>>>>>>>>>>>
>>>>>>>>>>>>> My current thinking is to first find all the cycles in the 
>>>>>>>>>>>>> archived heap. The reachability of every member in the 
>>>>>>>>>>>>> cycle is the same, so we can collapse each cycle into a 
>>>>>>>>>>>>> single node. After that we have a DAG. Then, instead of 
>>>>>>>>>>>>> encoding the class initialization order as multiple lists, 
>>>>>>>>>>>>> we can encode it as a DAG to save space.
>>>>>>>>>>>>>
>>>>>>>>>>>>> See 
>>>>>>>>>>>>> https://stackoverflow.com/questions/261573/best-algorithm-for-detecting-cycles-in-a-directed-graph
>>>>>>>>>>>>>
>>>>>>>>>>>>> In any case, I think this is a sufficiently difficult 
>>>>>>>>>>>>> problem, so we should separate the dumping of the 
>>>>>>>>>>>>> sub-graphs (the first part of 
>>>>>>>>>>>>> HeapShared::archive_module_graph_objects() in my patch) 
>>>>>>>>>>>>> and the computation of the initialization order 
>>>>>>>>>>>>> (HeapShared::record_subgraph_klasses()).
>>>>>>>>>>>> Our current K-lists are quite small. If the number of 
>>>>>>>>>>>> sub-graphs grows in the future and there are many 
>>>>>>>>>>>> duplicates in the K-lists, we can explore using a single 
>>>>>>>>>>>> K-list, with each sub-graph-info recording the 
>>>>>>>>>>>> [start-index, end-index] pairs of the sub-K-lists.
>>>>>>>>>>>>
>>>>>>>>>>>> To avoid any confusion to others, I want to point out that 
>>>>>>>>>>>> the order of the elements in the K-list is not the 
>>>>>>>>>>>> initialization order of the classes that happens at dump time.
>>>>>>>>>>>
>>>>>>>>>>> Sorry for the confusion. I wanted to avoid using the word 
>>>>>>>>>>> "list" -- I think the most space efficient representation 
>>>>>>>>>>> will be a DAG instead of a list -- I should have used "set" 
>>>>>>>>>>> instead.
>>>>>>>>>>>
>>>>>>>>>>> Thanks again for spending time on this topic.
>>>>>>>>>>>
>>>>>>>>>>> - Ioi
>>>>>>>>>>>> The order of the K-list elements is determined by how we 
>>>>>>>>>>>> walk the graph. It works for our purpose because we are not 
>>>>>>>>>>>> doing general-purpose object graph archiving (at least 
>>>>>>>>>>>> currently), and all classes in the targeted sub-graph for 
>>>>>>>>>>>> archiving must not have any dependencies on runtime 
>>>>>>>>>>>> context. When we add more sub-graphs to the archive in the 
>>>>>>>>>>>> future, we can examine our existing assumptions and lift 
>>>>>>>>>>>> limitations case by case with enhancements.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also, I think we should defer the optimization of 
>>>>>>>>>>>>> record_subgraph_klasses in a future RFE. That way, we can 
>>>>>>>>>>>>> have correctness now, and performance later when we 
>>>>>>>>>>>>> actually need it. (The current implementation of 
>>>>>>>>>>>>> record_subgraph_klasses() has not caused any perceivable 
>>>>>>>>>>>>> delay for the current set of objects that we are archiving).
>>>>>>>>>>>> We probably don't need to do what I described in the above 
>>>>>>>>>>>> for now. I've looked at your changes, I think we can choose 
>>>>>>>>>>>> a middle-ground approach that's more efficient than your 
>>>>>>>>>>>> current change, but could still be O(N*M) in the worst case 
>>>>>>>>>>>> (N is the number of nodes, M is the number of sub-graphs). 
>>>>>>>>>>>> It also avoid duplicating the logic between the 
>>>>>>>>>>>> archiving_walk closure and the new RecordKlassesClosure.
>>>>>>>>>>>>
>>>>>>>>>>>> In the middle-ground approach, WalkOopAndArchiveClosure can 
>>>>>>>>>>>> take a new boolean flag, 'record_klass_only'. We can use a 
>>>>>>>>>>>> separate table to track the objects we have seen in the 
>>>>>>>>>>>> current sub-graph as you are doing in the webrev. If the 
>>>>>>>>>>>> there is an existing archived copy of the current object, 
>>>>>>>>>>>> we check the other table to determine if it's already in 
>>>>>>>>>>>> the current sub-graph. If yes, no additional work is 
>>>>>>>>>>>> needed. If no, recursively enter the closure but do 
>>>>>>>>>>>> 'record_klass_only'.
>>>>>>>>>>>>
>>>>>>>>>>>>       oop archived = 
>>>>>>>>>>>> MetaspaceShared::find_archived_heap_object(obj);
>>>>>>>>>>>>       if (archived != NULL) {
>>>>>>>>>>>>         if (has_seen_in_the_current_graph) {
>>>>>>>>>>>>             // no work
>>>>>>>>>>>>         } else {
>>>>>>>>>>>>             // walk the object but record klass only
>>>>>>>>>>>>         }
>>>>>>>>>>>>         return;
>>>>>>>>>>>>       }
>>>>>>>>>>>>
>>>>>>>>>>>> With that, we can avoid duplicating the logic in 
>>>>>>>>>>>> RecordKlassesClosure and also avoid walking the sub-graph 
>>>>>>>>>>>> objects twice in most cases. What do you think?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 9/1/18 7:55 PM, Jiangli Zhou wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sep 1, 2018, at 7:25 PM, Jiangli Zhou 
>>>>>>>>>>>>>>> <jiangli.zhou at oracle.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Ioi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for finding the bug. To address the incomplete 
>>>>>>>>>>>>>>> class list issue, the original algorithm can be 
>>>>>>>>>>>>>>> augmented while remaining as an O(N) solution.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As each node (object) in an archived subgraph is a root 
>>>>>>>>>>>>>>> of a sub-subgraph, the class-initialization-list 
>>>>>>>>>>>>>>> associated with a specific node (within an existing 
>>>>>>>>>>>>>>> archived sub-graph) is also a sub-list of the enclosing 
>>>>>>>>>>>>>>> sub-graph's class-initialization-list. For example,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sub-graph1:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>              O1(k1)
>>>>>>>>>>>>>>>            /         \
>>>>>>>>>>>>>>>      O2(k2)       O5(k5)
>>>>>>>>>>>>>>>      /       \
>>>>>>>>>>>>>>> O3(k3)   O4(k4)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Klass-init-list: K1, K2, K3, K4, K5
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sub-graph2:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>      O2(k2)
>>>>>>>>>>>>>>>      /       \
>>>>>>>>>>>>>>> O3(k3)   O4(k4)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Klass-init-list: K2, K3, K4
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> During the sub-graph walking and archiving process for 
>>>>>>>>>>>>>>> sub-graph2, if O2 has been previously archived in 
>>>>>>>>>>>>>>> another sub-graph, we can find the other sub-graph's 
>>>>>>>>>>>>>>> class list and take the sub-list starting from the klass 
>>>>>>>>>>>>>>> (k2), without re-walking each nodes again. To do that, 
>>>>>>>>>>>>>>> we only need to walk the existing recorded 
>>>>>>>>>>>>>>> class-initialization-list. If K2 is found in any of the 
>>>>>>>>>>>>>>> existing list, we can take the sub-list starting from K2 
>>>>>>>>>>>>>>> and append the list to the current one.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> With the approach, K5 would also be included if O5 is 
>>>>>>>>>>>>>>> walked after O2 in sub-graph1. However, O5 is not part 
>>>>>>>>>>>>>>> of O2's sub-graph. So that would introduce overhead at 
>>>>>>>>>>>>>>> runtime. To avoid such problem, we can remember the 
>>>>>>>>>>>>>>> sub-graph level (which is already built as part of the 
>>>>>>>>>>>>>>> existing graph walking algorithm) for each K in the 
>>>>>>>>>>>>>>> class-initialization-list. The ending indication of the 
>>>>>>>>>>>>>>> sub-list would be the first K with the same level as the 
>>>>>>>>>>>>>>> starting K. So we would have:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     K1(1), K2(2), K3(3), K4(4), K5(2)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The K2 level is 2, the sub-list would end before K2 
>>>>>>>>>>>>>>> who's level is 2.
>>>>>>>>>>>>>> Typo. The second K2 above should be k5:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The K2 level is 2, the sub-list would end before K5, 
>>>>>>>>>>>>>> who's level is 2.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This part is not currently implemented yet but is not 
>>>>>>>>>>>>>>> difficult to add.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 9/1/18 6:08 PM, Ioi Lam wrote:
>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8210289
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v01/ 
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Description
>>>>>>>>>>>>>>>> ===========
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I found this bug while trying to merge my code for CDS 
>>>>>>>>>>>>>>>> support of Lambda
>>>>>>>>>>>>>>>> classes (JDK-8198698).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When heapShared.cpp dumps a sub-graph of heap objects, 
>>>>>>>>>>>>>>>> it attempts to
>>>>>>>>>>>>>>>> record all the classes of all the objects that are 
>>>>>>>>>>>>>>>> referenced by
>>>>>>>>>>>>>>>> this sub-graph. However, if one of these objects have 
>>>>>>>>>>>>>>>> already been visited
>>>>>>>>>>>>>>>> while a previous sub-graph was dumped, then this 
>>>>>>>>>>>>>>>> object's class is not
>>>>>>>>>>>>>>>> recorded in the current sub-graph.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> At runtime, if the current sub-graph is restored before 
>>>>>>>>>>>>>>>> any other
>>>>>>>>>>>>>>>> sub-graphs, we will end up with a live object in the 
>>>>>>>>>>>>>>>> Java heap with
>>>>>>>>>>>>>>>> an uninitialized class.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Fix
>>>>>>>>>>>>>>>> ===
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Now I create the sub-graph's klasses list after all 
>>>>>>>>>>>>>>>> sub-graphs have dumped.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For each class that has archived sub-graph(s), I do a 
>>>>>>>>>>>>>>>> heap walk to discover
>>>>>>>>>>>>>>>> all klasses that are reachable from this archived 
>>>>>>>>>>>>>>>> fields of this class.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This is sub-optimal but suffice for now because we just 
>>>>>>>>>>>>>>>> have a very small
>>>>>>>>>>>>>>>> number of sub-graphs. The worst case its O(N^2) where N 
>>>>>>>>>>>>>>>> is the number of
>>>>>>>>>>>>>>>> objects in the archived heap.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In the future, we might need to implement a more 
>>>>>>>>>>>>>>>> efficient algorithm that
>>>>>>>>>>>>>>>> walks the heap once and computes all the klasses lists 
>>>>>>>>>>>>>>>> of all the
>>>>>>>>>>>>>>>> sub-graphs at the same time.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Testing
>>>>>>>>>>>>>>>> =======
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> hs-tier[1,2,3]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>
>