RFR[S] 8210289 ArchivedKlassSubGraphInfoRecord is incomplete

Ioi Lam ioi.lam at oracle.com
Tue Sep 11 03:17:23 UTC 2018


Hi Jiangli,

Thank you for being thorough with the review. I think it really makes 
the code much better than I originally had :-)

- Ioi

On 9/10/18 6:35 PM, Jiangli Zhou wrote:
>
> On 9/10/18 4:10 PM, Ioi Lam wrote:
>
>> On 9/10/2018 1:00 PM, Jiangli Zhou wrote:
>>
>>> Hi Ioi,
>>>
>>> Thanks for another updated webrev. Please see comments (minor) below.
>>>
>>>>> *From:* Ioi Lam <ioi.lam at oracle.com <mailto:ioi.lam at oracle.com>>
>>>>> *Date:* September 9, 2018 at 12:11:42 PM PDT
>>>>> *To:* Jiangli Zhou <jiangli.zhou at oracle.com 
>>>>> <mailto:jiangli.zhou at oracle.com>>, 
>>>>> hotspot-runtime-dev at openjdk.java.net 
>>>>> <mailto:hotspot-runtime-dev at openjdk.java.net>
>>>>> *Subject:* *Re: RFR[S] 8210289 ArchivedKlassSubGraphInfoRecord is 
>>>>> incomplete*
>>>>>
>>>>> Hi Jiangli,
>>>>>
>>>>> Thanks again for the review. I've updated the patch:
>>>>>
>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v05/
>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v05-delta/
>>>>>
>>>>> More comments inline
>>>>>
>>>>> On 9/8/18 2:37 PM, Jiangli Zhou wrote:
>>>>>> Hi Ioi,
>>>>>>
>>>>>> The updated webrev looks great. Here are some additional comments.
>>>>>>
>>>>>> - Thanks for answering my question about the two loops. If we 
>>>>>> want to solve the klass list issue as a general problem, we also 
>>>>>> need to allow the cases where the entry fields from the same 
>>>>>> class are not specified consecutively. The 'break' in the inner 
>>>>>> loop should be a 'continue'. Currently, all the entry fields from 
>>>>>> a single class are specified together as we are doing it 
>>>>>> manually. When we enhance the mechanism by adding the other half 
>>>>>> API that enables registering the entry fields from java code 
>>>>>> automatically, we might see fields from different classes are in 
>>>>>> mixed orders.
>>>>>>
>>>>>>  657     while (j < num_archivable_static_fields) {
>>>>>>  658       ArchivableStaticFieldInfo* f = 
>>>>>> &archivable_static_fields[j];
>>>>>>  659       if (f->klass_name == klass_name) {
>>>>>>  660 archive_reachable_objects_from_static_field(f->klass, 
>>>>>> f->klass_name,
>>>>>>  661                                                     
>>>>>> f->offset, f->field_name, CHECK);
>>>>>>  662         j++;
>>>>>>  663       } else {
>>>>>>  664         break; <<<<<<<
>>>>>>  665       }
>>>>>>  666     }
>>>>>>
>>>>>
>>>>> It's actually OK now to not order the archivable_static_fields[] 
>>>>> array by class name. The inner loop is just for optimization. I've 
>>>>> updated the code and added comments to make this clear.
>>>>>
>>>>> I tested by reordering the archivable_static_fields[] array (move 
>>>>> the "archivedModuleFinder" line to the end of the table). It shows:
>>>>>
>>>>>     Performed subgraph records = 7 times
>>>>>     Walked 3610 objects
>>>>>     Archived 2127 objects
>>>>>     Recorded 33 klasses
>>>>>
>>>>> With the original (sorted) table, it's faster with fewer walks, 
>>>>> but that doesn't change the number of archived objects or recorded 
>>>>> klasses:
>>>>>
>>>>>     Performed subgraph records = 6 times
>>>>>     Walked 2141 objects
>>>>>     Archived 2127 objects
>>>>>     Recorded 33 klasses
>>>>>
>>> The seen_objects_table resets when you see a klass_name that's 
>>> different. When the entry fields from different classes are mixed, 
>>> you might record an object klass that's already seen early for the 
>>> same containing class. That is not incorrect, but adds overhead 
>>> (although very small) to runtime. For following entry fields sequence:
>>>
>>> A.f1
>>> A.f2
>>> B.f1
>>> A.f3
>>>
>>> For example, K1 is recorded for A.f1, A.f2 subgraphs. When archiving 
>>> A.f3 subgraph, K1 is encountered, K1 would be recorded again for 
>>> A.f3 because a different seen_objects_table is used. We can fix the 
>>> logic here or add a post process to eliminate any duplication in the 
>>> klass lists belonging to the same containing klass. The post process 
>>> seems to be an overkill.
>>
>> There's no duplication in the _subgraph_object_klasses list because 
>> we add it to only when needed:
>>
>>   _subgraph_object_klasses->append_if_missing(relocated_k);
>>
>> So there's no runtime impact. (You can also see that with the same 
>> "Recorded 33 klasses" from the log above.)
> This indeed has no issue. I forgot that I built a single 
> subgraph_info_record for each containing klass, but not for each entry 
> field. :(
>
> One containing klass may have multiple subgraphs referenced from 
> multiple entry points. Since there is no separate info_record for each 
> entry field from the same containing klass, then there is no 
> duplicates for the recorded object_klass. Sorry for the confusion! 
> Thank you being patient with me.
>
> Thanks,
> Jiangli
>>
>>
>>>>>
>>>>>> - In order to make sure subgraphs are archived correctly, we need 
>>>>>> to do proper state propagation and checks/asserts, including the 
>>>>>> record_klasses_only state. When we see record_klasses_only 
>>>>>> becomes true, we need to verify all reachable object from the 
>>>>>> current one must have existing archived copies already. That 
>>>>>> helps to confirm that an earlier archived subgraph is complete, 
>>>>>> especially if the current walk enters an sub-subgraph from a 
>>>>>> different node. For your current scope of changes, resetting the 
>>>>>> record_klasses_only state based on if there is an existing 
>>>>>> archived copy is ok. We can enhance it separately.
>>>>>>
>>>>>
>>>>> I added a separate pass to verify these -- for each root 
>>>>> (orig_obj) of an archived subgraph:
>>>>>
>>>>>   // (1) Verify that all objects reachable from orig_obj are archived.
>>>>>   // (2) Verify that all objects reachable from the archived copy 
>>>>> of orig_obj
>>>>>   //     can only point to archived objects.
>>>>>
>>>>> I think that's sufficient to ensure that you have correctly 
>>>>> archived everything you need.
>>>
>>> The verify code is good to have. (2) is not necessary as that's 
>>> already done by the archiving verification code in GC. There is more 
>>> fine grained region verification implemented for archived objects in 
>>> G1 and (2) is already done for all objects in open archive heap 
>>> regions at dump time.
>>>
>>
>> Thanks for letting me know. I've replaced (2) with this comment.
>>
>>   // Note: we could also verify that all objects reachable from the 
>> archived
>>   // copy of orig_obj can only point to archived objects, with:
>>   //      init_seen_objects_table();
>>   //      verify_reachable_objects_from(archived_obj, true);
>>   //      init_seen_objects_table();
>>   // but that's already done in 
>> G1HeapVerifier::verify_archive_regions so we
>>   // won't do it here.
>>
>>
>>
>>>>>
>>>>>> Please also test all tier1-tier6 for your changes with the 
>>>>>> default CDS archive patch.
>>>>>>
>>>>> I just submitted a job for hs-tier1~6 using your latest patch
>>>>> http://mail.openjdk.java.net/pipermail/build-dev/2018-September/023146.html
>>>
>>
>> My test jobs passed with no new failures, so I'll push now.
>>
>> Thanks
>> - Ioi
>>
>>
>>> Thanks,
>>> Jiangli
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>>> Thanks,
>>>>>> Jiangli
>>>>>>
>>>>>> On 9/7/18 5:58 PM, Ioi Lam wrote:
>>>>>>>
>>>>>>> Hi Jiangli,
>>>>>>>
>>>>>>> Thanks for the review!
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04/
>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04-delta/
>>>>>>>
>>>>>>>
>>>>>>> On 9/6/2018 4:28 PM, Jiangli Zhou wrote:
>>>>>>>> Hi Ioi,
>>>>>>>>
>>>>>>>> Sorry for the delay. It took me a while to get back to this. 
>>>>>>>> Thanks for folding the closures together. It cuts the cost by 
>>>>>>>> almost half. Here are my detailed comments.
>>>>>>>>
>>>>>>>> - _subgraph_object_klasses renaming
>>>>>>>>
>>>>>>>> The reason that I choose _subgraph_object_klasses was to 
>>>>>>>> disambiguate between the klass containing a subgraph and the 
>>>>>>>> klass of an object included in a subgraph. Is there a specific 
>>>>>>>> reason to rename the term to _subgraph_klasses?
>>>>>>>>
>>>>>>>
>>>>>>> I just wanted to shorten the names to be the same as 
>>>>>>> ArchivedKlassSubGraphInfoRecord::subgraph_klasses, but I agree 
>>>>>>> with you for the naming.
>>>>>>>
>>>>>>> So instead, I've renamed 
>>>>>>> ArchivedKlassSubGraphInfoRecord::subgraph_klasses -> 
>>>>>>> subgraph_object_klasses to make the code consistent. What do you 
>>>>>>> think?
>>>>>>>
>>>>>>>> - Could you please align the lines at 358 - 360.
>>>>>>>>
>>>>>>>>  355   WalkOopAndArchiveClosure(int level, bool 
>>>>>>>> record_klasses_only,
>>>>>>>>  356                            KlassSubGraphInfo* subgraph_info,
>>>>>>>>  357                            oop orig, oop archived, TRAPS) :
>>>>>>>>  358     _level(level), _record_klasses_only(record_klasses_only),
>>>>>>>>  359     _subgraph_info(subgraph_info),
>>>>>>>>  360     _orig_referencing_obj(orig), 
>>>>>>>> _archived_referencing_obj(archived),
>>>>>>>>  361     _thread(THREAD) {}
>>>>>>>>
>>>>>>>
>>>>>>> The 2 space indent is for the C++ field initializer list. 
>>>>>>> HotSpot style guide doesn't mention this 
>>>>>>> (https://wiki.openjdk.java.net/display/HotSpot/StyleGuide) so I 
>>>>>>> just used emacs's default indentation.
>>>>>>>
>>>>>>>> - (2) should be only applied to object that's not seen within 
>>>>>>>> the current subgraph. Please clarify it in the comment.
>>>>>>>>
>>>>>>>>  401 // (2) If orig_obj has not been seen yet, trace all
>>>>>>>>  402 //     objects that are reachable from it, and make sure 
>>>>>>>> these objects are archived.
>>>>>>>>
>>>>>>>
>>>>>>> I changed it to
>>>>>>>
>>>>>>> // (2) If orig_obj has not been seen yet (since 
>>>>>>> start_recording_subgraph() was called),
>>>>>>> //     trace all  objects that are reachable from it, and make 
>>>>>>> sure these objects are archived.
>>>>>>>
>>>>>>>
>>>>>>>> - Some Strings (such as "") are not archived in the closed 
>>>>>>>> archive heap region. In that case, the String object is 
>>>>>>>> automatically archived in the open archive heap region. The 
>>>>>>>> following comment should be clarified and fixed.
>>>>>>>>
>>>>>>>>  417     // Strings are already archived in closed archive. No 
>>>>>>>> need to walk it
>>>>>>>>
>>>>>>>
>>>>>>> I changed to
>>>>>>>
>>>>>>>   if (java_lang_String::is_instance(orig_obj) && archived_obj != 
>>>>>>> NULL) {
>>>>>>>     // To save time, don't walk strings that are already 
>>>>>>> archived. They just contain
>>>>>>>     // pointers to a type array, whose klass doesn't need to be 
>>>>>>> recorded.
>>>>>>>     return archived_obj;
>>>>>>>   }
>>>>>>>
>>>>>>>
>>>>>>>> - Could you please explain the reason for using two loops here? 
>>>>>>>> Shouldn't one loop be sufficient?
>>>>>>>>
>>>>>>>>  646   for (int i = 0; i < num_archivable_static_fields; ) {
>>>>>>>>
>>>>>>>>  650
>>>>>>>>  651     int j = i;
>>>>>>>>  652     while (j < num_archivable_static_fields) {
>>>>>>>>  653       ArchivableStaticFieldInfo* f = 
>>>>>>>> &archivable_static_fields[j];
>>>>>>>>  654       if (f->klass_name == klass_name) {
>>>>>>>>  655 archive_reachable_objects_from_static_field(f->klass, 
>>>>>>>> f->klass_name,
>>>>>>>>  656 f->offset, f->field_name, CHECK);
>>>>>>>>  657         j++;
>>>>>>>>  658       } else {
>>>>>>>>  659         break;
>>>>>>>>  660       }
>>>>>>>>  661     }
>>>>>>>> ..
>>>>>>>>  663     i = j;
>>>>>>>>  664   }
>>>>>>>>  665
>>>>>>>
>>>>>>> That's because you can archive multiple fields in the same klass:
>>>>>>>
>>>>>>> // If you add new entries to this table, you should know what 
>>>>>>> you're doing!
>>>>>>> static ArchivableStaticFieldInfo archivable_static_fields[] = {
>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>> "archivedSystemModules"},
>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>> "archivedModuleFinder"},
>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", "archivedMainModule"},
>>>>>>>   {"jdk/internal/module/ArchivedModuleGraph", 
>>>>>>> "archivedConfiguration"},
>>>>>>>   {"java/util/ImmutableCollections$ListN", "EMPTY_LIST"},
>>>>>>>
>>>>>>> Note that start/done_recording_subgraph are called in the i loop 
>>>>>>> and not the j loop.
>>>>>>>
>>>>>>>> - HeapShared::archive_reachable_objects_from() should take a 
>>>>>>>> '_record_klasses_only' argument from caller, which decides if 
>>>>>>>> it only needs to record the klass list.
>>>>>>>>
>>>>>>>> As we walk all reachable objects from an entry point (a static 
>>>>>>>> field) during walk&archive process, at the end of it we obtain 
>>>>>>>> a complete and closed archived subgraph. Every object is also 
>>>>>>>> the root of a sub-subgraph that's reachable from it. In a 
>>>>>>>> subsequent subgraph archiving process from a different entry 
>>>>>>>> point, if we encounter an object that is already archived, then 
>>>>>>>> every objects contained within the sub-subgraph reachable from 
>>>>>>>> that root are already archived also. So, '_record_klasses_only' 
>>>>>>>> should be a inherited flag from the root and passed to all 
>>>>>>>> HeapShared::archive_reachable_objects_from() calls for 
>>>>>>>> processing objects within a sub-subgraph.
>>>>>>>>
>>>>>>>> During the walk, '_record_klasses_only' should be false 
>>>>>>>> initially. Once it sees an object that already has an archived 
>>>>>>>> copy (but not seen in the current subgraph), the 
>>>>>>>> '_record_klasses_only' needs to become true and be propagated 
>>>>>>>> to all sub-walking processes. '_record_klasses_only' should 
>>>>>>>> never go back to false state.
>>>>>>>>
>>>>>>>> With that, we can guarantee to build a correct archived 
>>>>>>>> subgraph without the risk of infinite cycles. It also reduces 
>>>>>>>> overhead for the walk&archive process.
>>>>>>>
>>>>>>> I actually started to write to code exactly as you suggested, by 
>>>>>>> passing record_klasses_only down, and then I found out that it's 
>>>>>>> unnecessary, and having it will require me to add complicated 
>>>>>>> asserts that was beyond my ability to get right.
>>>>>>>
>>>>>>> As I understand, you worry about about correctness (termination 
>>>>>>> of the algorithm), and performance.
>>>>>>>
>>>>>>> I believe the logic of the current code is correct, and the 
>>>>>>> algorithm will terminate:
>>>>>>>
>>>>>>>       oop archived_obj =
>>>>>>>     MetaspaceShared::find_archived_heap_object(orig_obj);
>>>>>>>       ....
>>>>>>>       if (has_been_seen_during_subgraph_recording(orig_obj)) {
>>>>>>>         // orig_obj has already been archived and traced.
>>>>>>>     Nothing more to do.
>>>>>>>         return archived_obj;
>>>>>>>       }
>>>>>>>       ...
>>>>>>>      WalkOopAndArchiveClosure walker(level, record_klasses_only,
>>>>>>>     subgraph_info, orig_obj, archived_obj, THREAD);
>>>>>>>      orig_obj->oop_iterate(&walker); // recurse
>>>>>>>
>>>>>>> HeapShared::archive_reachable_objects_from will recurse ONLY if 
>>>>>>> you call this function with an object that has not need seen, so 
>>>>>>> eventually the recursion will stop because all objects in the 
>>>>>>> subgraph has been seen.
>>>>>>>
>>>>>>> To make the code easier to understand, I rewrote it like this:
>>>>>>>
>>>>>>>   if (has_been_seen_during_subgraph_recording(orig_obj)) {
>>>>>>>     // orig_obj has already been archived and traced. Nothing 
>>>>>>> more to do.
>>>>>>>     return archived_obj;
>>>>>>>   } else {
>>>>>>> set_has_been_seen_during_subgraph_recording(orig_obj);
>>>>>>>    }
>>>>>>>
>>>>>>> For performance, the logic for deciding record_klasses_only is this:
>>>>>>>
>>>>>>>   oop archived_obj = 
>>>>>>> MetaspaceShared::find_archived_heap_object(orig_obj);
>>>>>>>    ...
>>>>>>>   bool record_klasses_only = (archived_obj != NULL);
>>>>>>>
>>>>>>> You have to call find_archived_heap_object anyway, so that's 
>>>>>>> unavoidable. The cost of computing record_klasses_only is very 
>>>>>>> simple. For performance, there's no need for the it to be 
>>>>>>> tracked in the recursive calls. That makes the code simpler, and 
>>>>>>> easier to maintain.
>>>>>>>
>>>>>>> Otherwise we may need to think about whether the caller and the 
>>>>>>> callee have different opinion on record_klasses_only:
>>>>>>>
>>>>>>>   - if this could happen, what would cause this to happen, and 
>>>>>>> how should we handle it?
>>>>>>>   - if this could never happen, how would you write an assertion 
>>>>>>> to validate that?
>>>>>>>
>>>>>>> Writing those asserts turned out to be too complicated.
>>>>>>>
>>>>>>>>
>>>>>>>> - Could add explanation in the comment?
>>>>>>>>
>>>>>>>> 438         // See 
>>>>>>>> runtime/appcds/cacheObject/ArchivedIntegerCacheTest.java
>>>>>>>>
>>>>>>>
>>>>>>> I changed it to
>>>>>>>
>>>>>>>     if (level == 1) {
>>>>>>>         // Don't archive a subgraph root that's too big. For 
>>>>>>> archives static fields, that's OK
>>>>>>>         // as the Java code will take care of initializing this 
>>>>>>> field dynamically.
>>>>>>>         return NULL;
>>>>>>>       }
>>>>>>>
>>>>>>>> - Please add the step 5) comment back to the walk&archive 
>>>>>>>> description.
>>>>>>>>
>>>>>>>>  466 // 5) The Klass of the current java object is added to the 
>>>>>>>> list of Klasses
>>>>>>>>  467 // for loading and initialzing before any object in the 
>>>>>>>> archived graph can
>>>>>>>>  468 // be accessed at runtime.
>>>>>>>>  469 //
>>>>>>>>
>>>>>>>
>>>>>>> Done. Sorry I removed it by mistake.
>>>>>>>
>>>>>>>> - The HeapShared::archive_module_graph_objects() can take a 
>>>>>>>> more general name now, if you want to do it as part of your 
>>>>>>>> change.
>>>>>>>>
>>>>>>>>
>>>>>>> I renamed it to archive_static_fields(). I also removed/fixed a 
>>>>>>> few comments about "Currently there is only one class mirror 
>>>>>>> (ArchivedModuleGraph) with archived sub-graphs."
>>>>>>>
>>>>>>>> Could you please also send a sample of the changed cds+heap 
>>>>>>>> debug log output?
>>>>>>>>
>>>>>>> See 
>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04/logs/
>>>>>>>
>>>>>>> Thanks
>>>>>>> - Ioi
>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> Jiangli
>>>>>>>>
>>>>>>>> On 9/5/18 3:56 PM, Ioi Lam wrote:
>>>>>>>>> I updated the patch:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v03/ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v03-delta/ 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 1. No need to walk String objects during subgraph recording.
>>>>>>>>>
>>>>>>>>> 2. Print put statistics:
>>>>>>>>>
>>>>>>>>> =========================
>>>>>>>>>
>>>>>>>>> jdk/internal/module/ArchivedModuleGraph: walked 1871 objs, 
>>>>>>>>> archived 1871 new objs, recorded 28 classes
>>>>>>>>> java/util/ImmutableCollections$ListN: walked 2 objs, archived 
>>>>>>>>> 0 new objs, recorded 0 classes
>>>>>>>>> java/util/ImmutableCollections$MapN: walked 2 objs, archived 0 
>>>>>>>>> new objs, recorded 0 classes
>>>>>>>>> java/util/ImmutableCollections$SetN: walked 2 objs, archived 0 
>>>>>>>>> new objs, recorded 0 classes
>>>>>>>>> java/lang/Integer$IntegerCache: walked 257 objs, archived 256 
>>>>>>>>> new objs, recorded 2 classes
>>>>>>>>> java/lang/module/Configuration: walked 7 objs, archived 0 new 
>>>>>>>>> objs, recorded 3 classes
>>>>>>>>>
>>>>>>>>> Performed subgraph records = 6 times
>>>>>>>>> Walked 2141 objects
>>>>>>>>> Archived 2127 objects
>>>>>>>>> Recorded 33 klasses
>>>>>>>>>
>>>>>>>>> ========
>>>>>>>>>
>>>>>>>>> So, at least for now, we don't need to worry about the 
>>>>>>>>> performance of WalkOopAndArchiveClosure.
>>>>>>>>>
>>>>>>>>> Once I get some statistics of Lambda archiving, I'll post them 
>>>>>>>>> as well.
>>>>>>>>>
>>>>>>>>> So, with the bug fix, the subgraph recording should be both 
>>>>>>>>> correct and faster than before (no more String walking).
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> - Ioi
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 9/3/2018 1:02 AM, Ioi Lam wrote:
>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>
>>>>>>>>>> I think your suggestion of combining of combining 
>>>>>>>>>> RecordKlassesClosure and WalkOopAndArchiveClosure is a good 
>>>>>>>>>> idea. I have updated the webrev:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v02/ 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This also supersedes my other patch for JDK-8210295 Refactor 
>>>>>>>>>> HeapShared::archive_reachable_objects_from_static_field
>>>>>>>>>>
>>>>>>>>>> More comments below on future optimizations.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 9/2/18 6:07 PM, Jiangli Zhou wrote:
>>>>>>>>>>> Hi Ioi,
>>>>>>>>>>>
>>>>>>>>>>> On 9/1/18 8:58 PM, Ioi Lam wrote:
>>>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for looking at the optimization of this problem. I 
>>>>>>>>>>>> think your algorithm will run into problems if (a) there is 
>>>>>>>>>>>> a cycle, and (b) two sub-graphs point to different members 
>>>>>>>>>>>> in the cycle.
>>>>>>>>>>>
>>>>>>>>>>> I think such issue can be solved by tracking the original K 
>>>>>>>>>>> (in the K-list, short for the class-initialization-list) for 
>>>>>>>>>>> each archived object. If an encountered object is already 
>>>>>>>>>>> archived (within the current sub-graph) and is associated 
>>>>>>>>>>> with K1 within the current K-list (must be in this case). We 
>>>>>>>>>>> can update the sub-K-list between K1 and the K associated 
>>>>>>>>>>> with the referencing object of the current object, so their 
>>>>>>>>>>> level are the same. We can also add a flag to the K-list 
>>>>>>>>>>> elements to tag each K within the cycle. It may not be what 
>>>>>>>>>>> we need to do currently.
>>>>>>>>>>>
>>>>>>>>>>> Can you please describe the specific class that you ran into 
>>>>>>>>>>> with the bug in your sub-graph?
>>>>>>>>>>
>>>>>>>>>> I found the bug by code examination. It's also during this 
>>>>>>>>>> time that I realized that it's not a simple problem to 
>>>>>>>>>> optimize, because of cycles in the graph. There's been tons 
>>>>>>>>>> of established research in this area (as in "Tarjan's 
>>>>>>>>>> algorithm is one of Donald Knuth's favorite 
>>>>>>>>>> implementations"),  so I don't think we should reinvent the 
>>>>>>>>>> wheel. For future optimization, let's just apply an existing 
>>>>>>>>>> algorithm.
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe someone familiar with graph theory can suggest a 
>>>>>>>>>>>> solution?
>>>>>>>>>>>>
>>>>>>>>>>>> My current thinking is to first find all the cycles in the 
>>>>>>>>>>>> archived heap. The reachability of every member in the 
>>>>>>>>>>>> cycle is the same, so we can collapse each cycle into a 
>>>>>>>>>>>> single node. After that we have a DAG. Then, instead of 
>>>>>>>>>>>> encoding the class initialization order as multiple lists, 
>>>>>>>>>>>> we can encode it as a DAG to save space.
>>>>>>>>>>>>
>>>>>>>>>>>> See 
>>>>>>>>>>>> https://stackoverflow.com/questions/261573/best-algorithm-for-detecting-cycles-in-a-directed-graph
>>>>>>>>>>>>
>>>>>>>>>>>> In any case, I think this is a sufficiently difficult 
>>>>>>>>>>>> problem, so we should separate the dumping of the 
>>>>>>>>>>>> sub-graphs (the first part of 
>>>>>>>>>>>> HeapShared::archive_module_graph_objects() in my patch) and 
>>>>>>>>>>>> the computation of the initialization order 
>>>>>>>>>>>> (HeapShared::record_subgraph_klasses()).
>>>>>>>>>>> Our current K-lists are quite small. If the number of 
>>>>>>>>>>> sub-graphs grows in the future and there are many duplicates 
>>>>>>>>>>> in the K-lists, we can explore using a single K-list, with 
>>>>>>>>>>> each sub-graph-info recording the [start-index, end-index] 
>>>>>>>>>>> pairs of the sub-K-lists.
>>>>>>>>>>>
>>>>>>>>>>> To avoid any confusion to others, I want to point out that 
>>>>>>>>>>> the order of the elements in the K-list is not the 
>>>>>>>>>>> initialization order of the classes that happens at dump time.
>>>>>>>>>>
>>>>>>>>>> Sorry for the confusion. I wanted to avoid using the word 
>>>>>>>>>> "list" -- I think the most space efficient representation 
>>>>>>>>>> will be a DAG instead of a list -- I should have used "set" 
>>>>>>>>>> instead.
>>>>>>>>>>
>>>>>>>>>> Thanks again for spending time on this topic.
>>>>>>>>>>
>>>>>>>>>> - Ioi
>>>>>>>>>>> The order of the K-list elements is determined by how we 
>>>>>>>>>>> walk the graph. It works for our purpose because we are not 
>>>>>>>>>>> doing general-purpose object graph archiving (at least 
>>>>>>>>>>> currently), and all classes in the targeted sub-graph for 
>>>>>>>>>>> archiving must not have any dependencies on runtime context. 
>>>>>>>>>>> When we add more sub-graphs to the archive in the future, we 
>>>>>>>>>>> can examine our existing assumptions and lift limitations 
>>>>>>>>>>> case by case with enhancements.
>>>>>>>>>>>>
>>>>>>>>>>>> Also, I think we should defer the optimization of 
>>>>>>>>>>>> record_subgraph_klasses in a future RFE. That way, we can 
>>>>>>>>>>>> have correctness now, and performance later when we 
>>>>>>>>>>>> actually need it. (The current implementation of 
>>>>>>>>>>>> record_subgraph_klasses() has not caused any perceivable 
>>>>>>>>>>>> delay for the current set of objects that we are archiving).
>>>>>>>>>>> We probably don't need to do what I described in the above 
>>>>>>>>>>> for now. I've looked at your changes, I think we can choose 
>>>>>>>>>>> a middle-ground approach that's more efficient than your 
>>>>>>>>>>> current change, but could still be O(N*M) in the worst case 
>>>>>>>>>>> (N is the number of nodes, M is the number of sub-graphs). 
>>>>>>>>>>> It also avoid duplicating the logic between the 
>>>>>>>>>>> archiving_walk closure and the new RecordKlassesClosure.
>>>>>>>>>>>
>>>>>>>>>>> In the middle-ground approach, WalkOopAndArchiveClosure can 
>>>>>>>>>>> take a new boolean flag, 'record_klass_only'. We can use a 
>>>>>>>>>>> separate table to track the objects we have seen in the 
>>>>>>>>>>> current sub-graph as you are doing in the webrev. If the 
>>>>>>>>>>> there is an existing archived copy of the current object, we 
>>>>>>>>>>> check the other table to determine if it's already in the 
>>>>>>>>>>> current sub-graph. If yes, no additional work is needed. If 
>>>>>>>>>>> no, recursively enter the closure but do 'record_klass_only'.
>>>>>>>>>>>
>>>>>>>>>>>       oop archived = 
>>>>>>>>>>> MetaspaceShared::find_archived_heap_object(obj);
>>>>>>>>>>>       if (archived != NULL) {
>>>>>>>>>>>         if (has_seen_in_the_current_graph) {
>>>>>>>>>>>             // no work
>>>>>>>>>>>         } else {
>>>>>>>>>>>             // walk the object but record klass only
>>>>>>>>>>>         }
>>>>>>>>>>>         return;
>>>>>>>>>>>       }
>>>>>>>>>>>
>>>>>>>>>>> With that, we can avoid duplicating the logic in 
>>>>>>>>>>> RecordKlassesClosure and also avoid walking the sub-graph 
>>>>>>>>>>> objects twice in most cases. What do you think?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Jiangli
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>
>>>>>>>>>>>> On 9/1/18 7:55 PM, Jiangli Zhou wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sep 1, 2018, at 7:25 PM, Jiangli Zhou 
>>>>>>>>>>>>>> <jiangli.zhou at oracle.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Ioi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for finding the bug. To address the incomplete 
>>>>>>>>>>>>>> class list issue, the original algorithm can be augmented 
>>>>>>>>>>>>>> while remaining as an O(N) solution.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As each node (object) in an archived subgraph is a root 
>>>>>>>>>>>>>> of a sub-subgraph, the class-initialization-list 
>>>>>>>>>>>>>> associated with a specific node (within an existing 
>>>>>>>>>>>>>> archived sub-graph) is also a sub-list of the enclosing 
>>>>>>>>>>>>>> sub-graph's class-initialization-list. For example,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sub-graph1:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>              O1(k1)
>>>>>>>>>>>>>>            /         \
>>>>>>>>>>>>>>      O2(k2)       O5(k5)
>>>>>>>>>>>>>>      /       \
>>>>>>>>>>>>>> O3(k3)   O4(k4)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Klass-init-list: K1, K2, K3, K4, K5
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sub-graph2:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>      O2(k2)
>>>>>>>>>>>>>>      /       \
>>>>>>>>>>>>>> O3(k3)   O4(k4)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Klass-init-list: K2, K3, K4
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> During the sub-graph walking and archiving process for 
>>>>>>>>>>>>>> sub-graph2, if O2 has been previously archived in another 
>>>>>>>>>>>>>> sub-graph, we can find the other sub-graph's class list 
>>>>>>>>>>>>>> and take the sub-list starting from the klass (k2), 
>>>>>>>>>>>>>> without re-walking each nodes again. To do that, we only 
>>>>>>>>>>>>>> need to walk the existing recorded 
>>>>>>>>>>>>>> class-initialization-list. If K2 is found in any of the 
>>>>>>>>>>>>>> existing list, we can take the sub-list starting from K2 
>>>>>>>>>>>>>> and append the list to the current one.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> With the approach, K5 would also be included if O5 is 
>>>>>>>>>>>>>> walked after O2 in sub-graph1. However, O5 is not part of 
>>>>>>>>>>>>>> O2's sub-graph. So that would introduce overhead at 
>>>>>>>>>>>>>> runtime. To avoid such problem, we can remember the 
>>>>>>>>>>>>>> sub-graph level (which is already built as part of the 
>>>>>>>>>>>>>> existing graph walking algorithm) for each K in the 
>>>>>>>>>>>>>> class-initialization-list. The ending indication of the 
>>>>>>>>>>>>>> sub-list would be the first K with the same level as the 
>>>>>>>>>>>>>> starting K. So we would have:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     K1(1), K2(2), K3(3), K4(4), K5(2)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The K2 level is 2, the sub-list would end before K2 who's 
>>>>>>>>>>>>>> level is 2.
>>>>>>>>>>>>> Typo. The second K2 above should be k5:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The K2 level is 2, the sub-list would end before K5, who's 
>>>>>>>>>>>>> level is 2.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> This part is not currently implemented yet but is not 
>>>>>>>>>>>>>> difficult to add.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 9/1/18 6:08 PM, Ioi Lam wrote:
>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8210289
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v01/ 
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Description
>>>>>>>>>>>>>>> ===========
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I found this bug while trying to merge my code for CDS 
>>>>>>>>>>>>>>> support of Lambda
>>>>>>>>>>>>>>> classes (JDK-8198698).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When heapShared.cpp dumps a sub-graph of heap objects, 
>>>>>>>>>>>>>>> it attempts to
>>>>>>>>>>>>>>> record all the classes of all the objects that are 
>>>>>>>>>>>>>>> referenced by
>>>>>>>>>>>>>>> this sub-graph. However, if one of these objects have 
>>>>>>>>>>>>>>> already been visited
>>>>>>>>>>>>>>> while a previous sub-graph was dumped, then this 
>>>>>>>>>>>>>>> object's class is not
>>>>>>>>>>>>>>> recorded in the current sub-graph.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> At runtime, if the current sub-graph is restored before 
>>>>>>>>>>>>>>> any other
>>>>>>>>>>>>>>> sub-graphs, we will end up with a live object in the 
>>>>>>>>>>>>>>> Java heap with
>>>>>>>>>>>>>>> an uninitialized class.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Fix
>>>>>>>>>>>>>>> ===
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Now I create the sub-graph's klasses list after all 
>>>>>>>>>>>>>>> sub-graphs have dumped.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For each class that has archived sub-graph(s), I do a 
>>>>>>>>>>>>>>> heap walk to discover
>>>>>>>>>>>>>>> all klasses that are reachable from this archived fields 
>>>>>>>>>>>>>>> of this class.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This is sub-optimal but suffice for now because we just 
>>>>>>>>>>>>>>> have a very small
>>>>>>>>>>>>>>> number of sub-graphs. The worst case its O(N^2) where N 
>>>>>>>>>>>>>>> is the number of
>>>>>>>>>>>>>>> objects in the archived heap.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In the future, we might need to implement a more 
>>>>>>>>>>>>>>> efficient algorithm that
>>>>>>>>>>>>>>> walks the heap once and computes all the klasses lists 
>>>>>>>>>>>>>>> of all the
>>>>>>>>>>>>>>> sub-graphs at the same time.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Testing
>>>>>>>>>>>>>>> =======
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> hs-tier[1,2,3]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list