RFR[S] 8210289 ArchivedKlassSubGraphInfoRecord is incomplete
Calvin Cheung
calvin.cheung at oracle.com
Tue Sep 11 17:28:47 UTC 2018
Hi Ioi,
Just couple of minor comments.
271 void HeapShared::initialize_from_archived_subgraph(Klass* k) {
I think the above function is being called during system module
initialization?
If so, maybe you can add an assert like the following?
assert(!Universe::is_module_initialized(), "should not be called
after system module initialization");
Could you also breakup line #679 in the same file?
I don't need to see another webrev for the above changes.
thanks,
Calvin
On 9/10/18, 8:17 PM, Ioi Lam wrote:
> Hi Jiangli,
>
> Thank you for being thorough with the review. I think it really makes
> the code much better than I originally had :-)
>
> - Ioi
>
> On 9/10/18 6:35 PM, Jiangli Zhou wrote:
>>
>> On 9/10/18 4:10 PM, Ioi Lam wrote:
>>
>>> On 9/10/2018 1:00 PM, Jiangli Zhou wrote:
>>>
>>>> Hi Ioi,
>>>>
>>>> Thanks for another updated webrev. Please see comments (minor) below.
>>>>
>>>>>> *From:* Ioi Lam <ioi.lam at oracle.com <mailto:ioi.lam at oracle.com>>
>>>>>> *Date:* September 9, 2018 at 12:11:42 PM PDT
>>>>>> *To:* Jiangli Zhou <jiangli.zhou at oracle.com
>>>>>> <mailto:jiangli.zhou at oracle.com>>,
>>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>>> <mailto:hotspot-runtime-dev at openjdk.java.net>
>>>>>> *Subject:* *Re: RFR[S] 8210289 ArchivedKlassSubGraphInfoRecord is
>>>>>> incomplete*
>>>>>>
>>>>>> Hi Jiangli,
>>>>>>
>>>>>> Thanks again for the review. I've updated the patch:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v05/
>>>>>>
>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v05-delta/
>>>>>>
>>>>>>
>>>>>> More comments inline
>>>>>>
>>>>>> On 9/8/18 2:37 PM, Jiangli Zhou wrote:
>>>>>>> Hi Ioi,
>>>>>>>
>>>>>>> The updated webrev looks great. Here are some additional comments.
>>>>>>>
>>>>>>> - Thanks for answering my question about the two loops. If we
>>>>>>> want to solve the klass list issue as a general problem, we also
>>>>>>> need to allow the cases where the entry fields from the same
>>>>>>> class are not specified consecutively. The 'break' in the inner
>>>>>>> loop should be a 'continue'. Currently, all the entry fields
>>>>>>> from a single class are specified together as we are doing it
>>>>>>> manually. When we enhance the mechanism by adding the other half
>>>>>>> API that enables registering the entry fields from java code
>>>>>>> automatically, we might see fields from different classes are in
>>>>>>> mixed orders.
>>>>>>>
>>>>>>> 657 while (j < num_archivable_static_fields) {
>>>>>>> 658 ArchivableStaticFieldInfo* f =
>>>>>>> &archivable_static_fields[j];
>>>>>>> 659 if (f->klass_name == klass_name) {
>>>>>>> 660 archive_reachable_objects_from_static_field(f->klass,
>>>>>>> f->klass_name,
>>>>>>> 661
>>>>>>> f->offset, f->field_name, CHECK);
>>>>>>> 662 j++;
>>>>>>> 663 } else {
>>>>>>> 664 break; <<<<<<<
>>>>>>> 665 }
>>>>>>> 666 }
>>>>>>>
>>>>>>
>>>>>> It's actually OK now to not order the archivable_static_fields[]
>>>>>> array by class name. The inner loop is just for optimization.
>>>>>> I've updated the code and added comments to make this clear.
>>>>>>
>>>>>> I tested by reordering the archivable_static_fields[] array (move
>>>>>> the "archivedModuleFinder" line to the end of the table). It shows:
>>>>>>
>>>>>> Performed subgraph records = 7 times
>>>>>> Walked 3610 objects
>>>>>> Archived 2127 objects
>>>>>> Recorded 33 klasses
>>>>>>
>>>>>> With the original (sorted) table, it's faster with fewer walks,
>>>>>> but that doesn't change the number of archived objects or
>>>>>> recorded klasses:
>>>>>>
>>>>>> Performed subgraph records = 6 times
>>>>>> Walked 2141 objects
>>>>>> Archived 2127 objects
>>>>>> Recorded 33 klasses
>>>>>>
>>>> The seen_objects_table resets when you see a klass_name that's
>>>> different. When the entry fields from different classes are mixed,
>>>> you might record an object klass that's already seen early for the
>>>> same containing class. That is not incorrect, but adds overhead
>>>> (although very small) to runtime. For following entry fields sequence:
>>>>
>>>> A.f1
>>>> A.f2
>>>> B.f1
>>>> A.f3
>>>>
>>>> For example, K1 is recorded for A.f1, A.f2 subgraphs. When
>>>> archiving A.f3 subgraph, K1 is encountered, K1 would be recorded
>>>> again for A.f3 because a different seen_objects_table is used. We
>>>> can fix the logic here or add a post process to eliminate any
>>>> duplication in the klass lists belonging to the same containing
>>>> klass. The post process seems to be an overkill.
>>>
>>> There's no duplication in the _subgraph_object_klasses list because
>>> we add it to only when needed:
>>>
>>> _subgraph_object_klasses->append_if_missing(relocated_k);
>>>
>>> So there's no runtime impact. (You can also see that with the same
>>> "Recorded 33 klasses" from the log above.)
>> This indeed has no issue. I forgot that I built a single
>> subgraph_info_record for each containing klass, but not for each
>> entry field. :(
>>
>> One containing klass may have multiple subgraphs referenced from
>> multiple entry points. Since there is no separate info_record for
>> each entry field from the same containing klass, then there is no
>> duplicates for the recorded object_klass. Sorry for the confusion!
>> Thank you being patient with me.
>>
>> Thanks,
>> Jiangli
>>>
>>>
>>>>>>
>>>>>>> - In order to make sure subgraphs are archived correctly, we
>>>>>>> need to do proper state propagation and checks/asserts,
>>>>>>> including the record_klasses_only state. When we see
>>>>>>> record_klasses_only becomes true, we need to verify all
>>>>>>> reachable object from the current one must have existing
>>>>>>> archived copies already. That helps to confirm that an earlier
>>>>>>> archived subgraph is complete, especially if the current walk
>>>>>>> enters an sub-subgraph from a different node. For your current
>>>>>>> scope of changes, resetting the record_klasses_only state based
>>>>>>> on if there is an existing archived copy is ok. We can enhance
>>>>>>> it separately.
>>>>>>>
>>>>>>
>>>>>> I added a separate pass to verify these -- for each root
>>>>>> (orig_obj) of an archived subgraph:
>>>>>>
>>>>>> // (1) Verify that all objects reachable from orig_obj are
>>>>>> archived.
>>>>>> // (2) Verify that all objects reachable from the archived copy
>>>>>> of orig_obj
>>>>>> // can only point to archived objects.
>>>>>>
>>>>>> I think that's sufficient to ensure that you have correctly
>>>>>> archived everything you need.
>>>>
>>>> The verify code is good to have. (2) is not necessary as that's
>>>> already done by the archiving verification code in GC. There is
>>>> more fine grained region verification implemented for archived
>>>> objects in G1 and (2) is already done for all objects in open
>>>> archive heap regions at dump time.
>>>>
>>>
>>> Thanks for letting me know. I've replaced (2) with this comment.
>>>
>>> // Note: we could also verify that all objects reachable from the
>>> archived
>>> // copy of orig_obj can only point to archived objects, with:
>>> // init_seen_objects_table();
>>> // verify_reachable_objects_from(archived_obj, true);
>>> // init_seen_objects_table();
>>> // but that's already done in
>>> G1HeapVerifier::verify_archive_regions so we
>>> // won't do it here.
>>>
>>>
>>>
>>>>>>
>>>>>>> Please also test all tier1-tier6 for your changes with the
>>>>>>> default CDS archive patch.
>>>>>>>
>>>>>> I just submitted a job for hs-tier1~6 using your latest patch
>>>>>> http://mail.openjdk.java.net/pipermail/build-dev/2018-September/023146.html
>>>>>>
>>>>
>>>
>>> My test jobs passed with no new failures, so I'll push now.
>>>
>>> Thanks
>>> - Ioi
>>>
>>>
>>>> Thanks,
>>>> Jiangli
>>>>>>
>>>>>> Thanks
>>>>>> - Ioi
>>>>>>
>>>>>>> Thanks,
>>>>>>> Jiangli
>>>>>>>
>>>>>>> On 9/7/18 5:58 PM, Ioi Lam wrote:
>>>>>>>>
>>>>>>>> Hi Jiangli,
>>>>>>>>
>>>>>>>> Thanks for the review!
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04/
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04-delta/
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 9/6/2018 4:28 PM, Jiangli Zhou wrote:
>>>>>>>>> Hi Ioi,
>>>>>>>>>
>>>>>>>>> Sorry for the delay. It took me a while to get back to this.
>>>>>>>>> Thanks for folding the closures together. It cuts the cost by
>>>>>>>>> almost half. Here are my detailed comments.
>>>>>>>>>
>>>>>>>>> - _subgraph_object_klasses renaming
>>>>>>>>>
>>>>>>>>> The reason that I choose _subgraph_object_klasses was to
>>>>>>>>> disambiguate between the klass containing a subgraph and the
>>>>>>>>> klass of an object included in a subgraph. Is there a specific
>>>>>>>>> reason to rename the term to _subgraph_klasses?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I just wanted to shorten the names to be the same as
>>>>>>>> ArchivedKlassSubGraphInfoRecord::subgraph_klasses, but I agree
>>>>>>>> with you for the naming.
>>>>>>>>
>>>>>>>> So instead, I've renamed
>>>>>>>> ArchivedKlassSubGraphInfoRecord::subgraph_klasses ->
>>>>>>>> subgraph_object_klasses to make the code consistent. What do
>>>>>>>> you think?
>>>>>>>>
>>>>>>>>> - Could you please align the lines at 358 - 360.
>>>>>>>>>
>>>>>>>>> 355 WalkOopAndArchiveClosure(int level, bool
>>>>>>>>> record_klasses_only,
>>>>>>>>> 356 KlassSubGraphInfo* subgraph_info,
>>>>>>>>> 357 oop orig, oop archived, TRAPS) :
>>>>>>>>> 358 _level(level),
>>>>>>>>> _record_klasses_only(record_klasses_only),
>>>>>>>>> 359 _subgraph_info(subgraph_info),
>>>>>>>>> 360 _orig_referencing_obj(orig),
>>>>>>>>> _archived_referencing_obj(archived),
>>>>>>>>> 361 _thread(THREAD) {}
>>>>>>>>>
>>>>>>>>
>>>>>>>> The 2 space indent is for the C++ field initializer list.
>>>>>>>> HotSpot style guide doesn't mention this
>>>>>>>> (https://wiki.openjdk.java.net/display/HotSpot/StyleGuide) so I
>>>>>>>> just used emacs's default indentation.
>>>>>>>>
>>>>>>>>> - (2) should be only applied to object that's not seen within
>>>>>>>>> the current subgraph. Please clarify it in the comment.
>>>>>>>>>
>>>>>>>>> 401 // (2) If orig_obj has not been seen yet, trace all
>>>>>>>>> 402 // objects that are reachable from it, and make sure
>>>>>>>>> these objects are archived.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I changed it to
>>>>>>>>
>>>>>>>> // (2) If orig_obj has not been seen yet (since
>>>>>>>> start_recording_subgraph() was called),
>>>>>>>> // trace all objects that are reachable from it, and make
>>>>>>>> sure these objects are archived.
>>>>>>>>
>>>>>>>>
>>>>>>>>> - Some Strings (such as "") are not archived in the closed
>>>>>>>>> archive heap region. In that case, the String object is
>>>>>>>>> automatically archived in the open archive heap region. The
>>>>>>>>> following comment should be clarified and fixed.
>>>>>>>>>
>>>>>>>>> 417 // Strings are already archived in closed archive. No
>>>>>>>>> need to walk it
>>>>>>>>>
>>>>>>>>
>>>>>>>> I changed to
>>>>>>>>
>>>>>>>> if (java_lang_String::is_instance(orig_obj) && archived_obj
>>>>>>>> != NULL) {
>>>>>>>> // To save time, don't walk strings that are already
>>>>>>>> archived. They just contain
>>>>>>>> // pointers to a type array, whose klass doesn't need to be
>>>>>>>> recorded.
>>>>>>>> return archived_obj;
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>>> - Could you please explain the reason for using two loops
>>>>>>>>> here? Shouldn't one loop be sufficient?
>>>>>>>>>
>>>>>>>>> 646 for (int i = 0; i < num_archivable_static_fields; ) {
>>>>>>>>>
>>>>>>>>> 650
>>>>>>>>> 651 int j = i;
>>>>>>>>> 652 while (j < num_archivable_static_fields) {
>>>>>>>>> 653 ArchivableStaticFieldInfo* f =
>>>>>>>>> &archivable_static_fields[j];
>>>>>>>>> 654 if (f->klass_name == klass_name) {
>>>>>>>>> 655 archive_reachable_objects_from_static_field(f->klass,
>>>>>>>>> f->klass_name,
>>>>>>>>> 656 f->offset, f->field_name, CHECK);
>>>>>>>>> 657 j++;
>>>>>>>>> 658 } else {
>>>>>>>>> 659 break;
>>>>>>>>> 660 }
>>>>>>>>> 661 }
>>>>>>>>> ..
>>>>>>>>> 663 i = j;
>>>>>>>>> 664 }
>>>>>>>>> 665
>>>>>>>>
>>>>>>>> That's because you can archive multiple fields in the same klass:
>>>>>>>>
>>>>>>>> // If you add new entries to this table, you should know what
>>>>>>>> you're doing!
>>>>>>>> static ArchivableStaticFieldInfo archivable_static_fields[] = {
>>>>>>>> {"jdk/internal/module/ArchivedModuleGraph",
>>>>>>>> "archivedSystemModules"},
>>>>>>>> {"jdk/internal/module/ArchivedModuleGraph",
>>>>>>>> "archivedModuleFinder"},
>>>>>>>> {"jdk/internal/module/ArchivedModuleGraph",
>>>>>>>> "archivedMainModule"},
>>>>>>>> {"jdk/internal/module/ArchivedModuleGraph",
>>>>>>>> "archivedConfiguration"},
>>>>>>>> {"java/util/ImmutableCollections$ListN", "EMPTY_LIST"},
>>>>>>>>
>>>>>>>> Note that start/done_recording_subgraph are called in the i
>>>>>>>> loop and not the j loop.
>>>>>>>>
>>>>>>>>> - HeapShared::archive_reachable_objects_from() should take a
>>>>>>>>> '_record_klasses_only' argument from caller, which decides if
>>>>>>>>> it only needs to record the klass list.
>>>>>>>>>
>>>>>>>>> As we walk all reachable objects from an entry point (a static
>>>>>>>>> field) during walk&archive process, at the end of it we obtain
>>>>>>>>> a complete and closed archived subgraph. Every object is also
>>>>>>>>> the root of a sub-subgraph that's reachable from it. In a
>>>>>>>>> subsequent subgraph archiving process from a different entry
>>>>>>>>> point, if we encounter an object that is already archived,
>>>>>>>>> then every objects contained within the sub-subgraph reachable
>>>>>>>>> from that root are already archived also. So,
>>>>>>>>> '_record_klasses_only' should be a inherited flag from the
>>>>>>>>> root and passed to all
>>>>>>>>> HeapShared::archive_reachable_objects_from() calls for
>>>>>>>>> processing objects within a sub-subgraph.
>>>>>>>>>
>>>>>>>>> During the walk, '_record_klasses_only' should be false
>>>>>>>>> initially. Once it sees an object that already has an archived
>>>>>>>>> copy (but not seen in the current subgraph), the
>>>>>>>>> '_record_klasses_only' needs to become true and be propagated
>>>>>>>>> to all sub-walking processes. '_record_klasses_only' should
>>>>>>>>> never go back to false state.
>>>>>>>>>
>>>>>>>>> With that, we can guarantee to build a correct archived
>>>>>>>>> subgraph without the risk of infinite cycles. It also reduces
>>>>>>>>> overhead for the walk&archive process.
>>>>>>>>
>>>>>>>> I actually started to write to code exactly as you suggested,
>>>>>>>> by passing record_klasses_only down, and then I found out that
>>>>>>>> it's unnecessary, and having it will require me to add
>>>>>>>> complicated asserts that was beyond my ability to get right.
>>>>>>>>
>>>>>>>> As I understand, you worry about about correctness (termination
>>>>>>>> of the algorithm), and performance.
>>>>>>>>
>>>>>>>> I believe the logic of the current code is correct, and the
>>>>>>>> algorithm will terminate:
>>>>>>>>
>>>>>>>> oop archived_obj =
>>>>>>>> MetaspaceShared::find_archived_heap_object(orig_obj);
>>>>>>>> ....
>>>>>>>> if (has_been_seen_during_subgraph_recording(orig_obj)) {
>>>>>>>> // orig_obj has already been archived and traced.
>>>>>>>> Nothing more to do.
>>>>>>>> return archived_obj;
>>>>>>>> }
>>>>>>>> ...
>>>>>>>> WalkOopAndArchiveClosure walker(level, record_klasses_only,
>>>>>>>> subgraph_info, orig_obj, archived_obj, THREAD);
>>>>>>>> orig_obj->oop_iterate(&walker); // recurse
>>>>>>>>
>>>>>>>> HeapShared::archive_reachable_objects_from will recurse ONLY if
>>>>>>>> you call this function with an object that has not need seen,
>>>>>>>> so eventually the recursion will stop because all objects in
>>>>>>>> the subgraph has been seen.
>>>>>>>>
>>>>>>>> To make the code easier to understand, I rewrote it like this:
>>>>>>>>
>>>>>>>> if (has_been_seen_during_subgraph_recording(orig_obj)) {
>>>>>>>> // orig_obj has already been archived and traced. Nothing
>>>>>>>> more to do.
>>>>>>>> return archived_obj;
>>>>>>>> } else {
>>>>>>>> set_has_been_seen_during_subgraph_recording(orig_obj);
>>>>>>>> }
>>>>>>>>
>>>>>>>> For performance, the logic for deciding record_klasses_only is
>>>>>>>> this:
>>>>>>>>
>>>>>>>> oop archived_obj =
>>>>>>>> MetaspaceShared::find_archived_heap_object(orig_obj);
>>>>>>>> ...
>>>>>>>> bool record_klasses_only = (archived_obj != NULL);
>>>>>>>>
>>>>>>>> You have to call find_archived_heap_object anyway, so that's
>>>>>>>> unavoidable. The cost of computing record_klasses_only is very
>>>>>>>> simple. For performance, there's no need for the it to be
>>>>>>>> tracked in the recursive calls. That makes the code simpler,
>>>>>>>> and easier to maintain.
>>>>>>>>
>>>>>>>> Otherwise we may need to think about whether the caller and the
>>>>>>>> callee have different opinion on record_klasses_only:
>>>>>>>>
>>>>>>>> - if this could happen, what would cause this to happen, and
>>>>>>>> how should we handle it?
>>>>>>>> - if this could never happen, how would you write an
>>>>>>>> assertion to validate that?
>>>>>>>>
>>>>>>>> Writing those asserts turned out to be too complicated.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> - Could add explanation in the comment?
>>>>>>>>>
>>>>>>>>> 438 // See
>>>>>>>>> runtime/appcds/cacheObject/ArchivedIntegerCacheTest.java
>>>>>>>>>
>>>>>>>>
>>>>>>>> I changed it to
>>>>>>>>
>>>>>>>> if (level == 1) {
>>>>>>>> // Don't archive a subgraph root that's too big. For
>>>>>>>> archives static fields, that's OK
>>>>>>>> // as the Java code will take care of initializing this
>>>>>>>> field dynamically.
>>>>>>>> return NULL;
>>>>>>>> }
>>>>>>>>
>>>>>>>>> - Please add the step 5) comment back to the walk&archive
>>>>>>>>> description.
>>>>>>>>>
>>>>>>>>> 466 // 5) The Klass of the current java object is added to
>>>>>>>>> the list of Klasses
>>>>>>>>> 467 // for loading and initialzing before any object in the
>>>>>>>>> archived graph can
>>>>>>>>> 468 // be accessed at runtime.
>>>>>>>>> 469 //
>>>>>>>>>
>>>>>>>>
>>>>>>>> Done. Sorry I removed it by mistake.
>>>>>>>>
>>>>>>>>> - The HeapShared::archive_module_graph_objects() can take a
>>>>>>>>> more general name now, if you want to do it as part of your
>>>>>>>>> change.
>>>>>>>>>
>>>>>>>>>
>>>>>>>> I renamed it to archive_static_fields(). I also removed/fixed a
>>>>>>>> few comments about "Currently there is only one class mirror
>>>>>>>> (ArchivedModuleGraph) with archived sub-graphs."
>>>>>>>>
>>>>>>>>> Could you please also send a sample of the changed cds+heap
>>>>>>>>> debug log output?
>>>>>>>>>
>>>>>>>> See
>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04/logs/
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> - Ioi
>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> Jiangli
>>>>>>>>>
>>>>>>>>> On 9/5/18 3:56 PM, Ioi Lam wrote:
>>>>>>>>>> I updated the patch:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v03/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v03-delta/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 1. No need to walk String objects during subgraph recording.
>>>>>>>>>>
>>>>>>>>>> 2. Print put statistics:
>>>>>>>>>>
>>>>>>>>>> =========================
>>>>>>>>>>
>>>>>>>>>> jdk/internal/module/ArchivedModuleGraph: walked 1871 objs,
>>>>>>>>>> archived 1871 new objs, recorded 28 classes
>>>>>>>>>> java/util/ImmutableCollections$ListN: walked 2 objs, archived
>>>>>>>>>> 0 new objs, recorded 0 classes
>>>>>>>>>> java/util/ImmutableCollections$MapN: walked 2 objs, archived
>>>>>>>>>> 0 new objs, recorded 0 classes
>>>>>>>>>> java/util/ImmutableCollections$SetN: walked 2 objs, archived
>>>>>>>>>> 0 new objs, recorded 0 classes
>>>>>>>>>> java/lang/Integer$IntegerCache: walked 257 objs, archived 256
>>>>>>>>>> new objs, recorded 2 classes
>>>>>>>>>> java/lang/module/Configuration: walked 7 objs, archived 0 new
>>>>>>>>>> objs, recorded 3 classes
>>>>>>>>>>
>>>>>>>>>> Performed subgraph records = 6 times
>>>>>>>>>> Walked 2141 objects
>>>>>>>>>> Archived 2127 objects
>>>>>>>>>> Recorded 33 klasses
>>>>>>>>>>
>>>>>>>>>> ========
>>>>>>>>>>
>>>>>>>>>> So, at least for now, we don't need to worry about the
>>>>>>>>>> performance of WalkOopAndArchiveClosure.
>>>>>>>>>>
>>>>>>>>>> Once I get some statistics of Lambda archiving, I'll post
>>>>>>>>>> them as well.
>>>>>>>>>>
>>>>>>>>>> So, with the bug fix, the subgraph recording should be both
>>>>>>>>>> correct and faster than before (no more String walking).
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> - Ioi
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 9/3/2018 1:02 AM, Ioi Lam wrote:
>>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>>
>>>>>>>>>>> I think your suggestion of combining of combining
>>>>>>>>>>> RecordKlassesClosure and WalkOopAndArchiveClosure is a good
>>>>>>>>>>> idea. I have updated the webrev:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v02/
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> This also supersedes my other patch for JDK-8210295 Refactor
>>>>>>>>>>> HeapShared::archive_reachable_objects_from_static_field
>>>>>>>>>>>
>>>>>>>>>>> More comments below on future optimizations.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 9/2/18 6:07 PM, Jiangli Zhou wrote:
>>>>>>>>>>>> Hi Ioi,
>>>>>>>>>>>>
>>>>>>>>>>>> On 9/1/18 8:58 PM, Ioi Lam wrote:
>>>>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for looking at the optimization of this problem. I
>>>>>>>>>>>>> think your algorithm will run into problems if (a) there
>>>>>>>>>>>>> is a cycle, and (b) two sub-graphs point to different
>>>>>>>>>>>>> members in the cycle.
>>>>>>>>>>>>
>>>>>>>>>>>> I think such issue can be solved by tracking the original K
>>>>>>>>>>>> (in the K-list, short for the class-initialization-list)
>>>>>>>>>>>> for each archived object. If an encountered object is
>>>>>>>>>>>> already archived (within the current sub-graph) and is
>>>>>>>>>>>> associated with K1 within the current K-list (must be in
>>>>>>>>>>>> this case). We can update the sub-K-list between K1 and the
>>>>>>>>>>>> K associated with the referencing object of the current
>>>>>>>>>>>> object, so their level are the same. We can also add a flag
>>>>>>>>>>>> to the K-list elements to tag each K within the cycle. It
>>>>>>>>>>>> may not be what we need to do currently.
>>>>>>>>>>>>
>>>>>>>>>>>> Can you please describe the specific class that you ran
>>>>>>>>>>>> into with the bug in your sub-graph?
>>>>>>>>>>>
>>>>>>>>>>> I found the bug by code examination. It's also during this
>>>>>>>>>>> time that I realized that it's not a simple problem to
>>>>>>>>>>> optimize, because of cycles in the graph. There's been tons
>>>>>>>>>>> of established research in this area (as in "Tarjan's
>>>>>>>>>>> algorithm is one of Donald Knuth's favorite
>>>>>>>>>>> implementations"), so I don't think we should reinvent the
>>>>>>>>>>> wheel. For future optimization, let's just apply an existing
>>>>>>>>>>> algorithm.
>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Maybe someone familiar with graph theory can suggest a
>>>>>>>>>>>>> solution?
>>>>>>>>>>>>>
>>>>>>>>>>>>> My current thinking is to first find all the cycles in the
>>>>>>>>>>>>> archived heap. The reachability of every member in the
>>>>>>>>>>>>> cycle is the same, so we can collapse each cycle into a
>>>>>>>>>>>>> single node. After that we have a DAG. Then, instead of
>>>>>>>>>>>>> encoding the class initialization order as multiple lists,
>>>>>>>>>>>>> we can encode it as a DAG to save space.
>>>>>>>>>>>>>
>>>>>>>>>>>>> See
>>>>>>>>>>>>> https://stackoverflow.com/questions/261573/best-algorithm-for-detecting-cycles-in-a-directed-graph
>>>>>>>>>>>>>
>>>>>>>>>>>>> In any case, I think this is a sufficiently difficult
>>>>>>>>>>>>> problem, so we should separate the dumping of the
>>>>>>>>>>>>> sub-graphs (the first part of
>>>>>>>>>>>>> HeapShared::archive_module_graph_objects() in my patch)
>>>>>>>>>>>>> and the computation of the initialization order
>>>>>>>>>>>>> (HeapShared::record_subgraph_klasses()).
>>>>>>>>>>>> Our current K-lists are quite small. If the number of
>>>>>>>>>>>> sub-graphs grows in the future and there are many
>>>>>>>>>>>> duplicates in the K-lists, we can explore using a single
>>>>>>>>>>>> K-list, with each sub-graph-info recording the
>>>>>>>>>>>> [start-index, end-index] pairs of the sub-K-lists.
>>>>>>>>>>>>
>>>>>>>>>>>> To avoid any confusion to others, I want to point out that
>>>>>>>>>>>> the order of the elements in the K-list is not the
>>>>>>>>>>>> initialization order of the classes that happens at dump time.
>>>>>>>>>>>
>>>>>>>>>>> Sorry for the confusion. I wanted to avoid using the word
>>>>>>>>>>> "list" -- I think the most space efficient representation
>>>>>>>>>>> will be a DAG instead of a list -- I should have used "set"
>>>>>>>>>>> instead.
>>>>>>>>>>>
>>>>>>>>>>> Thanks again for spending time on this topic.
>>>>>>>>>>>
>>>>>>>>>>> - Ioi
>>>>>>>>>>>> The order of the K-list elements is determined by how we
>>>>>>>>>>>> walk the graph. It works for our purpose because we are not
>>>>>>>>>>>> doing general-purpose object graph archiving (at least
>>>>>>>>>>>> currently), and all classes in the targeted sub-graph for
>>>>>>>>>>>> archiving must not have any dependencies on runtime
>>>>>>>>>>>> context. When we add more sub-graphs to the archive in the
>>>>>>>>>>>> future, we can examine our existing assumptions and lift
>>>>>>>>>>>> limitations case by case with enhancements.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also, I think we should defer the optimization of
>>>>>>>>>>>>> record_subgraph_klasses in a future RFE. That way, we can
>>>>>>>>>>>>> have correctness now, and performance later when we
>>>>>>>>>>>>> actually need it. (The current implementation of
>>>>>>>>>>>>> record_subgraph_klasses() has not caused any perceivable
>>>>>>>>>>>>> delay for the current set of objects that we are archiving).
>>>>>>>>>>>> We probably don't need to do what I described in the above
>>>>>>>>>>>> for now. I've looked at your changes, I think we can choose
>>>>>>>>>>>> a middle-ground approach that's more efficient than your
>>>>>>>>>>>> current change, but could still be O(N*M) in the worst case
>>>>>>>>>>>> (N is the number of nodes, M is the number of sub-graphs).
>>>>>>>>>>>> It also avoid duplicating the logic between the
>>>>>>>>>>>> archiving_walk closure and the new RecordKlassesClosure.
>>>>>>>>>>>>
>>>>>>>>>>>> In the middle-ground approach, WalkOopAndArchiveClosure can
>>>>>>>>>>>> take a new boolean flag, 'record_klass_only'. We can use a
>>>>>>>>>>>> separate table to track the objects we have seen in the
>>>>>>>>>>>> current sub-graph as you are doing in the webrev. If the
>>>>>>>>>>>> there is an existing archived copy of the current object,
>>>>>>>>>>>> we check the other table to determine if it's already in
>>>>>>>>>>>> the current sub-graph. If yes, no additional work is
>>>>>>>>>>>> needed. If no, recursively enter the closure but do
>>>>>>>>>>>> 'record_klass_only'.
>>>>>>>>>>>>
>>>>>>>>>>>> oop archived =
>>>>>>>>>>>> MetaspaceShared::find_archived_heap_object(obj);
>>>>>>>>>>>> if (archived != NULL) {
>>>>>>>>>>>> if (has_seen_in_the_current_graph) {
>>>>>>>>>>>> // no work
>>>>>>>>>>>> } else {
>>>>>>>>>>>> // walk the object but record klass only
>>>>>>>>>>>> }
>>>>>>>>>>>> return;
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> With that, we can avoid duplicating the logic in
>>>>>>>>>>>> RecordKlassesClosure and also avoid walking the sub-graph
>>>>>>>>>>>> objects twice in most cases. What do you think?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 9/1/18 7:55 PM, Jiangli Zhou wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sep 1, 2018, at 7:25 PM, Jiangli Zhou
>>>>>>>>>>>>>>> <jiangli.zhou at oracle.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Ioi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for finding the bug. To address the incomplete
>>>>>>>>>>>>>>> class list issue, the original algorithm can be
>>>>>>>>>>>>>>> augmented while remaining as an O(N) solution.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As each node (object) in an archived subgraph is a root
>>>>>>>>>>>>>>> of a sub-subgraph, the class-initialization-list
>>>>>>>>>>>>>>> associated with a specific node (within an existing
>>>>>>>>>>>>>>> archived sub-graph) is also a sub-list of the enclosing
>>>>>>>>>>>>>>> sub-graph's class-initialization-list. For example,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sub-graph1:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> O1(k1)
>>>>>>>>>>>>>>> / \
>>>>>>>>>>>>>>> O2(k2) O5(k5)
>>>>>>>>>>>>>>> / \
>>>>>>>>>>>>>>> O3(k3) O4(k4)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Klass-init-list: K1, K2, K3, K4, K5
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sub-graph2:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> O2(k2)
>>>>>>>>>>>>>>> / \
>>>>>>>>>>>>>>> O3(k3) O4(k4)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Klass-init-list: K2, K3, K4
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> During the sub-graph walking and archiving process for
>>>>>>>>>>>>>>> sub-graph2, if O2 has been previously archived in
>>>>>>>>>>>>>>> another sub-graph, we can find the other sub-graph's
>>>>>>>>>>>>>>> class list and take the sub-list starting from the klass
>>>>>>>>>>>>>>> (k2), without re-walking each nodes again. To do that,
>>>>>>>>>>>>>>> we only need to walk the existing recorded
>>>>>>>>>>>>>>> class-initialization-list. If K2 is found in any of the
>>>>>>>>>>>>>>> existing list, we can take the sub-list starting from K2
>>>>>>>>>>>>>>> and append the list to the current one.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> With the approach, K5 would also be included if O5 is
>>>>>>>>>>>>>>> walked after O2 in sub-graph1. However, O5 is not part
>>>>>>>>>>>>>>> of O2's sub-graph. So that would introduce overhead at
>>>>>>>>>>>>>>> runtime. To avoid such problem, we can remember the
>>>>>>>>>>>>>>> sub-graph level (which is already built as part of the
>>>>>>>>>>>>>>> existing graph walking algorithm) for each K in the
>>>>>>>>>>>>>>> class-initialization-list. The ending indication of the
>>>>>>>>>>>>>>> sub-list would be the first K with the same level as the
>>>>>>>>>>>>>>> starting K. So we would have:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> K1(1), K2(2), K3(3), K4(4), K5(2)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The K2 level is 2, the sub-list would end before K2
>>>>>>>>>>>>>>> who's level is 2.
>>>>>>>>>>>>>> Typo. The second K2 above should be k5:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The K2 level is 2, the sub-list would end before K5,
>>>>>>>>>>>>>> who's level is 2.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This part is not currently implemented yet but is not
>>>>>>>>>>>>>>> difficult to add.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 9/1/18 6:08 PM, Ioi Lam wrote:
>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8210289
>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v01/
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Description
>>>>>>>>>>>>>>>> ===========
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I found this bug while trying to merge my code for CDS
>>>>>>>>>>>>>>>> support of Lambda
>>>>>>>>>>>>>>>> classes (JDK-8198698).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When heapShared.cpp dumps a sub-graph of heap objects,
>>>>>>>>>>>>>>>> it attempts to
>>>>>>>>>>>>>>>> record all the classes of all the objects that are
>>>>>>>>>>>>>>>> referenced by
>>>>>>>>>>>>>>>> this sub-graph. However, if one of these objects have
>>>>>>>>>>>>>>>> already been visited
>>>>>>>>>>>>>>>> while a previous sub-graph was dumped, then this
>>>>>>>>>>>>>>>> object's class is not
>>>>>>>>>>>>>>>> recorded in the current sub-graph.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> At runtime, if the current sub-graph is restored before
>>>>>>>>>>>>>>>> any other
>>>>>>>>>>>>>>>> sub-graphs, we will end up with a live object in the
>>>>>>>>>>>>>>>> Java heap with
>>>>>>>>>>>>>>>> an uninitialized class.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Fix
>>>>>>>>>>>>>>>> ===
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Now I create the sub-graph's klasses list after all
>>>>>>>>>>>>>>>> sub-graphs have dumped.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For each class that has archived sub-graph(s), I do a
>>>>>>>>>>>>>>>> heap walk to discover
>>>>>>>>>>>>>>>> all klasses that are reachable from this archived
>>>>>>>>>>>>>>>> fields of this class.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This is sub-optimal but suffice for now because we just
>>>>>>>>>>>>>>>> have a very small
>>>>>>>>>>>>>>>> number of sub-graphs. The worst case its O(N^2) where N
>>>>>>>>>>>>>>>> is the number of
>>>>>>>>>>>>>>>> objects in the archived heap.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In the future, we might need to implement a more
>>>>>>>>>>>>>>>> efficient algorithm that
>>>>>>>>>>>>>>>> walks the heap once and computes all the klasses lists
>>>>>>>>>>>>>>>> of all the
>>>>>>>>>>>>>>>> sub-graphs at the same time.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Testing
>>>>>>>>>>>>>>>> =======
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> hs-tier[1,2,3]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list