RFR[S] 8210289 ArchivedKlassSubGraphInfoRecord is incomplete
Ioi Lam
ioi.lam at oracle.com
Tue Sep 11 03:17:23 UTC 2018
Hi Jiangli,
Thank you for being thorough with the review. I think it really makes
the code much better than I originally had :-)
- Ioi
On 9/10/18 6:35 PM, Jiangli Zhou wrote:
>
> On 9/10/18 4:10 PM, Ioi Lam wrote:
>
>> On 9/10/2018 1:00 PM, Jiangli Zhou wrote:
>>
>>> Hi Ioi,
>>>
>>> Thanks for another updated webrev. Please see comments (minor) below.
>>>
>>>>> *From:* Ioi Lam <ioi.lam at oracle.com <mailto:ioi.lam at oracle.com>>
>>>>> *Date:* September 9, 2018 at 12:11:42 PM PDT
>>>>> *To:* Jiangli Zhou <jiangli.zhou at oracle.com
>>>>> <mailto:jiangli.zhou at oracle.com>>,
>>>>> hotspot-runtime-dev at openjdk.java.net
>>>>> <mailto:hotspot-runtime-dev at openjdk.java.net>
>>>>> *Subject:* *Re: RFR[S] 8210289 ArchivedKlassSubGraphInfoRecord is
>>>>> incomplete*
>>>>>
>>>>> Hi Jiangli,
>>>>>
>>>>> Thanks again for the review. I've updated the patch:
>>>>>
>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v05/
>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v05-delta/
>>>>>
>>>>> More comments inline
>>>>>
>>>>> On 9/8/18 2:37 PM, Jiangli Zhou wrote:
>>>>>> Hi Ioi,
>>>>>>
>>>>>> The updated webrev looks great. Here are some additional comments.
>>>>>>
>>>>>> - Thanks for answering my question about the two loops. If we
>>>>>> want to solve the klass list issue as a general problem, we also
>>>>>> need to allow the cases where the entry fields from the same
>>>>>> class are not specified consecutively. The 'break' in the inner
>>>>>> loop should be a 'continue'. Currently, all the entry fields from
>>>>>> a single class are specified together as we are doing it
>>>>>> manually. When we enhance the mechanism by adding the other half
>>>>>> API that enables registering the entry fields from java code
>>>>>> automatically, we might see fields from different classes are in
>>>>>> mixed orders.
>>>>>>
>>>>>> 657 while (j < num_archivable_static_fields) {
>>>>>> 658 ArchivableStaticFieldInfo* f =
>>>>>> &archivable_static_fields[j];
>>>>>> 659 if (f->klass_name == klass_name) {
>>>>>> 660 archive_reachable_objects_from_static_field(f->klass,
>>>>>> f->klass_name,
>>>>>> 661
>>>>>> f->offset, f->field_name, CHECK);
>>>>>> 662 j++;
>>>>>> 663 } else {
>>>>>> 664 break; <<<<<<<
>>>>>> 665 }
>>>>>> 666 }
>>>>>>
>>>>>
>>>>> It's actually OK now to not order the archivable_static_fields[]
>>>>> array by class name. The inner loop is just for optimization. I've
>>>>> updated the code and added comments to make this clear.
>>>>>
>>>>> I tested by reordering the archivable_static_fields[] array (move
>>>>> the "archivedModuleFinder" line to the end of the table). It shows:
>>>>>
>>>>> Performed subgraph records = 7 times
>>>>> Walked 3610 objects
>>>>> Archived 2127 objects
>>>>> Recorded 33 klasses
>>>>>
>>>>> With the original (sorted) table, it's faster with fewer walks,
>>>>> but that doesn't change the number of archived objects or recorded
>>>>> klasses:
>>>>>
>>>>> Performed subgraph records = 6 times
>>>>> Walked 2141 objects
>>>>> Archived 2127 objects
>>>>> Recorded 33 klasses
>>>>>
>>> The seen_objects_table resets when you see a klass_name that's
>>> different. When the entry fields from different classes are mixed,
>>> you might record an object klass that's already seen early for the
>>> same containing class. That is not incorrect, but adds overhead
>>> (although very small) to runtime. For following entry fields sequence:
>>>
>>> A.f1
>>> A.f2
>>> B.f1
>>> A.f3
>>>
>>> For example, K1 is recorded for A.f1, A.f2 subgraphs. When archiving
>>> A.f3 subgraph, K1 is encountered, K1 would be recorded again for
>>> A.f3 because a different seen_objects_table is used. We can fix the
>>> logic here or add a post process to eliminate any duplication in the
>>> klass lists belonging to the same containing klass. The post process
>>> seems to be an overkill.
>>
>> There's no duplication in the _subgraph_object_klasses list because
>> we add it to only when needed:
>>
>> _subgraph_object_klasses->append_if_missing(relocated_k);
>>
>> So there's no runtime impact. (You can also see that with the same
>> "Recorded 33 klasses" from the log above.)
> This indeed has no issue. I forgot that I built a single
> subgraph_info_record for each containing klass, but not for each entry
> field. :(
>
> One containing klass may have multiple subgraphs referenced from
> multiple entry points. Since there is no separate info_record for each
> entry field from the same containing klass, then there is no
> duplicates for the recorded object_klass. Sorry for the confusion!
> Thank you being patient with me.
>
> Thanks,
> Jiangli
>>
>>
>>>>>
>>>>>> - In order to make sure subgraphs are archived correctly, we need
>>>>>> to do proper state propagation and checks/asserts, including the
>>>>>> record_klasses_only state. When we see record_klasses_only
>>>>>> becomes true, we need to verify all reachable object from the
>>>>>> current one must have existing archived copies already. That
>>>>>> helps to confirm that an earlier archived subgraph is complete,
>>>>>> especially if the current walk enters an sub-subgraph from a
>>>>>> different node. For your current scope of changes, resetting the
>>>>>> record_klasses_only state based on if there is an existing
>>>>>> archived copy is ok. We can enhance it separately.
>>>>>>
>>>>>
>>>>> I added a separate pass to verify these -- for each root
>>>>> (orig_obj) of an archived subgraph:
>>>>>
>>>>> // (1) Verify that all objects reachable from orig_obj are archived.
>>>>> // (2) Verify that all objects reachable from the archived copy
>>>>> of orig_obj
>>>>> // can only point to archived objects.
>>>>>
>>>>> I think that's sufficient to ensure that you have correctly
>>>>> archived everything you need.
>>>
>>> The verify code is good to have. (2) is not necessary as that's
>>> already done by the archiving verification code in GC. There is more
>>> fine grained region verification implemented for archived objects in
>>> G1 and (2) is already done for all objects in open archive heap
>>> regions at dump time.
>>>
>>
>> Thanks for letting me know. I've replaced (2) with this comment.
>>
>> // Note: we could also verify that all objects reachable from the
>> archived
>> // copy of orig_obj can only point to archived objects, with:
>> // init_seen_objects_table();
>> // verify_reachable_objects_from(archived_obj, true);
>> // init_seen_objects_table();
>> // but that's already done in
>> G1HeapVerifier::verify_archive_regions so we
>> // won't do it here.
>>
>>
>>
>>>>>
>>>>>> Please also test all tier1-tier6 for your changes with the
>>>>>> default CDS archive patch.
>>>>>>
>>>>> I just submitted a job for hs-tier1~6 using your latest patch
>>>>> http://mail.openjdk.java.net/pipermail/build-dev/2018-September/023146.html
>>>
>>
>> My test jobs passed with no new failures, so I'll push now.
>>
>> Thanks
>> - Ioi
>>
>>
>>> Thanks,
>>> Jiangli
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>>> Thanks,
>>>>>> Jiangli
>>>>>>
>>>>>> On 9/7/18 5:58 PM, Ioi Lam wrote:
>>>>>>>
>>>>>>> Hi Jiangli,
>>>>>>>
>>>>>>> Thanks for the review!
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04/
>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04-delta/
>>>>>>>
>>>>>>>
>>>>>>> On 9/6/2018 4:28 PM, Jiangli Zhou wrote:
>>>>>>>> Hi Ioi,
>>>>>>>>
>>>>>>>> Sorry for the delay. It took me a while to get back to this.
>>>>>>>> Thanks for folding the closures together. It cuts the cost by
>>>>>>>> almost half. Here are my detailed comments.
>>>>>>>>
>>>>>>>> - _subgraph_object_klasses renaming
>>>>>>>>
>>>>>>>> The reason that I choose _subgraph_object_klasses was to
>>>>>>>> disambiguate between the klass containing a subgraph and the
>>>>>>>> klass of an object included in a subgraph. Is there a specific
>>>>>>>> reason to rename the term to _subgraph_klasses?
>>>>>>>>
>>>>>>>
>>>>>>> I just wanted to shorten the names to be the same as
>>>>>>> ArchivedKlassSubGraphInfoRecord::subgraph_klasses, but I agree
>>>>>>> with you for the naming.
>>>>>>>
>>>>>>> So instead, I've renamed
>>>>>>> ArchivedKlassSubGraphInfoRecord::subgraph_klasses ->
>>>>>>> subgraph_object_klasses to make the code consistent. What do you
>>>>>>> think?
>>>>>>>
>>>>>>>> - Could you please align the lines at 358 - 360.
>>>>>>>>
>>>>>>>> 355 WalkOopAndArchiveClosure(int level, bool
>>>>>>>> record_klasses_only,
>>>>>>>> 356 KlassSubGraphInfo* subgraph_info,
>>>>>>>> 357 oop orig, oop archived, TRAPS) :
>>>>>>>> 358 _level(level), _record_klasses_only(record_klasses_only),
>>>>>>>> 359 _subgraph_info(subgraph_info),
>>>>>>>> 360 _orig_referencing_obj(orig),
>>>>>>>> _archived_referencing_obj(archived),
>>>>>>>> 361 _thread(THREAD) {}
>>>>>>>>
>>>>>>>
>>>>>>> The 2 space indent is for the C++ field initializer list.
>>>>>>> HotSpot style guide doesn't mention this
>>>>>>> (https://wiki.openjdk.java.net/display/HotSpot/StyleGuide) so I
>>>>>>> just used emacs's default indentation.
>>>>>>>
>>>>>>>> - (2) should be only applied to object that's not seen within
>>>>>>>> the current subgraph. Please clarify it in the comment.
>>>>>>>>
>>>>>>>> 401 // (2) If orig_obj has not been seen yet, trace all
>>>>>>>> 402 // objects that are reachable from it, and make sure
>>>>>>>> these objects are archived.
>>>>>>>>
>>>>>>>
>>>>>>> I changed it to
>>>>>>>
>>>>>>> // (2) If orig_obj has not been seen yet (since
>>>>>>> start_recording_subgraph() was called),
>>>>>>> // trace all objects that are reachable from it, and make
>>>>>>> sure these objects are archived.
>>>>>>>
>>>>>>>
>>>>>>>> - Some Strings (such as "") are not archived in the closed
>>>>>>>> archive heap region. In that case, the String object is
>>>>>>>> automatically archived in the open archive heap region. The
>>>>>>>> following comment should be clarified and fixed.
>>>>>>>>
>>>>>>>> 417 // Strings are already archived in closed archive. No
>>>>>>>> need to walk it
>>>>>>>>
>>>>>>>
>>>>>>> I changed to
>>>>>>>
>>>>>>> if (java_lang_String::is_instance(orig_obj) && archived_obj !=
>>>>>>> NULL) {
>>>>>>> // To save time, don't walk strings that are already
>>>>>>> archived. They just contain
>>>>>>> // pointers to a type array, whose klass doesn't need to be
>>>>>>> recorded.
>>>>>>> return archived_obj;
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>>> - Could you please explain the reason for using two loops here?
>>>>>>>> Shouldn't one loop be sufficient?
>>>>>>>>
>>>>>>>> 646 for (int i = 0; i < num_archivable_static_fields; ) {
>>>>>>>>
>>>>>>>> 650
>>>>>>>> 651 int j = i;
>>>>>>>> 652 while (j < num_archivable_static_fields) {
>>>>>>>> 653 ArchivableStaticFieldInfo* f =
>>>>>>>> &archivable_static_fields[j];
>>>>>>>> 654 if (f->klass_name == klass_name) {
>>>>>>>> 655 archive_reachable_objects_from_static_field(f->klass,
>>>>>>>> f->klass_name,
>>>>>>>> 656 f->offset, f->field_name, CHECK);
>>>>>>>> 657 j++;
>>>>>>>> 658 } else {
>>>>>>>> 659 break;
>>>>>>>> 660 }
>>>>>>>> 661 }
>>>>>>>> ..
>>>>>>>> 663 i = j;
>>>>>>>> 664 }
>>>>>>>> 665
>>>>>>>
>>>>>>> That's because you can archive multiple fields in the same klass:
>>>>>>>
>>>>>>> // If you add new entries to this table, you should know what
>>>>>>> you're doing!
>>>>>>> static ArchivableStaticFieldInfo archivable_static_fields[] = {
>>>>>>> {"jdk/internal/module/ArchivedModuleGraph",
>>>>>>> "archivedSystemModules"},
>>>>>>> {"jdk/internal/module/ArchivedModuleGraph",
>>>>>>> "archivedModuleFinder"},
>>>>>>> {"jdk/internal/module/ArchivedModuleGraph", "archivedMainModule"},
>>>>>>> {"jdk/internal/module/ArchivedModuleGraph",
>>>>>>> "archivedConfiguration"},
>>>>>>> {"java/util/ImmutableCollections$ListN", "EMPTY_LIST"},
>>>>>>>
>>>>>>> Note that start/done_recording_subgraph are called in the i loop
>>>>>>> and not the j loop.
>>>>>>>
>>>>>>>> - HeapShared::archive_reachable_objects_from() should take a
>>>>>>>> '_record_klasses_only' argument from caller, which decides if
>>>>>>>> it only needs to record the klass list.
>>>>>>>>
>>>>>>>> As we walk all reachable objects from an entry point (a static
>>>>>>>> field) during walk&archive process, at the end of it we obtain
>>>>>>>> a complete and closed archived subgraph. Every object is also
>>>>>>>> the root of a sub-subgraph that's reachable from it. In a
>>>>>>>> subsequent subgraph archiving process from a different entry
>>>>>>>> point, if we encounter an object that is already archived, then
>>>>>>>> every objects contained within the sub-subgraph reachable from
>>>>>>>> that root are already archived also. So, '_record_klasses_only'
>>>>>>>> should be a inherited flag from the root and passed to all
>>>>>>>> HeapShared::archive_reachable_objects_from() calls for
>>>>>>>> processing objects within a sub-subgraph.
>>>>>>>>
>>>>>>>> During the walk, '_record_klasses_only' should be false
>>>>>>>> initially. Once it sees an object that already has an archived
>>>>>>>> copy (but not seen in the current subgraph), the
>>>>>>>> '_record_klasses_only' needs to become true and be propagated
>>>>>>>> to all sub-walking processes. '_record_klasses_only' should
>>>>>>>> never go back to false state.
>>>>>>>>
>>>>>>>> With that, we can guarantee to build a correct archived
>>>>>>>> subgraph without the risk of infinite cycles. It also reduces
>>>>>>>> overhead for the walk&archive process.
>>>>>>>
>>>>>>> I actually started to write to code exactly as you suggested, by
>>>>>>> passing record_klasses_only down, and then I found out that it's
>>>>>>> unnecessary, and having it will require me to add complicated
>>>>>>> asserts that was beyond my ability to get right.
>>>>>>>
>>>>>>> As I understand, you worry about about correctness (termination
>>>>>>> of the algorithm), and performance.
>>>>>>>
>>>>>>> I believe the logic of the current code is correct, and the
>>>>>>> algorithm will terminate:
>>>>>>>
>>>>>>> oop archived_obj =
>>>>>>> MetaspaceShared::find_archived_heap_object(orig_obj);
>>>>>>> ....
>>>>>>> if (has_been_seen_during_subgraph_recording(orig_obj)) {
>>>>>>> // orig_obj has already been archived and traced.
>>>>>>> Nothing more to do.
>>>>>>> return archived_obj;
>>>>>>> }
>>>>>>> ...
>>>>>>> WalkOopAndArchiveClosure walker(level, record_klasses_only,
>>>>>>> subgraph_info, orig_obj, archived_obj, THREAD);
>>>>>>> orig_obj->oop_iterate(&walker); // recurse
>>>>>>>
>>>>>>> HeapShared::archive_reachable_objects_from will recurse ONLY if
>>>>>>> you call this function with an object that has not need seen, so
>>>>>>> eventually the recursion will stop because all objects in the
>>>>>>> subgraph has been seen.
>>>>>>>
>>>>>>> To make the code easier to understand, I rewrote it like this:
>>>>>>>
>>>>>>> if (has_been_seen_during_subgraph_recording(orig_obj)) {
>>>>>>> // orig_obj has already been archived and traced. Nothing
>>>>>>> more to do.
>>>>>>> return archived_obj;
>>>>>>> } else {
>>>>>>> set_has_been_seen_during_subgraph_recording(orig_obj);
>>>>>>> }
>>>>>>>
>>>>>>> For performance, the logic for deciding record_klasses_only is this:
>>>>>>>
>>>>>>> oop archived_obj =
>>>>>>> MetaspaceShared::find_archived_heap_object(orig_obj);
>>>>>>> ...
>>>>>>> bool record_klasses_only = (archived_obj != NULL);
>>>>>>>
>>>>>>> You have to call find_archived_heap_object anyway, so that's
>>>>>>> unavoidable. The cost of computing record_klasses_only is very
>>>>>>> simple. For performance, there's no need for the it to be
>>>>>>> tracked in the recursive calls. That makes the code simpler, and
>>>>>>> easier to maintain.
>>>>>>>
>>>>>>> Otherwise we may need to think about whether the caller and the
>>>>>>> callee have different opinion on record_klasses_only:
>>>>>>>
>>>>>>> - if this could happen, what would cause this to happen, and
>>>>>>> how should we handle it?
>>>>>>> - if this could never happen, how would you write an assertion
>>>>>>> to validate that?
>>>>>>>
>>>>>>> Writing those asserts turned out to be too complicated.
>>>>>>>
>>>>>>>>
>>>>>>>> - Could add explanation in the comment?
>>>>>>>>
>>>>>>>> 438 // See
>>>>>>>> runtime/appcds/cacheObject/ArchivedIntegerCacheTest.java
>>>>>>>>
>>>>>>>
>>>>>>> I changed it to
>>>>>>>
>>>>>>> if (level == 1) {
>>>>>>> // Don't archive a subgraph root that's too big. For
>>>>>>> archives static fields, that's OK
>>>>>>> // as the Java code will take care of initializing this
>>>>>>> field dynamically.
>>>>>>> return NULL;
>>>>>>> }
>>>>>>>
>>>>>>>> - Please add the step 5) comment back to the walk&archive
>>>>>>>> description.
>>>>>>>>
>>>>>>>> 466 // 5) The Klass of the current java object is added to the
>>>>>>>> list of Klasses
>>>>>>>> 467 // for loading and initialzing before any object in the
>>>>>>>> archived graph can
>>>>>>>> 468 // be accessed at runtime.
>>>>>>>> 469 //
>>>>>>>>
>>>>>>>
>>>>>>> Done. Sorry I removed it by mistake.
>>>>>>>
>>>>>>>> - The HeapShared::archive_module_graph_objects() can take a
>>>>>>>> more general name now, if you want to do it as part of your
>>>>>>>> change.
>>>>>>>>
>>>>>>>>
>>>>>>> I renamed it to archive_static_fields(). I also removed/fixed a
>>>>>>> few comments about "Currently there is only one class mirror
>>>>>>> (ArchivedModuleGraph) with archived sub-graphs."
>>>>>>>
>>>>>>>> Could you please also send a sample of the changed cds+heap
>>>>>>>> debug log output?
>>>>>>>>
>>>>>>> See
>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v04/logs/
>>>>>>>
>>>>>>> Thanks
>>>>>>> - Ioi
>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> Jiangli
>>>>>>>>
>>>>>>>> On 9/5/18 3:56 PM, Ioi Lam wrote:
>>>>>>>>> I updated the patch:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v03/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v03-delta/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 1. No need to walk String objects during subgraph recording.
>>>>>>>>>
>>>>>>>>> 2. Print put statistics:
>>>>>>>>>
>>>>>>>>> =========================
>>>>>>>>>
>>>>>>>>> jdk/internal/module/ArchivedModuleGraph: walked 1871 objs,
>>>>>>>>> archived 1871 new objs, recorded 28 classes
>>>>>>>>> java/util/ImmutableCollections$ListN: walked 2 objs, archived
>>>>>>>>> 0 new objs, recorded 0 classes
>>>>>>>>> java/util/ImmutableCollections$MapN: walked 2 objs, archived 0
>>>>>>>>> new objs, recorded 0 classes
>>>>>>>>> java/util/ImmutableCollections$SetN: walked 2 objs, archived 0
>>>>>>>>> new objs, recorded 0 classes
>>>>>>>>> java/lang/Integer$IntegerCache: walked 257 objs, archived 256
>>>>>>>>> new objs, recorded 2 classes
>>>>>>>>> java/lang/module/Configuration: walked 7 objs, archived 0 new
>>>>>>>>> objs, recorded 3 classes
>>>>>>>>>
>>>>>>>>> Performed subgraph records = 6 times
>>>>>>>>> Walked 2141 objects
>>>>>>>>> Archived 2127 objects
>>>>>>>>> Recorded 33 klasses
>>>>>>>>>
>>>>>>>>> ========
>>>>>>>>>
>>>>>>>>> So, at least for now, we don't need to worry about the
>>>>>>>>> performance of WalkOopAndArchiveClosure.
>>>>>>>>>
>>>>>>>>> Once I get some statistics of Lambda archiving, I'll post them
>>>>>>>>> as well.
>>>>>>>>>
>>>>>>>>> So, with the bug fix, the subgraph recording should be both
>>>>>>>>> correct and faster than before (no more String walking).
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> - Ioi
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 9/3/2018 1:02 AM, Ioi Lam wrote:
>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>
>>>>>>>>>> I think your suggestion of combining of combining
>>>>>>>>>> RecordKlassesClosure and WalkOopAndArchiveClosure is a good
>>>>>>>>>> idea. I have updated the webrev:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v02/
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This also supersedes my other patch for JDK-8210295 Refactor
>>>>>>>>>> HeapShared::archive_reachable_objects_from_static_field
>>>>>>>>>>
>>>>>>>>>> More comments below on future optimizations.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 9/2/18 6:07 PM, Jiangli Zhou wrote:
>>>>>>>>>>> Hi Ioi,
>>>>>>>>>>>
>>>>>>>>>>> On 9/1/18 8:58 PM, Ioi Lam wrote:
>>>>>>>>>>>> Hi Jiangli,
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for looking at the optimization of this problem. I
>>>>>>>>>>>> think your algorithm will run into problems if (a) there is
>>>>>>>>>>>> a cycle, and (b) two sub-graphs point to different members
>>>>>>>>>>>> in the cycle.
>>>>>>>>>>>
>>>>>>>>>>> I think such issue can be solved by tracking the original K
>>>>>>>>>>> (in the K-list, short for the class-initialization-list) for
>>>>>>>>>>> each archived object. If an encountered object is already
>>>>>>>>>>> archived (within the current sub-graph) and is associated
>>>>>>>>>>> with K1 within the current K-list (must be in this case). We
>>>>>>>>>>> can update the sub-K-list between K1 and the K associated
>>>>>>>>>>> with the referencing object of the current object, so their
>>>>>>>>>>> level are the same. We can also add a flag to the K-list
>>>>>>>>>>> elements to tag each K within the cycle. It may not be what
>>>>>>>>>>> we need to do currently.
>>>>>>>>>>>
>>>>>>>>>>> Can you please describe the specific class that you ran into
>>>>>>>>>>> with the bug in your sub-graph?
>>>>>>>>>>
>>>>>>>>>> I found the bug by code examination. It's also during this
>>>>>>>>>> time that I realized that it's not a simple problem to
>>>>>>>>>> optimize, because of cycles in the graph. There's been tons
>>>>>>>>>> of established research in this area (as in "Tarjan's
>>>>>>>>>> algorithm is one of Donald Knuth's favorite
>>>>>>>>>> implementations"), so I don't think we should reinvent the
>>>>>>>>>> wheel. For future optimization, let's just apply an existing
>>>>>>>>>> algorithm.
>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Maybe someone familiar with graph theory can suggest a
>>>>>>>>>>>> solution?
>>>>>>>>>>>>
>>>>>>>>>>>> My current thinking is to first find all the cycles in the
>>>>>>>>>>>> archived heap. The reachability of every member in the
>>>>>>>>>>>> cycle is the same, so we can collapse each cycle into a
>>>>>>>>>>>> single node. After that we have a DAG. Then, instead of
>>>>>>>>>>>> encoding the class initialization order as multiple lists,
>>>>>>>>>>>> we can encode it as a DAG to save space.
>>>>>>>>>>>>
>>>>>>>>>>>> See
>>>>>>>>>>>> https://stackoverflow.com/questions/261573/best-algorithm-for-detecting-cycles-in-a-directed-graph
>>>>>>>>>>>>
>>>>>>>>>>>> In any case, I think this is a sufficiently difficult
>>>>>>>>>>>> problem, so we should separate the dumping of the
>>>>>>>>>>>> sub-graphs (the first part of
>>>>>>>>>>>> HeapShared::archive_module_graph_objects() in my patch) and
>>>>>>>>>>>> the computation of the initialization order
>>>>>>>>>>>> (HeapShared::record_subgraph_klasses()).
>>>>>>>>>>> Our current K-lists are quite small. If the number of
>>>>>>>>>>> sub-graphs grows in the future and there are many duplicates
>>>>>>>>>>> in the K-lists, we can explore using a single K-list, with
>>>>>>>>>>> each sub-graph-info recording the [start-index, end-index]
>>>>>>>>>>> pairs of the sub-K-lists.
>>>>>>>>>>>
>>>>>>>>>>> To avoid any confusion to others, I want to point out that
>>>>>>>>>>> the order of the elements in the K-list is not the
>>>>>>>>>>> initialization order of the classes that happens at dump time.
>>>>>>>>>>
>>>>>>>>>> Sorry for the confusion. I wanted to avoid using the word
>>>>>>>>>> "list" -- I think the most space efficient representation
>>>>>>>>>> will be a DAG instead of a list -- I should have used "set"
>>>>>>>>>> instead.
>>>>>>>>>>
>>>>>>>>>> Thanks again for spending time on this topic.
>>>>>>>>>>
>>>>>>>>>> - Ioi
>>>>>>>>>>> The order of the K-list elements is determined by how we
>>>>>>>>>>> walk the graph. It works for our purpose because we are not
>>>>>>>>>>> doing general-purpose object graph archiving (at least
>>>>>>>>>>> currently), and all classes in the targeted sub-graph for
>>>>>>>>>>> archiving must not have any dependencies on runtime context.
>>>>>>>>>>> When we add more sub-graphs to the archive in the future, we
>>>>>>>>>>> can examine our existing assumptions and lift limitations
>>>>>>>>>>> case by case with enhancements.
>>>>>>>>>>>>
>>>>>>>>>>>> Also, I think we should defer the optimization of
>>>>>>>>>>>> record_subgraph_klasses in a future RFE. That way, we can
>>>>>>>>>>>> have correctness now, and performance later when we
>>>>>>>>>>>> actually need it. (The current implementation of
>>>>>>>>>>>> record_subgraph_klasses() has not caused any perceivable
>>>>>>>>>>>> delay for the current set of objects that we are archiving).
>>>>>>>>>>> We probably don't need to do what I described in the above
>>>>>>>>>>> for now. I've looked at your changes, I think we can choose
>>>>>>>>>>> a middle-ground approach that's more efficient than your
>>>>>>>>>>> current change, but could still be O(N*M) in the worst case
>>>>>>>>>>> (N is the number of nodes, M is the number of sub-graphs).
>>>>>>>>>>> It also avoid duplicating the logic between the
>>>>>>>>>>> archiving_walk closure and the new RecordKlassesClosure.
>>>>>>>>>>>
>>>>>>>>>>> In the middle-ground approach, WalkOopAndArchiveClosure can
>>>>>>>>>>> take a new boolean flag, 'record_klass_only'. We can use a
>>>>>>>>>>> separate table to track the objects we have seen in the
>>>>>>>>>>> current sub-graph as you are doing in the webrev. If the
>>>>>>>>>>> there is an existing archived copy of the current object, we
>>>>>>>>>>> check the other table to determine if it's already in the
>>>>>>>>>>> current sub-graph. If yes, no additional work is needed. If
>>>>>>>>>>> no, recursively enter the closure but do 'record_klass_only'.
>>>>>>>>>>>
>>>>>>>>>>> oop archived =
>>>>>>>>>>> MetaspaceShared::find_archived_heap_object(obj);
>>>>>>>>>>> if (archived != NULL) {
>>>>>>>>>>> if (has_seen_in_the_current_graph) {
>>>>>>>>>>> // no work
>>>>>>>>>>> } else {
>>>>>>>>>>> // walk the object but record klass only
>>>>>>>>>>> }
>>>>>>>>>>> return;
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> With that, we can avoid duplicating the logic in
>>>>>>>>>>> RecordKlassesClosure and also avoid walking the sub-graph
>>>>>>>>>>> objects twice in most cases. What do you think?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Jiangli
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks
>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>
>>>>>>>>>>>> On 9/1/18 7:55 PM, Jiangli Zhou wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sep 1, 2018, at 7:25 PM, Jiangli Zhou
>>>>>>>>>>>>>> <jiangli.zhou at oracle.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Ioi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for finding the bug. To address the incomplete
>>>>>>>>>>>>>> class list issue, the original algorithm can be augmented
>>>>>>>>>>>>>> while remaining as an O(N) solution.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> As each node (object) in an archived subgraph is a root
>>>>>>>>>>>>>> of a sub-subgraph, the class-initialization-list
>>>>>>>>>>>>>> associated with a specific node (within an existing
>>>>>>>>>>>>>> archived sub-graph) is also a sub-list of the enclosing
>>>>>>>>>>>>>> sub-graph's class-initialization-list. For example,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sub-graph1:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> O1(k1)
>>>>>>>>>>>>>> / \
>>>>>>>>>>>>>> O2(k2) O5(k5)
>>>>>>>>>>>>>> / \
>>>>>>>>>>>>>> O3(k3) O4(k4)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Klass-init-list: K1, K2, K3, K4, K5
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sub-graph2:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> O2(k2)
>>>>>>>>>>>>>> / \
>>>>>>>>>>>>>> O3(k3) O4(k4)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Klass-init-list: K2, K3, K4
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> During the sub-graph walking and archiving process for
>>>>>>>>>>>>>> sub-graph2, if O2 has been previously archived in another
>>>>>>>>>>>>>> sub-graph, we can find the other sub-graph's class list
>>>>>>>>>>>>>> and take the sub-list starting from the klass (k2),
>>>>>>>>>>>>>> without re-walking each nodes again. To do that, we only
>>>>>>>>>>>>>> need to walk the existing recorded
>>>>>>>>>>>>>> class-initialization-list. If K2 is found in any of the
>>>>>>>>>>>>>> existing list, we can take the sub-list starting from K2
>>>>>>>>>>>>>> and append the list to the current one.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> With the approach, K5 would also be included if O5 is
>>>>>>>>>>>>>> walked after O2 in sub-graph1. However, O5 is not part of
>>>>>>>>>>>>>> O2's sub-graph. So that would introduce overhead at
>>>>>>>>>>>>>> runtime. To avoid such problem, we can remember the
>>>>>>>>>>>>>> sub-graph level (which is already built as part of the
>>>>>>>>>>>>>> existing graph walking algorithm) for each K in the
>>>>>>>>>>>>>> class-initialization-list. The ending indication of the
>>>>>>>>>>>>>> sub-list would be the first K with the same level as the
>>>>>>>>>>>>>> starting K. So we would have:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> K1(1), K2(2), K3(3), K4(4), K5(2)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The K2 level is 2, the sub-list would end before K2 who's
>>>>>>>>>>>>>> level is 2.
>>>>>>>>>>>>> Typo. The second K2 above should be k5:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The K2 level is 2, the sub-list would end before K5, who's
>>>>>>>>>>>>> level is 2.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> This part is not currently implemented yet but is not
>>>>>>>>>>>>>> difficult to add.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Jiangli
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 9/1/18 6:08 PM, Ioi Lam wrote:
>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8210289
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk12/8210289-subgraph-record-incomplete.v01/
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Description
>>>>>>>>>>>>>>> ===========
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I found this bug while trying to merge my code for CDS
>>>>>>>>>>>>>>> support of Lambda
>>>>>>>>>>>>>>> classes (JDK-8198698).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When heapShared.cpp dumps a sub-graph of heap objects,
>>>>>>>>>>>>>>> it attempts to
>>>>>>>>>>>>>>> record all the classes of all the objects that are
>>>>>>>>>>>>>>> referenced by
>>>>>>>>>>>>>>> this sub-graph. However, if one of these objects have
>>>>>>>>>>>>>>> already been visited
>>>>>>>>>>>>>>> while a previous sub-graph was dumped, then this
>>>>>>>>>>>>>>> object's class is not
>>>>>>>>>>>>>>> recorded in the current sub-graph.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> At runtime, if the current sub-graph is restored before
>>>>>>>>>>>>>>> any other
>>>>>>>>>>>>>>> sub-graphs, we will end up with a live object in the
>>>>>>>>>>>>>>> Java heap with
>>>>>>>>>>>>>>> an uninitialized class.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Fix
>>>>>>>>>>>>>>> ===
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Now I create the sub-graph's klasses list after all
>>>>>>>>>>>>>>> sub-graphs have dumped.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For each class that has archived sub-graph(s), I do a
>>>>>>>>>>>>>>> heap walk to discover
>>>>>>>>>>>>>>> all klasses that are reachable from this archived fields
>>>>>>>>>>>>>>> of this class.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This is sub-optimal but suffice for now because we just
>>>>>>>>>>>>>>> have a very small
>>>>>>>>>>>>>>> number of sub-graphs. The worst case its O(N^2) where N
>>>>>>>>>>>>>>> is the number of
>>>>>>>>>>>>>>> objects in the archived heap.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In the future, we might need to implement a more
>>>>>>>>>>>>>>> efficient algorithm that
>>>>>>>>>>>>>>> walks the heap once and computes all the klasses lists
>>>>>>>>>>>>>>> of all the
>>>>>>>>>>>>>>> sub-graphs at the same time.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Testing
>>>>>>>>>>>>>>> =======
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> hs-tier[1,2,3]
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>> - Ioi
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list