RFR(XS) 8230674 Heap dumps should exclude dormant CDS archived objects of unloaded classes

Ioi Lam ioi.lam at oracle.com
Fri Sep 6 22:17:13 UTC 2019


On 9/6/19 11:48 AM, Jiangli Zhou wrote:
> On Fri, Sep 6, 2019 at 9:43 AM Ioi Lam <ioi.lam at oracle.com> wrote:
>>
>>
>> On 9/5/19 11:11 PM, David Holmes wrote:
>>> On 6/09/2019 1:39 pm, Ioi Lam wrote:
>>>> On 9/5/19 8:18 PM, David Holmes wrote:
>>>>> Hi Ioi,
>>>>>
>>>>> On 6/09/2019 12:27 pm, Ioi Lam wrote:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8230674
>>>>>> http://cr.openjdk.java.net/~iklam/jdk14/8230674-heap-dump-exclude-dormant-oops.v01
>>>>>>
>>>>>>
>>>>>> Please review this small fix:
>>>>>>
>>>>>> When CDS is in use, archived objects are memory-mapped into the
>>>>>> heap (currently G1GC only). These objects are partitioned into
>>>>>> "subgraphs". Some of these subgraphs may not be loaded (e.g., those
>>>>>> related to jdk.internal.math.FDBigInteger) at the time a heap dump is
>>>>>> requested. >
>>>>>> When a subgraph is not loaded, some of the objects in this subgraph
>>>>>> may belong to a class that's not yet loaded.
>>>>>>
>>>>>> The bug happens when such an "dormant" object is dumped, but its class
>>>>>> is not dumped because the class is not in the system dictionary.
>>>>>>
>>>>>> There is already code in DumperSupport::dump_instance() that tries
>>>>>> to handle dormant objects, but it needs to be extended to cover
>>>>>> arrays, as well as and references from non-dormant object/arrays to
>>>>>> dormant ones.
>>>>> I have to confess I did not pay any attention to the CDS archived
>>>>> objects work, so I don't have a firm grasp of how you have
>>>>> implemented things. But I'm wondering how can you have a reference
>>>>> to a dormant object from a non-dormant one? Shouldn't the act of
>>>>> becoming non-dormant automatically cause the subgraph from that
>>>>> object to also become non-dormant? Or do you have "read barriers" to
>>>>> perform the changes on demand?
>>>>>
>> Ah -- my bug title is not correct.
>>
>> I changed the bug title (and this e-mail subject) to
>>
>> Heap dumps should exclude dormant CDS archived objects **of unloaded
>> classes**
>>
>> During the heap dump, we scan all objects in the heap, regardless of
>> reachability. There's no way to decide reachability in
>> HeapObjectDumper::do_object(), unless we perform an actual GC.
>>
>> But it's OK to include unreachable objects in the heap dump. (I guess
>> it's useful to see how much garbage you have in the heap. There's an
>> option to run a collection before dumping the heap.)
>>
>> There are 2 kinds of unreachable objects -- garbage: those that were
>> once reachable but no longer, dormant: the archived objects that have
>> never been reachable.
> Currently Java object archiving framework only supports one
> directional state change: dormant -> live. An archived object can
> become a live object from dormant state, but it cannot go back to the
> dormant state. Need to investigate thoroughly for all cases before the
> 'live -> dormant' transition can be supported. All objects in the
> 'Open' archive heap region are associated with the builtin class
> loaders and their classes are not unloaded. The existing static fields
> for archiving within the JDK classes are selected and the associated
> objects do not become garbage once 'installed'.
>
>> Anyway, it's OK to dump dormant objects as long as their class has been
>> loaded. The problem happens only when we dump a dormant object who class
>> is not yet loaded (Eclipase MAT get confused when it sees an object
>> whose class ID is invalid).
> Yes. That's a scenario needs to be handled for a tool that iterates
> the Java heap. A dormant object in the 'Open' archive heap region may
> have a 'invalid' klass since the klass may not be loaded yet at the
> moment.
>
> Your webrev looks reasonable to me on high level pending information
> for following questions. Can you please give more details on the
> dormant objects referenced from the arrays? What specific arrays are
> those?
Hi Jiangli,

Thanks for the review. I add the following code:

   // [id]* elements
   for (int index = 0; index < length; index++) {
     oop o = array->obj_at(index);
 >>
     if (o != NULL && mask_dormant_archived_object(o) == NULL) {
       ResourceMark rm;
       tty->print_cr("%s array contains %s object", 
array->klass()->external_name(), o->klass()->external_name());
     }
 >>
     o = mask_dormant_archived_object(o);
     writer->write_objectID(o);
   }

and the output is:

$ java -cp .  -XX:+HeapDumpAfterFullGC HelloGC
Dumping heap to java_pid20956.hprof ...
[Ljava.lang.Object; array contains java.util.jar.Attributes$Name object
[Ljava.lang.Object; array contains java.util.jar.Attributes$Name object
(repeated about 20 times)

It comes from java/util/jar/Attributes$Name::KNOWN_NAMES. This class is 
not loaded because my program doesn't use JAR files in the classpath:

Thanks
- Ioi


> Regards,
> Jiangli
>
>
>
>> So to answer your question, we can have a case with a dormant array
>> (that contains a dormant object) like this:
>>
>>       Object[] array = {new ClassNotYetLoaded();}
>>
>> After my fix, the array will be dumped (we have no easy way of not doing
>> that), but its contents becomes this in the .hprof file:
>>
>>       Object[] array = {null}
>>
>> Thanks
>> - Ioi
>>
>>
>>
>>>> Hi David,
>>>>
>>>> Thanks for the review.
>>>>
>>>> The dormant objects are not reachable via the GC roots. They become
>>>> non-dormant via explicit calls to JVM_InitializeFromArchive, after
>>>> which they become reachable via the static fields of loaded classes.
>>> Right, so is there a distinction between non-dormant and reachable at
>>> the time an object becomes non-dormant? I'm still unclear how a drmant
>>> array becomes non-dormant but still contains elements that refer to
>>> dormant objects.
>>>
>>>> The only issue here is heap dump is done by scanning all objects in
>>>> the heap, including unreachable ones
>>>>
>>>>     HeapObjectDumper obj_dumper(this, writer());
>>>>     Universe::heap()->safe_object_iterate(&obj_dumper);
>>>>
>>>> that's how these dormant objects are discovered during heap dump.
>>>>
>>>>> That aside the code changes seem reasonable, you moved the check out
>>>>> of DumperSupport::dump_instance and into the higher-level
>>>>> HeapObjectDumper::do_object so that it catches instances and arrays,
>>>>> plus you added a check for array elements.
>>>>>
>>>> I am debating whether I should put the masking code in here:
>>>>
>>>> void DumpWriter::write_objectID(oop o) {
>>>>     o = mask_dormant_archived_object(o);  /// <---- add
>>>>     address a = (address)o;
>>>> #ifdef _LP64
>>>>     write_u8((u8)a);
>>>> #else
>>>>     write_u4((u4)a);
>>>> #endif
>>>> }
>>>>
>>>>
>>>> That way, even if a dormant object (unintentionally) becomes
>>>> reachable via the GC roots, we won't write an invalid reference to it
>>>> (the object "body" will not be written, so the ID will not point to
>>>> anything valid).
>>>>
>>>> But this seems a little too aggressive to me. What do you think?
>>> It does seem a little aggressive as it seems to introduce the dormancy
>>> check into a lot of places that don't need it. But as I said I don't
>>> know this code so I'm really not the right person to ask.
>>>
>>> Cheers,
>>> David
>>> -----
>>>
>>>> Thanks
>>>> - Ioi
>>>>



More information about the serviceability-dev mailing list