[9] RFR (M) 8054888: Runtime: Add Diagnostic Command that prints the class hierarchy

Karen Kinnear karen.kinnear at oracle.com
Fri Jan 9 17:49:56 UTC 2015


Staffan,

On Jan 9, 2015, at 7:38 AM, Staffan Larsen wrote:

> It’s getting difficult to get all the information into the same output: hierarchy, interfaces, class loaders and modules. I took a stab at it and it could look like this:
> 
> java.lang.Object
> |--java.io.Serializable (java.base, 0x00000007c00375f8, iface)
> |--java.util.RandomAccess (java.base, 0x00000007c00375f8, iface)
> |--java.lang.Iterable (java.base, 0x00000007c00375f8, iface)
> | |--java.util.Collection (java.base, 0x00000007c00375f8, iface)
> | | |--java.util.List (java.base, 0x00000007c00375f8, iface)
> |--java.util.AbstractCollection  (java.base, 0x00000007c00375f8)
> | |  implements java.util.Collection
> | |  implements java.lang.Iterable
> | |--java.util.AbstractList (java.base, 0x00000007c00375f8)
> | |    implements java.util.List
> | | |--java.util.Arrays$ArrayList (java.base, 0x00000007c00375f8)
> | | |    implements java.io.Serializable
> | | |    implements java.util.RandomAccess
>  
> 
> But this is pushing what can be visualized in one place.
> 
> We will need to add module and class loader information somehow to all the Diagnostic Commands that list classes. In addition we will need a way to see how modules relate to one another. Perhaps it isn’t possible to have all the information in one place, but instead make it possible to cross reference between different diagnostic commands. For example, GC.class_stats could have the module and class loader information, and GC.class_hierarchy would not have to include it. What is missing from making that possible is a unique way of identifying a class (since the name is not unique). All output would need to include that unique identifier and it would be possible to cross reference. The identifier has to be stable during a JVM run, but not between runs.
I like where you are going with this. Since a given module is within a class loader, putting the the loader/module/class  information in class_stats would be helpful and then having
other dcmds just need the class/class loader pair for uniqueness. 

Not sure what you are using below for unique identifier.
I would be tempted to use the class loader hex value since otherwise you are introducing hex values that have no meaning other than as
cross-references. And the class/class loader pair is guaranteed uniqueness.
For any class loader you could list its hex value thereby giving the information in a single command that gives meaning to the loader value. 

java.lang.Object/null
...
|-sun.misc.Launcher$AppClassLoader/null (0xyyy)
|-myapp/0xyyy

Just a thought. If you are not enthused - I am ok with your proposal.

thanks,
Karen

> 
> The above would then become:
> 
> java.lang.Object/0x12345600
> |--java.io.Serializable/0x12345601
> |--java.util.RandomAccess/0x12345602
> |--java.lang.Iterable/0x12345603
> | |--java.util.Collection/0x12345604
> | | |--java.util.List/0x12345605
> |--java.util.AbstractCollection/0x12345606
> | |  implements java.util.Collection/0x12345604
> | |  implements java.lang.Iterable/0x12345603
> | |--java.util.AbstractList/0x12345607
> | |    implements java.util.List/0x12345605
> | | |--java.util.Arrays$ArrayList/0x12345608
> | | |    implements java.io.Serializable/0x12345601
> | | |    implements java.util.RandomAccess/0x12345602
> 
> With additions to GC.class_stats:
> 
> Index Super ClassLoader        ClassName
>     1    -1 0x00000007c0034c48 java.lang.Object/0x12345600
>     2     1 0x00000007c0034c48 java.util.List/0x12345605
>     3    31 0x00000007c0034c48 java.util.AbstractList/0x12345607
> 
> And GC.class_histogram:
> 
>  num     #instances         #bytes  class name
> ----------------------------------------------
>    1:           945         117736  java.lang.Object/0x12345600
>    2:           442          50352  java.util.List/0x12345605
>    3:           499          25288  java.util.AbstractList/0x12345607
> 
> 
> 
> /Staffan
> 
>> On 9 jan 2015, at 09:53, Frederic Parain <Frederic.Parain at oracle.com> wrote:
>> 
>> 
>> 
>> On 01/08/2015 10:29 PM, Chris Plummer wrote:
>>> Hi Karen,
>>> 
>>> Comments inline.
>>> 
>>> On 1/8/15 8:07 AM, Karen Kinnear wrote:
>>>> Chris,
>>>> 
>>>> Thank you for doing this. I had a couple of questions/comments.
>>>> 
>>>> I like your idea of being able to start with a specific class to show
>>>> all subclasses of.
>>> Ok. I'll add that.
>>>> 
>>>> I think the way I read the code this shows the superclass hierarchy,
>>>> not the superinterfaces. With the addition
>>>> of default methods in interfaces, I think we have increased the value
>>>> in seeing superinterfaces.
>>> It does include interfaces in the output, but not as part of any class
>>> hierarchy. Interfaces are just shown as regular classes that inherit
>>> from Object. This is the case if one interface extends another, I
>>> suppose because "extends" is just interpreted as "implements" in this case.
>>>> 
>>>> So what I personally would find useful would be to be able to start
>>>> with a specific class and
>>>> find the superclasses and superinterfaces of that class - for the
>>>> debugging I do, I usually am
>>>> trying to look up and need both sets of information. Today if you run
>>>> -XX:+TraceDefaultMethods
>>>> there is one sample way to display all the supertypes of a single
>>>> type, all the way up. I don't know the
>>>> best way to make that consistent with the current output approach,
>>>> e.g. using the |- syntax.
>>>> 
>>>> e.g.
>>>> Class java.util.Arrays$ArrayList requires default method processing
>>>> java/util/Arrays$ArrayList
>>>>   java/util/AbstractList
>>>>     java/util/AbstractCollection
>>>>       java/lang/Object
>>>>       java/util/Collection
>>>>         java/lang/Object
>>>>         java/lang/Iterable
>>>>           java/lang/Object
>>>>     java/util/List
>>>>       java/lang/Object
>>>>       java/util/Collection
>>>>         java/lang/Object
>>>>         java/lang/Iterable
>>>>           java/lang/Object
>>>>   java/util/RandomAccess
>>>>     java/lang/Object
>>>>   java/io/Serializable
>>>>     java/lang/Object
>>>> 
>>>> Do you think there could be value to others in the ability to walk up
>>>> the hierarchy as well as to
>>>> see superclasses and superinterfaces at least from that perspective?
>>> This is a inverted from how my dcmd prints the hierarchy, plus also
>>> includes interfaces. Inverting the hierarchy means a class is listed
>>> with every subclass of the class, which I don't think is desirable.
>>> Including interfaces has the same issue, but introduces a new issue even
>>> when not inverting the hierarchy. The same interface can be in more than
>>> one location in the hierarchy, so there is no avoiding printing it more
>>> than once, so we need to decide how to best include interfaces in the
>>> output.
>> 
>> It seems to me that we have two very different use cases here, each one
>> best served with a different output format:
>> 
>> 1 - Listing of all classes/interfaces hierarchy when the dcmd is
>>     invoked without arguments:
>>        -> Chris' output format as described below (with interfaces)
>> 2 - Investigation on a particular class or interface when a class
>>     or interface is passed in argument to the dcmd
>>        -> Karen's output format, much easier to work with to
>>           track default methods. Because the output is limited to the
>>           hierarchy from a single class, there's no class duplication
>>           in output (single parent class inheritance) and limited
>>           interfaces duplication.
>> 
>> If the implementations of the two features are too different, we could
>> consider having two different dcmds.
>> 
>> My 2 cents,
>> 
>> Fred
>> 
>>> The could just be a simple list right after the class that
>>> implements them:
>>> 
>>> java.lang.Object
>>> | ...
>>> |--MyBaseClass
>>> | |  implements -> MyInterface1
>>> | |  implements -> MyInterface2
>>> | |--MySubClass
>>> |      implements -> MyInterface1
>>> |      implements -> MyInterface2
>>> | ...
>>> |--MyInterface1
>>> |--MyInterface2
>>> 
>>> The "implements"  lines could be optional.
>>> 
>>> I know cvm would distinguish between interfaces the Class declared it
>>> implemented, and those it inherited from the interfaces it declared it
>>> implemented. This was necessary for reflection, and I think also to
>>> properly build up interfaces tables. I assume hotspot does something
>>> similar. Is there any need for this information to be conveyed in the
>>> above output, or just list out every interface implemented, and not
>>> worry about how the class acquired it.
>>>> Is there value in printing the defining class loader for each class -
>>>> maybe optionally?
>>> This is already available with GC.class_stats, although not in the
>>> default output. I suppose the reality is that it might be better handled
>>> by this DCMD. The main puprose of GC.class_stats is to print statistics
>>> (counts and sizes), so printing the ClassLoader name there is probably
>>> not appropriate, but then it's not really appropriate for
>>> VM.class_hierarchy either. I'm not sure how best to handle this. One or
>>> both DCMDs possibly should be re-purposed and more clearly define what
>>> type of data it displays.
>>>> If so, I'm wondering if there might be value in future for the jigsaw
>>>> project adding the module for each class - maybe optionally as well?
>>> This relates to my above statement. We need to figure out what type of
>>> data is useful, and which dcmds should handle them.
>>>> Love opinions on that  - especially from the serviceability folks
>>>> 
>>>> thanks,
>>>> Karen
>>> Thanks for the input.
>>> 
>>> Chris
>>>> 
>>>> 
>>>> On Jan 7, 2015, at 6:29 PM, Chris Plummer wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> Please review the following changes for the addition of the
>>>>> VM.class_hierarchy DCMD. Please read the bug first for some
>>>>> background information.
>>>>> 
>>>>> Webrev: http://cr.openjdk.java.net/~cjplummer/8054888/webrev.00/
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8054888
>>>>> 
>>>>> I expect there will be further restructuring or additional feature
>>>>> work. More discussion on that below. I'm not sure if that additional
>>>>> work will be done later with a separately bug filed or with this
>>>>> initial commit. That's one thing I want to work out with this review.
>>>>> 
>>>>> Currently the bulk of the DCMD is implemented in heapInspection.cpp.
>>>>> The  main purpose of this file is to implement the GC.class_stats and
>>>>> GC.class_histogram DCMDs. Both of them require walking the java heap
>>>>> to count live objects of each type, thus the name
>>>>> "heapInspection.cpp". This new VM.class_hierarchy DCMD does not
>>>>> require walking the heap, but is implemented in this file because it
>>>>> leverages the existing KlassInfoTable and related classes
>>>>> (KlassInfoEntry, KlassInfoBucket, and KlassClosure).
>>>>> 
>>>>> KlassInfoTable makes it easy to build a database of all loaded
>>>>> classes, save additional info gathered for each class, iterate over
>>>>> them quickly, and also do quick lookups. This exactly what I needed
>>>>> for this DCMD, thus the reuse. There is some downside to this. For
>>>>> starters, heapInspection.cpp is not the proper place for a DCMD that
>>>>> has nothing to do with heap inspection. Also, KlassInfoEntry is being
>>>>> overloaded now to support 3 different DCMDs, as is KlassInfoTable. As
>>>>> a result each has a few fields and methods that are not used for all
>>>>> 3 DCMDs. Some subclassing might be in order here, but I'm not sure if
>>>>> it's worth it. Opinions welcomed. If I am going to refactor, I would
>>>>> prefer that be done as a next step so I'm not disturbing the existing
>>>>> DCMDs with this first implementation.
>>>>> 
>>>>> I added some comments to code only used for GC.class_stats and
>>>>> GC.class_histogram. I did this when trying to figure them out so I
>>>>> could better understand how to implement VM.class_hierarchy. I can
>>>>> take them out if you think they are not appropriate for this commit.
>>>>> 
>>>>> One other item I like to discuss is whether it is worth adding a
>>>>> class name argument to this DCMD. That would cause just the
>>>>> superclasses and subclasses of the named class to be printed. If you
>>>>> think that is useful, I think it can be added without too much trouble.
>>>>> 
>>>>> At the moment not much testing has been done other than running the
>>>>> DCMD and looking at the output. I'll do more once it's clear the code
>>>>> has "settled". I would like to know if there are any existing tests
>>>>> for GC.class_stats and GC.class_histogram (there are none in the
>>>>> "test" directory). If so, possibly one could serve as the basis for a
>>>>> new test for VM.class_hierarchy.
>>>>> 
>>>>> thanks,
>>>>> 
>>>>> Chris
>>> 
>> 
>> -- 
>> Frederic Parain - Oracle
>> Grenoble Engineering Center - France
>> Phone: +33 4 76 18 81 17
>> Email: Frederic.Parain at oracle.com
> 



More information about the hotspot-runtime-dev mailing list