<div class="__aliyun_email_body_block"><div style="line-height:1.7;font-family:tahoma;font-size:14.0px;color:#000000;"><div style="clear:both;"><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;color:#000000;font-family:tahoma;font-size:14.0px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:.0px;text-transform:none;white-space:normal;word-spacing:.0px;text-decoration-thickness:initial;text-decoration-style:initial;text-decoration-color:initial;clear:both;">Hi Ioi,<br ><br ></div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;color:#000000;font-family:tahoma;font-size:14.0px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:.0px;text-transform:none;white-space:normal;word-spacing:.0px;text-decoration-thickness:initial;text-decoration-style:initial;text-decoration-color:initial;clear:both;"><span style="margin:.0px;padding:.0px;border:.0px;outline:.0px;">> I think there are overlaps between your proposal and existing tools. For example, there are jcmd options such as VM.class_hierarchy and VM.classes, etc.</span><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;">> The Serviceability Agent can also be used to analyze the contents of the class metadata.</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"><br ></div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;">Of course, we can continue to add jcmd commands such as jcmd VM.method_counter and jcmd VM.aggregtate_by_class_package to help diagnosing, but another once and for all solution is to implement a rich and well-formed metadata dump as this proposal described, third-party parsers and platforms are eligible to analyze well-formed dump file and provide many grouping/filtering options(grouping_by_package, filter_linked, filter_force_inline, essentially VM.class_hierarchy is aggregation of VM.classes).</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"><br ></div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;">I'm trying to describe a real use case to illustrate benefits of well-formed metaspace dump: In our internal DevOps platform, I observed that the Metaspace utilization rate of my application has been high. During this period, FGC occurred several times. So I generate a well-formed metaspace dump through DevOps platform, and then the dump file will be automatically generated and uploaded to another internal Java troubleshooting platform, troubleshooting platform further analyzes and show it with many grouping and filter options and so on.</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"><br ></div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;">> I'd be interested in seeing your implementation and compare it with the existing tools.</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"><br ></div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;">I'm starting to do this, and it may take several months to implement since it looks more like a JEP level feature, I want to hear some general discussion before coding, i.e, is it acceptable to use JSON format? should it be Metadata Dump or keeping the current metaspace scope? Do you think basic+extend output for internal structure is acceptable?</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"><br ></div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;">> This may be quite difficult, because the metadata contains rewritten Java bytecodes. The rewriting format may be dependent on the JDK version. Also, the class linkage (the resolution of constant pool information) will be vastly from one JDK version to another. So using writing a third party tool that can work with multiple JDK versions will be quite hard.</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"><br ></div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;">Thanks for your input! Maybe display rewrited bytecodes? Anyway, I'll take a close look at this, and I'll prepare a POC along with dump parser and a simple UI diagnose web once ready.</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"><br ></div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;">> Also, defining a "portable" format for the dump will be difficult, since we don't know how the internal data structure will evolve in the future.</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"><br ></div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;">Yes, since we don't know how internal data structure will changed in the future, so I propose reaching a consensus that we can at least reconstruct Java (rewrited?) source code as much as possible. For example, the dumped JSON object for InstanceKlass contains two parts, the first part contains the necessary information to reconstruct the source code as much as possible, and the second part is extended information, like this:</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;">{</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"> name:..,</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"> super:..,</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"> flags:...,</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"> method:[]</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"> interface:[]</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"> fields:[],</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"> annotation:[]</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"> bytecode:[],</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"> constantpool:[],</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"> //extend</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"> init_state:...,</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;"> init_thread:...,</div><div style="margin:.0px;padding:.0px;border:.0px;outline:.0px;clear:both;">}</div><span style="margin:.0px;padding:.0px;border:.0px;outline:.0px;">The first part is basically unchanged(or adding new fields only), and the extended part is subject to change, visualization dump client checks if fields of JSON objects are defined and displays them further.<br ></span></div></div><div style="clear:both;"><br /></div><blockquote style="margin-right:0;margin-top:0;margin-bottom:0;font-family:Tahoma,Arial,STHeiti,SimSun;font-size:14.0px;color:#000000;"><div style="clear:both;">------------------------------------------------------------------</div><div style="clear:both;">From:Ioi Lam <ioi.lam@oracle.com></div><div style="clear:both;">Send Time:2023 Jan. 12 (Thu.) 08:15</div><div style="clear:both;">To:hotspot-runtime-dev <hotspot-runtime-dev@openjdk.org>; serviceability-dev@openjdk.java.net <serviceability-dev@openjdk.java.net></div><div style="clear:both;">Subject:Re: RFC: regarding metaspace(metadata?) dump</div><div style="clear:both;"><br /></div><head >
</head>
CC-ing serviceability.<br >
<br >
Hi Yi,<br >
<br >
In general, I think it's good to have tools for understanding the
internal layout of the class metadata layouts.<br >
<br >
I think there are overlaps between your proposal and existing tools.
For example, there are jcmd options such as VM.class_hierarchy and
VM.classes, etc.<br >
<br >
The Serviceability Agent can also be used to analyze the contents of
the class metadata.<br >
<br >
Dd you look at the existing tools and see how they match up with
your requirements?<br >
<br >
I'd be interested in seeing your implementation and compare it with
the existing tools.<br >
<br >
<br >
<div class="moz-cite-prefix">On 1/11/2023 4:56 AM, Yi Yang wrote:<br >
</div>
<div class=" __aliyun_node_has_color" style="line-height:1.7;font-family:tahoma;font-size:14.0px;color:#000000;">
<div style="clear:both;">Hi,<br >
</div>
<div style="clear:both;"><br >
</div>
<div style="clear:both;">
<div style="clear:both;">Internally, we often receive
feedback from users and ask for help on metaspace-related
issues, for example</div>
<div style="clear:both;">1. Users are eager to know which
GroovyClassLoader loads which classes, why they are not
unloaded,</div>
<div style="clear:both;">and why they are leading to
Metaspace OOME.</div>
<div style="clear:both;">2. They want to know the class
structure of dynamically generated classes in some
scenarios such as </div>
<div style="clear:both;">deserialization</div>
<div style="clear:both;">3. Finding memory leaking about
duplicated classes</div>
<div style="clear:both;">...</div>
<div style="clear:both;">Internally we implemented a
metaspace dump that generates human-readable text, it
looks something like this:</div>
<div style="clear:both;"><br >
</div>
<div style="clear:both;">[Basic Information]</div>
<div style="clear:both;">Dump Reason : JCMD</div>
<div style="clear:both;">MaxMetaspaceSize :
18446744073709547520 B</div>
<div style="clear:both;">CompressedClassSpaceSize :
1073741824 B</div>
<div style="clear:both;">Class Space Used : 309992 B</div>
<div style="clear:both;">Class Space Capacity : 395264 B</div>
<div style="clear:both;">...</div>
<div style="clear:both;">[Class Loader Data]</div>
<div style="clear:both;">ClassLoaderData : loader =
0x000000008024f928, loader_klass = 0x0000000800010098,
loader_klass_name = </div>
<div style="clear:both;">sun/misc/Launcher$AppClassLoader,
label = N/A</div>
<div style="clear:both;"> Class Used Chunks :</div>
<div style="clear:both;"> * Chunk : [0x0000000800060000,
0x0000000800060230, 0x0000000800060800)</div>
<div style="clear:both;"> NonClass Used Chunks :</div>
<div style="clear:both;"> * Chunk : [0x00007fd8379c1000,
0x00007fd8379c1350, 0x00007fd8379c2000)</div>
<div style="clear:both;"> Klasses :</div>
<div style="clear:both;"> Klass : 0x0000000800060028,
name = Test, size = 520 B</div>
<div style="clear:both;"> ConstantPool :
0x00007fd8379c1050, size = 296 B</div>
<div style="clear:both;">...</div>
<div style="clear:both;"><br >
</div>
<div style="clear:both;">It has been working effectively for
several years and has helped many users solve
metaspace-related problems.</div>
<div style="clear:both;">But a more user-friendly way is
that JDK can inherently support this capability. We hope
that format of the metaspace</div>
<div style="clear:both;">dump file can take both flexibility
and compatibility into account, and the content of dump
file should be detailed</div>
<div style="clear:both;">enough to meet the needs of both
application developers and lower-level developers.</div>
<div style="clear:both;"><br >
</div>
<div style="clear:both;">Based on above considerations, I
think using JSON as its file format is an appropriate
solution(But XML or binary </div>
<div style="clear:both;">format are still not excluded as
candidates). Specifically, in earlier thoughts, I thought
the format of the metaspace</div>
<div style="clear:both;">file could be as follows(pretty
printed)</div>
<div style="clear:both;"><br >
</div>
<div style="clear:both;"><a class="moz-txt-link-freetext" href="https://gist.github.com/y1yang0/ab3034b6381b8a9d215602c89af4e9c3" target="_blank">https://gist.github.com/y1yang0/ab3034b6381b8a9d215602c89af4e9c3</a></div>
<div style="clear:both;"><br >
</div>
<div style="clear:both;">Using the JSON format, we can
flexibly add new fields without breaking compatibility. It
is debatable as to which data</div>
<div style="clear:both;">to write. We can reach a consensus
that third-party parsers(Metaspace Analyzer Tool) can at
least reconstruct Java</div>
<div style="clear:both;">source code from the dump file. </div>
</div>
</div>
<br >
This may be quite difficult, because the metadata contains rewritten
Java bytecodes. The rewriting format may be dependent on the JDK
version. Also, the class linkage (the resolution of constant pool
information) will be vastly from one JDK version to another. So
using writing a third party tool that can work with multiple JDK
versions will be quite hard. Also, defining a "portable" format for
the dump will be difficult, since we don't know how the internal
data structure will evolve in the future.<br >
<br >
Thanks<br >
- Ioi<br >
<br >
<br >
<div class=" __aliyun_node_has_color" style="line-height:1.7;font-family:tahoma;font-size:14.0px;color:#000000;">
<div style="clear:both;">
<div style="clear:both;">Based on this, we can write more
useful information for low-level troubleshooting</div>
<div style="clear:both;">or debugging. (e.g. the init_state
of InstanceKlass).</div>
<div style="clear:both;"> In addition, we can even output
the native code and associated information with regard to
Method, third-party parser</div>
<div style="clear:both;"> can reconstruct the human-readable
assembly representation of the compiled method based on
dump file. To some extent,</div>
<div style="clear:both;">we have implemented code cache dump
by the way. For this reason, I'm not sure if the title of
the RFC proposal should</div>
<div style="clear:both;">be called metaspace dump, maybe
metadata dump? It looks more like a metadata-dump
framework.</div>
<div style="clear:both;"><br >
</div>
<div style="clear:both;">Do you have any thoughts about
metaspace/metadata dump? Looking forward to hearing your
feedback, any comments are invaluable!</div>
<div style="clear:both;"><br >
</div>
<div style="clear:both;">Best regards,</div>
<span >Yi Yang</span></div>
</div>
<br >
</blockquote></div></div>