<div dir="ltr">Hi Yi,<br><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Jan 12, 2023 at 9:29 AM Yi Yang <<a href="mailto:qingfeng.yy@alibaba-inc.com">qingfeng.yy@alibaba-inc.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div style="line-height:1.7;font-family:tahoma;font-size:14px;color:rgb(0,0,0)"><div style="clear:both"><span>Thanks Ioi and Thomas for your valuable thoughts! I list some pros about metaspace dump.</span></div><div style="clear:both"><br><div style="clear:both">1. Standardization. The format of metadata dump is standard and well-formed. It could be seamlessly integrated with DevOps/Diagnose/APM platforms, while SA is interactively and the output of jcmd is not well-formed and parser-unfriendly, and its content is subject to change. You can't expect DevOps platform to use SA/coredump/gdb conveniently and automatically in production environment.</div></div></div></div></blockquote><div><br></div><div>I'm not convinced that json solves the standardization problem. These tools are for hotspot devs mainly, so they expose implementation details that are frequently changing. That runs the risk of breakage, regardless of the format. Json will give you better parsing (at the expense of developer experience), but their content will change. Your renderer needs to make sense of the data to do something useful beyond printing the raw json soup. So the renderer can break if the implementation changes. But it is not maintained by us, in contrast to our jcmds, so we cannot easily fix it. We see these kinds of problems even with JFR, and that we maintain ourselves.<br><br>If standardization is the goal we take seriously, we'd need to carefully choose the information we expose, limiting them to those that are stable. That limits the usefulness of the command. It also puts the onus of backward compatibility on us. Regression tests and all that. If we don't take standardization seriously, why do it at all? Why worsen the experience for hotspot devs to get machine-readable output?</div><div><br></div><div>We could add json output as an option beside the normal human-readable output. I could see this to be useful. But it increases code complexity. It would also be good to have this consistently for all jcmds, or all that produce table-like output.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div style="line-height:1.7;font-family:tahoma;font-size:14px;color:rgb(0,0,0)"><div style="clear:both"><div style="clear:both"><br></div><div style="clear:both">2. Functionality. MetaspaceDumpBeforeFullGC could generate a small dump for further debugging, it works as well as heap dump.</div></div></div></div></blockquote><div><br></div><div>Sounds useful.<br></div><div></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div style="line-height:1.7;font-family:tahoma;font-size:14px;color:rgb(0,0,0)"><div style="clear:both"><div style="clear:both"><br></div><div style="clear:both">3. JVM Metadata. Codecache dump, method counter, method data, they are unexplored scopes, reconstruct human-readable representation of compiled method.</div><div style="clear:both"><br></div><div style="clear:both">4. Complexity. In my humble opinion, all of this stuff such as by-chunktype/show-loaders/VM.classloader_stats/VM.classloader_hierarchy/etc could be done in other place. VM is eligible to provides a standard and rich raw metadata output, third-party parser and UI render display them in their way instead of continuously adding new filter/grouping/hierarcy/VM.method_counter features when we do want to know them. Heap dump is a good example, it dumps all objects/symbols/etc to a binary file, and third-party tools orchestrate them to histogram/thread/classloader/domtree.</div></div></div></div></blockquote><div><br></div><div>Heap dump is a good example of the limits of this approach, since the sheer size of data can make them cumbersome and expensive to get. There is a niche for tools that quickly tell you what you need, without a lot of fuss, runtime costs, and intermediate steps. And that help you along the way of analysis.<br></div></div><div class="gmail_quote"><br></div><div class="gmail_quote">One pro of doing rendering in hotspot instead of dumping the whole thing and letting tool providers figure it out is that the backend developers know their backend best. They write analysis tools for themselves (e.g. VM.metaspace was carefully designed to answer the common Metaspace OOMEs quickly). They also can get a quick turnaround. Of course, there is a limit somewhere, which is why we have JFR and JMC. </div><div class="gmail_quote"><br></div><div class="gmail_quote">Tool providers need to understand the backend in order to intelligently display internal status. Then, they need to agree with hotspot devs on the data to be provided, then need to render them in an intelligent way. E.g. if you were to provide a "waste" section equally useful as that one from VM.metaspace, we would need to expose a lot of internals to the tool provider. The interface would be broad and very brittle.<br></div><div class="gmail_quote"><br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div style="line-height:1.7;font-family:tahoma;font-size:14px;color:rgb(0,0,0)"><div style="clear:both"><div style="clear:both"><br></div><div style="clear:both">Basically, I know the content of metaspace dump file has overlap with some existing tools such as SA/jcmd/coredump/gdb/HeapDump, as Thomas commented inlinely, I don’t think metaspace dump can troubleshoot many problems that it is the only solution. I think metadata dump is more about providing a standardized and parser-friendly framework, so that users at all levels can inspect JVM metadata information they care about. In addition, something like M(etaspace)AT could orchestrate dump content with filter/grouping/hierarchy options.</div><div style="clear:both"><br></div><div style="clear:both">> Also you mentioned that "Internally we implemented a metaspace dump that generates human-readable text". Can you share how this tool was implemented?</div><div style="clear:both"><br></div><span>That's not surprising, it iterates CLD/Classes/etc and dumps basic information about metaspace, a demo metaspace dump can be found at <a href="https://gist.github.com/y1yang0/683d8a58dd946b3e9180682863df55ea" target="_blank">https://gist.github.com/y1yang0/683d8a58dd946b3e9180682863df55ea</a><br></span></div><div style="clear:both"><br></div></div></div></blockquote><div><br></div><div>In that case, it misses an important part of the picture that VM.metaspace provides (Fragmentation and dead-memory analysis).</div><div><br></div><div>---</div><div><br></div><div>I am not opposed to adding json format output to jcmd, but not at the expense of the current human-readable reports. There are still a lot of questions are unanswered. Lets see what others think.<br></div><div></div><div><br></div><div>A jcmd that dumps metadata could be useful as an addition to existing commands if we can keep the overlap small.</div><div><br></div>Cheers, Thomas<br><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div style="line-height:1.7;font-family:tahoma;font-size:14px;color:rgb(0,0,0)"><div style="clear:both"></div><blockquote style="margin-right:0px;margin-top:0px;margin-bottom:0px;font-family:Tahoma,Arial,STHeiti,SimSun;font-size:14px;color:rgb(0,0,0)"><div style="clear:both">------------------------------------------------------------------</div><div style="clear:both">From:Thomas Stüfe <<a href="mailto:thomas.stuefe@gmail.com" target="_blank">thomas.stuefe@gmail.com</a>></div><div style="clear:both">Send Time:2023 Jan. 12 (Thu.) 15:06</div><div style="clear:both">To:"YANG, Yi" <<a href="mailto:qingfeng.yy@alibaba-inc.com" target="_blank">qingfeng.yy@alibaba-inc.com</a>></div><div style="clear:both">Cc:HotSpot Open Source Developers <<a href="mailto:hotspot-dev@openjdk.java.net" target="_blank">hotspot-dev@openjdk.java.net</a>>; hotspot-runtime-dev <<a href="mailto:hotspot-runtime-dev@openjdk.java.net" target="_blank">hotspot-runtime-dev@openjdk.java.net</a>>; hotspot-dev <<a href="mailto:hotspot-dev@openjdk.org" target="_blank">hotspot-dev@openjdk.org</a>></div><div style="clear:both">Subject:Re: RFC: regarding metaspace(metadata?) dump</div><div style="clear:both"><br></div><div><div><div>Hi Yi,</div></div><div><br></div><div>A lot of what you try to do already exists. For example, we
also have `VM.metaspace`. This is a quite powerful command to analyze
Metaspace-related issues, especially for fragmentation and
other wastages. Generally speaking, it is the tool you use to look at the underpinnings of metaspace, the allocator, while tools like `VM.classloaders`, `VM.classloader_stats` and `VM.classes` look at things "from above", e.g. walk the CLDG. All these tools have already a bit of overlap.<br></div><div><br></div><div>For analyzing OOMEs, you need several tools, since it can be caused by multiple issues. E.g. tools that walk the CLDG don't see fragmentation, or unclaimed metaspace for dead loaders.<br></div><div><br></div><div>Please find more remarks inline.<br></div><div><br></div><div class="gmail_quote"><div class="gmail_attr">On Wed, Jan 11, 2023 at 1:56 PM Yi Yang <<a href="mailto:qingfeng.yy@alibaba-inc.com" target="_blank">qingfeng.yy@alibaba-inc.com</a>> wrote:<br></div><div><div style="line-height:1.7;font-family:tahoma;font-size:14px;color:rgb(0,0,0)"><div style="clear:both">Hi,<br></div><div style="clear:both"><br></div><div style="clear:both"><div style="clear:both">Internally, we often receive feedback from users and ask for help on metaspace-related issues, for example</div><div style="clear:both">1. Users are eager to know which GroovyClassLoader loads which classes, why they are not unloaded,</div><div style="clear:both">and why they are leading to Metaspace OOME.</div></div></div></div><div><br></div><div>There are several tools to do this, for example:</div><div><br></div><div>`VM.metaspace show-loaders show-classes` <br></div><div>`VM.classloaders show-classes` </div><div><br></div><div>both show you loaded classes by loader, and the former also shows you metaspace stats needed to understand OOMEs. None of these tools shows you why loaders are kept alive, but for that you need heap- and GC-root-analysis. This quickly enters the territory of Eclipse MAT and similar tools, where having a text-based tool alone gets cumbersome.<br></div><div></div><div> </div><div><div style="line-height:1.7;font-family:tahoma;font-size:14px;color:rgb(0,0,0)"><div style="clear:both"><div style="clear:both">2. They want to know the class structure of dynamically generated classes in some scenarios such as </div><div style="clear:both">deserialization</div></div></div></div><div><br></div><div>Interesting. This seems to be a very specific query; not sure how general the need for this is. `VM.classes -verbose` shows a part of the story. <br></div><div> </div><div><div style="line-height:1.7;font-family:tahoma;font-size:14px;color:rgb(0,0,0)"><div style="clear:both"><div style="clear:both">3. Finding memory leaking about duplicated classes</div></div></div></div><div><br></div><div>Again, <br></div><div><br></div><div><div>`VM.metaspace show-loaders show-classes` <br></div><div>`VM.classloaders show-classes` <br></div><div><br></div><div>but also `VM.classes` and `VM.classloader_stats` are your friends here.<br></div><div><br></div></div><div> </div><div><div style="line-height:1.7;font-family:tahoma;font-size:14px;color:rgb(0,0,0)"><div style="clear:both"><div style="clear:both">...</div><div style="clear:both">Internally we implemented a metaspace dump that generates human-readable text, it looks something like this:</div><div style="clear:both"><br></div><div style="clear:both">[Basic Information]</div><div style="clear:both">Dump Reason : JCMD</div><div style="clear:both">MaxMetaspaceSize : 18446744073709547520 B</div><div style="clear:both">CompressedClassSpaceSize : 1073741824 B</div><div style="clear:both">Class Space Used : 309992 B</div><div style="clear:both">Class Space Capacity : 395264 B</div><div style="clear:both">...</div><div style="clear:both">[Class Loader Data]</div><div style="clear:both">ClassLoaderData : loader = 0x000000008024f928, loader_klass = 0x0000000800010098, loader_klass_name = </div><div style="clear:both">sun/misc/Launcher$AppClassLoader, label = N/A</div><div style="clear:both"> Class Used Chunks :</div><div style="clear:both"> * Chunk : [0x0000000800060000, 0x0000000800060230, 0x0000000800060800)</div><div style="clear:both"> NonClass Used Chunks :</div><div style="clear:both"> * Chunk : [0x00007fd8379c1000, 0x00007fd8379c1350, 0x00007fd8379c2000)</div><div style="clear:both"> Klasses :</div><div style="clear:both"> Klass : 0x0000000800060028, name = Test, size = 520 B</div><div style="clear:both"> ConstantPool : 0x00007fd8379c1050, size = 296 B</div><div style="clear:both">...</div><div style="clear:both"><br></div></div></div></div><div><br></div><div>`VM.metaspace` shows you the chunk composition of arenas if needed.<br></div><div><br></div><div>E.g. : `VM.metaspace by-chunktype show-loaders`<br></div><div><br></div><div>```</div><div>Usage per loader: <br> <br> 1: CLD 0x00007f72fc29b820: "app" instance of jdk.internal.loader.ClassLoaders$AppClassLoader <br> Loaded classes: <br> 1: de.stuefe.repros.MiscUtils$$Lambda$1/0x0000000801001448 <br> 2: de.stuefe.repros.MiscUtils <br> 3: de.stuefe.repros.Simple2 <br> 4: de.stuefe.repros.Simple <br> 5: de.stuefe.repros.SimpleBase <br> 6: de.stuefe.repros.I2 <br> 7: de.stuefe.repros.I1<br> -total-: 7 classes<br> Non-Class: <br> Usage by chunk level:<br> 4m chunks: (none)<br> 2m chunks: (none)<br> 1m chunks: (none)<br> 512k chunks: (none)<br> 256k chunks: (none)<br> 128k chunks: (none)<br> 64k chunks: (none)<br> 32k chunks: (none)<br> 16k chunks: (none)<br> 8k chunks: 1 chunk, 8,00 KB capacity, 8,00 KB (100%) committed, 8,00 KB (100%) used, 0 bytes ( 0%) free, 0 bytes ( 0%) waste <br> 4k chunks: 1 chunk, 4,00 KB capacity, 4,00 KB (100%) committed, 256 bytes ( 6%) used, 3,75 KB ( 94%) free, 0 bytes ( 0%) waste <br> 2k chunks: (none)<br> 1k chunks: (none)<br> -total-: 2 chunks, 12,00 KB capacity, 12,00 KB (100%) committed, 8,25 KB ( 69%) used, 3,75 KB ( 31%) free, 0 bytes ( 0%) waste <br> deallocated: 1 blocks with 24 bytes<br><br> Class: <br> Usage by chunk level:</div><div> .... and so forth<br></div><div>```</div><div><br></div><div>but for analyzing potential fragmentation issues (which have been rare since JEP 387) the "Waste" section at the end of the printout is much more helpful, e.g.:</div><div><br></div><div>```</div><div>Waste (unused committed space):(percentages refer to total committed size 384,00 KB):<br> Waste in chunks in use: 0 bytes ( 0%)<br> Free in chunks in use: 85,01 KB ( 22%)<br> In free chunks: 0 bytes ( 0%)<br>Deallocated from chunks in use: 928 bytes ( <1%) (3 blocks)<br> -total-: 85,91 KB ( 22%)<br></div><div>```<br></div><div></div><div></div><div><br></div><div><div style="line-height:1.7;font-family:tahoma;font-size:14px;color:rgb(0,0,0)"><div style="clear:both"><div style="clear:both"></div><div style="clear:both">It has been working effectively for several years and has helped many users solve metaspace-related problems.</div><div style="clear:both">But a more user-friendly way is that JDK can inherently support this capability. We hope that format of the metaspace</div><div style="clear:both">dump file can take both flexibility and compatibility into account, and the content of dump file should be detailed</div><div style="clear:both">enough to meet the needs of both application developers and lower-level developers.</div><div style="clear:both"><br></div><div style="clear:both">Based on above considerations, I think using JSON as its file format is an appropriate solution(But XML or binary </div><div style="clear:both">format are still not excluded as candidates). Specifically, in earlier thoughts, I thought the format of the metaspace</div><div style="clear:both">file could be as follows(pretty printed)</div><div style="clear:both"><br></div><div style="clear:both"><a href="https://gist.github.com/y1yang0/ab3034b6381b8a9d215602c89af4e9c3" target="_blank">https://gist.github.com/y1yang0/ab3034b6381b8a9d215602c89af4e9c3</a></div><div style="clear:both"><br></div><div style="clear:both">Using the JSON format, we can flexibly add new fields without breaking compatibility. It is debatable as to which data</div><div style="clear:both">to write. We can reach a consensus that third-party parsers(Metaspace Analyzer Tool) can at least reconstruct Java</div><div style="clear:both">source code from the dump file. Based on this, we can write more useful information for low-level troubleshooting</div><div style="clear:both">or debugging. (e.g. the init_state of InstanceKlass).</div><div style="clear:both"> In addition, we can even output the native code and associated information with regard to Method, third-party parser</div><div style="clear:both"> can reconstruct the human-readable assembly representation of the compiled method based on dump file. To some extent,</div><div style="clear:both">we have implemented code cache dump by the way. For this reason, I'm not sure if the title of the RFC proposal should</div><div style="clear:both">be called metaspace dump, maybe metadata dump? It looks more like a metadata-dump framework.</div><div style="clear:both"><br></div><div style="clear:both">Do you have any thoughts about metaspace/metadata dump? Looking forward to hearing your feedback, any comments are invaluable!</div><div style="clear:both"><br></div><div style="clear:both">Best regards,</div><span>Yi Yang</span></div></div></div><div><br></div><div><br></div><div>Analyzing the structure of generated classes sounds interesting, and could help with analyzing issues with bytecode instrumentation tools.</div><div><br></div><div>For analyzing general metaspace OOMEs we are already covered quite well. Not perfect, but your proposal does intersect with existing tools a lot. To keep code complexity down, I'd rather avoid adding duplicate features.</div><br><div>Cheers, Thomas<br></div><div><br></div><div> </div></div></div>
</blockquote></div></div></blockquote></div></div>