RFR: 8157023: Integrate NMT with JFR
Carter Kozak
duke at openjdk.org
Thu Dec 1 11:32:19 UTC 2022
On Thu, 1 Dec 2022 10:48:51 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:
> Please review this enhancement to include NMT information in JFR recordings.
>
> **Summary**
> Native Memory Tracking summary information can be obtained from a running VM using `jcmd` if started with `-XX:NativeMemoryTracking=summary/detail`. Using `jcmd` requires you to run a separate process and to parse the output to get the needed information. This change adds JFR events for NMT information to enable additional ways to consume the NMT data.
>
> There are two new events added:
> * _NativeMemoryUsage_ - The total native memory usage.
> * _NativeMemoryUsagePart_ - The native memory usage for each component.
>
> These events are sent periodically and by default the interval is 1s. This can of course be discussed, but that is the staring point. When NMT is not enabled on events will be sent.
>
> **Testing**
> * Added a simple test to verify that the events are sent as expected depending on if NMT is enabled or not.
> * Mach5 sanity testing
src/hotspot/share/jfr/metadata/metadata.xml line 709:
> 707: </Event>
> 708:
> 709: <Event name="NativeMemoryUsagePart" category="Java Virtual Machine, Memory" label="Component Native Memory Usage" description="Native memory usage for a component" stackTrace="false" thread="false"
I found it odd that naming for native memory tracking “parts” aren’t named consistently. “Category” might be more obvious. I don’t have a strong opinion, just thought I’d mention my observation.
src/hotspot/share/services/memReporter.cpp line 832:
> 830:
> 831: MemBaseline usage;
> 832: usage.baseline(true);
In an ideal world we could report totals and category information from the same snapshot, however the periodic event infrastructure doesn’t seem to provide a great way to do that. I don’t think that’s a big deal, but may be worth documenting to avoid confusion as totals may be reported lower than the sum of the parts due to temporal drift.
An option I had considered is to define some form of larger “NMTSummary” event with reserved+committed fields for each category as well as the totals. The reporting structure would align with the existing NMT summary tooling, but the event would need to be kept up to date with new NMT types, which may not be worthwhile (unless we can create dynamic events using the c apis, which would make things easier). Unclear how this data would be visualized out of the box. I don’t have as much context as I’d like around event design considerations, there may be other reasons I’m not aware of that this suggestion is a bad idea.
src/hotspot/share/services/memReporter.cpp line 861:
> 859:
> 860: MemBaseline usage;
> 861: usage.baseline(true);
Perhaps we could collect event timing here, rather than reading the timer for each memory category? That way it’s easier to align events from a single snapshot.
for this sort of event, I’m not sure if it’s preferable to declare only a start time, start==end, or measure start and end time around the ‘baseline(true)’ call.
-------------
PR: https://git.openjdk.org/jdk/pull/11449
More information about the hotspot-dev
mailing list