RFR: 8157023: Integrate NMT with JFR [v3]

Thomas Stuefe stuefe at openjdk.org
Tue Dec 6 13:09:21 UTC 2022


On Tue, 6 Dec 2022 11:54:37 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

>> Please review this enhancement to include NMT information in JFR recordings.
>> 
>> **Summary**
>> Native Memory Tracking summary information can be obtained from a running VM using `jcmd` if started with `-XX:NativeMemoryTracking=summary/detail`. Using `jcmd` requires you to run a separate process and to parse the output to get the needed information. This change adds JFR events for NMT information to enable additional ways to consume the NMT data.
>> 
>> There are two new events added:
>> * _NativeMemoryUsage_ - The total native memory usage.
>> * _NativeMemoryUsagePart_ - The native memory usage for each component.
>> 
>> These events are sent periodically and by default the interval is 1s. This can of course be discussed, but that is the staring point. When NMT is not enabled on events will be sent.
>> 
>> **Testing**
>> * Added a simple test to verify that the events are sent as expected depending on if NMT is enabled or not.
>> * Mach5 sanity testing
>
> Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Special snapshot for NMT events in JFR

Hi @kstefanj ,

Getting closer. See remarks. I did not look closely at the JFR side of things, just the NMT code.

Cheers, Thomas

src/hotspot/share/services/memJfrReporter.hpp line 47:

> 45: class MemJFRSnapshot : public AllStatic {
> 46: private:
> 47:   // The baseline age threshold in millie seconds. If older

s/millie seconds/milliseconds?

src/hotspot/share/services/memSnapshot.cpp line 59:

> 57:   // is deducted from mtChunk in the end to give correct values.
> 58:   ThreadCritical tc;
> 59:   MallocMemorySnapshot* ms = MallocMemorySummary::as_snapshot();

make `const MallocMemorySnapshot*` - this is the one true copy of the real counters, since as_snapshot does not copy. We don't want to modify it.

src/hotspot/share/services/memSnapshot.cpp line 64:

> 62:   for (int i = 0; i < mt_number_of_types; i++) {
> 63:     MEMFLAGS flag = NMTUtil::index_to_flag(i);
> 64:     MallocMemory* mm = ms->by_type(flag);

Make const too, and could you please add a const variant for MallocMemorySnapshot::by_type?

src/hotspot/share/services/memSnapshot.cpp line 68:

> 66:     total_arena_size +=  mm->arena_size();
> 67:   }
> 68:   assert(total_arena_size == ms->total_arena(), "Mismatch in accounting");

Could fail intermittently since all counters are updated asynchronously. The snapshot can be modified below us while we assemble this MemSnapshot. I think that's fine, but I would just use the accumulated counter here.

src/hotspot/share/services/memSnapshot.cpp line 93:

> 91:   // Total virtual memory size.
> 92:   _vm_total.reserved = vms->total_reserved();
> 93:   _vm_total.committed = vms->total_committed();

Here too, this is the live snapshot which can be modified while we iterate over its counters. If you want to appear consistent, I would add up committed and reserved above and use those instead of the total counter from vms.

src/hotspot/share/services/memSnapshot.cpp line 99:

> 97:   if (_snapshot_options.update_thread_stacks) {
> 98:     walk_thread_stacks();
> 99:   }

It does not make sense to walk_thread_stacks independently from the VM snapshot, since it piggy-backs on it. I would move this into the _snapshot_options.include_vm condition.

src/hotspot/share/services/memSnapshot.hpp line 42:

> 40: };
> 41: 
> 42: class MemSnapshot : public CHeapObj<mtNMT> {

Can we name this not Snapshot but something different? We already have MallocMemorySnapshot and methods like "as_snapshot", which have nothing to do with this layer.

proposals:

`NMTData`
`NMTStatistics`
`NMTStatSet`

.... 
....

-------------

Changes requested by stuefe (Reviewer).

PR: https://git.openjdk.org/jdk/pull/11449


More information about the hotspot-dev mailing list