RFR: 8157023: Integrate NMT with JFR

Fri Dec 2 10:47:22 UTC 2022

On Fri, 2 Dec 2022 10:27:16 GMT, Stefan Johansson <sjohanss at openjdk.org> wrote:

>> Haven't looked at the details of `baseline(bool summaryOnly)` that much, but since it `summaryOnly = true` I don't think it actually walk the thread stacks, right? 
>> 
>> We don't do this at a safepoint but looking at `MallocMemorySnapshot::copy_to(...)` it uses `ThreadCritical` to avoid things being cleaned out at the same time. Not sure if there are other thread safety problems though. I would expect this to have the same problems as a summary report triggered through `jcmd` because that isn't run at a safepoint either. But I do see that when used from `jcmd` we take a lock to serialize the NMT query, so we should probably do the same here.
>> 
>>   // Query lock is used to synchronize the access to tracking data.
>>   // So far, it is only used by JCmd query, but it may be used by
>>   // other tools.
>>   static inline Mutex* query_lock() {
>>     assert(NMTQuery_lock != NULL, "not initialized!");
>>     return NMTQuery_lock;
>>   }
>> 
>> So we certainly need to look closer at this. I would like to understand why the query lock is needed.
>
> Looking a bit more at this. To me it seems like we are protecting the stored baseline with the lock is:
> 
> class MemTracker : AllStatic {
> ...
>   // Stored baseline
>   static MemBaseline      _baseline;
> ...
> };
> 
> That the `NMTDCmd` code uses through `MemTracker::get_baseline()`. So if the JFR events keep its own baseline we should be fine without the lock. Or am I missing something here?

Hi @kstefanj ,

> Haven't looked at the details of `baseline(bool summaryOnly)` that much, but since it `summaryOnly = true` I don't think it actually walk the thread stacks, right?

I think we do, since summary info contains information about committed thread stack size. I think the call chain is 

MemBaseline::baseline()->
MemBaseline::baseline_summary()->
VirtualMemorySummary::snapshot()->
VirtualMemoryTracker::snapshot_thread_stacks()->
VirtualMemoryTracker::walk_virtual_memory()->
SnapshotThreadStackWalker::do_allocation_site()->
RegionIterator::next_committed()->
os::committed_in_range()->
mincore loop

> 
> We don't do this at a safepoint but looking at `MallocMemorySnapshot::copy_to(...)` it uses `ThreadCritical` to avoid things being cleaned out at the same time. Not sure if there are other thread safety problems though. I would expect this to have the same problems as a summary report triggered through `jcmd` because that isn't run at a safepoint either.

You are right, I could have sworn we do this with a VM op. Must have confused this with metaspace printing.

> But I do see that when used from `jcmd` we take a lock to serialize the NMT query, so we should probably do the same here.
> 
> ```
>   // Query lock is used to synchronize the access to tracking data.
>   // So far, it is only used by JCmd query, but it may be used by
>   // other tools.
>   static inline Mutex* query_lock() {
>     assert(NMTQuery_lock != NULL, "not initialized!");
>     return NMTQuery_lock;
>   }
> ```
> 
> So we certainly need to look closer at this. I would like to understand why the query lock is needed.

To serialize jcmds running in parallel. Since jcmd is done by the attach listener thread, the only way I can think of would be if someone in parallel executes the command via mbean? Not sure.

Note that there is no way to get consistent counter values across all categories anyway. All the counters are updated atomically without lock, and that is all we can afford in malloc. Counters are not synchronized among themselves.

-------------

PR: https://git.openjdk.org/jdk/pull/11449