RFR: 8277486: NMT: Cleanup ThreadStackTracker
Thomas Stuefe
stuefe at openjdk.java.net
Wed Nov 24 08:54:14 UTC 2021
On Wed, 24 Nov 2021 08:45:15 GMT, Thomas Stuefe <stuefe at openjdk.org> wrote:
>> There are several issues with `ThreadStackTracker`.
>>
>> 1) Following assertion:
>> `assert(MemTracker::tracking_level() >= NMT_summary, "Must be");`
>> in `ThreadStackTracker::record/release_thread_stack()` is unreliable. The full fence after downgrading tracking level is *not* sufficient to avoid the racy.
>>
>> 2) NMT tracking level is downgraded without `ThreadCritical` lock held. But, it does require `ThreadCritical` lock to be held when it actually downgrade internal tracking data structure, so checking internal state is reliable to determine current tracking state.
>> Add assertion to ensure correct tracking state
>>
>> 3) `_thread_counter` is updated with `ThreadCritical` lock, but is read without the lock. Change to atomic update to ensure reader will not read stale value.
>>
>> 4) NMT events are relative rare. So far, I have yet seen (1) assertion failure but add new test may help to spot such problem.
>
> src/hotspot/share/services/threadStackTracker.cpp line 103:
>
>> 101: assert(MemTracker::tracking_level() < NMT_summary, "Must be");
>> 102: return false;
>> 103: }
>
> I looked at `MemTracker::transition_to`, the downgrade path:
>
>
> } else if (current_level > level) {
> // Downgrade tracking level, we want to lower the tracking level first
> _tracking_level = level;
> // Make _tracking_level visible immediately.
> OrderAccess::fence();
> VirtualMemoryTracker::transition(current_level, level);
> MallocTracker::transition(current_level, level);
> ThreadStackTracker::transition(current_level, level);
>
> Is the comment about "immediately" correct? I would have assumed OrderAccess::fence() has not have anything to do with timing, just with the order the writes are perceptible from outside?
>
> So, concurrent threads may observe tracking level == minimal but the pointers to the infrastructure are still there?
>
> I see that you do not guard `MemTracker::transition_to` with ThreadCritical. You do guard teardown inside `ThreadStackTracker::transition` and `VirtualMemoryTracker::transition` but it looks like teardown in `MallocTracker::transition` is unguarded by ThreadCritical at least.
>
> Just a proposal, would guarding upstairs in `MemTracker::transition_to` not be clearer and safer, especially if it includes changing the tracking level?
>
> Alternatively, do we even have to delete all those structures on shutdown? They are not that costly. We shut down either manually - nobody cares then - or if we encounter native OOM. In the latter case, those detail structures are arguably important since they contain data that may help analyze the OOM. The malloc site table, for instance.
To add to that, I always wondered why we even expose `shutdown` via jcmd... especially since this is a one-way operation, so its use is limited.
-------------
PR: https://git.openjdk.java.net/jdk/pull/6504
More information about the hotspot-runtime-dev
mailing list