RFR: 8346123: [REDO] NMT should not use ThreadCritical [v5]

Coleen Phillimore coleenp at openjdk.org
Wed Dec 18 23:04:38 UTC 2024


On Wed, 18 Dec 2024 15:09:56 GMT, Robert Toyonaga <duke at openjdk.org> wrote:

>> This is a redo of [JDK-8304824](https://bugs.openjdk.org/browse/JDK-8304824) which was backed out by [JDK-8343726](https://bugs.openjdk.org/browse/JDK-8343726) due to problems documented in [JDK-8343244](https://bugs.openjdk.org/browse/JDK-8343244).
>> 
>> The problem was that `NmtVirtualMemoryLocker` was not locking when the current thread is not attached by checking `Thread::current_or_null_safe() != nullptr`. This is necessary during VM init, but should not be allowed afterward. NMT may be used in `attach_current_thread` before the current thread is set. The lock was not being acquired in that case, which intermittently caused NMT accounting to become corrupted, triggering various assertions when future NMT operations are done.  To fix this, I've adopted [Thomas' suggestion](https://github.com/openjdk/jdk/pull/21928#issuecomment-2460238057) to reverse the order of 
>> 
>> 
>> thread->register_thread_stack_with_NMT();
>> thread->initialize_thread_current();
>> 
>> 
>> in `attach_current_thread`.  This allows `NmtVirtualMemoryLocker` to be locked after current thread is set. 
>> 
>> To allow for `NmtVirtualMemoryLocker` to be used during VM init, I've replaced the `ConditionalMutexLocker` check `Thread::current_or_null_safe() != nullptr` with a new flag: `_done_bootstrap`. This flag prevents the lock from being used during VM init, at which point we are single threaded anyway. This avoids errors due to Hotspot mutexes and current thread not yet being ready. 
>> 
>> I also added new asserts in `virtualMemoryTracker.cpp` to catch future bugs like this where the lock is not held when it should be. I updated the appropriate VMT tests to also lock (there were a few cases where locking was being bypassed) so they can pass the new asserts.
>> 
>> I also removed the unused `_query_lock` variable in `MemTracker`.
>> 
>> Testing: 
>> 
>> - On Linux amd64, I was able to consistently reproduce the errors described in [JDK-8343244](https://bugs.openjdk.org/browse/JDK-8343244) by increasing the number of test threads in `java/lang/Thread/jni/AttachCurrentThread/AttachTest.java`. The test consistently passes with the new changes in this PR.
>> - hotspot_nmt , gtest:VirtualSpace, tier1
>
> Robert Toyonaga has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits:
> 
>  - Merge master
>  - move MemTracker::set_done_bootstrap()
>  - Update src/hotspot/share/utilities/vmError.cpp
>    
>    Co-authored-by: David Holmes <62092539+dholmes-ora at users.noreply.github.com>
>  - updates tests and remove old class
>  - use ConditionalMutexLocker directly
>  - Fix concurrency issue. Add assertions. Update tests.
>  - Revert "8343726: [BACKOUT] NMT should not use ThreadCritical"
>    
>    This reverts commit c3df050b88ecef123199a4e96f6d9884d064ae45.

Sorry if I'm repeating other comments, but it looks like a good change but could be a bit less repetition.  Thanks for removing this instance of ThreadCritical.

Oh yes, now I see, sorry I did repeat Kim's suggestion.

src/hotspot/share/runtime/mutexLocker.cpp line 292:

> 290:   MUTEX_DEFN(NMTQuery_lock                   , PaddedMutex  , safepoint);
> 291:   MUTEX_DEFN(NMTCompilationCostHistory_lock  , PaddedMutex  , nosafepoint);
> 292:   MUTEX_DEFN(NmtVirtualMemory_lock           , PaddedMutex  , service-4);

Why is this service-4?  Does it depend on being a rank lower than another lock?  If so, this and the SharedDecoder_lock should be declared below as MUTEX_DEFL and call out that lock.

src/hotspot/share/runtime/os.cpp line 2313:

> 2311:   bool result;
> 2312:   if (MemTracker::enabled()) {
> 2313:     ConditionalMutexLocker cml(NmtVirtualMemory_lock, MemTracker::is_done_bootstrap(), Mutex::_no_safepoint_check_flag);

This pattern is all over.  Can you create a class in nmtSomeHeader like:

class NMTLocker {
    ConditionalMutexLocker cml;
    NMTLocker();
};

in cpp file:
NmtLocker() : _cml(NmtVirtualMemory_lock, MemTracker::is_done_bootstrap(), Mutex::_no_safepoint_check_flag) {}

Or something like that, rather than this be repeated everywhere.  I only skimmed the other comments, so maybe David and Kim said the same thing.

-------------

Changes requested by coleenp (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/22745#pullrequestreview-2513014280
PR Comment: https://git.openjdk.org/jdk/pull/22745#issuecomment-2552430048
PR Review Comment: https://git.openjdk.org/jdk/pull/22745#discussion_r1890930186
PR Review Comment: https://git.openjdk.org/jdk/pull/22745#discussion_r1890932610


More information about the serviceability-dev mailing list