RFR: 8346123: [REDO] NMT should not use ThreadCritical [v6]

Coleen Phillimore coleenp at openjdk.org
Fri Dec 20 15:40:38 UTC 2024


On Thu, 19 Dec 2024 21:04:51 GMT, Robert Toyonaga <duke at openjdk.org> wrote:

>> This is a redo of [JDK-8304824](https://bugs.openjdk.org/browse/JDK-8304824) which was backed out by [JDK-8343726](https://bugs.openjdk.org/browse/JDK-8343726) due to problems documented in [JDK-8343244](https://bugs.openjdk.org/browse/JDK-8343244).
>> 
>> The problem was that `NmtVirtualMemoryLocker` was not locking when the current thread is not attached by checking `Thread::current_or_null_safe() != nullptr`. This is necessary during VM init, but should not be allowed afterward. NMT may be used in `attach_current_thread` before the current thread is set. The lock was not being acquired in that case, which intermittently caused NMT accounting to become corrupted, triggering various assertions when future NMT operations are done.  To fix this, I've adopted [Thomas' suggestion](https://github.com/openjdk/jdk/pull/21928#issuecomment-2460238057) to reverse the order of 
>> 
>> 
>> thread->register_thread_stack_with_NMT();
>> thread->initialize_thread_current();
>> 
>> 
>> in `attach_current_thread`.  This allows `NmtVirtualMemoryLocker` to be locked after current thread is set. 
>> 
>> To allow for `NmtVirtualMemoryLocker` to be used during VM init, I've replaced the `ConditionalMutexLocker` check `Thread::current_or_null_safe() != nullptr` with a new flag: `_done_bootstrap`. This flag prevents the lock from being used during VM init, at which point we are single threaded anyway. This avoids errors due to Hotspot mutexes and current thread not yet being ready. 
>> 
>> I also added new asserts in `virtualMemoryTracker.cpp` to catch future bugs like this where the lock is not held when it should be. I updated the appropriate VMT tests to also lock (there were a few cases where locking was being bypassed) so they can pass the new asserts.
>> 
>> I also removed the unused `_query_lock` variable in `MemTracker`.
>> 
>> Testing: 
>> 
>> - On Linux amd64, I was able to consistently reproduce the errors described in [JDK-8343244](https://bugs.openjdk.org/browse/JDK-8343244) by increasing the number of test threads in `java/lang/Thread/jni/AttachCurrentThread/AttachTest.java`. The test consistently passes with the new changes in this PR.
>> - hotspot_nmt , gtest:VirtualSpace, tier1
>
> Robert Toyonaga has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Revert to using NmtVirtualMemoryLocker. Use defaultStream. Add comment for SharedDecoder_lock

I had a few minor comments and hopefully easy changes to make.

src/hotspot/share/nmt/virtualMemoryTracker.cpp line 421:

> 419:   assert(_reserved_regions != nullptr, "Sanity check");
> 420:   assert(!MemTracker::is_done_bootstrap() || NmtVirtualMemory_lock->owned_by_self() , "Should have acquired NmtVirtualMemory_lock");
> 421: 

You could add this to MemTracker class like:
 ```
 inline static void assert_locked();


Then put the body  

void MemTracker::assert_locked() {
  assert(!is_bootstrapping_done() || NmtVirtualMemory_lock->owned_by_self(),
            "should have acquired NmtVirtualMemory_lock");
}

Then all these calls could be MemTracker::assert_locked().

src/hotspot/share/nmt/virtualMemoryTracker.cpp line 631:

> 629: 
> 630:   bool do_allocation_site(const ReservedMemoryRegion* rgn) {
> 631:     assert_lock_strong(NmtVirtualMemory_lock);

So this is past bootstrapping?

src/hotspot/share/runtime/mutexLocker.cpp line 292:

> 290:   MUTEX_DEFN(NMTCompilationCostHistory_lock  , PaddedMutex  , nosafepoint);
> 291:   MUTEX_DEFN(NmtVirtualMemory_lock           , PaddedMutex  , service-4); // Must be lower than G1Mapper_lock used from G1RegionsSmallerThanCommitSizeMapper::commit_regions
> 292: #if INCLUDE_CDS

Yes, this looks good.  I see why you couldn't move this one to MUTEX_DEFL.

-------------

Changes requested by coleenp (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/22745#pullrequestreview-2517782112
PR Review Comment: https://git.openjdk.org/jdk/pull/22745#discussion_r1894080029
PR Review Comment: https://git.openjdk.org/jdk/pull/22745#discussion_r1894080796
PR Review Comment: https://git.openjdk.org/jdk/pull/22745#discussion_r1894084698


More information about the serviceability-dev mailing list