RFR: 8343244: Aarch64: Internal Error virtualMemoryTracker.cpp:444 assert(reserved_rgn != nullptr) failed: Add committed region, No reserved region found
Robert Toyonaga
duke at openjdk.org
Wed Nov 6 15:00:13 UTC 2024
On Wed, 6 Nov 2024 14:01:43 GMT, Johan Sjölen <jsjolen at openjdk.org> wrote:
> Hi,
>
> Ping @roberttoyonaga , @tstuefe , @afshin-zafari , @dholmes-ora , @gerard-ziemski
>
> We are seeing issues with the transition to one of our 'fat' mutexes. Specifically, we do have `os::commit_memory` that occurs during the running `Thread`'s initialization, see the following stack trace:
>
>
> Stack: [0x0000ffff686ae000,0x0000ffff687ae000], sp=0x0000ffff687ac5f0, free space=1017k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
> V [libjvm.so+0x16aa67c] VirtualMemoryTracker::add_committed_region(unsigned char*, unsigned long, NativeCallStack const&)+0x38c (virtualMemoryTracker.cpp:444)
> V [libjvm.so+0x131134c] os::commit_memory(char*, unsigned long, bool)+0x198 (memTracker.hpp:165)
> V [libjvm.so+0x14ddf54] StackOverflow::create_stack_guard_pages()+0x70 (stackOverflow.cpp:97)
> V [libjvm.so+0xe2e0dc] attach_current_thread.isra.0+0x11c (jni.cpp:3809)
> C [libExplicitAttach.so+0xa64] thread_main+0x34 (libExplicitAttach.c:45)
> C [libc.so.6+0x806b8] start_thread+0x2d8
>
>
> This, in turn, causes issues where the lock isn't actually taken, causing issues with the VMT. I'd like to suggest that we switch out the `Mutex` to a `PlatformMutex`. Platform mutexes are identical in construction to `ThreadCritical`, but are at least not shared globally and can be owned by the `VirtualMemoryTracker`.
>
> If you do not agree with this change, then I suggest that we back out [8304824](https://bugs.openjdk.org/browse/JDK-8304824) ASAP and wait for a better solution.
>
> Thanks.
@jdksjolen So just to clarify, the issue is due to `NmtVirtualMemoryLocker` not locking since `Thread::current_or_null_safe` is `nullptr` during thread initialization, and `PlatformMutex` isn't affected by this since it is lower level and performs less checks (we can just set the owner to nullptr)?
If so, I think this change makes sense. But it's a bit unfortunate since the reason for using a Hotspot mutex was for the extra safety checks/features.
Maybe it's also worth considering either removing the `ThreadCritical` related to NMT in arena.cpp or replacing them with `VirtualMemoryTracker::Locker` (even though they're related to malloc, not virtual memory), as per this comment https://github.com/openjdk/jdk/pull/20852#issuecomment-2450515494 ?
src/hotspot/share/nmt/nmtCommon.hpp line 144:
> 142: // Same as MutexLocker but can be used during VM init.
> 143: // Performs no action if given a null mutex or with detached threads.
> 144: class NmtVirtualMemoryLocker: public ConditionalMutexLocker {
Maybe you can remove `#include "runtime/mutexLocker.hpp"` as well
src/hotspot/share/runtime/mutexLocker.cpp line 296:
> 294: MUTEX_DEFN(ThreadsSMRDelete_lock , PaddedMonitor, service-2); // Holds ConcurrentHashTableResize_lock
> 295: MUTEX_DEFN(ThreadIdTableCreate_lock , PaddedMutex , safepoint);
> 296: MUTEX_DEFN(SharedDecoder_lock , PaddedMutex , service-5);
This was originally of rank `tty-1`. But it was lowered due to rank errors after replacing `ThreadCritical` with `NmtVirtualMemory_lock`. It probably makes sense to keep this lower rank now, since the actual locking locations have not changed even though `NmtVirtualMemory_lock` is removed. But maybe there should be a comment explaining that this low rank is needed for lock ordering with NMT `MemTracker::print_containing_region`
-------------
PR Comment: https://git.openjdk.org/jdk/pull/21928#issuecomment-2459894337
PR Comment: https://git.openjdk.org/jdk/pull/21928#issuecomment-2459960163
PR Review Comment: https://git.openjdk.org/jdk/pull/21928#discussion_r1831150401
PR Review Comment: https://git.openjdk.org/jdk/pull/21928#discussion_r1831148321
More information about the hotspot-dev
mailing list