RFR: 8356173: Remove ThreadCritical

Fri May 9 06:51:51 UTC 2025

On Tue, 6 May 2025 20:32:05 GMT, Coleen Phillimore <coleenp at openjdk.org> wrote:

> Updated the description in the bug.  This removes the last use of ThreadCritical and replaces it with a global PlatformMutex lock.
> Tested with tier1-4, and tier1 on all Oracle-supported OSs.

I like that we use PlatformMutex, not Mutex; I think that is a very reasonable approach. We forego deadlock detection but gain independence wrt initialization order.

Note that if you use PlatformMutex, there is no reason to dynamically allocate the mutex; just allocate as a global variable.

There is a small danger associated with this proposal in that ThreadCritical was allowing for recursive entry, PlatformMutex does not. On Posix it uses pthread_mutex_xxx, which on some platforms has the ability to work recursively; but on Windows we use critical sections, there its not possible. (we could use Windows Mutex instead).

Why not add a debug-only information to the Locker class to hold the owning thread id, and assert on enter with my own?

But actually the usage of the Locker is rather safe; The only real concern with potential recursive entry is with guarding os::free, and with potential deadlocks should we crash inside a signal handler (error handling or AsyncGetCallTrace or that new "CPU Time Profiling for JFR" JEP). 

For these signal handler usages, I had this idea: https://bugs.openjdk.org/browse/JDK-8349578 - a double-buffering approach where we, upon entering the signal handler, open up a new ResourceArea for the current thread, and return to the original RA upon leaving signal handling.

About guarding os::free, that is needed for a dreary issue (see https://bugs.openjdk.org/browse/JDK-8325890). In short, it is needed to get stable arena memory readings. 

However, I would love to just get rid of arena memory accounting altogether; the solution is very hackish and probably not needed anymore. The only heavy user of chunks causing repeated problems that caused me to use NMT at customers was the JIT. And I added the compilation memory statistic last year, so we have not a much much better tool to investigate JIT arena usage.

We also have an issue https://bugs.openjdk.org/browse/JDK-8333151, which investigates whether we can get rid of chunkpool altogether; see the ratio given in that issue description. This gives some possible performance benefits apart from simplification, but comes at a significant risk of backward compatibility problems since we are then much more subject to memory retention policy of the glibc.

---

Bottom line, I like this PR and if possible would add some debug-only way to catch recursive entry.

src/hotspot/share/nmt/nmtUsage.cpp line 58:

> 56:   // is deducted from mtChunk in the end to give correct values.
> 57:   ChunkPoolLocker lock;
> 58:   const MallocMemorySnapshot* ms = MallocMemorySummary::as_snapshot();

I'd scope this to the as_snapshot call, the subsequent code does not need the locking (lesser chance of accidental recursive locking)

-------------

PR Review: https://git.openjdk.org/jdk/pull/25072#pullrequestreview-2827109155
PR Review Comment: https://git.openjdk.org/jdk/pull/25072#discussion_r2080999219