RFR: 8371990: Remove two second delayed OOME after GC shutdown [v3]

Stefan Karlsson stefank at openjdk.org
Tue Nov 18 12:46:28 UTC 2025


> In [JDK-8366865](https://bugs.openjdk.org/browse/JDK-8366865) the shutdown code was tweaked so that allocating code would try to block for two seconds and if the JVM didn't shut down within that time, an OOME was thrown from the allocating thread.
> 
> One of the reason why this code was introduced was to deal with shutdown problem where the thread that were shutting down the JVM would first initiate the shutdown of the GC and *after* that the thread would call the JVMTI shutdown events and callbacks. The JVMTI callbacks could call arbitrary Java code that could try to allocate memory, and if the heap was filled up, it would have to wait for a GC to do its thing and hand back memory. But the GC had initiated its termination protocol and could be unresponsive to that request, which in term would lead to hanging JVM process.
> 
> The problem described above was finally fixed with [JDK-8367902](https://bugs.openjdk.org/browse/JDK-8367902).
> 
> So, I propose that we get rid of the workaround put into place with [JDK-8366865](https://bugs.openjdk.org/browse/JDK-8366865).
> 
> The proposed patch restructures the GC shutdown a little bit. The idea is all threads that want to schedule a GC VM Operation already take the Heap_lock, and while holding that lock they check the `_is_shutting_down` variable. If the the JVM indeed is shutting down, the threads refuse to schedule the GC operation.
> 
> Depending on the type of thread that is trying to schedule the GC operation we do one out of two things:
> 
> 1) If it is a Java thread, we simply block the thread from running. The thread is either a daemon thread and the blocking of the thread will not hinder the shutdown. Or, the thread is a non-daemon thread but the Java code called System.halt, which doesn't wait for non-daemon threads.
> 
> 2) If it is a Concurrent GC thread, then we let the thread proceed but with the order to skip the GC operation. This is done because the current shutdown code calls "stop" on the Concurrent GC threads and then wait for them to signal back when they have stopped running their code. So, we need to let them run to completion.
> 
> There are some G1 specific details to look at:
> 
> 1) I've reverted the G1 `concurrent_mark_is_terminating` checks.
> 
> 2) `try_collect_concurrently` queries the `_is_shutting_down` while holding the lock, and then uses that queried value after the lock is released.
> 
> 3) I've left some breadcrumbs in `should_clear_region`. Any suggestions on what to do with the comment and assert?
> 
> This has been ...

Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision:

  Add comment to block_if_java_thread

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/28349/files
  - new: https://git.openjdk.org/jdk/pull/28349/files/b5b7c68a..c5f65378

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=28349&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28349&range=01-02

  Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod
  Patch: https://git.openjdk.org/jdk/pull/28349.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/28349/head:pull/28349

PR: https://git.openjdk.org/jdk/pull/28349


More information about the hotspot-gc-dev mailing list