RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC

Vladimir Kozlov kvn at openjdk.org
Fri Nov 21 20:47:01 UTC 2025


On Tue, 11 Nov 2025 17:32:22 GMT, Chad Rakoczy <duke at openjdk.org> wrote:

> [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046)
> 
> This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation.
> 
> ---
> 
> #### 1. Test Bug
> 
> It’s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn’t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually).
> 
> The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock.
> 
> This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with  `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m`
> 
> 
> After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed.
> 
> ---
> 
> #### 2. Implementation Bug
> 
> `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets.
> 
> Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed.
> 
> The fix ensures that all call sites are patched **before** the `nmethod` is registered.
> 
> In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs.

src/hotspot/share/code/nmethod.cpp line 1508:

> 1506: #ifdef USE_TRAMPOLINE_STUB_FIX_OWNER
> 1507:     // Direct calls may no longer be in range and the use of a trampoline may now be required.
> 1508:     // Instead, allow trampoline relocations to update their owners and perform the necessary checks.

`Instead` is wrong word here I think. May be `Otherwise`.

Also where you add trampoline in new nmethod's copy if needed? I don't see it in `fix_relocation_after_move()`.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2550915038


More information about the graal-dev mailing list