RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC
Vladimir Kozlov
kvn at openjdk.org
Fri Nov 21 20:47:01 UTC 2025
On Tue, 11 Nov 2025 17:32:22 GMT, Chad Rakoczy <duke at openjdk.org> wrote:
> [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046)
>
> This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation.
>
> ---
>
> #### 1. Test Bug
>
> It’s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn’t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually).
>
> The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock.
>
> This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m`
>
>
> After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed.
>
> ---
>
> #### 2. Implementation Bug
>
> `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets.
>
> Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed.
>
> The fix ensures that all call sites are patched **before** the `nmethod` is registered.
>
> In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs.
src/hotspot/share/code/nmethod.cpp line 1508:
> 1506: #ifdef USE_TRAMPOLINE_STUB_FIX_OWNER
> 1507: // Direct calls may no longer be in range and the use of a trampoline may now be required.
> 1508: // Instead, allow trampoline relocations to update their owners and perform the necessary checks.
`Instead` is wrong word here I think. May be `Otherwise`.
Also where you add trampoline in new nmethod's copy if needed? I don't see it in `fix_relocation_after_move()`.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/28241#discussion_r2550915038
More information about the graal-dev
mailing list