RFR: 8371046: Segfault in compiler/whitebox/StressNMethodRelocation.java with -XX:+UseZGC

Vladimir Kozlov kvn at openjdk.org
Tue Dec 2 17:34:58 UTC 2025


On Tue, 11 Nov 2025 17:32:22 GMT, Chad Rakoczy <duke at openjdk.org> wrote:

> [JDK-8371046](https://bugs.openjdk.org/browse/JDK-8371046)
> 
> This pull request fixes two crashes (see below) and adds `InvalidationReason::RELOCATED` to better describe why an nmethod is marked not entrant during relocation.
> 
> ---
> 
> #### 1. Test Bug
> 
> It’s possible for an `nmethod` to be unloaded without its `_state` being explicitly set to `not_entrant`. Checking only `is_in_use()` isn’t sufficient, since the `nmethod` may already be in the process of unloading and therefore may not have a lock (as with ZGC, where `nmethods` are locked individually).
> 
> The fix adds an additional `is_unloading()` check in WhiteBox before acquiring the lock.
> 
> This issue was reproducible fairly consistently (every few runs) by executing `compiler/whitebox/StressNMethodRelocation.java` with  `-XX:+UseZGC -XX:ReservedCodeCacheSize=32m`
> 
> 
> After applying this patch, the original crash stopped occurring, though a more infrequent crash was still observed.
> 
> ---
> 
> #### 2. Implementation Bug
> 
> `nmethod::relocate` works by copying the instructions of an `nmethod` and then adjusting the call sites to account for new PC-relative offsets.
> 
> Previously, this fix-up happened *after* calling `post_init()`, which registers the `nmethod` and makes it visible to the GC. This introduced a race condition where the GC might attempt to resolve a call site before it had been fixed.
> 
> The fix ensures that all call sites are patched **before** the `nmethod` is registered.
> 
> In testing, the crash previously occurred roughly 60 times in 5,000 runs (~1.2%). With this patch, no crashes were observed in the same number of runs.

>> May be we should change the assert to guarantee in Relocation::pd_set_call_destination() to make sure we catch incorrect patching it product VM.

> I'm not opposed to changing this. Is this the main concern?

Yes, my main concern is that new encoding of address in cloned nmethod may not fit into existing instructions set.
At least with guarantee we can catch such case if it happened.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28241#issuecomment-3603204728


More information about the hotspot-dev mailing list