RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v3]

Martin Doerr mdoerr at openjdk.org
Fri Aug 1 21:58:56 UTC 2025


On Fri, 1 Aug 2025 20:09:11 GMT, Dean Long <dlong at openjdk.org> wrote:

>> This PR removes the recently added lock around set_guard_value, using instead Atomic::cmpxchg to atomically update bit-fields of the guard value.  Further, it takes a fast-path that uses the previous direct store when at a safepoint.  Combined, these changes should get us back to almost where we were before in terms of overhead.  If necessary, we could go even further and allow make_not_entrant() to perform a direct byte store, leaving 24 bits for the guard value.
>
> Dean Long has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix PPC64

PPC64 code looks correct, now, but I have minor proposals.

src/hotspot/cpu/ppc/gc/shared/barrierSetNMethod_ppc.cpp line 84:

> 82:       nativeMovRegMem_at(new_mov_instr.buf)->set_offset(new_value, false /* no icache flush */);
> 83:       // Swap in the new value
> 84:       uint64_t v = Atomic::cmpxchg(instr, old_mov_instr.u64, new_mov_instr.u64, memory_order_release);

We have `OrderAccess::release()` above, so `memory_order_release` looks redundant. Shouldn't we use `memory_order_relaxed`, here?

src/hotspot/cpu/ppc/gc/shared/barrierSetNMethod_ppc.cpp line 88:

> 86:       old_mov_instr.u64 = v;
> 87:     }
> 88:     ICache::ppc64_flush_icache_bytes(addr_at(0), NativeMovRegMem::instruction_size);

Maybe only use flushing if `cmpxchg` succeeded? Otherwise, we didn't modify the code.

-------------

PR Review: https://git.openjdk.org/jdk/pull/26399#pullrequestreview-3080627303
PR Review Comment: https://git.openjdk.org/jdk/pull/26399#discussion_r2248909656
PR Review Comment: https://git.openjdk.org/jdk/pull/26399#discussion_r2248925985


More information about the hotspot-dev mailing list