RFR: 8361376: Regressions 1-6% in several Renaissance in 26-b4 only MacOSX aarch64 [v3]

Martin Doerr mdoerr at openjdk.org
Mon Aug 4 21:50:09 UTC 2025


On Mon, 4 Aug 2025 21:22:12 GMT, Dean Long <dlong at openjdk.org> wrote:

>> src/hotspot/cpu/ppc/gc/shared/barrierSetNMethod_ppc.cpp line 84:
>> 
>>> 82:       nativeMovRegMem_at(new_mov_instr.buf)->set_offset(new_value, false /* no icache flush */);
>>> 83:       // Swap in the new value
>>> 84:       uint64_t v = Atomic::cmpxchg(instr, old_mov_instr.u64, new_mov_instr.u64, memory_order_release);
>> 
>> We have `OrderAccess::release()` above, so `memory_order_release` looks redundant. Shouldn't we use `memory_order_relaxed`, here?
>
> I think you are right.  But your question about release is making me wonder if we need acquire as well.  For example if two threads are racing to disarm, is there a memory visibility problem if we do not use acquire for the CAS, or if we do the release only on a successful CAS on the other platforms.

Correct. The acquire barrier is at the end of the nmethod entry barrier: https://github.com/openjdk/jdk/blob/f96b6bcd4ddbb1d0e0a76d9f4e3b43bec20dcb7a/src/hotspot/cpu/ppc/gc/shared/barrierSetAssembler_ppc.cpp#L203
It's not needed if we use a GC with `stw_instruction_and_data_patch`.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/26399#discussion_r2252641681


More information about the hotspot-dev mailing list