RFR: 8360654: AArch64: Remove redundant dmb from C1 compareAndSet
Andrew Haley
aph at openjdk.org
Tue Jul 15 11:02:47 UTC 2025
On Thu, 26 Jun 2025 12:13:19 GMT, Samuel Chee <duke at openjdk.org> wrote:
> AtomicLong.CompareAndSet has the following assembly dump snippet which gets emitted from the intermediary LIRGenerator::atomic_cmpxchg:
>
> ;; cmpxchg {
> 0x0000e708d144cf60: mov x8, x2
> 0x0000e708d144cf64: casal x8, x3, [x0]
> 0x0000e708d144cf68: cmp x8, x2
> ;; 0x1F1F1F1F1F1F1F1F
> 0x0000e708d144cf6c: mov x8, #0x1f1f1f1f1f1f1f1f
> ;; } cmpxchg
> 0x0000e708d144cf70: cset x8, ne // ne = any
> 0x0000e708d144cf74: dmb ish
>
>
> According to the Oracle Java Specification, AtomicLong.CompareAndSet [1] has the same memory effects as specified by VarHandle.compareAndSet which has the following effects: [2]
>
>> Atomically sets the value of a variable to the
>> newValue with the memory semantics of setVolatile if
>> the variable's current value, referred to as the witness
>> value, == the expectedValue, as accessed with the memory
>> semantics of getVolatile.
>
>
>
> Hence the release on the store due to setVolatile only occurs if the compare is successful. Since casal already satisfies these requirements, the dmb does not need to occur to ensure memory ordering in case the compare fails and a release does not happen.
>
> Hence we remove the dmb from both casl and casw (same logic applies to the non-long variant)
>
> This is also reflected by C2 not having a dmb for the same respective method.
>
> [1] https://docs.oracle.com/en/java/javase/24/docs/api/java.base/java/util/concurrent/atomic/AtomicLong.html#compareAndSet(long,long)
> [2] https://docs.oracle.com/en/java/javase/24/docs/api/java.base/java/lang/invoke/VarHandle.html#compareAndSet(java.lang.Object...)
With the help of Will Deacon, one of the authors of the memory model. I've now got to the bottom of this.
It is indeed a change to the MM, dating from 2002.
I agree that the DMB isn't needed here because the CASAL has both
acquire and release semantics. However, I don't think that's related to
the snippet of the architecture you have above but rather comes from:
// DDI0487L_b
// Barrier-ordered-before (B2-255)
...
* All of the following apply:
- E1 is an Explicit Memory Write Effect and is generated by an atomic
instruction with both Acquire and Release semantics.
- E1 appears in program order before E2.
- One of the following applies:
- E2 is an Explicit Memory Effect.
- E2 is an Implicit Tag Memory Read Effect.
- E2 is an MMU Fault Effect.
Which says that the release store of the CASAL is ordered before the
the subsequent store to y. Note that this _wouldn't_ work if you used
CASL instead.
The full details of the MM change are here:
http://github.com/herd/herdtools7/commit/636b7163c0679c691b8cf9a04623cd3aa1cc0ec3
So, this change looks good, and we can remove trailing DMBs from most CASALs.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26000#issuecomment-3073154641
PR Comment: https://git.openjdk.org/jdk/pull/26000#issuecomment-3073156671
More information about the hotspot-compiler-dev
mailing list