RFR: 8360654: AArch64: Remove redundant dmb from C1 compareAndSet
Andrew Haley
aph at openjdk.org
Fri Jul 11 13:22:40 UTC 2025
On Thu, 26 Jun 2025 12:13:19 GMT, Samuel Chee <duke at openjdk.org> wrote:
> AtomicLong.CompareAndSet has the following assembly dump snippet which gets emitted from the intermediary LIRGenerator::atomic_cmpxchg:
>
> ;; cmpxchg {
> 0x0000e708d144cf60: mov x8, x2
> 0x0000e708d144cf64: casal x8, x3, [x0]
> 0x0000e708d144cf68: cmp x8, x2
> ;; 0x1F1F1F1F1F1F1F1F
> 0x0000e708d144cf6c: mov x8, #0x1f1f1f1f1f1f1f1f
> ;; } cmpxchg
> 0x0000e708d144cf70: cset x8, ne // ne = any
> 0x0000e708d144cf74: dmb ish
>
>
> According to the Oracle Java Specification, AtomicLong.CompareAndSet [1] has the same memory effects as specified by VarHandle.compareAndSet which has the following effects: [2]
>
>> Atomically sets the value of a variable to the
>> newValue with the memory semantics of setVolatile if
>> the variable's current value, referred to as the witness
>> value, == the expectedValue, as accessed with the memory
>> semantics of getVolatile.
>
>
>
> Hence the release on the store due to setVolatile only occurs if the compare is successful. Since casal already satisfies these requirements, the dmb does not need to occur to ensure memory ordering in case the compare fails and a release does not happen.
>
> Hence we remove the dmb from both casl and casw (same logic applies to the non-long variant)
>
> This is also reflected by C2 not having a dmb for the same respective method.
>
> [1] https://docs.oracle.com/en/java/javase/24/docs/api/java.base/java/util/concurrent/atomic/AtomicLong.html#compareAndSet(long,long)
> [2] https://docs.oracle.com/en/java/javase/24/docs/api/java.base/java/lang/invoke/VarHandle.html#compareAndSet(java.lang.Object...)
On 04/07/2025 17:28, Samuel Chee wrote:
> Hope this helps :)
Thanks, this looks convincing.
Please allow some time for me to do some more checking. This is a tricky area, and the the cost if we get it wrong is high.
FYI, I'm still looking at this.
It seems that the definition of barrier-ordered-before has been strengthened since this code was written. A test that I wrote a few years ago now passes on the online Herd7 simulator, where it used to fail. Back then I commented
// At the time of writing we don't know of any AArch64 hardware that
// reorders stores in this way, but the Reference Manual permits it.
... and confirmed my interpretation with the author of the Reference Manual.
I'm guessing that older AArch64 implementations still in use never did such reorderings.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26000#issuecomment-3044020068
PR Comment: https://git.openjdk.org/jdk/pull/26000#issuecomment-3062325467
More information about the hotspot-compiler-dev
mailing list