RFR: 8360654: AArch64: Remove redundant dmb from C1 compareAndSet
    Ruben 
    duke at openjdk.org
       
    Mon Nov  3 16:11:28 UTC 2025
    
    
  
On Wed, 16 Jul 2025 14:41:58 GMT, Andrew Haley <aph at openjdk.org> wrote:
>> AtomicLong.CompareAndSet has the following assembly dump snippet which gets emitted from the intermediary LIRGenerator::atomic_cmpxchg:
>> 
>> ;; cmpxchg {
>>   0x0000e708d144cf60:   mov	x8, x2
>>   0x0000e708d144cf64:   casal	x8, x3, [x0]
>>   0x0000e708d144cf68:   cmp	x8, x2
>>  ;; 0x1F1F1F1F1F1F1F1F
>>   0x0000e708d144cf6c:   mov	x8, #0x1f1f1f1f1f1f1f1f
>>  ;; } cmpxchg
>>   0x0000e708d144cf70:   cset	x8, ne  // ne = any
>>   0x0000e708d144cf74:   dmb	ish
>> 
>> 
>> According to the Oracle Java Specification, AtomicLong.CompareAndSet [1] has the same memory effects as specified by VarHandle.compareAndSet which has the following effects: [2]
>> 
>>> Atomically sets the value of a variable to the
>>> newValue with the memory semantics of setVolatile if
>>> the variable's current value, referred to as the witness
>>> value, == the expectedValue, as accessed with the memory
>>> semantics of getVolatile.
>> 
>> 
>> 
>> Hence the release on the store due to setVolatile only occurs if the compare is successful. Since casal already satisfies these requirements, the dmb does not need to occur to ensure memory ordering in case the compare fails and a release does not happen.
>> 
>> Hence we remove the dmb from both casl and casw (same logic applies to the non-long variant)
>> 
>> This is also reflected by C2 not having a dmb for the same respective method.
>> 
>> [1] https://docs.oracle.com/en/java/javase/24/docs/api/java.base/java/util/concurrent/atomic/AtomicLong.html#compareAndSet(long,long)
>> [2] https://docs.oracle.com/en/java/javase/24/docs/api/java.base/java/lang/invoke/VarHandle.html#compareAndSet(java.lang.Object...)
>
> I think we still need a DMB after non-LSE CMPXCHG, which gets failures without this DMB:
> 
> 
> AArch64 MP
> 
> {
> 0:X0=x; 0:X2=y;
> 1:X0=y; 1:X4=x;
> }
>  P0           | P1               ;
>  LDAR W1,[X0] | MOV W2,#1        ;
>               | L0: ;
>  LDR W3,[X2]  | LDAXR W1,[X0] ;
>               | STLXR W8,W2,[X0] ;
>               | CBNZ W8,L0;
>               | DMB ISH;
>               | MOV W3,#1        ;
>               | STR W3,[X4]      ;
> exists (0:X1=1 /\ 0:X3=0 /\ 1:X1=0)
Hi @theRealAph,
I've pushed changes for this PR to a new branch https://github.com/openjdk/jdk/compare/master...ruben-arm:jdk:pr-8360654 as Samuel is currently not available. Once he is back, he can update this PR's branch.
In the meanwhile, I'm planning to run more of the `jcstress` testing. I'd appreciate your feedback on the version in the new branch.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26000#issuecomment-3481315382
    
    
More information about the hotspot-compiler-dev
mailing list