RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code

Dong Bo dongbo at openjdk.java.net
Tue Mar 2 12:24:54 UTC 2021


On Tue, 2 Mar 2021 09:33:12 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> The aarch64 LSE atomic operations are introduced to C++ hotspot code in JDK-8261027 and optimized in JDK-8261649.
>> For memory_order_conservative, the acquire semantics in atomic instructions, i.e. ldaddal, swpal, casal, ensure that no subsequent accesses can pass the atomic operations.
>> We also have a trailing dmb to ensure barrier-ordered-after relationship, it can ensure what the acquire does. So the acquire semantics is no longer needed, {ldaddl, swpl, casl} would be enough.
>> 
>> Checked by using the herd7 consistency model simulator with the test in comments before `gen_cas_entry`:
>> AArch64 LseCasAfter
>> { 0:X1=x; 0:X2=y; 1:X1=x; 1:X2=y; }
>> P0           | P1                ;
>> LDR W4, [X2] | MOV W3, #0        ;
>> DMB LD       | MOV W4, #1        ;
>> LDR W3, [X1] | CASL W3, W4, [X1] ;
>>              | DMB ISH           ;
>>              | STR W4, [X2]      ;
>> exists
>>  (0:X3=0 /\ 0:X4=1)
>> No `X3 == 0 && X4 == 1` witnessed.
>> 
>> Remove the acquire semantics does not allow prior accesses to pass the atomic operations, because the release semantics are still there.
>> Just in case, checked by herd7 with the testcase below:
>> AArch64 LseCasPrior
>> { 0:X1=x; 0:X2=y; 1:X1=x; 1:X2=y; }
>> P0           | P1                ;
>> LDR W3, [X1] | MOV W3, #0        ;
>> DMB LD       | MOV W4, #1        ;
>> LDR W4, [X2] | STR W4, [X2]      ;
>>              | CASL W3, W4, [X1] ;
>>              | DMB ISH           ;
>> exists
>>  (0:X3=1 /\ 0:X4=0)
>> No `X3 == 1 && X4 == 0` witnessed.
>> 
>> Similarly, the default implementations of `atomic_fetch_add` and `atomic_xchg` via `ldaxr+stlxr+dmb` can be replaced by `ldxr+stlxr+dmb`.
>
> No. Try this with and without the acquire:
> 
>  Jon1
> { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; 1:X4=z; }
>  P0 | P1 ;
>  MOV W0,#1 | LDR W0,[X1];
>  STR W0,[X1] | CASAL W5,W6,[X4];
>  MOV W2,#1 | LDR W2,[X3];
>  STLR W2,[X3] | ;
> exists
> (1:X0=1 /\ 1:X2=0)```

Without the acquire, `1:X0=1 /\ 1:X2=0` exists, then we do have a problem.
But for `memory_order_conservative`, I guess there should be a trailing DMB right after the CASAL under current `gen_cas_entry` code:
    __ lse_cas(prev, exchange_val, ptr, size, acquire, release, /*not_pair*/true);
    if (order == memory_order_conservative) {
      __ membar(Assembler::StoreStore|Assembler::StoreLoad);
    }
That is:
AArch64 exper
{ 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; 1:X4=z; }
P0           | P1               ;
MOV W0,#1    | LDR W0,[X1]      ;
STR W0,[X1]  | CASAL W5,W6,[X4] ;
MOV W2,#1    | DMB ISH          ;
STLR W2,[X3] | LDR W2,[X3]      ;
exists
 (1:X0=1 /\ 1:X2=0)
With the DMB, herd7 produces same results with or without acquire.

Thanks.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2788


More information about the hotspot-dev mailing list