[aarch64-port-dev ] RFR: aarch64: minor improvements of atomic operations
Yangfei (Felix)
felix.yang at huawei.com
Tue Nov 12 12:14:48 UTC 2019
> -----Original Message-----
> From: Yangfei (Felix)
> Sent: Tuesday, November 12, 2019 8:03 PM
> To: 'Andrew Dinn' <adinn at redhat.com>; Andrew Haley <aph at redhat.com>;
> aarch64-port-dev at openjdk.java.net
> Cc: hotspot-dev at openjdk.java.net
> Subject: RE: [aarch64-port-dev ] RFR: aarch64: minor improvements of atomic
> operations
>
> > -----Original Message-----
> > From: Andrew Dinn [mailto:adinn at redhat.com]
> > Sent: Tuesday, November 12, 2019 5:42 PM
> > To: Andrew Haley <aph at redhat.com>; Yangfei (Felix)
> > <felix.yang at huawei.com>; aarch64-port-dev at openjdk.java.net
> > Cc: hotspot-dev at openjdk.java.net
> > Subject: Re: [aarch64-port-dev ] RFR: aarch64: minor improvements of
> > atomic operations
> >
> > On 12/11/2019 09:25, Andrew Haley wrote:
> > > On 11/12/19 8:37 AM, Yangfei (Felix) wrote:
> > >> This has been discussed somewhere before:
> > >> https://patchwork.kernel.org/patch/3575821/
> > >> Let's keep the current status for safe.
> > >
> > > Yes.
> > >
> > > It's been interesting to see the progress of this patch. I don't
> > > think it's the first time that someone has been tempted to change
> > > this code to make it "more efficient".
> > >
> > > I wonder if we could perhaps add a comment to that code so that it
> > > doesn't happen again. I'm not sure exactly what the patch should say
> > > beyond "do not touch". Perhaps something along the lines of "Do not
> > > touch this code unless you have at least Black Belt, 4th Dan in
> > > memory ordering." :-)
> > >
> > > More seriously, maybe simply "Note that memory_order_conservative
> > > requires a full barrier after atomic stores. See
> > > https://patchwork.kernel.org/patch/3575821/"
> > Yes, that would be a help. It's particularly easy to get confused here
> > because we happily omit the ordering of an stlr store wrt subsequent
> > stores when the strl is implementing a Java volatile write or a Java cmpxchg.
> >
> > So, it might be worth adding a rider that implementing the full
> > memory_order_conservative semantics is necessary because VM code
> > relies on the strong ordering wrt writes that the cmpxchg is required to
> provide.
> >
>
> I also suggest we implement these functions with inline assembly here.
> For Atomic::PlatformXchg, we may issue two consecutive full memory barriers
> with the current status.
> I used GCC 7.3.0 to compile the following function:
>
> $ cat test.c
> long foo(long add_value, long volatile* dest, long exchange_value) {
> long val = __sync_lock_test_and_set(dest, exchange_value);
>
> __sync_synchronize();
>
> return val;
> }
>
> $ cat test.s
> .arch armv8-a
> .file "test.c"
> .text
> .align 2
> .p2align 3,,7
> .global foo
> .type foo, %function
> foo:
> .L2:
> ldxr x0, [x1]
> stxr w3, x2, [x1]
> cbnz w3, .L2
> dmb ish < ========
> dmb ish < ========
> ret
> .size foo, .-foo
> .ident "GCC: (GNU) 7.3.0"
> .section .note.GNU-stack,"", at progbits
Also this is different from the following sequence (stxr instead of stlxr).
<Access [A]>
// atomic_op (B)
1: ldxr x0, [B] // Exclusive load
<op(B)>
stlxr w1, x0, [B] // Exclusive store with release
cbnz w1, 1b
dmb ish // Full barrier
<Access [C]>
I think the two-way memory barrier may not be ensured for this case.
Felix
More information about the aarch64-port-dev
mailing list