[aarch64-port-dev ] RFR: aarch64: minor improvements of atomic operations
Yangfei (Felix)
felix.yang at huawei.com
Mon Nov 11 12:44:03 UTC 2019
> -----Original Message-----
> From: Yangfei (Felix)
> Sent: Monday, November 11, 2019 8:01 PM
> To: 'Andrew Haley' <aph at redhat.com>; aarch64-port-dev at openjdk.java.net
> Cc: 'hotspot-dev at openjdk.java.net' <hotspot-dev at openjdk.java.net>
> Subject: RE: [aarch64-port-dev ] RFR: aarch64: minor improvements of atomic
> operations
>
> > -----Original Message-----
> > From: Andrew Haley [mailto:aph at redhat.com]
> > Sent: Monday, November 11, 2019 7:17 PM
> > To: Yangfei (Felix) <felix.yang at huawei.com>;
> > aarch64-port-dev at openjdk.java.net
> > Subject: Re: [aarch64-port-dev ] RFR: aarch64: minor improvements of
> > atomic operations
> >
> > On 11/5/19 6:20 AM, Yangfei (Felix) wrote:
> > > Please review this small improvements of aarch64 atomic operations.
> > > This eliminates the use of full memory barriers.
> > > Passed tier1-3 testing.
> >
> > No, rejected.
> >
> > Patch also must go to hotspot-dev.
>
> CCing to hotspot-dev.
>
> > Are you sure this is safe? The HotSpot internal barriers are specified
> > as being full two-way barriers, which these are not. Tier1 testing
> > really isn't going to do it. Now, you might argue that none of the
> > uses in HotSpot actually require anything stronger that acq/rel, but good luck
> proving that.
>
> I was also curious about the reason why full memory barrier is used here.
> For add_and_fetch, I was thinking that there is no difference in functionality for
> the following two code snippet.
> It's interesting to know that this may make a difference. Can you elaborate
> more on that please?
>
> 1) without patch
> .L2:
> ldxr x2, [x1]
> add x2, x2, x0
> stlxr w3, x2, [x1]
> cbnz w3, .L2
> dmb ish
> mov x0, x2
> ret
> -----------------------------------------------
> 2) with patch
> .L2:
> ldaxr x2, [x1]
> add x2, x2, x0
> stlxr w3, x2, [x1]
> cbnz w3, .L2
> mov x0, x2
> ret
And looks like the aarch64 port from Oracle also did the same thing:
http://hg.openjdk.java.net/jdk-updates/jdk11u-dev/file/f8b2e95a1d41/src/hotspot/os_cpu/linux_arm/atomic_linux_arm.hpp
template<size_t byte_size>
struct Atomic::PlatformAdd
: Atomic::AddAndFetch<Atomic::PlatformAdd<byte_size> >
{
template<typename I, typename D>
D add_and_fetch(I add_value, D volatile* dest, atomic_memory_order order) const;
};
template<>
template<typename I, typename D>
inline D Atomic::PlatformAdd<4>::add_and_fetch(I add_value, D volatile* dest,
atomic_memory_order order) const {
STATIC_ASSERT(4 == sizeof(I));
STATIC_ASSERT(4 == sizeof(D));
#ifdef AARCH64
D val;
int tmp;
__asm__ volatile(
"1:\n\t"
" ldaxr %w[val], [%[dest]]\n\t"
" add %w[val], %w[val], %w[add_val]\n\t"
" stlxr %w[tmp], %w[val], [%[dest]]\n\t"
" cbnz %w[tmp], 1b\n\t"
: [val] "=&r" (val), [tmp] "=&r" (tmp)
: [add_val] "r" (add_value), [dest] "r" (dest)
: "memory");
return val;
#else
return add_using_helper<int32_t>(os::atomic_add_func, add_value, dest);
#endif
}
More information about the aarch64-port-dev
mailing list