Atomic operations: your thoughts are welocme
Aleksey Shipilev
shade at redhat.com
Thu Feb 11 13:13:39 UTC 2021
On 2/11/21 4:59 AM, Kim Barrett wrote:
>> On Feb 8, 2021, at 1:14 PM, Andrew Haley <aph at redhat.com> wrote:
>>
>> I've been looking at the hottest Atomic operations in HotSpot, with a view to
>> finding out if the default memory_order_conservative (which is very expensive
>> on some architectures) can be weakened to something less. It's impossible to
>> fix all of them, but perhaps we can fix some of the most frequent.
>
> Is there any information about the possible performance improvement from
> such changes? 1.5-3M occurrences doesn't mean much without context.
I am going through the exercise of relaxing some of the memory orders in Shenandoah code, and
AArch64 benefits greatly from it (= two-way barriers are bad in hot code).
There are obvious things like relaxing counter updates:
JDK-8261503: Shenandoah: reconsider verifier memory ordering
JDK-8261501: Shenandoah: reconsider heap statistics memory ordering
JDK-8261500: Shenandoah: reconsider region live data memory ordering
JDK-8261496: Shenandoah: reconsider pacing updates memory ordering
There are more interesting things like relaxing accesses to marking bitmap (which is a large counter
array in disguise) -- which effectively implies a CAS (and thus two FULL_MEM_BARRIER-s on AArch64)
per marked object:
JDK-8261493: Shenandoah: reconsider bitmap access memory ordering
These five relaxations above cut down marking phase time on AArch64 for about 10..15%.
And there is more advanced stuff where relaxed is not enough, but conservative is too conservative.
There, acq/rel should be enough -- but we cannot yet test it, because AArch64 cmpxchg does not do
anything except relaxed/conservative (JDK-8261579):
JDK-8261492: Shenandoah: reconsider forwardee accesses memory ordering
JDK-8261495: Shenandoah: reconsider update references memory ordering
These two (along with experimental 8261579 fix) cut down evacuation and update-references phase
times for about 25..30% and 10..15%, respectively.
All in all, this cuts down Shenandoah GC cycle times on AArch64 for about 15..20%! So, I believe
this shows enough benefit to invest our time. Heavy-duty GC code is where I expect the most benefit.
--
Thanks,
-Aleksey
More information about the hotspot-gc-dev
mailing list