RFR: 8261649: AArch64: Optimize LSE atomics in C++ code
Andrew Haley
aph at openjdk.java.net
Wed Feb 17 18:13:52 UTC 2021
Now that we have support for LSE atomics in C++ HotSpot source, we can generate much better code for them. In particular, the sequence we generate for CMPXCHG with a full two-way barrier using two DMBs is way suboptimal.
Barrier-ordered-before, Arm Architecture Reference Manual B2.3 :
| Barrier instructions order prior Memory effects before subsequent
| Memory effects generated by the same Observer. A read or a write RW1
| is Barrier-ordered-before a read or a write RW2 from the same Observer
| if and only if RW1 appears in program order before RW2 and any of the
| following cases apply:
|
| [...]
|
| * RW1 appears in program order before an atomic instruction with both
| Acquire and Release semantics that appears in program order before RW2.
So a prior load or store cannot be reordered with the load of an atomic swap with Acquire and Release semantics. This barrier-ordered-before in combination with sequential consistency gives us everything we need for a full barrier. However, we still need a DMB after the cmpxchg to ensure that subsequent loads and stores cannot be reordered with the store in an atomic instruction.
-------------
Commit messages:
- Everything
Changes: https://git.openjdk.java.net/jdk/pull/2612/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2612&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8261649
Stats: 280 lines in 4 files changed: 164 ins; 51 del; 65 mod
Patch: https://git.openjdk.java.net/jdk/pull/2612.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/2612/head:pull/2612
PR: https://git.openjdk.java.net/jdk/pull/2612
More information about the hotspot-dev
mailing list