RFR: 8261649: AArch64: Optimize LSE atomics in C++ code

Andrew Haley aph at openjdk.java.net
Wed Feb 17 18:13:52 UTC 2021


Now that we have support for LSE atomics in C++ HotSpot source, we can generate much better code for them. In particular, the sequence we generate for CMPXCHG with a full two-way barrier using two DMBs is way suboptimal.

Barrier-ordered-before, Arm Architecture Reference Manual B2.3 :

   | Barrier instructions order prior Memory effects before subsequent
   | Memory effects generated by the same Observer. A read or a write RW1
   | is Barrier-ordered-before a read or a write RW2 from the same Observer
   | if and only if RW1 appears in program order before RW2 and any of the
   | following cases apply:
   |
   | [...]
   |
   | * RW1 appears in program order before an atomic instruction with both
   | Acquire and Release semantics that appears in program order before RW2.

So a prior load or store cannot be reordered with the load of an atomic swap with Acquire and Release semantics. This barrier-ordered-before in combination with sequential consistency gives us everything we need for a full barrier. However, we still need a DMB after the cmpxchg to ensure that subsequent loads and stores cannot be reordered with the store in an atomic instruction.

-------------

Commit messages:
 - Everything

Changes: https://git.openjdk.java.net/jdk/pull/2612/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2612&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8261649
  Stats: 280 lines in 4 files changed: 164 ins; 51 del; 65 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2612.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2612/head:pull/2612

PR: https://git.openjdk.java.net/jdk/pull/2612


More information about the hotspot-dev mailing list