RFR: 8261027: AArch64: Support for LSE atomics C++ HotSpot code [v2]

Andrew Haley aph at openjdk.java.net
Mon Feb 8 18:50:09 UTC 2021


> Go back a few years, and there were simple atomic load/store exclusive
> instructions on Arm. Say you want to do an atomic increment of a
> counter. You'd do an atomic load to get the counter into your local cache
> in exclusive state, increment that counter locally, then write that
> incremented counter back to memory with an atomic store. All the time
> that cache line was in exclusive state, so you're guaranteed that
> no-one else changed anything on that cache line while you had it.
> 
> This is hard to scale on a very large system (e.g. Fugaku) because if
> many processors are incrementing that counter you get a lot of cache
> line ping-ponging between cores.
> 
> So, Arm decided to add a locked memory increment instruction that
> works without needing to load an entire line into local cache. It's a
> single instruction that loads, increments, and writes back. The secret
> is to send a cache control message to whichever processor owns the
> cache line containing the count, tell that processor to increment the
> counter and return the incremented value. That way cache coherency
> traffic is mimimized. This new set of instructions is known as Large
> System Extensions, or LSE.
> 
> Unfortunately, in recent processors, the "old" load/store exclusive
> instructions, sometimes perform very badly. Therefore, it's now
> necessary for software to detect which version of Arm it's running
> on, and use the "new" LSE instructions if they're available. Otherwise
> performance can be very poor under heavy contention.
> 
> GCC's -moutline-atomics does this by providing library calls which use
> LSE if it's available, but this option is only provided on newer
> versions of GCC. This is particularly problematic with older versions
> of OpenJDK, which build using old GCC versions.
> 
> Also, I suspect that some other operating systems could use this.
> Perhaps not MacOS, given that all Apple CPUs support LSE, but
> maybe Windows.

Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:

  Review changes

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/2434/files
  - new: https://git.openjdk.java.net/jdk/pull/2434/files/4f17903b..31f9c003

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2434&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2434&range=00-01

  Stats: 16 lines in 3 files changed: 14 ins; 0 del; 2 mod
  Patch: https://git.openjdk.java.net/jdk/pull/2434.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/2434/head:pull/2434

PR: https://git.openjdk.java.net/jdk/pull/2434


More information about the hotspot-dev mailing list