RFR: 8261027: AArch64: Support for LSE atomics C++ HotSpot code [v7]
Volker Simonis
simonis at openjdk.java.net
Wed Feb 10 17:23:39 UTC 2021
On Wed, 10 Feb 2021 15:20:02 GMT, Andrew Haley <aph at openjdk.org> wrote:
>> Go back a few years, and there were simple atomic load/store exclusive
>> instructions on Arm. Say you want to do an atomic increment of a
>> counter. You'd do an atomic load to get the counter into your local cache
>> in exclusive state, increment that counter locally, then write that
>> incremented counter back to memory with an atomic store. All the time
>> that cache line was in exclusive state, so you're guaranteed that
>> no-one else changed anything on that cache line while you had it.
>>
>> This is hard to scale on a very large system (e.g. Fugaku) because if
>> many processors are incrementing that counter you get a lot of cache
>> line ping-ponging between cores.
>>
>> So, Arm decided to add a locked memory increment instruction that
>> works without needing to load an entire line into local cache. It's a
>> single instruction that loads, increments, and writes back. The secret
>> is to send a cache control message to whichever processor owns the
>> cache line containing the count, tell that processor to increment the
>> counter and return the incremented value. That way cache coherency
>> traffic is mimimized. This new set of instructions is known as Large
>> System Extensions, or LSE.
>>
>> Unfortunately, in recent processors, the "old" load/store exclusive
>> instructions, sometimes perform very badly. Therefore, it's now
>> necessary for software to detect which version of Arm it's running
>> on, and use the "new" LSE instructions if they're available. Otherwise
>> performance can be very poor under heavy contention.
>>
>> GCC's -moutline-atomics does this by providing library calls which use
>> LSE if it's available, but this option is only provided on newer
>> versions of GCC. This is particularly problematic with older versions
>> of OpenJDK, which build using old GCC versions.
>>
>> Also, I suspect that some other operating systems could use this.
>> Perhaps not MacOS, given that all Apple CPUs support LSE, but
>> maybe Windows.
>
> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision:
>
> #ifdef LINUX for now.
Changes requested by simonis (Reviewer).
src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp line 30:
> 28:
> 29: #include "runtime/vm_version.hpp"
> 30: #include "atomic_aarch64.hpp"
Should be sorted before `#include "runtime/vm_version.hpp"`
src/hotspot/os_cpu/linux_aarch64/atomic_linux_aarch64.hpp line 52:
> 50: extern aarch64_atomic_stub_t aarch64_atomic_cmpxchg_4_impl;
> 51: extern aarch64_atomic_stub_t aarch64_atomic_cmpxchg_8_impl;
> 52:
I don't think you need to duplicate all these declarations here if you include `"atomic_aarch64.hpp"` which already declares all these types and variables.
-------------
PR: https://git.openjdk.java.net/jdk/pull/2434
More information about the hotspot-dev
mailing list