RFC: linux-aarch64 and LSE support

Wed Sep 7 04:41:37 UTC 2022

I’m puzzled by this change:

https://bugs.openjdk.org/browse/JDK-8282322
8282322: AArch64: Provide a means to eliminate all STREX family of instructions
(2022-07-08, jdk20, no backports)

It’s a followup to these changes:

https://bugs.openjdk.org/browse/JDK-8261027
8261027: AArch64: Support for LSE atomics C++ HotSpot code
(2021-02-12, jdk17, backported to jdk11, but not jdk8)

8261649: AArch64: Optimize LSE atomics in C++ code
(2021-02-19, jdk17, backported to jdk11, but not jdk8)

[Also related is this
https://bugs.openjdk.org/browse/JDK-8261660
AArch64: Race condition in stub code generation for LSE Atomics
(2021-02-12, jdk17, superseded by 8261649)]

which are essentially reimplementing gcc’s -moutline-atomics option. The point
of doing this is to allow those changes to be used while building jdk with gcc
versions that don't support -moutline-atomics, esp. for purposes of backports.
(That option arrived with gcc8.5/gcc9.4/gcc10, enabled by default in gcc10. I
guess some non-Oracle folks might still be using something earlier, esp. in
the jdk17 timeframe when LSE support was being added.)

8282322 builds on the earlier two, adding more complexity. It's purpose is to
support a development activity (the use of rr for debugging), and requires
building the jdk with additional configure options (specifying an `-march=` or
`-mcpu` option that supports LSE, so the result of the build will *only* run
on such hardware).  As noted in its review, it doesn't fit the normal criteria
for backporting.

For jdk20+ (where 8282322 landed), I question whether the approach being taken
here really makes sense. If one is willing to assume a relatively recent
version of gcc is being used, then I think there is no reason to reimplement
the effect of -moutline-atomics. We could undo all three of those changes,
reverting back to using gcc __atomic intrinsics, and rely on -moutline-atomics
(explicitly requested for gcc8/9). In that case, nothing like 8282322 is
needed; just specify armv8.1-a or later when configuring the build (which is
already needed to activate the current 8282322 behavior) and LSE will be used,
regardless of -moutline-atomics (or if it is even supported, if you want to
use rr with an old gcc version).

That would be a lot simpler.  It also makes it easier to make further changes.

(Reading through the PR comments for 8282322, it looks like Andrew Haley
almost suggested doing this. But instead of really buying into it, he seemed
to be suggesting having a parallel implementation and some mechanism for
selecting which one to use. That's definitely not what I'm suggesting.)

In particular, I noticed all this because I'm working on 8293117: “Add atomic
bitset functions”, and I’m trying to figure out how I should implement them
for linux-aarch64. I’d rather not jump through the existing hoops when it
would be really easy to just use the appropriate gcc __atomic intrinsics, as
we used to do for other operations.

The downside of going back to using the __atomic intrinsics rather than
continuing to roll our own is that using an old gcc to build a recent jdk
might get less than optimal performance on spiffy new hardware. I think that's
worth the benefit of a much simpler implementation.