RFC: linux-aarch64 and LSE support
Dmitry Chuyko
dmitry.chuyko at bell-sw.com
Wed Sep 7 09:30:00 UTC 2022
Hi Kim,
The implementation of the main JDK-8261027 change doesn't just try to
provide the functionality when built with older GCC (or not GCC). What it
does is dynamically switch to a more advanced implementation if the
appropriate hardware capabilities were detected during the VM start. The
selected code is picked by the compiler. So there are 2 implementations and
non-LSE one is the default. This allows us to provide a single binary for
all supported ARM devices and get better performance where possible.
>From another hand, a less advanced implementation is always used initially
in the default configuration. As you noticed, JDK-8282322 just makes it
possible to create a build that uses the LSE variant from the start.
-Dmitry
On Wed, Sep 7, 2022 at 7:41 AM Kim Barrett <kim.barrett at oracle.com> wrote:
> I’m puzzled by this change:
>
> https://bugs.openjdk.org/browse/JDK-8282322
> 8282322: AArch64: Provide a means to eliminate all STREX family of
> instructions
> (2022-07-08, jdk20, no backports)
>
> It’s a followup to these changes:
>
> https://bugs.openjdk.org/browse/JDK-8261027
> 8261027: AArch64: Support for LSE atomics C++ HotSpot code
> (2021-02-12, jdk17, backported to jdk11, but not jdk8)
>
> 8261649: AArch64: Optimize LSE atomics in C++ code
> (2021-02-19, jdk17, backported to jdk11, but not jdk8)
>
> [Also related is this
> https://bugs.openjdk.org/browse/JDK-8261660
> AArch64: Race condition in stub code generation for LSE Atomics
> (2021-02-12, jdk17, superseded by 8261649)]
>
> which are essentially reimplementing gcc’s -moutline-atomics option. The
> point
> of doing this is to allow those changes to be used while building jdk with
> gcc
> versions that don't support -moutline-atomics, esp. for purposes of
> backports.
> (That option arrived with gcc8.5/gcc9.4/gcc10, enabled by default in
> gcc10. I
> guess some non-Oracle folks might still be using something earlier, esp. in
> the jdk17 timeframe when LSE support was being added.)
>
> 8282322 builds on the earlier two, adding more complexity. It's purpose is
> to
> support a development activity (the use of rr for debugging), and requires
> building the jdk with additional configure options (specifying an
> `-march=` or
> `-mcpu` option that supports LSE, so the result of the build will *only*
> run
> on such hardware). As noted in its review, it doesn't fit the normal
> criteria
> for backporting.
>
> For jdk20+ (where 8282322 landed), I question whether the approach being
> taken
> here really makes sense. If one is willing to assume a relatively recent
> version of gcc is being used, then I think there is no reason to
> reimplement
> the effect of -moutline-atomics. We could undo all three of those changes,
> reverting back to using gcc __atomic intrinsics, and rely on
> -moutline-atomics
> (explicitly requested for gcc8/9). In that case, nothing like 8282322 is
> needed; just specify armv8.1-a or later when configuring the build (which
> is
> already needed to activate the current 8282322 behavior) and LSE will be
> used,
> regardless of -moutline-atomics (or if it is even supported, if you want to
> use rr with an old gcc version).
>
> That would be a lot simpler. It also makes it easier to make further
> changes.
>
> (Reading through the PR comments for 8282322, it looks like Andrew Haley
> almost suggested doing this. But instead of really buying into it, he
> seemed
> to be suggesting having a parallel implementation and some mechanism for
> selecting which one to use. That's definitely not what I'm suggesting.)
>
> In particular, I noticed all this because I'm working on 8293117: “Add
> atomic
> bitset functions”, and I’m trying to figure out how I should implement them
> for linux-aarch64. I’d rather not jump through the existing hoops when it
> would be really easy to just use the appropriate gcc __atomic intrinsics,
> as
> we used to do for other operations.
>
> The downside of going back to using the __atomic intrinsics rather than
> continuing to roll our own is that using an old gcc to build a recent jdk
> might get less than optimal performance on spiffy new hardware. I think
> that's
> worth the benefit of a much simpler implementation.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-dev/attachments/20220907/f4d02c3f/attachment-0001.htm>
More information about the hotspot-dev
mailing list