RFR: 8292587: AArch64: Support SVE fabd instruction
Fei Gao
fgao at openjdk.org
Wed Sep 7 07:46:39 UTC 2022
On Thu, 25 Aug 2022 01:52:41 GMT, Hao Sun <haosun at openjdk.org> wrote:
> Scalar and NEON fabd instructions were initially supported in
> JDK-8256318. In this patch, we support SVE fabd instruction [1] and add
> one Jtreg test case as well.
>
> With this patch, two instructions `fsub + fabs` would be combined into
> one single `fabd` instruction.
>
>
> fsub z16.s, z16.s, z17.s
> fabs z16.s, p7/m, z16.s
>
> -->
>
> fabd z16.s, p7/m, z16.s, z17.s
>
>
> In the initial evaluation of JMH case, i.e.
> FloatingScalarVectorAbsDiff.java, we found the performance uplift done
> by this optimization was easily hidden by the heavy memory load/store
> instructions. To avoid that, we updated the JMH case a bit, adding one
> more group of subtraction and Math.abs operations in the loop body.
>
> Here shows the data with the new JMH case on one 256-bit SVE machine. We
> can observe about 39% and 35% improvements for the two functions
> respectively.
>
>
> Benchmark Before After Units
> FloatingScalarVectorAbsDiff.testVectorAbsDiffDouble 260.468 160.965 ns/op
> FloatingScalarVectorAbsDiff.testVectorAbsDiffFloat 133.963 87.292 ns/op
>
>
> Jtreg testing: tier1~3 passed on one NEON-only machine and one 256-bit SVE machine.
>
> [1] https://developer.arm.com/documentation/ddi0596/2021-12/SVE-Instructions/FABD--Floating-point-absolute-difference--predicated--
test/hotspot/jtreg/compiler/vectorapi/VectorAbsDiffTest.java line 97:
> 95: public static void testFloatAbsDiff_runner() {
> 96: testFloatAbsDiff();
> 97: for (int i = 0; i < F_SPECIES.length(); i++) {
I suppose it should be `for (int i = 0; i < LENGTH; i++) {` here. You can check all similar code lines in the following functions for verification.
-------------
PR: https://git.openjdk.org/jdk/pull/10011
More information about the hotspot-compiler-dev
mailing list