RFR: 8292587: AArch64: Support SVE fabd instruction

Hao Sun haosun at openjdk.org
Thu Aug 25 01:59:56 UTC 2022


Scalar and NEON fabd instructions were initially supported in
JDK-8256318. In this patch, we support SVE fabd instruction [1] and add
one Jtreg test case as well.

With this patch, two instructions `fsub + fabs` would be combined into
one single `fabd` instruction.


  fsub    z16.s, z16.s, z17.s
  fabs    z16.s, p7/m, z16.s

  -->

  fabd    z16.s, p7/m, z16.s, z17.s


In the initial evaluation of JMH case, i.e.
FloatingScalarVectorAbsDiff.java, we found the performance uplift done
by this optimization was easily hidden by the heavy memory load/store
instructions. To avoid that, we updated the JMH case a bit, adding one
more group of subtraction and Math.abs operations in the loop body.

Here shows the data with the new JMH case on one 256-bit SVE machine. We
can observe about 39% and 35% improvements for the two functions
respectively.


Benchmark                                             Before    After  Units
FloatingScalarVectorAbsDiff.testVectorAbsDiffDouble  260.468  160.965  ns/op
FloatingScalarVectorAbsDiff.testVectorAbsDiffFloat   133.963   87.292  ns/op


Jtreg testing: tier1~3 passed on one NEON-only machine and one 256-bit SVE machine.

[1] https://developer.arm.com/documentation/ddi0596/2021-12/SVE-Instructions/FABD--Floating-point-absolute-difference--predicated--

-------------

Commit messages:
 - 8292587: AArch64: Support SVE fabd instruction

Changes: https://git.openjdk.org/jdk/pull/10011/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=10011&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8292587
  Stats: 277 lines in 7 files changed: 234 ins; 0 del; 43 mod
  Patch: https://git.openjdk.org/jdk/pull/10011.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/10011/head:pull/10011

PR: https://git.openjdk.org/jdk/pull/10011


More information about the hotspot-compiler-dev mailing list