RFR: 8282875: AArch64: [vectorapi] Optimize Vector.reduceLane for SVE 64/128 vector size

Eric Liu eliu at openjdk.java.net
Mon Mar 28 16:47:08 UTC 2022


This patch speeds up add/mul/min/max reductions for SVE for 64/128
vector size.

According to Neoverse N2/V1 software optimization guide[1][2], for
128-bit vector size reduction operations, we prefer using NEON
instructions instead of SVE instructions. This patch adds some rules to
distinguish 64/128 bits vector size with others, so that for these two
special cases, they can generate code the same as NEON. E.g., For
ByteVector.SPECIES_128, "ByteVector.reduceLanes(VectorOperators.ADD)"
generates code as below:


        Before:
        uaddv   d17, p0, z16.b
        smov    x15, v17.b[0]
        add     w15, w14, w15, sxtb

        After:
        addv    b17, v16.16b
        smov    x12, v17.b[0]
        add     w12, w12, w16, sxtb

No multiply reduction instruction in SVE, this patch generates code for
MulReductionVL by using scalar insnstructions for 128-bit vector size.

With this patch, all of them have performance gain for specific vector
micro benchmarks in my SVE testing system.

[1] https://developer.arm.com/documentation/pjdoc466751330-9685/latest/
[2] https://developer.arm.com/documentation/PJDOC-466751330-18256/0001

Change-Id: I4bef0b3eb6ad1bac582e4236aef19787ccbd9b1c

-------------

Commit messages:
 - 8282875: AArch64: [vectorapi] Optimize Vector.reduceLane for SVE 64/128 vector size

Changes: https://git.openjdk.java.net/jdk/pull/7999/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7999&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8282875
  Stats: 1740 lines in 6 files changed: 770 ins; 671 del; 299 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7999.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7999/head:pull/7999

PR: https://git.openjdk.java.net/jdk/pull/7999


More information about the hotspot-compiler-dev mailing list