RFR: 8282431: AArch64: Add optimized rules for masked vector multiply-add/sub for SVE

Xiaohong Gong xgong at openjdk.java.net
Tue Mar 8 07:54:44 UTC 2022


We have the optimized match rules for vector `"fmls,fnmla,fnmls,mla,mls"` for ARM SVE. Similarly we can also add the rules for the relative masked operations to generate the optimized predicated instructions.

With this patch the following masked vector multiply-add for a byte vector:

  mul z18.b, p7/m, z18.b, z17.b
  add z17.b, p0/m, z17.b, z18.b

could be optimized to: 

   mla z19.b, p0/m, z18.b, z17.b

 And so does the multiply-sub operations. Also the following masked fused multiply-substract for a float vector:

  fneg    z18.s, p7/m, z18.s
  fmad    z17.s, p0/m, z18.s, z16.s

could be optimized to: 

  fmsb z17.s, p0/m, z18.s, z16.s"

 And the same to the relative negated fused operations.

This patch also fixes the potential issues for the usage of `NegVF/D` in match rules. The explicit check of non-predicated vector for `NegVF/D` must be added to the match rule predicate if the `NegVF/D` is assumed to be non-masked. Otherwise, the jvm might crash if a masked` NegVF/D` with two operands is matched into a rule which assumes the `NegVF/D` in subtree has one operand.

Also add the jtreg tests for all the touched vector operations.

-------------

Commit messages:
 - 8282431: AArch64: Add optimized rules for masked vector multiply-add/sub for SVE

Changes: https://git.openjdk.java.net/jdk/pull/7737/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=7737&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8282431
  Stats: 971 lines in 6 files changed: 863 ins; 0 del; 108 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7737.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7737/head:pull/7737

PR: https://git.openjdk.java.net/jdk/pull/7737


More information about the hotspot-compiler-dev mailing list