RFR: 8274179: AArch64: Support SVE operations with encodable immediates [v5]

Wed Nov 17 03:53:58 UTC 2021

> for(int i = 0; i < LENGTH; i++) {
>       c[i] = a[i] + 2;
>     }
> 
> For the case showed above, after superword optimization with SVE,
> without the patch, the vector add operation always has 2 z-reg inputs,
> like:
> mov     z16.s, #2
> add	z17.s, z17.s, z16.s
> 
> Considering sve has supported basic binary operations with immediate,
> this pattern could be further optimized to:
> add     z16.s, z16.s, #2
> 
> To implement it, we added some new match rules and assembler rules in
> the aarch64 backend. We also made some extensions on immediate types
> and functions to keep backward compatible.
> 
> With the patch, only these binary integer vector operations, +(add),
> -(sub), &(and), |(orr), and ^(eor) with immediate are supported for
> the optimization. Other vector operations are not supported currently.
> 
> Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64
> CPU, no new failure.
> 
> There is no obvious performance uplift but it can help remove one
> redundant mov instruction.

Fei Gao has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits:

 - Regenerate the asmtest.out.h file for aarch64 after rebasing

   Change-Id: I1292449268c73c8f84cc3ffa7a4c859cf79058eb
 - Merge branch 'master' of github.com:fg1417/jdk into fg1417-20211026

   Change-Id: I2004dc45f7f0ab44bc22b48083b185e7b3bd5eea
 - Add some assertion lines for help functions

   Change-Id: Ic9120902bd8f8a8ead2e3740435a40f35d21757c
 - Split the original patch and leave the existing logic in Assembler entirely untouched

   Change-Id: If8ddcef07b15615d7dd0c3063c44d2b705fac6f7
 - Merge branch 'master' of github.com:fg1417/jdk into fg1417-20211026

   Change-Id: I52aa66d200b74ac312c5d40283b94854bc1142e6
 - 8274179: AArch64: Support SVE operations with encodable immediates

       for(int i = 0; i < LENGTH; i++) {
         c[i] = a[i] + 2;
       }

   For the case showed above, after superword optimization with SVE,
   without the patch, the vector add operation always has 2 z-reg inputs,
   like:
   mov     z16.s, #2
   add	z17.s, z17.s, z16.s

   Considering sve has supported basic binary operations with immediate,
   this pattern could be further optimized to:
   add     z16.s, z16.s, #2

   To implement it, we added some new match rules and assembler rules in
   the aarch64 backend. We also made some extensions on immediate types
   and functions to keep backward compatible.

   With the patch, only these binary integer vector operations, +(add),
   -(sub), &(and), |(orr), and ^(eor) with immediate are supported for
   the optimization. Other vector operations are not supported currently.

   Tested tier1 and test/hotspot/jtreg/compiler on SVE featured AArch64
   CPU, no new failure.

   There is no obvious performance uplift but it can help remove one
   redundant mov instruction.

   Change-Id: Iaec40e362918118691083fb171cc4dff390b35a2

-------------

Changes: https://git.openjdk.java.net/jdk/pull/6115/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=6115&range=04
  Stats: 1476 lines in 12 files changed: 1329 ins; 43 del; 104 mod
  Patch: https://git.openjdk.java.net/jdk/pull/6115.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/6115/head:pull/6115

PR: https://git.openjdk.java.net/jdk/pull/6115