RFR: 8297753: AArch64: Add optimized rules for vector compare with zero on NEON [v7]

Chang Peng duke at openjdk.org
Fri Mar 3 02:24:51 UTC 2023


> We can use the compare-with-zero instructions like cmgt(zero)[1] immediately to avoid the extra scalar2vector operations.
> 
> The following instruction sequence
> 
> movi  v16.4s, #0x0
> cmgt  v16.4s, v17.4s, v16.4s
> 
> can be optimized to:
> 
> cmgt v16.4s, v17.4s, #0x0
> 
> This patch does the following:
> 1. Add NEON floating-point compare-with-zero instructions.
> 2. Add optimized match rules to generate the compare-with-zero instructions.
> 
> [1]: https://developer.arm.com/documentation/ddi0602/2022-06/SIMD-FP-Instructions/CMGT--zero---Compare-signed-Greater-than-zero--vector--

Chang Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits:

 - Merge branch 'openjdk:master' into add_cmp0_neon
 - Remove some switch-case stmts in c2_MacroAssembler_aarch64.cpp and avoid
   unsigned comparison.
 - Revert "Remove switch-case stmts in c2_MacroAssembler_aarch64.cpp"
   
   This reverts commit d899238d0cb98fdf375b3011670495c3bfe8bbaf.
 - Merge branch 'openjdk:master' into add_cmp0_neon
 - Remove switch-case stmts in c2_MacroAssembler_aarch64.cpp
 - Merge fcm<cc> instruction encoding functions into a single function.
 - Merge branch 'openjdk:master' into add_cmp0_neon
 - Resolving the merge conflicts caused by test/hotspot/gtest/aarch64/asmtest.out.h
   
   Change-Id: I896b879c8b7097a99e35fc1e53abab646240281a
 - 8297753: AArch64: Add optimized rules for vector compare with zero on NEON
   
   We can use the compare-with-zero instructions like cmgt(zero)[1]
   immediately to avoid the extra scalar2vector operations.
   
   The following instruction sequence
   ```
   movi  v16.4s, #0x0
   cmgt  v16.4s, v17.4s, v16.4s
   ```
   can be optimized to:
   ```
   cmgt v16.4s, v17.4s, #0x0
   ```
   This patch does the following:
   1. Add NEON floating-point compare-with-zero instructions.
   2. Add optimized match rules to generate the compare-with-zero
   instructions.
   
   [1]: https://developer.arm.com/documentation/ddi0602/2022-06/SIMD-FP-Instructions/CMGT--zero---Compare-signed-Greater-than-zero--vector--
   
   Change-Id: If026b477a0cad809bd201feafbfc9ab301a1b569

-------------

Changes: https://git.openjdk.org/jdk/pull/11822/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11822&range=06
  Stats: 1033 lines in 10 files changed: 535 ins; 0 del; 498 mod
  Patch: https://git.openjdk.org/jdk/pull/11822.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/11822/head:pull/11822

PR: https://git.openjdk.org/jdk/pull/11822


More information about the hotspot-compiler-dev mailing list