RFR: 8297753: AArch64: Add optimized rules for vector compare with zero on NEON [v4]

Andrew Haley aph at openjdk.org
Wed Feb 8 13:20:50 UTC 2023


On Wed, 8 Feb 2023 07:44:11 GMT, Chang Peng <duke at openjdk.org> wrote:

>> We can use the compare-with-zero instructions like cmgt(zero)[1] immediately to avoid the extra scalar2vector operations.
>> 
>> The following instruction sequence
>> 
>> movi  v16.4s, #0x0
>> cmgt  v16.4s, v17.4s, v16.4s
>> 
>> can be optimized to:
>> 
>> cmgt v16.4s, v17.4s, #0x0
>> 
>> This patch does the following:
>> 1. Add NEON floating-point compare-with-zero instructions.
>> 2. Add optimized match rules to generate the compare-with-zero instructions.
>> 
>> [1]: https://developer.arm.com/documentation/ddi0602/2022-06/SIMD-FP-Instructions/CMGT--zero---Compare-signed-Greater-than-zero--vector--
>
> Chang Peng has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Merge fcm<cc> instruction encoding functions into a single function.

src/hotspot/cpu/aarch64/c2_MacroAssembler_aarch64.cpp line 976:

> 974:       case BoolTest::gt: fcm(Assembler::GT, dst, size, src); break;
> 975:       case BoolTest::le: fcm(Assembler::LE, dst, size, src); break;
> 976:       case BoolTest::lt: fcm(Assembler::LT, dst, size, src); break;

The key to this problem of endless switch statements is a function from `BoolTest` cond to `Assembler::Condition`.

Such a function is `cmpOpOper(BoolTest::overflow).ccode()` .

Please use it everywhere a `BoolTest` needs to be converted to a `Condition`.

-------------

PR: https://git.openjdk.org/jdk/pull/11822


More information about the hotspot-compiler-dev mailing list