RFR: 8297753: AArch64: Add optimized rules for vector compare with zero on NEON [v2]
Chang Peng
duke at openjdk.org
Sun Jan 29 03:02:20 UTC 2023
On Sat, 28 Jan 2023 10:29:31 GMT, Andrew Haley <aph at openjdk.org> wrote:
>> Chang Peng has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits:
>>
>> - Resolving the merge conflicts caused by test/hotspot/gtest/aarch64/asmtest.out.h
>>
>> Change-Id: I896b879c8b7097a99e35fc1e53abab646240281a
>> - 8297753: AArch64: Add optimized rules for vector compare with zero on NEON
>>
>> We can use the compare-with-zero instructions like cmgt(zero)[1]
>> immediately to avoid the extra scalar2vector operations.
>>
>> The following instruction sequence
>> ```
>> movi v16.4s, #0x0
>> cmgt v16.4s, v17.4s, v16.4s
>> ```
>> can be optimized to:
>> ```
>> cmgt v16.4s, v17.4s, #0x0
>> ```
>> This patch does the following:
>> 1. Add NEON floating-point compare-with-zero instructions.
>> 2. Add optimized match rules to generate the compare-with-zero
>> instructions.
>>
>> [1]: https://developer.arm.com/documentation/ddi0602/2022-06/SIMD-FP-Instructions/CMGT--zero---Compare-signed-Greater-than-zero--vector--
>>
>> Change-Id: If026b477a0cad809bd201feafbfc9ab301a1b569
>
> src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 3174:
>
>> 3172: INSN(fcvtzs, 0, 0b10, 0b01, 0b11011);
>> 3173: INSN(fcvtms, 0, 0b00, 0b01, 0b11011);
>> 3174: INSN(fcmgt, 0, 0b10, 0b01, 0b01100); // Floating-point compare greater than zero (vector)
>
> if you were to make this `fcm(Condition cond, ...` rather than having separate definitions for each condition it might make the code simpler and shorter.
Thanks, I think this would make the code much more simpler. But I was wondering if the function name in assembler_aarch64.hpp should align with ISA definition.
-------------
PR: https://git.openjdk.org/jdk/pull/11822
More information about the hotspot-compiler-dev
mailing list