RFR: 8297753: AArch64: Add optimized rules for vector compare with zero on NEON [v9]
Andrew Haley
aph at openjdk.org
Fri Mar 3 09:38:06 UTC 2023
On Fri, 3 Mar 2023 07:46:57 GMT, Chang Peng <duke at openjdk.org> wrote:
>> We can use the compare-with-zero instructions like cmgt(zero)[1] immediately to avoid the extra scalar2vector operations.
>>
>> The following instruction sequence
>>
>> movi v16.4s, #0x0
>> cmgt v16.4s, v17.4s, v16.4s
>>
>> can be optimized to:
>>
>> cmgt v16.4s, v17.4s, #0x0
>>
>> This patch does the following:
>> 1. Add NEON floating-point compare-with-zero instructions.
>> 2. Add optimized match rules to generate the compare-with-zero instructions.
>>
>> [1]: https://developer.arm.com/documentation/ddi0602/2022-06/SIMD-FP-Instructions/CMGT--zero---Compare-signed-Greater-than-zero--vector--
>
> Chang Peng has updated the pull request incrementally with one additional commit since the last revision:
>
> Remove hard-coded 0b1111 in to_assembler_condition()
Alright! That is _beautiful_.
I felt a bit bad about pushing you so hard on this, but I think the quality of the result justifies the effort. I hope you agree.
Thank you.
-------------
Marked as reviewed by aph (Reviewer).
PR: https://git.openjdk.org/jdk/pull/11822
More information about the hotspot-compiler-dev
mailing list