RFR: 8303553: AArch64: Add BCAX backend rule
Quan Anh Mai
qamai at openjdk.org
Thu Mar 30 09:02:19 UTC 2023
On Wed, 29 Mar 2023 09:12:59 GMT, changpeng1997 <duke at openjdk.org> wrote:
> We can use BCAX [1] [2] to merge a bit clear and an exclusive-OR operation. For example, on a 128-bit aarch64 machine which supports NEON and SHA3, following instruction sequence:
>
>
> ...
> bic v16.16b, v16.16b, v17.16b
> eor v16.16b, v16.16b, v18.16b
> ...
>
>
> can be optimized to:
>
>
> ...
> bcax v16.16b, v17.16b, v16.16b, v18.16b
> ...
>
>
> This patch adds backend rules for BCAX, and we can gain almost 10% performance lift on a 128-bit aarch64 machine which supports NEON and SHA3. Similar performance uplift can also be observed on SVE2.
>
> Performance_Before:
>
>
> Benchmark Score(op/ms) Error
> TestByte#size(2048) 9779.361 47.184
> TestInt#size(2048) 3028.617 7.292
> TestLong#size(2048) 1331.216 1.815
> TestShort#size(2048) 5828.089 8.975
>
>
> Performance_BCAX_NEON:
>
>
> Benchmark Score(op/ms) Error
> TestByte#size(2048) 10510.371 34.931
> TestInt#size(2048) 3437.512 81.318
> TestLong#size(2048) 1461.023 0.679
> TestShort#size(2048) 6238.210 26.452
>
>
> [1]: https://developer.arm.com/documentation/ddi0596/2021-12/SIMD-FP-Instructions/BCAX--Bit-Clear-and-XOR-
> [2]: https://developer.arm.com/documentation/ddi0602/2022-12/SVE-Instructions/BCAX--Bitwise-clear-and-exclusive-OR-?lang=en
`bsl(dst, src2, src3) == (dst & src3) | (src2 & ~src3)` which is a bitwise logical operation
-------------
PR Comment: https://git.openjdk.org/jdk/pull/13222#issuecomment-1489945729
More information about the hotspot-compiler-dev
mailing list