RFR: 8303553: AArch64: Add BCAX backend rule

Quan Anh Mai qamai at openjdk.org
Thu Mar 30 09:02:19 UTC 2023


On Wed, 29 Mar 2023 09:12:59 GMT, changpeng1997 <duke at openjdk.org> wrote:

> We can use BCAX [1] [2] to merge a bit clear and an exclusive-OR operation. For example, on a 128-bit aarch64 machine which supports NEON and SHA3, following instruction sequence:
> 
> 
> ...
> bic     v16.16b, v16.16b, v17.16b
> eor     v16.16b, v16.16b, v18.16b
> ...
> 
> 
> can be optimized to:
> 
> 
> ...
> bcax    v16.16b, v17.16b, v16.16b, v18.16b
> ...
> 
> 
> This patch adds backend rules for BCAX, and we can gain almost 10% performance lift on a 128-bit aarch64 machine which supports NEON and SHA3. Similar performance uplift can also be observed on SVE2.
> 
> Performance_Before:
> 
> 
> Benchmark                 Score(op/ms)     Error
> TestByte#size(2048)       9779.361         47.184
> TestInt#size(2048)        3028.617         7.292
> TestLong#size(2048)       1331.216         1.815
> TestShort#size(2048)      5828.089         8.975
> 
> 
> Performance_BCAX_NEON:
> 
> 
> Benchmark                 Score(op/ms)     Error
> TestByte#size(2048)       10510.371        34.931
> TestInt#size(2048)        3437.512         81.318
> TestLong#size(2048)       1461.023         0.679
> TestShort#size(2048)      6238.210         26.452
> 
> 
> [1]: https://developer.arm.com/documentation/ddi0596/2021-12/SIMD-FP-Instructions/BCAX--Bit-Clear-and-XOR-
> [2]: https://developer.arm.com/documentation/ddi0602/2022-12/SVE-Instructions/BCAX--Bitwise-clear-and-exclusive-OR-?lang=en

`bsl(dst, src2, src3) == (dst & src3) | (src2 & ~src3)` which is a bitwise logical operation

-------------

PR Comment: https://git.openjdk.org/jdk/pull/13222#issuecomment-1489945729


More information about the hotspot-compiler-dev mailing list