RFR: 8303553: AArch64: Add BCAX backend rule
changpeng1997
duke at openjdk.org
Wed Mar 29 09:20:57 UTC 2023
We can use BCAX[1][2] to merge a bit clear and an exclusive-OR operation. For example, on a 128-bit aarch64 machine which supports NEON and SHA3, following instruction sequence:
...
bic v16.16b, v16.16b, v17.16b
eor v16.16b, v16.16b, v18.16b
...
can be optimized to:
...
bcax v16.16b, v17.16b, v16.16b, v18.16b
...
This patch adds backend rules for BCAX, and we can gain almost 10% performance lift on a 128-bit aarch64 machine which supports NEON and SHA3. Similar performance uplift can also be observed on SVE2.
Performance_Before:
Benchmark Score(op/ms) Error
TestByte#size(2048) 9779.361 47.184
TestInt#size(2048) 3028.617 7.292
TestLong#size(2048) 1331.216 1.815
TestShort#size(2048) 5828.089 8.975
Performance_BCAX_NEON:
Benchmark Score(op/ms) Error
TestByte#size(2048) 10510.371 34.931
TestInt#size(2048) 3437.512 81.318
TestLong#size(2048) 1461.023 0.679
TestShort#size(2048) 6238.210 26.452
[1]: https://developer.arm.com/documentation/ddi0596/2021-12/SIMD-FP-Instructions/BCAX--Bit-Clear-and-XOR-
[2]: https://developer.arm.com/documentation/ddi0602/2022-12/SVE-Instructions/BCAX--Bitwise-clear-and-exclusive-OR-?lang=en
-------------
Commit messages:
- 8303553: AArch64: Add BCAX backend rule
Changes: https://git.openjdk.org/jdk/pull/13222/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13222&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8303553
Stats: 468 lines in 9 files changed: 433 ins; 0 del; 35 mod
Patch: https://git.openjdk.org/jdk/pull/13222.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/13222/head:pull/13222
PR: https://git.openjdk.org/jdk/pull/13222
More information about the hotspot-compiler-dev
mailing list