RFR: 8358032: Use crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU
Liming Liu
lliu at openjdk.org
Tue Jun 3 07:19:11 UTC 2025
This PR is to enable the use of crypto pmull for CRC32/CRC32C intrinsics on Ampere CPU. There is an option UseCryptoPmullForCRC32 that can enable crypto pmull, but directly enabling it on Ampere CPU will cause the following problems.
1. There will be regressions (-14% ~ -8%) on Ampere1 when the length is 64. When <= 128, both kernel_crc32_using_crc32 and kernel_crc32_using_crypto_pmull use the loop labeled as CRC_by32_loop, but their implements are a little different, and the loop in kernel_crc32_using_crc32 is better at hiding latency on Ampere1. So this PR takes the loop in kernel_crc32_using_crc32 to kernel_crc32_using_crypto_pmull, and does the same for CRC32C intrinsic.
2. The intrinsics only use crypto pmull when the length is higher than 383, while the loop in kernel_crc32_common_fold_using_crypto_pmull looks able to handle 256, and if it handles 256 on Ampere1, the improvements can be as high as 110% compared with kernel_crc32_using_crc32/kernel_crc32c_using_crc32c. However, there are regressions (~-6%) on Neoverse V1 when the length is 256. So this PR introduces a new option named CryptoPmullForCRC32LowLimit. It defaults to 256 since the code could handle 256, while it is set to 384 for V1/V2 to keep the old behavior on these platforms.
The performance regressions and improvements were measured with the following microbenchmarks:
org.openjdk.bench.java.util.TestCRC32.testCRC32Update
org.openjdk.bench.java.util.TestCRC32C.testCRC32CUpdate
Ran the following JTReg tests on Ampere1 and did not find problems:
test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32.java
test/hotspot/jtreg/compiler/intrinsics/zip/TestCRC32C.java
-------------
Commit messages:
- Use the utility functions
- Introduce CryptoPmullForCRC32LowLimit and use pmull for crc32 on Ampere CPU
Changes: https://git.openjdk.org/jdk/pull/25609/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=25609&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8358032
Stats: 28 lines in 3 files changed: 17 ins; 3 del; 8 mod
Patch: https://git.openjdk.org/jdk/pull/25609.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/25609/head:pull/25609
PR: https://git.openjdk.org/jdk/pull/25609
More information about the hotspot-dev
mailing list