RFR: 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster [v2]

Boris Ulasevich bulasevich at openjdk.org
Thu Oct 30 11:29:40 UTC 2025


> This change adjusts the default selection of SHA-3 intrinsics on AArch64 based on observed performance across CPUs. In our measurements, the SHA-3 SIMD path (using SHA3 instructions) is consistently faster on Apple silicon, while on Neoverse and several older cores the GPR implementation performs better. On CPUs without SHA-3 instructions, the GPR path is the only viable option and behaves as expected.
> 
> Accordingly, `UseSIMDForSHA3Intrinsic` now defaults to false globally. The SIMD variant is auto-enabled only on Apple silicon; elsewhere the default remains the GPR path.
> 
> _The attached raw data also includes observations about `UseFPUForSpilling`. Back in #27350 we discussed whether the option is entirely useless. While orthogonal to this change, the MessageDigests benchmark is a convenient probe of register-spilling behavior because the SHA-3 (Keccak) algorithm is highly register-hungry, which adds a significant number of spills to the generated assembly sequence. In the provided results, at least one CPU benefits from enabling UseFPUForSpilling, so the option seems worth keeping for now._
> 
> **Cortex-A53 (RPi3)**
> 
> $ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:-UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest
> Benchmark          (digesterName)  (length)   Cnt    Score   Error   Units
> MessageDigests.digest    SHA3-512        64   150  345.010 ± 0.473  ops/ms
> MessageDigests.digest    SHA3-512     16384   150    1.817 ± 0.001  ops/ms
> 
> $ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:+UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest
> MessageDigests.digest    SHA3-512        64   150  352.247 ± 0.279  ops/ms  +UseFPUForSpilling: +2%
> MessageDigests.digest    SHA3-512     16384   150    1.855 ± 0.001  ops/ms  +UseFPUForSpilling: +2%
> 
> $ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics" 2>&1 | tail -n5
> Benchmark                (digesterName)  (length)   Cnt    Score    Error   Units
> MessageDigests.digest          SHA3-512        64    15  345.552 ±  0.291  ops/ms
> MessageDigests.digest          SHA3-512     16384    15    1.818 ±  0.001  ops/ms
> MessageDigests.getAndDigest    SHA3-512        64    15  265.744 ± 56.591  ops/ms
> MessageDigests.getAndDigest    SHA3-512     16384    1...

Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision:

  minor review corrections

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/27726/files
  - new: https://git.openjdk.org/jdk/pull/27726/files/a96aa022..3ed9f64b

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=27726&range=01
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=27726&range=00-01

  Stats: 6 lines in 1 file changed: 0 ins; 2 del; 4 mod
  Patch: https://git.openjdk.org/jdk/pull/27726.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/27726/head:pull/27726

PR: https://git.openjdk.org/jdk/pull/27726


More information about the hotspot-dev mailing list