RFR: 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster

Boris Ulasevich bulasevich at openjdk.org
Fri Oct 10 22:22:50 UTC 2025


This change adjusts the default selection of SHA-3 intrinsics on AArch64 based on observed performance across CPUs. In our measurements, the SHA-3 SIMD path (using SHA3 instructions) is consistently faster on Apple silicon, while on Neoverse and several older cores the GPR implementation performs better. On CPUs without SHA-3 instructions, the GPR path is the only viable option and behaves as expected.

Accordingly, `UseSIMDForSHA3Intrinsic` now defaults to false globally. The SIMD variant is auto-enabled only on Apple silicon; elsewhere the default remains the GPR path.

_The attached raw data also includes observations about `UseFPUForSpilling`. Back in #27350 we discussed whether the option is entirely useless. While orthogonal to this change, the MessageDigests benchmark is a convenient probe of register-spilling behavior because the SHA-3 (Keccak) algorithm is highly register-hungry, which adds a significant number of spills to the generated assembly sequence. In the provided results, at least one CPU benefits from enabling UseFPUForSpilling, so the option seems worth keeping for now._

**Cortex-A53 (RPi3)**

$ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:-UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest
Benchmark          (digesterName)  (length)   Cnt    Score   Error   Units
MessageDigests.digest    SHA3-512        64   150  345.010 ± 0.473  ops/ms
MessageDigests.digest    SHA3-512     16384   150    1.817 ± 0.001  ops/ms

$ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:+UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest
MessageDigests.digest    SHA3-512        64   150  352.247 ± 0.279  ops/ms  +UseFPUForSpilling: +2%
MessageDigests.digest    SHA3-512     16384   150    1.855 ± 0.001  ops/ms  +UseFPUForSpilling: +2%

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics" 2>&1 | tail -n5
Benchmark                (digesterName)  (length)   Cnt    Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  345.552 ±  0.291  ops/ms
MessageDigests.digest          SHA3-512     16384    15    1.818 ±  0.001  ops/ms
MessageDigests.getAndDigest    SHA3-512        64    15  265.744 ± 56.591  ops/ms
MessageDigests.getAndDigest    SHA3-512     16384    15    1.812 ±  0.002  ops/ms

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:+UseSIMDForSHA3Intrinsic" 2>&1 | tail -n5
Benchmark                (digesterName)  (length)   Cnt    Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  343.047 ±  3.802  ops/ms  (UseSHA3Intrinsics is disabled due to missing sha3 capabilites)
MessageDigests.digest          SHA3-512     16384    15    1.817 ±  0.003  ops/ms
MessageDigests.getAndDigest    SHA3-512        64    15  292.429 ± 20.928  ops/ms
MessageDigests.getAndDigest    SHA3-512     16384    15    1.788 ±  0.020  ops/ms

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:-UseSIMDForSHA3Intrinsic" 2>&1 | tail -n5
Benchmark                (digesterName)  (length)   Cnt    Score   Error   Units
MessageDigests.digest          SHA3-512        64    15  474.854 ± 4.275  ops/ms  GPR SHA3: +37% 🟢
MessageDigests.digest          SHA3-512     16384    15    2.940 ± 0.003  ops/ms  GPR SHA3: +61% 🟢
MessageDigests.getAndDigest    SHA3-512        64    15  411.179 ± 3.143  ops/ms  GPR SHA3: +54% 🟢
MessageDigests.getAndDigest    SHA3-512     16384    15    2.927 ± 0.002  ops/ms  GPR SHA3: +61% 🟢

$ head /proc/cpuinfo
processor	: 0
BogoMIPS	: 38.40
Features	: fp asimd evtstrm crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd03
CPU revision	: 4

$ ./jdk-25/bin/java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal 2>&1 | grep SHA
     bool UseSHA                           = false                     {product} {default}
     bool UseSHA1Intrinsics                = false                  {diagnostic} {default}
     bool UseSHA256Intrinsics              = false                  {diagnostic} {default}
     bool UseSHA3Intrinsics                = false                  {diagnostic} {default}
     bool UseSHA512Intrinsics              = false                  {diagnostic} {default}
     bool UseSIMDForSHA3Intrinsic          = true                 {ARCH product} {default}

**Cortex-A72 (G1)**

$ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:-UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest -f10
MessageDigests.digest    SHA3-512        64    50  1056.831 ± 10.485 ops/ms
MessageDigests.digest    SHA3-512     16384    50     5.348 ±  0.008 ops/ms

$ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:+UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest -f10
MessageDigests.digest    SHA3-512        64    50  1050.750 ± 9.830  ops/ms  +UseFPUForSpilling:  0%
MessageDigests.digest    SHA3-512     16384    50     5.249 ± 0.003  ops/ms  +UseFPUForSpilling: -2%

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics" 2>&1 | tail -n5
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  1055.729 ± 22.589  ops/ms
MessageDigests.digest          SHA3-512     16384    15     5.355 ±  0.003  ops/ms
MessageDigests.getAndDigest    SHA3-512        64    15   867.129 ± 24.624  ops/ms
MessageDigests.getAndDigest    SHA3-512     16384    15     5.320 ±  0.019  ops/ms

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:+UseSIMDForSHA3Intrinsic" 2>&1 | tail -n5
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  1057.076 ± 22.620  ops/ms (UseSHA3Intrinsics is disabled due to missing sha3 capabilites)
MessageDigests.digest          SHA3-512     16384    15     5.354 ±  0.003  ops/ms
MessageDigests.getAndDigest    SHA3-512        64    15   866.266 ± 10.790  ops/ms
MessageDigests.getAndDigest    SHA3-512     16384    15     5.330 ±  0.015  ops/ms

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:-UseSIMDForSHA3Intrinsic" 2>&1 | tail -n5
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  1215.299 ± 24.954  ops/ms  GPR SHA3: +15% 🟢
MessageDigests.digest          SHA3-512     16384    15     6.582 ±  0.005  ops/ms  GPR SHA3: +23% 🟢
MessageDigests.getAndDigest    SHA3-512        64    15   956.375 ± 36.935  ops/ms  GPR SHA3: +10% 🟢
MessageDigests.getAndDigest    SHA3-512     16384    15     6.552 ±  0.008  ops/ms  GPR SHA3: +23% 🟢

$ cat /proc/cpuinfo
processor	: 0
BogoMIPS	: 166.66
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd08
CPU revision	: 3

$ ./jdk-25/bin/java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal 2>&1 | grep SHA
     bool UseSHA                           = true                      {product} {default}
     bool UseSHA1Intrinsics                = true                   {diagnostic} {default}
     bool UseSHA256Intrinsics              = true                   {diagnostic} {default}
     bool UseSHA3Intrinsics                = false                  {diagnostic} {default}
     bool UseSHA512Intrinsics              = false                  {diagnostic} {default}
     bool UseSIMDForSHA3Intrinsic          = true                 {ARCH product} {default}

**ARM Neoverse N1 (G2)**

$ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:-UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest -f10 MessageDigests.digest    SHA3-512        64    50  1693.312 ± 6.705  ops/ms
MessageDigests.digest    SHA3-512     16384    50     8.338 ± 0.004  ops/ms

$ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:+UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest -f10
MessageDigests.digest    SHA3-512        64    50  1549.136 ± 4.379  ops/ms  +UseFPUForSpilling:  -9%
MessageDigests.digest    SHA3-512     16384    50     7.464 ± 0.008  ops/ms  +UseFPUForSpilling: -10%

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics"
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  1699.575 ? 12.454  ops/ms
MessageDigests.digest          SHA3-512     16384    15     8.337 ?  0.008  ops/ms
MessageDigests.getAndDigest    SHA3-512        64    15  1530.011 ?  3.471  ops/ms
MessageDigests.getAndDigest    SHA3-512     16384    15     8.342 ?  0.006  ops/ms

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:+UseSIMDForSHA3Intrinsic"
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  1690.346 ± 10.649  ops/ms (UseSHA3Intrinsics is disabled due to missing sha3 capabilites)
MessageDigests.digest          SHA3-512     16384    15     8.339 ±  0.009  ops/ms
MessageDigests.getAndDigest    SHA3-512        64    15  1529.265 ±  3.778  ops/ms
MessageDigests.getAndDigest    SHA3-512     16384    15     8.259 ±  0.005  ops/ms

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:-UseSIMDForSHA3Intrinsic"
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  2071.184 ±  2.020  ops/ms  GPR SHA3: +22% 🟢
MessageDigests.digest          SHA3-512     16384    15    10.550 ±  0.002  ops/ms  GPR SHA3: +26% 🟢
MessageDigests.getAndDigest    SHA3-512        64    15  1821.436 ±  9.783  ops/ms  GPR SHA3: +19% 🟢
MessageDigests.getAndDigest    SHA3-512     16384    15    10.541 ±  0.001  ops/ms  GPR SHA3: +26% 🟢

$ cat /proc/cpuinfo
processor       : 0
BogoMIPS        : 243.75
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x3
CPU part        : 0xd0c
CPU revision    : 1

$ ./jdk-25/bin/java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal 2>&1 | grep SHA
     bool UseSHA                           = true                      {product} {default}
     bool UseSHA1Intrinsics                = true                   {diagnostic} {default}
     bool UseSHA256Intrinsics              = true                   {diagnostic} {default}
     bool UseSHA3Intrinsics                = false                  {diagnostic} {default}
     bool UseSHA512Intrinsics              = false                  {diagnostic} {default}
     bool UseSIMDForSHA3Intrinsic          = true                 {ARCH product} {default}

**ARM Neoverse V1 (G3)**

$ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:-UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest
MessageDigests.digest    SHA3-512        64    50  2567.010 ± 6.958  ops/ms
MessageDigests.digest    SHA3-512     16384    50    12.267 ± 0.003  ops/ms

$ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:+UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest
MessageDigests.digest    SHA3-512        64    50  2023.971 ± 3.043  ops/ms  +UseFPUForSpilling: -21%
MessageDigests.digest    SHA3-512     16384    50     9.531 ± 0.006  ops/ms  +UseFPUForSpilling: -22%

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics"
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  2567.030 ± 11.001  ops/ms
MessageDigests.digest          SHA3-512     16384    15    12.266 ±  0.006  ops/ms
MessageDigests.getAndDigest    SHA3-512        64    15  2283.653 ±  4.276  ops/ms
MessageDigests.getAndDigest    SHA3-512     16384    15    12.253 ±  0.007  ops/ms

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:+UseSIMDForSHA3Intrinsic"
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  1586.933 ±  5.553  ops/ms  SIMD SHA3: -38%
MessageDigests.digest          SHA3-512     16384    15     7.411 ±  0.001  ops/ms  SIMD SHA3: -39%
MessageDigests.getAndDigest    SHA3-512        64    15  1449.235 ± 31.213  ops/ms  SIMD SHA3: -36%
MessageDigests.getAndDigest    SHA3-512     16384    15     7.401 ±  0.001  ops/ms  SIMD SHA3: -39%

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:-UseSIMDForSHA3Intrinsic"
Benchmark                (digesterName)  (length)   Cnt     Score   Error   Units
MessageDigests.digest          SHA3-512        64    15  2877.972 ± 1.844  ops/ms   GPR SHA3: +12% 🟢
MessageDigests.digest          SHA3-512     16384    15    14.259 ± 0.015  ops/ms   GPR SHA3: +16% 🟢
MessageDigests.getAndDigest    SHA3-512        64    15  2494.387 ± 5.761  ops/ms   GPR SHA3:  +9% 🟢
MessageDigests.getAndDigest    SHA3-512     16384    15    14.221 ± 0.012  ops/ms   GPR SHA3: +16% 🟢

$ cat /proc/cpuinfo
processor	: 0
BogoMIPS	: 2100.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs paca pacg dcpodp svei8mm svebf16 i8mm bf16 dgh rng
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x1
CPU part	: 0xd40
CPU revision	: 1

$ ./jdk-25/bin/java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal 2>&1 | grep SHA
     bool UseSHA                           = true                      {product} {default}
     bool UseSHA1Intrinsics                = true                   {diagnostic} {default}
     bool UseSHA256Intrinsics              = true                   {diagnostic} {default}
     bool UseSHA3Intrinsics                = false                  {diagnostic} {default}
     bool UseSHA512Intrinsics              = true                   {diagnostic} {default}
     bool UseSIMDForSHA3Intrinsic          = true                 {ARCH product} {default}

**ARM Neoverse V2 (G4)**

$ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:-UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest
MessageDigests.digest    SHA3-512        64    50  3014.775 ± 1.913  ops/ms
MessageDigests.digest    SHA3-512     16384    50    14.404 ± 0.004  ops/ms

$ ./jdk-25/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:+UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest
MessageDigests.digest    SHA3-512        64    50  2793.075 ± 2.551  ops/ms  +UseFPUForSpilling: -7%
MessageDigests.digest    SHA3-512     16384    50    13.243 ± 0.006  ops/ms  +UseFPUForSpilling: -8%

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics" 2>&1 | tail -n5
Benchmark                (digesterName)  (length)   Cnt     Score   Error   Units
MessageDigests.digest          SHA3-512        64    15  3015.873 ± 4.404  ops/ms
MessageDigests.digest          SHA3-512     16384    15    14.403 ± 0.008  ops/ms
MessageDigests.getAndDigest    SHA3-512        64    15  2606.421 ± 6.977  ops/ms
MessageDigests.getAndDigest    SHA3-512     16384    15    14.295 ± 0.051  ops/ms

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:+UseSIMDForSHA3Intrinsic"
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  1734.671 ±  1.107  ops/ms  SIMD SHA3: -43%
MessageDigests.digest          SHA3-512     16384    15     7.975 ±  0.001  ops/ms  SIMD SHA3: -45%
MessageDigests.getAndDigest    SHA3-512        64    15  1580.420 ± 31.229  ops/ms  SIMD SHA3: -40%
MessageDigests.getAndDigest    SHA3-512     16384    15     7.967 ±  0.001  ops/ms  SIMD SHA3: -45%

$ ./jdk-25/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:-UseSIMDForSHA3Intrinsic"
Benchmark                (digesterName)  (length)   Cnt     Score   Error   Units
MessageDigests.digest          SHA3-512        64    15  3295.626 ± 7.651  ops/ms   GPR SHA3:  +9% 🟢
MessageDigests.digest          SHA3-512     16384    15    16.291 ± 0.011  ops/ms   GPR SHA3: +13% 🟢
MessageDigests.getAndDigest    SHA3-512        64    15  2822.072 ± 8.346  ops/ms   GPR SHA3:  +8% 🟢
MessageDigests.getAndDigest    SHA3-512     16384    15    16.241 ± 0.035  ops/ms   GPR SHA3: +13% 🟢

$ cat /proc/cpuinfo
processor	: 0
BogoMIPS	: 2000.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd4f
CPU revision	: 1

$ ./jdk-25/bin/java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal 2>&1 | grep SHA
     bool UseSHA                           = true                      {product} {default}
     bool UseSHA1Intrinsics                = true                   {diagnostic} {default}
     bool UseSHA256Intrinsics              = true                   {diagnostic} {default}
     bool UseSHA3Intrinsics                = false                  {diagnostic} {default}
     bool UseSHA512Intrinsics              = true                   {diagnostic} {default}
     bool UseSIMDForSHA3Intrinsic          = true                 {ARCH product} {default}

**Apple M1**

$ ./jdk-25.jdk/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:-UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest -f10
MessageDigests.digest    SHA3-512        64    50  4139.231 ± 18.412  ops/ms
MessageDigests.digest    SHA3-512     16384    50    19.897 ±  0.025  ops/ms

$ ./jdk-25.jdk/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:+UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest -f10
MessageDigests.digest    SHA3-512        64    50  3706.428 ± 18.688  ops/ms  +UseFPUForSpilling: -10%
MessageDigests.digest    SHA3-512     16384    50    17.692 ±  0.010  ops/ms  +UseFPUForSpilling: -11%

$ ./jdk-25.jdk/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics"
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  4126.912 ± 34.411  ops/ms
MessageDigests.digest          SHA3-512     16384    15    19.895 ±  0.036  ops/ms
MessageDigests.getAndDigest    SHA3-512        64    15  3661.652 ± 17.358  ops/ms
MessageDigests.getAndDigest    SHA3-512     16384    15    19.470 ±  0.171  ops/ms

$ ./jdk-25.jdk/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:+UseSIMDForSHA3Intrinsic"
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  5575.331 ± 53.743  ops/ms  SIMD SHA3: +35%
MessageDigests.digest          SHA3-512     16384    15    28.138 ±  0.016  ops/ms  SIMD SHA3: +41%
MessageDigests.getAndDigest    SHA3-512        64    15  4908.696 ±  3.546  ops/ms  SIMD SHA3: +34%
MessageDigests.getAndDigest    SHA3-512     16384    15    28.008 ±  0.029  ops/ms  SIMD SHA3: +44%

$ ./jdk-25.jdk/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:-UseSIMDForSHA3Intrinsic"
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  4436.765 ± 16.368  ops/ms   GPR SHA3:  +8% 🟢
MessageDigests.digest          SHA3-512     16384    15    21.573 ±  0.011  ops/ms   GPR SHA3:  +8% 🟢
MessageDigests.getAndDigest    SHA3-512        64    15  3972.640 ±  1.727  ops/ms   GPR SHA3:  +8% 🟢
MessageDigests.getAndDigest    SHA3-512     16384    15    21.468 ±  0.016  ops/ms   GPR SHA3: +10% 🟢

**Apple M4**

$ ./jdk-25.jdk/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:-UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest -f10
MessageDigests.digest    SHA3-512        64    50  5654.605 ± 47.051  ops/ms
MessageDigests.digest    SHA3-512     16384    50    27.613 ±  0.034  ops/ms

$ ./jdk-25.jdk/bin/java -jar benchmarks.jar -p digesterName=SHA3-512 -jvmArgs "-XX:+UseFPUForSpilling -XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics -XX:TieredStopAtLevel=4" MessageDigests.digest -f10
MessageDigests.digest    SHA3-512        64    50  5019.941 ± 45.104  ops/ms  +UseFPUForSpilling: -11%
MessageDigests.digest    SHA3-512     16384    50    23.929 ±  0.009  ops/ms  +UseFPUForSpilling: -13%

$ ./jdk-25.jdk/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:-UseSHA3Intrinsics"
 Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  5701.512 ± 70.397  ops/ms
MessageDigests.digest          SHA3-512     16384    15    27.656 ±  0.013  ops/ms
MessageDigests.getAndDigest    SHA3-512        64    15  5253.424 ± 12.838  ops/ms
MessageDigests.getAndDigest    SHA3-512     16384    15    27.207 ±  0.149  ops/ms

$ ./jdk-25.jdk/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:+UseSIMDForSHA3Intrinsic"
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  7204.542 ± 51.968  ops/ms  SIMD SHA3: +26%
MessageDigests.digest          SHA3-512     16384    15    33.473 ±  0.294  ops/ms  SIMD SHA3: +21%
MessageDigests.getAndDigest    SHA3-512        64    15  6621.422 ± 24.413  ops/ms  SIMD SHA3: +26$
MessageDigests.getAndDigest    SHA3-512     16384    15    33.431 ±  0.022  ops/ms  SIMD SHA3: +23%

$ ./jdk-25.jdk/bin/java -jar benchmarks.jar MessageDigests -p digesterName=SHA3-512 -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:+UseSHA3Intrinsics -XX:-UseSIMDForSHA3Intrinsic"
Benchmark                (digesterName)  (length)   Cnt     Score    Error   Units
MessageDigests.digest          SHA3-512        64    15  7084.719 ± 94.518  ops/ms   GPR SHA3: +24$ 🟢
MessageDigests.digest          SHA3-512     16384    15    35.974 ±  0.025  ops/ms   GPR SHA3: +30% 🟢
MessageDigests.getAndDigest    SHA3-512        64    15  6475.984 ± 18.642  ops/ms   GPR SHA3: +23% 🟢
MessageDigests.getAndDigest    SHA3-512     16384    15    35.839 ±  0.012  ops/ms   GPR SHA3: +32% 🟢

$ cat /proc/cpuinfo | head
     No such file or directory

$ ./jdk-25.jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal 2>&1 | grep SHA
     bool UseSHA                           = true                      {product} {default}
     bool UseSHA1Intrinsics                = true                   {diagnostic} {default}
     bool UseSHA256Intrinsics              = true                   {diagnostic} {default}
     bool UseSHA3Intrinsics                = true                   {diagnostic} {default}
     bool UseSHA512Intrinsics              = true                   {diagnostic} {default}
     bool UseSIMDForSHA3Intrinsic          = true                 {ARCH product} {default}

-------------

Commit messages:
 - 8359256: AArch64: Use SHA3 GPR intrinsic where it's faster

Changes: https://git.openjdk.org/jdk/pull/27726/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=27726&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8359256
  Stats: 20 lines in 2 files changed: 8 ins; 0 del; 12 mod
  Patch: https://git.openjdk.org/jdk/pull/27726.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/27726/head:pull/27726

PR: https://git.openjdk.org/jdk/pull/27726


More information about the hotspot-dev mailing list