RFR: 8341013: Optimize x86/aarch64 MD5 intrinsics by reducing data dependency
Oli Gillespie
ogillespie at openjdk.org
Thu Sep 26 11:37:53 UTC 2024
As suggested in https://github.com/animetosho/md5-optimisation?tab=readme-ov-file#dependency-shortcut-in-g-function, we can delay the dependency on 'b' by recognizing that the ((d & b) | (~d & c)) is equivalent to ((d & b) + (~d & c)) in this scenario, and we can perform those additions independently, leaving our dependency on b to the final addition. This speeds it up around 5%.
Benchmark results on my two hosts:
Benchmark (algorithm) (dataSize) (provider) Mode Cnt Score Error Units
x86 Before:
MessageDigestBench.digest MD5 1048576 thrpt 10 636.389 ± 0.240 ops/s
x86 After:
MessageDigestBench.digest MD5 1048576 thrpt 10 671.611 ± 0.226 ops/s (+5.5%)
aarch64 Before:
MessageDigestBench.digest MD5 1048576 thrpt 10 498.613 ± 0.359 ops/s
aarch64 After:
MessageDigestBench.digest MD5 1048576 thrpt 10 526.008 ± 0.491 ops/s (+5.6%)
-------------
Commit messages:
- Optimize md5 intrinsic
Changes: https://git.openjdk.org/jdk/pull/21203/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=21203&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8341013
Stats: 6 lines in 2 files changed: 3 ins; 3 del; 0 mod
Patch: https://git.openjdk.org/jdk/pull/21203.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/21203/head:pull/21203
PR: https://git.openjdk.org/jdk/pull/21203
More information about the hotspot-dev
mailing list