RFR: JDK-8299158: Improve MD5 intrinsic on AArch64
Yi-Fan Tsai
duke at openjdk.org
Wed Dec 21 02:00:35 UTC 2022
There are two optimizations to reduce the length of the data path.
1) Replace
__ eorw(rscratch3, rscratch3, r4);
__ addw(rscratch3, rscratch3, rscratch1);
__ addw(rscratch3, rscratch3, rscratch4);
with
__ eorw(rscratch3, rscratch3, r4);
__ addw(rscratch4, rscratch4, rscratch1);
__ addw(rscratch3, rscratch3, rscratch4);
The eorw and the first addw can be computed in parallel.
2) Replace
__ eorw(rscratch2, r2, r3);
__ andw(rscratch3, rscratch2, r4);
__ eorw(rscratch3, rscratch3, r3);
with
__ andw(rscratch3, r2, r4);
__ bicw(rscratch4, r3, r4);
__ orrw(rscratch3, rscratch3, rscratch4);
The transformation is based on the equation `((r2 ^ r3) & r4) ^ r3 == (r2 & r4) | (r3 & -r4)`.
The two subexpressions on RHS can be computed in parallel.
Correctness proof
r2 r3 r4 (r2 ^ r3) ((r2 ^ r3) & r4) LHS (r2 & r4) (r3 & -r4) RHS
0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0
0 1 0 1 0 1 0 1 1
0 1 1 1 1 0 0 0 0
1 0 0 1 0 0 0 0 0
1 0 1 1 1 1 1 0 1
1 1 0 0 0 1 0 1 1
1 1 1 0 0 1 1 0 1
The change has been tested by TestMD5Intrinsics and TestMD5MultiBlockIntrinsics.
The performance is measured on EC2 m6g instance (Graviton2) and shows 18-25% improvement.
Baseline
Benchmark (digesterName) (length) (provider) Mode Cnt Score Error Units
MessageDigests.digest md5 64 DEFAULT thrpt 15 2989.149 ? 54.895 ops/ms
MessageDigests.digest md5 16384 DEFAULT thrpt 15 24.927 ? 0.002 ops/ms
MessageDigests.getAndDigest md5 64 DEFAULT thrpt 15 2433.184 ? 74.616 ops/ms
MessageDigests.getAndDigest md5 16384 DEFAULT thrpt 15 24.736 ? 0.002 ops/ms
Optimized
Benchmark (digesterName) (length) (provider) Mode Cnt Score Error Units
MessageDigests.digest md5 64 DEFAULT thrpt 15 3719.214 ? 23.087 ops/ms
MessageDigests.digest md5 16384 DEFAULT thrpt 15 31.280 ? 0.003 ops/ms
MessageDigests.getAndDigest md5 64 DEFAULT thrpt 15 2874.308 ? 88.455 ops/ms
MessageDigests.getAndDigest md5 16384 DEFAULT thrpt 15 31.014 ? 0.060 ops/ms
-------------
Commit messages:
- transform GG
- Reduce the length of data path
Changes: https://git.openjdk.org/jdk/pull/11748/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11748&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8299158
Stats: 8 lines in 1 file changed: 1 ins; 1 del; 6 mod
Patch: https://git.openjdk.org/jdk/pull/11748.diff
Fetch: git fetch https://git.openjdk.org/jdk pull/11748/head:pull/11748
PR: https://git.openjdk.org/jdk/pull/11748
More information about the hotspot-compiler-dev
mailing list