RFR: 8333583: Crypto-XDH.generateSecret regression after JDK-8329538
Volodymyr Paprotski
duke at openjdk.org
Fri Jun 14 20:31:44 UTC 2024
This fix recovers XDH performance but removes some of the P256 gains (~-8-14%). Still faster, but not as much.
The fix is to undo 'int' return type on mult()/square(), which allowed to return partially reduced result (i.e. this avoids extra reductions when mult() result is fed into addition). This is the behaviour before the Montgomery ECC PR.
I have a slightly better mult() intrinsic that does reduction at the end, but decided to use a more conservative fix and just keep the reduction in Java (i.e. original mult() refactored into multImpl() and reducePositive()) Will commit these optimizations I discovered while working on this in next release.
---
Performance before Montgomery PR:
Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units
SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6398.727 ± 7.400 ops/s
SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6129.739 ± 5.995 ops/s
SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1889.928 ± 54.660 ops/s
SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1866.339 ± 42.438 ops/s
Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1350.745 ± 28.514 ops/s
o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1349.393 ± 32.050 ops/s
Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
KeyAgreementBench.XDH.generateSecret XDH 255 XDH thrpt 3 8435.277 ± 27.230 ops/s
Performance in master without mult() intrinsic
Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units
SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6539.589 ± 132.844 ops/s
SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6202.530 ± 124.496 ops/s
SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1967.038 ± 15.819 ops/s
SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1931.667 ± 22.901 ops/s
Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1354.143 ± 24.861 ops/s
o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1354.139 ± 21.904 ops/s
Performance in master with mult() intrinsic
Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units
SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 10534.707 ± 20.690 ops/s
SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 9729.246 ± 102.803 ops/s
SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 3549.011 ± 77.343 ops/s
SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 3458.107 ± 14.622 ops/s
Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 2563.566 ± 94.381 ops/s
o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 2569.143 ± 53.337 ops/s
Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
KeyAgreementBench.XDH.generateSecret XDH 255 XDH thrpt 3 8309.028 ± 22.071 ops/s
THIS PR without mult intrinsic
Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units
SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6225.541 ± 111.874 ops/s
SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 5913.876 ± 121.556 ops/s
SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1837.740 ± 42.881 ops/s
SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1815.064 ± 72.015 ops/s
Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1271.716 ± 17.119 ops/s
o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1265.405 ± 19.382 ops/s
THIS PR with mult intrinsic
Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units
SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 9560.700 ± 232.557 ops/s
SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 8916.806 ± 164.756 ops/s
SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 3064.470 ± 72.166 ops/s
SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 2991.568 ± 75.720 ops/s
Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 2200.308 ± 13.744 ops/s
o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 2203.028 ± 1.948 ops/s
Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
KeyAgreementBench.XDH.generateSecret XDH 255 XDH thrpt 3 8514.924 ± 59.022 ops/s
-------------
Commit messages:
- whitespace
- better reduction refactoring
- Undo incomplete p256 mult reduction optimization
Changes: https://git.openjdk.org/jdk/pull/19728/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=19728&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8333583
Stats: 130 lines in 9 files changed: 53 ins; 37 del; 40 mod
Patch: https://git.openjdk.org/jdk/pull/19728.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/19728/head:pull/19728
PR: https://git.openjdk.org/jdk/pull/19728
More information about the security-dev
mailing list