RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v3]
Anthony Scarpino
ascarpino at openjdk.org
Tue Apr 23 20:09:30 UTC 2024
On Mon, 15 Apr 2024 22:12:30 GMT, Volodymyr Paprotski <duke at openjdk.org> wrote:
>> Performance. Before:
>>
>> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units
>> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6443.934 ± 6.491 ops/s
>> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6152.979 ± 4.954 ops/s
>> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1895.410 ± 36.979 ops/s
>> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1878.955 ± 45.487 ops/s
>> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
>> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1357.810 ± 26.584 ops/s
>> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1352.119 ± 23.547 ops/s
>> Benchmark (isMontBench) Mode Cnt Score Error Units
>> PolynomialP256Bench.benchMultiply false thrpt 3 1746.126 ± 10.970 ops/s
>>
>> Performance, no intrinsic:
>>
>> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units
>> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6529.839 ± 42.420 ops/s
>> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6199.747 ± 133.566 ops/s
>> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1973.676 ± 54.071 ops/s
>> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1932.127 ± 35.920 ops/s
>> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
>> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1355.788 ± 29.858 ops/s
>> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1346.523 ± 28.722 ops/s
>> Benchmark (isMontBench) Mode Cnt Score Error Units
>> PolynomialP256Bench.benchMultiply true thrpt 3 1919.57...
>
> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision:
>
> Comments from Jatin and Tony
src/java.base/share/classes/sun/security/ec/ECOperations.java line 204:
> 202: * @return the product
> 203: */
> 204: public MutablePoint multiply(AffinePoint affineP, byte[] s) {
It seems like there could be some combining of both `multiply()`. If `multiply(AffinePoint, ...)` is called, it can call `DefaultMultiplier` with the `affineP`, but internally call the other `multiply(ECPoint, ...)` for the other situations. I'd rather not have two methods doing most of the same code, but different methods.
src/java.base/share/classes/sun/security/ec/ECOperations.java line 467:
> 465: sealed static abstract class SmallWindowMultiplier implements PointMultiplier
> 466: permits DefaultMultiplier, DefaultMontgomeryMultiplier {
> 467: private final AffinePoint affineP;
I don't think `affineP` needs to be a class variable anymore. It's only used in the constructor
src/java.base/share/classes/sun/security/ec/ECOperations.java line 592:
> 590: }
> 591:
> 592: private final ProjectivePoint.Immutable[][] points;
Can you define this at the top please.
src/java.base/share/classes/sun/security/ec/ECOperations.java line 668:
> 666: }
> 667:
> 668: private final BigInteger[] base;
Can you define this at the top. You use it in the constructor but it's defined later on.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1576821201
PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1575499019
PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1575495263
PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1575491814
More information about the core-libs-dev
mailing list