RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v3]

Anthony Scarpino ascarpino at openjdk.org
Tue Apr 23 20:09:30 UTC 2024


On Mon, 15 Apr 2024 22:12:30 GMT, Volodymyr Paprotski <duke at openjdk.org> wrote:

>> Performance. Before:
>> 
>> Benchmark                        (algorithm)  (dataSize)  (keyLength)  (provider)   Mode  Cnt     Score    Error  Units
>> SignatureBench.ECDSA.sign    SHA256withECDSA        1024          256              thrpt    3  6443.934 ±  6.491  ops/s
>> SignatureBench.ECDSA.sign    SHA256withECDSA       16384          256              thrpt    3  6152.979 ±  4.954  ops/s
>> SignatureBench.ECDSA.verify  SHA256withECDSA        1024          256              thrpt    3  1895.410 ± 36.979  ops/s
>> SignatureBench.ECDSA.verify  SHA256withECDSA       16384          256              thrpt    3  1878.955 ± 45.487  ops/s
>> Benchmark                                            (algorithm)  (keyLength)  (kpgAlgorithm)  (provider)   Mode  Cnt     Score    Error  Units
>> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret          ECDH          256              EC              thrpt    3  1357.810 ± 26.584  ops/s
>> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret         ECDH          256              EC              thrpt    3  1352.119 ± 23.547  ops/s
>> Benchmark                          (isMontBench)   Mode  Cnt     Score    Error  Units
>> PolynomialP256Bench.benchMultiply          false  thrpt    3  1746.126 ± 10.970  ops/s
>> 
>> Performance, no intrinsic:
>> 
>> Benchmark                        (algorithm)  (dataSize)  (keyLength)  (provider)   Mode  Cnt     Score     Error  Units
>> SignatureBench.ECDSA.sign    SHA256withECDSA        1024          256              thrpt    3  6529.839 ±  42.420  ops/s
>> SignatureBench.ECDSA.sign    SHA256withECDSA       16384          256              thrpt    3  6199.747 ± 133.566  ops/s
>> SignatureBench.ECDSA.verify  SHA256withECDSA        1024          256              thrpt    3  1973.676 ±  54.071  ops/s
>> SignatureBench.ECDSA.verify  SHA256withECDSA       16384          256              thrpt    3  1932.127 ±  35.920  ops/s
>> Benchmark                                            (algorithm)  (keyLength)  (kpgAlgorithm)  (provider)   Mode  Cnt     Score    Error  Units
>> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret          ECDH          256              EC              thrpt    3  1355.788 ± 29.858  ops/s
>> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret         ECDH          256              EC              thrpt    3  1346.523 ± 28.722  ops/s
>> Benchmark                          (isMontBench)   Mode  Cnt     Score    Error  Units
>> PolynomialP256Bench.benchMultiply           true  thrpt    3  1919.57...
>
> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Comments from Jatin and Tony

src/java.base/share/classes/sun/security/ec/ECOperations.java line 204:

> 202:      * @return the product
> 203:      */
> 204:     public MutablePoint multiply(AffinePoint affineP, byte[] s) {

It seems like there could be some combining of both `multiply()`.  If `multiply(AffinePoint, ...)` is called, it can call `DefaultMultiplier` with the `affineP`, but internally call the other `multiply(ECPoint, ...)` for the other situations.  I'd rather not have two methods doing most of the same code, but different methods.

src/java.base/share/classes/sun/security/ec/ECOperations.java line 467:

> 465:     sealed static abstract class SmallWindowMultiplier implements PointMultiplier
> 466:         permits DefaultMultiplier, DefaultMontgomeryMultiplier {
> 467:         private final AffinePoint affineP;

I don't think `affineP` needs to be a class variable anymore.  It's only used in the constructor

src/java.base/share/classes/sun/security/ec/ECOperations.java line 592:

> 590:         }
> 591: 
> 592:         private final ProjectivePoint.Immutable[][] points;

Can you define this at the top please.

src/java.base/share/classes/sun/security/ec/ECOperations.java line 668:

> 666:         }
> 667: 
> 668:         private final BigInteger[] base;

Can you define this at the top.  You use it in the constructor but it's defined later on.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1576821201
PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1575499019
PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1575495263
PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1575491814


More information about the core-libs-dev mailing list