RFR: 8333583: Crypto-XDH.generateSecret regression after JDK-8329538 [v3]
Ferenc Rakoczi
duke at openjdk.org
Tue Jun 25 17:33:22 UTC 2024
On Mon, 17 Jun 2024 16:38:55 GMT, Volodymyr Paprotski <duke at openjdk.org> wrote:
>> This fix recovers XDH performance but removes some of the P256 gains (~-8-14%). Still faster, but not as much.
>>
>> The fix is to undo 'int' return type on mult()/square(), which allowed to return partially reduced result (e.g. this avoids extra reductions when mult() result is fed into addition). This is the behaviour before the Montgomery ECC PR.
>>
>> ---
>> XDH.generateSecret performance
>> before Montgomery PR:
>>
>> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
>> KeyAgreementBench.XDH.generateSecret XDH 255 XDH thrpt 3 8435.277 ± 27.230 ops/s
>>
>> after Montgomery PR:
>>
>> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
>> KeyAgreementBench.XDH.generateSecret XDH 255 XDH thrpt 3 8309.028 ± 22.071 ops/s
>>
>> with this PR:
>>
>> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
>> KeyAgreementBench.XDH.generateSecret XDH 255 XDH thrpt 3 8491.268 ± 32.858 ops/s
>>
>> ---
>>
>> P256 performance with/without mult intrinsic:
>>
>> Performance before Montgomery PR:
>>
>> Benchmark (algorithm) (dataSize) (keyLength) (provider) Mode Cnt Score Error Units
>> SignatureBench.ECDSA.sign SHA256withECDSA 1024 256 thrpt 3 6398.727 ± 7.400 ops/s
>> SignatureBench.ECDSA.sign SHA256withECDSA 16384 256 thrpt 3 6129.739 ± 5.995 ops/s
>> SignatureBench.ECDSA.verify SHA256withECDSA 1024 256 thrpt 3 1889.928 ± 54.660 ops/s
>> SignatureBench.ECDSA.verify SHA256withECDSA 16384 256 thrpt 3 1866.339 ± 42.438 ops/s
>> Benchmark (algorithm) (keyLength) (kpgAlgorithm) (provider) Mode Cnt Score Error Units
>> o.o.b.j.c.full.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1350.745 ± 28.514 ops/s
>> o.o.b.j.c.small.KeyAgreementBench.EC.generateSecret ECDH 256 EC thrpt 3 1349.393 ± 32.050 ops/s
>>
>> Performance in master without mult() intrinsic
>>
>> Benchmark ...
>
> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision:
>
> comment from Sandhya
Looks good to me. It would be good, though, to figure out what else could be done to regain the P256 performance with keeping the speed of this code path.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/19728#issuecomment-2189545307
More information about the security-dev
mailing list