RFR: 8295010: Reduce if required in EC limbs operations [v6]
John Jiang
jjiang at openjdk.org
Mon Nov 28 01:57:19 UTC 2022
On Wed, 23 Nov 2022 06:21:05 GMT, Xue-Lei Andrew Fan <xuelei at openjdk.org> wrote:
>> Hi,
>>
>> May I have this update reviewed? With this update, the result will be reduced if required in EC limbs operations in the JDK implementation.
>>
>> In the current implementation, the EC limbs addition and subtraction result is not reduced before the value is returned. This behavior is different from multiplication and square. To get it right, a maximum addition limit is introduced for EC limbs addition and subtraction. If the limit is reached, an exception will be thrown so as to indicate an EC implementation problem. The exception is not recoverable from the viewpoint of an application.
>>
>> The design is error prone as the EC implementation code must be carefully checked so that the limit cannot reach. If a reach is noticed, a reduce operation would have to be hard-coded additionally. In the FieldGen.java implementation, the limit can only be 1 or 2 as the implementation code is only able to handle 2. But the FieldGen.java does not really check if 1 or 2 really works for the specific filed generation parameters.
>>
>> The design impact the performance as well. Because of this limit, the maximum limb size is 28 bits for 2 max adding limit. Otherwise there are integer (long) overflow issues. For example for 256 bits curves, 10 limbs are required for 28-bit limb size; and 9 limbs for 29-bit size. By reducing 1 limbs from 10 to 9, the Signature performance could get improved by 20%.
>>
>> In the IntegerPolynomial class description, it is said "All IntegerPolynomial implementations allow at most one addition before multiplication. Additions after that will result in an ArithmeticException." It's too strict to follow without exam the code very carefully. Indeed, the implementation does not really follow the spec, and 2 addition may be allowed.
>>
>> It would be nice if there is no addition limit, and then we are free from these issues. It is doable by reducing the limbs if required for EC limbs operations.
>>
>> The benchmark for ECDSA Signature is checked in order to see how much the performance could be impacted. Here are the benchmark numbers before the patch applied:
>>
>> Benchmark (curveName) (messageLength) Mode Cnt Score Error Units
>> Signatures.sign secp256r1 64 thrpt 15 1417.507 ± 5.809 ops/s
>> Signatures.sign secp256r1 512 thrpt 15 1412.620 ± 7.040 ops/s
>> Signatures.sign secp256r1 2048 thrpt 15 1417.364 ± 6.643 ops/s
>> Signatures.sign secp256r1 16384 thrpt 15 1364.720 ± 44.354 ops/s
>> Signatures.sign secp384r1 64 thrpt 15 609.924 ± 9.858 ops/s
>> Signatures.sign secp384r1 512 thrpt 15 610.873 ± 5.519 ops/s
>> Signatures.sign secp384r1 2048 thrpt 15 608.573 ± 13.324 ops/s
>> Signatures.sign secp384r1 16384 thrpt 15 597.802 ± 7.074 ops/s
>> Signatures.sign secp521r1 64 thrpt 15 301.954 ± 6.449 ops/s
>> Signatures.sign secp521r1 512 thrpt 15 300.774 ± 5.849 ops/s
>> Signatures.sign secp521r1 2048 thrpt 15 295.831 ± 8.423 ops/s
>> Signatures.sign secp521r1 16384 thrpt 15 276.258 ± 43.146 ops/s
>> Signatures.sign Ed25519 64 thrpt 15 1171.878 ± 4.187 ops/s
>> Signatures.sign Ed25519 512 thrpt 15 1175.175 ± 4.611 ops/s
>> Signatures.sign Ed25519 2048 thrpt 15 1168.459 ± 5.750 ops/s
>> Signatures.sign Ed25519 16384 thrpt 15 1086.906 ± 6.514 ops/s
>> Signatures.sign Ed448 64 thrpt 15 326.475 ± 3.274 ops/s
>> Signatures.sign Ed448 512 thrpt 15 328.709 ± 3.867 ops/s
>> Signatures.sign Ed448 2048 thrpt 15 315.480 ± 15.377 ops/s
>> Signatures.sign Ed448 16384 thrpt 15 312.388 ± 1.067 ops/s
>>
>>
>> and here are the numbers with this patch applied:
>>
>> Benchmark (curveName) (messageLength) Mode Cnt Score Error Units
>> Signatures.sign secp256r1 64 thrpt 15 1377.447 ± 38.889 ops/s
>> Signatures.sign secp256r1 512 thrpt 15 1403.149 ± 5.151 ops/s
>> Signatures.sign secp256r1 2048 thrpt 15 1401.101 ± 6.937 ops/s
>> Signatures.sign secp256r1 16384 thrpt 15 1381.855 ± 10.606 ops/s
>> Signatures.sign secp384r1 64 thrpt 15 595.979 ± 1.967 ops/s
>> Signatures.sign secp384r1 512 thrpt 15 592.419 ± 6.087 ops/s
>> Signatures.sign secp384r1 2048 thrpt 15 581.800 ± 18.815 ops/s
>> Signatures.sign secp384r1 16384 thrpt 15 583.224 ± 9.856 ops/s
>> Signatures.sign secp521r1 64 thrpt 15 264.772 ± 36.883 ops/s
>> Signatures.sign secp521r1 512 thrpt 15 302.084 ± 1.074 ops/s
>> Signatures.sign secp521r1 2048 thrpt 15 296.619 ± 3.272 ops/s
>> Signatures.sign secp521r1 16384 thrpt 15 290.112 ± 5.085 ops/s
>> Signatures.sign Ed25519 64 thrpt 15 1163.293 ± 4.266 ops/s
>> Signatures.sign Ed25519 512 thrpt 15 1157.093 ± 8.145 ops/s
>> Signatures.sign Ed25519 2048 thrpt 15 1140.900 ± 8.660 ops/s
>> Signatures.sign Ed25519 16384 thrpt 15 1053.233 ± 40.591 ops/s
>> Signatures.sign Ed448 64 thrpt 15 317.509 ± 6.870 ops/s
>> Signatures.sign Ed448 512 thrpt 15 320.975 ± 1.534 ops/s
>> Signatures.sign Ed448 2048 thrpt 15 322.157 ± 1.914 ops/s
>> Signatures.sign Ed448 16384 thrpt 15 303.532 ± 6.419 ops/s
>>
>>
>> From the numbers, it looks like the performance impact is limited. Based on this update, the performance could get rewarded by using less limbs. For examples if using less limbs for secp256r1/secp384r1/secp521r1/curve25519, [see PR for PR for JDK-8294248](https://github.com/openjdk/jdk/pull/10398), and run the benchmark together with this patch, I could see the performance improvement as follow (note: No update for Ed448 as using less limbs does not improve the performance for Ed448):
>>
>> Benchmark (curveName) (messageLength) Mode Cnt Score Error Units
>> Signatures.sign secp256r1 64 thrpt 15 1769.262 ± 4.251 ops/s
>> Signatures.sign secp256r1 512 thrpt 15 1764.754 ± 12.185 ops/s
>> Signatures.sign secp256r1 2048 thrpt 15 1737.499 ± 43.937 ops/s
>> Signatures.sign secp256r1 16384 thrpt 15 1733.932 ± 7.938 ops/s
>> Signatures.sign secp384r1 64 thrpt 15 636.614 ± 6.752 ops/s
>> Signatures.sign secp384r1 512 thrpt 15 629.449 ± 8.000 ops/s
>> Signatures.sign secp384r1 2048 thrpt 15 645.898 ± 5.655 ops/s
>> Signatures.sign secp384r1 16384 thrpt 15 634.530 ± 11.846 ops/s
>> Signatures.sign secp521r1 64 thrpt 15 334.413 ± 6.710 ops/s
>> Signatures.sign secp521r1 512 thrpt 15 335.088 ± 6.630 ops/s
>> Signatures.sign secp521r1 2048 thrpt 15 339.729 ± 1.450 ops/s
>> Signatures.sign secp521r1 16384 thrpt 15 327.597 ± 2.341 ops/s
>> Signatures.sign Ed25519 64 thrpt 15 1361.028 ± 2.338 ops/s
>> Signatures.sign Ed25519 512 thrpt 15 1359.072 ± 3.748 ops/s
>> Signatures.sign Ed25519 2048 thrpt 15 1347.409 ± 9.172 ops/s
>> Signatures.sign Ed25519 16384 thrpt 15 1250.794 ± 11.750 ops/s
>> Signatures.sign Ed448 64 thrpt 15 321.312 ± 5.826 ops/s
>> Signatures.sign Ed448 512 thrpt 15 326.592 ± 4.213 ops/s
>> Signatures.sign Ed448 2048 thrpt 15 322.183 ± 7.005 ops/s
>> Signatures.sign Ed448 16384 thrpt 15 299.869 ± 16.594 ops/s
>> ```
>>
>> If considering this PR together with [PR for JDK-8294248](https://github.com/openjdk/jdk/pull/10398), performance improvement could be expected for EC crypto primitives.
>>
>> Thanks,
>> Xuelei
>
> Xue-Lei Andrew Fan has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits:
>
> - Merge
> - remove duplicated bench test
> - Merge master
> - missed reduce
> - reduce if needed
> - add the key pair generation bench code
> - remove tabs
> - 8295010: EC limbs addition and subtraction limit
Marked as reviewed by jjiang (Reviewer).
-------------
PR: https://git.openjdk.org/jdk/pull/10624
More information about the security-dev
mailing list