RFR: 8295010: Reduce if required in EC limbs operations [v5]

Daniel Jeliński djelinski at openjdk.org
Mon Nov 21 07:46:29 UTC 2022


On Sat, 19 Nov 2022 07:17:19 GMT, Xue-Lei Andrew Fan <xuelei at openjdk.org> wrote:

>> Hi,
>> 
>> May I have this update reviewed?  With this update, the result will be reduced if required in EC limbs operations in the JDK implementation.
>> 
>> In the current implementation, the EC limbs addition and subtraction result is not reduced before the value is returned.  This behavior is different from multiplication and square.  To get it right, a maximum addition limit is introduced for EC limbs addition and subtraction.  If the limit is reached, an exception will be thrown so as to indicate an EC implementation problem.  The exception is not recoverable from the viewpoint of an application.
>> 
>> The design is error prone as the EC implementation code must be carefully checked so that the limit cannot reach. If a reach is noticed, a reduce operation would have to be hard-coded additionally. In the FieldGen.java implementation, the limit can only be 1 or 2 as the implementation code is only able to handle 2.  But the FieldGen.java does not really check if 1 or 2 really works for the specific filed generation parameters.
>> 
>> The design impact the performance as well. Because of this limit, the maximum limb size is 28 bits for 2 max adding limit. Otherwise there are integer (long) overflow issues. For example for 256 bits curves, 10 limbs are required for 28-bit limb size; and 9 limbs for 29-bit size. By reducing 1 limbs from 10 to 9, the Signature performance could get improved by 20%.
>> 
>> In the IntegerPolynomial class description, it is said "All IntegerPolynomial implementations allow at most one addition before multiplication. Additions after that will result in an ArithmeticException." It's too strict to follow without exam the code very carefully. Indeed, the implementation does not really follow the spec, and 2 addition may be allowed.
>> 
>> It would be nice if there is no addition limit, and then we are free from these issues. It is doable by reducing the limbs if required for EC limbs operations.
>> 
>> The benchmark for ECDSA Signature is checked in order to see how much the performance could be impacted. Here are the benchmark numbers before the patch applied:
>> 
>> Benchmark        (curveName)  (messageLength)   Mode  Cnt     Score    Error  Units
>> Signatures.sign    secp256r1               64  thrpt   15  1417.507 ±  5.809  ops/s
>> Signatures.sign    secp256r1              512  thrpt   15  1412.620 ±  7.040  ops/s
>> Signatures.sign    secp256r1             2048  thrpt   15  1417.364 ±  6.643  ops/s
>> Signatures.sign    secp256r1            16384  thrpt   15  1364.720 ± 44.354  ops/s
>> Signatures.sign    secp384r1               64  thrpt   15   609.924 ±  9.858  ops/s
>> Signatures.sign    secp384r1              512  thrpt   15   610.873 ±  5.519  ops/s
>> Signatures.sign    secp384r1             2048  thrpt   15   608.573 ± 13.324  ops/s
>> Signatures.sign    secp384r1            16384  thrpt   15   597.802 ±  7.074  ops/s
>> Signatures.sign    secp521r1               64  thrpt   15   301.954 ±  6.449  ops/s
>> Signatures.sign    secp521r1              512  thrpt   15   300.774 ±  5.849  ops/s
>> Signatures.sign    secp521r1             2048  thrpt   15   295.831 ±  8.423  ops/s
>> Signatures.sign    secp521r1            16384  thrpt   15   276.258 ± 43.146  ops/s
>> Signatures.sign      Ed25519               64  thrpt   15  1171.878 ±  4.187  ops/s
>> Signatures.sign      Ed25519              512  thrpt   15  1175.175 ±  4.611  ops/s
>> Signatures.sign      Ed25519             2048  thrpt   15  1168.459 ±  5.750  ops/s
>> Signatures.sign      Ed25519            16384  thrpt   15  1086.906 ±  6.514  ops/s
>> Signatures.sign        Ed448               64  thrpt   15   326.475 ±  3.274  ops/s
>> Signatures.sign        Ed448              512  thrpt   15   328.709 ±  3.867  ops/s
>> Signatures.sign        Ed448             2048  thrpt   15   315.480 ± 15.377  ops/s
>> Signatures.sign        Ed448            16384  thrpt   15   312.388 ±  1.067  ops/s
>> 
>> 
>> and here are the numbers with this patch applied:
>> 
>> Benchmark        (curveName)  (messageLength)   Mode  Cnt     Score    Error  Units
>> Signatures.sign    secp256r1               64  thrpt   15  1377.447 ± 38.889  ops/s
>> Signatures.sign    secp256r1              512  thrpt   15  1403.149 ±  5.151  ops/s
>> Signatures.sign    secp256r1             2048  thrpt   15  1401.101 ±  6.937  ops/s
>> Signatures.sign    secp256r1            16384  thrpt   15  1381.855 ± 10.606  ops/s
>> Signatures.sign    secp384r1               64  thrpt   15   595.979 ±  1.967  ops/s
>> Signatures.sign    secp384r1              512  thrpt   15   592.419 ±  6.087  ops/s
>> Signatures.sign    secp384r1             2048  thrpt   15   581.800 ± 18.815  ops/s
>> Signatures.sign    secp384r1            16384  thrpt   15   583.224 ±  9.856  ops/s
>> Signatures.sign    secp521r1               64  thrpt   15   264.772 ± 36.883  ops/s
>> Signatures.sign    secp521r1              512  thrpt   15   302.084 ±  1.074  ops/s
>> Signatures.sign    secp521r1             2048  thrpt   15   296.619 ±  3.272  ops/s
>> Signatures.sign    secp521r1            16384  thrpt   15   290.112 ±  5.085  ops/s
>> Signatures.sign      Ed25519               64  thrpt   15  1163.293 ±  4.266  ops/s
>> Signatures.sign      Ed25519              512  thrpt   15  1157.093 ±  8.145  ops/s
>> Signatures.sign      Ed25519             2048  thrpt   15  1140.900 ±  8.660  ops/s
>> Signatures.sign      Ed25519            16384  thrpt   15  1053.233 ± 40.591  ops/s
>> Signatures.sign        Ed448               64  thrpt   15   317.509 ±  6.870  ops/s
>> Signatures.sign        Ed448              512  thrpt   15   320.975 ±  1.534  ops/s
>> Signatures.sign        Ed448             2048  thrpt   15   322.157 ±  1.914  ops/s
>> Signatures.sign        Ed448            16384  thrpt   15   303.532 ±  6.419  ops/s
>> 
>> 
>> From the numbers, it looks like the performance impact is limited.  Based on this update, the performance could get rewarded by using less limbs.  For examples if using less limbs for secp256r1/secp384r1/secp521r1/curve25519, [see PR for PR for JDK-8294248](https://github.com/openjdk/jdk/pull/10398), and run the benchmark together with this patch, I could see the performance improvement as follow (note: No update for Ed448 as using less limbs does not improve the performance for Ed448):
>> 
>> Benchmark        (curveName)  (messageLength)   Mode  Cnt     Score    Error  Units
>> Signatures.sign    secp256r1               64  thrpt   15  1769.262 ±  4.251  ops/s
>> Signatures.sign    secp256r1              512  thrpt   15  1764.754 ± 12.185  ops/s
>> Signatures.sign    secp256r1             2048  thrpt   15  1737.499 ± 43.937  ops/s
>> Signatures.sign    secp256r1            16384  thrpt   15  1733.932 ±  7.938  ops/s
>> Signatures.sign    secp384r1               64  thrpt   15   636.614 ±  6.752  ops/s
>> Signatures.sign    secp384r1              512  thrpt   15   629.449 ±  8.000  ops/s
>> Signatures.sign    secp384r1             2048  thrpt   15   645.898 ±  5.655  ops/s
>> Signatures.sign    secp384r1            16384  thrpt   15   634.530 ± 11.846  ops/s
>> Signatures.sign    secp521r1               64  thrpt   15   334.413 ±  6.710  ops/s
>> Signatures.sign    secp521r1              512  thrpt   15   335.088 ±  6.630  ops/s
>> Signatures.sign    secp521r1             2048  thrpt   15   339.729 ±  1.450  ops/s
>> Signatures.sign    secp521r1            16384  thrpt   15   327.597 ±  2.341  ops/s
>> Signatures.sign      Ed25519               64  thrpt   15  1361.028 ±  2.338  ops/s
>> Signatures.sign      Ed25519              512  thrpt   15  1359.072 ±  3.748  ops/s
>> Signatures.sign      Ed25519             2048  thrpt   15  1347.409 ±  9.172  ops/s
>> Signatures.sign      Ed25519            16384  thrpt   15  1250.794 ± 11.750  ops/s
>> Signatures.sign        Ed448               64  thrpt   15   321.312 ±  5.826  ops/s
>> Signatures.sign        Ed448              512  thrpt   15   326.592 ±  4.213  ops/s
>> Signatures.sign        Ed448             2048  thrpt   15   322.183 ±  7.005  ops/s
>> Signatures.sign        Ed448            16384  thrpt   15   299.869 ± 16.594  ops/s
>> ``` 
>> 
>> If considering this PR together with [PR for JDK-8294248](https://github.com/openjdk/jdk/pull/10398), performance improvement could be expected for EC crypto primitives.
>> 
>> Thanks,
>> Xuelei
>
> Xue-Lei Andrew Fan has updated the pull request incrementally with one additional commit since the last revision:
> 
>   remove duplicated bench test

Can you share the updated benchmarks?

-------------

PR: https://git.openjdk.org/jdk/pull/10624


More information about the security-dev mailing list