RFR: 8322767: TLS full handshake is slow with PKCS12KeyStore and X509KeyManagerImpl [v3]

Hai-May Chao hchao at openjdk.org
Wed Mar 27 08:17:22 UTC 2024


On Tue, 26 Mar 2024 06:00:33 GMT, Hai-May Chao <hchao at openjdk.org> wrote:

>> For the PKIX KeyManager and PKCS12 Keystore, when the TLS server sends the ServerHello message and ultimately calls the X509KeyManagerImpl.chooseEngineServerAlias() method, it retrieves the private key from the keystore, decrypts it, and caches both the key and its certificate. This caching currently occurs only during a single handshake. Since decryption can be time-consuming, a modification has been implemented to cache the keystore entries at initialization time. This way, it won't be necessary to retrieve and decrypt the keys for multiple handshakes, which could lead to performance drawbacks.
>> 
>> A change was made to also update/refresh the cached entry as the certificates in the PKCS12 keystore may change, for scenarios like when the certificate expires and a new one is added under a different alias, and the certificate chain returned by the PKCS12 keystore is not the same as the one in the cache. While attempting to handle when to refresh a cached entry to accommodate keystore changes, we would like to know if you agree that this improvement is worth the risk. We would also like to know if you have a preference for other options:
>> 
>> 1. Accept that PKIX+PKCS12 is slow.
>> 2. Add a configuration option (system property, maybe) to decide the level of caching (1 - same as the existing one, 2 - same caching as in SunX509KeyManagerImpl, 3 - the new caching introduced in this change).
>> 
>> Additionally, the benchmark test SSLHandshake.java is modified to include a @Param annotation, allowing it to pass different KeyManagerFactory values (SunX509 and PKIX) to the benchmark method.
>> 
>> Running modified SSLHandshake.java test prior to the change that caches the PKCS12 keystore entries for PKIX:
>> Benchmark                 (keyMgr)  (resume)  (tlsVersion)   Mode  Cnt     Score     Error  Units
>> SSLHandshake.doHandshake   SunX509      true       TLSv1.2  thrpt   15  9346.292 ? 379.023  ops/s
>> SSLHandshake.doHandshake   SunX509      true           TLS  thrpt   15   940.175 ?  21.215  ops/s
>> SSLHandshake.doHandshake   SunX509     false       TLSv1.2  thrpt   15   594.418 ?  23.374  ops/s
>> SSLHandshake.doHandshake   SunX509     false           TLS  thrpt   15   534.030 ?  16.709  ops/s
>> SSLHandshake.doHandshake      PKIX      true       TLSv1.2  thrpt   15  9359.086 ? 246.257  ops/s
>> SSLHandshake.doHandshake      PKIX      true           TLS  thrpt   15   933.835 ?  81.388  ops/s
>> SSLHandshake.doHandshake      PKIX     false       TLSv1.2  thrpt   15   ...
>
> Hai-May Chao has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Updated with John's comments

Yes, it’d be nice to make sure no bugs were introduced. It is not desirable to introduce bugs when working on this issue to improve performance.

As each run may have slightly different scores, I ran the benchmark again. This time, it shows no regression as below. It seems the small score degradation may not be flagged as regression; hence, I should run multiple times to try to get the average results.

Benchmark                                       (keyMgr) (resume) (tlsVersion)   Mode   Cnt            Score         Error    Units   
SSLHandshake.doHandshake        SunX509         true       TLSv1.2     thrpt     15      9381.086 ?  171.396   ops/s
SSLHandshake.doHandshake        SunX509         true              TLS     thrpt     15        953.039 ?   19.431   ops/s
SSLHandshake.doHandshake        SunX509        false        TLSv1.2    thrpt     15         616.918 ?     7.652   ops/s
SSLHandshake.doHandshake        SunX509        false               TLS    thrpt     15        530.865 ?     7.109   ops/s
SSLHandshake.doHandshake               PKIX          true        TLSv1.2    thrpt     15      9653.252 ?   92.254  ops/s
SSLHandshake.doHandshake               PKIX          true              TLS     thrpt     15         951.771 ?   28.665  ops/s
SSLHandshake.doHandshake               PKIX         false        TLSv1.2    thrpt     15         617.288 ?      7.197   ops/s
SSLHandshake.doHandshake               PKIX         false              TLS     thrpt     15        521.930 ?    19.292  ops/s

I ran the benchmark to measure the time needed to build a TLS context using PKIX KeyManager (with protocols "TLSv1.2" and "TLS”) before and after the changes to X509KeyManagerImpl.java. Here are the results:

Without changes:
Benchmark                                (tlsVersion)    Mode    Cnt                  Score               Error     Units
SSLHandshake.doHandshake        TLSv1.2     thrpt      15      1073890.853 ?   18249.629     ops/s
SSLHandshake.doHandshake              TLS      thrpt      15      1139049.624 ?     5507.867     ops/s

With changes:
Benchmark                                (tlsVersion)    Mode    Cnt         Score           Error   Units
SSLHandshake.doHandshake        TLSv1.2     thrpt      15      130.039 ?       4.372   ops/s
SSLHandshake.doHandshake              TLS      thrpt      15      134.157 ?       1.903    ops/s

For the case without changes, it has higher Error values and looks like there is more variability in the measurements. I ran it again, and got similar results with no background processes running. I can’t explain if there may be environmental variables affecting the Error values. The changes that arise time to set up cached map at initialization time will be paid off for multiple handshaking.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17956#issuecomment-2022178215



More information about the security-dev mailing list