RFR: 8322767: TLS full handshake is slow with PKCS12KeyStore and X509KeyManagerImpl [v3]

Daniel Jeliński djelinski at openjdk.org
Wed Mar 27 09:20:29 UTC 2024


On Tue, 26 Mar 2024 06:00:33 GMT, Hai-May Chao <hchao at openjdk.org> wrote:

>> For the PKIX KeyManager and PKCS12 Keystore, when the TLS server sends the ServerHello message and ultimately calls the X509KeyManagerImpl.chooseEngineServerAlias() method, it retrieves the private key from the keystore, decrypts it, and caches both the key and its certificate. This caching currently occurs only during a single handshake. Since decryption can be time-consuming, a modification has been implemented to cache the keystore entries at initialization time. This way, it won't be necessary to retrieve and decrypt the keys for multiple handshakes, which could lead to performance drawbacks.
>> 
>> A change was made to also update/refresh the cached entry as the certificates in the PKCS12 keystore may change, for scenarios like when the certificate expires and a new one is added under a different alias, and the certificate chain returned by the PKCS12 keystore is not the same as the one in the cache. While attempting to handle when to refresh a cached entry to accommodate keystore changes, we would like to know if you agree that this improvement is worth the risk. We would also like to know if you have a preference for other options:
>> 
>> 1. Accept that PKIX+PKCS12 is slow.
>> 2. Add a configuration option (system property, maybe) to decide the level of caching (1 - same as the existing one, 2 - same caching as in SunX509KeyManagerImpl, 3 - the new caching introduced in this change).
>> 
>> Additionally, the benchmark test SSLHandshake.java is modified to include a @Param annotation, allowing it to pass different KeyManagerFactory values (SunX509 and PKIX) to the benchmark method.
>> 
>> Running modified SSLHandshake.java test prior to the change that caches the PKCS12 keystore entries for PKIX:
>> Benchmark                 (keyMgr)  (resume)  (tlsVersion)   Mode  Cnt     Score     Error  Units
>> SSLHandshake.doHandshake   SunX509      true       TLSv1.2  thrpt   15  9346.292 ? 379.023  ops/s
>> SSLHandshake.doHandshake   SunX509      true           TLS  thrpt   15   940.175 ?  21.215  ops/s
>> SSLHandshake.doHandshake   SunX509     false       TLSv1.2  thrpt   15   594.418 ?  23.374  ops/s
>> SSLHandshake.doHandshake   SunX509     false           TLS  thrpt   15   534.030 ?  16.709  ops/s
>> SSLHandshake.doHandshake      PKIX      true       TLSv1.2  thrpt   15  9359.086 ? 246.257  ops/s
>> SSLHandshake.doHandshake      PKIX      true           TLS  thrpt   15   933.835 ?  81.388  ops/s
>> SSLHandshake.doHandshake      PKIX     false       TLSv1.2  thrpt   15   ...
>
> Hai-May Chao has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Updated with John's comments

Well this PR doesn't introduce new bugs, but it exacerbates a preexisting one. The key manager does not try to validate the private/public key pair in any way, and will use whatever the key store provides, even if not valid. With the current key manager implementation, as soon as a key pair is changed, the key manager will start using the new pair. With the proposed PR, the private key is cached, and the certificate chain is used as the cache key. If the key store entry is corrected by changing the private key only, the key manager cache will not be updated, and the bad private key will be served until the certificate chain changes.

The default key store getEntry method implementation is not thread-safe, and during concurrent modification it can return a PrivateKeyEntry mixing the old private key and the new certificate chain. This was recently fixed for PKCS12 key stores (#18156), but other key store types might still be affected. This is why the PR is only caching PKCS12 key stores. This should really be explained in a code comment by the way.

With PKCS12 key store, we can still cache an invalid entry if the invalid entry is actually present in the key store. This could happen if a user tried to update the key/cert pair, but used the wrong key in the process. If the user then tried to fix the mistake by updating the key without changing the certificate chain, the change would not be picked up by the key manager, which would keep serving the wrong key.

The above scenario, as unlikely as it sounds, i why I'm not sure about this change.

What alternatives do we have?

- We could keep the existing code, and accept the fact that it's 400% slower than SunX509. However, given that we want to steer users away from SunX509 and change the default to PKIX ([JDK-8272875](https://bugs.openjdk.org/browse/JDK-8272875)), the performance penalty might be a bitter pill to swallow.
- We could use a faster KDF for private key encryption in PKCS12; JKS is also encrypting private keys, but the encryption overhead is negligible (SunX509+JKS had similar performance to PKIX+JKS last time I measured)
- We could implement a new caching key manager, that would cache key store entries during creation (like SunX509), and then behave identically to PKIX.
- We could add a new system property to control caching in PKIX key manager, then switch to PKIX, then point the users who need every last bit of performance to that system property.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17956#issuecomment-2022282831



More information about the security-dev mailing list