RFR: 8259886 : Improve SSL session cache performance and scalability [v2]
Xue-Lei Andrew Fan
xuelei at openjdk.java.net
Tue Feb 2 02:40:44 UTC 2021
On Mon, 1 Feb 2021 18:49:04 GMT, djelinski <github.com+30433125+djelinski at openjdk.org> wrote:
>> Under certain load, MemoryCache operations take a substantial fraction of the time needed to complete SSL handshakes. This series of patches improves performance characteristics of MemoryCache, at the cost of a functional change: expired entries are no longer guaranteed to be removed before live ones. Unused entries are still removed before used ones, and cache performance no longer depends on its capacity.
>>
>> First patch in the series contains a benchmark that can be run with `make test TEST="micro:CacheBench"`.
>> Baseline results before any MemoryCache changes:
>> Benchmark (size) (timeout) Mode Cnt Score Error Units
>> CacheBench.put 20480 86400 avgt 25 83.653 ? 6.269 us/op
>> CacheBench.put 20480 0 avgt 25 0.107 ? 0.001 us/op
>> CacheBench.put 204800 86400 avgt 25 2057.781 ? 35.942 us/op
>> CacheBench.put 204800 0 avgt 25 0.108 ? 0.001 us/op
>> there's a nonlinear performance drop between 20480 and 204800 entries, probably attributable to CPU cache thrashing. Beyond 204800 entries the cache scales more linearly.
>>
>> Benchmark results after the 2nd and 3rd patches are pretty similar, so I'll only copy one:
>> Benchmark (size) (timeout) Mode Cnt Score Error Units
>> CacheBench.put 20480 86400 avgt 25 0.146 ? 0.002 us/op
>> CacheBench.put 20480 0 avgt 25 0.108 ? 0.002 us/op
>> CacheBench.put 204800 86400 avgt 25 0.150 ? 0.001 us/op
>> CacheBench.put 204800 0 avgt 25 0.106 ? 0.001 us/op
>> The third patch improves worst-case times on a mostly idle cache by scattering removal of expired entries over multiple `put` calls. It does not affect performance of an overloaded cache.
>>
>> The 4th patch removes all code that clears cached values before handing them over to the GC. [This comment](https://github.com/openjdk/jdk/commit/5859a0320334bfb6b46b62eb16b4c387641f4a2a#diff-c6bd583a97fbc4f471621fee7eab37c63718cdb6932ce357fa403cfda4b32b6fL346) stated that clearing values was supposed to be a GC performance optimization. It wasn't. Benchmark results after that commit:
>> Benchmark (size) (timeout) Mode Cnt Score Error Units
>> CacheBench.put 20480 86400 avgt 25 0.113 ? 0.001 us/op
>> CacheBench.put 20480 0 avgt 25 0.075 ? 0.002 us/op
>> CacheBench.put 204800 86400 avgt 25 0.116 ? 0.001 us/op
>> CacheBench.put 204800 0 avgt 25 0.072 ? 0.001 us/op
>> I wasn't expecting that much of an improvement, and don't know how to explain it.
>>
>> The 40ns difference between cache with and without a timeout can be attributed to 2 `System.currentTimeMillis()` calls; they were pretty slow on my VM.
>
> djelinski has updated the pull request incrementally with one additional commit since the last revision:
>
> Simplify makefile
If I get the patch right, the benchmark performance improvement is a trade-off between CPU and memory, by keeping expired entries while putting a new entry in the cache. I'm not very sure of the performance impact on memory and GC collections. Would you mind add two more types of benchmark: get the entries and remove the entries, for cases that there are 1/10, 1/4, 1/3 and 1/2 expired entries in the cache? And increase the size to some big scales, like 2M and 20M.
It looks like a spec update as it may change the behavior of a few JDK components (TLS session cache, OCSP stapling response cache, cert store cache, certificate factory, etc), because of "expired entries are no longer guaranteed to be removed before live ones". I'm not very sure of the impact. I may suggest to file a CSR and have more eyes to check the compatibility impact before moving forward.
-------------
Changes requested by xuelei (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/2255
More information about the security-dev
mailing list