RFR: 8259886 : Improve SSL session cache performance and scalability [v2]

djelinski github.com+30433125+djelinski at openjdk.java.net
Thu Feb 4 20:48:42 UTC 2021


On Thu, 4 Feb 2021 19:36:24 GMT, Xue-Lei Andrew Fan <xuelei at openjdk.org> wrote:

>> Added benchmarks for get & remove. Added tests for 5M cache size. Switched time units to nanoseconds. Results:
>> Benchmark           (size)  (timeout)  Mode  Cnt    Score   Error  Units
>> CacheBench.get       20480      86400  avgt   25   62.999 ? 2.017  ns/op
>> CacheBench.get       20480          0  avgt   25   41.519 ? 1.113  ns/op
>> CacheBench.get      204800      86400  avgt   25   67.995 ? 4.530  ns/op
>> CacheBench.get      204800          0  avgt   25   46.439 ? 2.222  ns/op
>> CacheBench.get     5120000      86400  avgt   25   72.516 ? 0.759  ns/op
>> CacheBench.get     5120000          0  avgt   25   53.471 ? 0.491  ns/op
>> CacheBench.put       20480      86400  avgt   25  117.117 ? 3.424  ns/op
>> CacheBench.put       20480          0  avgt   25   73.582 ? 1.484  ns/op
>> CacheBench.put      204800      86400  avgt   25  116.983 ? 0.743  ns/op
>> CacheBench.put      204800          0  avgt   25   73.945 ? 0.515  ns/op
>> CacheBench.put     5120000      86400  avgt   25  230.878 ? 7.582  ns/op
>> CacheBench.put     5120000          0  avgt   25  192.526 ? 7.048  ns/op
>> CacheBench.remove    20480      86400  avgt   25   39.048 ? 2.036  ns/op
>> CacheBench.remove    20480          0  avgt   25   36.293 ? 0.281  ns/op
>> CacheBench.remove   204800      86400  avgt   25   43.899 ? 0.895  ns/op
>> CacheBench.remove   204800          0  avgt   25   43.046 ? 0.759  ns/op
>> CacheBench.remove  5120000      86400  avgt   25   51.896 ? 0.640  ns/op
>> CacheBench.remove  5120000          0  avgt   25   51.537 ? 0.536  ns/op
>
> Thank you for the comment. The big picture is more clear to me now.
> 
>> Example 2:
>> Old implementation will get:
>> |K=3, exp=10|K=5, exp=12|K=7, exp=14|K=9, exp=16|
>>
>> New implementation will get:
>> |K=5, exp=12|K=7, exp=14|K=1, exp=8(expired)|K=9, exp=16|
> 
> K=3 is not expired yet, but get removed, while K=1 is kept.  This behavior change may cause more overall performance hurt than improving the cache put/get performance.  For example, it need to grab the value remotely.   A full handshake or OCSP status grabbing could counteract all the performance gain with the cache update.
> 
>> All calls to put() remove expired items from the front of the queue, and never perform a full scan. get() calls shuffle the queue, moving the accessed item to the back. Compare this to original code where put() only removed expired items when the cache overflowed, and scanned the entire cache.
> 
> I think the idea that put() remove expired items from the front of the queue is good.  I was wondering if it is an option to have the get() method that removed expired items until the 1st un-expired item, without scan the full queue and change the order of the queue.  But there is still an issue that the SoftReference may have clear an item, which may be still valid.
> 
> In general, I think the get() performance is more important than put() method, as get() is called more frequently.  So we should try to keep the cache small if possible.
> 
>>> increase the size to some big scales, like 2M and 20M
>>
>> Can do. Do you think it makes sense to also benchmark the scenario where GC kicks in and collects soft references?
> 
> In the update, the SoftReference.clear() get removed.  I'm not sure of the impact of the enqueued objects any longer.  In theory, it could improve the memory use, which could counteract the performance gain in some situation.
> 
>> Also, what do you think about the changes done in Do not invalidate objects before GC 5859a03 commit? 
> See above, it is a concern to me that the soft reference cannot be cleared with this update.
> 
>> How do I file a CSR?
> Could you edit the bug: https://bugs.openjdk.java.net/browse/JDK-8259886?  In the more drop down menu, there is a "Create CSR" option.  You can do it if we have an agreement about the solution and impact.

Thanks for your review! Some comments below.
> A full handshake or OCSP status grabbing could counteract all the performance gain with the cache update.

Yes, but that's unlikely. Note that K=3 is before K=1 in the queue only because 3 wasn't used since 1 was last used. This means that either K=3 is used less frequently than K=1, or that all cached items are in active use. In the former case we don't lose much by dropping K=3, and in the latter we are dealing with full cache at all times, which means that most `put()`s remove un-expired items anyway.
> get() [..] without [..] change the order of the queue

If we do that, frequently used entries will be evicted at the same age as never used ones. This means we will have to recompute (full handshake/fresh OCSP) both the frequently used and the infrequently used entries. It's better to recompute only the infrequently used ones, and reuse the frequently used as long as possible - we will do less work that way.
> get() performance is more important [..] so we should try to keep the cache small if possible

I don't see the link; could you explain?
> In the update, the SoftReference.clear() get removed. I'm not sure of the impact of the enqueued objects any longer. In theory, it could improve the memory use, which could counteract the performance gain in some situation.

That's the best part: no objects ever get enqueued! We only called `clear()` right before losing the last reference to `SoftCacheEntry` (which is the `SoftReference`). When GC collects the `SoftReference`, it does not enqueue anything. GC only enqueues the `SoftReference` when it collects the referenced object (session / OCSP response) without collecting the `SoftReference` (cache entry) itself.
I wasn't sure about this at first, so I added code to crash the benchmark if `emptyQueue` ever finds anything. It didn't crash.
> Could you edit the bug

I'd need an account on the bug tracker first.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2255


More information about the security-dev mailing list