8259886 : Improve SSL session cache performance and scalability

Wed Jan 27 21:39:50 UTC 2021

Hi Xuelei,
Thank you, that would be great.
Yes, I created a microbenchmark and got some numbers. The benchmark just
calls MemoryCache.put() in a loop; baseline results for different size &
timeout values:

Benchmark       (size)  (timeout)  Mode  Cnt     Score    Error  Units
CacheBench.put   20480      86400  avgt   25    83.653 ?  6.269  us/op
CacheBench.put   20480          0  avgt   25     0.107 ?  0.001  us/op
CacheBench.put  204800      86400  avgt   25  2057.781 ? 35.942  us/op
CacheBench.put  204800          0  avgt   25     0.108 ?  0.001  us/op

And after applying my proposed changes:

Benchmark       (size)  (timeout)  Mode  Cnt  Score   Error  Units
CacheBench.put   20480      86400  avgt   25  0.146 ? 0.002  us/op
CacheBench.put   20480          0  avgt   25  0.108 ? 0.002  us/op
CacheBench.put  204800      86400  avgt   25  0.150 ? 0.001  us/op
CacheBench.put  204800          0  avgt   25  0.106 ? 0.001  us/op

Given that the server-side handshake takes about 2 milliseconds (same
machine, TLS 1.2/ECDHE/ECDSA/secp256r1), I don't expect 0.1 microsecond
cache access to be a point of contention, but I haven't field-tested it yet.

I posted my code and benchmark results here:
https://github.com/openjdk/jdk/pull/2255. I'm still struggling with the
company's OCA (signed a few years and a few corporate changes ago, and not
used in a long time), I hope to get this sorted soon. I went with minimal
changes to the existing implementation, hoping that smaller changes will be
more likely to be approved for backporting than complete rewrites.

I had a quick look at your SessionCache. It looks like your put() operation
is constant-time already, so not much else for me to improve. However,
please correct me if I'm wrong, but I think SoftCacheEntry.unlink() should
be called with a lock held at all times, shouldn't it? SessionCache.get()
may call remove() without lock. Also there's no protection against calling
unlink() twice on the same entry (for example from concurrent get() calls),
which could wreck the linked list; setting prev=next=null would have
avoided this.

Thanks,
Daniel

śr., 27 sty 2021 o 20:36 Xue-Lei Fan <xuelei.fan at oracle.com> napisał(a):

> Hi Daniel,
>
> I would like to help you out.  Did you have some numbers about the
> performance improvement of your evaluation?  I had a draft re-write cache
> <https://github.com/XueleiFan/jdk/blob/jdk-8245576/src/java.base/share/classes/sun/security/ssl/SSLSessionContextImpl.java> by
> using ConcurrentHashMap, if you have a chance, would you like to evaluate
> if it could be improved further with your ideas?
>
> Thanks,
> Xuelei
>
> On Jan 27, 2021, at 1:28 AM, Daniel Jeliński <djelinski1 at gmail.com> wrote:
>
> Hi all,
> I'd like to modify the MemoryCache class that is used for caching SSL
> sessions in Java 11; when the cache is overloaded (full cache with no
> expired entries), the computational complexity of put operation is linear
> in the cache size.
>
> I am aware that the current Java versions are using stateless resumption;
> my company's policy is to use Java LTS, so stateless is not an option to us
> at the moment.
>
> The change I propose would alter the behavior of MemoryCache; currently
> the cache guarantees that expired entries are removed before live entries
> when the cache capacity is reached. I'd like to remove that guarantee, and
> instead always remove the least recently used cache entry. Additionally, to
> keep the memory use in check, I'd like to check the least recently used
> entries for expiration, and remove them as needed. This operation would be
> run on every put().
>
> Evaluated alternatives:
> - keep MemoryCache as is, increase cache size to avoid overloading. This
> would reduce the frequency of check for expired entries, but if the larger
> cache eventually gets overloaded, the put() operation would take even
> longer to complete.
> - keep MemoryCache behavior as is, but improve performance of removing
> expired entries. This would require a new data structure - we would need to
> have cache entries sorted both by access time and creation time.
> - always remove entries in insertion order, regardless if they are used or
> not.
> - always remove the least recently used entries, even if they are still
> valid and there are recently used expired entries somewhere in the cache.
> This is my preferred option.
>
> Additionally, we can choose how to remove stale entries:
> - all reachable stale entries on every put (that is, iterate over the
> internal data structure removing stale entries until a non-stale one is
> found); keeps memory use low, but can occasionally scan the entire cache
> (computational complexity: linear worst case, amortized constant);
> - fixed number of reachable stale entries on every put (same as above, but
> stop iterating after a fixed number of removed entries even if more are
> available); execution time is bounded by a constant, but some entries may
> stay in cache a bit longer than they need to.
> - all stale entries on put that would otherwise exceed cache capacity
> (current implementation)
> - never; once cache reaches full capacity, it stays full until the
> application is restarted. Fastest implementation, but at a cost of
> increased memory use.
>
> My preference would be to remove either all or a fixed number of reachable
> stale entries.
>
> I'm willing to prepare a patch for 17 and 11u, if I can find a sponsor.
> Thanks,
> Daniel
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/security-dev/attachments/20210127/03a09dcb/attachment.htm>