[10] RFR : 8186628 : SSL session cache can cause a scalability bottleneck

Ivan Gerasimov ivan.gerasimov at oracle.com
Sat Nov 11 06:48:14 UTC 2017


Thank you Roger for the suggestion!

Indeed an interesting idea to consider for optimization!  I guess, in 
inventory of how Cache is used needs to be made to see if the maximum 
size requirement may be weakened.

At this point, however, I'd like to have a simple solution with a choice 
between MemoryCache and NullCache, which will allow to have a better 
performance in certain situations.  Benchmarking with small cache size 
with accessing the cache from many threads showed that the cache was 
still a hot point and threads were spending lots of time blocked.

With kind regards,
Ivan


On 11/8/17 12:18 PM, Roger Riggs wrote:
> Hi Ivan,
>
> One idea to consider is an indirection that spreads the work over 
> multiple Cache implementations.
> Similar to what ConcurrentHashMap does, doing an early fan out to 
> multiple Caches based on the key.
> If it was keyed to the same key as the cache, it would be able to take 
> advantage of re-using the contexts.
> Though I'm not sure how to size or re-size the index based on load or ...
>
> I would think that using a 'small' cache size would bound the expunge 
> time and still allow some-reuse.
>
> $.02, Roger
>
>
> On 11/8/2017 2:09 PM, Ivan Gerasimov wrote:
>>
>> Thank you Bernd for looking into this!
>>
>>
>> On 11/7/17 11:42 PM, Bernd Eckenfels wrote:
>>> Hello,
>>>
>>> There is already a property to set the cache size, would it be 
>>> enough to re-purpose a cache size of 0 to turn it off?
>>>
>> Currently, setting the cache size to 0 means that it is unbounded, so 
>> that the entries are removed from the cache only when they get expired.
>>
>>> Are there numbers to show when this is actually a problem? Is this 
>>> only for 100% Cache misses?
>> We've seen dumps with lots of threads blocked waiting on the 
>> Cache.get()/put()/remove().
>> This is primarily due to the time spent in the cache cleaning 
>> routines (see 
>> sun.security.util.MemoryCache.emptyQueue()/expungeExpiredEntries()), 
>> which are executed inside the synchronization block.
>> This time is linear on the size of cache, but limiting the cache size 
>> doesn't always help either, as the amount of cleanup work also 
>> increases with a bounded cache.
>>
>> Allowing to avoid to use the cache removed this bottleneck and under 
>> certain conditions the throughput increased from 35 to 120 sessions 
>> per second.
>>
>> Please note that the proposed option javax.net.ssl.needCacheSessions 
>> will be true by default, so the default behavior will not change.
>> Only in specific situation, if it is proved that turning off the 
>> cache will improve performance, this option will be recommended to be 
>> set to false.
>>
>>> Maybe the cache itself needs some optimizations?
>> Certainly, it would be very good to optimize the cache implementation!
>> I've made a few attempts, but failed to achieve a significant 
>> improvement in different scenarios.
>> The complication is due to the two requirements: maintaining fixed 
>> cache capacity and maintaining FIFO order when removing the entries.  
>> This makes it hard to use the concurrent data structures as is.
>>
>> Still, I'm totally for the cache optimization in JDK 10, if it is 
>> possible.
>> However, if it is done, it would not be probably backported to the 
>> earlier releases.
>>
>> And I'm going to propose to backport the proposed fix with the option 
>> to turn off the cache, as it will be useful for some currently 
>> running applications.
>>
>> With kind regards,
>> Ivan
>>
>>> (It is hard to imagine that a saved handshake does not compensate 
>>> for hundreds of gets - especially if the current version still would 
>>> generate a cache key)
>>>
>>> Gruss
>>> Bernd
>>>
>>> Gruss
>>> Bernd
>>> -- 
>>> http://bernd.eckenfels.net
>>> ------------------------------------------------------------------------ 
>>>
>>> *From:* security-dev <security-dev-bounces at openjdk.java.net> on 
>>> behalf of Ivan Gerasimov <ivan.gerasimov at oracle.com>
>>> *Sent:* Wednesday, November 8, 2017 3:24:54 AM
>>> *To:* security-dev at openjdk.java.net
>>> *Subject:* [10] RFR : 8186628 : SSL session cache can cause a 
>>> scalability bottleneck
>>> Hello everybody!
>>>
>>> The class sun.security.ssl.SSLSessionContextImpl maintains caches for
>>> the sessions reuse.
>>> Access to the cache from threads is serialized.
>>> It was reported that under heavy load the time of waiting for the turn
>>> to access the synchronized methods outweighs the time of creating a new
>>> session.
>>>
>>> It is proposed to introduce a flag that will allow to avoid using the
>>> cache altogether.
>>> Would you please help review the proposed fix?
>>>
>>> BUGURL: https://bugs.openjdk.java.net/browse/JDK-8186628
>>> WEBREV: http://cr.openjdk.java.net/~igerasim/8186628/00/webrev/ 
>>> <http://cr.openjdk.java.net/%7Eigerasim/8186628/00/webrev/>
>>>
>>> -- 
>>> With kind regards,
>>> Ivan Gerasimov
>>>
>>
>> -- 
>> With kind regards,
>> Ivan Gerasimov
>
>

-- 
With kind regards,
Ivan Gerasimov




More information about the security-dev mailing list