7u Code review request for 7033170, 7092821, 7092825
David Schlosnagle
schlosna at gmail.com
Fri Aug 7 12:44:50 UTC 2015
I forgot to mention one other workaround for this bug, as used in Guava is
cloning the instance if clone() is supported, see
https://github.com/google/guava/issues/1197
On Friday, August 7, 2015, David Schlosnagle <schlosna at gmail.com> wrote:
> // reviving this ghost
>
> Hi Valerie, Sean, and sec-dev,
>
> I am curious if there has been any movement on incorporating a fix for
> https://bugs.openjdk.java.net/browse/JDK-7092821? I have encountered
> several systems where this is a significant contention and
> scale bottleneck. While there are some workarounds such as pooling and
> reusing Provider instances, that seems like a band-aid, and fixing the JDK
> is a better path.
>
> If I were to put a webrev together, would someone be kind enough to sponsor
> it for me? Are there any existing open source load tests to verify these
> changes? Is there a good mechanism in OpenJDK now to run JMH across all of
> the supported platforms (I only have access to a small subset of the
> permutation of platforms).
>
> Thanks!
>
> - Dave
>
> On Thursday, January 12, 2012, Valerie (Yu-Ching) Peng <
> valerie.peng at oracle.com
> <javascript:_e(%7B%7D,'cvml','valerie.peng at oracle.com');>> wrote:
>
>> Dave,
>> Thanks for the comments.
>> Let me think about it some more and see how to better address this kind
>> of racing issue.
>>
>> If you have concerns on fixes for 7033170 and 7092825, please let me know.
>>
>> Sean,
>> Can you please ignore the review request for 7092821 for now. I'll send
>> out an updated version later.
>> If you can still review the remaining two, that'd be great.
>>
>> Thanks,
>> Valerie
>>
>> On 01/11/12 21:48, David Schlosnagle wrote:
>>
>>> On Wed, Jan 11, 2012 at 8:04 PM, Valerie (Yu-Ching) Peng
>>> <valerie.peng at oracle.com> wrote:
>>>
>>>> 7092821: java.security.Provider.getService() is synchronized and became
>>>> scalability bottleneck.
>>>> jdk7u Webrev: http://cr.openjdk.java.net/~valeriep/7092821_7u/
>>>> jdk8 Webrev: http://cr.openjdk.java.net/~valeriep/7092821/
>>>>
>>> Valerie,
>>>
>>> You might already be aware of this, but there is a data race on lines
>>> 685 - 686 of the Provider's getService(String, String) method. If
>>> there are concurrent callers to getService while lookupCache == null,
>>> the lookupCache may be overwritten by a new ConcurrentHashMap after
>>> another thread has just instantiated and populated an entry in the
>>> cache leading to thrashing on lookupCache. It might be worthwhile to
>>> either use a double checked lock to avoid the race at the expense of
>>> an additional lock and volatile read in the case lookupCache == null
>>> or add a comment indicating that this is an accepted data race.
>>>
>>> There is also now the possibility that if one thread is executing
>>> getService while another thread invokes one of the methods that sets
>>> lookupCache = null, there could then be a NullPointerException on line
>>> 703 as the volatile read would now see a null and fail. You could
>>> prevent that by either moving the putIfAbsent under the lock (not
>>> ideal for performance if you're already seeing contention), or just
>>> maintain a local reference to the cache map and use it throughout the
>>> getService method. This would mean that if the lookupCache reference
>>> was concurrently mutated, the putIfAbsent would basically be a write
>>> to the local map reference which is now garbage, but shouldn't really
>>> hurt anything.
>>>
>>> I'd propose something along the lines of the following to address this:
>>>
>>> public Service getService(String type, String algorithm) {
>>> ServiceKey key = new ServiceKey(type, algorithm);
>>> ConcurrentMap<ServiceKey,Service> localLookupCache =
>>> getLookupCache();
>>> Service result = localLookupCache.get(key);
>>> if (result != null) {
>>> return (result == NULL_MARK? null : result);
>>> }
>>> synchronized (this) {
>>> checkInitialized();
>>> if (serviceMap != null) {
>>> result = serviceMap.get(key);
>>> }
>>> if (result == null) {
>>> ensureLegacyParsed();
>>> result = (legacyMap != null) ? legacyMap.get(key) :
>>> null;
>>> }
>>> }
>>> // under concurrent mutation of lookupCache, this will write
>>> to map that is no
>>> // longer the active cache
>>> localLookupCache.putIfAbsent(key, (result == null? NULL_MARK :
>>> result));
>>> return result;
>>> }
>>>
>>> private ConcurrentMap<ServiceKey,Service> getLookupCache() {
>>> if (lookupCache == null) {
>>> synchronized (this) {
>>> // must fall back on double checked lock
>>> if (lookupCache == null) {
>>> lookupCache = new ConcurrentHashMap<>();
>>> }
>>> }
>>> }
>>> return lookupCache;
>>> }
>>>
>>>
>>> - Dave
>>>
>>>
>>> For reference, here were the original changes:
>>> 429 // Cache for service lookups. Discard whenever services are
>>> changed.
>>> 430 private transient volatile ConcurrentMap<ServiceKey,Service>
>>> lookupCache;
>>> ...snip...
>>> 682 public Service getService(String type, String algorithm) {
>>> 683 ServiceKey key = new ServiceKey(type, algorithm);
>>> 684 Service result = null;
>>> 685 if (lookupCache == null) {
>>> 686 lookupCache = new ConcurrentHashMap<>();
>>> 687 } else {
>>> 688 result = lookupCache.get(key);
>>> 689 if (result != null) {
>>> 690 return (result == NULL_MARK? null : result);
>>> 691 }
>>> 692 }
>>> 693 synchronized (this) {
>>> 694 checkInitialized();
>>> 695 if (serviceMap != null) {
>>> 696 result = serviceMap.get(key);
>>> 697 }
>>> 698 if (result == null) {
>>> 699 ensureLegacyParsed();
>>> 700 result = (legacyMap != null) ? legacyMap.get(key)
>>> : null;
>>> 701 }
>>> 702 }
>>> 703 lookupCache.putIfAbsent(key, (result == null? NULL_MARK :
>>> result));
>>> 704 return result;
>>> 705 }
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/security-dev/attachments/20150807/ec2ee92a/attachment.htm>
More information about the security-dev
mailing list