// reviving this ghost<div><br></div><div>Hi Valerie, Sean, and sec-dev,</div><div><br></div><div>I am curious if there has been any movement on incorporating a fix for <a href="https://bugs.openjdk.java.net/browse/JDK-7092821">https://bugs.openjdk.java.net/browse/JDK-7092821</a>? I have encountered several systems where this is a significant contention and scale bottleneck. While there are some workarounds such as pooling and reusing Provider instances, that seems like a band-aid, and fixing the JDK is a better path.</div><div><br></div><div>If I were to put a webrev together, would someone be kind enough to<span></span> sponsor it for me? Are there any existing open source load tests to verify these changes? Is there a good mechanism in OpenJDK now to run JMH across all of the supported platforms (I only have access to a small subset of the permutation of platforms).</div><div><br></div><div>Thanks!</div><div><br></div><div>- Dave</div><div><br>On Thursday, January 12, 2012, Valerie (Yu-Ching) Peng <<a href="mailto:valerie.peng@oracle.com">valerie.peng@oracle.com</a>> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dave,<br>
Thanks for the comments.<br>
Let me think about it some more and see how to better address this kind of racing issue.<br>
<br>
If you have concerns on fixes for 7033170 and 7092825, please let me know.<br>
<br>
Sean,<br>
Can you please ignore the review request for 7092821 for now. I'll send out an updated version later.<br>
If you can still review the remaining two, that'd be great.<br>
<br>
Thanks,<br>
Valerie<br>
<br>
On 01/11/12 21:48, David Schlosnagle wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On Wed, Jan 11, 2012 at 8:04 PM, Valerie (Yu-Ching) Peng<br>
<<a>valerie.peng@oracle.com</a>> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
7092821: java.security.Provider.getService() is synchronized and became scalability bottleneck.<br>
jdk7u Webrev: <a href="http://cr.openjdk.java.net/~valeriep/7092821_7u/" target="_blank">http://cr.openjdk.java.net/~valeriep/7092821_7u/</a><br>
jdk8 Webrev: <a href="http://cr.openjdk.java.net/~valeriep/7092821/" target="_blank">http://cr.openjdk.java.net/~valeriep/7092821/</a><br>
</blockquote>
Valerie,<br>
<br>
You might already be aware of this, but there is a data race on lines<br>
685 - 686 of the Provider's getService(String, String) method. If<br>
there are concurrent callers to getService while lookupCache == null,<br>
the lookupCache may be overwritten by a new ConcurrentHashMap after<br>
another thread has just instantiated and populated an entry in the<br>
cache leading to thrashing on lookupCache. It might be worthwhile to<br>
either use a double checked lock to avoid the race at the expense of<br>
an additional lock and volatile read in the case lookupCache == null<br>
or add a comment indicating that this is an accepted data race.<br>
<br>
There is also now the possibility that if one thread is executing<br>
getService while another thread invokes one of the methods that sets<br>
lookupCache = null, there could then be a NullPointerException on line<br>
703 as the volatile read would now see a null and fail. You could<br>
prevent that by either moving the putIfAbsent under the lock (not<br>
ideal for performance if you're already seeing contention), or just<br>
maintain a local reference to the cache map and use it throughout the<br>
getService method. This would mean that if the lookupCache reference<br>
was concurrently mutated, the putIfAbsent would basically be a write<br>
to the local map reference which is now garbage, but shouldn't really<br>
hurt anything.<br>
<br>
I'd propose something along the lines of the following to address this:<br>
<br>
public Service getService(String type, String algorithm) {<br>
ServiceKey key = new ServiceKey(type, algorithm);<br>
ConcurrentMap<ServiceKey,Service> localLookupCache = getLookupCache();<br>
Service result = localLookupCache.get(key);<br>
if (result != null) {<br>
return (result == NULL_MARK? null : result);<br>
}<br>
synchronized (this) {<br>
checkInitialized();<br>
if (serviceMap != null) {<br>
result = serviceMap.get(key);<br>
}<br>
if (result == null) {<br>
ensureLegacyParsed();<br>
result = (legacyMap != null) ? legacyMap.get(key) : null;<br>
}<br>
}<br>
// under concurrent mutation of lookupCache, this will write<br>
to map that is no<br>
// longer the active cache<br>
localLookupCache.putIfAbsent(key, (result == null? NULL_MARK : result));<br>
return result;<br>
}<br>
<br>
private ConcurrentMap<ServiceKey,Service> getLookupCache() {<br>
if (lookupCache == null) {<br>
synchronized (this) {<br>
// must fall back on double checked lock<br>
if (lookupCache == null) {<br>
lookupCache = new ConcurrentHashMap<>();<br>
}<br>
}<br>
}<br>
return lookupCache;<br>
}<br>
<br>
<br>
- Dave<br>
<br>
<br>
For reference, here were the original changes:<br>
429 // Cache for service lookups. Discard whenever services are changed.<br>
430 private transient volatile ConcurrentMap<ServiceKey,Service><br>
lookupCache;<br>
...snip...<br>
682 public Service getService(String type, String algorithm) {<br>
683 ServiceKey key = new ServiceKey(type, algorithm);<br>
684 Service result = null;<br>
685 if (lookupCache == null) {<br>
686 lookupCache = new ConcurrentHashMap<>();<br>
687 } else {<br>
688 result = lookupCache.get(key);<br>
689 if (result != null) {<br>
690 return (result == NULL_MARK? null : result);<br>
691 }<br>
692 }<br>
693 synchronized (this) {<br>
694 checkInitialized();<br>
695 if (serviceMap != null) {<br>
696 result = serviceMap.get(key);<br>
697 }<br>
698 if (result == null) {<br>
699 ensureLegacyParsed();<br>
700 result = (legacyMap != null) ? legacyMap.get(key) : null;<br>
701 }<br>
702 }<br>
703 lookupCache.putIfAbsent(key, (result == null? NULL_MARK : result));<br>
704 return result;<br>
705 }<br>
</blockquote>
<br>
</blockquote></div>