Request for review: 6922246: Deadlock during intensive class loading in parallel

Wed Oct 20 01:31:48 PDT 2010

On 10/20/10 12:16 AM, David Holmes wrote:
> HI Igor,
>
> Igor Veresov said the following on 10/20/10 16:52:
>> There's no question a store-store barrier is needed after 291 for
>> non-TSO if we were to make it lock-free (with the current code monitor
>> lock should contain such barrier anyway).
>
> There's no requirement for a storestore barrier when acquiring a lock.
> You will happen to get that on sparc and x86 if the lock acquisition
> involves a CAS, but that is not true on other archs. In general terms it
> is considered safe to move stores into a critical section at which time
> it can be reordered with other stores inside the critical section.

Yes, that's what I meant. Basically a lock should guarantee that all 
stores before the lock precede stores in the critical section, which is 
essentially a storestore barrier.

>
>> But why do you think it needs barrier before 286?
>
> I was thinking in very general terms from age old discussions on DCL
> whereby from a theoretical perspective you need a read-barrier to make
> sure that the load through the pointer must read from main memory if the
> pointer itself was freshly read from main memory. In our terms we would
> need a loadload barrier between the load of method->method_data and any
> subsequent loads via that pointer. Again this would be a noop on x86 and
> sparc, but not other archs.
>
>> There seems to be nothing wrong in allocating an extra MDO...
>> Moreover, the code is written under this assumption.
>
> Right. If it were not safe/correct for more than one thread to create a
> MDO, and have all but one thrown away, then we'd have a different
> problem to solve. Which reminds we, if another thread does beat us to
> storing the MDO then we need to delete the one we created.

Yes, perhaps we can try to deallocate. Although I suppose the race 
should occur infrequently and GC will take care of the unused copy anyway.

igor

>
> Cheers,
> David
>
>>
>> igor
>>
>> On 10/19/10 3:59 PM, David Holmes wrote:
>>> Hi Coleen,
>>>
>>> coleen phillimore said the following on 10/20/10 08:11:
>>>> Summary: allocate methodData before taking out the MethodData_lock
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/6922246/
>>>> bug link at http://bugs.sun.com/view_bug.do?bug_id=6922246
>>>>
>>>> Thanks to Ramki for the diagnosis and suggestion for the fix.
>>>
>>> This is a variation on the unsafe double-checked-locking idiom.
>>>
>>> 286 if (method->method_data() == NULL) {
>>> 291 methodDataOop method_data = oopFactory::new_methodData(method,
>>> CHECK);
>>> 292 MutexLocker ml(MethodData_lock, THREAD);
>>> 295 if (method->method_data() == NULL) {
>>> 296 method->set_method_data(method_data);
>>>
>>> In general terms any stores associated with the method_data construction
>>> are not guaranteed to be visible by the time set_method_data is called.
>>> A second thread could then see a non-null method->method_data() but
>>> access uninitialized method_data values.
>>>
>>> A general solution would need memory barriers before 286 and after 291
>>> to ensure correctness on all platforms. A simpler, though perhaps
>>> heavier solution would be to use a MutexUnlocker around the allocation:
>>>
>>> MutexLocker ml(MethodData_lock, THREAD);
>>> if (method->method_data() == NULL) {
>>> methodDataOop method_data = NULL;
>>> { // need new block for scope
>>> MutexUnLocker ml(MethodData_lock, THREAD);
>>> method_data = oopFactory::new_methodData(method, CHECK);
>>> }
>>> // someone might have beaten us to it
>>> if (method->method_data() == NULL) {
>>> method->set_method_data(method_data);
>>> }
>>> }
>>>
>>> A lock-less solution as Tom proposed would also require suitable
>>> memory-barriers.
>>>
>>> David
>>