RFR(S): 7037756 Deadlock in compiler thread similiar to 6789220
Bengt Rutisson
bengt.rutisson at oracle.com
Tue Apr 26 06:19:32 UTC 2011
Hi John,
I like this much better than my filtering approach.
I think it looks good. Just one nit: I think it might be worth adding to
the comment for your code that the deadlock that might occur is with the
GC. That might make it easier for someone reading the code to figure out
why this is dangerous.
So, maybe instead of:
// If the requesting thread is holding the pending list lock
// then we just return. We can't risk blocking while holding
// the pending list lock or a deadlock may occur.
Something like:
// If the requesting thread is holding the pending list lock
// then we just return. We can't risk blocking while holding
// the pending list lock or a deadlock with the GC (that needs
// to take the pending list lock) may occur.
Bengt
On 2011-04-26 01:37, John Cuthbertson wrote:
> Hi Everyone,
>
> A new webrev that is essentially Tom's suggestion can be found at:
> http://cr.openjdk.java.net/~johnc/7037756/webrev.2/. I also reverted
> Bengt's fix for 6789220 as, with Tom's suggested fix, a thread that
> owns the pending list will no longer be blocked in
> CompileBroker::compile_method_base.
>
> Testing: Ran over the weekend with the test case for 7037756; the test
> case for 6789220 (which fails 50% of the time with Bengt's fix
> removed); nsk tests; jprt.
>
> Thanks,
>
> JohnC
>
>
> On 04/22/11 18:48, Tom Rodriguez wrote:
>> Instead of enshrining the reference handler thread itself, could you make it work by checking whether the requesting thread owns the reference handler lock instead? That seems more robust and targeted. Something like:
>>
>> if (instanceRefKlass:owns_pending_list_lock(JavaThread::current()) {
>> return false;
>> }
>>
>> replacing the fix in in CompileBroker::is_compile_blocking seems like it should work.
>>
>> tom
>>
>> On Apr 22, 2011, at 5:38 PM, John Cuthbertson wrote:
>>
>>
>>> Hi EVeryone.
>>>
>>> Typo....
>>>
>>> On 04/22/11 17:28, John Cuthbertson wrote:
>>>
>>>> Hi Everyone,
>>>>
>>>> Can I have a couple of volunteers to look over these changes? The webrev can be found at:http://cr.openjdk.java.net/~johnc/7037756/webrev.1
>>>>
>>>> The issue here was very similar to the issue that caused 6789220 - the difference here was that the reference handler was blocked while waiting for the MethodCompileQueue_lock rather than waiting on a blocking compilation. To summarize:
>>>>
>>>> Thread 6 (reference handler thread), while owning the pending list lock, requested a compilation and was blocked waiting on the MethodCompileQueue_lock.
>>>>
>>>> Thread 11 (compiler thread 1), while owning the Compile_lock, attempted to allocate a Class mirror which triggered GC. In the GC it was blocked attempting to lock the pending list lock.
>>>>
>>>> Thread 12 (compiler thread 2) was registering a compiled method and, while owning the MethodCompileQueue_lock, was blocked waiting on the Compile_lock.
>>>>
>>>> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return with enqueueing the compile task. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking.
>>>>
>>> The above paragraph should read;
>>>
>>> The solution is to make the reference handler thread not block while holding the pending list lock. If the requesting thread is the reference handler thread, then an attempt is made to lock the MethodCompileQueue_lock in CompileBroker::compile_method_base and, if that is unsuccessful, we just return _without_ enqueueing the compilation request. Otherwise a regular blocking lock attempt is made. I also tweaked the fix made by Bengt for 6789220 to make all compilation requests by the reference handler thread non-blocking.
>>>
>>>> Testing: the failing test case has been running successfully on the VMSQE machine for 2 days (normally I see the deadlock after 20 minutes or so); the nsk tests; and a jprt job is the queue.
>>>>
>>>> Thanks,
>>>>
>>>> JohnC
>>>>
>>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20110426/43de9364/attachment.htm>
More information about the hotspot-gc-dev
mailing list