deadlock with jni NewDirectByteBuffer called from multiple threads introduced in JDK 1.6.0_04

Thu Jan 8 19:09:16 PST 2009

Bug 6791815 has been filed.

David

David Holmes - Sun Microsystems said the following on 01/09/09 09:43:
> Keith,
> 
> I can see the problem - in fact I think I was involved in the code 
> change that has triggered the deadlock. :(
> 
> Two threads are concurrently try to use direct buffers and they are 
> racing to initialize direct buffer support. The first thread has got the 
> job of doing the initialization and is looking up classes which leads to 
> one thing then another and ultimately a safepoint is requested. 
> Meanwhile the other thread that lost the initialization race enters a 
> polling loop waiting for the first thread to complete the initialization.
> 
> The problem is that while in this polling loop (a sched_yield() call) 
> the thread is marked as ThreadInVM, which means that the VMThread will 
> wait for it to reach the safepoint. As it never does anything to 
> encounter the safepoint then VMThread keeps waiting; so the 
> initialization thread keeps waiting, and so the second thread keeps 
> waiting - deadlock.
> 
> The code change that caused this was the change of the thread's state to 
> ThreadInVM. This was done because on Solaris the os::yield_all call can 
> turn into an os_sleep call and that requires the thread to be 
> ThreadInVM. On linux the os::yield_all call is just sched_yield and so 
> the state change is not only not needed but dangerous.
> 
> I will file a bug for this immediately.
> 
> There should be a workaround however: don't have a race to initialize 
> the direct buffers. If you can insert a call to create a NewDirectBuffer 
> early in the apps lifetime, from one thread, then initialization will be 
> able to occur with no race and this problem won't occur.
> 
> David Holmes
> 
> Keith McNeill said the following on 01/09/09 09:15:
>> Here is a gdb stack dump from linux64.  Look for NewDirectByteBuffer 
>> to find the calls.
>>
>> David Holmes - Sun Microsystems wrote:
>>> Hi Keith,
>>>
>>> What platform are you on? Can you see where threads block inside 
>>> NewDirectByteBuffer?
>>>
>>> On Solaris pstack would show you what state the process in. I think 
>>> linux has similar functionality, but don't know about Windows.
>>>
>>> David Holmes
>>>
>>> Keith McNeill said the following on 01/09/09 06:22:
>>>> Our software has a C++ network layer using a large java runtime via 
>>>> JNI.  When new clients connect to our server we make some 
>>>> NewDirectByteBuffer calls so that we can pass data from the c++ 
>>>> network layer to the the java runtime system.   We use the JVM 
>>>> invocation JNI interface (i.e. we startup with our own exe rather 
>>>> than java.exe).  This same basic setup has been running for several 
>>>> years.
>>>>
>>>> We have recently found that we can get what appears to be deadlock 
>>>> within calls to NewDirectByteBuffer.   Debugging we can see multiple 
>>>> threads down in the guts of NewDirectByteBuffer blocked.    Once the 
>>>> deadlock occurs the JVM is hosed.  We can't get stack dumps from it, 
>>>> can't do anything with it. This problem is complicated to reproduce 
>>>> but we can do it reliably.
>>>> We have been able to reproduce this with JDK 1.6.0_04 through JDK 
>>>> 1.6.0_11.  We haven't been able to reproduce with JDK 1.6.0_03 down 
>>>> through JDK 1.5.
>>>>
>>>> Any suggestions on the best way to debug this JDK problem?
>>>>
>>>> Keith
>>>>
>>>>