JVM hangs beyond recovery

David Holmes David.Holmes at oracle.com
Tue Jun 8 15:39:53 PDT 2010


Stas Oskin said the following on 06/09/10 04:35:
> I encountered the hang in latest jdk1.6.0_20 version.
> 
> I tried to get the threads dump, but both jstack and kill -3 didn't 
> produce any results.
> Below is the GDB dump:
> http://pastie.org/996680
> 
> Earlier in this thread I receive that possible cause is the dlsym 
> function in libdl library. I googled a bit for similar issues, and found 
> only the following potentially similar case:
> http://www-01.ibm.com/support/docview.wss?uid=isg1IZ07615

Yes that is basically the problem.

The hang is occurring when attempting to reach a safepoint for a
biased-locking revocation operation, but any safepoint operation will 
trigger the final hang under these conditions.

Once the VM is trying to enter the safepoint state you won't get any
response from it from any of the serviceability related tools, including
ctrl-\ for a thread dump etc. You need a native facility like pstack or
as in this case, gdb.

The underlying problem looks familiar to me though I can't find the 
culprit thread in the gdb thread dump. This problem occurs when a 
library uses a native dynamic loader hook (I'll call it dlsym) to make a 
call into the JVM. This is done while holding an internal dlsym mutex 
lock and the end result is a deadlock if any of the other threads in the 
Java application need to perform a dlsym related operation (which you 
can see in the thread dump that they do and are all blocked on that 
internal mutex).

The VM is trying to reach a safepoint so all threads will block once 
they detect that. The VMThread is waiting for _threads_in_vm to signal 
that they have stopped; but some of those threads are blocked on the 
dlsym mutex and so can't signal; but the holder of the dlsym mutex has 
blocked at the safepoint and hence the system grinds to a halt.

The culprit is whomever calls into the VM from inside the dlsym code. As 
I said I can't see that culprit in the stack traces, but typically it 
happens when someone installs an "on-load" hook or something of that nature.

David Holmes



> I also received advices to:
> 
> * ReduceInitialCardMarks
> or/and
> * UseBiasedLocking
> 
> Which one of them may solve the issue, and what the possible results 
> would be?
> 
> Thanks in advance.


More information about the hotspot-runtime-dev mailing list