JVM hangs beyond recovery

David Holmes David.Holmes at oracle.com
Wed Jun 9 06:00:01 PDT 2010


Stas Oskin said the following on 06/09/10 21:41:
> Hi.
> 
>     I just realized this is the issue you posted back in April - that
>     was the one I was remembering, that and Xuggle. Based on this:
> 
>     http://lists.apple.com/archives/java-dev/2009/Jun/msg00395.html
> 
>     I'd say Xuggle is the prime suspect here.
> 
> Just to clarify, you believe that main issue behind deadlock and even 
> crash (as in this case), is the way Xuggler loads native libraries?

Yes (the crash was the VM detecting the deadlock itself). Here's the 
stack as reported above:

1 libclient.dylib 4716923 SafepointSynchronize::block (JavaThread*) + 619
2 libclient.dylib 6027802 jni_NewStringUTF + 394
3 libxuggle-ferry.3.dylib 0x1aceb8c3 
com::xuggle::ferry::Logger::getLogger(char const*) + 147
4 libxuggle-ferry.3.dylib 0x1acebedd 
com::xuggle::ferry::Logger::getStaticLogger(char const*) + 29
5 libxuggle-xuggler-io.3.dylib 0x171e6fbc 
__static_initialization_and_destruction_0(int, int) + 44
6 dyld 0x8fe12f36 
ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 246
7 dyld 0x8fe0e7e3 
ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&, 
unsigned int) + 307
8 dyld 0x8fe0e8c9 ImageLoader::runInitializers(ImageLoader::LinkContext 
const&) + 57
9 dyld 0x8fe02202 dyld::runInitializers (ImageLoader*) + 34
10 dyld 0x8fe0bbdd dlopen + 605
11 libSystem.B.dylib 0x916ac2c2 dlopen + 66
12 libclient.dylib 0x0060d6f1 JVM_LoadLibrary + 193
13 libjava.jnilib 0x00061d74 
Java_java_lang_ClassLoader_00024NativeLibrary_load + 87

We start in Java, try to load a native library and enter the dynamic 
linker. That in turn causes native code in libxuggle-xuggler-io.3 to run 
which tries to invoke Java code: com::xuggle::ferry::Logger::getStaticLogger

If this is done while holding a lock in the dynamic linker then we will 
deadlock if any other thread tries to acquire that lock and a safepoint 
is trigerred - which in the above case is done by the thread holding 
that lock.

Your example is not so obvious as we can't see the attempted call back 
into the VM from the linker.

Even if there were no lock involved, the VM is not reentrant and can't 
be used as depicted. All calls out to dlopen etc would have to be 
handled as JNI calls (I'm not even sure that would be enough), but as 
dlopen might be called through some other native library call, all calls 
to the OS or libc would have to be treated as JNI calls - and that would 
be a huge overhead on the VM.

There is a similar issue on Windows where people try to call into Java 
from dllMain - it just can't work.

David Holmes


More information about the hotspot-runtime-dev mailing list