JVM hangs beyond recovery
David Holmes
David.Holmes at oracle.com
Wed Jun 9 06:00:01 PDT 2010
Stas Oskin said the following on 06/09/10 21:41:
> Hi.
>
> I just realized this is the issue you posted back in April - that
> was the one I was remembering, that and Xuggle. Based on this:
>
> http://lists.apple.com/archives/java-dev/2009/Jun/msg00395.html
>
> I'd say Xuggle is the prime suspect here.
>
> Just to clarify, you believe that main issue behind deadlock and even
> crash (as in this case), is the way Xuggler loads native libraries?
Yes (the crash was the VM detecting the deadlock itself). Here's the
stack as reported above:
1 libclient.dylib 4716923 SafepointSynchronize::block (JavaThread*) + 619
2 libclient.dylib 6027802 jni_NewStringUTF + 394
3 libxuggle-ferry.3.dylib 0x1aceb8c3
com::xuggle::ferry::Logger::getLogger(char const*) + 147
4 libxuggle-ferry.3.dylib 0x1acebedd
com::xuggle::ferry::Logger::getStaticLogger(char const*) + 29
5 libxuggle-xuggler-io.3.dylib 0x171e6fbc
__static_initialization_and_destruction_0(int, int) + 44
6 dyld 0x8fe12f36
ImageLoaderMachO::doModInitFunctions(ImageLoader::LinkContext const&) + 246
7 dyld 0x8fe0e7e3
ImageLoader::recursiveInitialization(ImageLoader::LinkContext const&,
unsigned int) + 307
8 dyld 0x8fe0e8c9 ImageLoader::runInitializers(ImageLoader::LinkContext
const&) + 57
9 dyld 0x8fe02202 dyld::runInitializers (ImageLoader*) + 34
10 dyld 0x8fe0bbdd dlopen + 605
11 libSystem.B.dylib 0x916ac2c2 dlopen + 66
12 libclient.dylib 0x0060d6f1 JVM_LoadLibrary + 193
13 libjava.jnilib 0x00061d74
Java_java_lang_ClassLoader_00024NativeLibrary_load + 87
We start in Java, try to load a native library and enter the dynamic
linker. That in turn causes native code in libxuggle-xuggler-io.3 to run
which tries to invoke Java code: com::xuggle::ferry::Logger::getStaticLogger
If this is done while holding a lock in the dynamic linker then we will
deadlock if any other thread tries to acquire that lock and a safepoint
is trigerred - which in the above case is done by the thread holding
that lock.
Your example is not so obvious as we can't see the attempted call back
into the VM from the linker.
Even if there were no lock involved, the VM is not reentrant and can't
be used as depicted. All calls out to dlopen etc would have to be
handled as JNI calls (I'm not even sure that would be enough), but as
dlopen might be called through some other native library call, all calls
to the OS or libc would have to be treated as JNI calls - and that would
be a huge overhead on the VM.
There is a similar issue on Windows where people try to call into Java
from dllMain - it just can't work.
David Holmes
More information about the hotspot-runtime-dev
mailing list