deadlock in initialization of instanceKlass

Krystal Mok rednaxelafx at gmail.com
Thu Mar 22 16:44:29 PDT 2012


Hi,

I've run into a corner case where deadlock happens during instanceKlass
linking, which I think could have been avoided by patching the VM. It's
mostly a JDK6-only issue, as I haven't been able to reproduce it on JDK7 or
JDK8.

In the original problem, the scenario was to start a Java process with a
BTrace script tracing from the beginning [1]. As the system started up, two
threads were trying to load two different classes with different class
loaders at about the same time [2]. Both of these class load operations
were delegated to BTrace's agent for instrumentation, where both threads
tried to create a new instance of a class called
"com.sun.btrace.runtime.Instrumentor". This class was actually loaded
already, but not linked/initialized yet, so this was the time to do it.
And then a deadlock happened: One of the threads has locked the system
class loader before trying to link the class of
"com.sun.btrace.runtime.Instrumentor"; where as the other thread went ahead
and started linking the class of "com.sun.btrace.runtime.Instrumentor"
first, and for verification it needed to load new classes, which in turn
needed to lock the system class loader.

A simplified minimal repro of the original problem is available [3].
Caveat: it may take a few runs to actually see the deadlock. I've had a
better chance of reproducing it on JDK6/HS20, but harder on JDK6/HS24,
where everything seem to be faster. Haven't been able to reproduce it with
JDK7/HS23 or JDK8/HS24.

The race in this case involves locking both an instanceKlass and the lock
object of its defining class loader. Since an instanceKlass isn't a
Java-level object, Java code couldn't have locked it on its own.
So I think there's a chance of getting around this deadlock by locking the
loader before locking the instanceKlass, for places where class loading may
happen during the period the instanceKlass is locked. Or perhaps the
complete_exit()/reenter() dance trick.

A pragmatic way of solving the problem is to modify the application code to
get rid of the race in the first place. In this particular scenario, that'd
be modifying BTrace, forcing the Instrumentor class to link early with a
Class.forName() call. But as in the repro case, the problem seems pretty
generic, and could hit innocent-looking Java code.
(Of course, upgrading to JDK7 will get rid of the problem, too. Another
good case to push for an upgrade :-)

I haven't made a patch to fix this just yet. Any suggestions on whether
it'd be worthwhile to fix it in the VM or not?

P.S. There's a second bug. When using "jstack -l" to diagnose the problem,
a fastdebug build of the VM hit an assertion [4]. The code in DeadlockCycle
assumed that all ObjectMonitors correspond to instanceOops, which is not
(yet) the case. A instanceKlass or constantPool could also be directly
locked with a ObjectMonitor, and that's not an instanceOop.
I've made a patch against the current jdk8/jdk8/hotspot to fix this second
bug [5].

When the Permanent Generation removal work completes, this would no longer
be a problem, because the klass hierarchy won't be oops anymore, and the
klassKlass's will be gone. I can see that in Jon's latest patch [6]
already, but there should still be some time before the work is complete.
Meanwhile people may still run into this issue.

Regards,
Kris Mok

[1]: https://gist.github.com/2000950#file_trace_system_gc_call.java
[2]: https://gist.github.com/2158975#file_deadlock_stack_trace.log
[3]: https://gist.github.com/2163070#file_test_deadlock.java
[4]: https://gist.github.com/2163070#file_hit_assertion_in_fastdebug
[5]: https://gist.github.com/2163070#file_fix_assertion_jdk8.patch
[6]:
http://mail.openjdk.java.net/pipermail/hotspot-dev/2012-March/005441.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-dev/attachments/20120323/053f5fa6/attachment.html 


More information about the hotspot-dev mailing list