deadlock in initialization of instanceKlass
David Holmes
david.holmes at oracle.com
Thu Mar 22 18:19:52 PDT 2012
On 23/03/2012 11:13 AM, Krystal Mok wrote:
> Thanks for the quick reply. Comments inline:
>
> On Fri, Mar 23, 2012 at 8:09 AM, David Holmes <david.holmes at oracle.com
> <mailto:david.holmes at oracle.com>> wrote:
>
> Hi Kris,
>
> The parallel-classloading changes likely remove this problem from 7+
>
> I did check the value of the flag AllowParallelDefineClass, which
> defaulted to false when I ran on JDK7/HS23 and JDK8/HS24.
I think by 7 GA parallel class loaders were enabled by default - no need
for a VM flag. But my memory is rusty on this too.
> My min repro didn't use an agent. But it's easier to fall into this trap
> with an agent.
Sorry didn't look at the min repo.
> So I'm not going to stick my head in and try to "fix" this bug, since
> the problem doesn't show up in 7+ anyway.
Ok. Thanks for reporting it.
> On the int[0] stuff: yep, I saw it. The locker object for bootstrap
> class loader is also an array, a "fake" int[0][]:
>
> _system_loader_lock_obj = oopFactory::new_system_objArray(0, CHECK);
>
>
> It's been like that since duke at 0...from what I can tell. Wonder why an
> array was favored over something else, say, a java.lang.Byte instance,
> which carries pretty much the same weight (in terms of object size)?
A primitive array can be created without having any classes loaded.
David
-----
>
> Thanks,
> Kris Mok
>
> Thanks,
> David
>
>
> On 23/03/2012 9:44 AM, Krystal Mok wrote:
>
> Hi,
>
> I've run into a corner case where deadlock happens during
> instanceKlass
> linking, which I think could have been avoided by patching the
> VM. It's
> mostly a JDK6-only issue, as I haven't been able to reproduce it
> on JDK7
> or JDK8.
>
> In the original problem, the scenario was to start a Java
> process with a
> BTrace script tracing from the beginning [1]. As the system
> started up,
> two threads were trying to load two different classes with different
> class loaders at about the same time [2]. Both of these class load
> operations were delegated to BTrace's agent for instrumentation,
> where
> both threads tried to create a new instance of a class called
> "com.sun.btrace.runtime.__Instrumentor". This class was actually
> loaded
> already, but not linked/initialized yet, so this was the time to
> do it.
> And then a deadlock happened: One of the threads has locked the
> system
> class loader before trying to link the class of
> "com.sun.btrace.runtime.__Instrumentor"; where as the other
> thread went
> ahead and started linking the class of
> "com.sun.btrace.runtime.__Instrumentor" first, and for
> verification it
> needed to load new classes, which in turn needed to lock the system
> class loader.
>
> A simplified minimal repro of the original problem is available [3].
> Caveat: it may take a few runs to actually see the deadlock.
> I've had a
> better chance of reproducing it on JDK6/HS20, but harder on
> JDK6/HS24,
> where everything seem to be faster. Haven't been able to
> reproduce it
> with JDK7/HS23 or JDK8/HS24.
>
> The race in this case involves locking both an instanceKlass and the
> lock object of its defining class loader. Since an instanceKlass
> isn't a
> Java-level object, Java code couldn't have locked it on its own.
> So I think there's a chance of getting around this deadlock by
> locking
> the loader before locking the instanceKlass, for places where class
> loading may happen during the period the instanceKlass is locked. Or
> perhaps the complete_exit()/reenter() dance trick.
>
> A pragmatic way of solving the problem is to modify the
> application code
> to get rid of the race in the first place. In this particular
> scenario,
> that'd be modifying BTrace, forcing the Instrumentor class to
> link early
> with a Class.forName() call. But as in the repro case, the
> problem seems
> pretty generic, and could hit innocent-looking Java code.
> (Of course, upgrading to JDK7 will get rid of the problem, too.
> Another
> good case to push for an upgrade :-)
>
> I haven't made a patch to fix this just yet. Any suggestions on
> whether
> it'd be worthwhile to fix it in the VM or not?
>
> P.S. There's a second bug. When using "jstack -l" to diagnose the
> problem, a fastdebug build of the VM hit an assertion [4]. The
> code in
> DeadlockCycle assumed that all ObjectMonitors correspond to
> instanceOops, which is not (yet) the case. A instanceKlass or
> constantPool could also be directly locked with a ObjectMonitor, and
> that's not an instanceOop.
> I've made a patch against the current jdk8/jdk8/hotspot to fix this
> second bug [5].
>
> When the Permanent Generation removal work completes, this would no
> longer be a problem, because the klass hierarchy won't be oops
> anymore,
> and the klassKlass's will be gone. I can see that in Jon's
> latest patch
> [6] already, but there should still be some time before the work is
> complete. Meanwhile people may still run into this issue.
>
> Regards,
> Kris Mok
>
> [1]:
> https://gist.github.com/__2000950#file_trace_system_gc___call.java
> <https://gist.github.com/2000950#file_trace_system_gc_call.java>
> [2]:
> https://gist.github.com/__2158975#file_deadlock_stack___trace.log <https://gist.github.com/2158975#file_deadlock_stack_trace.log>
> [3]: https://gist.github.com/__2163070#file_test_deadlock.__java
> <https://gist.github.com/2163070#file_test_deadlock.java>
> [4]:
> https://gist.github.com/__2163070#file_hit_assertion_in___fastdebug
> <https://gist.github.com/2163070#file_hit_assertion_in_fastdebug>
> [5]:
> https://gist.github.com/__2163070#file_fix_assertion___jdk8.patch <https://gist.github.com/2163070#file_fix_assertion_jdk8.patch>
> [6]:
> http://mail.openjdk.java.net/__pipermail/hotspot-dev/2012-__March/005441.html
> <http://mail.openjdk.java.net/pipermail/hotspot-dev/2012-March/005441.html>
>
>
More information about the hotspot-dev
mailing list