RFR (S/M) JDK-8010196 NPG: Internal Error: Metaspace allocation lock -- possible deadlock
Coleen Phillimore
coleen.phillimore at oracle.com
Mon Apr 8 20:58:11 PDT 2013
Minor correction below.
On 4/8/2013 11:20 PM, Coleen Phillimore wrote:
> Mikael,
> This looks good including the refactoring.
>
> On 4/8/2013 8:17 PM, David Holmes wrote:
>> Hi Mikael,
>>
>> This approach sounds (and looks) reasonable to me, but I'll leav eit
>> to others to officially Review.
>>
>> One concern I did have was whether the initialization of the array
>> could fail and what that would mean for how this particular cld was
>> used. That led me to the following code which looks suspect to me:
>>
>> // ClassLoaderData into the java/lang/ClassLoader object as a hidden
>> field
>> ClassLoaderData* ClassLoaderDataGraph::add(ClassLoaderData**
>> cld_addr, Handle loader, TRAPS) {
>> ...
>> ClassLoaderData* cld = new ClassLoaderData(loader, is_anonymous);
>>
>> if (cld_addr != NULL) {
>> // First, Atomically set it
>> ClassLoaderData* old = (ClassLoaderData*)
>> Atomic::cmpxchg_ptr(cld, cld_addr, NULL);
>> if (old != NULL) {
>> delete cld;
>> // Returns the data.
>> return old;
>> }
>> }
>>
>> // We won the race, and therefore the task of adding the data to
>> the list of
>> // class loader data
>> do {
>> cld->set_next(next);
>> ClassLoaderData* exchanged =
>> (ClassLoaderData*)Atomic::cmpxchg_ptr(cld, list_head, next);
>> if (exchanged == next) {
>> ...
>> // Create dependencies after the CLD is added to the list.
>> Otherwise,
>> // the GC GC will not find the CLD and the _class_loader field
>> will
>> // not be updated.
>> cld->init_dependencies(CHECK_NULL);
>> return cld;
>> }
>> next = exchanged;
>> } while (true);
>> }
>>
>> So it explicitly states that we create dependencies after adding to
>> the list, but if creating the dependencies fails we return null, but
>> leave the new cld in the list. Is that a valid thing to do ??
>>
>
> Not really. If we hit OOM here there isn't a good recovery. We
> shouldn't return NULL in any case. I will file a bug. In the
> meantime, Mikael can you change it to CATCH_NULL and fix the comment
> above (remove "the GC" before "GC")?
David is right, it can return null with the pending exception but should
be removed from the list and from the class_loader hidden field in this
case. This function is dangerous because it has unhandled oops (the
address passed in). For now, I think you can leave it Mikael. It'll
throw the OOM anyway.
Coleen
>
> Coleen
>
>
>
>
>> David
>> -----
>>
>> On 8/04/2013 11:49 PM, Mikael Gerdin wrote:
>>> Hi
>>>
>>> The problem is that when running the G1 garbage collector a call to
>>> objArrayOopDesc::obj_at_put can in rare cases cause the G1 dirty card
>>> queue buffer to fill up and this will cause G1 to take
>>> DirtyCardQ_CBL_mon/16 to return the full buffer and get a new one.
>>>
>>> Adding dependencies to a ClassLoaderData is currently protected by the
>>> per-metaspace "Metaspace allocation lock"/5 (which protects more than
>>> just allocations).
>>>
>>> Because I want to avoid messing around with the lock ordering I suggest
>>> that we use an ObjectLocker on the _dependencies oop in
>>> ClassLoaderData.
>>> _dependencies is a 2-element objArrayOop, in effect it's an ad-hoc
>>> linked list of ClassLoaders/Classes which must be kept alive by this
>>> CLD.
>>>
>>> Using an ObjectLocker on the head element of the linked list should be
>>> at least as good as using the metaspace_lock(). There should not be any
>>> new deadlock issues with VM mutexes since any application thread could
>>> use a similar synchronized linked list construct.
>>>
>>> As a bonus I suggest that we factor out the dependency handling from
>>> ClassLoaderData into a inner class: ClassLoaderData::Dependencies and
>>> let Dependencies manage the head of the linked list by itself. This
>>> should make it more clear that the dependency list is not guarded by
>>> the
>>> metaspace lock anymore.
>>>
>>> Buglinks:
>>> https://jbs.oracle.com/bugs/browse/JDK-8010196
>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8010196
>>>
>>> Testing:
>>> default jprt run
>>> jdk/test/java/lang/invoke/BigArityTest.java (which could reproduce the
>>> issue with -XX:+UseG1GC -XX:G1UpdateBufferSize=1)
>>>
>>> Webrevs:
>>> Incremental:
>>> http://cr.openjdk.java.net/~mgerdin/8010196/webrev.0-fix/webrev
>>> http://cr.openjdk.java.net/~mgerdin/8010196/webrev.0-factor-out/webrev
>>> http://cr.openjdk.java.net/~mgerdin/8010196/webrev.0-rename/webrev
>>> Everything:
>>> http://cr.openjdk.java.net/~mgerdin/8010196/webrev.0/webrev
>>>
>>> Thanks
>>> /Mikael
>
More information about the hotspot-runtime-dev
mailing list