RFR (S/M) JDK-8010196 NPG: Internal Error: Metaspace allocation lock -- possible deadlock

David Holmes david.holmes at oracle.com
Mon Apr 8 21:13:02 PDT 2013


On 9/04/2013 1:58 PM, Coleen Phillimore wrote:
>
> Minor correction below.
>
> On 4/8/2013 11:20 PM, Coleen Phillimore wrote:
>> Mikael,
>> This looks good including the refactoring.
>>
>> On 4/8/2013 8:17 PM, David Holmes wrote:
>>> Hi Mikael,
>>>
>>> This approach sounds (and looks) reasonable to me, but I'll leav eit
>>> to others to officially Review.
>>>
>>> One concern I did have was whether the initialization of the array
>>> could fail and what that would mean for how this particular cld was
>>> used. That led me to the following code which looks suspect to me:
>>>
>>> // ClassLoaderData into the java/lang/ClassLoader object as a hidden
>>> field
>>> ClassLoaderData* ClassLoaderDataGraph::add(ClassLoaderData**
>>> cld_addr, Handle loader, TRAPS) {
>>>   ...
>>>   ClassLoaderData* cld = new ClassLoaderData(loader, is_anonymous);
>>>
>>>   if (cld_addr != NULL) {
>>>     // First, Atomically set it
>>>     ClassLoaderData* old = (ClassLoaderData*)
>>> Atomic::cmpxchg_ptr(cld, cld_addr, NULL);
>>>     if (old != NULL) {
>>>       delete cld;
>>>       // Returns the data.
>>>       return old;
>>>     }
>>>   }
>>>
>>>   // We won the race, and therefore the task of adding the data to
>>> the list of
>>>   // class loader data
>>>   do {
>>>     cld->set_next(next);
>>>     ClassLoaderData* exchanged =
>>> (ClassLoaderData*)Atomic::cmpxchg_ptr(cld, list_head, next);
>>>     if (exchanged == next) {
>>>       ...
>>>       // Create dependencies after the CLD is added to the list.
>>> Otherwise,
>>>       // the GC GC will not find the CLD and the _class_loader field
>>> will
>>>       // not be updated.
>>>       cld->init_dependencies(CHECK_NULL);
>>>       return cld;
>>>     }
>>>     next = exchanged;
>>>   } while (true);
>>> }
>>>
>>> So it explicitly states that we create dependencies after adding to
>>> the list, but if creating the dependencies fails we return null, but
>>> leave the new cld in the list. Is that a valid thing to do ??
>>>
>>
>> Not really.  If we hit OOM here there isn't a good recovery. We
>> shouldn't return NULL in any case.   I will file a bug.  In the
>> meantime, Mikael can you change it to CATCH_NULL and fix the comment
>> above (remove "the GC" before "GC")?
>
> David is right, it can return null with the pending exception but should
> be removed from the list and from the class_loader hidden field in this
> case.   This function is dangerous because it has unhandled oops (the
> address passed in).  For now, I think you can leave it Mikael.  It'll
> throw the OOM anyway.

So getting back to the original fix my concern is whether after 
init_dependencies fails, can we ever call add_dependency for that cld 
and so attempt to lock on a NULL object?

David

> Coleen
>
>>
>> Coleen
>>
>>
>>
>>
>>> David
>>> -----
>>>
>>> On 8/04/2013 11:49 PM, Mikael Gerdin wrote:
>>>> Hi
>>>>
>>>> The problem is that when running the G1 garbage collector a call to
>>>> objArrayOopDesc::obj_at_put can in rare cases cause the G1 dirty card
>>>> queue buffer to fill up and this will cause G1 to take
>>>> DirtyCardQ_CBL_mon/16 to return the full buffer and get a new one.
>>>>
>>>> Adding dependencies to a ClassLoaderData is currently protected by the
>>>> per-metaspace "Metaspace allocation lock"/5 (which protects more than
>>>> just allocations).
>>>>
>>>> Because I want to avoid messing around with the lock ordering I suggest
>>>> that we use an ObjectLocker on the _dependencies oop in
>>>> ClassLoaderData.
>>>> _dependencies is a 2-element objArrayOop, in effect it's an ad-hoc
>>>> linked list of ClassLoaders/Classes which must be kept alive by this
>>>> CLD.
>>>>
>>>> Using an ObjectLocker on the head element of the linked list should be
>>>> at least as good as using the metaspace_lock(). There should not be any
>>>> new deadlock issues with VM mutexes since any application thread could
>>>> use a similar synchronized linked list construct.
>>>>
>>>> As a bonus I suggest that we factor out the dependency handling from
>>>> ClassLoaderData into a inner class: ClassLoaderData::Dependencies and
>>>> let Dependencies manage the head of the linked list by itself. This
>>>> should make it more clear that the dependency list is not guarded by
>>>> the
>>>> metaspace lock anymore.
>>>>
>>>> Buglinks:
>>>> https://jbs.oracle.com/bugs/browse/JDK-8010196
>>>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=8010196
>>>>
>>>> Testing:
>>>> default jprt run
>>>> jdk/test/java/lang/invoke/BigArityTest.java (which could reproduce the
>>>> issue with -XX:+UseG1GC -XX:G1UpdateBufferSize=1)
>>>>
>>>> Webrevs:
>>>> Incremental:
>>>> http://cr.openjdk.java.net/~mgerdin/8010196/webrev.0-fix/webrev
>>>> http://cr.openjdk.java.net/~mgerdin/8010196/webrev.0-factor-out/webrev
>>>> http://cr.openjdk.java.net/~mgerdin/8010196/webrev.0-rename/webrev
>>>> Everything:
>>>> http://cr.openjdk.java.net/~mgerdin/8010196/webrev.0/webrev
>>>>
>>>> Thanks
>>>> /Mikael
>>
>


More information about the hotspot-runtime-dev mailing list