RFR 8028497: SIGSEGV at ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)

Erik Helin erik.helin at oracle.com
Tue Apr 1 08:32:37 UTC 2014


On 2014-04-01 05:30, Coleen Phillimore wrote:
>
> Erik,
>
> You are right.  The old code would use the Klass->mirror link to clear
> out the mirror->Klass link so that we didn't have a pointer to a
> deallocated Klass in a mirror that might still be live for CMS marking.
> The new code doesn't set the Klass->mirror link unless the mirror fields
> are completely allocated, so we could crash if we deallocated a Klass
> before the mirror because the mirror would still be pointing to the class.
>
> I've caught failed allocation in mirror fields so that the link is reset
> in this change.  The only change is to the function create_mirror calls
> initialize_mirror_fields.
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8028497_3/

In javaClasses.cpp:
   void java_lang_Class::create_mirror(KlassHandle k, Handle, TRAPS) {
     ...
     initialize_mirror_fields(k, mirror, protection_domain, THREAD);
     if (HAS_PENDING_EXCEPTION) {
       // If any of the fields throws an exception like OOM remove the
       // klass field from the mirror so GC doesn't follow it after the
       // klass has been deallocated.
       java_lang_Class::set_klass(mirror(), k());
       return;                              ^
     }                                      |
                                            |
+------------------------------------------+
|
+-- Do you really want to use k() here? I thought the idea was to do:
     java_lang_Class::set_klass(mirror(), NULL);

Setting the Klass that the mirror is mirroring will hit the following 
check in instanceMirrorKlass.cpp, oop_oop_iterate macro:

int InstanceMirrorKlass::oop_oop_iterate##nv_suffix(...) {             \
   ...                                                                  \
   if_do_metadata_checked(closure, nv_suffix) {                         \
     Klass* klass = java_lang_Class::as_Klass(obj);                     \
     /* We'll get NULL for primitive mirrors. */                        \
     if (klass != NULL) {                                               \
        closure->do_klass##nv_suffix(klass);                            \
   }                                                                    \
}

Setting the Klass that the mirror is mirroring to NULL will make the 
mirror look like a primitive mirror (this is the way we encode primitive 
mirrors). Will that be a problem?

Thanks,
Erik

> Tested with vm.parallel_class_loading.testlist which do hit this case.
>
> Thanks!
> Coleen
>
>
> On 3/27/14 1:11 PM, Erik Helin wrote:
>> Hi Coleen,
>>
>> thanks for taking on this tricky bug! However, I think there is an
>> issue with your fix :(
>>
>> This case is for "normal" (non-CDS) class loading:
>> 1. We create the mirror
>> 2. We set up the link from the mirror to the Klass
>> 3. We fail somewhere before setting the link from the Klass to the
>>    mirror, for example when allocating the Java object for the init
>>    lock.
>> 4. The destructor for ClassFileParser is run and we add the Klass to the
>>    deallocate list.
>> 5. A CMS concurrent cycle is started, it will treat all objects in eden
>>    + the survivor area as roots (including the mirror).
>> 6. The above concurrent cycle enters the "final remark" phase and does
>>    class unloading which will call free_deallocate_list() which will
>>    return the memory for the Klass added to the deallocate list in step
>>    4 to Metaspace.
>> 7. The concurrent CMS cycle ends (the mirror is still in eden or young)
>> 8. A new concurrent CMS cycle is started, the mirror will once again be
>>    treated as a root since it is still in eden or young (no YC has been
>>    done).
>> 9. During the initial marking, CMS will now follow the pointer to the
>>    Klass that the mirror is mirroring, but we returned this memory to
>>    Metaspace in step 6.
>> 10. We now crash :(
>>
>> Unfortunately, I don't have any suggestion on how to solve the above
>> scenario at the moment.
>>
>> Thanks,
>> Erik
>>
>> On 2014-03-27 16:04, Coleen Phillimore wrote:
>>>
>>> On 3/27/14 8:39 AM, Stefan Karlsson wrote:
>>>>
>>>> On 2014-03-27 03:00, Coleen Phillimore wrote:
>>>>>
>>>>> I have a new version of this change based on code that Ioi noticed
>>>>> that I was missing for making restoring the method's adapters
>>>>> restartable if something fails during restore_unshareable_info, and
>>>>> the constant pool resolved references array.
>>>>> Please review this new code (only method.* and instanceKlass.cpp and
>>>>> constantPool.cpp have differences).  Reran tests with CDS and CMS.
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/8028497_2/
>>>>
>>>> This seems good. I'd prefer if someone with more experience with CDS
>>>> also Reviews the change.
>>>
>>> Thanks, Ioi said he'd look at it.
>>>
>>>>
>>>> Small comment:
>>>>
>>>> http://cr.openjdk.java.net/~coleenp/8028497_2/src/share/vm/oops/instanceKlass.cpp.udiff.html
>>>>
>>>>
>>>>
>>>> -    methodHandle m(THREAD, methods->at(index2));
>>>> -    m()->link_method(m, CHECK);
>>>> -    // restore method's vtable by calling a virtual function
>>>> -    m->restore_vtable();
>>>> +    Method *m = methods->at(index2);
>>>> +    m->restore_unshareable_info(CHECK);
>>>>
>>>>
>>>> Do we really want to use a raw Method * here. I know that we wrap the
>>>> method in a handle inside  m->restore_unshareable_info, but it's not
>>>> clear from the this context. I'm thinking about this issue:
>>>> https://bugs.openjdk.java.net/browse/JDK-8004815
>>>>
>>>> On the other hand, I guess we can't redefine a class before it has
>>>> been defined. But maybe it would be good to not promote the usage of
>>>> Method* over calls that might hit a Safepoint.
>>>>
>>>
>>> I think you're right, we don't use 'm' after the potential safepoint and
>>> we can't really redefine this class yet anyway but I'll restore the code
>>> because we don't want to encourage uses of unhandled Methods unless
>>> proven to be harmless.
>>>
>>> Thanks!
>>> Coleen
>>>> thanks,
>>>> StefanK
>>>>
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>> On 3/26/14 5:15 PM, Coleen Phillimore wrote:
>>>>>>
>>>>>> Hi Jon,
>>>>>> Thank you for looking at the code changes.
>>>>>>
>>>>>> On 3/26/14 12:24 PM, Jon Masamitsu wrote:
>>>>>>> Coleen,
>>>>>>>
>>>>>>> Did you remove Class_obj_allocate() because it was only
>>>>>>> called in that one spot and  use obj_allocate() did most
>>>>>>> of the work?
>>>>>>
>>>>>> I deleted it because GC doesn't need to know how these javaClasses
>>>>>> are set up and I spent way to long looking for this code and didn't
>>>>>> find it where it belongs, so I removed this. This Class_obj_allocate
>>>>>> is an artifact of the permgen code.
>>>>>>
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~coleenp/8028497/src/share/vm/oops/instanceMirrorKlass.cpp.frames.html
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  373   // the size in the mirror size itself.
>>>>>>>
>>>>>>> Drop the second "size".  You know how we GC guys are
>>>>>>> getting with our comments :-).
>>>>>>
>>>>>> Yes, I noticed that.  Thanks - fixed.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~coleenp/8028497/src/share/vm/oops/klass.cpp.frames.html
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  498 void Klass::restore_unshareable_info(TRAPS) {
>>>>>>>  499   // If an exception happened during CDS restore, some of
>>>>>>> these fields may already be
>>>>>>>  500   // set.  We leave the class on the CLD list, even if
>>>>>>> incomplete so that we don't
>>>>>>>  501   // modify the CLD list outside a safepoint.
>>>>>>>  502   if (class_loader_data() == NULL) {
>>>>>>>  503     ClassLoaderData* loader_data =
>>>>>>> ClassLoaderData::the_null_class_loader_data();
>>>>>>>
>>>>>>> I can see that this works with the current CDS (sharing only
>>>>>>> classes loader
>>>>>>> with the NULL class loader).  Will it work if the shared classes
>>>>>>> are loaded
>>>>>>> with an application class loader?
>>>>>>
>>>>>> I hope Ioi can answer that.  I assume so because the classes don't
>>>>>> change class loaders if restoring shareable info fails.
>>>>>>
>>>>>> Ioi pointed out to me (not on open list) that I missed making
>>>>>> link_method() and creating the constant pool resolved references
>>>>>> array restartable, so I'm testing a fix for that also and I'll post
>>>>>> an update later.   But you've seen most of it.
>>>>>>
>>>>>> Thanks!!
>>>>>> Coleen
>>>>>>>
>>>>>>> Rest looks good.
>>>>>>>
>>>>>>> Jon
>>>>>>>
>>>>>>> On 3/25/2014 1:42 PM, Coleen Phillimore wrote:
>>>>>>>> Summary: Keep class in CLD::_klasses list and mirror created for
>>>>>>>> CDS classes if OOM during restore_shareable_info(). This keeps
>>>>>>>> pointers consistent for CMS.
>>>>>>>>
>>>>>>>> This change makes restoring the mirror for shared classes
>>>>>>>> restartable if there is an OOM in the middle of
>>>>>>>> restore_unshareable_info.   The mirror field in the classes and
>>>>>>>> CLD link is not cleaned up anymore.  The class isn't marked loaded
>>>>>>>> though so must be loaded again.
>>>>>>>>
>>>>>>>> Tested with Stefan's test case with hard coded OOM in jvm, so
>>>>>>>> can't add the test.   Also ran all the tests against -client and
>>>>>>>> server with -Xshare:auto and with -XX:+UseConcMarkSweepGC.
>>>>>>>>
>>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8028497/
>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8028497
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>


More information about the hotspot-dev mailing list