RFR 8028497: SIGSEGV at ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)

Coleen Phillimore coleen.phillimore at oracle.com
Tue Apr 8 17:58:10 UTC 2014


Thank you, Erik!
Coleen

On 4/8/14, 11:43 AM, Erik Helin wrote:
> Coleen,
>
> the change http://cr.openjdk.java.net/~coleenp/8028497_4/ looks good.
>
> Thanks for taking care of this patch!
> Erik
>
> On 2014-04-04 17:28, Coleen Phillimore wrote:
>>
>> Erik,
>> Thank you for the discussion and thorough code reviews.   See below.
>> (other reviewers: there's a new version of webrev, read down a little).
>>
>> On 4/1/14 4:32 AM, Erik Helin wrote:
>>> On 2014-04-01 05:30, Coleen Phillimore wrote:
>>>>
>>>> Erik,
>>>>
>>>> You are right.  The old code would use the Klass->mirror link to clear
>>>> out the mirror->Klass link so that we didn't have a pointer to a
>>>> deallocated Klass in a mirror that might still be live for CMS 
>>>> marking.
>>>> The new code doesn't set the Klass->mirror link unless the mirror 
>>>> fields
>>>> are completely allocated, so we could crash if we deallocated a Klass
>>>> before the mirror because the mirror would still be pointing to the
>>>> class.
>>>>
>>>> I've caught failed allocation in mirror fields so that the link is 
>>>> reset
>>>> in this change.  The only change is to the function create_mirror 
>>>> calls
>>>> initialize_mirror_fields.
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8028497_3/
>>>
>>> In javaClasses.cpp:
>>>   void java_lang_Class::create_mirror(KlassHandle k, Handle, TRAPS) {
>>>     ...
>>>     initialize_mirror_fields(k, mirror, protection_domain, THREAD);
>>>     if (HAS_PENDING_EXCEPTION) {
>>>       // If any of the fields throws an exception like OOM remove the
>>>       // klass field from the mirror so GC doesn't follow it after the
>>>       // klass has been deallocated.
>>>       java_lang_Class::set_klass(mirror(), k());
>>>       return;                              ^
>>>     }                                      |
>>>                                            |
>>> +------------------------------------------+
>>> |
>>> +-- Do you really want to use k() here? I thought the idea was to do:
>>>     java_lang_Class::set_klass(mirror(), NULL);
>>>
>>
>> I had meant to change this to NULL.
>>> Setting the Klass that the mirror is mirroring will hit the following
>>> check in instanceMirrorKlass.cpp, oop_oop_iterate macro:
>>>
>>> int InstanceMirrorKlass::oop_oop_iterate##nv_suffix(...) 
>>> {             \
>>> ... \
>>>   if_do_metadata_checked(closure, nv_suffix) 
>>> {                         \
>>>     Klass* klass = 
>>> java_lang_Class::as_Klass(obj);                     \
>>>     /* We'll get NULL for primitive mirrors. 
>>> */                        \
>>>     if (klass != NULL) 
>>> {                                               \
>>> closure->do_klass##nv_suffix(klass); \
>>> } \
>>> }
>>>
>>> Setting the Klass that the mirror is mirroring to NULL will make the
>>> mirror look like a primitive mirror (this is the way we encode
>>> primitive mirrors). Will that be a problem?
>>>
>>
>> No, this is fine.  I added a comment that this mirror isn't a primitive
>> type but it's also not representative of a class.  I had internal code
>> to generate lots of these mirrors and CMS and G1 work fine because they
>> don't follow the klass.   We don't want them to. We also clean up these
>> orphaned mirrors, which I also verified manually.
>>
>> I also tested heap walking with temporary code to create lots of
>> orphaned mirrors.  I tried to create an internal test or a white box
>> test but wasn't able to generate the right set of circumstances to
>> exercise this code.   If necessary, we can open another bug to add a
>> test but I found it noreg-hard.
>>
>> This is also tested with Stefan's TreeMapTest.java test case that also
>> required temporary code in the jvm.
>>
>> So yes, I think we want to be able to set the Klass field in the mirror
>> to null.  If other bugs fall out because of this in the future, I think
>> we should fix the other bugs separately.
>>
>> New webrev with only this classfile/javaClasses.cpp change is here.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8028497_4/
>>
>> Thanks!
>> Coleen
>>> Thanks,
>>> Erik
>>>
>>>> Tested with vm.parallel_class_loading.testlist which do hit this case.
>>>>
>>>> Thanks!
>>>> Coleen
>>>>
>>>>
>>>> On 3/27/14 1:11 PM, Erik Helin wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> thanks for taking on this tricky bug! However, I think there is an
>>>>> issue with your fix :(
>>>>>
>>>>> This case is for "normal" (non-CDS) class loading:
>>>>> 1. We create the mirror
>>>>> 2. We set up the link from the mirror to the Klass
>>>>> 3. We fail somewhere before setting the link from the Klass to the
>>>>>    mirror, for example when allocating the Java object for the init
>>>>>    lock.
>>>>> 4. The destructor for ClassFileParser is run and we add the Klass to
>>>>> the
>>>>>    deallocate list.
>>>>> 5. A CMS concurrent cycle is started, it will treat all objects in 
>>>>> eden
>>>>>    + the survivor area as roots (including the mirror).
>>>>> 6. The above concurrent cycle enters the "final remark" phase and 
>>>>> does
>>>>>    class unloading which will call free_deallocate_list() which will
>>>>>    return the memory for the Klass added to the deallocate list in 
>>>>> step
>>>>>    4 to Metaspace.
>>>>> 7. The concurrent CMS cycle ends (the mirror is still in eden or 
>>>>> young)
>>>>> 8. A new concurrent CMS cycle is started, the mirror will once 
>>>>> again be
>>>>>    treated as a root since it is still in eden or young (no YC has 
>>>>> been
>>>>>    done).
>>>>> 9. During the initial marking, CMS will now follow the pointer to the
>>>>>    Klass that the mirror is mirroring, but we returned this memory to
>>>>>    Metaspace in step 6.
>>>>> 10. We now crash :(
>>>>>
>>>>> Unfortunately, I don't have any suggestion on how to solve the above
>>>>> scenario at the moment.
>>>>>
>>>>> Thanks,
>>>>> Erik
>>>>>
>>>>> On 2014-03-27 16:04, Coleen Phillimore wrote:
>>>>>>
>>>>>> On 3/27/14 8:39 AM, Stefan Karlsson wrote:
>>>>>>>
>>>>>>> On 2014-03-27 03:00, Coleen Phillimore wrote:
>>>>>>>>
>>>>>>>> I have a new version of this change based on code that Ioi noticed
>>>>>>>> that I was missing for making restoring the method's adapters
>>>>>>>> restartable if something fails during restore_unshareable_info, 
>>>>>>>> and
>>>>>>>> the constant pool resolved references array.
>>>>>>>> Please review this new code (only method.* and 
>>>>>>>> instanceKlass.cpp and
>>>>>>>> constantPool.cpp have differences).  Reran tests with CDS and CMS.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~coleenp/8028497_2/
>>>>>>>
>>>>>>> This seems good. I'd prefer if someone with more experience with 
>>>>>>> CDS
>>>>>>> also Reviews the change.
>>>>>>
>>>>>> Thanks, Ioi said he'd look at it.
>>>>>>
>>>>>>>
>>>>>>> Small comment:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~coleenp/8028497_2/src/share/vm/oops/instanceKlass.cpp.udiff.html 
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -    methodHandle m(THREAD, methods->at(index2));
>>>>>>> -    m()->link_method(m, CHECK);
>>>>>>> -    // restore method's vtable by calling a virtual function
>>>>>>> -    m->restore_vtable();
>>>>>>> +    Method *m = methods->at(index2);
>>>>>>> +    m->restore_unshareable_info(CHECK);
>>>>>>>
>>>>>>>
>>>>>>> Do we really want to use a raw Method * here. I know that we 
>>>>>>> wrap the
>>>>>>> method in a handle inside m->restore_unshareable_info, but it's not
>>>>>>> clear from the this context. I'm thinking about this issue:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8004815
>>>>>>>
>>>>>>> On the other hand, I guess we can't redefine a class before it has
>>>>>>> been defined. But maybe it would be good to not promote the 
>>>>>>> usage of
>>>>>>> Method* over calls that might hit a Safepoint.
>>>>>>>
>>>>>>
>>>>>> I think you're right, we don't use 'm' after the potential
>>>>>> safepoint and
>>>>>> we can't really redefine this class yet anyway but I'll restore the
>>>>>> code
>>>>>> because we don't want to encourage uses of unhandled Methods unless
>>>>>> proven to be harmless.
>>>>>>
>>>>>> Thanks!
>>>>>> Coleen
>>>>>>> thanks,
>>>>>>> StefanK
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>>>
>>>>>>>> On 3/26/14 5:15 PM, Coleen Phillimore wrote:
>>>>>>>>>
>>>>>>>>> Hi Jon,
>>>>>>>>> Thank you for looking at the code changes.
>>>>>>>>>
>>>>>>>>> On 3/26/14 12:24 PM, Jon Masamitsu wrote:
>>>>>>>>>> Coleen,
>>>>>>>>>>
>>>>>>>>>> Did you remove Class_obj_allocate() because it was only
>>>>>>>>>> called in that one spot and  use obj_allocate() did most
>>>>>>>>>> of the work?
>>>>>>>>>
>>>>>>>>> I deleted it because GC doesn't need to know how these 
>>>>>>>>> javaClasses
>>>>>>>>> are set up and I spent way to long looking for this code and 
>>>>>>>>> didn't
>>>>>>>>> find it where it belongs, so I removed this. This
>>>>>>>>> Class_obj_allocate
>>>>>>>>> is an artifact of the permgen code.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8028497/src/share/vm/oops/instanceMirrorKlass.cpp.frames.html 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  373   // the size in the mirror size itself.
>>>>>>>>>>
>>>>>>>>>> Drop the second "size".  You know how we GC guys are
>>>>>>>>>> getting with our comments :-).
>>>>>>>>>
>>>>>>>>> Yes, I noticed that.  Thanks - fixed.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8028497/src/share/vm/oops/klass.cpp.frames.html 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  498 void Klass::restore_unshareable_info(TRAPS) {
>>>>>>>>>>  499   // If an exception happened during CDS restore, some of
>>>>>>>>>> these fields may already be
>>>>>>>>>>  500   // set.  We leave the class on the CLD list, even if
>>>>>>>>>> incomplete so that we don't
>>>>>>>>>>  501   // modify the CLD list outside a safepoint.
>>>>>>>>>>  502   if (class_loader_data() == NULL) {
>>>>>>>>>>  503     ClassLoaderData* loader_data =
>>>>>>>>>> ClassLoaderData::the_null_class_loader_data();
>>>>>>>>>>
>>>>>>>>>> I can see that this works with the current CDS (sharing only
>>>>>>>>>> classes loader
>>>>>>>>>> with the NULL class loader).  Will it work if the shared classes
>>>>>>>>>> are loaded
>>>>>>>>>> with an application class loader?
>>>>>>>>>
>>>>>>>>> I hope Ioi can answer that.  I assume so because the classes 
>>>>>>>>> don't
>>>>>>>>> change class loaders if restoring shareable info fails.
>>>>>>>>>
>>>>>>>>> Ioi pointed out to me (not on open list) that I missed making
>>>>>>>>> link_method() and creating the constant pool resolved references
>>>>>>>>> array restartable, so I'm testing a fix for that also and I'll 
>>>>>>>>> post
>>>>>>>>> an update later.   But you've seen most of it.
>>>>>>>>>
>>>>>>>>> Thanks!!
>>>>>>>>> Coleen
>>>>>>>>>>
>>>>>>>>>> Rest looks good.
>>>>>>>>>>
>>>>>>>>>> Jon
>>>>>>>>>>
>>>>>>>>>> On 3/25/2014 1:42 PM, Coleen Phillimore wrote:
>>>>>>>>>>> Summary: Keep class in CLD::_klasses list and mirror created 
>>>>>>>>>>> for
>>>>>>>>>>> CDS classes if OOM during restore_shareable_info(). This keeps
>>>>>>>>>>> pointers consistent for CMS.
>>>>>>>>>>>
>>>>>>>>>>> This change makes restoring the mirror for shared classes
>>>>>>>>>>> restartable if there is an OOM in the middle of
>>>>>>>>>>> restore_unshareable_info.   The mirror field in the classes and
>>>>>>>>>>> CLD link is not cleaned up anymore.  The class isn't marked
>>>>>>>>>>> loaded
>>>>>>>>>>> though so must be loaded again.
>>>>>>>>>>>
>>>>>>>>>>> Tested with Stefan's test case with hard coded OOM in jvm, so
>>>>>>>>>>> can't add the test.   Also ran all the tests against -client 
>>>>>>>>>>> and
>>>>>>>>>>> server with -Xshare:auto and with -XX:+UseConcMarkSweepGC.
>>>>>>>>>>>
>>>>>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8028497/
>>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8028497
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Coleen
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>



More information about the hotspot-dev mailing list