RFR 8028497: SIGSEGV at ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)

Coleen Phillimore coleen.phillimore at oracle.com
Fri Apr 4 15:28:07 UTC 2014


Erik,
Thank you for the discussion and thorough code reviews.   See below.
(other reviewers: there's a new version of webrev, read down a little).

On 4/1/14 4:32 AM, Erik Helin wrote:
> On 2014-04-01 05:30, Coleen Phillimore wrote:
>>
>> Erik,
>>
>> You are right.  The old code would use the Klass->mirror link to clear
>> out the mirror->Klass link so that we didn't have a pointer to a
>> deallocated Klass in a mirror that might still be live for CMS marking.
>> The new code doesn't set the Klass->mirror link unless the mirror fields
>> are completely allocated, so we could crash if we deallocated a Klass
>> before the mirror because the mirror would still be pointing to the 
>> class.
>>
>> I've caught failed allocation in mirror fields so that the link is reset
>> in this change.  The only change is to the function create_mirror calls
>> initialize_mirror_fields.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8028497_3/
>
> In javaClasses.cpp:
>   void java_lang_Class::create_mirror(KlassHandle k, Handle, TRAPS) {
>     ...
>     initialize_mirror_fields(k, mirror, protection_domain, THREAD);
>     if (HAS_PENDING_EXCEPTION) {
>       // If any of the fields throws an exception like OOM remove the
>       // klass field from the mirror so GC doesn't follow it after the
>       // klass has been deallocated.
>       java_lang_Class::set_klass(mirror(), k());
>       return;                              ^
>     }                                      |
>                                            |
> +------------------------------------------+
> |
> +-- Do you really want to use k() here? I thought the idea was to do:
>     java_lang_Class::set_klass(mirror(), NULL);
>

I had meant to change this to NULL.
> Setting the Klass that the mirror is mirroring will hit the following 
> check in instanceMirrorKlass.cpp, oop_oop_iterate macro:
>
> int InstanceMirrorKlass::oop_oop_iterate##nv_suffix(...) {             \
> ... \
>   if_do_metadata_checked(closure, nv_suffix) {                         \
>     Klass* klass = java_lang_Class::as_Klass(obj);                     \
>     /* We'll get NULL for primitive mirrors. */                        \
>     if (klass != NULL) {                                               \
> closure->do_klass##nv_suffix(klass); \
> } \
> }
>
> Setting the Klass that the mirror is mirroring to NULL will make the 
> mirror look like a primitive mirror (this is the way we encode 
> primitive mirrors). Will that be a problem?
>

No, this is fine.  I added a comment that this mirror isn't a primitive 
type but it's also not representative of a class.  I had internal code 
to generate lots of these mirrors and CMS and G1 work fine because they 
don't follow the klass.   We don't want them to. We also clean up these 
orphaned mirrors, which I also verified manually.

I also tested heap walking with temporary code to create lots of 
orphaned mirrors.  I tried to create an internal test or a white box 
test but wasn't able to generate the right set of circumstances to 
exercise this code.   If necessary, we can open another bug to add a 
test but I found it noreg-hard.

This is also tested with Stefan's TreeMapTest.java test case that also 
required temporary code in the jvm.

So yes, I think we want to be able to set the Klass field in the mirror 
to null.  If other bugs fall out because of this in the future, I think 
we should fix the other bugs separately.

New webrev with only this classfile/javaClasses.cpp change is here.

open webrev at http://cr.openjdk.java.net/~coleenp/8028497_4/

Thanks!
Coleen
> Thanks,
> Erik
>
>> Tested with vm.parallel_class_loading.testlist which do hit this case.
>>
>> Thanks!
>> Coleen
>>
>>
>> On 3/27/14 1:11 PM, Erik Helin wrote:
>>> Hi Coleen,
>>>
>>> thanks for taking on this tricky bug! However, I think there is an
>>> issue with your fix :(
>>>
>>> This case is for "normal" (non-CDS) class loading:
>>> 1. We create the mirror
>>> 2. We set up the link from the mirror to the Klass
>>> 3. We fail somewhere before setting the link from the Klass to the
>>>    mirror, for example when allocating the Java object for the init
>>>    lock.
>>> 4. The destructor for ClassFileParser is run and we add the Klass to 
>>> the
>>>    deallocate list.
>>> 5. A CMS concurrent cycle is started, it will treat all objects in eden
>>>    + the survivor area as roots (including the mirror).
>>> 6. The above concurrent cycle enters the "final remark" phase and does
>>>    class unloading which will call free_deallocate_list() which will
>>>    return the memory for the Klass added to the deallocate list in step
>>>    4 to Metaspace.
>>> 7. The concurrent CMS cycle ends (the mirror is still in eden or young)
>>> 8. A new concurrent CMS cycle is started, the mirror will once again be
>>>    treated as a root since it is still in eden or young (no YC has been
>>>    done).
>>> 9. During the initial marking, CMS will now follow the pointer to the
>>>    Klass that the mirror is mirroring, but we returned this memory to
>>>    Metaspace in step 6.
>>> 10. We now crash :(
>>>
>>> Unfortunately, I don't have any suggestion on how to solve the above
>>> scenario at the moment.
>>>
>>> Thanks,
>>> Erik
>>>
>>> On 2014-03-27 16:04, Coleen Phillimore wrote:
>>>>
>>>> On 3/27/14 8:39 AM, Stefan Karlsson wrote:
>>>>>
>>>>> On 2014-03-27 03:00, Coleen Phillimore wrote:
>>>>>>
>>>>>> I have a new version of this change based on code that Ioi noticed
>>>>>> that I was missing for making restoring the method's adapters
>>>>>> restartable if something fails during restore_unshareable_info, and
>>>>>> the constant pool resolved references array.
>>>>>> Please review this new code (only method.* and instanceKlass.cpp and
>>>>>> constantPool.cpp have differences).  Reran tests with CDS and CMS.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~coleenp/8028497_2/
>>>>>
>>>>> This seems good. I'd prefer if someone with more experience with CDS
>>>>> also Reviews the change.
>>>>
>>>> Thanks, Ioi said he'd look at it.
>>>>
>>>>>
>>>>> Small comment:
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/8028497_2/src/share/vm/oops/instanceKlass.cpp.udiff.html 
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -    methodHandle m(THREAD, methods->at(index2));
>>>>> -    m()->link_method(m, CHECK);
>>>>> -    // restore method's vtable by calling a virtual function
>>>>> -    m->restore_vtable();
>>>>> +    Method *m = methods->at(index2);
>>>>> +    m->restore_unshareable_info(CHECK);
>>>>>
>>>>>
>>>>> Do we really want to use a raw Method * here. I know that we wrap the
>>>>> method in a handle inside  m->restore_unshareable_info, but it's not
>>>>> clear from the this context. I'm thinking about this issue:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8004815
>>>>>
>>>>> On the other hand, I guess we can't redefine a class before it has
>>>>> been defined. But maybe it would be good to not promote the usage of
>>>>> Method* over calls that might hit a Safepoint.
>>>>>
>>>>
>>>> I think you're right, we don't use 'm' after the potential 
>>>> safepoint and
>>>> we can't really redefine this class yet anyway but I'll restore the 
>>>> code
>>>> because we don't want to encourage uses of unhandled Methods unless
>>>> proven to be harmless.
>>>>
>>>> Thanks!
>>>> Coleen
>>>>> thanks,
>>>>> StefanK
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>> On 3/26/14 5:15 PM, Coleen Phillimore wrote:
>>>>>>>
>>>>>>> Hi Jon,
>>>>>>> Thank you for looking at the code changes.
>>>>>>>
>>>>>>> On 3/26/14 12:24 PM, Jon Masamitsu wrote:
>>>>>>>> Coleen,
>>>>>>>>
>>>>>>>> Did you remove Class_obj_allocate() because it was only
>>>>>>>> called in that one spot and  use obj_allocate() did most
>>>>>>>> of the work?
>>>>>>>
>>>>>>> I deleted it because GC doesn't need to know how these javaClasses
>>>>>>> are set up and I spent way to long looking for this code and didn't
>>>>>>> find it where it belongs, so I removed this. This 
>>>>>>> Class_obj_allocate
>>>>>>> is an artifact of the permgen code.
>>>>>>>
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~coleenp/8028497/src/share/vm/oops/instanceMirrorKlass.cpp.frames.html 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  373   // the size in the mirror size itself.
>>>>>>>>
>>>>>>>> Drop the second "size".  You know how we GC guys are
>>>>>>>> getting with our comments :-).
>>>>>>>
>>>>>>> Yes, I noticed that.  Thanks - fixed.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~coleenp/8028497/src/share/vm/oops/klass.cpp.frames.html 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  498 void Klass::restore_unshareable_info(TRAPS) {
>>>>>>>>  499   // If an exception happened during CDS restore, some of
>>>>>>>> these fields may already be
>>>>>>>>  500   // set.  We leave the class on the CLD list, even if
>>>>>>>> incomplete so that we don't
>>>>>>>>  501   // modify the CLD list outside a safepoint.
>>>>>>>>  502   if (class_loader_data() == NULL) {
>>>>>>>>  503     ClassLoaderData* loader_data =
>>>>>>>> ClassLoaderData::the_null_class_loader_data();
>>>>>>>>
>>>>>>>> I can see that this works with the current CDS (sharing only
>>>>>>>> classes loader
>>>>>>>> with the NULL class loader).  Will it work if the shared classes
>>>>>>>> are loaded
>>>>>>>> with an application class loader?
>>>>>>>
>>>>>>> I hope Ioi can answer that.  I assume so because the classes don't
>>>>>>> change class loaders if restoring shareable info fails.
>>>>>>>
>>>>>>> Ioi pointed out to me (not on open list) that I missed making
>>>>>>> link_method() and creating the constant pool resolved references
>>>>>>> array restartable, so I'm testing a fix for that also and I'll post
>>>>>>> an update later.   But you've seen most of it.
>>>>>>>
>>>>>>> Thanks!!
>>>>>>> Coleen
>>>>>>>>
>>>>>>>> Rest looks good.
>>>>>>>>
>>>>>>>> Jon
>>>>>>>>
>>>>>>>> On 3/25/2014 1:42 PM, Coleen Phillimore wrote:
>>>>>>>>> Summary: Keep class in CLD::_klasses list and mirror created for
>>>>>>>>> CDS classes if OOM during restore_shareable_info(). This keeps
>>>>>>>>> pointers consistent for CMS.
>>>>>>>>>
>>>>>>>>> This change makes restoring the mirror for shared classes
>>>>>>>>> restartable if there is an OOM in the middle of
>>>>>>>>> restore_unshareable_info.   The mirror field in the classes and
>>>>>>>>> CLD link is not cleaned up anymore.  The class isn't marked 
>>>>>>>>> loaded
>>>>>>>>> though so must be loaded again.
>>>>>>>>>
>>>>>>>>> Tested with Stefan's test case with hard coded OOM in jvm, so
>>>>>>>>> can't add the test.   Also ran all the tests against -client and
>>>>>>>>> server with -Xshare:auto and with -XX:+UseConcMarkSweepGC.
>>>>>>>>>
>>>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8028497/
>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8028497
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Coleen
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>



More information about the hotspot-dev mailing list