RFR 8028497: SIGSEGV at ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)
Erik Helin
erik.helin at oracle.com
Tue Apr 8 15:43:42 UTC 2014
Coleen,
the change http://cr.openjdk.java.net/~coleenp/8028497_4/ looks good.
Thanks for taking care of this patch!
Erik
On 2014-04-04 17:28, Coleen Phillimore wrote:
>
> Erik,
> Thank you for the discussion and thorough code reviews. See below.
> (other reviewers: there's a new version of webrev, read down a little).
>
> On 4/1/14 4:32 AM, Erik Helin wrote:
>> On 2014-04-01 05:30, Coleen Phillimore wrote:
>>>
>>> Erik,
>>>
>>> You are right. The old code would use the Klass->mirror link to clear
>>> out the mirror->Klass link so that we didn't have a pointer to a
>>> deallocated Klass in a mirror that might still be live for CMS marking.
>>> The new code doesn't set the Klass->mirror link unless the mirror fields
>>> are completely allocated, so we could crash if we deallocated a Klass
>>> before the mirror because the mirror would still be pointing to the
>>> class.
>>>
>>> I've caught failed allocation in mirror fields so that the link is reset
>>> in this change. The only change is to the function create_mirror calls
>>> initialize_mirror_fields.
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8028497_3/
>>
>> In javaClasses.cpp:
>> void java_lang_Class::create_mirror(KlassHandle k, Handle, TRAPS) {
>> ...
>> initialize_mirror_fields(k, mirror, protection_domain, THREAD);
>> if (HAS_PENDING_EXCEPTION) {
>> // If any of the fields throws an exception like OOM remove the
>> // klass field from the mirror so GC doesn't follow it after the
>> // klass has been deallocated.
>> java_lang_Class::set_klass(mirror(), k());
>> return; ^
>> } |
>> |
>> +------------------------------------------+
>> |
>> +-- Do you really want to use k() here? I thought the idea was to do:
>> java_lang_Class::set_klass(mirror(), NULL);
>>
>
> I had meant to change this to NULL.
>> Setting the Klass that the mirror is mirroring will hit the following
>> check in instanceMirrorKlass.cpp, oop_oop_iterate macro:
>>
>> int InstanceMirrorKlass::oop_oop_iterate##nv_suffix(...) { \
>> ... \
>> if_do_metadata_checked(closure, nv_suffix) { \
>> Klass* klass = java_lang_Class::as_Klass(obj); \
>> /* We'll get NULL for primitive mirrors. */ \
>> if (klass != NULL) { \
>> closure->do_klass##nv_suffix(klass); \
>> } \
>> }
>>
>> Setting the Klass that the mirror is mirroring to NULL will make the
>> mirror look like a primitive mirror (this is the way we encode
>> primitive mirrors). Will that be a problem?
>>
>
> No, this is fine. I added a comment that this mirror isn't a primitive
> type but it's also not representative of a class. I had internal code
> to generate lots of these mirrors and CMS and G1 work fine because they
> don't follow the klass. We don't want them to. We also clean up these
> orphaned mirrors, which I also verified manually.
>
> I also tested heap walking with temporary code to create lots of
> orphaned mirrors. I tried to create an internal test or a white box
> test but wasn't able to generate the right set of circumstances to
> exercise this code. If necessary, we can open another bug to add a
> test but I found it noreg-hard.
>
> This is also tested with Stefan's TreeMapTest.java test case that also
> required temporary code in the jvm.
>
> So yes, I think we want to be able to set the Klass field in the mirror
> to null. If other bugs fall out because of this in the future, I think
> we should fix the other bugs separately.
>
> New webrev with only this classfile/javaClasses.cpp change is here.
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8028497_4/
>
> Thanks!
> Coleen
>> Thanks,
>> Erik
>>
>>> Tested with vm.parallel_class_loading.testlist which do hit this case.
>>>
>>> Thanks!
>>> Coleen
>>>
>>>
>>> On 3/27/14 1:11 PM, Erik Helin wrote:
>>>> Hi Coleen,
>>>>
>>>> thanks for taking on this tricky bug! However, I think there is an
>>>> issue with your fix :(
>>>>
>>>> This case is for "normal" (non-CDS) class loading:
>>>> 1. We create the mirror
>>>> 2. We set up the link from the mirror to the Klass
>>>> 3. We fail somewhere before setting the link from the Klass to the
>>>> mirror, for example when allocating the Java object for the init
>>>> lock.
>>>> 4. The destructor for ClassFileParser is run and we add the Klass to
>>>> the
>>>> deallocate list.
>>>> 5. A CMS concurrent cycle is started, it will treat all objects in eden
>>>> + the survivor area as roots (including the mirror).
>>>> 6. The above concurrent cycle enters the "final remark" phase and does
>>>> class unloading which will call free_deallocate_list() which will
>>>> return the memory for the Klass added to the deallocate list in step
>>>> 4 to Metaspace.
>>>> 7. The concurrent CMS cycle ends (the mirror is still in eden or young)
>>>> 8. A new concurrent CMS cycle is started, the mirror will once again be
>>>> treated as a root since it is still in eden or young (no YC has been
>>>> done).
>>>> 9. During the initial marking, CMS will now follow the pointer to the
>>>> Klass that the mirror is mirroring, but we returned this memory to
>>>> Metaspace in step 6.
>>>> 10. We now crash :(
>>>>
>>>> Unfortunately, I don't have any suggestion on how to solve the above
>>>> scenario at the moment.
>>>>
>>>> Thanks,
>>>> Erik
>>>>
>>>> On 2014-03-27 16:04, Coleen Phillimore wrote:
>>>>>
>>>>> On 3/27/14 8:39 AM, Stefan Karlsson wrote:
>>>>>>
>>>>>> On 2014-03-27 03:00, Coleen Phillimore wrote:
>>>>>>>
>>>>>>> I have a new version of this change based on code that Ioi noticed
>>>>>>> that I was missing for making restoring the method's adapters
>>>>>>> restartable if something fails during restore_unshareable_info, and
>>>>>>> the constant pool resolved references array.
>>>>>>> Please review this new code (only method.* and instanceKlass.cpp and
>>>>>>> constantPool.cpp have differences). Reran tests with CDS and CMS.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~coleenp/8028497_2/
>>>>>>
>>>>>> This seems good. I'd prefer if someone with more experience with CDS
>>>>>> also Reviews the change.
>>>>>
>>>>> Thanks, Ioi said he'd look at it.
>>>>>
>>>>>>
>>>>>> Small comment:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~coleenp/8028497_2/src/share/vm/oops/instanceKlass.cpp.udiff.html
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> - methodHandle m(THREAD, methods->at(index2));
>>>>>> - m()->link_method(m, CHECK);
>>>>>> - // restore method's vtable by calling a virtual function
>>>>>> - m->restore_vtable();
>>>>>> + Method *m = methods->at(index2);
>>>>>> + m->restore_unshareable_info(CHECK);
>>>>>>
>>>>>>
>>>>>> Do we really want to use a raw Method * here. I know that we wrap the
>>>>>> method in a handle inside m->restore_unshareable_info, but it's not
>>>>>> clear from the this context. I'm thinking about this issue:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8004815
>>>>>>
>>>>>> On the other hand, I guess we can't redefine a class before it has
>>>>>> been defined. But maybe it would be good to not promote the usage of
>>>>>> Method* over calls that might hit a Safepoint.
>>>>>>
>>>>>
>>>>> I think you're right, we don't use 'm' after the potential
>>>>> safepoint and
>>>>> we can't really redefine this class yet anyway but I'll restore the
>>>>> code
>>>>> because we don't want to encourage uses of unhandled Methods unless
>>>>> proven to be harmless.
>>>>>
>>>>> Thanks!
>>>>> Coleen
>>>>>> thanks,
>>>>>> StefanK
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>>
>>>>>>> On 3/26/14 5:15 PM, Coleen Phillimore wrote:
>>>>>>>>
>>>>>>>> Hi Jon,
>>>>>>>> Thank you for looking at the code changes.
>>>>>>>>
>>>>>>>> On 3/26/14 12:24 PM, Jon Masamitsu wrote:
>>>>>>>>> Coleen,
>>>>>>>>>
>>>>>>>>> Did you remove Class_obj_allocate() because it was only
>>>>>>>>> called in that one spot and use obj_allocate() did most
>>>>>>>>> of the work?
>>>>>>>>
>>>>>>>> I deleted it because GC doesn't need to know how these javaClasses
>>>>>>>> are set up and I spent way to long looking for this code and didn't
>>>>>>>> find it where it belongs, so I removed this. This
>>>>>>>> Class_obj_allocate
>>>>>>>> is an artifact of the permgen code.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8028497/src/share/vm/oops/instanceMirrorKlass.cpp.frames.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 373 // the size in the mirror size itself.
>>>>>>>>>
>>>>>>>>> Drop the second "size". You know how we GC guys are
>>>>>>>>> getting with our comments :-).
>>>>>>>>
>>>>>>>> Yes, I noticed that. Thanks - fixed.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~coleenp/8028497/src/share/vm/oops/klass.cpp.frames.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 498 void Klass::restore_unshareable_info(TRAPS) {
>>>>>>>>> 499 // If an exception happened during CDS restore, some of
>>>>>>>>> these fields may already be
>>>>>>>>> 500 // set. We leave the class on the CLD list, even if
>>>>>>>>> incomplete so that we don't
>>>>>>>>> 501 // modify the CLD list outside a safepoint.
>>>>>>>>> 502 if (class_loader_data() == NULL) {
>>>>>>>>> 503 ClassLoaderData* loader_data =
>>>>>>>>> ClassLoaderData::the_null_class_loader_data();
>>>>>>>>>
>>>>>>>>> I can see that this works with the current CDS (sharing only
>>>>>>>>> classes loader
>>>>>>>>> with the NULL class loader). Will it work if the shared classes
>>>>>>>>> are loaded
>>>>>>>>> with an application class loader?
>>>>>>>>
>>>>>>>> I hope Ioi can answer that. I assume so because the classes don't
>>>>>>>> change class loaders if restoring shareable info fails.
>>>>>>>>
>>>>>>>> Ioi pointed out to me (not on open list) that I missed making
>>>>>>>> link_method() and creating the constant pool resolved references
>>>>>>>> array restartable, so I'm testing a fix for that also and I'll post
>>>>>>>> an update later. But you've seen most of it.
>>>>>>>>
>>>>>>>> Thanks!!
>>>>>>>> Coleen
>>>>>>>>>
>>>>>>>>> Rest looks good.
>>>>>>>>>
>>>>>>>>> Jon
>>>>>>>>>
>>>>>>>>> On 3/25/2014 1:42 PM, Coleen Phillimore wrote:
>>>>>>>>>> Summary: Keep class in CLD::_klasses list and mirror created for
>>>>>>>>>> CDS classes if OOM during restore_shareable_info(). This keeps
>>>>>>>>>> pointers consistent for CMS.
>>>>>>>>>>
>>>>>>>>>> This change makes restoring the mirror for shared classes
>>>>>>>>>> restartable if there is an OOM in the middle of
>>>>>>>>>> restore_unshareable_info. The mirror field in the classes and
>>>>>>>>>> CLD link is not cleaned up anymore. The class isn't marked
>>>>>>>>>> loaded
>>>>>>>>>> though so must be loaded again.
>>>>>>>>>>
>>>>>>>>>> Tested with Stefan's test case with hard coded OOM in jvm, so
>>>>>>>>>> can't add the test. Also ran all the tests against -client and
>>>>>>>>>> server with -Xshare:auto and with -XX:+UseConcMarkSweepGC.
>>>>>>>>>>
>>>>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8028497/
>>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8028497
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Coleen
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>
More information about the hotspot-dev
mailing list