RFR 8028497: SIGSEGV at ClassLoaderData::oops_do(OopClosure*, KlassClosure*, bool)
Coleen Phillimore
coleen.phillimore at oracle.com
Tue Apr 1 11:39:27 UTC 2014
Oh, that's why testing went so well.
Coleen
On 4/1/14 4:32 AM, Erik Helin wrote:
> On 2014-04-01 05:30, Coleen Phillimore wrote:
>>
>> Erik,
>>
>> You are right. The old code would use the Klass->mirror link to clear
>> out the mirror->Klass link so that we didn't have a pointer to a
>> deallocated Klass in a mirror that might still be live for CMS marking.
>> The new code doesn't set the Klass->mirror link unless the mirror fields
>> are completely allocated, so we could crash if we deallocated a Klass
>> before the mirror because the mirror would still be pointing to the
>> class.
>>
>> I've caught failed allocation in mirror fields so that the link is reset
>> in this change. The only change is to the function create_mirror calls
>> initialize_mirror_fields.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8028497_3/
>
> In javaClasses.cpp:
> void java_lang_Class::create_mirror(KlassHandle k, Handle, TRAPS) {
> ...
> initialize_mirror_fields(k, mirror, protection_domain, THREAD);
> if (HAS_PENDING_EXCEPTION) {
> // If any of the fields throws an exception like OOM remove the
> // klass field from the mirror so GC doesn't follow it after the
> // klass has been deallocated.
> java_lang_Class::set_klass(mirror(), k());
> return; ^
> } |
> |
> +------------------------------------------+
> |
> +-- Do you really want to use k() here? I thought the idea was to do:
> java_lang_Class::set_klass(mirror(), NULL);
>
> Setting the Klass that the mirror is mirroring will hit the following
> check in instanceMirrorKlass.cpp, oop_oop_iterate macro:
>
> int InstanceMirrorKlass::oop_oop_iterate##nv_suffix(...) { \
> ... \
> if_do_metadata_checked(closure, nv_suffix) { \
> Klass* klass = java_lang_Class::as_Klass(obj); \
> /* We'll get NULL for primitive mirrors. */ \
> if (klass != NULL) { \
> closure->do_klass##nv_suffix(klass); \
> } \
> }
>
> Setting the Klass that the mirror is mirroring to NULL will make the
> mirror look like a primitive mirror (this is the way we encode
> primitive mirrors). Will that be a problem?
>
> Thanks,
> Erik
>
>> Tested with vm.parallel_class_loading.testlist which do hit this case.
>>
>> Thanks!
>> Coleen
>>
>>
>> On 3/27/14 1:11 PM, Erik Helin wrote:
>>> Hi Coleen,
>>>
>>> thanks for taking on this tricky bug! However, I think there is an
>>> issue with your fix :(
>>>
>>> This case is for "normal" (non-CDS) class loading:
>>> 1. We create the mirror
>>> 2. We set up the link from the mirror to the Klass
>>> 3. We fail somewhere before setting the link from the Klass to the
>>> mirror, for example when allocating the Java object for the init
>>> lock.
>>> 4. The destructor for ClassFileParser is run and we add the Klass to
>>> the
>>> deallocate list.
>>> 5. A CMS concurrent cycle is started, it will treat all objects in eden
>>> + the survivor area as roots (including the mirror).
>>> 6. The above concurrent cycle enters the "final remark" phase and does
>>> class unloading which will call free_deallocate_list() which will
>>> return the memory for the Klass added to the deallocate list in step
>>> 4 to Metaspace.
>>> 7. The concurrent CMS cycle ends (the mirror is still in eden or young)
>>> 8. A new concurrent CMS cycle is started, the mirror will once again be
>>> treated as a root since it is still in eden or young (no YC has been
>>> done).
>>> 9. During the initial marking, CMS will now follow the pointer to the
>>> Klass that the mirror is mirroring, but we returned this memory to
>>> Metaspace in step 6.
>>> 10. We now crash :(
>>>
>>> Unfortunately, I don't have any suggestion on how to solve the above
>>> scenario at the moment.
>>>
>>> Thanks,
>>> Erik
>>>
>>> On 2014-03-27 16:04, Coleen Phillimore wrote:
>>>>
>>>> On 3/27/14 8:39 AM, Stefan Karlsson wrote:
>>>>>
>>>>> On 2014-03-27 03:00, Coleen Phillimore wrote:
>>>>>>
>>>>>> I have a new version of this change based on code that Ioi noticed
>>>>>> that I was missing for making restoring the method's adapters
>>>>>> restartable if something fails during restore_unshareable_info, and
>>>>>> the constant pool resolved references array.
>>>>>> Please review this new code (only method.* and instanceKlass.cpp and
>>>>>> constantPool.cpp have differences). Reran tests with CDS and CMS.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~coleenp/8028497_2/
>>>>>
>>>>> This seems good. I'd prefer if someone with more experience with CDS
>>>>> also Reviews the change.
>>>>
>>>> Thanks, Ioi said he'd look at it.
>>>>
>>>>>
>>>>> Small comment:
>>>>>
>>>>> http://cr.openjdk.java.net/~coleenp/8028497_2/src/share/vm/oops/instanceKlass.cpp.udiff.html
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> - methodHandle m(THREAD, methods->at(index2));
>>>>> - m()->link_method(m, CHECK);
>>>>> - // restore method's vtable by calling a virtual function
>>>>> - m->restore_vtable();
>>>>> + Method *m = methods->at(index2);
>>>>> + m->restore_unshareable_info(CHECK);
>>>>>
>>>>>
>>>>> Do we really want to use a raw Method * here. I know that we wrap the
>>>>> method in a handle inside m->restore_unshareable_info, but it's not
>>>>> clear from the this context. I'm thinking about this issue:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8004815
>>>>>
>>>>> On the other hand, I guess we can't redefine a class before it has
>>>>> been defined. But maybe it would be good to not promote the usage of
>>>>> Method* over calls that might hit a Safepoint.
>>>>>
>>>>
>>>> I think you're right, we don't use 'm' after the potential
>>>> safepoint and
>>>> we can't really redefine this class yet anyway but I'll restore the
>>>> code
>>>> because we don't want to encourage uses of unhandled Methods unless
>>>> proven to be harmless.
>>>>
>>>> Thanks!
>>>> Coleen
>>>>> thanks,
>>>>> StefanK
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>> On 3/26/14 5:15 PM, Coleen Phillimore wrote:
>>>>>>>
>>>>>>> Hi Jon,
>>>>>>> Thank you for looking at the code changes.
>>>>>>>
>>>>>>> On 3/26/14 12:24 PM, Jon Masamitsu wrote:
>>>>>>>> Coleen,
>>>>>>>>
>>>>>>>> Did you remove Class_obj_allocate() because it was only
>>>>>>>> called in that one spot and use obj_allocate() did most
>>>>>>>> of the work?
>>>>>>>
>>>>>>> I deleted it because GC doesn't need to know how these javaClasses
>>>>>>> are set up and I spent way to long looking for this code and didn't
>>>>>>> find it where it belongs, so I removed this. This
>>>>>>> Class_obj_allocate
>>>>>>> is an artifact of the permgen code.
>>>>>>>
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~coleenp/8028497/src/share/vm/oops/instanceMirrorKlass.cpp.frames.html
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 373 // the size in the mirror size itself.
>>>>>>>>
>>>>>>>> Drop the second "size". You know how we GC guys are
>>>>>>>> getting with our comments :-).
>>>>>>>
>>>>>>> Yes, I noticed that. Thanks - fixed.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~coleenp/8028497/src/share/vm/oops/klass.cpp.frames.html
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 498 void Klass::restore_unshareable_info(TRAPS) {
>>>>>>>> 499 // If an exception happened during CDS restore, some of
>>>>>>>> these fields may already be
>>>>>>>> 500 // set. We leave the class on the CLD list, even if
>>>>>>>> incomplete so that we don't
>>>>>>>> 501 // modify the CLD list outside a safepoint.
>>>>>>>> 502 if (class_loader_data() == NULL) {
>>>>>>>> 503 ClassLoaderData* loader_data =
>>>>>>>> ClassLoaderData::the_null_class_loader_data();
>>>>>>>>
>>>>>>>> I can see that this works with the current CDS (sharing only
>>>>>>>> classes loader
>>>>>>>> with the NULL class loader). Will it work if the shared classes
>>>>>>>> are loaded
>>>>>>>> with an application class loader?
>>>>>>>
>>>>>>> I hope Ioi can answer that. I assume so because the classes don't
>>>>>>> change class loaders if restoring shareable info fails.
>>>>>>>
>>>>>>> Ioi pointed out to me (not on open list) that I missed making
>>>>>>> link_method() and creating the constant pool resolved references
>>>>>>> array restartable, so I'm testing a fix for that also and I'll post
>>>>>>> an update later. But you've seen most of it.
>>>>>>>
>>>>>>> Thanks!!
>>>>>>> Coleen
>>>>>>>>
>>>>>>>> Rest looks good.
>>>>>>>>
>>>>>>>> Jon
>>>>>>>>
>>>>>>>> On 3/25/2014 1:42 PM, Coleen Phillimore wrote:
>>>>>>>>> Summary: Keep class in CLD::_klasses list and mirror created for
>>>>>>>>> CDS classes if OOM during restore_shareable_info(). This keeps
>>>>>>>>> pointers consistent for CMS.
>>>>>>>>>
>>>>>>>>> This change makes restoring the mirror for shared classes
>>>>>>>>> restartable if there is an OOM in the middle of
>>>>>>>>> restore_unshareable_info. The mirror field in the classes and
>>>>>>>>> CLD link is not cleaned up anymore. The class isn't marked
>>>>>>>>> loaded
>>>>>>>>> though so must be loaded again.
>>>>>>>>>
>>>>>>>>> Tested with Stefan's test case with hard coded OOM in jvm, so
>>>>>>>>> can't add the test. Also ran all the tests against -client and
>>>>>>>>> server with -Xshare:auto and with -XX:+UseConcMarkSweepGC.
>>>>>>>>>
>>>>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8028497/
>>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8028497
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Coleen
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>
More information about the hotspot-dev
mailing list