8076110: VM crash when class is redefined with Instrumentation.redefineClasses

serguei.spitsyn at oracle.com serguei.spitsyn at oracle.com
Fri Jun 12 17:29:18 UTC 2015


On 6/12/15 4:49 AM, Andreas Eriksson wrote:
>
>
> On 2015-06-12 00:36, serguei.spitsyn at oracle.com wrote:
>> On 6/11/15 7:32 AM, Coleen Phillimore wrote:
>>>
>>>
>>> On 6/11/15 7:20 AM, Andreas Eriksson wrote:
>>>> Hi,
>>>>
>>>> Deleting the resolution errors from InstanceKlass::deallocate 
>>>> contents works, but I'm unsure if the deletion code should be 
>>>> placed inside the is_shared() if-block or not.
>>>> I don't think that CDS should affect our decision to delete the 
>>>> cached resolution errors, but I might be missing something.
>>>> Coleen, do you know if it affects anything?
>>>
>>> The resolution errors aren't shared in the CDS archive so it's fine 
>>> to delete them outside the is_shared block.  The other things are in 
>>> the archive so have to be excluded.
>>>
>>> I'm still trying to figure out why deleting them inside 
>>> ConstantPool::deallocate_contents() deleted them outside a 
>>> safepoint.  I didn't find that code path.
>>
>> The ConstantPool::deallocate_contents() is called only from the 
>> free_metadata(ClassLoaderData* loader_data, T md).
>>
>> There are just two free_metadata() calls that are related to the 
>> ConstantPool::deallocate_contents():
>>     classfile/classLoaderData.cpp: 
>> MetadataFactory::free_metadata(this, (ConstantPool*)m);
>>     oops/instanceKlass.cpp: 
>> MetadataFactory::free_metadata(loader_data, constants());
>>
>
> There is at least one more place, in the ClassFileParser destructor, 
> at classfile/classFileParser.cpp:4271.
> // Destructor to clean up if there's an error
> ClassFileParser::~ClassFileParser() {
>    MetadataFactory::free_metadata(_loader_data, _cp);
> It seems to be called when the parsing has gone wrong, which can 
> happen when not at safepoint.
> Attached a hs_err where this has happened.

Good finding!

The ConstantPool::deallocate_contents() does not have a call to the 
SystemDictionary::delete_resolution_error() anymore.
So, it looks like the crash in the attached hs_err has been already fixed.

Thanks,
Serguei

>
> - Andreas
>
>> The first one is in the ClassLoaderData::free_deallocate_list() which 
>> has an assert:
>>   assert(SafepointSynchronize::is_at_safepoint(), "only called at 
>> safepoint");
>>
>> The second call is in the InstanceKlass::deallocate_contents() that 
>> is also called only
>> from the ClassLoaderData::free_deallocate_list().
>>
>> So that even if there is such a code path then it is by a mistake 
>> that has to be fixed.
>>
>> The issue with having the call to delete_resolution_error() in the 
>> ConstantPool::deallocate_contents() is that
>> it's call in the InstanceKlass::deallocate_contents( ) is conditional:
>>   379   if (constants() != NULL) {
>>   380     assert (!constants()->on_stack(), "shouldn't be called if anything is onstack");
>>   381     if (!constants()->is_shared()) {
>>   382       MetadataFactory::free_metadata(loader_data, constants());
>>   383     }
>> so that the delete_resolution_error() will not be called for 
>> constants()->is_shared() == true.
>>
>> Thanks,
>> Serguei
>>
>>
>>> You wouldn't happen to still have an hs_err file from that experiment?
>>>
>>> Thanks,
>>> Coleen
>>>>
>>>> New webrev:
>>>> http://cr.openjdk.java.net/~aeriksso/8076110/webrev.01/
>>>>
>>>> Regards,
>>>> Andreas
>>>>
>>>> On 2015-06-09 01:49, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Andreas,
>>>>>
>>>>> I agree, it is a good idea to delete the resolution errors from 
>>>>> ConstantPool::deallocate_contents.
>>>>> So, I'm waiting for next webrev.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>> On 6/8/15 9:53 AM, Andreas Eriksson wrote:
>>>>>> Hi Coleen, thanks for looking at this.
>>>>>>
>>>>>> That might work, I'll make a build to test it.
>>>>>>
>>>>>> Just FYI:
>>>>>> My first attempt to fix this was actually to delete the 
>>>>>> resolution errors from ConstantPool::deallocate_contents, but 
>>>>>> that was sometimes called when not at safepoint, which made me do 
>>>>>> this change instead.
>>>>>>
>>>>>> Regards,
>>>>>> Andreas
>>>>>>
>>>>>>
>>>>>> On 2015-06-06 00:07, Coleen Phillimore wrote:
>>>>>>>
>>>>>>> Thank you so much for fixing this, Andreas!
>>>>>>>
>>>>>>> I was wondering if you could delete the resolution errors in 
>>>>>>> InstanceKlass::deallocate_contents instead at the point where 
>>>>>>> you free the constant pool?   This code is at a safepoint at 
>>>>>>> this time, even with the CMS collector.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>>
>>>>>>> On 6/4/15 5:23 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Added the SVC list as it impacts the Serviceability as well.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>> On 6/4/15 4:56 AM, Andreas Eriksson wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> Please review this fix of a crash because of a class 
>>>>>>>>> redefinition race.
>>>>>>>>>
>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8076110
>>>>>>>>> Webrev: http://cr.openjdk.java.net/~aeriksso/8076110/webrev.00/
>>>>>>>>>
>>>>>>>>> The crash happens when the redefine occurs while the redefined 
>>>>>>>>> method is still running.
>>>>>>>>> This is because VM_RedefineClasses::redefine_single_class 
>>>>>>>>> deletes the resolution errors at redefine time, but the 
>>>>>>>>> running method might still need them.
>>>>>>>>>
>>>>>>>>> This is fixed by moving the deletion of the cached resolution 
>>>>>>>>> errors to when the {add,purge}_previous_version functions have 
>>>>>>>>> come to the conclusion that no running methods need the old 
>>>>>>>>> class any more (and therefore no one needs the constant pool 
>>>>>>>>> and its associated resolution errors either).
>>>>>>>>>
>>>>>>>>> A jtreg regression test that reproduces the problem using 
>>>>>>>>> redefineclasses and objectweb.asm is included in the change.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Andreas
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list