RFR(XS): 8222005: ClassRedefinition crashes with: guarantee(false) failed: OLD and/or OBSOLETE method(s) found

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Wed Jun 3 15:49:38 UTC 2020



On 5/28/20 5:44 PM, serguei.spitsyn at oracle.com wrote:
> Hi Coleen,
>
> Thank you a lot for reviewing this!
>
>
> On 5/28/20 12:48, coleen.phillimore at oracle.com wrote:
>> Hi Serguei,
>> Sorry for the delay reviewing this again.
>>
>> On 5/18/20 3:30 AM, serguei.spitsyn at oracle.com wrote:
>>> Hi Coleen and potential reviewers,
>>>
>>> Now, the webrev:
>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/
>>>
>>> has a complete fix for all three failure modes related to the 
>>> guarantee about OLD and OBSOLETE methods.
>>>
>>> The root cause are the following optimizations:
>>>
>>>  1) Optimization based on the flag ik->is_being_redefined():
>>>     The problem is that the cpcache method entries of such classes 
>>> are not being adjusted.
>>>     It is explained below in the initial RFR summary.
>>>     The fix is to get rid of this optimization.
>>
>> This seems like a good thing to do even though (actually especially 
>> because) I can't re-imagine the logic that went into this optimization.
>
> Probably, I've not explained it well enough.
> The logic was that the class marked as is_being_redefined was 
> considered as being redefined in the current redefinition operation.
> For classes redefined in current redefinition the cpcache is empty, so 
> there is  nothing to adjust.
> The problem is that classes can be marked as is_being_redefined by 
> doit_prologue of one of the following redefinition operations.
> In such a case, the VM_RedefineClasses::CheckClass::do_klass fails 
> with this guarantee.
> It is because the VM_RedefineClasses::CheckClass::do_klass does not 
> have this optimization
> and does not skip such classes as the 
> VM_RedefineClasses::AdjustAndCleanMetadata::do_class.
> Without this catch this issue could have unknown consequences in the 
> future execution far away from the root cause.

Yes this makes sense.  Two threads are redefining a set of classes in 
parallel,  not at a safepoint:

t1: class A, B, C => marks them all as is_being_redefined
t2: class D, E, F => marks these as is_being_redefined
safepoint
classes A, B, C are finishing redefinition in doit() so have their 
Methods replaced, and with is_being_redefine set for D, E, F the 
optimization was skipping replacing their Methods.  One of these classes 
D could have had a B::foo() in the vtable or cpCache.
crash in the check_classes!
>
>>>
>>>  2) Optimization for array classes based on the flag 
>>> _has_redefined_Object.
>>>     The problem is that the vtable method entries are not adjusted 
>>> for array classes.
>>>     The array classes have to be adjusted even if the 
>>> java.lang.Object was redefined
>>>     by one of previous VM_RedefineClasses operation, not only if it 
>>> was redefined in
>>>     the current VM_RedefineClasses operation. The fix is is follow 
>>> this requirement.
>>
>> This I can't understand.  The redefinitions are serialized in 
>> safepoints, so why would you need to replace vtable entries for 
>> arrays if java.lang.Object isn't redefined in this safepoint?
> The VM_RedefineClasses::CheckClass::do_klass fails with the same 
> guarantee because of this.
> It never fails this way with this optimization relaxed.
> I've already broke my head trying to understand it.
> It can be because of another bug we don't know yet.

Me neither but that's fine.  Remove the optimization!
Coleen

>
>>>
>>>  3) Optimization based on the flag _has_null_class_loader which 
>>> assumes that the Hotspot
>>>     does not support delegation from the bootstrap class loader to 
>>> auser-defined class
>>>     loader.The assumption is that if the current class being 
>>> redefined has a user-defined
>>>     classloader as its defining class loader, then allclasses loaded 
>>> by the bootstrap
>>>     class loader can be skipped for vtable/itable method entries 
>>> adjustment.
>>>     The problem is that this assumption is not really correct. There 
>>> are classes that
>>>     still need the adjustment. For instance, the class 
>>> java.util.IdentityHashMap$KeyIterator
>>>     loaded by the bootstrap class loader has the vtable/itable 
>>> references to the method:
>>> java.util.Iterator.forEachRemaining(java.util.function.Consumer)
>>>     The class java.util.Iterator is defined by a user-defined class 
>>> loader.
>>>     The fix is to get rid of this optimization.
>>
>> Also with this optimization, I'm not sure what the logic was that 
>> determined that this was safe, so it's best to remove it. Above makes 
>> sense.
>
> I don't know the full theory behind this optimization. We only have a 
> comment.
>
>
>>> All three failure modes are observed with the -Xcomp flag.
>>> With all three fixes above in place, the Kitchensink does not fail 
>>> with this guarantee anymore.
>>
>>
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/src/hotspot/share/oops/cpCache.cpp.udiff.html
>>
>> For logging, the log_trace function will also repeat the 'if' 
>> statement and not allocate the external_name() if logging isn't 
>> specified, so you don't need the 'if' statement above.
>>
>> + if (log_is_enabled(Trace, redefine, class, update)) {
>> + log_trace(redefine, class, update, constantpool)
>> + ("cpc %s entry update: %s", entry_type, new_method->external_name());
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/src/hotspot/share/oops/klassVtable.cpp.udiff.html
>>
>> Same in two cases here, and you could move the ResourceMark outside 
>> the loop at the top.
>
> Good suggestions, taken.
>
> Thanks!
> Serguei
>
>>
>> Thanks,
>> Coleen
>>>
>>> There is still a JIT compiler relted failure:
>>> https://bugs.openjdk.java.net/browse/JDK-8245128
>>>     Kitchensink fails with: assert(destination == (address)-1 || 
>>> destination == entry) failed: b) MT-unsafe modification of inline cache
>>>
>>> I also saw this failure but just once:
>>> https://bugs.openjdk.java.net/browse/JDK-8245126
>>>     Kitchensink fails with: assert(!method->is_old()) failed: Should 
>>> not be installing old methods
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 5/15/20 15:14, serguei.spitsyn at oracle.com wrote:
>>>> Hi Coleen,
>>>>
>>>> Thanks a lot for review!
>>>> Good suggestion, will use it.
>>>>
>>>> In fact, I've found two more related problems with the same guarantee.
>>>> One is with vtable method entries adjustment and another with itable.
>>>> This webrev version includes a fix for the vtable related issue:
>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/
>>>>
>>>> I'm still investigating the itable related issue.
>>>>
>>>> It is interesting that the Kitchensink with Instrumentation modules 
>>>> enabled is like a Pandora box full of surprises.
>>>> New problems are getting discovered after some road blocks are removed.
>>>> I've just filed a couple of compiler bugs discovered in this mode 
>>>> of testing:
>>>> https://bugs.openjdk.java.net/browse/JDK-8245126
>>>>     Kitchensink fails with: assert(!method->is_old()) failed: 
>>>> Should not be installing old methods
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8245128
>>>>     Kitchensink fails with: assert(destination == (address)-1 || 
>>>> destination == entry) failed: b) MT-unsafe modification of inline cache
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 5/15/20 05:12, coleen.phillimore at oracle.com wrote:
>>>>>
>>>>> Serguei,
>>>>>
>>>>> Good find!!  The fix looks good.  I'm sure the optimization wasn't 
>>>>> noticeable and thank you for the additional comments.
>>>>>
>>>>> There is a Method::external_name() function that I believe prints 
>>>>> all the things you want in the logging here:
>>>>>
>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.1/src/hotspot/share/oops/cpCache.cpp.udiff.html
>>>>>
>>>>> I don't need to see another webrev if you make this change.
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>> On 5/14/20 12:26 PM, serguei.spitsyn at oracle.com wrote:
>>>>>> Please, review a fix for The Kitchensink bug:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222005
>>>>>>
>>>>>> Webrev:
>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.1/
>>>>>>
>>>>>> Summary:
>>>>>>   The VM_RedefineClasses::doit() uses two helper classes to walk 
>>>>>> all VM classes.
>>>>>>   First is AdjustAndCleanMetadata to adjust method entries in the 
>>>>>> vtables/itables/cpcaches.
>>>>>>   Second is CheckClass to check that adjustments for all method 
>>>>>> entries are correct.
>>>>>>   The Kitchensink test is failing with two modes:
>>>>>>     - guarantee(false) failed: OLD and/or OBSOLETE method(s) 
>>>>>> found in the
>>>>>>       VM_RedefineClasses::CheckClass::do_klass()
>>>>>>     - SIGSEGV in the 
>>>>>> ConstantPoolCacheEntry::get_interesting_method_entry() in context
>>>>>>       of VM_RedefineClasses::CheckClass::do_klass() execution
>>>>>>
>>>>>>   The second failure mode is rare. In is before the first one in 
>>>>>> the code path.
>>>>>>   The root cause of both is that the 
>>>>>> VM_RedefineClasses::AdjustAndCleanMetadata::do_klass()
>>>>>>   is skipping the cpcache update for classes that are being 
>>>>>> redefined assuming they are
>>>>>>   being redefined by the current VM_RedefineClasses operation. In 
>>>>>> such cases, the adjustment
>>>>>>   is not needed as the cpcache is empty. The problem is that the 
>>>>>> assumption above is wrong.
>>>>>>   The class can also be redefined by another VM_RedefineClasses 
>>>>>> operation which has already
>>>>>>   executed its doit_prologue. The cpcache djustment for such 
>>>>>> class is necessary.
>>>>>>   The fix is to always call the cp_cache->adjust_method_entries() 
>>>>>> even if the class is
>>>>>>   being redefined by the current VM_RedefineClasses operation. It 
>>>>>> is possible to skip it
>>>>>>   but it will add extra complexity to the code.
>>>>>>   The fix also includes minor tweak in the cpCache.cpp to include 
>>>>>> method's class name to
>>>>>>   the redefinition cpcache log.
>>>>>>
>>>>>> Testing:
>>>>>>   Ran Kitchensink test locally on a Linux server with the 
>>>>>> Instrumentation module enabled.
>>>>>>   The test does not fail anymore.
>>>>>>   In progress, a mach5 tiers 1-5 and runs and separate mach5 
>>>>>> Kitchensink run.
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200603/2f28976b/attachment-0001.htm>


More information about the serviceability-dev mailing list