RFR(XS): 8222005: ClassRedefinition crashes with: guarantee(false) failed: OLD and/or OBSOLETE method(s) found

Daniel D. Daugherty daniel.daugherty at oracle.com
Wed Jun 10 20:57:11 UTC 2020


Hi Serguei,

Sorry for the late review...

On 5/28/20 7:16 PM, serguei.spitsyn at oracle.com wrote:
> Hi Coleen,
>
> The updated webrev version is:
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.3/

src/hotspot/share/oops/cpCache.cpp
     nit - please update copyright year before you push

     L570:   log_trace(redefine, class, update, constantpool)
     L571:         ("cpc %s entry update: %s", entry_type, 
new_method->external_name());
         nit - The continued line indent on the other line you touched 
in this
                file is two spaces. This one is six... Not your bug, but 
can you
                fix it while you are here?

     L816:         ("cpcache check found old method entry: class: %s, 
old: %d, obsolete: %d, method: %s\n",
         nit - I don't think you want the "\n" here.

src/hotspot/share/oops/klassVtable.cpp
     L1004:         ("vtable check found old method entry: class: %s 
old: %d obsolete: %d, %s\n",
     L1319:         ("itable check found old method entry: class: %s 
old: %d obsolete: %d, %s\n",
         nit - I don't think you want the "\n" here.

         In the new log_trace() call in cpCache.cpp, you include a
         label for the method output:

         L816:         ("cpcache check found old method entry: class: 
%s, old: %d, obsolete: %d, method: %s\n",

         but you don't here. I think you should.

src/hotspot/share/prims/jvmtiRedefineClasses.cpp
     L74: // this flag is global as the constructor does not reset it
         nit - Please s/this/This/ and add a ':' to the end.

     old L3586:     if (!_has_null_class_loader && ik->class_loader() == 
NULL) {
     old L3587:       return;
         This optimization has been here for a long time! Thanks for the
         explanation in "3) Optimization based on the flag 
_has_null_class_loader"
         below... I'm probably the one that got that wrong so long ago...

     L3601:     // and needs cpchache method entries adjusted. For 
simplicity, the cpcache
         typo - s/cpchache/cpcache/

     old L3616:     if (!ik->is_being_redefined()) {
         Nice explanation on L3599-3604 for why this optimization is
         not a good idea.


Thumbs up! I only have nits and typos above. I don't need to see
another webrev.

Dan



>
> It has your suggestions addressed:
>  - remove log_is_enabled conditions
>  - move ResourceMark's out of loops
>
> Thanks,
> Serguei
>
>
> On 5/28/20 14:44, serguei.spitsyn at oracle.com wrote:
>> Hi Coleen,
>>
>> Thank you a lot for reviewing this!
>>
>>
>> On 5/28/20 12:48, coleen.phillimore at oracle.com wrote:
>>> Hi Serguei,
>>> Sorry for the delay reviewing this again.
>>>
>>> On 5/18/20 3:30 AM, serguei.spitsyn at oracle.com wrote:
>>>> Hi Coleen and potential reviewers,
>>>>
>>>> Now, the webrev:
>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/
>>>>
>>>> has a complete fix for all three failure modes related to the 
>>>> guarantee about OLD and OBSOLETE methods.
>>>>
>>>> The root cause are the following optimizations:
>>>>
>>>>  1) Optimization based on the flag ik->is_being_redefined():
>>>>     The problem is that the cpcache method entries of such classes 
>>>> are not being adjusted.
>>>>     It is explained below in the initial RFR summary.
>>>>     The fix is to get rid of this optimization.
>>>
>>> This seems like a good thing to do even though (actually especially 
>>> because) I can't re-imagine the logic that went into this optimization.
>>
>> Probably, I've not explained it well enough.
>> The logic was that the class marked as is_being_redefined was 
>> considered as being redefined in the current redefinition operation.
>> For classes redefined in current redefinition the cpcache is empty, 
>> so there is  nothing to adjust.
>> The problem is that classes can be marked as is_being_redefined by 
>> doit_prologue of one of the following redefinition operations.
>> In such a case, the VM_RedefineClasses::CheckClass::do_klass fails 
>> with this guarantee.
>> It is because the VM_RedefineClasses::CheckClass::do_klass does not 
>> have this optimization
>> and does not skip such classes as the 
>> VM_RedefineClasses::AdjustAndCleanMetadata::do_class.
>> Without this catch this issue could have unknown consequences in the 
>> future execution far away from the root cause.
>>
>>>>
>>>>  2) Optimization for array classes based on the flag 
>>>> _has_redefined_Object.
>>>>     The problem is that the vtable method entries are not adjusted 
>>>> for array classes.
>>>>     The array classes have to be adjusted even if the 
>>>> java.lang.Object was redefined
>>>>     by one of previous VM_RedefineClasses operation, not only if it 
>>>> was redefined in
>>>>     the current VM_RedefineClasses operation. The fix is is follow 
>>>> this requirement.
>>>
>>> This I can't understand.  The redefinitions are serialized in 
>>> safepoints, so why would you need to replace vtable entries for 
>>> arrays if java.lang.Object isn't redefined in this safepoint?
>> The VM_RedefineClasses::CheckClass::do_klass fails with the same 
>> guarantee because of this.
>> It never fails this way with this optimization relaxed.
>> I've already broke my head trying to understand it.
>> It can be because of another bug we don't know yet.
>>
>>>>
>>>>  3) Optimization based on the flag _has_null_class_loader which 
>>>> assumes that the Hotspot
>>>>     does not support delegation from the bootstrap class loader to 
>>>> auser-defined class
>>>>     loader.The assumption is that if the current class being 
>>>> redefined has a user-defined
>>>>     classloader as its defining class loader, then allclasses 
>>>> loaded by the bootstrap
>>>>     class loader can be skipped for vtable/itable method entries 
>>>> adjustment.
>>>>     The problem is that this assumption is not really correct. 
>>>> There are classes that
>>>>     still need the adjustment. For instance, the class 
>>>> java.util.IdentityHashMap$KeyIterator
>>>>     loaded by the bootstrap class loader has the vtable/itable 
>>>> references to the method:
>>>> java.util.Iterator.forEachRemaining(java.util.function.Consumer)
>>>>     The class java.util.Iterator is defined by a user-defined class 
>>>> loader.
>>>>     The fix is to get rid of this optimization.
>>>
>>> Also with this optimization, I'm not sure what the logic was that 
>>> determined that this was safe, so it's best to remove it.  Above 
>>> makes sense.
>>
>> I don't know the full theory behind this optimization. We only have a 
>> comment.
>>
>>
>>>> All three failure modes are observed with the -Xcomp flag.
>>>> With all three fixes above in place, the Kitchensink does not fail 
>>>> with this guarantee anymore.
>>>
>>>
>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/src/hotspot/share/oops/cpCache.cpp.udiff.html
>>>
>>> For logging, the log_trace function will also repeat the 'if' 
>>> statement and not allocate the external_name() if logging isn't 
>>> specified, so you don't need the 'if' statement above.
>>>
>>> + if (log_is_enabled(Trace, redefine, class, update)) {
>>> + log_trace(redefine, class, update, constantpool)
>>> + ("cpc %s entry update: %s", entry_type, new_method->external_name());
>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/src/hotspot/share/oops/klassVtable.cpp.udiff.html
>>>
>>> Same in two cases here, and you could move the ResourceMark outside 
>>> the loop at the top.
>>
>> Good suggestions, taken.
>>
>> Thanks!
>> Serguei
>>
>>>
>>> Thanks,
>>> Coleen
>>>>
>>>> There is still a JIT compiler relted failure:
>>>> https://bugs.openjdk.java.net/browse/JDK-8245128
>>>>     Kitchensink fails with: assert(destination == (address)-1 || 
>>>> destination == entry) failed: b) MT-unsafe modification of inline cache
>>>>
>>>> I also saw this failure but just once:
>>>> https://bugs.openjdk.java.net/browse/JDK-8245126
>>>>     Kitchensink fails with: assert(!method->is_old()) failed: 
>>>> Should not be installing old methods
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 5/15/20 15:14, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> Thanks a lot for review!
>>>>> Good suggestion, will use it.
>>>>>
>>>>> In fact, I've found two more related problems with the same guarantee.
>>>>> One is with vtable method entries adjustment and another with itable.
>>>>> This webrev version includes a fix for the vtable related issue:
>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.2/
>>>>>
>>>>> I'm still investigating the itable related issue.
>>>>>
>>>>> It is interesting that the Kitchensink with Instrumentation 
>>>>> modules enabled is like a Pandora box full of surprises.
>>>>> New problems are getting discovered after some road blocks are 
>>>>> removed.
>>>>> I've just filed a couple of compiler bugs discovered in this mode 
>>>>> of testing:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8245126
>>>>>     Kitchensink fails with: assert(!method->is_old()) failed: 
>>>>> Should not be installing old methods
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8245128
>>>>>     Kitchensink fails with: assert(destination == (address)-1 || 
>>>>> destination == entry) failed: b) MT-unsafe modification of inline 
>>>>> cache
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 5/15/20 05:12, coleen.phillimore at oracle.com wrote:
>>>>>>
>>>>>> Serguei,
>>>>>>
>>>>>> Good find!!  The fix looks good.  I'm sure the optimization 
>>>>>> wasn't noticeable and thank you for the additional comments.
>>>>>>
>>>>>> There is a Method::external_name() function that I believe prints 
>>>>>> all the things you want in the logging here:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.1/src/hotspot/share/oops/cpCache.cpp.udiff.html
>>>>>>
>>>>>> I don't need to see another webrev if you make this change.
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>> On 5/14/20 12:26 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Please, review a fix for The Kitchensink bug:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8222005
>>>>>>>
>>>>>>> Webrev:
>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2020/jvmti-redef.1/
>>>>>>>
>>>>>>> Summary:
>>>>>>>   The VM_RedefineClasses::doit() uses two helper classes to walk 
>>>>>>> all VM classes.
>>>>>>>   First is AdjustAndCleanMetadata to adjust method entries in 
>>>>>>> the vtables/itables/cpcaches.
>>>>>>>   Second is CheckClass to check that adjustments for all method 
>>>>>>> entries are correct.
>>>>>>>   The Kitchensink test is failing with two modes:
>>>>>>>     - guarantee(false) failed: OLD and/or OBSOLETE method(s) 
>>>>>>> found in the
>>>>>>>       VM_RedefineClasses::CheckClass::do_klass()
>>>>>>>     - SIGSEGV in the 
>>>>>>> ConstantPoolCacheEntry::get_interesting_method_entry() in context
>>>>>>>       of VM_RedefineClasses::CheckClass::do_klass() execution
>>>>>>>
>>>>>>>   The second failure mode is rare. In is before the first one in 
>>>>>>> the code path.
>>>>>>>   The root cause of both is that the 
>>>>>>> VM_RedefineClasses::AdjustAndCleanMetadata::do_klass()
>>>>>>>   is skipping the cpcache update for classes that are being 
>>>>>>> redefined assuming they are
>>>>>>>   being redefined by the current VM_RedefineClasses operation. 
>>>>>>> In such cases, the adjustment
>>>>>>>   is not needed as the cpcache is empty. The problem is that the 
>>>>>>> assumption above is wrong.
>>>>>>>   The class can also be redefined by another VM_RedefineClasses 
>>>>>>> operation which has already
>>>>>>>   executed its doit_prologue. The cpcache djustment for such 
>>>>>>> class is necessary.
>>>>>>>   The fix is to always call the 
>>>>>>> cp_cache->adjust_method_entries() even if the class is
>>>>>>>   being redefined by the current VM_RedefineClasses operation. 
>>>>>>> It is possible to skip it
>>>>>>>   but it will add extra complexity to the code.
>>>>>>>   The fix also includes minor tweak in the cpCache.cpp to 
>>>>>>> include method's class name to
>>>>>>>   the redefinition cpcache log.
>>>>>>>
>>>>>>> Testing:
>>>>>>>   Ran Kitchensink test locally on a Linux server with the 
>>>>>>> Instrumentation module enabled.
>>>>>>>   The test does not fail anymore.
>>>>>>>   In progress, a mach5 tiers 1-5 and runs and separate mach5 
>>>>>>> Kitchensink run.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20200610/082caffa/attachment-0001.htm>


More information about the serviceability-dev mailing list