RFR 8055008: Clean up code that saves the previous versions of redefined classes

Tue Sep 2 17:42:50 UTC 2014

On 8/28/14 2:39 PM, Coleen Phillimore wrote:
>
> Serguei,   Thank you for the code review and discussions!
>
> This led me to find a bug and here is the new webrev, but there are 
> comments below to explain:
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8055008_3/

src/share/vm/oops/instanceKlass.hpp
     No comments.

src/share/vm/oops/instanceKlass.cpp
     No comments.

src/share/vm/oops/method.hpp
     No comments.

src/share/vm/classfile/metadataOnStackMark.hpp
     No comments.

src/share/vm/classfile/metadataOnStackMark.cpp
     No comments.

src/share/vm/classfile/classLoaderData.cpp
     No comments.

src/share/vm/prims/jvmtiRedefineClasses.hpp
     No comments.

src/share/vm/prims/jvmtiRedefineClasses.cpp
     line 138:   MetadataOnStackMark md_on_stack(true);
         So this "new has_redefined_a_class" parameter that we're
         passing "true" here allows us to call CodeCache::
         alive_nmethods_do() earlier. That call was previously keyed
         off a call to JvmtiExport::has_redefined_a_class() which
         wouldn't be true until after the first round of
         RedefineClasses() was pretty much done.

         So I assume we were not updating some things in the
         CodeCache during our first RedefineClasses() call so we
         had some incorrect (probably obsolete) methods being called?

         Update: Now that I've read the replies below the webrev
         link I see that my questions are already answered.

src/share/vm/prims/jvmtiImpl.cpp
     No comments.

src/share/vm/code/nmethod.cpp
     No comments.

src/share/vm/memory/universe.cpp
     No comments.

test/runtime/RedefineTests/RedefineFinalizer.java
     No comments.

test/runtime/RedefineTests/RedefineRunningMethods.java
     No comments.

Thumbs up!

Dan

>
>
> On 8/27/14, 8:20 AM, serguei.spitsyn at oracle.com wrote:
>> Hi Coleen,
>>
>>
>> src/share/vm/code/nmethod.cpp
>>
>>   Nice simplification.
>>
>>
>> src/share/vm/memory/universe.cpp
>>
>>   No comments
>>
>>
>> src/share/vm/oops/instanceKlass.cpp
>>
>>   A minor question about two related fragments:
>>
>> 3505       // next previous version
>> 3506       last = pv_node;
>> 3507       pv_node = pv_node->previous_versions();
>> 3508       version++;
>>
>>   Should the version be incremented to the case 3462-3469 as at the 
>> line 3508?
>>   It is not a big issue as the version number is used at the RC_TRACE 
>> line only:
>>      3496 method->signature()->as_C_string(), j, version));
>>
>>
>
> Yes, I fixed that.
>
>>   We still have no consensus on the following question:
>>   Can a non-running EMCP method become running again after the flag 
>> was cleared?
>>
>> 3487           if (!method->on_stack()) {
>> 3488             if (method->is_running_emcp()) {
>> 3489               method->set_running_emcp(false);  // no 
>> breakpoints for non-running methods
>> 3490             }
>>
>>   Just wanted to be sure what is the current view on this. :)
>>
>
> Not unless there is a bug, such that we hit a safepoint with the EMCP 
> method not in any place that MetadataOnStackMark can get to it but we 
> have a Method* pointer to it.  This situation could result in the EMCP 
> Method* getting deallocated as well.
>
> Stefan filed a bug a while ago that Method* pointers are not safe 
> without a methodHandle but we currently don't have any verification 
> that they are properly handled, such as CheckUnhandledOops.
>
>>
>> src/share/vm/oops/instanceKlass.hpp
>>
>>   No comments
>>
>>
>> src/share/vm/oops/method.hpp
>>
>>   Just some questions.
>>   Usefulness of this new function depends on basic ability of a 
>> non-running method to become running again:
>>        is_running_emcp()
>>
>>   The questions are:
>>    - How precise is the control of this bit?
>
> This bit is set during purging previous versions when all methods have 
> been marked on_stack() if found in various places.  The bit is only 
> used for setting breakpoints.
>
>>    - Should we clear this bit after all method invocations have been 
>> finished?
>>    - Can a EMCP method become running again after the bit was cleared 
>> or not set?
>>
>>
>> src/share/vm/prims/jvmtiImpl.cpp
>>
>>   300       if (method->is_running_emcp() &&
>>
>>    Is it possible that an EMCP method becomes running after the bit 
>> is_running_emcp() is set?
>>     Do we miss breakpoints in such a case?
>>
>
> I do think this is only possible if there is a bug and the Method* 
> would probably be deallocated first and crash.  Since this code is 
> called during class unloading, a crash is something I'd expect to see.
>
> So I added an additional check to find resurrected emcp methods:
>
>           if (!method->on_stack()) {
>             if (method->is_running_emcp()) {
>               method->set_running_emcp(false); // no breakpoints for 
> non-running methods
>             }
>           } else {
>             assert (method->is_obsolete() || method->is_running_emcp(),
>                     "emcp method cannot run after emcp bit is cleared");
>             // RC_TRACE macro has an embedded ResourceMark
>             RC_TRACE(0x00000200,
>               ("purge: %s(%s): prev method @%d in version @%d is alive",
>
>
> Unfortunately, this assert did fire in NSK testing.  And I just now 
> found the bug and fixed it (see above webrev).  We weren't walking the 
> code cache for the first redefinition, and an emcp method was in the 
> code cache.  The methods in the code cache are not exactly running but 
> we don't know if they could be called directly by running compiled code.
>
> This fix is to add a boolean flag to MetadataOnStackMark.  I reran the 
> consistently failing tests, nsk.quick tests, and the jtreg tests in 
> java/lang/instrument on the fix.
>>
>>  303         RC_TRACE(0x00000800, ("%sing breakpoint in %s(%s)",
>>  304           meth_act == &Method::set_breakpoint ? "set" : "clear",
>>
>>    The change from  "sett" to "set" seems to be wrong (see the line 
>> 303):
>>
>
> Yes, I didn't realize that.  Dan pointed this out to me also.
>>
>> src/share/vm/prims/jvmtiRedefineClasses.cpp
>> src/share/vm/prims/jvmtiRedefineClasses.hpp
>>
>>   No comments
>>
>
> Thank you Serguei for the in-depth comments and pushing on these 
> issues.  I think this has improved the code tremendously.  I really 
> appreciate it!
>
> Coleen
>>
>> Thanks,
>> Serguei
>>
>>
>> On 8/22/14 7:26 AM, Coleen Phillimore wrote:
>>>
>>> Thanks Dan, Serguei and Roland for discussion on this change. The 
>>> latest version is here:
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8055008_2/
>>>
>>> Changes from the last version (don't have the setup to do a diff 
>>> webrev, sorry) are that I have a new flag to mark running emcp 
>>> methods so we can set breakpoints in only those.  Also, confirmed 
>>> that we need to clean_weak_method_links in obsolete methods too. 
>>> Made changes per review comments.  Also, added more to the test so 
>>> that all tracing in InstanceKlass comes out. Reran all tests (nsk, 
>>> jck, jtreg, java/lang/instrument).
>>>
>>> Thanks to whoever made command line processing handle hex numbers!
>>>
>>> Thanks,
>>> Coleen
>>>
>>> On 8/20/14, 9:26 PM, Coleen Phillimore wrote:
>>>>
>>>> On 8/20/14, 6:45 PM, Daniel D. Daugherty wrote:
>>>>> On 8/20/14 2:01 PM, Coleen Phillimore wrote:
>>>>>> On 8/20/14, 3:49 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>
>>>>>>>> If an EMCP method is not running, should we save it on a 
>>>>>>>> previous version list anyway so that we can make it obsolete if 
>>>>>>>> it's redefined and made obsolete?
>>>>>>>
>>>>>>> I hope, Dan will catch me if I'm wrong...
>>>>>>>
>>>>>>> I think, we should not.
>>>>>>> An EMCP method can not be made obsolete if it is not running.
>>>>>>>
>>>>>>
>>>>>>
>>>>>> It should be this way otherwise we'd have to hang onto things 
>>>>>> forever.
>>>>>
>>>>> An EMCP method should only be made obsolete if a RedefineClasses() or
>>>>> RetransformClasses() operation made it so. We should not be 
>>>>> leveraging
>>>>> off the obsolete-ness attribute to solve a life-cycle problem.
>>>>
>>>> Yes, this was my error in the change.  This is why I made things 
>>>> obsolete if they were not running.  I think I can't reuse this 
>>>> flag.  My latest changes add a new explicit flag (which we have 
>>>> space for in Method*).
>>>>>
>>>>> In the pre-PGR world, we could trust GC to make a completely unused
>>>>> EMCP method collectible and eventually our weak reference would go
>>>>> away. Just because an EMCP method is not on a stack does not mean
>>>>> that it is not used so we need a different way to determine whether
>>>>> it is OK to no longer track an EMCP method.
>>>>
>>>> Our on_stack marking is supposed to look at all the places where GC 
>>>> used to look so I think we can use on_stack to track the lifecycle 
>>>> of EMCP methods.  If the EMCP method is somewhere, we will find it!
>>>>
>>>> I'm running tests on the latest change, but am also waiting for 
>>>> confirmation from Roland because we were only cleaning out 
>>>> MethodData for EMCP methods and not for running obsolete methods 
>>>> and I think we need to do that for obsolete methods also, which my 
>>>> change does now.  I think it was a bug.
>>>>
>>>> Thanks Dan for remembering all of this for me!
>>>>
>>>> Coleen
>>>>>
>>>>>
>>>>>>
>>>>>>> BTW, I'm reviewing the webrev too, but probably it'd be better 
>>>>>>> to switch to updated webrev after it is ready.
>>>>>>
>>>>>> Yes, this is a good idea.  I figured out why I made emcp methods 
>>>>>> obsolete, and I'm fixing that as well as Dan's comments. Thanks!
>>>>>
>>>>> Cool! I'm looking forward to the next review.
>>>>>
>>>>> Dan
>>>>>
>>>>>
>>>>>>
>>>>>> Coleen
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>
>>>>>
>>>>
>>>
>>
>