RFR: 8025834: NPE in Parallel Scavenge with -XX:+CheckUnhandledOops

Mon Oct 21 07:07:19 PDT 2013

This sounds reasonable.
Coleen

On 10/21/2013 04:48 AM, Erik Helin wrote:
> All,
>
> after having discussed this issue some more, I've decided to split it into
> multiple issues:
> - The crash when using -XX:+CheckUnhandledOops
> - Using MethodHandle instead of Method*
> - Using java_mirror() for anonymous classes
>
> I will send out a new webrev with a much smaller change that only solves
> the first issue (the crash with -XX:+CheckUnhandledOops). This review request
> will be labeled (round 2) as in:
> RFR: 8025834: NPE in Parallel Scavenge with -XX:+CheckUnhandledOops (round 2).
>
> For the other two issues, I have created the following two bugs:
> - https://bugs.openjdk.java.net/browse/JDK-8026946
> - https://bugs.openjdk.java.net/browse/JDK-8026948
>
> Everyone, thanks for all your feedback on this change!
> Erik
>
> On 2013-10-18, Erik Helin wrote:
>> Hi David and Serguei,
>>
>> thanks for having a look at this change!
>>
>> On 2013-10-18, David Holmes wrote:
>>> On 18/10/2013 5:58 PM, serguei.spitsyn at oracle.com wrote:
>>>> On 10/18/13 12:24 AM, serguei.spitsyn at oracle.com wrote:
>>>>> On 10/17/13 11:03 PM, David Holmes wrote:
>>>>>> On 18/10/2013 3:49 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Erik,
>>>>>>>
>>>>>>> The fix looks good in general.
>>>>>>>
>>>>>>> But one thing is confusing in the fix.
>>>>>>> Why do you keep both _class_loader and _class_loader_handle in the
>>>>>>> JvmtiBreakpoint class?
>>>>>> Even more confusing to me is the fact the neither of these seem to
>>>>>> actually be used anywhere ???
>>>>> Nice catch, David.
>>>>> I do not see too any of them is really used.
>>>>> Is it a leftover after the permgen elimination?
>>>> Maybe this is a rush judging.
>>>> It depends on the closure->do_oop() that is used for traversing
>>>> I thought that the KeepAliveClosure is used below (basing on the comment).
>>>>
>>>> class JvmtiBreakpoint : public GrowableElement {
>>>>    . . .
>>>>    void oops_do(OopClosure* f)     {
>>>>      // Mark the method loader as live
>>>>      f->do_oop(&_class_loader);
>>>>    }
>>>>
>>>> This oops_do() is not needed if we have handle instead of oop.
>>> Ah! Maybe the only purpose of keeping the class_loader (whether oop
>>> or Handle) is that it is kept alive outside of its normal lifecycle.
>>>
>>> But still we should only need the Handle or the Oop, not both. And
>>> if there is no oop we should not need an oops_do.
>> I will try to explain the problem, the current implementation and then
>> my change.
>>
>> The problem
>> -----------
>>
>> The problem is that JvmtiEnv::SetBreakpoint and
>> JvmtiEnv::ClearBreakpoint are passed a Method*. We must ensure that the
>> class for this Method is kept alive by the GC. We do this by informing
>> the GC that it must visit the class loader for the Method's class when
>> marking. This can be done in two ways:
>> - Putting a Handle on the stack containing an oop to the class loader
>> - Have an oops_do method that we ensure will be called by the GC during
>>    marking
>>
>> A third option is to make sure (by inspecting the code) that no GC can
>> occur that might cause problems, but this is a very brittle solution
>> once a new programmer comes along and adds/removes code.
>>
>> The current implementation
>> --------------------------
>>
>> In JvmtiEnv::SetBreakpoint and JvmtiEnv::ClearBreakpoint, we create a
>> JvmtiBreakpoint on the stack. This JvmtiBreakpoint contains an
>> "unhandled" oop to the class loader of the Method's class.
>>
>> A pointer to the JvmtiBreakpoint allocated on the stack will be passed
>> to VM_ChangeBreakpoint via JvmtiBreakpoints::set/clear.
>> In the doit() method of VM_ChangeBreakpoint,
>> JvmtiBreakpoints::set_at_safepoint will be called. A new JvmtiBreakpoint
>> will be allocated on the heap by JvmtiBreakpoints::set_at_safepoint.
>>
>> The JvmtiBreakpoint on the heap will contain the same oop as the one in
>> the JvmtiBreakpoint allocated on the stack earlier. The oop in the
>> JvmtiBreakpoint allocated on the heap will be found by the GC, because
>> VM_ChangeBreakpoint has an oops_do method (see first email for how this
>> oops_do method is called).
>>
>> Once the VM_ChangeBreakpoint operation is done, then the JvmtiBreakpoint
>> allocated on the heap will either be cleared due to a clear operation
>> (and we do not care about the oop any longer) or it will be placed in
>> JvmtiBreakpoints. JvmtiBreakpoints also has an oops_do which ensures
>> that the oop now will be found by the GC.
>>
>> We will now pop the call stack and return to JvmtiEnv::SetBreakpoint or
>> JvmtiEnv::ClearBreakpoint. No code in JvmtiEnv::SetBreakpoint or
>> JvmtiEnv::ClearBreakpoint is touching the JvmtiBreakpoint after the
>> VM_ChangeBreakpoint operation has been run.
>>
>> Furthermore, no GC can today occur before the call to VMThread::execute
>> in JvmtiBreakpoints::set/clear.
>>
>> This means that the current implementation is safe, but understanding
>> this is quite tricky.
>>
>> The change
>> ----------
>>
>> My proposed solution was the following:
>> - When a JvmtiBreakpoint is allocated on the stack, place an oop to the
>>    Method's class loader in a Handle.
>> - When a JvmtiBreakpoint is allocated on the heap, just store the oop to
>>    Method's class lodear in a field. The JvmtiBreakpoint will be placed
>>    in _jvmti_breakpoints which will be visited by the GC, which makes the
>>    GC call JvmtiBreakpoint::oops_do.
>>
>> Of course, this means that that if a GC occurs before the
>> JvmtiBreakpoint in constructed in JvmtiEnv::SetBreakpoint or
>> JvmtiEnv::ClearBreakpoint, the Method's class might be garbage
>> collected. Ideally, we should wrap the Method* in jvmti_SetBreakpoint
>> and jvmti_ClearBreakpoint in jvmtiEnter.cpp (a cpp file generated from
>> an XML file via an XSL tranformation).
>>
>> As icing on the cake, if someone redefines the method that we are
>> receiving a pointer to, then this code will probably not work at all. I
>> believe (I am not sure) that this should be solved by having a
>> MethodHandle wrapping the Method*.
>>
>> This change is just a first step towards safer code.
>>
>> I've gotten feedback internally that it is hard to understand the
>> new code. I will try to see if I can update the change to make this
>> clearer.
>>
>> Thanks for getting all the way to the end of this email :)
>>
>> Erik
>>
>>> David
>>>
>>>>>> But if we have the Handle then the oop is redundant AFAICS.
>>>>> Right.
>>>>> The issue is that the oop version is used in the oops_do.
>>>>> Not sure if we can get rid of oops_do.
>>>>> It may have an empty body though.
>>>> Getting rid of the oops_do will require more cleanup that needs to be
>>>> accurate.
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>> David
>>>>>>
>>>>>>> Also, you need to run the nsk.jdi.testlist and nsk.jdwp.testlist test
>>>>>>> suites as well.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 10/17/13 2:28 PM, Erik Helin wrote:
>>>>>>>> Hi Mikael and Coleen,
>>>>>>>>
>>>>>>>> thanks for your reviews!
>>>>>>>>
>>>>>>>> On 2013-10-16, Mikael Gerdin wrote:
>>>>>>>>> jvmtiImpl.hpp:
>>>>>>>>> Since clone() uses unhandled oops, and is only supposed to be used
>>>>>>>>> by the VM operation, would it make sense to
>>>>>>>>> assert(SafepointSynchronize::is_at_safepoint())?
>>>>>>>>>
>>>>>>>>> 196   GrowableElement *clone()        {
>>>>>>>>>   197     return new JvmtiBreakpoint(_method, _bci,
>>>>>>>>> _class_loader_handle);
>>>>>>>> Agree, I've updated the patch. A new webrev is located at:
>>>>>>>> http://cr.openjdk.java.net/~ehelin/8025834/webrev.01/
>>>>>>>>
>>>>>>>> On 2013-10-16, Mikael Gerdin wrote:
>>>>>>>>> jvmtiEnv.cpp:
>>>>>>>>> Have you verified that the generated JVMTI entry point contains a
>>>>>>>>> ResourceMark or is it just not needed?
>>>>>>>>> -  ResourceMark rm;
>>>>>>>>> +  HandleMark hm;
>>>>>>>> The JVMTI entry point does not contain a ResourceMark. However, I have
>>>>>>>> verified that a ResourceMark is not needed for jvmtiEnv::SetBreakpoint
>>>>>>>> nor for jvmtiEnv::ClearBreapoint. The only codes that needs a
>>>>>>>> ResourceMark is JvmtiBreakpoints::prints, but that method already
>>>>>>>> has a
>>>>>>>> ResourceMark.
>>>>>>>>
>>>>>>>> On 2013-10-16, Coleen Phillimore wrote:
>>>>>>>>> Did you run the nsk.jvmti.testlist tests too though?
>>>>>>>> No, I had not run the nsk.jvmti.testlist test, but I have now :)
>>>>>>>>
>>>>>>>> I run both with and without the patch on the latest hsx/hotspot-gc. I
>>>>>>>> also run with and without -XX:+CheckUnhandledOops. The results were
>>>>>>>> all
>>>>>>>> the same: 598 passed an 11 failed (the same tests for all
>>>>>>>> combinations).
>>>>>>>> So, the patch does not introduce any regressions for this test suite.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Erik
>>>>>>>>
>>>>>>>> On 2013-10-16, Mikael Gerdin wrote:
>>>>>>>>> Erik,
>>>>>>>>>
>>>>>>>>> (it's not necessary to cross-post between hotspot-dev and
>>>>>>>>> hotspot-gc-dev, so I removed hotspot-gc from the CC list)
>>>>>>>>>
>>>>>>>>> On 2013-10-16 18:09, Erik Helin wrote:
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> this patch fixes an issue where an oop in JvmtiBreakpoint,
>>>>>>>>>> JvmtiBreakpoint::_class_loader, was found by the unhandled oop
>>>>>>>>>> detector.
>>>>>>>>>>
>>>>>>>>>> Instead of registering the oop as an unhandled oop, which would have
>>>>>>>>>> worked, I decided to wrap the oop in a handle as long as it is on
>>>>>>>>>> the
>>>>>>>>>> stack.
>>>>>>>>>>
>>>>>>>>>> A JvmtiBreakpoint is created on the stack by the two methods
>>>>>>>>>> JvmtiEnv::SetBreakpoint and JvmtiEnv::ClearBreakpoint. This
>>>>>>>>>> JvmtiBreakpoint is only created to carry the Method*, jlocation
>>>>>>>>>> and oop
>>>>>>>>>> to a VM operation, VM_ChangeBreakpoints. VM_ChangeBreakpoints will,
>>>>>>>>>> when
>>>>>>>>>> at a safepoint, allocate a new JvmtiBreakpoint on the native
>>>>>>>>>> heap, copy
>>>>>>>>>> the values from the stack allocated JvmtiBreakpoint and then
>>>>>>>>>> place/clear the
>>>>>>>>>> newly alloacted JvmtiBreakpoint in
>>>>>>>>>> JvmtiCurrentBreakpoints::_jvmti_breakpoints.
>>>>>>>>>>
>>>>>>>>>> I have updated to the code to check that the public constructor
>>>>>>>>>> is only
>>>>>>>>>> used to allocate JvmtiBreakpoints on the stack (to warn a future
>>>>>>>>>> programmer if he/she decides to allocate one on the heap). The
>>>>>>>>>> class_loader oop is now wrapped in a Handle for stack allocated
>>>>>>>>>> JvmtiBreakpoints.
>>>>>>>>>>
>>>>>>>>>> Due to the stack allocated JvmtiBreakpoint having the oop in a
>>>>>>>>>> handle,
>>>>>>>>>> the oops_do method of VM_ChangeBreakpoints can be removed. This also
>>>>>>>>>> makes the oop in the handle safe for use after the
>>>>>>>>>> VM_ChangeBreakpoint
>>>>>>>>>> operation is finished.
>>>>>>>>>>
>>>>>>>>>> The unhandled oop in the JvmtiBreakpoint allocated on the heap
>>>>>>>>>> will be
>>>>>>>>>> visited by the GC via jvmtiExport::oops_do ->
>>>>>>>>>> JvmtiCurrentBreakpoints::oops_do -> JvmtiBreakpoints::oops_do ->
>>>>>>>>>> GrowableCache::oops_do -> JvmtiBreakpoint::oops_do, since it is
>>>>>>>>>> being
>>>>>>>>>> added to.
>>>>>>>>>>
>>>>>>>>>> I've also removed some dead code to simplify the change:
>>>>>>>>>> - GrowableCache::insert
>>>>>>>>>> - JvmtiBreakpoint::copy
>>>>>>>>>> - JvmtiBreakpoint::lessThan
>>>>>>>>>> - GrowableElement::lessThan
>>>>>>>>>>
>>>>>>>>>> Finally, I also formatted JvmtiEnv::ClearBreakpoint and
>>>>>>>>>> Jvmti::SetBreakpoint exactly the same to highlight that they
>>>>>>>>>> share all
>>>>>>>>>> code except one line. Unfortunately, I was not able to remove
>>>>>>>>>> this code
>>>>>>>>>> duplication in a good way.
>>>>>>>>>>
>>>>>>>>>> Webrev:
>>>>>>>>>> http://cr.openjdk.java.net/~ehelin/8025834/webrev.00/
>>>>>>>>> jvmtiImpl.hpp:
>>>>>>>>> Since clone() uses unhandled oops, and is only supposed to be used
>>>>>>>>> by the VM operation, would it make sense to
>>>>>>>>> assert(SafepointSynchronize::is_at_safepoint())?
>>>>>>>>>
>>>>>>>>> 196   GrowableElement *clone()        {
>>>>>>>>>   197     return new JvmtiBreakpoint(_method, _bci,
>>>>>>>>> _class_loader_handle);
>>>>>>>>>
>>>>>>>>> jvmtiImpl.cpp:
>>>>>>>>> No comments.
>>>>>>>>>
>>>>>>>>> jvmtiEnv.cpp:
>>>>>>>>> Have you verified that the generated JVMTI entry point contains a
>>>>>>>>> ResourceMark or is it just not needed?
>>>>>>>>> -  ResourceMark rm;
>>>>>>>>> +  HandleMark hm;
>>>>>>>>>
>>>>>>>>> Otherwise the code change looks good.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> One thing that you didn't describe here, but which was related to
>>>>>>>>> the bug (which we discussed) was the fact that the old code tried to
>>>>>>>>> "do the right thing" WRT CheckUnhandledOops, but it incorrectly
>>>>>>>>> added a Method*:
>>>>>>>>>
>>>>>>>>> thread->allow_unhandled_oop((oop*)&_method);
>>>>>>>>>
>>>>>>>>> We should take care to find other such places where we try to put a
>>>>>>>>> non-oop in allow_unhandled_oop(), perhaps checking is_oop_or_null in
>>>>>>>>> the unhandled oops code.
>>>>>>>>>
>>>>>>>>> /Mikael
>>>>>>>>>
>>>>>>>>>> Testing:
>>>>>>>>>> - JPRT
>>>>>>>>>> - The four tests that were failing are now passing
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Erik
>>>>>>>>>>