RFR(s) 8153711: [REDO] JDWP: Memory Leak: GlobalRefs never deleted when processing invokeMethod command
serguei.spitsyn at oracle.com
serguei.spitsyn at oracle.com
Tue May 3 10:21:16 UTC 2016
Hi Severin,
Please, find my comments below.
On 5/2/16 01:44, Severin Gehwolf wrote:
> On Fri, 2016-04-29 at 12:33 -0700, serguei.spitsyn at oracle.com wrote:
>> On 4/29/16 01:56, Severin Gehwolf wrote:
>>> Hi Serguei,
>>>
>>> On Fri, 2016-04-29 at 01:34 -0700, serguei.spitsyn at oracle.com wrote:
>>>> Hi Severin,
>>>>
>>>> The fix looks good in general.
>>>> I'm testing both fixes together at the moment.
>>> That is JDK-8154529 and JDK-8153711? Yes, that's what I've done too.
>>>
>>>> A couple of questions...
>>>>
>>>> It seems, there are more places where an invokerLock critical section is missed.
>>> Right.
>>>
>>>> The following functions:
>>>> - invoker_enableInvokeRequests
>>> This should be fixed with the patch for JDK-8154529
>>>
>>>> - invokeConstructor
>>>> - invokeStatic
>>>> - invokeNonvirtual
>>>> - invokeVirtual
>>>> - saveGlobalRef
>>> Correct. The intent would be to fix the callsites of saveGlobalRef. If
>>> we fix invoke* then saveGlobalRef should not be an issue. I didn't want
>>> to include this in this fix since it's pretty hairy and would like to
>>> fix this in incremental steps.
>>>
>>>> The first function is easy to fix.
>>>> The last 5 functions are called from the invoker_doInvoke() that we already had a problem with.
>>>> I'm puzzled with the question: How to synchronize and avoid deadlocks at the same time?
>>> I'm not sure yet. Perhaps locking only while saveGlobalRef is being
>>> called in invoke* functions will help.
>>>
>>>> I'm inclined to let your fix go (if the testing is Ok) and file one more bug on the remaining sync issues.
>>> Please keep me in the loop about your test results.
>>
>> Both the JTreg com/sun/jdi and the co-located nsk.jdi tests are all passed.
>>
>> I also ran the 4 previously failed tests in big loops of 1000 runs:
>> com/sun/jdi/InvokeTest.java
>> com/sun/jdi/InvokeHangTest.java
>> com/sun/jdi/InterfaceMethodsTest.java
>> com/sun/jdi/OomDebugTest.java (new test introduced in the webrev)
> I suggest to run InvokeTest, InvokeHangTest and InterfaceMethodsTest in
> a loop. Those never failed for me in such a scenario.
>
>> The OomDebugTest.java failed with a timeout (most likely, a deadlock).
>> Please, find the OomDebugTest.jtr file in attachments.
> Correct. This is what I was seeing. See the last comment of the bug:
> https://bugs.openjdk.java.net/browse/JDK-8153711
Need to check if it is the same as I'm seeing.
>
> It has the jstack output of the hung OomDebugTestTarg JVM. I'm not
> convinced this is the same failure we were seeing in JDK-4858370 since
> the stack suggests it's doing a GC upon a newInstance of a primitive
> array. It also does not seem like the same issue as the deadlocks
> exposed by locking during an invoke, because it was reproducible fairly
> consistently with InvokeHangTest.
Agreed.
>
> It looks to me like a new issue. Probably one which was there before,
> but is only exposed by the new test. The new test stress-tests the GC
> with a debugger attached. Of course, this is going to be hard to prove
> since it would just run out of memory prior the patch.
Thanks.
Your analysis seems to be reasonable.
I tend to agree with it but need more time to convince myself.
> Thoughts?
Let me double check the above.
If the analysis is correct then I'd suggest to file new bug and push
your fix as is.
Thanks,
Serguei
>
> Cheers,
> Severin
>
>
>
>> Thanks,
>> Serguei
>>
>>
>>
>>> Thanks for your help!
>>>
>>> Cheers,
>>> Severin
>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 4/28/16 02:00, Severin Gehwolf wrote:
>>>>> On Tue, 2016-04-19 at 19:32 -0700, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Severin,
>>>>>>>
>>>>>>> I postpone a push for this fix.
>>>>>>>
>>>>>>> There are two nsk.jdi test failures (they look like deadlocks):
>>>>>>> nsk/jdi/ObjectReference/invokeMethod/invokemethod012 FAIL(TIMEOUT)
>>>>>>> nsk/jdi/Scenarios/invokeMethod/popframes001 FAIL(TIMEOUT)
>>>>>>>
>>>>>>> and one jtreg sun/com/jdi failure (it looks like a deadlock too):
>>>>>>> com/sun/jdi/InvokeHangTes.java
>>>>>>>
>>>>>>>
>>>>>>> I'll double check if these failures are regressions caused by your fix
>>>>>>> or not.
>>>>>> I confirm, the failures above are new regressions introduced by the fix.
>>>>>> The tests fail consistently with the fix and do not fail without the fix.
>>>>> OK this was caused by the locking done in invoker_doInvoke(). Note that
>>>>> holding either of them invoker lock or event handler lock causes this.
>>>>>
>>>>> Here is a new webrev with those hunks removed. It's sufficient to grab
>>>>> the locks again in invoke_completeInvokeRequest() when clearing the
>>>>> global references in order to not get those failures we've seen when
>>>>> the fix for JDK-4858370 was in place.
>>>>>
>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8153711/webrev.02/
>>>>>
>>>>> Testing done:
>>>>> - com/sun/jdi/InvokeTest.java com/sun/jdi/InvokeHangTest.java and
>>>>> sun/jdi/InterfaceMethodsTest.java does not fail in 1500 runs.
>>>>> - regular com/sun/jdi test set: no regressions
>>>>>
>>>>> Note that there are some rare cases where OomDebugTest times out which
>>>>> seems to be caused by the GC, though. See the bug for details. Since
>>>>> this problem is rare, it's still worthwhile having that test included.
>>>>> If it turns out a problem in practice we could consider disabling the
>>>>> test.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> Cheers,
>>>>> Severin
>>>>
>>
More information about the serviceability-dev
mailing list