RFR(S): 8134103: JVMTI_ERROR_WRONG_PHASE(112): on checking for an interface

Thu Mar 2 23:07:56 UTC 2017

On 3/2/17 14:55, Daniel D. Daugherty wrote:
> On 3/2/17 3:19 PM, serguei.spitsyn at oracle.com wrote:
>> On 3/2/17 14:06, Daniel D. Daugherty wrote:
>>> On 3/2/17 2:24 PM, serguei.spitsyn at oracle.com wrote:
>>>> The updated webrev:
>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8134103-jdi-wrong-phase.2/ 
>>>>
>>>>
>>>> The change at L152-L153 has been reverted.
>>>> Just one sanity check is needed.
>>>
>>> This comment from earlier review is still unresolved:
>>>
>>> src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
>>>     L243:          * Immediately close out any commands enqueued from a
>>>     L244:          * previously attached debugger.
>>>
>>>         Perhaps L244 can be change to:
>>>
>>>                    * dead VM or a previously attached debugger.
>>>
>>>         to match the changed line of code:
>>>
>>>             L246         if (gdata->vmDead || command->sessionID != 
>>> currentSessionID) {
>>>
>>> Sorry I didn't notice that you didn't reply to it earlier...
>>
>> Sorry, I missed it - fixed now.
>> I guess, it can be pushed now.
>>
>>
>>>
>>> I presume you are retesting... How often does
>>> hotspot/test/closed/compiler/c1/6507107/HeapwalkingTest.java reproduce
>>> the problem?
>>
>> The repro-rate is pretty high it is about 1/3 - 1/2.
>> It is non-reproducible with the fix.
>
> So just to be clear, with just these code changes in place:
>
> - grab the vmDeathLock for all JDWP command sets instead of just
>   the VirtualMachine command set in the debugLoop thread
> - ignore/close out cmds when gdata->vmDead is true in addition
>   to the existing old session check in the CommandLoop thread

Exactly.

>
> the HeapwalkingTest goes from failing 1/3 -> 1/2 the time to
> not failing in 200 runs so far...
>
> I'd say you nailed this one nicely!

Thanks.
The fix should cover all the symptoms described in the bug dups.
At least, I tried to analyze and cover all theoretically possible cases. :)
My initial fix had more guards.
But then I proved to myself some of the guards are not necessary.

Thanks,
Serguei

>
> Dan
>
>
>> The test is pretty slow.
>> At this point I've got about 200 clean runs.
>> All the jdi/jdwp tests are clean too.
>>
>> Thanks!
>> Serguei
>>
>>>
>>> Dan
>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 3/2/17 11:21, serguei.spitsyn at oracle.com wrote:
>>>>> Dan,
>>>>>
>>>>> Thank you for reviewing!
>>>>> I was waiting for your comments.
>>>>>
>>>>>
>>>>> On 3/2/17 06:59, Daniel D. Daugherty wrote:
>>>>>> On 3/1/17 8:49 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Please, review the JDK 10 fix for:
>>>>>>>   https://bugs.openjdk.java.net/browse/JDK-8134103
>>>>>>>
>>>>>>>
>>>>>>> Webrev:
>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2017/hotspot/8134103-jdi-wrong-phase.1/ 
>>>>>>>
>>>>>>
>>>>>> src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c
>>>>>>     old L152:             } else if (gdata->vmDead &&
>>>>>>     old L153:              ((cmd->cmdSet) != 
>>>>>> JDWP_COMMAND_SET(VirtualMachine))) {
>>>>>>
>>>>>>         The old code used to set the error condition when the VM
>>>>>>         is dead and the command was not in the VirtualMachine
>>>>>>         command set. With the new code:
>>>>>>
>>>>>>             L150:             } else if (gdata->vmDead) {
>>>>>>
>>>>>>         The error condition is now set for all command sets
>>>>>>         including the VirtualMachine command set. Minimally
>>>>>>         that means that this comment needs work:
>>>>>>
>>>>>>             L152:                  * VirtualMachine cmdSet 
>>>>>> quietly ignores some cmds
>>>>>>             L153:                  * after VM death, so, it sends 
>>>>>> it's own errors.
>>>>>>
>>>>>>         since you are no longer letting the VirtualMachine cmdSet 
>>>>>> send
>>>>>>         its own errors.
>>>>>
>>>>> Agreed, good catch, thanks!
>>>>>
>>>>>
>>>>>>
>>>>>>         It's not clear to me why you are now setting the error
>>>>>>         condition for VirtualMachine cmds instead of letting those
>>>>>>         cmds send their own errors.
>>>>>>
>>>>>>         See more comments below for your summary.
>>>>>
>>>>> Most likely, you are right here.
>>>>> I specifically looked at the VM commands but overlooked the ones 
>>>>> that are silently ignored.
>>>>> For instance: Resume (9) or exit (10).
>>>>>
>>>>> So, I will need to restore the L152+L153.
>>>>> Please, let me re-test this update and then I'll send another webrev.
>>>>> Interesting that no test has caught this as there is a very tiny 
>>>>> gap for such an intermittent issue to appear.
>>>>>
>>>>>
>>>>>>
>>>>>> src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
>>>>>>     L243:          * Immediately close out any commands enqueued 
>>>>>> from a
>>>>>>     L244:          * previously attached debugger.
>>>>>>
>>>>>>         Perhaps L244 can be change to:
>>>>>>
>>>>>>                    * dead VM or a previously attached debugger.
>>>>>>
>>>>>>         to match the changed line of code:
>>>>>>
>>>>>>             L246         if (gdata->vmDead || command->sessionID 
>>>>>> != currentSessionID) {
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Summary:
>>>>>>>   This is an intermittent issue in the debugger back-end (JDWP 
>>>>>>> agent)
>>>>>>>   that impacts the nightly testing stability.
>>>>>>
>>>>>> Congrats on tracking down this elusive bug!
>>>>>
>>>>> Thanks!
>>>>>
>>>>>>
>>>>>>
>>>>>>>   The fix adds check guards of gdata->vmDead condition to the:
>>>>>>>    - debugLoop (JDWP Event Helper thread) and
>>>>>>>    - commandLoop (JDWP Transport Listener thread)
>>>>>>>   The commands are ignored in the DEAD phase.
>>>>>>
>>>>>> In the debugLoop, you aren't ignoring the commands, you
>>>>>> are setting an error condition.
>>>>>
>>>>> Right, I meant: ignored == not executed. :)
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>>   The check guard in the debugLoop already existed but only for 
>>>>>>> VirtualMachine
>>>>>>>   command set, so it has been extended to commands from all JDWP 
>>>>>>> command sets.
>>>>>>
>>>>>> I'm reading the existing code exactly opposite of what you
>>>>>> say here. The gdata->vmDead check applied to all command
>>>>>> sets except for the VirtualMachine command set. I agree
>>>>>> that you've extended it to all command sets, but we're
>>>>>> not ignoring the commands. We are now setting the error
>>>>>> condition for all command sets.
>>>>>
>>>>> Agreed.
>>>>>
>>>>>
>>>>>>
>>>>>> Maybe I'm missing something here. Perhaps I've been away
>>>>>> from this code for too long... :-)
>>>>>
>>>>> No, it is that I'm still learning this code with your help. :-)
>>>>>
>>>>>
>>>>> Thanks a lot!
>>>>> Serguei
>>>>>
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>   I suspect, this bug could also cause some of the timeout and 
>>>>>>> socket related issues.
>>>>>>>
>>>>>>>
>>>>>>> Testing:
>>>>>>>   The fix was tested with the nsk.jdi, jtreg com/sun/jdi and 100 
>>>>>>> runs
>>>>>>>   of the 
>>>>>>> hotspot/test/closed/compiler/c1/6507107/HeapwalkingTest.java.
>>>>>>>   The test results are very clean.
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>
>>>>>
>>>>
>>>
>>
>