3-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112)
Dmitry Samersoff
dmitry.samersoff at oracle.com
Sat Nov 1 10:13:44 UTC 2014
Serguei,
Thank you for good finding. This approach looks much better for me.
The fix looks good.
Is it necessary to release vmDeathLock locks at
eventHandler.c:1244 before call
EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ?
-Dmitry
On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote:
>
> It is 3-rd round of review for:
> https://bugs.openjdk.java.net/browse/JDK-6988950
>
> New webrev:
>
> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/
>
>
> Summary
>
> For failing scenario, please, refer to the 1-st round RFR below.
>
> I've found what is missed in the jdwp agent shutdown and decided to
> switch from a workaround to a real fix.
>
> The agent VM_DEATH callback sets the gdata field: gdata->vmDead = 1.
> The agent debugLoop_run() has a guard against the VM shutdown:
>
> 165 } else if (gdata->vmDead &&
> 166 ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) {
> 167 /* Protect the VM from calls while dead.
> 168 * VirtualMachine cmdSet quietly ignores some cmds
> 169 * after VM death, so, it sends it's own errors.
> 170 */
> 171 outStream_setError(&out, JDWP_ERROR(VM_DEAD));
>
>
> However, the guard above does not help much if the VM_DEATH event
> happens in the middle of a command execution.
> There is a lack of synchronization here.
>
> The fix introduces new lock (vmDeathLock) which does not allow to
> execute the commands
> and the VM_DEATH event callback concurrently.
> It should work well for any function that is used in implementation of
> the JDWP_COMMAND_SET(VirtualMachine) .
>
>
> Testing:
> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests
>
>
> Thanks,
> Serguei
>
>
> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote:
>> The updated webrev:
>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/
>>
>>
>> The changes are:
>> - added a comment recommended by Staffan
>> - removed the ignore_wrong_phase() call from function classSignature()
>>
>> The classSignature() function is called in 16 places.
>> Most of them do not tolerate the NULL in place of returned signature
>> and will crash.
>> I'm not comfortable to fix all the occurrences now and suggest to
>> return to this
>> issue after gaining experience with more failure cases that are still
>> expected.
>> The failure with the classSignature() involved was observed only once
>> in the nightly
>> and should be extremely rare reproducible.
>> I'll file a placeholder bug if necessary.
>>
>> Thanks,
>> Serguei
>>
>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote:
>>> Please, review the fix for:
>>> https://bugs.openjdk.java.net/browse/JDK-6988950
>>>
>>>
>>> Open webrev:
>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/
>>>
>>>
>>>
>>> Summary:
>>>
>>> The failing scenario:
>>> The debugger and the debuggee are well aware a VM shutdown has
>>> been started in the target process.
>>> The debugger at this point is not expected to send any commands
>>> to the JDWP agent.
>>> However, the JDI layer (debugger side) and the jdwp agent
>>> (debuggee side)
>>> are not in sync with the consumer layers.
>>>
>>> One reason is because the test debugger does not invoke the JDI
>>> method VirtualMachine.dispose().
>>> Another reason is that the Debugger and the debuggee processes
>>> are uneasy to sync in general.
>>>
>>> As a result the following steps are possible:
>>> - The test debugger sends a 'quit' command to the test debuggee
>>> - The debuggee is normally exiting
>>> - The jdwp backend reports (over the jdwp protocol) an
>>> anonymous class unload event
>>> - The JDI InternalEventHandler thread handles the
>>> ClassUnloadEvent event
>>> - The InternalEventHandler wants to uncache the matching
>>> reference type.
>>> If there is more than one class with the same host class
>>> signature, it can't distinguish them,
>>> and so, deletes all references and re-retrieves them again
>>> (see tracing below):
>>> MY_TRACE: JDI:
>>> VirtualMachineImpl.retrieveClassesBySignature:
>>> sig=Ljava/lang/invoke/LambdaForm$DMH;
>>> - The jdwp backend debugLoop_run() gets the command from JDI
>>> and calls the functions
>>> classesForSignature() and classStatus() recursively.
>>> - The classStatus() makes a call to the JVMTI GetClassStatus()
>>> and gets the JVMTI_ERROR_WRONG_PHASE
>>> - As a result the jdwp backend reports the JVMTI error to the
>>> JDI, and so, the test fails
>>>
>>> For details, see the analysis in bug report closed as a dup of
>>> the bug 6988950:
>>> https://bugs.openjdk.java.net/browse/JDK-8024865
>>>
>>> Some similar cases can be found in the two bug reports (6988950
>>> and 8024865) describing this issue.
>>>
>>> The fix is to skip reporting the JVMTI_ERROR_WRONG_PHASE error
>>> as it is normal at the VM shutdown.
>>> The original jdwp backend implementation had a similar approach
>>> for the raw monitor functions.
>>> Threy use the ignore_vm_death() to workaround the
>>> JVMTI_ERROR_WRONG_PHASE errors.
>>> For reference, please, see the file: src/share/back/util.c
>>>
>>>
>>> Testing:
>>> Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG com/sun/jdi tests
>>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>
>
--
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.
More information about the serviceability-dev
mailing list