4-nd round RFR (XS) 6988950: JDWP exit error JVMTI_ERROR_WRONG_PHASE(112)

serguei.spitsyn at oracle.com serguei.spitsyn at oracle.com
Fri Nov 7 19:41:01 UTC 2014


On 11/7/14 2:13 AM, serguei.spitsyn at oracle.com wrote:
> On 11/7/14 1:59 AM, David Holmes wrote:
>> On 7/11/2014 7:48 PM, serguei.spitsyn at oracle.com wrote:
>>> Hi David,
>>>
>>>
>>> On 11/6/14 9:18 PM, David Holmes wrote:
>>>> Hi Serguei,
>>>>
>>>> I think I get the gist of this approach but I'm not an expert on the
>>>> JVM TI or JDWP event model. My main concern would be how the delay to
>>>> the completion of cbVMDeath() might impact things - specifically if it
>>>> might be a lengthy delay?
>>>
>>> 1. At the beginning the VirtualMachine comands check if 
>>> gdata->vmDead is
>>> true
>>>      and in such case just return with the JDWP_ERROR(VM_DEAD) error or
>>> quietly.
>>>      Normally, the cbVMDeath event callback needs to wait for just one
>>> command.
>>>
>>> Please, see the VirtualMachine.c and the following comment in
>>> debugLoop_run():
>>>
>>>              } else if (gdata->vmDead &&
>>>               ((cmd->cmdSet) != JDWP_COMMAND_SET(VirtualMachine))) {
>>>                  /* Protect the VM from calls while dead.
>>>                   * VirtualMachine cmdSet quietly ignores some cmds
>>>                   * after VM death, so, it sends it's own errors.
>>>                   */
>>>                  outStream_setError(&out, JDWP_ERROR(VM_DEAD));
>>>              } else {
>>>
>>>
>>> 2. We do not have many choices.
>>>      Without a sync on a command completeness we will continue getting
>>> WRONG_PHASE errors intermittently.
>>>      Another choice is to use already reviewed ignore_wrong_phase
>>> workaround.
>>>      Note, the workaround works Ok not for all the commands.
>>>      I understand, we need to make sure nothing is broken if we choose
>>> this approach. :)
>>>
>>> 3. What delay would you consider lengthy: 1 sec, 10 sec, 1 min.?
>>
>> Anything that causes something unexpected to happen :) I'm just 
>> looking at the code and thinking what might go wrong. Really all we 
>> can do is try this and see.
>
> 1 min sleep looks too big as it causes timeout failures of some tests.
> Launched the nsk.jdi and jtreg com/sun/jdi with 10 sec sleep.
> Will see the results tomorrow.

The nsk.jdi and the jtreg com/sun/jdi tests are passed with the 10 sec 
sleep.

Thanks,
Serguei

>
> Thanks!
> Serguei
>
>>
>> Thanks,
>> David
>>
>>>      For instance, I can add 10 sec sleep to provoke the command
>>> execution delay and see what can be broken.
>>>      With 1 min sleep I see some timeouts in the jtreg com/sun/jdi 
>>> tests
>>> though which is probably Ok.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 7/11/2014 8:27 AM, serguei.spitsyn at oracle.com wrote:
>>>>> Hi reviewers,
>>>>>
>>>>> I'm suggesting to review a modified fix:
>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.4/ 
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The 3-rd round fix is not right as it caused deadlocks in several 
>>>>> tests
>>>>> (in nsk.jdi.testlist and  jtreg com/sun/jdi).
>>>>>
>>>>> Here is a deadlock example:
>>>>>
>>>>> -----------------  lwp# 2 / thread# 2 --------------------
>>>>>   ffffffff7e8dc6a4 lwp_cond_wait (100138748, 100138730, 0, 0)
>>>>>   ffffffff7dcad148 void os::PlatformEvent::park() (100138700, d4788,
>>>>> d4400, 0, ffffffff7e357440, 100138730) + 100
>>>>>   ffffffff7dc3151c int Monitor::IWait(Thread*,long) 
>>>>> (ffffffff7e3c5b98,
>>>>> 100137000, 0, 1004405d0, 6e750, 0) + a4
>>>>>   ffffffff7dc324d0 bool Monitor::wait(bool,long,bool) (1004405d0,
>>>>> 100137000, 0, 0, 1, 20000000) + 358
>>>>>   ffffffff7de6c530 int JavaThread::java_suspend_self() (1004405d0,
>>>>> 100137000, 1, deab, 60000000, 100137000) + c8
>>>>>   ffffffff7da5f478 int JvmtiRawMonitor::raw_wait(long,bool,Thread*)
>>>>> (10034bdc0, ffffffffffffffff, ffffffff7e3e6bd0, 100137000, 1, 2) + 
>>>>> 258
>>>>>   ffffffff7da2284c jvmtiError
>>>>> JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bdc0,
>>>>> ffffffffffffffff, 4, 9aeb0, 100137000) + 8c
>>>>>   ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00,
>>>>> ffffffff7ab3ad18, ffffffff7ab3ad18, 0) + 3c
>>>>>   ffffffff7aa1c804 enqueueCommand (10034bb90, 102c00, 
>>>>> ffffffffffefd118,
>>>>> ffffffff7ab3ad18, 102c00, ffffffff7ab3bd60) + 14c
>>>>>   ffffffff7aa1e23c eventHelper_reportEvents (d8, 100135d70, 2, 1, 
>>>>> 1, 2)
>>>>> + 10c
>>>>>   ffffffff7aa181f8 reportEvents (1001371f8, 0, 0, 14, 100135d70, 0) +
>>>>> 138
>>>>>   ffffffff7aa187b8 event_callback (1001371f8, ffffffff7b0ffa88,
>>>>> ffffffff7aa23150, ffffffff7aa376a0, ffffffff7ab3ad18, 100441ad0) + 
>>>>> 360
>>>>>   ffffffff7aa1b870 cbVMDeath (800, 1001371f8, ffffffff7aa37c48,
>>>>> ffffffff7ab3ad18, 1018, 1000) + 1d8
>>>>>   ffffffff7da3635c void JvmtiExport::post_vm_death() (1ffc, 
>>>>> 100137000,
>>>>> ffffffff7e3e8b30, ffffffff7e357440, 1, 10010cf30) + 534
>>>>>   ffffffff7d7bb104 void before_exit(JavaThread*) (100137000,
>>>>> ffffffff7e392350, ffffffff7e3fb938, 6ed99, ffffffff7e357440,
>>>>> ffffffff7e3e6b70) + 30c
>>>>>   ffffffff7de72128 bool Threads::destroy_vm() (100137000, 100110a40,
>>>>> ffffffff7e3f22f4, ffffffff7e3e6ab0, ffffffff7e357440, 30000000) + 100
>>>>>   ffffffff7d8d0664 jni_DestroyJavaVM (100137000, 1ffc, 
>>>>> ffffffff7e3e8b30,
>>>>> ffffffff7e357440, 0, 10013700) + 1bc
>>>>>   ffffffff7ee08680 JavaMain (ffffffff7e3da790, 0, ffffffff7e3da790,
>>>>> 10035de68, 0, ffffffff7e4143b0) + 860
>>>>>   ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0)
>>>>>
>>>>> -----------------  lwp# 12 / thread# 12 --------------------
>>>>>   ffffffff7e8dc6a4 lwp_cond_wait (100349948, 100349930, 0, 0)
>>>>>   ffffffff7dcad148 void os::PlatformEvent::park() (100349900, d4788,
>>>>> d4400, 0, ffffffff7e357440, 100349930) + 100
>>>>>   ffffffff7da5f010 int JvmtiRawMonitor::raw_enter(Thread*) 
>>>>> (10034a070,
>>>>> 100348800, a, ffffffff7e3de340, 1, ffffffff7e115ff4) + 258
>>>>>   ffffffff7da22450 jvmtiError
>>>>> JvmtiEnv::RawMonitorEnter(JvmtiRawMonitor*) (ffffffff7ea05a00,
>>>>> 10034a070, 1c7, 100348800, ffffffff7e357440, 4) + a0
>>>>>   ffffffff7aa2f288 debugMonitorEnter (10034a070, c18, c00,
>>>>> ffffffff7ab3ad28, ffffffff7ab3b940, 0) + 38
>>>>>   ffffffff7aa14134 debugLoop_run (ffffffff7ab3b940, 1000,
>>>>> ffffffff7ab3ad28, ffffffff7aa360d0, ffffffff5b2ff718, c18) + 11c
>>>>>   ffffffff7aa2a4f8 connectionInitiated (ffffffff5b504010, 1358, 1000,
>>>>> ffffffff7ab3ad28, 1, ffffffff7ab3c080) + e0
>>>>>   ffffffff7aa2a7d4 attachThread (ffffffffffefee48, 101000,
>>>>> ffffffff5b504010, ffffffff7ab3ad28, 0, 10000000) + 54
>>>>>   ffffffff7da56b18 void JvmtiAgentThread::call_start_function()
>>>>> (100348800, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034880, 1) +
>>>>> 128
>>>>>   ffffffff7de6a678 void JavaThread::thread_main_inner() 
>>>>> (100348800, 3d8,
>>>>> 1003497f8, 100349420, ffffffff5b2ff9f8, 0) + 90
>>>>>   ffffffff7de6a5b4 void JavaThread::run() (100348800, 100349442, c,
>>>>> fffffffea5f3e048, 3d8, 1003497f8) + 3ac
>>>>>   ffffffff7dc9f2e4 java_start (ca800, 100348800, ca904,
>>>>> ffffffff7e16ff31, ffffffff7e357440, 4797) + 2e4
>>>>>   ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0)
>>>>>
>>>>> -----------------  lwp# 13 / thread# 13 --------------------
>>>>>   ffffffff7e8dc6a4 lwp_cond_wait (10034d348, 10034d330, 0, 0)
>>>>>   ffffffff7dcad148 void os::PlatformEvent::park() (10034d300, d4788,
>>>>> d4400, 0, ffffffff7e357440, 10034d330) + 100
>>>>>   ffffffff7da5eac8 int JvmtiRawMonitor::SimpleWait(Thread*,long)
>>>>> (10034bed0, 10034c000, ffffffffffffffff, 241000, 0, 10034c000) + 100
>>>>>   ffffffff7da5f300 int JvmtiRawMonitor::raw_wait(long,bool,Thread*)
>>>>> (10034bed0, ffffffffffffffff, 1, 10034c000, ffffffff7e357440, 
>>>>> 10034c000)
>>>>> + e0
>>>>>   ffffffff7da2284c jvmtiError
>>>>> JvmtiEnv::RawMonitorWait(JvmtiRawMonitor*,long) (92800, 10034bed0,
>>>>> ffffffffffffffff, 4, 9aeb0, 10034c000) + 8c
>>>>>   ffffffff7aa2f47c debugMonitorWait (ffffffff7ab3ba10, c28, c00,
>>>>> ffffffff7ab3ad18, ffffffff7ab3b940, 0) + 3c
>>>>>   ffffffff7aa1d838 doBlockCommandLoop (800, 1038, ffffffff7ab3ad18,
>>>>> 1000, ffffffff7ab3ad18, ffffffff7ab3bd60) + 48
>>>>>   ffffffff7aa1da3c commandLoop (c28, 10034c1f8, c00, 
>>>>> ffffffff7ab3ad18,
>>>>> 0, 10000000) + ac
>>>>>   ffffffff7da56b18 void JvmtiAgentThread::call_start_function()
>>>>> (10034c000, ffffffff7e3e8b38, 916f0, ffffffff7e357440, 10034c00, 1) +
>>>>> 128
>>>>>   ffffffff7de6a678 void JavaThread::thread_main_inner() 
>>>>> (10034c000, 3d8,
>>>>> 10034cfe8, 10034cc10, ffffffff5b0ffbf8, 0) + 90
>>>>>   ffffffff7de6a5b4 void JavaThread::run() (10034c000, 10034cc28, d,
>>>>> fffffffea5f3e290, 3d8, 10034cfe8) + 3ac
>>>>>   ffffffff7dc9f2e4 java_start (ca800, 10034c000, ca904,
>>>>> ffffffff7e16ff31, ffffffff7e357440, 181a) + 2e4
>>>>>   ffffffff7e8d8558 _lwp_start (0, 0, 0, 0, 0, 0)
>>>>>
>>>>>
>>>>> The details:
>>>>>    - Thread #2:   The cbVMDeath() event handler is waiting on the
>>>>> commandCompleteLock in the enqueueCommand().
>>>>>       The call chain is:
>>>>>          cbVMDeath() -> event_callback() -> reportEvents() ->
>>>>> eventHelper_reportEvents() -> enqueueCommand().
>>>>>       The enqueueCommand() depends on the commandLoop() that has 
>>>>> to call
>>>>>       completeCommand(command) for the command being enqueued.
>>>>>       This has not been set yet:  gdata->vmDead = JNI_TRUE
>>>>>
>>>>>    - Thread #12: The debugLoop_run blocked on the vmDeathLock enter
>>>>>
>>>>>    - Thread #13: The commandLoop is waiting on the 
>>>>> blockCommandLoopLock
>>>>> in the doBlockCommandLoop().
>>>>>       It is because blockCommandLoop == JNI_TRUE which is set in the
>>>>> needBlockCommandLoop()
>>>>>       if the following condition is true:
>>>>>           (cmd->commandKind == COMMAND_REPORT_EVENT_COMPOSITE &&
>>>>>            cmd->u.reportEventComposite.suspendPolicy ==
>>>>> JDWP_SUSPEND_POLICY(ALL))
>>>>>
>>>>>
>>>>> It seems, the debugLoop_run() block on the vmDeathLock causes the
>>>>> commandLoop() to wait indefinitely.
>>>>> The cbVMDeath() can not proceed because the commandLoop() does not 
>>>>> make
>>>>> a progress.
>>>>>
>>>>> The vmDeathLock critical section in the cbVMDeath() event callback 
>>>>> seems
>>>>> to be an overkill (unnecessary).
>>>>> A less intrusive synchronization is required here which is to wait 
>>>>> until
>>>>> the current command is completed
>>>>> before returning to the JvmtiExport::post_vm_death().
>>>>>
>>>>> The new approach (see new webrev) is to extend the resumeLock
>>>>> synchronization pattern
>>>>> to all VirtualMachine set of commands, not only the resume command.
>>>>> The resumeLock name is replaced with the vmDeathLock to reflect new
>>>>> semantics.
>>>>>
>>>>> In general, we could consider to do the same for the rest of the JDWP
>>>>> command sets.
>>>>> But it is better to be careful and see how this change goes first.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 11/5/14 2:27 AM, serguei.spitsyn at oracle.com wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> Thank you for the concerns!
>>>>>> Testing showed several tests failing with deadlocks.
>>>>>> Scenarios are similar to that you describe.
>>>>>>
>>>>>> Trying to understand the details.
>>>>>>
>>>>>> Thanks,
>>>>>> Serguei
>>>>>>
>>>>>> On 11/4/14 4:09 PM, David Holmes wrote:
>>>>>>> Hi Serguei,
>>>>>>>
>>>>>>> On 3/11/2014 5:07 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> On 11/2/14 8:58 PM, David Holmes wrote:
>>>>>>>>> On 1/11/2014 8:13 PM, Dmitry Samersoff wrote:
>>>>>>>>>> Serguei,
>>>>>>>>>>
>>>>>>>>>> Thank you for good finding. This approach looks much better 
>>>>>>>>>> for me.
>>>>>>>>>>
>>>>>>>>>> The fix looks good.
>>>>>>>>>>
>>>>>>>>>> Is it necessary to release vmDeathLock locks at
>>>>>>>>>> eventHandler.c:1244 before call
>>>>>>>>>>
>>>>>>>>>> EXIT_ERROR(error,"Can't clear event callbacks on vm death"); ?
>>>>>>>>>
>>>>>>>>> I agree this looks necessary, or at least more clean (if 
>>>>>>>>> things are
>>>>>>>>> failing we really don't know what is happening).
>>>>>>>>
>>>>>>>> Agreed (replied to Dmitry).
>>>>>>>>
>>>>>>>>>
>>>>>>>>> More generally I'm concerned about whether any of the code paths
>>>>>>>>> taken
>>>>>>>>> while holding the new lock can result in deadlock - in particular
>>>>>>>>> with
>>>>>>>>> regard to the resumeLock ?
>>>>>>>>
>>>>>>>> The cbVMDeath() function never holds both vmDeathLock and
>>>>>>>> resumeLock at
>>>>>>>> the same time,
>>>>>>>> so there is no chance for a deadlock that involves both these 
>>>>>>>> locks.
>>>>>>>>
>>>>>>>> Two more locks used in the cbVMDeath() are the callbackBlock and
>>>>>>>> callbackLock.
>>>>>>>> These two locks look completely unrelated to the debugLoop_run().
>>>>>>>>
>>>>>>>> The debugLoop_run() function also uses the cmdQueueLock.
>>>>>>>> The debugLoop_run() never holds both vmDeathLock and 
>>>>>>>> cmdQueueLock at
>>>>>>>> the
>>>>>>>> same time.
>>>>>>>>
>>>>>>>> So that I do not see any potential to introduce new deadlock 
>>>>>>>> with the
>>>>>>>> vmDeathLock.
>>>>>>>>
>>>>>>>> However, it is still easy to overlook something here.
>>>>>>>> Please, let me know if you see any danger.
>>>>>>>
>>>>>>> I was mainly concerned about what might happen in the call chain 
>>>>>>> for
>>>>>>> threadControl_resumeAll() (it certainly sounds like it might 
>>>>>>> need to
>>>>>>> use a resumeLock :) ). I see direct use of the threadLock and
>>>>>>> indirectly the eventHandler lock; but there are further call 
>>>>>>> paths I
>>>>>>> did not explore. Wish there was an easy way to determine the
>>>>>>> transitive closure of all locks used from a given call.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> -Dmitry
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2014-11-01 00:07, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>
>>>>>>>>>>> It is 3-rd round of review for:
>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950
>>>>>>>>>>>
>>>>>>>>>>> New webrev:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.3/ 
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Summary
>>>>>>>>>>>
>>>>>>>>>>>    For failing scenario, please, refer to the 1-st round RFR
>>>>>>>>>>> below.
>>>>>>>>>>>
>>>>>>>>>>>    I've found what is missed in the jdwp agent shutdown and
>>>>>>>>>>> decided to
>>>>>>>>>>> switch from a workaround to a real fix.
>>>>>>>>>>>
>>>>>>>>>>>    The agent VM_DEATH callback sets the gdata field:
>>>>>>>>>>> gdata->vmDead = 1.
>>>>>>>>>>>    The agent debugLoop_run() has a guard against the VM 
>>>>>>>>>>> shutdown:
>>>>>>>>>>>
>>>>>>>>>>>   165             } else if (gdata->vmDead &&
>>>>>>>>>>>   166              ((cmd->cmdSet) !=
>>>>>>>>>>> JDWP_COMMAND_SET(VirtualMachine))) {
>>>>>>>>>>>   167                 /* Protect the VM from calls while dead.
>>>>>>>>>>>   168                  * VirtualMachine cmdSet quietly ignores
>>>>>>>>>>> some
>>>>>>>>>>> cmds
>>>>>>>>>>>   169                  * after VM death, so, it sends it's own
>>>>>>>>>>> errors.
>>>>>>>>>>>   170                  */
>>>>>>>>>>>   171 outStream_setError(&out,
>>>>>>>>>>> JDWP_ERROR(VM_DEAD));
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    However, the guard above does not help much if the VM_DEATH
>>>>>>>>>>> event
>>>>>>>>>>> happens in the middle of a command execution.
>>>>>>>>>>>    There is a lack of synchronization here.
>>>>>>>>>>>
>>>>>>>>>>>    The fix introduces new lock (vmDeathLock) which does not
>>>>>>>>>>> allow to
>>>>>>>>>>> execute the commands
>>>>>>>>>>>    and the VM_DEATH event callback concurrently.
>>>>>>>>>>>    It should work well for any function that is used in
>>>>>>>>>>> implementation of
>>>>>>>>>>> the JDWP_COMMAND_SET(VirtualMachine) .
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Testing:
>>>>>>>>>>>    Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG 
>>>>>>>>>>> com/sun/jdi
>>>>>>>>>>> tests
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 10/29/14 6:05 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>> The updated webrev:
>>>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.2/ 
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The changes are:
>>>>>>>>>>>>    - added a comment recommended by Staffan
>>>>>>>>>>>>    - removed the ignore_wrong_phase() call from function
>>>>>>>>>>>> classSignature()
>>>>>>>>>>>>
>>>>>>>>>>>> The classSignature() function is called in 16 places.
>>>>>>>>>>>> Most of them do not tolerate the NULL in place of returned
>>>>>>>>>>>> signature
>>>>>>>>>>>> and will crash.
>>>>>>>>>>>> I'm not comfortable to fix all the occurrences now and 
>>>>>>>>>>>> suggest to
>>>>>>>>>>>> return to this
>>>>>>>>>>>> issue after gaining experience with more failure cases that 
>>>>>>>>>>>> are
>>>>>>>>>>>> still
>>>>>>>>>>>> expected.
>>>>>>>>>>>> The failure with the classSignature() involved was observed 
>>>>>>>>>>>> only
>>>>>>>>>>>> once
>>>>>>>>>>>> in the nightly
>>>>>>>>>>>> and should be extremely rare reproducible.
>>>>>>>>>>>> I'll file a placeholder bug if necessary.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Serguei
>>>>>>>>>>>>
>>>>>>>>>>>> On 10/28/14 6:11 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>> Please, review the fix for:
>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-6988950
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Open webrev:
>>>>>>>>>>>>> http://cr.openjdk.java.net/~sspitsyn/webrevs/2014/jdk/6988950-JDWP-wrong-phase.1/ 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Summary:
>>>>>>>>>>>>>
>>>>>>>>>>>>>     The failing scenario:
>>>>>>>>>>>>>       The debugger and the debuggee are well aware a VM
>>>>>>>>>>>>> shutdown has
>>>>>>>>>>>>> been started in the target process.
>>>>>>>>>>>>>       The debugger at this point is not expected to send any
>>>>>>>>>>>>> commands
>>>>>>>>>>>>> to the JDWP agent.
>>>>>>>>>>>>>       However, the JDI layer (debugger side) and the jdwp 
>>>>>>>>>>>>> agent
>>>>>>>>>>>>> (debuggee side)
>>>>>>>>>>>>>       are not in sync with the consumer layers.
>>>>>>>>>>>>>
>>>>>>>>>>>>>       One reason is because the test debugger does not invoke
>>>>>>>>>>>>> the JDI
>>>>>>>>>>>>> method VirtualMachine.dispose().
>>>>>>>>>>>>>       Another reason is that the Debugger and the debuggee
>>>>>>>>>>>>> processes
>>>>>>>>>>>>> are uneasy to sync in general.
>>>>>>>>>>>>>
>>>>>>>>>>>>>       As a result the following steps are possible:
>>>>>>>>>>>>>         - The test debugger sends a 'quit' command to the 
>>>>>>>>>>>>> test
>>>>>>>>>>>>> debuggee
>>>>>>>>>>>>>         - The debuggee is normally exiting
>>>>>>>>>>>>>         - The jdwp backend reports (over the jdwp 
>>>>>>>>>>>>> protocol) an
>>>>>>>>>>>>> anonymous class unload event
>>>>>>>>>>>>>         - The JDI InternalEventHandler thread handles the
>>>>>>>>>>>>> ClassUnloadEvent event
>>>>>>>>>>>>>         - The InternalEventHandler wants to uncache the 
>>>>>>>>>>>>> matching
>>>>>>>>>>>>> reference type.
>>>>>>>>>>>>>           If there is more than one class with the same host
>>>>>>>>>>>>> class
>>>>>>>>>>>>> signature, it can't distinguish them,
>>>>>>>>>>>>>           and so, deletes all references and re-retrieves 
>>>>>>>>>>>>> them
>>>>>>>>>>>>> again
>>>>>>>>>>>>> (see tracing below):
>>>>>>>>>>>>>             MY_TRACE: JDI:
>>>>>>>>>>>>> VirtualMachineImpl.retrieveClassesBySignature:
>>>>>>>>>>>>> sig=Ljava/lang/invoke/LambdaForm$DMH;
>>>>>>>>>>>>>         - The jdwp backend debugLoop_run() gets the command
>>>>>>>>>>>>> from JDI
>>>>>>>>>>>>> and calls the functions
>>>>>>>>>>>>>           classesForSignature() and classStatus() 
>>>>>>>>>>>>> recursively.
>>>>>>>>>>>>>         - The classStatus() makes a call to the JVMTI
>>>>>>>>>>>>> GetClassStatus()
>>>>>>>>>>>>> and gets the JVMTI_ERROR_WRONG_PHASE
>>>>>>>>>>>>>         - As a result the jdwp backend reports the JVMTI 
>>>>>>>>>>>>> error
>>>>>>>>>>>>> to the
>>>>>>>>>>>>> JDI, and so, the test fails
>>>>>>>>>>>>>
>>>>>>>>>>>>>       For details, see the analysis in bug report closed as a
>>>>>>>>>>>>> dup of
>>>>>>>>>>>>> the bug 6988950:
>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8024865
>>>>>>>>>>>>>
>>>>>>>>>>>>>       Some similar cases can be found in the two bug reports
>>>>>>>>>>>>> (6988950
>>>>>>>>>>>>> and 8024865) describing this issue.
>>>>>>>>>>>>>
>>>>>>>>>>>>>       The fix is to skip reporting the 
>>>>>>>>>>>>> JVMTI_ERROR_WRONG_PHASE
>>>>>>>>>>>>> error
>>>>>>>>>>>>> as it is normal at the VM shutdown.
>>>>>>>>>>>>>       The original jdwp backend implementation had a similar
>>>>>>>>>>>>> approach
>>>>>>>>>>>>> for the raw monitor functions.
>>>>>>>>>>>>>       Threy use the ignore_vm_death() to workaround the
>>>>>>>>>>>>> JVMTI_ERROR_WRONG_PHASE errors.
>>>>>>>>>>>>>       For reference, please, see the file: 
>>>>>>>>>>>>> src/share/back/util.c
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Testing:
>>>>>>>>>>>>>    Run nsk.jdi.testlist, nsk.jdwp.testlist and JTREG 
>>>>>>>>>>>>> com/sun/jdi
>>>>>>>>>>>>> tests
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>
>



More information about the serviceability-dev mailing list