Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed

shanliang shanliang.jiang at oracle.com
Thu Feb 13 09:15:38 PST 2014


Staffan,

Very nice analysis!

The fix must be very simple, just remove the line
    108   vm.resume
it is an error because here the test does not yet treat the events in 
eventSet.

the line
    136   eventSet.resume();
is the right place to resume the threads after event treatment.

Here is the new webrev:
http://cr.openjdk.java.net/~sjiang/JDK-8007710/03/

Thanks,
Shanliang

Staffan Larsen wrote:
> I think I understand what happens now.
>
> The test code, simplified, looks like this (with the Thread.sleep() 
> added that causes the test to fail):
>
>   launchTarget();
>   addClassWatch();
>   vm.resume();
>   Thread.sleep(1000);
>   while(connected) {
>       eventSet = eventQueue.remove()
>       for(event : eventQueue) {
>           if (event instanceof ClassPrepareEvent) {
>               addFieldWatch();
>           }
>       }
>       eventSet.resume();
>   }
>
> By default all events that happen will cause the debuggee to suspend 
> (see EventRequest.setSuspendPolicy()). Thus when we get to 
> addFieldWatch(), the vm should be suspended and we should be able to 
> create the field watch without problem. But the VM isn’t suspended and 
> that is why the test fail. 
>
> Why isn’t the VM suspended? When we get to the “for(event : 
> eventQueue)” the first time there are *two* events already in the 
> queue: the VMStartEvent and a ClassPrepareEvent. At this point the VM 
> is suspended and everything is good. We look at the first eventSet 
> which only contains the VMStartEvent, we ignore the event, but we 
> resume the VM. We then loop and look at the ClassPrepareEvent, but by 
> now the VM is already running and has also terminated. Failure.
>
> Thus, we need to handle the VMStartEvent. I suggest a modification to 
> my previous code:
>
>   launchTarget();
>   while(connected) {
>       eventSet = eventQueue.remove()
>       for(event : eventQueue) {
>           if (event instanceof VMStartEvent) {
>               addClassWatch();
>           }
>           if (event instanceof ClassPrepareEvent) {
>               addFieldWatch();
>           }
>       }
>       eventSet.resume();
>   }
>
> This will cause us to have complete control over the state of the 
> debuggee. The first event we see will be the VMStartEvent. The VM will 
> be suspended. We can add a class watch here. Then we resume the VM. 
> The second event we see will be the ClassPrepareEvent with the VM 
> suspended. We can add the field watch. Then we resume the VM and wait 
> for the field watch events.
>
> Thanks,
> /Staffan
>
> On 13 feb 2014, at 11:36, shanliang <shanliang.jiang at oracle.com 
> <mailto:shanliang.jiang at oracle.com>> wrote:
>
>> Staffan Larsen wrote:
>>> On 13 feb 2014, at 10:17, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:
>>>
>>>   
>>>> Hi Staffan,
>>>>
>>>> On 12.2.2014 18:27, Staffan Larsen wrote:
>>>>     
>>>>> I’m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can’t quite explain what goes wrong without the fix… I’d really like to understand that. I’ll try to dig deeper and see if I can understand exactly what happens.
>>>>>       
>>>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin.
>>>>
>>>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM?
>>>>     
>>> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread.
>>>   
>> The test failed when it received ClassPrepareEvent and did 
>> addFieldWatch, that meant the test must receive already VMStartEvent, 
>> because VMStartEvent must be the first event, if it was true then the 
>> vm must be already stable when failing.
>>
>> Except that the test received ClassPrepareEvent before VMStartEvent 
>> then it was doing addFieldWatch with a possibly unstable VM. in this 
>> case we might have a serious bug in VirtualMachine implementation, 
>> and if this is true the fix proposed to check "start" may make miss 
>> ClassPrepareEvent, then the test would test nothing.
>>
>> Shanliang
>>> /S
>>>
>>>   
>>>> -JB-
>>>>
>>>>     
>>>>> /Staffan
>>>>>
>>>>> On 12 feb 2014, at 18:04, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>
>>>>>       
>>>>>> Staffan Larsen wrote:
>>>>>>         
>>>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:
>>>>>>>
>>>>>>>  Note that a target VM launched by a launching connector is not
>>>>>>>  guaranteed to be stable until after the VMStartEvent has been
>>>>>>>  received.
>>>>>>>
>>>>>>>           
>>>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent.
>>>>>>         
>>>>>>> I think adding code that looks something like this will make the test stable:
>>>>>>>
>>>>>>>     VirtualMachine vm = launchTarget(CLASS_NAME);
>>>>>>>     EventQueue eventQueue = vm.eventQueue();
>>>>>>>
>>>>>>>     boolean started = false;
>>>>>>>     while(!started) {
>>>>>>>       EventSet eventSet = eventQueue.remove();
>>>>>>>       for (Event event : eventSet) {
>>>>>>>         if (event instanceof VMStartEvent) {
>>>>>>>           started = true;
>>>>>>>         }
>>>>>>>         if (event instanceof VMDeathEvent
>>>>>>>             || event instanceof VMDisconnectEvent) {
>>>>>>>           throw new Error("VM died before it started...:"+event);
>>>>>>>         }
>>>>>>>       }
>>>>>>>     }
>>>>>>>
>>>>>>>     System.out.println("Vm launched");
>>>>>>>
>>>>>>>           
>>>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent.
>>>>>>
>>>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done.
>>>>>>         
>>>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.
>>>>>>>
>>>>>>>           
>>>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it.
>>>>>>         
>>>>>>> While you are in there you can also remove the unused StringBuffer near the top of main().
>>>>>>>
>>>>>>>           
>>>>>> Yes it was already removed in version 01
>>>>>>
>>>>>> Here is the new webrev:
>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/
>>>>>>
>>>>>> Thanks,
>>>>>> Shanliang
>>>>>>         
>>>>>>> Thanks,
>>>>>>> /Staffan
>>>>>>>
>>>>>>> On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>           
>>>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>>>>>>>>   http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Shanliang
>>>>>>>>
>>>>>>>> shanliang wrote:
>>>>>>>>
>>>>>>>>             
>>>>>>>>> shanliang wrote:
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>> Jaroslav Bachorik wrote:
>>>>>>>>>>
>>>>>>>>>>                 
>>>>>>>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>>>>>>>
>>>>>>>>>>>                   
>>>>>>>>>>>> Staffan Larsen wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>                     
>>>>>>>>>>>>> Hi Shanliang,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I can’t quite see how the test can fail in this way. When the
>>>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                       
>>>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>>>>>>>>> call "vm.resume()" before treating events.
>>>>>>>>>>>>
>>>>>>>>>>>>                     
>>>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>>>>>>>>>>
>>>>>>>>>>>                   
>>>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>>>>>>>>   /**
>>>>>>>>>>    * Continues the execution of the application running in this
>>>>>>>>>>    * virtual machine. All threads are resumed as documented in
>>>>>>>>>>    * {@link ThreadReference#resume}.
>>>>>>>>>>    *
>>>>>>>>>>    * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>>>>>>>>    *
>>>>>>>>>>    * @see #suspend
>>>>>>>>>>    */
>>>>>>>>>>   void resume();
>>>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>>>>>>>>>>
>>>>>>>>>>                 
>>>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>>>>>>>>>
>>>>>>>>> Shanliang
>>>>>>>>>
>>>>>>>>>               
>>>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>>>>>>>> calling eventQueue.remove();
>>>>>>>>>>>>
>>>>>>>>>>>>                     
>>>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>>>>>>>>>>
>>>>>>>>>>>                   
>>>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Shanliang
>>>>>>>>>>
>>>>>>>>>>                 
>>>>>>>>>>> Cheers,
>>>>>>>>>>>
>>>>>>>>>>> -JB-
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                   
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>
>>>>>>>>>>>>                     
>>>>>>>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>>>>>>>> VMStartEvent before setting up requests. I’m not sure if that could
>>>>>>>>>>>>> cause the failure in the bug report, though.
>>>>>>>>>>>>>
>>>>>>>>>>>>> /Staffan
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>                       
>>>>>>>>>>>>>> Hi ,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>>>>>>>>> already ended.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> web:
>>>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> bug:
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                         
>>>>>>         
>>>>>       
>>>   
>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20140213/e7a14bc3/attachment-0001.html 


More information about the serviceability-dev mailing list