Codereview request: 8007710 runtime/7158988/FieldMonitor.java fails with com.sun.jdi.VMDisconnectedException: Connection closed
shanliang
shanliang.jiang at oracle.com
Fri Feb 14 02:00:02 PST 2014
Staffan Larsen wrote:
> This version looks good! Thanks for hanging in there.
>
> The only improvement would be to count and verify the number of
> ModificationWatchpointEvent (there should be 10).
Good idea, here is:
http://cr.openjdk.java.net/~sjiang/JDK-8007710/05/
Thanks,
Shanliang
>
> Thanks,
> /Staffan
>
> On 13 feb 2014, at 21:15, shanliang <shanliang.jiang at oracle.com
> <mailto:shanliang.jiang at oracle.com>> wrote:
>
>> Hi,
>>
>> Here is Version 4:
>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/04/
>>
>> 1) remove the line
>> 108 vm.resume()
>> 2) call addClassWatch(vm) only when receiving VMStartEvent
>> 3) make sure that the test receives ModificationWatchpointEvent
>> 4) clean
>>
>> Thanks,
>> Shanliang
>>
>> shanliang wrote:
>>> Staffan,
>>>
>>> Very nice analysis!
>>>
>>> The fix must be very simple, just remove the line
>>> 108 vm.resume
>>> it is an error because here the test does not yet treat the events
>>> in eventSet.
>>>
>>> the line
>>> 136 eventSet.resume();
>>> is the right place to resume the threads after event treatment.
>>>
>>> Here is the new webrev:
>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/03/
>>>
>>> Thanks,
>>> Shanliang
>>>
>>> Staffan Larsen wrote:
>>>> I think I understand what happens now.
>>>>
>>>> The test code, simplified, looks like this (with the Thread.sleep()
>>>> added that causes the test to fail):
>>>>
>>>> launchTarget();
>>>> addClassWatch();
>>>> vm.resume();
>>>> Thread.sleep(1000);
>>>> while(connected) {
>>>> eventSet = eventQueue.remove()
>>>> for(event : eventQueue) {
>>>> if (event instanceof ClassPrepareEvent) {
>>>> addFieldWatch();
>>>> }
>>>> }
>>>> eventSet.resume();
>>>> }
>>>>
>>>> By default all events that happen will cause the debuggee to
>>>> suspend (see EventRequest.setSuspendPolicy()). Thus when we get to
>>>> addFieldWatch(), the vm should be suspended and we should be able
>>>> to create the field watch without problem. But the VM isn’t
>>>> suspended and that is why the test fail.
>>>>
>>>> Why isn’t the VM suspended? When we get to the “for(event :
>>>> eventQueue)” the first time there are *two* events already in the
>>>> queue: the VMStartEvent and a ClassPrepareEvent. At this point the
>>>> VM is suspended and everything is good. We look at the first
>>>> eventSet which only contains the VMStartEvent, we ignore the event,
>>>> but we resume the VM. We then loop and look at the
>>>> ClassPrepareEvent, but by now the VM is already running and has
>>>> also terminated. Failure.
>>>>
>>>> Thus, we need to handle the VMStartEvent. I suggest a modification
>>>> to my previous code:
>>>>
>>>> launchTarget();
>>>> while(connected) {
>>>> eventSet = eventQueue.remove()
>>>> for(event : eventQueue) {
>>>> if (event instanceof VMStartEvent) {
>>>> addClassWatch();
>>>> }
>>>> if (event instanceof ClassPrepareEvent) {
>>>> addFieldWatch();
>>>> }
>>>> }
>>>> eventSet.resume();
>>>> }
>>>>
>>>> This will cause us to have complete control over the state of the
>>>> debuggee. The first event we see will be the VMStartEvent. The VM
>>>> will be suspended. We can add a class watch here. Then we resume
>>>> the VM. The second event we see will be the ClassPrepareEvent with
>>>> the VM suspended. We can add the field watch. Then we resume the VM
>>>> and wait for the field watch events.
>>>>
>>>> Thanks,
>>>> /Staffan
>>>>
>>>> On 13 feb 2014, at 11:36, shanliang <shanliang.jiang at oracle.com
>>>> <mailto:shanliang.jiang at oracle.com>> wrote:
>>>>
>>>>> Staffan Larsen wrote:
>>>>>> On 13 feb 2014, at 10:17, Jaroslav Bachorik <jaroslav.bachorik at oracle.com> wrote:
>>>>>>
>>>>>>
>>>>>>> Hi Staffan,
>>>>>>>
>>>>>>> On 12.2.2014 18:27, Staffan Larsen wrote:
>>>>>>>
>>>>>>>> I’m still not happy with this fix since I think the extra output stream synchronization logic is not needed - the debuggee should be suspended at all the interesting points. The fix I proposed is cleaner and (as far as I can tell) also fixes the problem. The only thing is that I can’t quite explain what goes wrong without the fix… I’d really like to understand that. I’ll try to dig deeper and see if I can understand exactly what happens.
>>>>>>>>
>>>>>>> Yes, bringing the VM to a stable state before calling other JDI functions helps to stabilize the test even without the additional synchronization via stdout/stdin.
>>>>>>>
>>>>>>> I just wonder whether this check should not be done inside com.sun.jdi.connect.LaunchingConnector#launch() implementation. Does it even make sense to hand off an unstable VM?
>>>>>>>
>>>>>> Good question, but hard to change now - all implementations depend on the current functionality. The VMStartEvent also gives you a reference to the main thread.
>>>>>>
>>>>> The test failed when it received ClassPrepareEvent and did
>>>>> addFieldWatch, that meant the test must receive already
>>>>> VMStartEvent, because VMStartEvent must be the first event, if it
>>>>> was true then the vm must be already stable when failing.
>>>>>
>>>>> Except that the test received ClassPrepareEvent before
>>>>> VMStartEvent then it was doing addFieldWatch with a possibly
>>>>> unstable VM. in this case we might have a serious bug in
>>>>> VirtualMachine implementation, and if this is true the fix
>>>>> proposed to check "start" may make miss ClassPrepareEvent, then
>>>>> the test would test nothing.
>>>>>
>>>>> Shanliang
>>>>>> /S
>>>>>>
>>>>>>
>>>>>>> -JB-
>>>>>>>
>>>>>>>
>>>>>>>> /Staffan
>>>>>>>>
>>>>>>>> On 12 feb 2014, at 18:04, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> Staffan Larsen wrote:
>>>>>>>>>
>>>>>>>>>> I think what you need to do is wait for the VMStartEvent before you add requests to the VM. Note this paragraph from the VirtualMachine doc:
>>>>>>>>>>
>>>>>>>>>> Note that a target VM launched by a launching connector is not
>>>>>>>>>> guaranteed to be stable until after the VMStartEvent has been
>>>>>>>>>> received.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> I may miss something here, I believe VMStartEvent must be the first event, when the test got ClassPrepareEvent, it must already received VMStartEvent.
>>>>>>>>>
>>>>>>>>>> I think adding code that looks something like this will make the test stable:
>>>>>>>>>>
>>>>>>>>>> VirtualMachine vm = launchTarget(CLASS_NAME);
>>>>>>>>>> EventQueue eventQueue = vm.eventQueue();
>>>>>>>>>>
>>>>>>>>>> boolean started = false;
>>>>>>>>>> while(!started) {
>>>>>>>>>> EventSet eventSet = eventQueue.remove();
>>>>>>>>>> for (Event event : eventSet) {
>>>>>>>>>> if (event instanceof VMStartEvent) {
>>>>>>>>>> started = true;
>>>>>>>>>> }
>>>>>>>>>> if (event instanceof VMDeathEvent
>>>>>>>>>> || event instanceof VMDisconnectEvent) {
>>>>>>>>>> throw new Error("VM died before it started...:"+event);
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> System.out.println("Vm launched");
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> The code you proposed could improve the test, it made sure that TestPostFieldModification was started, but I am afraid that it did not address the issue causing the failure, the issue I believe was that TestPostFieldModification exited before or during FieldMonitor called addFieldWatch(), that was why addFieldWatch() received VMDisconnectedException. When the test was treating ClassPrepareEvent, even if VMDeathEvent or VMDisconnectEvent arrived, it must be still waiting in the eventQueue because it arrived after ClassPrepareEvent.
>>>>>>>>>
>>>>>>>>> My fix was to not allow TestPostFieldModification to exit before addFieldWatch() was done.
>>>>>>>>>
>>>>>>>>>> There is also no reason to call addFieldWatch() before the ClassPrepareEvent has been received. The call to vm..classesByName() will just return an empty list anyway.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> I do not know why the test called addFieldWatch before ClassPrepareEvent had been received, but yes the returned list was empty, so agree to remove it.
>>>>>>>>>
>>>>>>>>>> While you are in there you can also remove the unused StringBuffer near the top of main().
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> Yes it was already removed in version 01
>>>>>>>>>
>>>>>>>>> Here is the new webrev:
>>>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/02/
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Shanliang
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> /Staffan
>>>>>>>>>>
>>>>>>>>>> On 11 feb 2014, at 18:30, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Here is the new fix in which FieldMonitor will write to TestPostFieldModification, to inform the latter to quit, as suggested bu Jaroslav
>>>>>>>>>>> http://cr.openjdk.java.net/~sjiang/JDK-8007710/01/
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Shanliang
>>>>>>>>>>>
>>>>>>>>>>> shanliang wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> shanliang wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Jaroslav Bachorik wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 11.2.2014 16:31, shanliang wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Staffan Larsen wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Shanliang,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I can’t quite see how the test can fail in this way. When the
>>>>>>>>>>>>>>>> ClassPrepareEvent happens, the debuggee will be suspended. So when
>>>>>>>>>>>>>>>> addFieldWatch() is called, the debuggee should not have moved.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am not expert of jdi so I may miss something here. I checked the
>>>>>>>>>>>>>>> failure trace and saw the report exception happen when FieldMonitor
>>>>>>>>>>>>>>> received ClassPrepareEvent and was doing addFieldWatch. FieldMonitor did
>>>>>>>>>>>>>>> call "vm.resume()" before treating events.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> AFAICS, calling vm.resume() results in an almost immediate debuggee death. The gc() invoking thread "d" is flagged as a deamon and as such doesn't prevent the process from exiting. The other thread is not a daemon but will finish in only few cycles.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I looked at the class com.sun.jdi.VirtualMachine, here is the Javadoc of the method "resume":
>>>>>>>>>>>>> /**
>>>>>>>>>>>>> * Continues the execution of the application running in this
>>>>>>>>>>>>> * virtual machine. All threads are resumed as documented in
>>>>>>>>>>>>> * {@link ThreadReference#resume}.
>>>>>>>>>>>>> *
>>>>>>>>>>>>> * @throws VMCannotBeModifiedException if the VirtualMachine is read-only - see {@link VirtualMachine#canBeModified()}.
>>>>>>>>>>>>> *
>>>>>>>>>>>>> * @see #suspend
>>>>>>>>>>>>> */
>>>>>>>>>>>>> void resume();
>>>>>>>>>>>>> My understanding is that the debuggee resumes to work after this call, instead to die?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> In fact the problem is here, the vm (TestPostFieldModification) should not die before FieldMonitor finishes addFieldWatch.
>>>>>>>>>>>>
>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>>> I reproduced the bug by add sleep(1000) after vm.resume() but before
>>>>>>>>>>>>>>> calling eventQueue.remove();
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It looks like some kind of synchronization between the debugger and the debuggee is necessary. But I wonder if you should better use the process.getOuptuptStream() to write and flush a message for the debugee indicating that it can exit. And in the debugee you would just do System.in.read() as the last statement in the main() method. Seems more robust than involving files.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> It could work, but creating a file in the testing directory should have no issue, but yes maybe less performance.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -JB-
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> One problem I do see with the test is that it does not wait for a
>>>>>>>>>>>>>>>> VMStartEvent before setting up requests. I’m not sure if that could
>>>>>>>>>>>>>>>> cause the failure in the bug report, though.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> /Staffan
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 11 feb 2014, at 15:13, shanliang <shanliang.jiang at oracle.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi ,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The problem could be that FieldMonitor did not have enough time to
>>>>>>>>>>>>>>>>> "addFieldWatch" but the vm to monitor (TestPostFieldModification) was
>>>>>>>>>>>>>>>>> already ended.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So we should make sure that TestPostFieldModification exits after
>>>>>>>>>>>>>>>>> FieldMonitor has done necessary. The solution proposed here is that
>>>>>>>>>>>>>>>>> FieldMonitor creates a file after adding field watching, and
>>>>>>>>>>>>>>>>> TestPostFieldModification quits only after finding the file.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> web:
>>>>>>>>>>>>>>>>> http://icncweb.fr.oracle.com/~shjiang/webrev/8007710/00/
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> bug:
>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8007710
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Shanliang
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-dev
mailing list