RFR: 8255987: JDI tests fail with com.sun.jdi.ObjectCollectedException
Chris Plummer
cjplummer at openjdk.java.net
Fri Dec 4 21:26:12 UTC 2020
On Fri, 4 Dec 2020 21:01:13 GMT, Chris Plummer <cjplummer at openjdk.org> wrote:
>> This PR replaces the withdrawn PR #1348. This PR tries to fix the underlying problem, rather than fix the tests.
>>
>> The problem is that a number of JDI tests create objects on the debugger side with calls to `newInstance()`. However, on the debugee side, these new instances will only be held on to by a `JNIGlobalWeakRef`, which means they could be collected at any time, even before `newInstace()` returns. A number of JDI tests get spurious `ObjectCollectedException` thrown at them, which results in test failures. To make these objects stick around, a call to `disableCollection()` is typically needed.
>>
>> However, as pointer out by @plummercj in [JDK-8255987](https://bugs.openjdk.java.net/browse/JDK-8255987):
>>
>>> Going back to the spec, ObjectReference.disableCollection() says:
>>>
>>> "By default all ObjectReference values returned by JDI may be collected at any time the target VM is running"
>>>
>>> and
>>>
>>> "Note that while the target VM is suspended, no garbage collection will occur because all threads are suspended."
>>>
>>> But no where does is say what is meant by the VM running or being suspended, or how to get it in that state. One might assume that this ties in with VirtualMachine.suspend(), but it says:
>>>
>>> "Suspends the execution of the application running in this virtual machine. All threads currently running will be suspended."
>>>
>>> No mention of suspending the VM, but that certainly seems to be what is implied by the method name and also by the loose wording in disableCollection().
>>
>> Most of our spuriously failing tests do actually make a call to `VirtualMachine.suspend()`, presumably to prevent objects from being garbage collected. However, the current implementation of `VirtualMachine.suspend()` will only suspend all Java threads. That is not enough to prevent objects from being garbage collected. The GC can basically run at any time, and there is no relation to whether all Java threads are suspended or not.
>>
>> However, as suggested by @plummercj, we could emulate the behaviour implied by the spec by letting a call to `VirtualMachine.suspend()` also convert all existing JDI objects references to be backed by a (strong) `JNIGlobalRef` rather than a (weak) `JNIGlobalWeakRef`. That will not prevent the GC from running, but it will prevent any object visible to a JDI client from being garbage collected. Of course, a call to `VirtualMachine.resume()` would convert all references back to being weak again.
>>
>> This patch introduces the needed functions in `libjdwp` to "pin" and "unpin" all objects. These new functions are then used by the underpinnings of `VirtualMachine.suspend()` and `VirtualMachine.resume()` to implement the behaviour described above.
>>
>> Note that there are still a few tests that needed adjustments to guard against `ObjectCollectionException`. These are:
>> - *vmTestbase/nsk/jdi/ArrayType/newInstance/newinstance004.java* - This test seems to have been forgotten by [JDK-8203174](https://bugs.openjdk.java.net/browse/JDK-8203174), which did a similar fix in the other `ArrayType/newinstance` tests.
>> - *vmTestbase/nsk/jdi/VMOutOfMemoryException/VMOutOfMemoryException001/VMOutOfMemoryException001.java* - We just want to allocate as much as we can, so catching an ignoring `ObjectCollectedException` seems reasonable here.
>> - *vmTestbase/nsk/share/jdi/sde/SDEDebuggee.java* - We still want to prevent `TestClassLoader` from being unloaded to avoid invalidating code locations.
>> - *vmTestbase/nsk/jdi/ReferenceType/instances/instances002/instances002.java* - This test keeps the VM suspended, and then expects objects to be garbage collected, which they now won't.
>>
>> Testing:
>> - More than 50 iterations of the `vmTestbase/nsk/jdi` and `vmTestbase/nsk/jdwp` test suites, using various GC, both in mach5 and locally.
>
> src/jdk.jdwp.agent/share/native/libjdwp/commonRef.c line 632:
>
>> 630: if (weakRef == NULL) {
>> 631: EXIT_ERROR(AGENT_ERROR_NULL_POINTER,"NewWeakGlobalRef");
>> 632: }
>
> I'm not so sure I agree that having a fatal error here is the right thing to do. The only other user of `weakenNode()` is `ObjectReference.disableCollection()`. It returns an error to the debugger if `weakenNode()` returns `NULL`. However, I'm not so sure that's a good thing to do here either, since it means the `VM.resume()` will need to fail. Possibly the error should just be ignored, and we live with the ref staying strong.
Another options is to save away the weakref in the node when strengthening. This would benefit `ObjectReference.disableCollection()` also, since it would no longer need to deal with a potential OOM. However, I'm not so sure it's actually worth doing. Trying to keep the debug session alive will having allocation errors is probably a fools errand.
-------------
PR: https://git.openjdk.java.net/jdk/pull/1595
More information about the serviceability-dev
mailing list