RFR 8203174: [Graal] JDI tests fail with Unexpected exception: com.sun.jdi.ObjectCollectedException
Chris Plummer
chris.plummer at oracle.com
Wed Nov 7 19:33:26 UTC 2018
Hi Danil,
So this is the crux of the issue:
112 debuggee.suspend();
113
114 List<ObjectReference> baseReferences = new LinkedList<>();
115 // We need to ensure that the base references count is not
changed during the test.
116 // The instances of the class could be already created by
other (e.g. JVMCI) threads
117 // and if GC was scheduled before VM was suspended some of
these instances might
118 // become collected.
119 for (ObjectReference objRef : referenceType.instances(0)) {
120 try {
121 objRef.disableCollection();
122 baseReferences.add(objRef);
123 } catch (ObjectCollectedException e) {
124 // skip this reference
125 }
126 }
127 baseInstances = baseReferences.size();
First it is possible for a GC to be triggered even if the debuggee is
not executing any code because of JVMCI threads. So this is why
debuggee.suspend() is needed. Second, even after calling
debuggee.suspend(), it is still possible for a GC to happen since it may
have been triggered before the debuggee.suspend() call, but not yet
completed, and debuggee.suspend() does not prevent GC from completing in
that case. So that is why you need to disableCollecion() on each object,
and accept that some of them may have already been collected. Therefore
baseInstances needs to be updated to only reflect the count of instances
that have not been collected, and will not be collected because
disableCollection() has been called on them. This all makes sense and
seems fine.
I do think the comments could use some updating. The debuggee.suspend()
call should be explained as being needed because there are potentially
other non-test java threads allocating objects and triggering GC's,
JVMCI being the main culprit here. The comment before the loop should
focus on dealing with a GC that was triggered before the suspend,
requiring disableCollection() be called each object returned by
referenceType.instances() since it can potentially be collected
otherwise. This code is not really related to the JVMCI threads.
I don't understand the reason for excluding java.lang.String from testing.
I don't understand the reason for the useStrictCheck() changes.
I don't understand what the following is fixing, and the impact on test
execution that the resume() might have:
153 if (i > 0) {
154 debuggee.resume();
155 }
156
157 line = pipe.readln();
158 debuggee.suspend();
thanks,
Chris
On 11/7/18 8:58 AM, Daniil Titov wrote:
> Hi Chris and Serguei,
>
> With recent builds I can no longer see any GC triggered by C1 compiler thread due to running out of metaspace and JDK-8193126 seems as solved this particular problem. Currently all observed GCs are triggered by JVMCI Compiler threads.
>
> Please review a new version of the fix that no longer keeps the main thread in the target VM running while other threads are suspended (since as Dean mentioned it might be not safe). Instead, the target VM is fully suspended when required and resumed afterwards. The new webrev also excludes java.lang.String class from the list of classes used by some of these tests.
>
> Webrev: http://cr.openjdk.java.net/~dtitov/8203174/webrev.02/index.html
> Bug: https://bugs.openjdk.java.net/browse/JDK-8203174
>
> Thanks,
> Daniil
>
> On 11/5/18, 10:46 AM, "Chris Plummer" <chris.plummer at oracle.com> wrote:
>
> On 11/4/18 11:45 PM, serguei.spitsyn at oracle.com wrote:
> > On 11/4/18 23:40, serguei.spitsyn at oracle.com wrote:
> >> Hi Daniil,
> >>
> >> I agree with Chris below.
> >> Will tell more on reply to your reply.
> >>
> >> Thanks,
> >> Serguei
> >>
> >> On 11/2/18 20:59, Chris Plummer wrote:
> >>> Hi Daniil,
> >>>
> >>> I thought the issue was that C2 was doing metadata allocations, and
> >>> when it ran out of metaspace it did a GC (one that was not triggered
> >>> by a failed object
> >
> > Forgot to comment on the below.
> > It is probably a typo.
> > Should it be about the Graal, not the C2?
> > As I understand we have no issue with the C2.
> Actually it was the C1 Compiler Thread that was the problem, although it
> only turned up during Graal testing.
>
> Chris
> >
> > Thanks,
> > Serguei
> >
> >
> >>> allocation). As a results we were getting objects GCs before the
> >>> test ever got the chance to disable collection on them. I thought we
> >>> also concluded other than this metaspace issue, if the test is
> >>> properly disabling collection on the objects it cared about, it
> >>> shouldn't matter if there are GC's triggered by excessive object
> >>> allocations.
> >>>
> >>> I don't think the following check will always be valid. It may be on
> >>> by default at some point:
> >>>
> >>> 651 public boolean isJVMCICompilerOn() {
> >>> 652 String opts = argumentHandler.getLaunchOptions();
> >>> 653 return opts.indexOf("-XX:+UseJVMCICompiler") >= 0;
> >>> 654 }
> >>>
> >>> thanks,
> >>>
> >>> Chris
> >>>
> >>> On 11/2/18 4:29 PM, Daniil Titov wrote:
> >>>> Please review the change that fixes several tests failing with
> >>>> com.sun.jdi.ObjectCollectedException when running with Graal.
> >>>>
> >>>> There main problem here is that with Graal on JVMCI Compiler
> >>>> threads in the target VM create a lot of objects and sporadically
> >>>> trigger GC that results in the objects created in the target VM in
> >>>> the tests being GCed before the tests complete. The other problem
> >>>> is that for some classes the tests use, e.g. "java.lang.String",
> >>>> there is already a huge number of instances created by JVMCI threads.
> >>>>
> >>>> The fix suspends target VM to prevent JVMCI compiler threads from
> >>>> creating new objects during the tests execution. In cases when
> >>>> IOPipe is used for communication between the test and the debuggee
> >>>> the fix suspends all threads but the main rather than suspending
> >>>> the VM. The fix also filters out the checks for some test classes
> >>>> (e.g. "java.lang.String") in cases when the target VM is run with
> >>>> JVMCI compiler options on.
> >>>>
> >>>> Webrev: http://cr.openjdk.java.net/~dtitov/8203174/webrev.01/
> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8203174
> >>>>
> >>>> Thanks,
> >>>> Daniil
> >>>>
> >>>>
> >>>>
> >>>
> >>>
> >>
> >
>
>
>
>
>
More information about the serviceability-dev
mailing list