RFR: 8201409: JDWP debugger initialization hangs intermittently
Daniel D. Daugherty
daniel.daugherty at oracle.com
Sun Apr 15 17:01:24 UTC 2018
On 4/13/18 3:07 PM, serguei.spitsyn at oracle.com wrote:
> Andrew and reviewers,
>
> I'm re-sending this RFR with a corrected subject that includes the bug
> number.
>
> The issues is:
> _https://bugs.openjdk.java.net/browse/JDK-8201409_
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8201409&d=DwMDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=NaV8Iy8Ld-vjpXZFDdTbgGlRTghGHnwM75wUPd5_NUQ&m=nYdDNT1pdP7gQK1JviPJ6zIxLWcwe7R-azh_NnvI0Ok&s=z0_3fj5wG3cofrwXExX0jmnKvUeBGxjAOsYE1Gz6xCg&e=>
>
> Webrev:
> _http://cr.openjdk.java.net/~sspitsyn/webrevs/2018/8201409-jdwp-initsync.ibm.1/_
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Esspitsyn_webrevs_2018_8201409-2Djdwp-2Dinitsync.ibm.1_&d=DwMDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=NaV8Iy8Ld-vjpXZFDdTbgGlRTghGHnwM75wUPd5_NUQ&m=nYdDNT1pdP7gQK1JviPJ6zIxLWcwe7R-azh_NnvI0Ok&s=8LH2-PYUqJbVkawhCgmuD5106lOuFKI-jAVhCpB_tYY&e=>
src/jdk.jdwp.agent/share/native/libjdwp/debugInit.c
No comments.
src/jdk.jdwp.agent/share/native/libjdwp/debugInit.h
No comments.
src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c
So now pauses in debugLoop_run() before the loop
that reads cmds. Looks good.
src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
So the VM_INIT event handler now signals that we have
received the VM_INIT event so that allows debugLoop_run()
to proceed.
Serguei, this fix needs to have the most of the Serviceability
stack of tests run against it (jdwp, JVM/TI, JDI and jdb tests).
Based on the email thread, I can't tell which tests have been
run with the fix in place.
Dan
>
>
> The fix looks good to me.
> Also, I've agreed to skip a unit test as creating it for this issue is
> not easy.
>
> At least, one more review is needed before the fix can be pushed.
>
> Thanks,
> Serguei
>
>
> On 4/11/18 06:33, Andrew Leonard wrote:
>> Hi Serguei,
>> Thank you for raising the bug.
>> I had a chat with one of my colleagues who could recreate it, and
>> it's probably related to the handshaking that is done in the
>> particular scenario. So with the JCK harness:
>>
>> com.sun.jck.lib.ExecJCKTestOtherJVMCmd
>> LD_LIBRARY_PATH=/_javatest_/_lib_/_jck_/jck8b/natives/linux_x86-64
>> /projects/_jck_/jdwp/j2sdk-image/bin/java -Xdump:system:none
>> -Xdump:system:events=_gpf_+abort+_traceassert_+_corruptcache_
>> -Xdump:snap:none
>> -Xdump:snap:events=_gpf_+abort+_traceassert_+_corruptcache_
>> -Xdump:java:none
>> -Xdump:java:events=_gpf_+abort+_traceassert_+_corruptcache_
>> -Xdump:heap:none
>> -Xdump:heap:events=_gpf_+abort+_traceassert_+_corruptcache_
>> -_Xfuture_
>> -agentlib:jdwp=server=y,transport=dt_socket,address=_localhost_:35000,suspend=y
>> -_classpath_ /_javatest_/_lib_/_jck_/JCK8b-b03/JCK-runtime-8b/classes
>> -Djava.security.policy=/_javatest_/_lib_/_jck_/JCK8b-b03/JCK-runtime-8b/_lib_/jck.policy
>> javasoft.sqe.jck.lib.jpda.jdwp.DebuggeeLoader -_waittime_=600
>> -msgSwitch=ub1604x64vm10:38636
>> -componentName=*ArrayReference.GetValues.getvalues002*
>>
>> Note that the JCK test harness starts the target process, attaches to
>> it, and sends the resume command
>> in a very short time with no handshaking.
>>
>> That may not help..but hopefully helps explain things a bit? It's the
>> timing of the resume command during the test that is crucial,
>> resuming before the VM initialization is complete will trigger it.
>>
>> Thanks
>> Andrew
>>
>> Andrew Leonard
>> Java Runtimes Development
>> IBM Hursley
>> IBM United Kingdom Ltd
>> Phone internal: 245913, external: 01962 815913
>> internet email: andrew_m_leonard at uk.ibm.com
>>
>>
>>
>>
>> From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
>> To: Andrew Leonard <andrew_m_leonard at uk.ibm.com>
>> Cc: serviceability-dev at openjdk.java.net
>> Date: 11/04/2018 09:57
>> Subject: Re: RFR: Fix race condition in jdwp
>> ------------------------------------------------------------------------
>>
>>
>>
>> Hi Andrew,
>>
>> I've filed the bug:
>> _https://bugs.openjdk.java.net/browse/JDK-8201409_
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8201409&d=DwMDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=NaV8Iy8Ld-vjpXZFDdTbgGlRTghGHnwM75wUPd5_NUQ&m=nYdDNT1pdP7gQK1JviPJ6zIxLWcwe7R-azh_NnvI0Ok&s=z0_3fj5wG3cofrwXExX0jmnKvUeBGxjAOsYE1Gz6xCg&e=>
>>
>> Also, this is a webrev with your patch:
>> _http://cr.openjdk.java.net/~sspitsyn/webrevs/2018/8201409-jdwp-initsync.ibm.1/_
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Esspitsyn_webrevs_2018_8201409-2Djdwp-2Dinitsync.ibm.1_&d=DwMDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=NaV8Iy8Ld-vjpXZFDdTbgGlRTghGHnwM75wUPd5_NUQ&m=nYdDNT1pdP7gQK1JviPJ6zIxLWcwe7R-azh_NnvI0Ok&s=8LH2-PYUqJbVkawhCgmuD5106lOuFKI-jAVhCpB_tYY&e=>
>>
>> I agree that creating a standalone test is tricky here.
>>
>> I've added usleep(10000) into the eventHelper_reportVMInit()
>> and ran the JTreg com/sun/jdi tests with my JDK build.
>> However, none of the tests failed with the failure mode you described.
>> So that I'm puzzled a little bit.
>> I suspect that some specific debugLoop commands were used in your
>> scenario.
>>
>> It is still possible that I've missed something here.
>> Will try to double check everything.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 4/11/18 01:29, Andrew Leonard wrote:
>> Thanks Serguei,
>> I terms of a standalone testcase it is quite tricky, as due to the
>> nature of the issue which took a lot of investigation to solve it's
>> very timing dependent and will only occur randomly. It can be forced
>> as I indicated below by adding a "sleep" in the VMInit report code
>> but that's not a testcase, however the issue was originally found in
>> our JCK testing for IBMJava8, testcase test.jck8b.runtime.vm.jdwp,
>> but again only happened intermittently. Sort of like "performance"
>> type issues we're not always going to be able to create a testcase
>> that will always "fail" if the fix is not present.
>> Your thoughts?
>> Cheers
>> Andrew
>>
>> Andrew Leonard
>> Java Runtimes Development
>> IBM Hursley
>> IBM United Kingdom Ltd
>> Phone internal: 245913, external: 01962 815913
>> internet email: _andrew_m_leonard at uk.ibm.com_
>> <mailto:andrew_m_leonard at uk.ibm.com>
>>
>>
>>
>>
>> From: _"serguei.spitsyn at oracle.com"_
>> <mailto:serguei.spitsyn at oracle.com>_<serguei.spitsyn at oracle.com>_
>> <mailto:serguei.spitsyn at oracle.com>
>> To: Andrew Leonard _<andrew_m_leonard at uk.ibm.com>_
>> <mailto:andrew_m_leonard at uk.ibm.com>
>> Cc: _serviceability-dev at openjdk.java.net_
>> <mailto:serviceability-dev at openjdk.java.net>
>> Date: 11/04/2018 01:02
>> Subject: Re: RFR: Fix race condition in jdwp
>> ------------------------------------------------------------------------
>>
>>
>>
>> Hi Andrew,
>>
>> Okay, I'll file a bug on this topic.
>> But do you have a standalone test demonstrating this issue?
>>
>> Thanks,
>> Serguei
>>
>>
>> On 4/10/18 06:23, Andrew Leonard wrote:
>> Hi Serguei,
>> I don't have access to the bug database to raise one, are you able to
>> please?
>>
>> Summary: JDWP debugger initialization hangs intermittently
>> Description: If during the JDWP setup initialization the VM
>> initialization takes slightly longer than the main debug
>> initialization thread a "hang" situation can occur. This has been
>> seen in testcase test.jck8b.runtime.vm.jdwp and can also be recreated
>> easily by adding a 10 second sleep to the beginning of the
>> src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c method
>> eventHelper_reportVMInit() .
>> First seen: JDK8
>> Recreated: JDK11
>>
>> Thanks
>> Andrew
>>
>> Andrew Leonard
>> Java Runtimes Development
>> IBM Hursley
>> IBM United Kingdom Ltd
>> Phone internal: 245913, external: 01962 815913
>> internet email: _andrew_m_leonard at uk.ibm.com_
>> <mailto:andrew_m_leonard at uk.ibm.com>
>>
>>
>>
>>
>> From: _"serguei.spitsyn at oracle.com"_
>> <mailto:serguei.spitsyn at oracle.com>_<serguei.spitsyn at oracle.com>_
>> <mailto:serguei.spitsyn at oracle.com>
>> To: Andrew Leonard _<andrew_m_leonard at uk.ibm.com>_
>> <mailto:andrew_m_leonard at uk.ibm.com>,
>> _serviceability-dev at openjdk.java.net_
>> <mailto:serviceability-dev at openjdk.java.net>
>> Date: 09/04/2018 23:03
>> Subject: Re: RFR: Fix race condition in jdwp
>> ------------------------------------------------------------------------
>>
>>
>>
>> Hi Andrew,
>>
>> The patch itself looks reasonable.
>> However, in order to proceed with it, a bug report with a standalone
>> test case demonstrating the issue is needed.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 4/9/18 09:07, Andrew Leonard wrote:
>> > Hi,
>> > We discovered in our testing with OpenJ9 that a race condition can
>> > occur in the jdwp under certain circumstances, and we were able to
>> > force the same issue with Hotspot. Normally, the event helper thread
>> > suspends all threads, then the debug loop in the listener thread
>> > receives a command to resume. The debugger may deadlock if the debug
>> > loop in the listener thread starts processing commands (e.g. resume
>> > threads) before the event helper completes the initialization (and
>> > suspends threads).
>> >
>> > This patch adds synchronization to ensure the event helper completes
>> > the initialization sequence before debugger commands are processed.
>> >
>> > Please can I find a sponsor for this contribution? Patch below..
>> >
>> > Many thanks
>> >
>> > Andrew
>> >
>> >
>> >
>> > diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.c
>> > b/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.c
>> > --- a/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.c
>> > +++ b/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.c
>> > @@ -1,5 +1,5 @@
>> > /*
>> > - * Copyright (c) 1998, 2017, Oracle and/or its affiliates. All rights
>> > reserved.
>> > + * Copyright (c) 1998, 2018, Oracle and/or its affiliates. All rights
>> > reserved.
>> > * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>> > *
>> > * This code is free software; you can redistribute it and/or
>> modify it
>> > @@ -58,6 +58,7 @@
>> > static jboolean vmInitialized;
>> > static jrawMonitorID initMonitor;
>> > static jboolean initComplete;
>> > +static jboolean VMInitComplete;
>> > static jbyte currentSessionID;
>> >
>> > /*
>> > @@ -617,6 +618,35 @@
>> > debugMonitorExit(initMonitor);
>> > }
>> >
>> > +/*
>> > + * Signal VM initialization is complete.
>> > + */
>> > +void
>> > +signalVMInitComplete(void)
>> > +{
>> > + /*
>> > + * VM Initialization is complete
>> > + */
>> > + LOG_MISC(("signal VM initialization complete"));
>> > + debugMonitorEnter(initMonitor);
>> > + VMInitComplete = JNI_TRUE;
>> > + debugMonitorNotifyAll(initMonitor);
>> > + debugMonitorExit(initMonitor);
>> > +}
>> > +
>> > +/*
>> > + * Wait for VM initialization to complete.
>> > + */
>> > +void
>> > +debugInit_waitVMInitComplete(void)
>> > +{
>> > + debugMonitorEnter(initMonitor);
>> > + while (!VMInitComplete) {
>> > + debugMonitorWait(initMonitor);
>> > + }
>> > + debugMonitorExit(initMonitor);
>> > +}
>> > +
>> > /* All process exit() calls come from here */
>> > void
>> > forceExit(int exit_code)
>> > @@ -672,6 +702,7 @@
>> > LOG_MISC(("Begin initialize()"));
>> > currentSessionID = 0;
>> > initComplete = JNI_FALSE;
>> > + VMInitComplete = JNI_FALSE;
>> >
>> > if ( gdata->vmDead ) {
>> > EXIT_ERROR(AGENT_ERROR_INTERNAL,"VM dead at initialize() time");
>> > diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.h
>> > b/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.h
>> > --- a/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.h
>> > +++ b/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.h
>> > @@ -1,5 +1,5 @@
>> > /*
>> > - * Copyright (c) 1998, 2015, Oracle and/or its affiliates. All rights
>> > reserved.
>> > + * Copyright (c) 1998, 2018, Oracle and/or its affiliates. All rights
>> > reserved.
>> > * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>> > *
>> > * This code is free software; you can redistribute it and/or
>> modify it
>> > @@ -39,4 +39,7 @@
>> > void debugInit_exit(jvmtiError, const char *);
>> > void forceExit(int);
>> >
>> > +void debugInit_waitVMInitComplete(void);
>> > +void signalVMInitComplete(void);
>> > +
>> > #endif
>> > diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c
>> > b/src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c
>> > --- a/src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c
>> > +++ b/src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c
>> > @@ -1,5 +1,5 @@
>> > /*
>> > - * Copyright (c) 1998, 2017, Oracle and/or its affiliates. All rights
>> > reserved.
>> > + * Copyright (c) 1998, 2018, Oracle and/or its affiliates. All rights
>> > reserved.
>> > * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>> > *
>> > * This code is free software; you can redistribute it and/or
>> modify it
>> > @@ -98,6 +98,7 @@
>> > standardHandlers_onConnect();
>> > threadControl_onConnect();
>> >
>> > + debugInit_waitVMInitComplete();
>> > /* Okay, start reading cmds! */
>> > while (shouldListen) {
>> > if (!dequeue(&p)) {
>> > diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
>> > b/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
>> > --- a/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
>> > +++ b/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
>> > @@ -1,5 +1,5 @@
>> > /*
>> > - * Copyright (c) 1998, 2017, Oracle and/or its affiliates. All rights
>> > reserved.
>> > + * Copyright (c) 1998, 2018, Oracle and/or its affiliates. All rights
>> > reserved.
>> > * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>> > *
>> > * This code is free software; you can redistribute it and/or
>> modify it
>> > @@ -580,6 +580,7 @@
>> > (void)threadControl_suspendThread(command->thread, JNI_FALSE);
>> > }
>> >
>> > + signalVMInitComplete();
>> > outStream_initCommand(&out, uniqueID(), 0x0,
>> > JDWP_COMMAND_SET(Event),
>> > JDWP_COMMAND(Event, Composite));
>> >
>> >
>> >
>> > Andrew Leonard
>> > Java Runtimes Development
>> > IBM Hursley
>> > IBM United Kingdom Ltd
>> > Phone internal: 245913, external: 01962 815913
>> > internet email: _andrew_m_leonard at uk.ibm.com_
>> <mailto:andrew_m_leonard at uk.ibm.com>
>> >
>> >
>> > Unless stated otherwise above:
>> > IBM United Kingdom Limited - Registered in England and Wales with
>> > number 741598.
>> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
>> PO6 3AU
>>
>>
>>
>>
>>
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with
>> number 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
>> PO6 3AU
>>
>>
>>
>>
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with
>> number 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
>> PO6 3AU
>>
>>
>>
>>
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with
>> number 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
>> PO6 3AU
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180415/4035b561/attachment-0001.html>
More information about the serviceability-dev
mailing list