RFR: 8201409: JDWP debugger initialization hangs intermittently

Daniel D. Daugherty daniel.daugherty at oracle.com
Sun Apr 15 17:01:24 UTC 2018


On 4/13/18 3:07 PM, serguei.spitsyn at oracle.com wrote:
> Andrew and reviewers,
>
> I'm re-sending this RFR with a corrected subject that includes the bug 
> number.
>
> The issues is:
> _https://bugs.openjdk.java.net/browse/JDK-8201409_ 
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8201409&d=DwMDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=NaV8Iy8Ld-vjpXZFDdTbgGlRTghGHnwM75wUPd5_NUQ&m=nYdDNT1pdP7gQK1JviPJ6zIxLWcwe7R-azh_NnvI0Ok&s=z0_3fj5wG3cofrwXExX0jmnKvUeBGxjAOsYE1Gz6xCg&e=>
>
> Webrev:
> _http://cr.openjdk.java.net/~sspitsyn/webrevs/2018/8201409-jdwp-initsync.ibm.1/_ 
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Esspitsyn_webrevs_2018_8201409-2Djdwp-2Dinitsync.ibm.1_&d=DwMDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=NaV8Iy8Ld-vjpXZFDdTbgGlRTghGHnwM75wUPd5_NUQ&m=nYdDNT1pdP7gQK1JviPJ6zIxLWcwe7R-azh_NnvI0Ok&s=8LH2-PYUqJbVkawhCgmuD5106lOuFKI-jAVhCpB_tYY&e=>

src/jdk.jdwp.agent/share/native/libjdwp/debugInit.c
     No comments.

src/jdk.jdwp.agent/share/native/libjdwp/debugInit.h
     No comments.

src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c
     So now pauses in debugLoop_run() before the loop
     that reads cmds. Looks good.

src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
     So the VM_INIT event handler now signals that we have
     received the VM_INIT event so that allows debugLoop_run()
     to proceed.

Serguei, this fix needs to have the most of the Serviceability
stack of tests run against it (jdwp, JVM/TI, JDI and jdb tests).
Based on the email thread, I can't tell which tests have been
run with the fix in place.

Dan

>
>
> The fix looks good to me.
> Also, I've agreed to skip a unit test as creating it for this issue is 
> not easy.
>
> At least, one more review is needed before the fix can be pushed.
>
> Thanks,
> Serguei
>
>
> On 4/11/18 06:33, Andrew Leonard wrote:
>> Hi Serguei,
>> Thank you for raising the bug.
>> I had a chat with one of my colleagues who could recreate it, and 
>> it's probably related to the handshaking that is done in the 
>> particular scenario. So with the JCK harness:
>>
>> com.sun.jck.lib.ExecJCKTestOtherJVMCmd 
>> LD_LIBRARY_PATH=/_javatest_/_lib_/_jck_/jck8b/natives/linux_x86-64 
>> /projects/_jck_/jdwp/j2sdk-image/bin/java -Xdump:system:none 
>> -Xdump:system:events=_gpf_+abort+_traceassert_+_corruptcache_ 
>> -Xdump:snap:none 
>> -Xdump:snap:events=_gpf_+abort+_traceassert_+_corruptcache_ 
>> -Xdump:java:none 
>> -Xdump:java:events=_gpf_+abort+_traceassert_+_corruptcache_ 
>> -Xdump:heap:none 
>> -Xdump:heap:events=_gpf_+abort+_traceassert_+_corruptcache_ 
>> -_Xfuture_ 
>> -agentlib:jdwp=server=y,transport=dt_socket,address=_localhost_:35000,suspend=y 
>> -_classpath_ /_javatest_/_lib_/_jck_/JCK8b-b03/JCK-runtime-8b/classes 
>> -Djava.security.policy=/_javatest_/_lib_/_jck_/JCK8b-b03/JCK-runtime-8b/_lib_/jck.policy 
>> javasoft.sqe.jck.lib.jpda.jdwp.DebuggeeLoader -_waittime_=600 
>> -msgSwitch=ub1604x64vm10:38636 
>> -componentName=*ArrayReference.GetValues.getvalues002*
>>
>> Note that the JCK test harness starts the target process, attaches to 
>> it, and sends the resume command
>> in a very short time with no handshaking.
>>
>> That may not help..but hopefully helps explain things a bit? It's the 
>> timing of the resume command during the test that is crucial, 
>> resuming before the VM initialization is complete will trigger it.
>>
>> Thanks
>> Andrew
>>
>> Andrew Leonard
>> Java Runtimes Development
>> IBM Hursley
>> IBM United Kingdom Ltd
>> Phone internal: 245913, external: 01962 815913
>> internet email: andrew_m_leonard at uk.ibm.com
>>
>>
>>
>>
>> From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
>> To: Andrew Leonard <andrew_m_leonard at uk.ibm.com>
>> Cc: serviceability-dev at openjdk.java.net
>> Date: 11/04/2018 09:57
>> Subject: Re: RFR: Fix race condition in jdwp
>> ------------------------------------------------------------------------
>>
>>
>>
>> Hi Andrew,
>>
>> I've filed the bug:
>> _https://bugs.openjdk.java.net/browse/JDK-8201409_ 
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8201409&d=DwMDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=NaV8Iy8Ld-vjpXZFDdTbgGlRTghGHnwM75wUPd5_NUQ&m=nYdDNT1pdP7gQK1JviPJ6zIxLWcwe7R-azh_NnvI0Ok&s=z0_3fj5wG3cofrwXExX0jmnKvUeBGxjAOsYE1Gz6xCg&e=>
>>
>> Also, this is a webrev with your patch:
>> _http://cr.openjdk.java.net/~sspitsyn/webrevs/2018/8201409-jdwp-initsync.ibm.1/_ 
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Esspitsyn_webrevs_2018_8201409-2Djdwp-2Dinitsync.ibm.1_&d=DwMDaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=NaV8Iy8Ld-vjpXZFDdTbgGlRTghGHnwM75wUPd5_NUQ&m=nYdDNT1pdP7gQK1JviPJ6zIxLWcwe7R-azh_NnvI0Ok&s=8LH2-PYUqJbVkawhCgmuD5106lOuFKI-jAVhCpB_tYY&e=>
>>
>> I agree that creating a standalone test is tricky here.
>>
>> I've added usleep(10000) into the eventHelper_reportVMInit()
>> and ran the JTreg com/sun/jdi tests with my JDK build.
>> However, none of the tests failed with the failure mode you described.
>> So that I'm puzzled a little bit.
>> I suspect that some specific debugLoop commands were used in your 
>> scenario.
>>
>> It is still possible that I've missed something here.
>> Will try to double check everything.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 4/11/18 01:29, Andrew Leonard wrote:
>> Thanks Serguei,
>> I terms of a standalone testcase it is quite tricky, as due to the 
>> nature of the issue which took a lot of investigation to solve it's 
>> very timing dependent and will only occur randomly. It can be forced 
>> as I indicated below by adding a "sleep" in the VMInit report code 
>> but that's not a testcase, however the issue was originally found in 
>> our JCK testing for IBMJava8, testcase test.jck8b.runtime.vm.jdwp, 
>> but again only happened intermittently. Sort of like "performance" 
>> type issues we're not always going to be able to create a testcase 
>> that will always "fail" if the fix is not present.
>> Your thoughts?
>> Cheers
>> Andrew
>>
>> Andrew Leonard
>> Java Runtimes Development
>> IBM Hursley
>> IBM United Kingdom Ltd
>> Phone internal: 245913, external: 01962 815913
>> internet email: _andrew_m_leonard at uk.ibm.com_ 
>> <mailto:andrew_m_leonard at uk.ibm.com>
>>
>>
>>
>>
>> From: _"serguei.spitsyn at oracle.com"_ 
>> <mailto:serguei.spitsyn at oracle.com>_<serguei.spitsyn at oracle.com>_ 
>> <mailto:serguei.spitsyn at oracle.com>
>> To:        Andrew Leonard _<andrew_m_leonard at uk.ibm.com>_ 
>> <mailto:andrew_m_leonard at uk.ibm.com>
>> Cc: _serviceability-dev at openjdk.java.net_ 
>> <mailto:serviceability-dev at openjdk.java.net>
>> Date:        11/04/2018 01:02
>> Subject:        Re: RFR: Fix race condition in jdwp
>> ------------------------------------------------------------------------
>>
>>
>>
>> Hi Andrew,
>>
>> Okay, I'll file a bug on this topic.
>> But do you have a standalone test demonstrating this issue?
>>
>> Thanks,
>> Serguei
>>
>>
>> On 4/10/18 06:23, Andrew Leonard wrote:
>> Hi Serguei,
>> I don't have access to the bug database to raise one, are you able to 
>> please?
>>
>> Summary: JDWP debugger initialization hangs intermittently
>> Description: If during the JDWP setup initialization the VM 
>> initialization takes slightly longer than the main debug 
>> initialization thread a "hang" situation can occur. This has been 
>> seen in testcase test.jck8b.runtime.vm.jdwp and can also be recreated 
>> easily by adding a 10 second sleep to the beginning of the 
>> src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c method 
>> eventHelper_reportVMInit() .
>> First seen: JDK8
>> Recreated: JDK11
>>
>> Thanks
>> Andrew
>>
>> Andrew Leonard
>> Java Runtimes Development
>> IBM Hursley
>> IBM United Kingdom Ltd
>> Phone internal: 245913, external: 01962 815913
>> internet email: _andrew_m_leonard at uk.ibm.com_ 
>> <mailto:andrew_m_leonard at uk.ibm.com>
>>
>>
>>
>>
>> From: _"serguei.spitsyn at oracle.com"_ 
>> <mailto:serguei.spitsyn at oracle.com>_<serguei.spitsyn at oracle.com>_ 
>> <mailto:serguei.spitsyn at oracle.com>
>> To:        Andrew Leonard _<andrew_m_leonard at uk.ibm.com>_ 
>> <mailto:andrew_m_leonard at uk.ibm.com>, 
>> _serviceability-dev at openjdk.java.net_ 
>> <mailto:serviceability-dev at openjdk.java.net>
>> Date:        09/04/2018 23:03
>> Subject:        Re: RFR: Fix race condition in jdwp
>> ------------------------------------------------------------------------
>>
>>
>>
>> Hi Andrew,
>>
>> The patch itself looks reasonable.
>> However, in order to proceed with it, a bug report with a standalone
>> test case demonstrating the issue is needed.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 4/9/18 09:07, Andrew Leonard wrote:
>> > Hi,
>> > We discovered in our testing with OpenJ9 that a race condition can
>> > occur in the jdwp under certain circumstances, and we were able to
>> > force the same issue with Hotspot. Normally, the event helper thread
>> > suspends all threads, then the debug loop in the listener thread
>> > receives a command to resume. The debugger may deadlock if the debug
>> > loop in the listener thread starts processing commands (e.g. resume
>> > threads) before the event helper completes the initialization (and
>> > suspends threads).
>> >
>> > This patch adds synchronization to ensure the event helper completes
>> > the initialization sequence before debugger commands are processed.
>> >
>> > Please can I find a sponsor for this contribution? Patch below..
>> >
>> > Many thanks
>> >
>> > Andrew
>> >
>> >
>> >
>> > diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.c
>> > b/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.c
>> > --- a/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.c
>> > +++ b/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.c
>> > @@ -1,5 +1,5 @@
>> >  /*
>> > - * Copyright (c) 1998, 2017, Oracle and/or its affiliates. All rights
>> > reserved.
>> > + * Copyright (c) 1998, 2018, Oracle and/or its affiliates. All rights
>> > reserved.
>> >   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>> >   *
>> >   * This code is free software; you can redistribute it and/or 
>> modify it
>> > @@ -58,6 +58,7 @@
>> >  static jboolean vmInitialized;
>> >  static jrawMonitorID initMonitor;
>> >  static jboolean initComplete;
>> > +static jboolean VMInitComplete;
>> >  static jbyte currentSessionID;
>> >
>> >  /*
>> > @@ -617,6 +618,35 @@
>> >  debugMonitorExit(initMonitor);
>> >  }
>> >
>> > +/*
>> > + * Signal VM initialization is complete.
>> > + */
>> > +void
>> > +signalVMInitComplete(void)
>> > +{
>> > +    /*
>> > + * VM Initialization is complete
>> > + */
>> > +    LOG_MISC(("signal VM initialization complete"));
>> > +    debugMonitorEnter(initMonitor);
>> > +    VMInitComplete = JNI_TRUE;
>> > +    debugMonitorNotifyAll(initMonitor);
>> > +    debugMonitorExit(initMonitor);
>> > +}
>> > +
>> > +/*
>> > + * Wait for VM initialization to complete.
>> > + */
>> > +void
>> > +debugInit_waitVMInitComplete(void)
>> > +{
>> > +    debugMonitorEnter(initMonitor);
>> > +    while (!VMInitComplete) {
>> > +    debugMonitorWait(initMonitor);
>> > +    }
>> > +    debugMonitorExit(initMonitor);
>> > +}
>> > +
>> >  /* All process exit() calls come from here */
>> >  void
>> >  forceExit(int exit_code)
>> > @@ -672,6 +702,7 @@
>> >  LOG_MISC(("Begin initialize()"));
>> >  currentSessionID = 0;
>> >  initComplete = JNI_FALSE;
>> > +    VMInitComplete = JNI_FALSE;
>> >
>> >  if ( gdata->vmDead ) {
>> >      EXIT_ERROR(AGENT_ERROR_INTERNAL,"VM dead at initialize() time");
>> > diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.h
>> > b/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.h
>> > --- a/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.h
>> > +++ b/src/jdk.jdwp.agent/share/native/libjdwp/debugInit.h
>> > @@ -1,5 +1,5 @@
>> >  /*
>> > - * Copyright (c) 1998, 2015, Oracle and/or its affiliates. All rights
>> > reserved.
>> > + * Copyright (c) 1998, 2018, Oracle and/or its affiliates. All rights
>> > reserved.
>> >   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>> >   *
>> >   * This code is free software; you can redistribute it and/or 
>> modify it
>> > @@ -39,4 +39,7 @@
>> >  void debugInit_exit(jvmtiError, const char *);
>> >  void forceExit(int);
>> >
>> > +void debugInit_waitVMInitComplete(void);
>> > +void signalVMInitComplete(void);
>> > +
>> >  #endif
>> > diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c
>> > b/src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c
>> > --- a/src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c
>> > +++ b/src/jdk.jdwp.agent/share/native/libjdwp/debugLoop.c
>> > @@ -1,5 +1,5 @@
>> >  /*
>> > - * Copyright (c) 1998, 2017, Oracle and/or its affiliates. All rights
>> > reserved.
>> > + * Copyright (c) 1998, 2018, Oracle and/or its affiliates. All rights
>> > reserved.
>> >   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>> >   *
>> >   * This code is free software; you can redistribute it and/or 
>> modify it
>> > @@ -98,6 +98,7 @@
>> >  standardHandlers_onConnect();
>> >  threadControl_onConnect();
>> >
>> > +    debugInit_waitVMInitComplete();
>> >  /* Okay, start reading cmds! */
>> >  while (shouldListen) {
>> >      if (!dequeue(&p)) {
>> > diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
>> > b/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
>> > --- a/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
>> > +++ b/src/jdk.jdwp.agent/share/native/libjdwp/eventHelper.c
>> > @@ -1,5 +1,5 @@
>> >  /*
>> > - * Copyright (c) 1998, 2017, Oracle and/or its affiliates. All rights
>> > reserved.
>> > + * Copyright (c) 1998, 2018, Oracle and/or its affiliates. All rights
>> > reserved.
>> >   * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
>> >   *
>> >   * This code is free software; you can redistribute it and/or 
>> modify it
>> > @@ -580,6 +580,7 @@
>> >  (void)threadControl_suspendThread(command->thread, JNI_FALSE);
>> >  }
>> >
>> > +    signalVMInitComplete();
>> >  outStream_initCommand(&out, uniqueID(), 0x0,
>> >  JDWP_COMMAND_SET(Event),
>> >  JDWP_COMMAND(Event, Composite));
>> >
>> >
>> >
>> > Andrew Leonard
>> > Java Runtimes Development
>> > IBM Hursley
>> > IBM United Kingdom Ltd
>> > Phone internal: 245913, external: 01962 815913
>> > internet email: _andrew_m_leonard at uk.ibm.com_ 
>> <mailto:andrew_m_leonard at uk.ibm.com>
>> >
>> >
>> > Unless stated otherwise above:
>> > IBM United Kingdom Limited - Registered in England and Wales with
>> > number 741598.
>> > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire 
>> PO6 3AU
>>
>>
>>
>>
>>
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with 
>> number 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire 
>> PO6 3AU
>>
>>
>>
>>
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with 
>> number 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire 
>> PO6 3AU
>>
>>
>>
>>
>> Unless stated otherwise above:
>> IBM United Kingdom Limited - Registered in England and Wales with 
>> number 741598.
>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire 
>> PO6 3AU
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180415/4035b561/attachment-0001.html>


More information about the serviceability-dev mailing list