From david.holmes at oracle.com Sun Jul 1 12:19:50 2018 From: david.holmes at oracle.com (David Holmes) Date: Sun, 1 Jul 2018 22:19:50 +1000 Subject: [11] RFR: 8205653: test/jdk/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java and RmiSslBootstrapTest.sh fail with handshake_failure In-Reply-To: <687b7dd7-5be5-cfdb-c411-2cf4a7008b12@oracle.com> References: <3e4af336-6863-4145-9dce-60b08ea64a79@default> <74fabdff-3523-49cc-5ac2-4b766c8bbb30@oracle.com> <5764f7f2-f0a7-4bf1-44dc-977953cc6cab@oracle.com> <06b1c50c-401c-4c50-9ddf-7876c0638e63@default> <687b7dd7-5be5-cfdb-c411-2cf4a7008b12@oracle.com> Message-ID: <84c24bf6-05d6-3120-6f23-2483f51e3175@oracle.com> On 29/06/2018 6:32 PM, Alan Bateman wrote: > On 29/06/2018 09:22, Sibabrata Sahoo wrote: >> May I get the approval from serviceability-dev at openjdk.java.net. >> > This a test only change to update the keystores and the list of > ciphers/protocols that the test uses. There's nothing serviceability > specific here so having a Reviewer from the security area should be okay > in the event that don't get a quick review on serviceability-dev list. +1 I certainly can't comment on any of this keystore stuff. David > -Alan From rafael.wth at gmail.com Mon Jul 2 08:41:38 2018 From: rafael.wth at gmail.com (Rafael Winterhalter) Date: Mon, 2 Jul 2018 10:41:38 +0200 Subject: Review Request JDK-8200559: Java agents doing instrumentation need a means to define auxiliary classes In-Reply-To: <78719929-5cc1-7770-c7bf-107bef65e57e@oracle.com> References: <78719929-5cc1-7770-c7bf-107bef65e57e@oracle.com> Message-ID: Hi, I was wondering if a solution for this problem is still planned for JDK 11 giving the beginning ramp down. With removing sun.misc.Unsafe::defineClass, Java agents only have an option to use jdk.internal.misc.Unsafe::defineClass for the use-cases that I described. I think it would be a missed opportunity not to offer an alternative as of JDK 11 as a second migration would make it even less likely that agents would avoid unsafe API. Thanks for the information, best regards, Rafael mandy chung schrieb am So., 15. Apr. 2018, 08:23: > Background: > > Java agents support both load time and dynamic instrumentation. At load > time, > the agent's ClassFileTransformer is invoked to transform class bytes. > There is > no Class objects at this time. Dynamic instrumentation is when > redefineClasses > or retransformClasses is used to redefine an existing loaded class. The > ClassFileTransformer is invoked with class bytes where the Class object is > present. > > Java agent doing instrumentation needs a means to define auxiliary classes > that are visible and accessible to the instrumented class. Existing agents > have been using sun.misc.Unsafe::defineClass to define aux classes directly > or accessing protected ClassLoader::defineClass method with setAccessible > to > suppress the language access check (see [1] where this issue was brought > up). > > Instrumentation::appendToBootstrapClassLoaderSearch and > appendToSystemClassLoaderSearch > APIs are existing means to supply additional classes. It's too limited > for example it can't inject a class in the same runtime package as the > class > being transformed. > > Proposal: > > This proposes to add a new ClassFileTransformer.transform method taking > additional ClassDefiner parameter. A transformer can define additional > classes during the transformation process, i.e. > when ClassFileTransformer::transform is invoked. Some details: > > 1. ClassDefiner::defineClass defines a class in the same runtime package > as the class being transformed. > 2. The class is defined in the same thread as the transformers are being > invoked. ClassDefiner::defineClass returns Class object directly > before the transformed class is defined. > 3. No transformation is applied to classes defined by > ClassDefiner::defineClass. > > The first prototype we did is to collect the auxiliary classes and define > them until all transformers are invoked and have these aux classes to go > through the transformation pipeline. Several complicated issues would > need to be resolved for example timing whether the auxiliary classes > should > be defined before the transformed class (otherwise a potential race where > some other thread references the transformed class and cause the code to > execute that in turn reference the auxiliary classes. The current > implementation has a native reentrancy check that ensure one class is being > transformed to avoid potential circularity issues. This may need JVM TI > support to be reliable. > > This proposal would allow java agents to migrate from internal API and > ClassDefiner to be enhanced in the future. > > Webrev: > http://cr.openjdk.java.net/~mchung/jdk11/webrevs/8200559/webrev.00/ > > Mandy > [1] > http://mail.openjdk.java.net/pipermail/jdk-dev/2018-January/000405.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Mon Jul 2 09:03:58 2018 From: christoph.langer at sap.com (Langer, Christoph) Date: Mon, 2 Jul 2018 09:03:58 +0000 Subject: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <2E0FD75E-3AEF-4251-B324-7F0BA864CFC2@googlemail.com> References: <2e9e20817ecf49d995cd2f939fefd774@sap.com> <2E0FD75E-3AEF-4251-B324-7F0BA864CFC2@googlemail.com> Message-ID: <9bf3ebbba0014f918bf53eb0b4d0c464@sap.com> Hi Matthias, forwarding to serviceability-dev, because debugging is usually discussed there. Yes, I would think this coding should be fixed, too. Can you open a bug and prepare a change? Thanks Christoph > -----Original Message----- > From: net-dev [mailto:net-dev-bounces at openjdk.java.net] On Behalf Of > Norman Maurer > Sent: Montag, 2. Juli 2018 10:23 > To: Baesken, Matthias > Cc: Stuefe, Thomas ; net-dev at openjdk.java.net > Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR > > +1 retry a close on EINTR has most likely not the outcome you expect and > may even close a wrong FD if the same FD is reused already (as even if EINTR > is returned it may have closed the FD) > > > Am 02.07.2018 um 10:17 schrieb Baesken, Matthias > : > > > > Hello , there is a similar pattern (attempt to restart close in case of EINTR) > in the coding as well in socket_md.c : > > > > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-147- int rv; > > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-148- do { > > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-149- rv = > close(fd); > > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c:150: } while (rv > == -1 && errno == EINTR); > > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-151- > > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-152- return rv; > > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-153-} > > > > Do you think this needs adjustment (on LINUX) as well ? > > > > Best regards, Matthias > > > > > >> Message: 2 > >> Date: Thu, 28 Jun 2018 18:19:46 +0100 > >> From: Alan Bateman > >> To: David Lloyd , ivan.gerasimov at oracle.com > >> Cc: OpenJDK Network Dev list > >> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR > >> Message-ID: <3fd1496f-ab83-a2d5-0699-13c8b735d70b at oracle.com> > >> Content-Type: text/plain; charset=utf-8; format=flowed > >> > >>> On 28/06/2018 17:35, David Lloyd wrote: > >>> : > >>> Do you (or Alan) think that this might have accounted for real-world > >>> connection problems? > >>> > >> In the file I/O area, with NFS I think, we had an issue a long time ago > >> where close was retried after EIO. That issue was fixed a long time ago > >> but it's one that comes to mind in this general area. > >> > >> -Alan > >> > > From Alan.Bateman at oracle.com Mon Jul 2 09:55:45 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 2 Jul 2018 10:55:45 +0100 Subject: Review Request JDK-8200559: Java agents doing instrumentation need a means to define auxiliary classes In-Reply-To: References: <78719929-5cc1-7770-c7bf-107bef65e57e@oracle.com> Message-ID: <813342b9-d670-e60f-4cc0-e2b5d0542b5f@oracle.com> On 02/07/2018 09:41, Rafael Winterhalter wrote: > Hi, > > I was wondering if a solution for this problem is still planned for > JDK 11 giving the beginning ramp down. > > With removing sun.misc.Unsafe::defineClass, Java agents only have an > option to use jdk.internal.misc.Unsafe::defineClass for the use-cases > that I described. > > I think it would be a missed opportunity not to offer an alternative > as of JDK 11 as a second migration would make it even less likely that > agents would avoid unsafe API. > Mandy's propoal to allow agents doing instrumentation to define auxiliary classes in the same runtime package as the class being loaded or redefine is a good proposal make complete sense and that fits with the intended use of this API. Unfortunately it didn't make JDK 11. I read the mails and arguments for an Instrumentation.defineClass but I don't think it's the right API to add.? The Instrumentation API was designed for tool agents, not libraries, and a lot of discussion seems to be trying to use the API for cases that it was never intended. Also an unrestricted defineClass creates an attractive nuisance that would likely create a lot of problems further down the road. I think it would be better to focus on some of the use-cases to see if we can identify cases where a standard API make sense. -Alan From thomas.stuefe at gmail.com Mon Jul 2 10:08:15 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 2 Jul 2018 12:08:15 +0200 Subject: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <9bf3ebbba0014f918bf53eb0b4d0c464@sap.com> References: <2e9e20817ecf49d995cd2f939fefd774@sap.com> <2E0FD75E-3AEF-4251-B324-7F0BA864CFC2@googlemail.com> <9bf3ebbba0014f918bf53eb0b4d0c464@sap.com> Message-ID: +1. Please fix this for Linux! Thanks. On Mon, Jul 2, 2018 at 11:03 AM, Langer, Christoph wrote: > Hi Matthias, > > forwarding to serviceability-dev, because debugging is usually discussed there. > > Yes, I would think this coding should be fixed, too. Can you open a bug and prepare a change? > > Thanks > Christoph > >> -----Original Message----- >> From: net-dev [mailto:net-dev-bounces at openjdk.java.net] On Behalf Of >> Norman Maurer >> Sent: Montag, 2. Juli 2018 10:23 >> To: Baesken, Matthias >> Cc: Stuefe, Thomas ; net-dev at openjdk.java.net >> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR >> >> +1 retry a close on EINTR has most likely not the outcome you expect and >> may even close a wrong FD if the same FD is reused already (as even if EINTR >> is returned it may have closed the FD) >> >> > Am 02.07.2018 um 10:17 schrieb Baesken, Matthias >> : >> > >> > Hello , there is a similar pattern (attempt to restart close in case of EINTR) >> in the coding as well in socket_md.c : >> > >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-147- int rv; >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-148- do { >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-149- rv = >> close(fd); >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c:150: } while (rv >> == -1 && errno == EINTR); >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-151- >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-152- return rv; >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-153-} >> > >> > Do you think this needs adjustment (on LINUX) as well ? >> > >> > Best regards, Matthias >> > >> > >> >> Message: 2 >> >> Date: Thu, 28 Jun 2018 18:19:46 +0100 >> >> From: Alan Bateman >> >> To: David Lloyd , ivan.gerasimov at oracle.com >> >> Cc: OpenJDK Network Dev list >> >> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR >> >> Message-ID: <3fd1496f-ab83-a2d5-0699-13c8b735d70b at oracle.com> >> >> Content-Type: text/plain; charset=utf-8; format=flowed >> >> >> >>> On 28/06/2018 17:35, David Lloyd wrote: >> >>> : >> >>> Do you (or Alan) think that this might have accounted for real-world >> >>> connection problems? >> >>> >> >> In the file I/O area, with NFS I think, we had an issue a long time ago >> >> where close was retried after EIO. That issue was fixed a long time ago >> >> but it's one that comes to mind in this general area. >> >> >> >> -Alan >> >> >> > From david.holmes at oracle.com Mon Jul 2 12:03:55 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 2 Jul 2018 22:03:55 +1000 Subject: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <9bf3ebbba0014f918bf53eb0b4d0c464@sap.com> References: <2e9e20817ecf49d995cd2f939fefd774@sap.com> <2E0FD75E-3AEF-4251-B324-7F0BA864CFC2@googlemail.com> <9bf3ebbba0014f918bf53eb0b4d0c464@sap.com> Message-ID: <5a2ab4d2-ed90-20fd-6340-e73153d4313d@oracle.com> In reference to 8205959, where is it stated that dup2 is any more restartable than close ?? AFAICS both leave things undefined/unspecified if they set EINTR. David On 2/07/2018 7:03 PM, Langer, Christoph wrote: > Hi Matthias, > > forwarding to serviceability-dev, because debugging is usually discussed there. > > Yes, I would think this coding should be fixed, too. Can you open a bug and prepare a change? > > Thanks > Christoph > >> -----Original Message----- >> From: net-dev [mailto:net-dev-bounces at openjdk.java.net] On Behalf Of >> Norman Maurer >> Sent: Montag, 2. Juli 2018 10:23 >> To: Baesken, Matthias >> Cc: Stuefe, Thomas ; net-dev at openjdk.java.net >> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR >> >> +1 retry a close on EINTR has most likely not the outcome you expect and >> may even close a wrong FD if the same FD is reused already (as even if EINTR >> is returned it may have closed the FD) >> >>> Am 02.07.2018 um 10:17 schrieb Baesken, Matthias >> : >>> >>> Hello , there is a similar pattern (attempt to restart close in case of EINTR) >> in the coding as well in socket_md.c : >>> >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-147- int rv; >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-148- do { >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-149- rv = >> close(fd); >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c:150: } while (rv >> == -1 && errno == EINTR); >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-151- >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-152- return rv; >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-153-} >>> >>> Do you think this needs adjustment (on LINUX) as well ? >>> >>> Best regards, Matthias >>> >>> >>>> Message: 2 >>>> Date: Thu, 28 Jun 2018 18:19:46 +0100 >>>> From: Alan Bateman >>>> To: David Lloyd , ivan.gerasimov at oracle.com >>>> Cc: OpenJDK Network Dev list >>>> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR >>>> Message-ID: <3fd1496f-ab83-a2d5-0699-13c8b735d70b at oracle.com> >>>> Content-Type: text/plain; charset=utf-8; format=flowed >>>> >>>>> On 28/06/2018 17:35, David Lloyd wrote: >>>>> : >>>>> Do you (or Alan) think that this might have accounted for real-world >>>>> connection problems? >>>>> >>>> In the file I/O area, with NFS I think, we had an issue a long time ago >>>> where close was retried after EIO. That issue was fixed a long time ago >>>> but it's one that comes to mind in this general area. >>>> >>>> -Alan >>>> >>> From matthias.baesken at sap.com Mon Jul 2 13:44:22 2018 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Mon, 2 Jul 2018 13:44:22 +0000 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR Message-ID: <228035d2f64c494eaefe31b07ac72083@sap.com> I created a bug and a webrev , please review . https://bugs.openjdk.java.net/browse/JDK-8206145 http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/ ( The other bug where a similar issue was addressed is https://bugs.openjdk.java.net/browse/JDK-8205959 ) Best regards, Matthias > -----Original Message----- > From: Thomas St?fe [mailto:thomas.stuefe at gmail.com] > Sent: Montag, 2. Juli 2018 12:08 > To: Baesken, Matthias ; Langer, Christoph > > Cc: serviceability-dev (serviceability-dev at openjdk.java.net) dev at openjdk.java.net>; Stuefe, Thomas ; net- > dev at openjdk.java.net > Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR > > +1. Please fix this for Linux! Thanks. > > On Mon, Jul 2, 2018 at 11:03 AM, Langer, Christoph > wrote: > > Hi Matthias, > > > > forwarding to serviceability-dev, because debugging is usually discussed > there. > > > > Yes, I would think this coding should be fixed, too. Can you open a bug and > prepare a change? > > > > Thanks > > Christoph > > > >> -----Original Message----- > >> From: net-dev [mailto:net-dev-bounces at openjdk.java.net] On Behalf Of > >> Norman Maurer > >> Sent: Montag, 2. Juli 2018 10:23 > >> To: Baesken, Matthias > >> Cc: Stuefe, Thomas ; net- > dev at openjdk.java.net > >> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR > >> > >> +1 retry a close on EINTR has most likely not the outcome you expect and > >> may even close a wrong FD if the same FD is reused already (as even if > EINTR > >> is returned it may have closed the FD) > >> > >> > Am 02.07.2018 um 10:17 schrieb Baesken, Matthias > >> : > >> > > >> > Hello , there is a similar pattern (attempt to restart close in case of > EINTR) > >> in the coding as well in socket_md.c : > >> > > >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-147- int rv; > >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-148- do { > >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-149- rv = > >> close(fd); > >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c:150: } while > (rv > >> == -1 && errno == EINTR); > >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-151- > >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-152- return > rv; > >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-153-} > >> > > >> > Do you think this needs adjustment (on LINUX) as well ? > >> > > >> > Best regards, Matthias > >> > > >> > > >> >> Message: 2 > >> >> Date: Thu, 28 Jun 2018 18:19:46 +0100 > >> >> From: Alan Bateman > >> >> To: David Lloyd , > ivan.gerasimov at oracle.com > >> >> Cc: OpenJDK Network Dev list > >> >> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR > >> >> Message-ID: <3fd1496f-ab83-a2d5-0699-13c8b735d70b at oracle.com> > >> >> Content-Type: text/plain; charset=utf-8; format=flowed > >> >> > >> >>> On 28/06/2018 17:35, David Lloyd wrote: > >> >>> : > >> >>> Do you (or Alan) think that this might have accounted for real-world > >> >>> connection problems? > >> >>> > >> >> In the file I/O area, with NFS I think, we had an issue a long time ago > >> >> where close was retried after EIO. That issue was fixed a long time ago > >> >> but it's one that comes to mind in this general area. > >> >> > >> >> -Alan > >> >> > >> > From Alan.Bateman at oracle.com Mon Jul 2 14:09:02 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 2 Jul 2018 15:09:02 +0100 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <228035d2f64c494eaefe31b07ac72083@sap.com> References: <228035d2f64c494eaefe31b07ac72083@sap.com> Message-ID: On 02/07/2018 14:44, Baesken, Matthias wrote: > I created a bug and a webrev , please review . > > > https://bugs.openjdk.java.net/browse/JDK-8206145 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/ > Why is this Linux only? I assume the do-while should be removed completely. -Alan. From thomas.stuefe at gmail.com Mon Jul 2 15:41:52 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 2 Jul 2018 17:41:52 +0200 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: References: <228035d2f64c494eaefe31b07ac72083@sap.com> Message-ID: Hi Alan, Whether to repeat close() in case of EINTR seems to differ between platforms. POSIX leaves it open: "If close() is interrupted by a signal that is to be caught, it shall return -1 with errno set to [EINTR] and the state of fildes is unspecified." Linux recommends *not* repeating the call since the file descriptor is closed already and repeating the close may close a reopened fd belonging to someone else. AIX, for instance, recommends to repeat the call: "EINTR The state of the FileDescriptor is undetermined. Retry the close routine to ensure that the FileDescriptor is closed." Best Regards, Thomas On Mon, Jul 2, 2018 at 4:09 PM, Alan Bateman wrote: > On 02/07/2018 14:44, Baesken, Matthias wrote: >> >> I created a bug and a webrev , please review . >> >> >> https://bugs.openjdk.java.net/browse/JDK-8206145 >> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/ >> > Why is this Linux only? I assume the do-while should be removed completely. > > -Alan. From Alan.Bateman at oracle.com Mon Jul 2 15:55:25 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Mon, 2 Jul 2018 16:55:25 +0100 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: References: <228035d2f64c494eaefe31b07ac72083@sap.com> Message-ID: <376d39dc-ede0-99f0-d2be-af8db618a4bd@oracle.com> On 02/07/2018 16:41, Thomas St?fe wrote: > Hi Alan, > > Whether to repeat close() in case of EINTR seems to differ between > platforms. POSIX leaves it open: > > "If close() is interrupted by a signal that is to be caught, it shall > return -1 with errno set to [EINTR] and the state of fildes is > unspecified." > > Linux recommends *not* repeating the call since the file descriptor is > closed already and repeating the close may close a reopened fd > belonging to someone else. > > AIX, for instance, recommends to repeat the call: > > "EINTR The state of the FileDescriptor is undetermined. Retry the > close routine to ensure that the FileDescriptor is closed." I think we should double check macOS and Solaris too as we've been careful in other areas to not retry close when interrupted. -Alan From mandy.chung at oracle.com Mon Jul 2 17:17:31 2018 From: mandy.chung at oracle.com (mandy chung) Date: Mon, 2 Jul 2018 10:17:31 -0700 Subject: Review Request JDK-8200559: Java agents doing instrumentation need a means to define auxiliary classes In-Reply-To: References: <78719929-5cc1-7770-c7bf-107bef65e57e@oracle.com> Message-ID: <7a27c86c-6915-21e0-866f-f62c7626e330@oracle.com> My proposal of ClassDefiner API allows the java agent to define auxiliary classes in the same runtime package of the class being instrumented. You raised other use cases that are not addressed by this proposal. As Alan replied, the ability to define any arbitrary class would be an attractive nuisance and we think Instrumentation.defineClass isn't the right API to add. I think the proposed ClassDefiner API is useful for the specific use case (define auxiliary classes in the runtime package of the class being instrumented). I hold it off and so didn't make 11. For the other use cases, perhaps we should create JBS issues for further investigation. Mandy On 7/2/18 1:41 AM, Rafael Winterhalter wrote: > Hi, > > I was wondering if a solution for this problem is still planned for JDK > 11 giving the beginning ramp down. > > With removing sun.misc.Unsafe::defineClass, Java agents only have an > option to use jdk.internal.misc.Unsafe::defineClass for the use-cases > that I described. > > I think it would be a missed opportunity not to offer an alternative as > of JDK 11 as a second migration would make it even less likely that > agents would avoid unsafe API. > > Thanks for the information, > best regards, Rafael > > mandy chung > > schrieb am So., 15. Apr. 2018, 08:23: > > Background: > > Java agents support both load time and dynamic instrumentation. At > load time, > the agent's ClassFileTransformer is invoked to transform class > bytes.? There is > no Class objects at this time.? Dynamic instrumentation is when > redefineClasses > or retransformClasses is used to redefine an existing loaded class.? The > ClassFileTransformer is invoked with class bytes where the Class > object is present. > > Java agent doing instrumentation needs a means to define auxiliary > classes > that are visible and accessible to the instrumented class. Existing > agents > have been using sun.misc.Unsafe::defineClass to define aux classes > directly > or accessing protected ClassLoader::defineClass method with > setAccessible to > suppress the language access check (see [1] where this issue was > brought up). > > Instrumentation::appendToBootstrapClassLoaderSearch and > appendToSystemClassLoaderSearch > APIs are existing means to supply additional classes.? It's too limited > for example it can't inject a class in the same runtime package as > the class > being transformed. > > Proposal: > > This proposes to add a new ClassFileTransformer.transform method > taking additional ClassDefiner parameter.? A transformer can define > additional > classes during the transformation process, i.e. > when ClassFileTransformer::transform is invoked. Some details: > > 1. ClassDefiner::defineClass defines a class in the same runtime package > ?? as the class being transformed. > 2. The class is defined in the same thread as the transformers are being > ?? invoked.?? ClassDefiner::defineClass returns Class object directly > ?? before the transformed class is defined. > 3. No transformation is applied to classes defined by > ClassDefiner::defineClass. > > The first prototype we did is to collect the auxiliary classes and > define > them? until all transformers are invoked and have these aux classes > to go > through the transformation pipeline.? Several complicated issues would > need to be resolved for example timing whether the auxiliary classes > should > be defined before the transformed class (otherwise a potential race > where > some other thread references the transformed class and cause the code to > execute that in turn reference the auxiliary classes.? The current > implementation has a native reentrancy check that ensure one class > is being > transformed to avoid potential circularity issues.? This may need > JVM TI > support to be reliable. > > This proposal would allow java agents to migrate from internal API > and ClassDefiner to be enhanced in the future. > > Webrev: > http://cr.openjdk.java.net/~mchung/jdk11/webrevs/8200559/webrev.00/ > > Mandy > [1] > http://mail.openjdk.java.net/pipermail/jdk-dev/2018-January/000405.html > From david.lloyd at redhat.com Mon Jul 2 13:43:06 2018 From: david.lloyd at redhat.com (David Lloyd) Date: Mon, 2 Jul 2018 08:43:06 -0500 Subject: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <5a2ab4d2-ed90-20fd-6340-e73153d4313d@oracle.com> References: <2e9e20817ecf49d995cd2f939fefd774@sap.com> <2E0FD75E-3AEF-4251-B324-7F0BA864CFC2@googlemail.com> <9bf3ebbba0014f918bf53eb0b4d0c464@sap.com> <5a2ab4d2-ed90-20fd-6340-e73153d4313d@oracle.com> Message-ID: I think because the only two possible outcomes are either that the FD was not dup'd, in which case things carry on as before, or that it was dup'd, in which case (at least in the JVM) re-dupping won't really do anything harmful since the target FD already references the dead socket FD. The POSIX manpage doesn't seem to include any other possibilities. On Mon, Jul 2, 2018 at 7:04 AM David Holmes wrote: > > In reference to 8205959, where is it stated that dup2 is any more > restartable than close ?? > > AFAICS both leave things undefined/unspecified if they set EINTR. > > David > > On 2/07/2018 7:03 PM, Langer, Christoph wrote: > > Hi Matthias, > > > > forwarding to serviceability-dev, because debugging is usually discussed there. > > > > Yes, I would think this coding should be fixed, too. Can you open a bug and prepare a change? > > > > Thanks > > Christoph > > > >> -----Original Message----- > >> From: net-dev [mailto:net-dev-bounces at openjdk.java.net] On Behalf Of > >> Norman Maurer > >> Sent: Montag, 2. Juli 2018 10:23 > >> To: Baesken, Matthias > >> Cc: Stuefe, Thomas ; net-dev at openjdk.java.net > >> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR > >> > >> +1 retry a close on EINTR has most likely not the outcome you expect and > >> may even close a wrong FD if the same FD is reused already (as even if EINTR > >> is returned it may have closed the FD) > >> > >>> Am 02.07.2018 um 10:17 schrieb Baesken, Matthias > >> : > >>> > >>> Hello , there is a similar pattern (attempt to restart close in case of EINTR) > >> in the coding as well in socket_md.c : > >>> > >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-147- int rv; > >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-148- do { > >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-149- rv = > >> close(fd); > >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c:150: } while (rv > >> == -1 && errno == EINTR); > >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-151- > >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-152- return rv; > >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-153-} > >>> > >>> Do you think this needs adjustment (on LINUX) as well ? > >>> > >>> Best regards, Matthias > >>> > >>> > >>>> Message: 2 > >>>> Date: Thu, 28 Jun 2018 18:19:46 +0100 > >>>> From: Alan Bateman > >>>> To: David Lloyd , ivan.gerasimov at oracle.com > >>>> Cc: OpenJDK Network Dev list > >>>> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR > >>>> Message-ID: <3fd1496f-ab83-a2d5-0699-13c8b735d70b at oracle.com> > >>>> Content-Type: text/plain; charset=utf-8; format=flowed > >>>> > >>>>> On 28/06/2018 17:35, David Lloyd wrote: > >>>>> : > >>>>> Do you (or Alan) think that this might have accounted for real-world > >>>>> connection problems? > >>>>> > >>>> In the file I/O area, with NFS I think, we had an issue a long time ago > >>>> where close was retried after EIO. That issue was fixed a long time ago > >>>> but it's one that comes to mind in this general area. > >>>> > >>>> -Alan > >>>> > >>> -- - DML From david.holmes at oracle.com Tue Jul 3 04:28:43 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 3 Jul 2018 14:28:43 +1000 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: References: <228035d2f64c494eaefe31b07ac72083@sap.com> Message-ID: On 3/07/2018 1:41 AM, Thomas St?fe wrote: > Hi Alan, > > Whether to repeat close() in case of EINTR seems to differ between > platforms. POSIX leaves it open: > > "If close() is interrupted by a signal that is to be caught, it shall > return -1 with errno set to [EINTR] and the state of fildes is > unspecified." > > Linux recommends *not* repeating the call since the file descriptor is > closed already and repeating the close may close a reopened fd > belonging to someone else. > > AIX, for instance, recommends to repeat the call: > > "EINTR The state of the FileDescriptor is undetermined. Retry the > close routine to ensure that the FileDescriptor is closed." As does HP-UX according to: http://man7.org/linux/man-pages/man2/close.2.html Solaris leaves things unspecified as per POSIX. David > Best Regards, Thomas > > > > > > > > On Mon, Jul 2, 2018 at 4:09 PM, Alan Bateman wrote: >> On 02/07/2018 14:44, Baesken, Matthias wrote: >>> >>> I created a bug and a webrev , please review . >>> >>> >>> https://bugs.openjdk.java.net/browse/JDK-8206145 >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/ >>> >> Why is this Linux only? I assume the do-while should be removed completely. >> >> -Alan. From Alan.Bateman at oracle.com Tue Jul 3 07:35:29 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 3 Jul 2018 08:35:29 +0100 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: References: <228035d2f64c494eaefe31b07ac72083@sap.com> Message-ID: <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> On 03/07/2018 05:28, David Holmes wrote: > > > Solaris leaves things unspecified as per POSIX. We've had problems on Solaris in other areas on exactly this topic so they have been fixed to not retry. I think we should do the same here so that we are at least consistent. -Alan From matthias.baesken at sap.com Tue Jul 3 07:47:17 2018 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 3 Jul 2018 07:47:17 +0000 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> Message-ID: <832f9ddb14f4415b98adafa004ff196f@sap.com> Hello , so should I change my webrev for 8206145 to - retry on AIX - not retry on Linux + Solaris ? Any remarks on Mac / BSD ? Thanks, Matthias > -----Original Message----- > From: Alan Bateman [mailto:Alan.Bateman at oracle.com] > Sent: Dienstag, 3. Juli 2018 09:35 > To: David Holmes ; Thomas St?fe > > Cc: serviceability-dev (serviceability-dev at openjdk.java.net) dev at openjdk.java.net>; Baesken, Matthias > Subject: Re: 8206145 : dbgsysSocketClose - do not restart close if errno is > EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR > > On 03/07/2018 05:28, David Holmes wrote: > > > > > > Solaris leaves things unspecified as per POSIX. > We've had problems on Solaris in other areas on exactly this topic so > they have been fixed to not retry. I think we should do the same here so > that we are at least consistent. > > -Alan From thomas.stuefe at gmail.com Tue Jul 3 08:20:49 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 3 Jul 2018 10:20:49 +0200 Subject: RFR(xxs): 8206243: java -XshowSettings fails if memory.limit_in_bytes overflows LONG.max Message-ID: Hi all, may I please have reviews for this small fix. https://bugs.openjdk.java.net/browse/JDK-8206243 http://cr.openjdk.java.net/~stuefe/webrevs/8206243-java-xshowsettings-fails-for-large-values-of-memory-limit_in_bytes/webrev.00/webrev/ On some Linux kernels, the unlimited value of memory.limit_in_bytes is returned as ULONG_MAX, not LONG_MAX. - .../nightly $ cat //sys/fs/cgroup/memory/memory.limit_in_bytes 18446744073709551615 In those cases, java -XshowSettings will fail: java -XshowSettings .... Operating System Metrics: Provider: cgroupv1 Effective CPU Count: 8 CPU Period: 100000us CPU Quota: -1 CPU Shares: -1 List of Processors, 8 total: 0 1 2 3 4 5 6 7 List of Effective Processors, 0 total: List of Memory Nodes, 1 total: 0 List of Available Memory Nodes, 0 total: CPUSet Memory Pressure Enabled: false Exception in thread "main" java.lang.NumberFormatException: For input string: "18446744073709551615" at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.base/java.lang.Long.parseLong(Long.java:692) at java.base/java.lang.Long.parseLong(Long.java:817) at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValue(SubSystem.java:106) at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:374) at java.base/sun.launcher.LauncherHelper.printSystemMetrics(LauncherHelper.java:385) Thank you, Thomas From david.holmes at oracle.com Tue Jul 3 08:37:46 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 3 Jul 2018 18:37:46 +1000 Subject: RFR(xxs): 8206243: java -XshowSettings fails if memory.limit_in_bytes overflows LONG.max In-Reply-To: References: Message-ID: <539f5555-e057-4272-88a1-3da3ad0fd61d@oracle.com> Hi Thomas, This seems okay. Minor nit: if(bigInt Please add a space after 'if' Thanks, David On 3/07/2018 6:20 PM, Thomas St?fe wrote: > Hi all, > > may I please have reviews for this small fix. > > https://bugs.openjdk.java.net/browse/JDK-8206243 > http://cr.openjdk.java.net/~stuefe/webrevs/8206243-java-xshowsettings-fails-for-large-values-of-memory-limit_in_bytes/webrev.00/webrev/ > > > On some Linux kernels, the unlimited value of memory.limit_in_bytes is > returned as ULONG_MAX, not LONG_MAX. > > - .../nightly $ cat //sys/fs/cgroup/memory/memory.limit_in_bytes > 18446744073709551615 > > In those cases, java -XshowSettings will fail: > > java -XshowSettings > .... > Operating System Metrics: > Provider: cgroupv1 > Effective CPU Count: 8 > CPU Period: 100000us > CPU Quota: -1 > CPU Shares: -1 > List of Processors, 8 total: > 0 1 2 3 4 5 6 7 > List of Effective Processors, 0 total: > List of Memory Nodes, 1 total: > 0 > List of Available Memory Nodes, 0 total: > CPUSet Memory Pressure Enabled: false > Exception in thread "main" java.lang.NumberFormatException: For input > string: "18446744073709551615" > at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.base/java.lang.Long.parseLong(Long.java:692) > at java.base/java.lang.Long.parseLong(Long.java:817) > at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValue(SubSystem.java:106) > at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:374) > at java.base/sun.launcher.LauncherHelper.printSystemMetrics(LauncherHelper.java:385) > > > Thank you, > > Thomas > From per.liden at oracle.com Tue Jul 3 08:47:27 2018 From: per.liden at oracle.com (Per Liden) Date: Tue, 3 Jul 2018 10:47:27 +0200 Subject: HotSpot Serviceability Agent (SA) Survey In-Reply-To: References: Message-ID: Hi Stephen, On 03/21/2018 07:14 PM, Stephen Fitch wrote: > Hi, > > The HotSpot Serviceability Agent (SA) is a set of APIs and tools for > debugging HotSpot Virtual Machine and has been a part of the JVM/JDK for > a long time, however we don't have a lot of data about how it is used in > practice, especially outside of Oracle. Therefore, we have created an > initial survey to gather more information and help us evaluate and > understand how others are using it. > > If you have used, or have (support) processes that utilize the > Serviceability Agent or related APIs, then we would definitely > appreciate if you would complete this survey: > > https://www.surveymonkey.com/r/CF3MYDL > > We are specifically interested in your use-cases and how SA is effective > for you in resolving JVM issues. > > The survey will remain open through March 31st. The results of the > survey will be made public after the survey closes. Have the results been published yet? cheers, Per > > Regards, Stephen > > ?Java Platform Group - JVM - Sustaining Engineering From thomas.stuefe at gmail.com Tue Jul 3 09:15:56 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 3 Jul 2018 11:15:56 +0200 Subject: RFR(xxs): 8206243: java -XshowSettings fails if memory.limit_in_bytes overflows LONG.max In-Reply-To: <539f5555-e057-4272-88a1-3da3ad0fd61d@oracle.com> References: <539f5555-e057-4272-88a1-3da3ad0fd61d@oracle.com> Message-ID: Thank you David! I changed the webrev in place. Thanks, Thomas On Tue, Jul 3, 2018 at 10:37 AM, David Holmes wrote: > Hi Thomas, > > This seems okay. > > Minor nit: > > if(bigInt > > Please add a space after 'if' > > Thanks, > David > > > On 3/07/2018 6:20 PM, Thomas St?fe wrote: >> >> Hi all, >> >> may I please have reviews for this small fix. >> >> https://bugs.openjdk.java.net/browse/JDK-8206243 >> >> http://cr.openjdk.java.net/~stuefe/webrevs/8206243-java-xshowsettings-fails-for-large-values-of-memory-limit_in_bytes/webrev.00/webrev/ >> >> >> On some Linux kernels, the unlimited value of memory.limit_in_bytes is >> returned as ULONG_MAX, not LONG_MAX. >> >> - .../nightly $ cat //sys/fs/cgroup/memory/memory.limit_in_bytes >> 18446744073709551615 >> >> In those cases, java -XshowSettings will fail: >> >> java -XshowSettings >> .... >> Operating System Metrics: >> Provider: cgroupv1 >> Effective CPU Count: 8 >> CPU Period: 100000us >> CPU Quota: -1 >> CPU Shares: -1 >> List of Processors, 8 total: >> 0 1 2 3 4 5 6 7 >> List of Effective Processors, 0 total: >> List of Memory Nodes, 1 total: >> 0 >> List of Available Memory Nodes, 0 total: >> CPUSet Memory Pressure Enabled: false >> Exception in thread "main" java.lang.NumberFormatException: For input >> string: "18446744073709551615" >> at >> java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) >> at java.base/java.lang.Long.parseLong(Long.java:692) >> at java.base/java.lang.Long.parseLong(Long.java:817) >> at >> java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValue(SubSystem.java:106) >> at >> java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:374) >> at >> java.base/sun.launcher.LauncherHelper.printSystemMetrics(LauncherHelper.java:385) >> >> >> Thank you, >> >> Thomas >> > From Alan.Bateman at oracle.com Tue Jul 3 10:08:43 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 3 Jul 2018 11:08:43 +0100 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <832f9ddb14f4415b98adafa004ff196f@sap.com> References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> <832f9ddb14f4415b98adafa004ff196f@sap.com> Message-ID: <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> On 03/07/2018 08:47, Baesken, Matthias wrote: > Hello , so should I change my webrev for 8206145 to > - retry on AIX > - not retry on Linux + Solaris ? Yes. > Any remarks on Mac / BSD ? > I see a few issues in the FreeBSD bugzilla on this topic. I assume it would be safer to not retry if interrupted. -Alan From ralf.schmelter at sap.com Tue Jul 3 10:43:46 2018 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Tue, 3 Jul 2018 10:43:46 +0000 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior Message-ID: <709161f438f848b0af5fb079c9c0242a@sap.com> Hi All, Please review the fix for the bug https://bugs.openjdk.java.net/browse/JDK-8205608 . The webref is at http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/ . This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack. I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is. Best regards, Ralf Schmelter From Stephen.Fitch at oracle.com Tue Jul 3 12:17:43 2018 From: Stephen.Fitch at oracle.com (Stephen Fitch) Date: Tue, 3 Jul 2018 05:17:43 -0700 Subject: HotSpot Serviceability Agent (SA) Survey In-Reply-To: References: Message-ID: Hi Per, Sadly delayed by other things; I'll put some further solid effort into a published summary ASAP, ideally before the end of July, if not before. It's not forgotten, but behind other priorities. Regards, ?Stephen On 7/3/18 1:47 AM, Per Liden wrote: > Hi Stephen, > > On 03/21/2018 07:14 PM, Stephen Fitch wrote: >> Hi, >> >> The HotSpot Serviceability Agent (SA) is a set of APIs and tools for >> debugging HotSpot Virtual Machine and has been a part of the JVM/JDK for a >> long time, however we don't have a lot of data about how it is used in >> practice, especially outside of Oracle. Therefore, we have created an initial >> survey to gather more information and help us evaluate and understand how >> others are using it. >> >> If you have used, or have (support) processes that utilize the Serviceability >> Agent or related APIs, then we would definitely appreciate if you would >> complete this survey: >> >> https://www.surveymonkey.com/r/CF3MYDL >> >> We are specifically interested in your use-cases and how SA is effective for >> you in resolving JVM issues. >> >> The survey will remain open through March 31st. The results of the survey >> will be made public after the survey closes. > > Have the results been published yet? > > cheers, > Per > >> >> Regards, Stephen >> >> ??Java Platform Group - JVM - Sustaining Engineering From bob.vandette at oracle.com Tue Jul 3 12:59:55 2018 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 3 Jul 2018 08:59:55 -0400 Subject: RFR(xxs): 8206243: java -XshowSettings fails if memory.limit_in_bytes overflows LONG.max In-Reply-To: References: <539f5555-e057-4272-88a1-3da3ad0fd61d@oracle.com> Message-ID: Looks ok. Bob. > On Jul 3, 2018, at 5:15 AM, Thomas St?fe wrote: > > Thank you David! > > I changed the webrev in place. > > Thanks, Thomas > > On Tue, Jul 3, 2018 at 10:37 AM, David Holmes wrote: >> Hi Thomas, >> >> This seems okay. >> >> Minor nit: >> >> if(bigInt >> >> Please add a space after 'if' >> >> Thanks, >> David >> >> >> On 3/07/2018 6:20 PM, Thomas St?fe wrote: >>> >>> Hi all, >>> >>> may I please have reviews for this small fix. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8206243 >>> >>> http://cr.openjdk.java.net/~stuefe/webrevs/8206243-java-xshowsettings-fails-for-large-values-of-memory-limit_in_bytes/webrev.00/webrev/ >>> >>> >>> On some Linux kernels, the unlimited value of memory.limit_in_bytes is >>> returned as ULONG_MAX, not LONG_MAX. >>> >>> - .../nightly $ cat //sys/fs/cgroup/memory/memory.limit_in_bytes >>> 18446744073709551615 >>> >>> In those cases, java -XshowSettings will fail: >>> >>> java -XshowSettings >>> .... >>> Operating System Metrics: >>> Provider: cgroupv1 >>> Effective CPU Count: 8 >>> CPU Period: 100000us >>> CPU Quota: -1 >>> CPU Shares: -1 >>> List of Processors, 8 total: >>> 0 1 2 3 4 5 6 7 >>> List of Effective Processors, 0 total: >>> List of Memory Nodes, 1 total: >>> 0 >>> List of Available Memory Nodes, 0 total: >>> CPUSet Memory Pressure Enabled: false >>> Exception in thread "main" java.lang.NumberFormatException: For input >>> string: "18446744073709551615" >>> at >>> java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) >>> at java.base/java.lang.Long.parseLong(Long.java:692) >>> at java.base/java.lang.Long.parseLong(Long.java:817) >>> at >>> java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValue(SubSystem.java:106) >>> at >>> java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:374) >>> at >>> java.base/sun.launcher.LauncherHelper.printSystemMetrics(LauncherHelper.java:385) >>> >>> >>> Thank you, >>> >>> Thomas >>> >> From thomas.stuefe at gmail.com Tue Jul 3 13:04:45 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 3 Jul 2018 15:04:45 +0200 Subject: RFR(xxs): 8206243: java -XshowSettings fails if memory.limit_in_bytes overflows LONG.max In-Reply-To: References: <539f5555-e057-4272-88a1-3da3ad0fd61d@oracle.com> Message-ID: Thank you Bob! On Tue, Jul 3, 2018 at 2:59 PM, Bob Vandette wrote: > Looks ok. > > Bob. > >> On Jul 3, 2018, at 5:15 AM, Thomas St?fe wrote: >> >> Thank you David! >> >> I changed the webrev in place. >> >> Thanks, Thomas >> >> On Tue, Jul 3, 2018 at 10:37 AM, David Holmes wrote: >>> Hi Thomas, >>> >>> This seems okay. >>> >>> Minor nit: >>> >>> if(bigInt >>> >>> Please add a space after 'if' >>> >>> Thanks, >>> David >>> >>> >>> On 3/07/2018 6:20 PM, Thomas St?fe wrote: >>>> >>>> Hi all, >>>> >>>> may I please have reviews for this small fix. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8206243 >>>> >>>> http://cr.openjdk.java.net/~stuefe/webrevs/8206243-java-xshowsettings-fails-for-large-values-of-memory-limit_in_bytes/webrev.00/webrev/ >>>> >>>> >>>> On some Linux kernels, the unlimited value of memory.limit_in_bytes is >>>> returned as ULONG_MAX, not LONG_MAX. >>>> >>>> - .../nightly $ cat //sys/fs/cgroup/memory/memory.limit_in_bytes >>>> 18446744073709551615 >>>> >>>> In those cases, java -XshowSettings will fail: >>>> >>>> java -XshowSettings >>>> .... >>>> Operating System Metrics: >>>> Provider: cgroupv1 >>>> Effective CPU Count: 8 >>>> CPU Period: 100000us >>>> CPU Quota: -1 >>>> CPU Shares: -1 >>>> List of Processors, 8 total: >>>> 0 1 2 3 4 5 6 7 >>>> List of Effective Processors, 0 total: >>>> List of Memory Nodes, 1 total: >>>> 0 >>>> List of Available Memory Nodes, 0 total: >>>> CPUSet Memory Pressure Enabled: false >>>> Exception in thread "main" java.lang.NumberFormatException: For input >>>> string: "18446744073709551615" >>>> at >>>> java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) >>>> at java.base/java.lang.Long.parseLong(Long.java:692) >>>> at java.base/java.lang.Long.parseLong(Long.java:817) >>>> at >>>> java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValue(SubSystem.java:106) >>>> at >>>> java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:374) >>>> at >>>> java.base/sun.launcher.LauncherHelper.printSystemMetrics(LauncherHelper.java:385) >>>> >>>> >>>> Thank you, >>>> >>>> Thomas >>>> >>> > From bob.vandette at oracle.com Tue Jul 3 13:13:04 2018 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 3 Jul 2018 09:13:04 -0400 Subject: RFR: 8205928 - [TESTBUG]: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails depending on kernel config Message-ID: <3481D570-DAAF-49D8-862B-5B13BC6C2B48@oracle.com> Please review this small fix to correct a test failure when the Linux system kernel is not configured with the CONFIG_MEMCG_KMEM option. The Container Metric tests are dependent on docker which allow us to assume a certain minimum Linux kernel configuration level. However, the kernel memory resource limiting feature is not a hard requirement for docker. This test will need to be updated to allow for running on kernels without this option. A 0 return from the getKernelMemoryLimit is defined to indicate that this API is not available. BUG: https://bugs.openjdk.java.net/browse/JDK-8205928 PROPOSED FIX: diff --git a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java --- a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java +++ b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java @@ -95,10 +95,11 @@ private static void testKernelMemoryLimit(String value) { long limit = getMemoryValue(value); - if (limit != Metrics.systemMetrics().getKernelMemoryLimit()) { + long kmemlimit = Metrics.systemMetrics().getKernelMemoryLimit(); + if (kmemlimit != 0 && limit != kmemlimit) { throw new RuntimeException("Kernel Memory limit not equal, expected : [" + limit + "]" + ", got : [" - + Metrics.systemMetrics().getKernelMemoryLimit() + "]"); + + kmemlimit + "]"); } System.out.println("TEST PASSED!!!"); } From thomas.stuefe at gmail.com Tue Jul 3 13:38:40 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 3 Jul 2018 15:38:40 +0200 Subject: RFR: 8205928 - [TESTBUG]: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails depending on kernel config In-Reply-To: <3481D570-DAAF-49D8-862B-5B13BC6C2B48@oracle.com> References: <3481D570-DAAF-49D8-862B-5B13BC6C2B48@oracle.com> Message-ID: Hi Bob, It does look fine from the outside. I did not test it though, since I have no suitable kernel. Best Regards, Thomas On Tue, Jul 3, 2018 at 3:13 PM, Bob Vandette wrote: > Please review this small fix to correct a test failure when the Linux system kernel is > not configured with the CONFIG_MEMCG_KMEM option. > > The Container Metric tests are dependent on docker which allow us to assume a certain minimum > Linux kernel configuration level. However, the kernel memory resource limiting feature is not a hard > requirement for docker. This test will need to be updated to allow for running on kernels without this > option. A 0 return from the getKernelMemoryLimit is defined to indicate that this API is not available. > > BUG: https://bugs.openjdk.java.net/browse/JDK-8205928 > > PROPOSED FIX: > > diff --git a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java > --- a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java > +++ b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java > @@ -95,10 +95,11 @@ > > private static void testKernelMemoryLimit(String value) { > long limit = getMemoryValue(value); > - if (limit != Metrics.systemMetrics().getKernelMemoryLimit()) { > + long kmemlimit = Metrics.systemMetrics().getKernelMemoryLimit(); > + if (kmemlimit != 0 && limit != kmemlimit) { > throw new RuntimeException("Kernel Memory limit not equal, expected : [" > + limit + "]" + ", got : [" > - + Metrics.systemMetrics().getKernelMemoryLimit() + "]"); > + + kmemlimit + "]"); > } > System.out.println("TEST PASSED!!!"); > } From matthias.baesken at sap.com Tue Jul 3 13:57:50 2018 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 3 Jul 2018 13:57:50 +0000 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> <832f9ddb14f4415b98adafa004ff196f@sap.com> <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> Message-ID: <4411e9aedba54b16bc779acdf8de184d@sap.com> >>> I created a bug and a webrev , please review . >>> >>> >>> https://bugs.openjdk.java.net/browse/JDK-8206145 >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/ Hello, here is the second webrev including Solaris : http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.1/ Please review ! Thanks, Matthias > -----Original Message----- > From: Alan Bateman [mailto:Alan.Bateman at oracle.com] > Sent: Dienstag, 3. Juli 2018 12:09 > To: Baesken, Matthias ; David Holmes > ; Thomas St?fe > Cc: serviceability-dev (serviceability-dev at openjdk.java.net) dev at openjdk.java.net> > Subject: Re: 8206145 : dbgsysSocketClose - do not restart close if errno is > EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR > > On 03/07/2018 08:47, Baesken, Matthias wrote: > > Hello , so should I change my webrev for 8206145 to > > - retry on AIX > > - not retry on Linux + Solaris ? > Yes. > > > Any remarks on Mac / BSD ? > > > I see a few issues in the FreeBSD bugzilla on this topic. I assume it > would be safer to not retry if interrupted. > > -Alan From bob.vandette at oracle.com Tue Jul 3 14:02:55 2018 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 3 Jul 2018 10:02:55 -0400 Subject: RFR: 8205928 - [TESTBUG]: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails depending on kernel config In-Reply-To: References: <3481D570-DAAF-49D8-862B-5B13BC6C2B48@oracle.com> Message-ID: <596B8DE4-821B-4C0E-AB79-205F887645CB@oracle.com> Matthais, who reported the issue, confirmed that this patch solves the problem. Thanks, Bob. > On Jul 3, 2018, at 9:38 AM, Thomas St?fe wrote: > > Hi Bob, > > It does look fine from the outside. I did not test it though, since I > have no suitable kernel. > > Best Regards, Thomas > > On Tue, Jul 3, 2018 at 3:13 PM, Bob Vandette wrote: >> Please review this small fix to correct a test failure when the Linux system kernel is >> not configured with the CONFIG_MEMCG_KMEM option. >> >> The Container Metric tests are dependent on docker which allow us to assume a certain minimum >> Linux kernel configuration level. However, the kernel memory resource limiting feature is not a hard >> requirement for docker. This test will need to be updated to allow for running on kernels without this >> option. A 0 return from the getKernelMemoryLimit is defined to indicate that this API is not available. >> >> BUG: https://bugs.openjdk.java.net/browse/JDK-8205928 >> >> PROPOSED FIX: >> >> diff --git a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java >> --- a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java >> +++ b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java >> @@ -95,10 +95,11 @@ >> >> private static void testKernelMemoryLimit(String value) { >> long limit = getMemoryValue(value); >> - if (limit != Metrics.systemMetrics().getKernelMemoryLimit()) { >> + long kmemlimit = Metrics.systemMetrics().getKernelMemoryLimit(); >> + if (kmemlimit != 0 && limit != kmemlimit) { >> throw new RuntimeException("Kernel Memory limit not equal, expected : [" >> + limit + "]" + ", got : [" >> - + Metrics.systemMetrics().getKernelMemoryLimit() + "]"); >> + + kmemlimit + "]"); >> } >> System.out.println("TEST PASSED!!!"); >> } From coleen.phillimore at oracle.com Tue Jul 3 15:34:16 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 3 Jul 2018 11:34:16 -0400 Subject: [12] RFR (S) 8205534: Remove SymbolTable dependency from serviceability agent In-Reply-To: References: <88e391a8-78a2-8dbc-a489-fce9c6b922b5@oracle.com> Message-ID: <29f661c2-ce48-b174-cb2f-7300bd4aa8f2@oracle.com> Hi Jini,? Thank you for reviewing this. On 6/29/18 12:02 PM, Jini George wrote: > Hi Coleen, > > Apologize for the delay. Your changes look good to me overall. A few > comments: > > It might make sense to also remove the corresponding lines in the > vmStructs files. Like: > > ?File????????? Line > vmStructs.cpp? 170 typedef RehashableHashtable > RehashableSymbolHashtable; > vmStructs.cpp? 477 static_field(RehashableSymbolHashtable, _seed, > juint)???????????????????????????????? \ > vmStructs.cpp 1362 declare_type(RehashableSymbolHashtable, > BasicHashtable)???? \ > vmStructs.cpp? 475 static_field(SymbolTable, _the_table, > SymbolTable*)????????????????????????? \ > vmStructs.cpp? 476 static_field(SymbolTable, _shared_table, > SymbolCompactHashTable)??????????????? \ > Gerard has these changes in his changeset for rewriting the SymbolTable so I am going to leave this part of the change to him. > You could also remove the "friend class VMStructs" from the > corresponding C++ data types. > Good point.? We'll make sure it's not there in his changes. > The test case: test/jdk/sun/tools/jhsdb/AlternateHashingTest.java with > the file: test/jdk/sun/tools/jhsdb/LingeredAppWithAltHashing.java were > created to test the alternate hashing mechanism of the SymbolTable in > SA. Don't know if it makes sense to retain these. > Ok, I was debating with myself whether to remove these.? It makes sense not to test something that doesn't test what's intended anymore.? I'll remove them. > One nit: > > Line 1079 of HeapHprofBinWriter.java: Extra spaces needed. > Fixed. Thanks! Coleen > Thanks, > Jini. > > > On 6/23/2018 3:10 AM, coleen.phillimore at oracle.com wrote: >> Summary: Modify SA code to not use SymbolTable and remove it. >> >> This is to support the concurrent hashtable for SymbolTable. >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8205534.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8205534 >> >> Tested with hs-tier1-5. >> >> Thanks, >> Coleen From matthias.baesken at sap.com Tue Jul 3 15:45:23 2018 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 3 Jul 2018 15:45:23 +0000 Subject: RFR: 8205928 - [TESTBUG]: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails depending on kernel config In-Reply-To: <596B8DE4-821B-4C0E-AB79-205F887645CB@oracle.com> References: <3481D570-DAAF-49D8-862B-5B13BC6C2B48@oracle.com> <596B8DE4-821B-4C0E-AB79-205F887645CB@oracle.com> Message-ID: Hi Bob and Thomas , I had the patch in our internal queue and it fixed the problem . ( however I am not reviewer ) Best regards, Matthias > -----Original Message----- > From: Bob Vandette [mailto:bob.vandette at oracle.com] > Sent: Dienstag, 3. Juli 2018 16:03 > To: Thomas St?fe > Cc: serviceability-dev at openjdk.java.net serviceability- > dev at openjdk.java.net ; Baesken, > Matthias > Subject: Re: RFR: 8205928 - [TESTBUG]: > jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails > depending on kernel config > > Matthais, who reported the issue, confirmed that this patch solves the > problem. > > Thanks, > Bob. > > > On Jul 3, 2018, at 9:38 AM, Thomas St?fe > wrote: > > > > Hi Bob, > > > > It does look fine from the outside. I did not test it though, since I > > have no suitable kernel. > > > > Best Regards, Thomas > > > > On Tue, Jul 3, 2018 at 3:13 PM, Bob Vandette > wrote: > >> Please review this small fix to correct a test failure when the Linux system > kernel is > >> not configured with the CONFIG_MEMCG_KMEM option. > >> > >> The Container Metric tests are dependent on docker which allow us to > assume a certain minimum > >> Linux kernel configuration level. However, the kernel memory resource > limiting feature is not a hard > >> requirement for docker. This test will need to be updated to allow for > running on kernels without this > >> option. A 0 return from the getKernelMemoryLimit is defined to indicate > that this API is not available. > >> > >> BUG: https://bugs.openjdk.java.net/browse/JDK-8205928 > >> > >> PROPOSED FIX: > >> > >> diff --git > a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java > b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java > >> --- a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java > >> +++ b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java > >> @@ -95,10 +95,11 @@ > >> > >> private static void testKernelMemoryLimit(String value) { > >> long limit = getMemoryValue(value); > >> - if (limit != Metrics.systemMetrics().getKernelMemoryLimit()) { > >> + long kmemlimit = Metrics.systemMetrics().getKernelMemoryLimit(); > >> + if (kmemlimit != 0 && limit != kmemlimit) { > >> throw new RuntimeException("Kernel Memory limit not equal, > expected : [" > >> + limit + "]" + ", got : [" > >> - + Metrics.systemMetrics().getKernelMemoryLimit() + "]"); > >> + + kmemlimit + "]"); > >> } > >> System.out.println("TEST PASSED!!!"); > >> } From Alan.Bateman at oracle.com Tue Jul 3 16:49:15 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 3 Jul 2018 17:49:15 +0100 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <4411e9aedba54b16bc779acdf8de184d@sap.com> References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> <832f9ddb14f4415b98adafa004ff196f@sap.com> <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> <4411e9aedba54b16bc779acdf8de184d@sap.com> Message-ID: On 03/07/2018 14:57, Baesken, Matthias wrote: >>>> I created a bug and a webrev , please review . >>>> >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8206145 >>>> >>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/ > > Hello, here is the second webrev including Solaris : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.1/ > This looks okay to me (although I think we should include macOS in the list too). -Alan From thomas.stuefe at gmail.com Tue Jul 3 17:07:28 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 3 Jul 2018 19:07:28 +0200 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> <832f9ddb14f4415b98adafa004ff196f@sap.com> <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> <4411e9aedba54b16bc779acdf8de184d@sap.com> Message-ID: On Tue, Jul 3, 2018 at 6:49 PM, Alan Bateman wrote: > > > On 03/07/2018 14:57, Baesken, Matthias wrote: >>>>> >>>>> I created a bug and a webrev , please review . >>>>> >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8206145 >>>>> >>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/ >> >> >> Hello, here is the second webrev including Solaris : >> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.1/ >> > This looks okay to me (although I think we should include macOS in the list > too). > > -Alan +1 Actually, at this point we could just: #if defined(__AIX) do { rv = close(fd); } while (rv == -1 && errno == EINTR); #else rv = close(fd); #endif But boy this close() EINTR business is evil. Choosing between risking file descriptor leaks or random double closes ... ..Thomas From Alan.Bateman at oracle.com Tue Jul 3 17:14:18 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 3 Jul 2018 18:14:18 +0100 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> <832f9ddb14f4415b98adafa004ff196f@sap.com> <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> <4411e9aedba54b16bc779acdf8de184d@sap.com> Message-ID: <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com> On 03/07/2018 18:07, Thomas St?fe wrote: > : > Actually, at this point we could just: > > #if defined(__AIX) > do { > rv = close(fd); > } while (rv == -1 && errno == EINTR); > #else > rv = close(fd); > #endif Right, might be the simplest. > > But boy this close() EINTR business is evil. Choosing between risking > file descriptor leaks or random double closes ... > and we aren't out of the woods yet, there are a few other places that need similar attention. From thomas.stuefe at gmail.com Tue Jul 3 18:32:17 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 3 Jul 2018 20:32:17 +0200 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <709161f438f848b0af5fb079c9c0242a@sap.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> Message-ID: Hi Ralf, patch looks good and makes sense. Some remarks: + /* Should not usually happen. */ + if (length != count) { + error = JVMTI_ERROR_INTERNAL; + } Cosmetics: I would also probably explicitly return: /* Should not usually happen. */ if (length != count) { jvmtiDeallocate(frames); outStream_setError(out, JDWP_ERROR(INTERNAL)); return JNI_TRUE; } .. makes the code clearer and should someone change the loop cancel condition it will still work. ====== + for (fnum = 0; (fnum < count) && (error == JVMTI_ERROR_NONE); ++fnum) { you could loose the inner brackets. -- Cosmetics: you changed meaning of fnum. Before it was really the frame number. Now, fnum is a zero based index into your array. So I would probably have renamed the variable too, maybe index? or somesuch. ====== Do we not have to handle opaque frames like the code before did? Or does GetStackTrace already filter out opaque frames? Would that not mean that GetStackTrace returns fewer frames than expected, and then count could be smaller than length? -- oh wait I see GetFrameLocation never really returned JVMTI_ERROR_OPAQUE_FRAME? So it is probably fine. ===== How large can the depth get? In stack overflow scenarios? To limit memory usage and to make it more predictable, I would not retrieve all frames in one go but in a loop, in bulks a n frames. E.g. 4086 frames would mean your buffer never exceeds 64K on 64bit platforms. You would sacrifice a tiny bit of performance (again needless walking up to starting position) but would not choke out when stacks are ridiculously large. ====== I cannot comment on the jtreg test. Looks fine to me, but I wonder whether there is a better way to script jdb, is this how we are supposed to do this? Maybe someone from the Oracle serviceability group can comment. Thanks & Best Regards, Thomas On Tue, Jul 3, 2018 at 12:43 PM, Schmelter, Ralf wrote: > Hi All, > > Please review the fix for the bug https://bugs.openjdk.java.net/browse/JDK-8205608 . The webref is at http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/ . > > This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack. > > I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is. > > Best regards, > Ralf Schmelter From david.holmes at oracle.com Tue Jul 3 21:26:13 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 4 Jul 2018 07:26:13 +1000 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com> References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> <832f9ddb14f4415b98adafa004ff196f@sap.com> <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> <4411e9aedba54b16bc779acdf8de184d@sap.com> <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com> Message-ID: On 4/07/2018 3:14 AM, Alan Bateman wrote: > On 03/07/2018 18:07, Thomas St?fe wrote: >> : >> Actually, at this point we could just: >> >> #if defined(__AIX) >> ????? do { >> ????????? rv = close(fd); >> ????? } while (rv == -1 && errno == EINTR); >> #else >> ???? rv = close(fd); >> #endif > Right, might be the simplest. +1 with suitable comment >> >> But boy this close() EINTR business is evil. Choosing between risking >> file descriptor leaks or random double closes ... Hopefully it's somewhat academic and we don't actually take signals in arbitrary threads. Cheers, David >> > and we aren't out of the woods yet, there are a few other places that > need similar attention. From yasuenag at gmail.com Tue Jul 3 23:04:32 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 4 Jul 2018 08:04:32 +0900 Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running in Docker containers In-Reply-To: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com> References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com> Message-ID: <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com> PING: Could you review it? > JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 > webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ Thanks, Yasumasa On 2018/06/28 22:12, Yasumasa Suenaga wrote: > Hi all, > > Please review this change. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 > webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ > > I tried to attach jhsdb to java process in docker container from container host, but it couldn't. > jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet. > > SA gets LWP ID via thread stack and funcs in libthread_db.so, but they returns PIDs in container - they are different from host's PID. So I added the code to scan /proc//task to get all LWP IDs and they are kept in a Map in LinuxDebuggerLocal. > > Also SA_ALTROOT is set to /proc//root if SA detects debuggee runs in container. It helps SA to parse binaries in container. > > This change has been pushed to submit repo, and it was failed on OS X (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963). > But I guess it causes JDK-8205906. This change affects to Linux only. > > Could you review it? > > > Thanks, > > Yasumasa > From thomas.stuefe at gmail.com Wed Jul 4 05:11:51 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 4 Jul 2018 07:11:51 +0200 Subject: RFR: 8205928 - [TESTBUG]: jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails depending on kernel config In-Reply-To: References: <3481D570-DAAF-49D8-862B-5B13BC6C2B48@oracle.com> <596B8DE4-821B-4C0E-AB79-205F887645CB@oracle.com> Message-ID: Thanks for confirming, Matthias. On Tue, Jul 3, 2018, 17:45 Baesken, Matthias wrote: > Hi Bob and Thomas , I had the patch in our internal queue and it fixed > the problem . > ( however I am not reviewer ) > > Best regards, Matthias > > > > -----Original Message----- > > From: Bob Vandette [mailto:bob.vandette at oracle.com] > > Sent: Dienstag, 3. Juli 2018 16:03 > > To: Thomas St?fe > > Cc: serviceability-dev at openjdk.java.net serviceability- > > dev at openjdk.java.net ; Baesken, > > Matthias > > Subject: Re: RFR: 8205928 - [TESTBUG]: > > jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails > > depending on kernel config > > > > Matthais, who reported the issue, confirmed that this patch solves the > > problem. > > > > Thanks, > > Bob. > > > > > On Jul 3, 2018, at 9:38 AM, Thomas St?fe > > wrote: > > > > > > Hi Bob, > > > > > > It does look fine from the outside. I did not test it though, since I > > > have no suitable kernel. > > > > > > Best Regards, Thomas > > > > > > On Tue, Jul 3, 2018 at 3:13 PM, Bob Vandette > > wrote: > > >> Please review this small fix to correct a test failure when the Linux > system > > kernel is > > >> not configured with the CONFIG_MEMCG_KMEM option. > > >> > > >> The Container Metric tests are dependent on docker which allow us to > > assume a certain minimum > > >> Linux kernel configuration level. However, the kernel memory resource > > limiting feature is not a hard > > >> requirement for docker. This test will need to be updated to allow for > > running on kernels without this > > >> option. A 0 return from the getKernelMemoryLimit is defined to > indicate > > that this API is not available. > > >> > > >> BUG: https://bugs.openjdk.java.net/browse/JDK-8205928 > > >> > > >> PROPOSED FIX: > > >> > > >> diff --git > > a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java > > b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java > > >> --- a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java > > >> +++ b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java > > >> @@ -95,10 +95,11 @@ > > >> > > >> private static void testKernelMemoryLimit(String value) { > > >> long limit = getMemoryValue(value); > > >> - if (limit != Metrics.systemMetrics().getKernelMemoryLimit()) > { > > >> + long kmemlimit = > Metrics.systemMetrics().getKernelMemoryLimit(); > > >> + if (kmemlimit != 0 && limit != kmemlimit) { > > >> throw new RuntimeException("Kernel Memory limit not equal, > > expected : [" > > >> + limit + "]" + ", got : [" > > >> - + Metrics.systemMetrics().getKernelMemoryLimit() > + "]"); > > >> + + kmemlimit + "]"); > > >> } > > >> System.out.println("TEST PASSED!!!"); > > >> } > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthias.baesken at sap.com Wed Jul 4 11:37:44 2018 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Wed, 4 Jul 2018 11:37:44 +0000 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> <832f9ddb14f4415b98adafa004ff196f@sap.com> <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> <4411e9aedba54b16bc779acdf8de184d@sap.com> <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com> Message-ID: Hi all, here is another webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.2/ - switched to the coding proposed by Thomas - added a small comment Best regards, Matthias > -----Original Message----- > From: David Holmes [mailto:david.holmes at oracle.com] > Sent: Dienstag, 3. Juli 2018 23:26 > To: Alan Bateman ; Thomas St?fe > > Cc: Baesken, Matthias ; serviceability-dev > (serviceability-dev at openjdk.java.net) dev at openjdk.java.net> > Subject: Re: 8206145 : dbgsysSocketClose - do not restart close if errno is > EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR > > On 4/07/2018 3:14 AM, Alan Bateman wrote: > > On 03/07/2018 18:07, Thomas St?fe wrote: > >> : > >> Actually, at this point we could just: > >> > >> #if defined(__AIX) > >> ????? do { > >> ????????? rv = close(fd); > >> ????? } while (rv == -1 && errno == EINTR); > >> #else > >> ???? rv = close(fd); > >> #endif > > Right, might be the simplest. > > +1 with suitable comment > > >> > >> But boy this close() EINTR business is evil. Choosing between risking > >> file descriptor leaks or random double closes ... > > Hopefully it's somewhat academic and we don't actually take signals in > arbitrary threads. > > Cheers, > David > > >> > > and we aren't out of the woods yet, there are a few other places that > > need similar attention. From thomas.stuefe at gmail.com Wed Jul 4 11:39:05 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 4 Jul 2018 13:39:05 +0200 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> <832f9ddb14f4415b98adafa004ff196f@sap.com> <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> <4411e9aedba54b16bc779acdf8de184d@sap.com> <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com> Message-ID: Looks good. Thank you Matthias! ..Thomas On Wed, Jul 4, 2018 at 1:37 PM, Baesken, Matthias wrote: > Hi all, here is another webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.2/ > > - switched to the coding proposed by Thomas > - added a small comment > > > > Best regards, Matthias > > >> -----Original Message----- >> From: David Holmes [mailto:david.holmes at oracle.com] >> Sent: Dienstag, 3. Juli 2018 23:26 >> To: Alan Bateman ; Thomas St?fe >> >> Cc: Baesken, Matthias ; serviceability-dev >> (serviceability-dev at openjdk.java.net) > dev at openjdk.java.net> >> Subject: Re: 8206145 : dbgsysSocketClose - do not restart close if errno is >> EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR >> >> On 4/07/2018 3:14 AM, Alan Bateman wrote: >> > On 03/07/2018 18:07, Thomas St?fe wrote: >> >> : >> >> Actually, at this point we could just: >> >> >> >> #if defined(__AIX) >> >> do { >> >> rv = close(fd); >> >> } while (rv == -1 && errno == EINTR); >> >> #else >> >> rv = close(fd); >> >> #endif >> > Right, might be the simplest. >> >> +1 with suitable comment >> >> >> >> >> But boy this close() EINTR business is evil. Choosing between risking >> >> file descriptor leaks or random double closes ... >> >> Hopefully it's somewhat academic and we don't actually take signals in >> arbitrary threads. >> >> Cheers, >> David >> >> >> >> > and we aren't out of the woods yet, there are a few other places that >> > need similar attention. From Alan.Bateman at oracle.com Wed Jul 4 11:42:45 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 4 Jul 2018 12:42:45 +0100 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> <832f9ddb14f4415b98adafa004ff196f@sap.com> <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> <4411e9aedba54b16bc779acdf8de184d@sap.com> <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com> Message-ID: <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com> On 04/07/2018 12:37, Baesken, Matthias wrote: > Hi all, here is another webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.2/ > > - switched to the coding proposed by Thomas > - added a small comment > The code looks okay but the comment is a bit strange. A simple "AIX recommends to repeat the close call on EINTR" should be ignore and drop the bug reference. -Alan From Alan.Bateman at oracle.com Wed Jul 4 11:44:34 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 4 Jul 2018 12:44:34 +0100 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com> References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> <832f9ddb14f4415b98adafa004ff196f@sap.com> <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> <4411e9aedba54b16bc779acdf8de184d@sap.com> <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com> <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com> Message-ID: On 04/07/2018 12:42, Alan Bateman wrote: > ?A simple "AIX recommends to repeat the close call on EINTR" should be > ignore I meant "should be okay" of course as AIX is the outlier. From matthias.baesken at sap.com Wed Jul 4 13:00:38 2018 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Wed, 4 Jul 2018 13:00:38 +0000 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com> References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> <832f9ddb14f4415b98adafa004ff196f@sap.com> <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> <4411e9aedba54b16bc779acdf8de184d@sap.com> <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com> <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com> Message-ID: <8c8b6fc981e24fd1b6d1c6bf40e6e0fc@sap.com> Ok, I change the comment to "AIX recommends to repeat the close call on EINTR" and push , is that fine with you ? Best regards, Matthias > -----Original Message----- > From: Alan Bateman [mailto:Alan.Bateman at oracle.com] > Sent: Mittwoch, 4. Juli 2018 13:43 > To: Baesken, Matthias ; David Holmes > ; Thomas St?fe > Cc: serviceability-dev (serviceability-dev at openjdk.java.net) dev at openjdk.java.net> > Subject: Re: 8206145 : dbgsysSocketClose - do not restart close if errno is > EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR > > On 04/07/2018 12:37, Baesken, Matthias wrote: > > Hi all, here is another webrev : > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.2/ > > > > - switched to the coding proposed by Thomas > > - added a small comment > > > The code looks okay but the comment is a bit strange. A simple "AIX > recommends to repeat the close call on EINTR" should be ignore and drop > the bug reference. > > -Alan From thomas.stuefe at gmail.com Wed Jul 4 13:02:19 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 4 Jul 2018 15:02:19 +0200 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <8c8b6fc981e24fd1b6d1c6bf40e6e0fc@sap.com> References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> <832f9ddb14f4415b98adafa004ff196f@sap.com> <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> <4411e9aedba54b16bc779acdf8de184d@sap.com> <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com> <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com> <8c8b6fc981e24fd1b6d1c6bf40e6e0fc@sap.com> Message-ID: On Wed, Jul 4, 2018 at 3:00 PM, Baesken, Matthias wrote: > Ok, I change the comment to "AIX recommends to repeat the close call on EINTR" and push , is that fine with you ? > Sure. I do not need another webrev. ..Thomas > Best regards, Matthias > >> -----Original Message----- >> From: Alan Bateman [mailto:Alan.Bateman at oracle.com] >> Sent: Mittwoch, 4. Juli 2018 13:43 >> To: Baesken, Matthias ; David Holmes >> ; Thomas St?fe >> Cc: serviceability-dev (serviceability-dev at openjdk.java.net) > dev at openjdk.java.net> >> Subject: Re: 8206145 : dbgsysSocketClose - do not restart close if errno is >> EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR >> >> On 04/07/2018 12:37, Baesken, Matthias wrote: >> > Hi all, here is another webrev : >> > >> > http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.2/ >> > >> > - switched to the coding proposed by Thomas >> > - added a small comment >> > >> The code looks okay but the comment is a bit strange. A simple "AIX >> recommends to repeat the close call on EINTR" should be ignore and drop >> the bug reference. >> >> -Alan From Alan.Bateman at oracle.com Wed Jul 4 13:03:39 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 4 Jul 2018 14:03:39 +0100 Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR In-Reply-To: <8c8b6fc981e24fd1b6d1c6bf40e6e0fc@sap.com> References: <228035d2f64c494eaefe31b07ac72083@sap.com> <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com> <832f9ddb14f4415b98adafa004ff196f@sap.com> <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com> <4411e9aedba54b16bc779acdf8de184d@sap.com> <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com> <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com> <8c8b6fc981e24fd1b6d1c6bf40e6e0fc@sap.com> Message-ID: <09bfcdb3-29d5-7788-c73f-bba488a8ec30@oracle.com> On 04/07/2018 14:00, Baesken, Matthias wrote: > Ok, I change the comment to "AIX recommends to repeat the close call on EINTR" and push , is that fine with you ? > Works for me, no need to refresh the webrev of course. -Alan From ralf.schmelter at sap.com Wed Jul 4 13:47:28 2018 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Wed, 4 Jul 2018 13:47:28 +0000 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: References: <709161f438f848b0af5fb079c9c0242a@sap.com> Message-ID: <5f4b279d105e44d9a46bfe16a15bbb34@sap.com> Hi Thomas, thank you for reviewing the change. > + /* Should not usually happen. */ > + if (length != count) { > + error = JVMTI_ERROR_INTERNAL; > + } > > Cosmetics: I would also probably explicitly return: > > /* Should not usually happen. */ > if (length != count) { > jvmtiDeallocate(frames); > outStream_setError(out, JDWP_ERROR(INTERNAL)); > return JNI_TRUE; > } > > .. makes the code clearer and should someone change the loop cancel > condition it will still work. This would still rely on the error check in the loop, since the GetStackTrace JVMTI call sets the error variable too. This means it should be either explicit for both cases: error = JVMTI_FUNC_PTR(gdata->jvmti, GetStackTrace) (gdata->jvmti, thread, startIndex, length, frames, &count); If (error != JVMTI_ERROR_NONE) { jvmtiDeallocate(frames); outStream_setError(out, map2jdwpError(error)); return JNI_TRUE; } /* Should not happen. */ if (length != count) { jvmtiDeallocate(frames); outStream_setError(out, JDWP_ERROR(INTERNAL)); return JNI_TRUE; } or none (note that the original code could overwrite the error from the GetStackTrace call, which is fixed here): error = JVMTI_FUNC_PTR(gdata->jvmti, GetStackTrace) (gdata->jvmti, thread, startIndex, length, frames, &count); /* Should not happen. */ if (error == JVMTI_ERROR_NONE && length != count) { error = JVMTI_ERROR_INTERNAL; } > + for (fnum = 0; (fnum < count) && (error == JVMTI_ERROR_NONE); ++fnum) { > > you could loose the inner brackets. > > -- > > Cosmetics: you changed meaning of fnum. Before it was really the frame > number. Now, fnum is a zero based index into your array. So I would > probably have renamed the variable too, maybe index? or somesuch. Ok, index it is. > Do we not have to handle opaque frames like the code before did? Or > does GetStackTrace already filter out opaque frames? Would that not > mean that GetStackTrace returns fewer frames than expected, and then > count could be smaller than length? > -- oh wait I see GetFrameLocation never really returned > JVMTI_ERROR_OPAQUE_FRAME? So it is probably fine. Exactly. The old code would have skipped native methods in the stack trace, if JVMTI_ERROR_OPAQUE_FRAME would have been returned. But since this was in fact not returned, the stacks should look the same. > How large can the depth get? In stack overflow scenarios? > > To limit memory usage and to make it more predictable, I would not > retrieve all frames in one go but in a loop, in bulks a n frames. E.g. > 4086 frames would mean your buffer never exceeds 64K on 64bit > platforms. You would sacrifice a tiny bit of performance (again > needless walking up to starting position) but would not choke out when > stacks are ridiculously large. In theory or in practice? Practically a stack overflow will have at most a few 100 thousand frames, usually much less (10 to 20 thousand). But one can image a scenario where the JIT could statically inline a lot of calls, leading to many Java frames per (small) physical frame. But you should consider, that the whole stack is written 'to memory' already, since the packet output stream is backed completely by memory. So the memory requirement is already O(nrOfFrames). > I cannot comment on the jtreg test. Looks fine to me, but I wonder > whether there is a better way to script jdb, is this how we are > supposed to do this? I don't know. But the ShellScaffold.sh library is used by over 40 other JDI test, so I used it too. Best regards, Ralf From thomas.stuefe at gmail.com Wed Jul 4 14:43:17 2018 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 4 Jul 2018 16:43:17 +0200 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <5f4b279d105e44d9a46bfe16a15bbb34@sap.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <5f4b279d105e44d9a46bfe16a15bbb34@sap.com> Message-ID: Hi Ralf, On Wed, Jul 4, 2018 at 3:47 PM, Schmelter, Ralf wrote: > Hi Thomas, > > thank you for reviewing the change. > > >> + /* Should not usually happen. */ >> + if (length != count) { >> + error = JVMTI_ERROR_INTERNAL; >> + } >> >> Cosmetics: I would also probably explicitly return: >> >> /* Should not usually happen. */ >> if (length != count) { >> jvmtiDeallocate(frames); >> outStream_setError(out, JDWP_ERROR(INTERNAL)); >> return JNI_TRUE; >> } >> >> .. makes the code clearer and should someone change the loop cancel >> condition it will still work. > > This would still rely on the error check in the loop, since the GetStackTrace JVMTI call sets the error variable too. > > This means it should be either explicit for both cases: > error = JVMTI_FUNC_PTR(gdata->jvmti, GetStackTrace) > (gdata->jvmti, thread, startIndex, length, frames, &count); > > If (error != JVMTI_ERROR_NONE) { > jvmtiDeallocate(frames); > outStream_setError(out, map2jdwpError(error)); > return JNI_TRUE; > } > > /* Should not happen. */ > if (length != count) { > jvmtiDeallocate(frames); > outStream_setError(out, JDWP_ERROR(INTERNAL)); > return JNI_TRUE; > } > > or none (note that the original code could overwrite the error from the GetStackTrace call, which is fixed here): > error = JVMTI_FUNC_PTR(gdata->jvmti, GetStackTrace) > (gdata->jvmti, thread, startIndex, length, frames, &count); > > /* Should not happen. */ > if (error == JVMTI_ERROR_NONE && length != count) { > error = JVMTI_ERROR_INTERNAL; > } > Okay, in that case I prefer the second variant. At least only one deallocate call then. > > >> + for (fnum = 0; (fnum < count) && (error == JVMTI_ERROR_NONE); ++fnum) { >> >> you could loose the inner brackets. >> >> -- >> >> Cosmetics: you changed meaning of fnum. Before it was really the frame >> number. Now, fnum is a zero based index into your array. So I would >> probably have renamed the variable too, maybe index? or somesuch. > > Ok, index it is. > > Thanks. > >> Do we not have to handle opaque frames like the code before did? Or >> does GetStackTrace already filter out opaque frames? Would that not >> mean that GetStackTrace returns fewer frames than expected, and then >> count could be smaller than length? > >> -- oh wait I see GetFrameLocation never really returned >> JVMTI_ERROR_OPAQUE_FRAME? So it is probably fine. > > Exactly. The old code would have skipped native methods in the stack trace, if JVMTI_ERROR_OPAQUE_FRAME would have been returned. But since this was in fact not returned, the stacks should look the same. > > > > >> How large can the depth get? In stack overflow scenarios? >> >> To limit memory usage and to make it more predictable, I would not >> retrieve all frames in one go but in a loop, in bulks a n frames. E.g. >> 4086 frames would mean your buffer never exceeds 64K on 64bit >> platforms. You would sacrifice a tiny bit of performance (again >> needless walking up to starting position) but would not choke out when >> stacks are ridiculously large. > > In theory or in practice? Practically a stack overflow will have at most a few 100 thousand frames, usually much less (10 to 20 thousand). But one can image a scenario where the JIT could statically inline a lot of calls, leading to many Java frames per (small) physical frame. > > But you should consider, that the whole stack is written 'to memory' already, since the packet output stream is backed completely by memory. So the memory requirement is already O(nrOfFrames). Okay. Just did a quick calculation, we need now 33 bytes per frame in the outputstream, and now we need 16 more. But I find it difficult to see how one would be a problem and the other would not. So okay, lets keep the code simple. > > >> I cannot comment on the jtreg test. Looks fine to me, but I wonder >> whether there is a better way to script jdb, is this how we are >> supposed to do this? > > I don't know. But the ShellScaffold.sh library is used by over 40 other JDI test, so I used it too. > > Best regards, > Ralf Okay. From my point, this is reviewed. Thanks & Best Regards, Thomas > From rafael.wth at gmail.com Wed Jul 4 19:08:04 2018 From: rafael.wth at gmail.com (Rafael Winterhalter) Date: Wed, 4 Jul 2018 21:08:04 +0200 Subject: Review Request JDK-8200559: Java agents doing instrumentation need a means to define auxiliary classes In-Reply-To: <7a27c86c-6915-21e0-866f-f62c7626e330@oracle.com> References: <78719929-5cc1-7770-c7bf-107bef65e57e@oracle.com> <7a27c86c-6915-21e0-866f-f62c7626e330@oracle.com> Message-ID: Hi Mandy and Alan, I very much understand your points of view that the Java instrumentation API should retain its original intended scope and that every problem should be solved at its own time. I do however claim that the proposed API does not even solve this problem of auxiliary classes for the general case. By bringing up several examples for why I want to suggest a package-independent injection mechanism, this argument was maybe lost between the lines. Allow me to repeat it in a possibly clearer manner: Consider a case where an instrumentation is not isolated to a single class but involves the classes foo.Bar and qux.Baz which are both transformed. To implement the instrumentation some code is added to a method of foo.Bar which then invokes the method qux.Baz::baz(Object) which is also transformed. To apply the instrumentation, an auxiliary class needs to be created which transports some state from foo.Bar to qux.Baz as the method argument. For this to work, the code that is added to qux.Baz::baz(Object) checks and casts the argument to the known auxiliary class. Using the suggested API, it is very difficult to apply an transformation as it is not clear if the auxiliary class should live in the foo or the qux package as it is not controlled by the Java agent which of the foo.Bar and qux.Baz classes is loaded first. To solve this, one would need to prepare two instrumentations with an auxiliary class in the foo or the qux package to define the class according to the load order. This problem explodes into multidimensional complexity as more classes are involved in an instrumentation. This example might seem artificial but I it makes perfect sense when for example instrumenting actors in an actor framework such as Akka where a new message is added at runtime and handling is added to two actor classes. I have also encountered similar instrumentation circles in I/O processing frameworks. And while an instrumentation might start isolated to transforming a single class, there might be a requirement to evolve it later. Given the proposed API, such evolutions are now difficult to implement as defining auxiliary classes requires to consider class loading order. This is my argument for this API not being suited for instrumentation purposes and why I would favor an Instrumentation::defineClass API instead. It is simply to difficult to find a good limiting factor; one could of course consider to allow class definitions in the same class loader or module but since class loaders and modules can stand in exporting relationships to one another the problem with unpredictable load order would occur again. Beyond that I still claim that beyond the use case of auxiliary classes, the following facts justify an introduction of Instrumentation::defineClass which all apply to "tool agents" towards which the Instrumentation API is targeted: 1. The need to inject dispatchers into specific class loaders to allow cross-class loader communication. This is typically the bootstrap class loader which is already accessible via Instrumentation::appendToBootstrapSearchPath but this might be too specific. 2. The factual possibility of an owner of an instrumentation instance to inject classes into any package without using internal API simply by "pseudo transforming" a class that resides in the desired package. 3. The history of agents being developed using sun.misc.Unsafe::defineClass for many years what makes migration to a much different API unlikely if this involves heavy costs. 4. Retaining some equivalence to native agents where an API for defining classes is available via JNI. It would be a shame if JVMTI is favored over the Java agent API only for this. I really hope you take this concern into consideration, To strengthen my argument, please also considered that many regard me to be one of the leading experts for JVM agents (due to my library Byte Buddy that is often used for Java agents) and I have worked with a multitude of Java agent vendors whose concerns I am also voicing here. Currently, none of the vendors I regularly talk to is considering to use the suggested API whereas most plan to simply migrate to jdk.internal.misc.Unsafe what is further fostered by no alternative being offered in Java 11. I also understand that it is probably too late to make an API decision at this point. However, due to this important use case for sun.misc.Unsafe::defineClass not being currently covered, it want to suggest to reintroduce the latter method to Java 11 to avoid a further spread of internal API usage by forcing people into jdk.internal.misc.Unsafe where many will grow comfortable and not even consider future APIs. The migration to jdk.internal.misc.Unsafe is also what I observe being used for EA builds at this point so this is a partial reality already. Thank you for hearing me out, I really hope I can change your mind on this issue and add Instrumentation::defineClass. best regards, Rafael 2018-07-02 19:17 GMT+02:00 mandy chung : > My proposal of ClassDefiner API allows the java agent to define auxiliary > classes in the same runtime package of the class being instrumented. You > raised other use cases that are not addressed by this proposal. As Alan > replied, the ability to define any arbitrary class would be an attractive > nuisance and we think Instrumentation.defineClass isn't the right API to > add. > > I think the proposed ClassDefiner API is useful for the specific use case > (define auxiliary classes in the runtime package of the class being > instrumented). I hold it off and so didn't make 11. For the other use > cases, perhaps we should create JBS issues for further investigation. > > Mandy > > On 7/2/18 1:41 AM, Rafael Winterhalter wrote: > >> Hi, >> >> I was wondering if a solution for this problem is still planned for JDK >> 11 giving the beginning ramp down. >> >> With removing sun.misc.Unsafe::defineClass, Java agents only have an >> option to use jdk.internal.misc.Unsafe::defineClass for the use-cases >> that I described. >> >> I think it would be a missed opportunity not to offer an alternative as >> of JDK 11 as a second migration would make it even less likely that agents >> would avoid unsafe API. >> >> Thanks for the information, >> best regards, Rafael >> >> mandy chung > >> schrieb am So., 15. Apr. 2018, 08:23: >> >> Background: >> >> Java agents support both load time and dynamic instrumentation. At >> load time, >> the agent's ClassFileTransformer is invoked to transform class >> bytes. There is >> no Class objects at this time. Dynamic instrumentation is when >> redefineClasses >> or retransformClasses is used to redefine an existing loaded class. >> The >> ClassFileTransformer is invoked with class bytes where the Class >> object is present. >> >> Java agent doing instrumentation needs a means to define auxiliary >> classes >> that are visible and accessible to the instrumented class. Existing >> agents >> have been using sun.misc.Unsafe::defineClass to define aux classes >> directly >> or accessing protected ClassLoader::defineClass method with >> setAccessible to >> suppress the language access check (see [1] where this issue was >> brought up). >> >> Instrumentation::appendToBootstrapClassLoaderSearch and >> appendToSystemClassLoaderSearch >> APIs are existing means to supply additional classes. It's too >> limited >> for example it can't inject a class in the same runtime package as >> the class >> being transformed. >> >> Proposal: >> >> This proposes to add a new ClassFileTransformer.transform method >> taking additional ClassDefiner parameter. A transformer can define >> additional >> classes during the transformation process, i.e. >> when ClassFileTransformer::transform is invoked. Some details: >> >> 1. ClassDefiner::defineClass defines a class in the same runtime >> package >> as the class being transformed. >> 2. The class is defined in the same thread as the transformers are >> being >> invoked. ClassDefiner::defineClass returns Class object directly >> before the transformed class is defined. >> 3. No transformation is applied to classes defined by >> ClassDefiner::defineClass. >> >> The first prototype we did is to collect the auxiliary classes and >> define >> them until all transformers are invoked and have these aux classes >> to go >> through the transformation pipeline. Several complicated issues would >> need to be resolved for example timing whether the auxiliary classes >> should >> be defined before the transformed class (otherwise a potential race >> where >> some other thread references the transformed class and cause the code >> to >> execute that in turn reference the auxiliary classes. The current >> implementation has a native reentrancy check that ensure one class >> is being >> transformed to avoid potential circularity issues. This may need >> JVM TI >> support to be reliable. >> >> This proposal would allow java agents to migrate from internal API >> and ClassDefiner to be enhanced in the future. >> >> Webrev: >> http://cr.openjdk.java.net/~mchung/jdk11/webrevs/8200559/webrev.00/ >> >> Mandy >> [1] >> http://mail.openjdk.java.net/pipermail/jdk-dev/2018-January/ >> 000405.html >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Thu Jul 5 08:19:17 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 5 Jul 2018 18:19:17 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code Message-ID: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ Problem: The tests create native threads that attach to the VM through JNI_AttachCurrentThread but which then terminate without detaching themselves. When the VM exits and we're using Flight Recorder "dumponexit" this leads to a call to VM_PrintThreads that in part wants to print the per-thread CPU usage. When we encounter the threads that have terminated already the low level pthread_getcpuclockid calls returns ESRCH but the code doesn't expect that and so fails an assert in debug mode and can SEGV in product mode. Solution: Serviceability-side: fix the tests Change the tests so that the threads detach before terminating. The two tests are (surprisingly) written in completely different styles, so the solution also takes on two different styles. Runtime-side: make the VM more robust in the fact of JNI attached threads that terminate before detaching, and add a regression test I took a good look at the low-level code for interacting with arbitrary threads and as far as I can see the problem only exists for this one case of pthread_getcpuclockid on Linux. Elsewhere the potential for a library call failure just reports an error value (such as -1 for the cpu time used). So the fix is simply to allow for ESRCH when calling pthread_getcpuclockid and return -1 for the cpu usage in that case. I created a new regression test to create a new native thread, attach it and then let it terminate while still attached. The java code then calls various Thread and ThreadMXBean functions on it to ensure there are no crashes or unexpected exceptions. Testing: - old tests with fixed run-time - old run-time with fixed tests - mach tier4 (which exposed the problem - that's where we enable Flight recorder for the tests) [in progress] - mach5 tier 1-3 for good measure [in progress] - new regression test Thanks, David From david.holmes at oracle.com Thu Jul 5 09:58:39 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 5 Jul 2018 19:58:39 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> Message-ID: Solaris compiler complains about doing a return from inside a do-while loop. I'll have to rework part of the fix tomorrow. David On 5/07/2018 6:19 PM, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 > Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ > > Problem: > > The tests create native threads that attach to the VM through > JNI_AttachCurrentThread but which then terminate without detaching > themselves. When the VM exits and we're using Flight Recorder > "dumponexit" this leads to a call to VM_PrintThreads that in part wants > to print the per-thread CPU usage. When we encounter the threads that > have terminated already the low level pthread_getcpuclockid calls > returns ESRCH but the code doesn't expect that and so fails an assert in > debug mode and can SEGV in product mode. > > Solution: > > Serviceability-side: fix the tests > > Change the tests so that the threads detach before terminating. The two > tests are (surprisingly) written in completely different styles, so the > solution also takes on two different styles. > > Runtime-side: make the VM more robust in the fact of JNI attached > threads that terminate before detaching, and add a regression test > > I took a good look at the low-level code for interacting with arbitrary > threads and as far as I can see the problem only exists for this one > case of pthread_getcpuclockid on Linux. Elsewhere the potential for a > library call failure just reports an error value (such as -1 for the cpu > time used). > > So the fix is simply to allow for ESRCH when calling > pthread_getcpuclockid and return -1 for the cpu usage in that case. > > I created a new regression test to create a new native thread, attach it > and then let it terminate while still attached. The java code then calls > various Thread and ThreadMXBean functions on it to ensure there are no > crashes or unexpected exceptions. > > Testing: > ?- old tests with fixed run-time > ?- old run-time with fixed tests > ?- mach tier4 (which exposed the problem - that's where we enable > Flight recorder for the tests) [in progress] > ?- mach5 tier 1-3 for good measure [in progress] > ?- new regression test > > Thanks, > David From gary.adams at oracle.com Thu Jul 5 14:48:39 2018 From: gary.adams at oracle.com (Gary Adams) Date: Thu, 05 Jul 2018 10:48:39 -0400 Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time on some builds Message-ID: <5B3E2FC7.1060303@oracle.com> A simple test run using "exclude none" shows 625K methods are being observed. The bulk of those methods were due to the last class accessed in the test - VirtualMachineManager. It's not important that this particular call is used. The test is simply demonstrating that filters work for other packages than java and javax. This proposed fix uses a simpler lookup for GregorianCalendar. Issue: https://bugs.openjdk.java.net/browse/JDK-8206007 Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/ From chris.plummer at oracle.com Thu Jul 5 21:28:15 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 5 Jul 2018 14:28:15 -0700 Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time on some builds In-Reply-To: <5B3E2FC7.1060303@oracle.com> References: <5B3E2FC7.1060303@oracle.com> Message-ID: Hi Gary, The changes look good. How much is the reducing execution by? thanks, Chris On 7/5/18 7:48 AM, Gary Adams wrote: > A simple test run using "exclude none" shows 625K methods are being > observed. > The bulk of those methods were due to the last class accessed in the > test - VirtualMachineManager. > > It's not important that this particular call is used. The test is > simply demonstrating that > filters work for other packages than java and javax. > > This proposed fix uses a simpler lookup for GregorianCalendar. > > ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206007 > ? Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/ From chris.plummer at oracle.com Thu Jul 5 21:55:36 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 5 Jul 2018 14:55:36 -0700 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> Message-ID: Hi David, Solaris problems aside, overall it looks fine. Some minor things I noted: I noticed that exitCode is never modified in agentA() or agentB(), so there isn't much point to having it. If you reach the bottom of the function, it passed, so PASSED can be returned. The code would be more clear if it did this. As-is it is implied that you can reach the bottom when it fails. Is detaching the threads along the failure paths really needed? exit() is called, so this would seem to make it unnecessary. I prefer assignments not to be embedded inside the "if" condition. The DetachCurrentThread code in THREAD_return() is much more readable than the similar code in agentA() and agentB(). In the test: ? 54???????? // Generally as long as we don't crash of throw unexpected ? 55???????? // exceptions then the test passes. In some cases we know exactly "of" should be "or". Shouldn't you be catching exceptions for all the Thread methods you are calling? Otherwise the test will exit if one is thrown, and the above comment indicates that you don't want this. Don't we normally put these tests in a package? thanks, Chris On 7/5/18 2:58 AM, David Holmes wrote: > Solaris compiler complains about doing a return from inside a > do-while loop. I'll have to rework part of the fix tomorrow. > > David > > On 5/07/2018 6:19 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >> >> Problem: >> >> The tests create native threads that attach to the VM through >> JNI_AttachCurrentThread but which then terminate without detaching >> themselves. When the VM exits and we're using Flight Recorder >> "dumponexit" this leads to a call to VM_PrintThreads that in part >> wants to print the per-thread CPU usage. When we encounter the >> threads that have terminated already the low level >> pthread_getcpuclockid calls returns ESRCH but the code doesn't expect >> that and so fails an assert in debug mode and can SEGV in product mode. >> >> Solution: >> >> Serviceability-side: fix the tests >> >> Change the tests so that the threads detach before terminating. The >> two tests are (surprisingly) written in completely different styles, >> so the solution also takes on two different styles. >> >> Runtime-side: make the VM more robust in the fact of JNI attached >> threads that terminate before detaching, and add a regression test >> >> I took a good look at the low-level code for interacting with >> arbitrary threads and as far as I can see the problem only exists for >> this one case of pthread_getcpuclockid on Linux. Elsewhere the >> potential for a library call failure just reports an error value >> (such as -1 for the cpu time used). >> >> So the fix is simply to allow for ESRCH when calling >> pthread_getcpuclockid and return -1 for the cpu usage in that case. >> >> I created a new regression test to create a new native thread, attach >> it and then let it terminate while still attached. The java code then >> calls various Thread and ThreadMXBean functions on it to ensure there >> are no crashes or unexpected exceptions. >> >> Testing: >> ??- old tests with fixed run-time >> ??- old run-time with fixed tests >> ??- mach tier4 (which exposed the problem - that's where we enable >> Flight recorder for the tests) [in progress] >> ??- mach5 tier 1-3 for good measure [in progress] >> ??- new regression test >> >> Thanks, >> David From david.holmes at oracle.com Thu Jul 5 22:18:52 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Jul 2018 08:18:52 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> Message-ID: <5752c0cc-f2bd-ed5c-0579-51ed639ee4cb@oracle.com> On 5/07/2018 7:58 PM, David Holmes wrote: > Solaris compiler complains about doing a return from inside a > do-while loop. I'll have to rework part of the fix tomorrow. Webrev updated in-place. The only change is to the makefile to disable a warning: + ifeq ($(TOOLCHAIN_TYPE), solstudio) + BUILD_HOTSPOT_JTREG_LIBRARIES_CFLAGS_libji06t001 += -erroff=E_END_OF_LOOP_CODE_NOT_REACHED + endif + David ----- > David > > On 5/07/2018 6:19 PM, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >> >> Problem: >> >> The tests create native threads that attach to the VM through >> JNI_AttachCurrentThread but which then terminate without detaching >> themselves. When the VM exits and we're using Flight Recorder >> "dumponexit" this leads to a call to VM_PrintThreads that in part >> wants to print the per-thread CPU usage. When we encounter the threads >> that have terminated already the low level pthread_getcpuclockid calls >> returns ESRCH but the code doesn't expect that and so fails an assert >> in debug mode and can SEGV in product mode. >> >> Solution: >> >> Serviceability-side: fix the tests >> >> Change the tests so that the threads detach before terminating. The >> two tests are (surprisingly) written in completely different styles, >> so the solution also takes on two different styles. >> >> Runtime-side: make the VM more robust in the fact of JNI attached >> threads that terminate before detaching, and add a regression test >> >> I took a good look at the low-level code for interacting with >> arbitrary threads and as far as I can see the problem only exists for >> this one case of pthread_getcpuclockid on Linux. Elsewhere the >> potential for a library call failure just reports an error value (such >> as -1 for the cpu time used). >> >> So the fix is simply to allow for ESRCH when calling >> pthread_getcpuclockid and return -1 for the cpu usage in that case. >> >> I created a new regression test to create a new native thread, attach >> it and then let it terminate while still attached. The java code then >> calls various Thread and ThreadMXBean functions on it to ensure there >> are no crashes or unexpected exceptions. >> >> Testing: >> ??- old tests with fixed run-time >> ??- old run-time with fixed tests >> ??- mach tier4 (which exposed the problem - that's where we enable >> Flight recorder for the tests) [in progress] >> ??- mach5 tier 1-3 for good measure [in progress] >> ??- new regression test >> >> Thanks, >> David From chris.plummer at oracle.com Thu Jul 5 22:37:21 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 5 Jul 2018 15:37:21 -0700 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <709161f438f848b0af5fb079c9c0242a@sap.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> Message-ID: <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> Hi Ralf, Overall looks good, but I do have a few comments and questions. Please update the copyright. What testing have you done? How long does this test take to run. What happens if for some reason SOE is never thrown? It's not clear to me what the script would do in this case. In answer to the ShellScaffold.sh question, there is already work underway to convert to pure java tests. See JDK-8201652. I'm not certain if it is ok for you to just submit this new shell script, or if should be rewritten in pure java. Most of the work to convert the scripts has already been done but was put on hold. Maybe Serguei can comment and guide you on how it would be done in java. thanks, Chris On 7/3/18 3:43 AM, Schmelter, Ralf wrote: > Hi All, > > Please review the fix for the bughttps://bugs.openjdk.java.net/browse/JDK-8205608 . The webref is athttp://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/ . > > This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack. > > I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is. > > Best regards, > Ralf Schmelter From david.holmes at oracle.com Thu Jul 5 22:40:06 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Jul 2018 08:40:06 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> Message-ID: <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> Hi Chris, Thanks for looking at this. Updated webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ Only real changes in ji05t001.c. (And fixed typo in the new test) More below ... On 6/07/2018 7:55 AM, Chris Plummer wrote: > Hi David, > > Solaris problems aside, overall it looks fine. Some minor things I noted: > > I noticed that exitCode is never modified in agentA() or agentB(), so > there isn't much point to having it. If you reach the bottom of the > function, it passed, so PASSED can be returned. The code would be more > clear if it did this. As-is it is implied that you can reach the bottom > when it fails. I resisted any and all urges to do any kind of unrelated code cleanup in the tests - once you start you may end up doing a full rewrite. > Is detaching the threads along the failure paths really needed? exit() > is called, so this would seem to make it unnecessary. You're right that isn't necessary. I'll remove the changes from before the exits in ji05t001.c > I prefer assignments not to be embedded inside the "if" condition. The > DetachCurrentThread code in THREAD_return() is much more readable than > the similar code in agentA() and agentB(). It's an existing style already used in that test e.g. 287 if ((res = 288 JNI_ENV_PTR(vm)->AttachCurrentThread( 289 JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) != 0) { and I don't mind it, so I'd prefer not to change it. > In the test: > > ? 54???????? // Generally as long as we don't crash of throw unexpected > ? 55???????? // exceptions then the test passes. In some cases we know > exactly > > "of" should be "or". Well spotted. Thanks. > Shouldn't you be catching exceptions for all the Thread methods you are > calling? Otherwise the test will exit if one is thrown, and the above > comment indicates that you don't want this. I'm not expecting there to be any exceptions from any of the called methods. That would potentially indicate a problem in handling the terminated native thread, so would indicate a test failure. > Don't we normally put these tests in a package? Doesn't seem to be any hard and fast rule. I only uses packages when they are important for the test. In runtime we have 905 java files and only 116 have a package statement. It varies elsewhere. Thanks, David > thanks, > > Chris > > On 7/5/18 2:58 AM, David Holmes wrote: >> Solaris compiler complains about doing a return from inside a >> do-while loop. I'll have to rework part of the fix tomorrow. >> >> David >> >> On 5/07/2018 6:19 PM, David Holmes wrote: >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>> >>> Problem: >>> >>> The tests create native threads that attach to the VM through >>> JNI_AttachCurrentThread but which then terminate without detaching >>> themselves. When the VM exits and we're using Flight Recorder >>> "dumponexit" this leads to a call to VM_PrintThreads that in part >>> wants to print the per-thread CPU usage. When we encounter the >>> threads that have terminated already the low level >>> pthread_getcpuclockid calls returns ESRCH but the code doesn't expect >>> that and so fails an assert in debug mode and can SEGV in product mode. >>> >>> Solution: >>> >>> Serviceability-side: fix the tests >>> >>> Change the tests so that the threads detach before terminating. The >>> two tests are (surprisingly) written in completely different styles, >>> so the solution also takes on two different styles. >>> >>> Runtime-side: make the VM more robust in the fact of JNI attached >>> threads that terminate before detaching, and add a regression test >>> >>> I took a good look at the low-level code for interacting with >>> arbitrary threads and as far as I can see the problem only exists for >>> this one case of pthread_getcpuclockid on Linux. Elsewhere the >>> potential for a library call failure just reports an error value >>> (such as -1 for the cpu time used). >>> >>> So the fix is simply to allow for ESRCH when calling >>> pthread_getcpuclockid and return -1 for the cpu usage in that case. >>> >>> I created a new regression test to create a new native thread, attach >>> it and then let it terminate while still attached. The java code then >>> calls various Thread and ThreadMXBean functions on it to ensure there >>> are no crashes or unexpected exceptions. >>> >>> Testing: >>> ??- old tests with fixed run-time >>> ??- old run-time with fixed tests >>> ??- mach tier4 (which exposed the problem - that's where we enable >>> Flight recorder for the tests) [in progress] >>> ??- mach5 tier 1-3 for good measure [in progress] >>> ??- new regression test >>> >>> Thanks, >>> David > > > From chris.plummer at oracle.com Thu Jul 5 23:00:39 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 5 Jul 2018 16:00:39 -0700 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> Message-ID: Hi David, Looks good. Regarding the test being in a package, looks like this was the convention for the nsk tests, so that's why I noted it. thanks, Chris On 7/5/18 3:40 PM, David Holmes wrote: > Hi Chris, > > Thanks for looking at this. > > Updated webrev: > > http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ > > Only real changes in ji05t001.c. (And fixed typo in the new test) > > More below ... > > On 6/07/2018 7:55 AM, Chris Plummer wrote: >> Hi David, >> >> Solaris problems aside, overall it looks fine. Some minor things I >> noted: >> >> I noticed that exitCode is never modified in agentA() or agentB(), so >> there isn't much point to having it. If you reach the bottom of the >> function, it passed, so PASSED can be returned. The code would be >> more clear if it did this. As-is it is implied that you can reach the >> bottom when it fails. > > I resisted any and all urges to do any kind of unrelated code cleanup > in the tests - once you start you may end up doing a full rewrite. > >> Is detaching the threads along the failure paths really needed? >> exit() is called, so this would seem to make it unnecessary. > > You're right that isn't necessary. I'll remove the changes from before > the exits in ji05t001.c > >> I prefer assignments not to be embedded inside the "if" condition. >> The DetachCurrentThread code in THREAD_return() is much more readable >> than the similar code in agentA() and agentB(). > > It's an existing style already used in that test e.g. > > ?287???? if ((res = > ?288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( > ?289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) != > 0) { > > and I don't mind it, so I'd prefer not to change it. > >> In the test: >> >> ?? 54???????? // Generally as long as we don't crash of throw unexpected >> ?? 55???????? // exceptions then the test passes. In some cases we >> know exactly >> >> "of" should be "or". > > Well spotted. Thanks. > >> Shouldn't you be catching exceptions for all the Thread methods you >> are calling? Otherwise the test will exit if one is thrown, and the >> above comment indicates that you don't want this. > > I'm not expecting there to be any exceptions from any of the called > methods. That would potentially indicate a problem in handling the > terminated native thread, so would indicate a test failure. > >> Don't we normally put these tests in a package? > > Doesn't seem to be any hard and fast rule. I only uses packages when > they are important for the test. In runtime we have 905 java files and > only 116 have a package statement. It varies elsewhere. > > Thanks, > David > >> thanks, >> >> Chris >> >> On 7/5/18 2:58 AM, David Holmes wrote: >>> Solaris compiler complains about doing a return from inside a >>> do-while loop. I'll have to rework part of the fix tomorrow. >>> >>> David >>> >>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>> >>>> Problem: >>>> >>>> The tests create native threads that attach to the VM through >>>> JNI_AttachCurrentThread but which then terminate without detaching >>>> themselves. When the VM exits and we're using Flight Recorder >>>> "dumponexit" this leads to a call to VM_PrintThreads that in part >>>> wants to print the per-thread CPU usage. When we encounter the >>>> threads that have terminated already the low level >>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't >>>> expect that and so fails an assert in debug mode and can SEGV in >>>> product mode. >>>> >>>> Solution: >>>> >>>> Serviceability-side: fix the tests >>>> >>>> Change the tests so that the threads detach before terminating. The >>>> two tests are (surprisingly) written in completely different >>>> styles, so the solution also takes on two different styles. >>>> >>>> Runtime-side: make the VM more robust in the fact of JNI attached >>>> threads that terminate before detaching, and add a regression test >>>> >>>> I took a good look at the low-level code for interacting with >>>> arbitrary threads and as far as I can see the problem only exists >>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere the >>>> potential for a library call failure just reports an error value >>>> (such as -1 for the cpu time used). >>>> >>>> So the fix is simply to allow for ESRCH when calling >>>> pthread_getcpuclockid and return -1 for the cpu usage in that case. >>>> >>>> I created a new regression test to create a new native thread, >>>> attach it and then let it terminate while still attached. The java >>>> code then calls various Thread and ThreadMXBean functions on it to >>>> ensure there are no crashes or unexpected exceptions. >>>> >>>> Testing: >>>> ??- old tests with fixed run-time >>>> ??- old run-time with fixed tests >>>> ??- mach tier4 (which exposed the problem - that's where we enable >>>> Flight recorder for the tests) [in progress] >>>> ??- mach5 tier 1-3 for good measure [in progress] >>>> ??- new regression test >>>> >>>> Thanks, >>>> David >> >> >> From serguei.spitsyn at oracle.com Thu Jul 5 23:26:40 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 5 Jul 2018 16:26:40 -0700 Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time on some builds In-Reply-To: <5B3E2FC7.1060303@oracle.com> References: <5B3E2FC7.1060303@oracle.com> Message-ID: <1a60027a-f7c2-d892-997f-655fe9612718@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Jul 6 02:17:36 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 5 Jul 2018 19:17:36 -0700 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> Message-ID: <7b8e9622-b4a1-78c1-6228-dd9c9e845e30@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Jul 6 02:35:54 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 5 Jul 2018 19:35:54 -0700 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <7b8e9622-b4a1-78c1-6228-dd9c9e845e30@oracle.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <7b8e9622-b4a1-78c1-6228-dd9c9e845e30@oracle.com> Message-ID: <332c6e87-aba9-0fb5-4b41-4e80507792bb@oracle.com> An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Fri Jul 6 05:21:11 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Jul 2018 15:21:11 +1000 Subject: (11) RFR (XS): 8205966: [testbug] New Nestmates JDI test times out with Xcomp on sparc Message-ID: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8205966 webrev: http://cr.openjdk.java.net/~dholmes/8205966/webrev/ One of the @run variants was taking around 15x longer to execute. That variant uses the InMemoryJavaCompiler which involves a lot of classes and code execution. The test was enabling method entry event generation for all of main, resulting in the massive slowdown. The fix is to add a new breakpoint() function that gets called after the in-memory compilation setup is done, and we initially run the test to that point before enabling the events. The problem @run now only takes 2x the other tests and so should avoid the timeouts. Testing: mach5 tier4 solaris-sparc mach5 tier 1-3 Thanks, David From mikael.vidstedt at oracle.com Fri Jul 6 05:29:28 2018 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 5 Jul 2018 22:29:28 -0700 Subject: (11) RFR (XS): 8205966: [testbug] New Nestmates JDI test times out with Xcomp on sparc In-Reply-To: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com> References: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com> Message-ID: <1EAE1E8D-6DF4-4930-9494-16E6428819C5@oracle.com> Looks good. Nice speedup! Cheers, Mikael > On Jul 5, 2018, at 10:21 PM, David Holmes wrote: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8205966 > webrev: http://cr.openjdk.java.net/~dholmes/8205966/webrev/ > > One of the @run variants was taking around 15x longer to execute. That variant uses the InMemoryJavaCompiler which involves a lot of classes and code execution. The test was enabling method entry event generation for all of main, resulting in the massive slowdown. > > The fix is to add a new breakpoint() function that gets called after the in-memory compilation setup is done, and we initially run the test to that point before enabling the events. > > The problem @run now only takes 2x the other tests and so should avoid the timeouts. > > Testing: mach5 tier4 solaris-sparc > mach5 tier 1-3 > > Thanks, > David From david.holmes at oracle.com Fri Jul 6 05:48:54 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Jul 2018 15:48:54 +1000 Subject: (11) RFR (XS): 8205966: [testbug] New Nestmates JDI test times out with Xcomp on sparc In-Reply-To: <1EAE1E8D-6DF4-4930-9494-16E6428819C5@oracle.com> References: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com> <1EAE1E8D-6DF4-4930-9494-16E6428819C5@oracle.com> Message-ID: On 6/07/2018 3:29 PM, Mikael Vidstedt wrote: > Looks good. Nice speedup! Thanks for looking at it Mikael! Still the second longest test in com/sun/jdi at 21 minutes!!! David > Cheers, > Mikael > > >> On Jul 5, 2018, at 10:21 PM, David Holmes wrote: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205966 >> webrev: http://cr.openjdk.java.net/~dholmes/8205966/webrev/ >> >> One of the @run variants was taking around 15x longer to execute. That variant uses the InMemoryJavaCompiler which involves a lot of classes and code execution. The test was enabling method entry event generation for all of main, resulting in the massive slowdown. >> >> The fix is to add a new breakpoint() function that gets called after the in-memory compilation setup is done, and we initially run the test to that point before enabling the events. >> >> The problem @run now only takes 2x the other tests and so should avoid the timeouts. >> >> Testing: mach5 tier4 solaris-sparc >> mach5 tier 1-3 >> >> Thanks, >> David > From david.holmes at oracle.com Fri Jul 6 08:07:37 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 6 Jul 2018 18:07:37 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> Message-ID: The new test is hanging on Solaris. I just discovered we don't run these tests on Solaris until tier4. David On 6/07/2018 8:40 AM, David Holmes wrote: > Hi Chris, > > Thanks for looking at this. > > Updated webrev: > > http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ > > Only real changes in ji05t001.c. (And fixed typo in the new test) > > More below ... > > On 6/07/2018 7:55 AM, Chris Plummer wrote: >> Hi David, >> >> Solaris problems aside, overall it looks fine. Some minor things I noted: >> >> I noticed that exitCode is never modified in agentA() or agentB(), so >> there isn't much point to having it. If you reach the bottom of the >> function, it passed, so PASSED can be returned. The code would be more >> clear if it did this. As-is it is implied that you can reach the >> bottom when it fails. > > I resisted any and all urges to do any kind of unrelated code cleanup in > the tests - once you start you may end up doing a full rewrite. > >> Is detaching the threads along the failure paths really needed? exit() >> is called, so this would seem to make it unnecessary. > > You're right that isn't necessary. I'll remove the changes from before > the exits in ji05t001.c > >> I prefer assignments not to be embedded inside the "if" condition. The >> DetachCurrentThread code in THREAD_return() is much more readable than >> the similar code in agentA() and agentB(). > > It's an existing style already used in that test e.g. > > ?287???? if ((res = > ?288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( > ?289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) != 0) { > > and I don't mind it, so I'd prefer not to change it. > >> In the test: >> >> ?? 54???????? // Generally as long as we don't crash of throw unexpected >> ?? 55???????? // exceptions then the test passes. In some cases we >> know exactly >> >> "of" should be "or". > > Well spotted. Thanks. > >> Shouldn't you be catching exceptions for all the Thread methods you >> are calling? Otherwise the test will exit if one is thrown, and the >> above comment indicates that you don't want this. > > I'm not expecting there to be any exceptions from any of the called > methods. That would potentially indicate a problem in handling the > terminated native thread, so would indicate a test failure. > >> Don't we normally put these tests in a package? > > Doesn't seem to be any hard and fast rule. I only uses packages when > they are important for the test. In runtime we have 905 java files and > only 116 have a package statement. It varies elsewhere. > > Thanks, > David > >> thanks, >> >> Chris >> >> On 7/5/18 2:58 AM, David Holmes wrote: >>> Solaris compiler complains about doing a return from inside a >>> do-while loop. I'll have to rework part of the fix tomorrow. >>> >>> David >>> >>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>> >>>> Problem: >>>> >>>> The tests create native threads that attach to the VM through >>>> JNI_AttachCurrentThread but which then terminate without detaching >>>> themselves. When the VM exits and we're using Flight Recorder >>>> "dumponexit" this leads to a call to VM_PrintThreads that in part >>>> wants to print the per-thread CPU usage. When we encounter the >>>> threads that have terminated already the low level >>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't >>>> expect that and so fails an assert in debug mode and can SEGV in >>>> product mode. >>>> >>>> Solution: >>>> >>>> Serviceability-side: fix the tests >>>> >>>> Change the tests so that the threads detach before terminating. The >>>> two tests are (surprisingly) written in completely different styles, >>>> so the solution also takes on two different styles. >>>> >>>> Runtime-side: make the VM more robust in the fact of JNI attached >>>> threads that terminate before detaching, and add a regression test >>>> >>>> I took a good look at the low-level code for interacting with >>>> arbitrary threads and as far as I can see the problem only exists >>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere the >>>> potential for a library call failure just reports an error value >>>> (such as -1 for the cpu time used). >>>> >>>> So the fix is simply to allow for ESRCH when calling >>>> pthread_getcpuclockid and return -1 for the cpu usage in that case. >>>> >>>> I created a new regression test to create a new native thread, >>>> attach it and then let it terminate while still attached. The java >>>> code then calls various Thread and ThreadMXBean functions on it to >>>> ensure there are no crashes or unexpected exceptions. >>>> >>>> Testing: >>>> ??- old tests with fixed run-time >>>> ??- old run-time with fixed tests >>>> ??- mach tier4 (which exposed the problem - that's where we enable >>>> Flight recorder for the tests) [in progress] >>>> ??- mach5 tier 1-3 for good measure [in progress] >>>> ??- new regression test >>>> >>>> Thanks, >>>> David >> >> >> From gary.adams at oracle.com Fri Jul 6 12:54:58 2018 From: gary.adams at oracle.com (Gary Adams) Date: Fri, 06 Jul 2018 08:54:58 -0400 Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time on some builds In-Reply-To: <1a60027a-f7c2-d892-997f-655fe9612718@oracle.com> References: <5B3E2FC7.1060303@oracle.com> <1a60027a-f7c2-d892-997f-655fe9612718@oracle.com> Message-ID: <5B3F66A2.9020803@oracle.com> Yes, as part of the testing I went back to include windows-x64-debug and found the crashes are removed by this simplification that removes VirtualMachineManger. Once this fix is pushed, I'll close 8197938 as a duplicate. On 7/5/18, 7:26 PM, serguei.spitsyn at oracle.com wrote: > Hi Gary, > > One thing is not clear. > The 8206007 is linked to the 8197938 which tags this test in the > ProblemList.txt. > This line is removed: > -vmTestbase/nsk/jdb/exclude/exclude001/exclude001.java 8197938 windows-all > > > but the bug 8197938 is still open. > Is it intentional or some kind of a typo? > Or maybe we have to close the 8197938 as a dup of 8206007? > > Otherwise, the fix looks good to me. > > Thank you for the extra testing! > > Thanks, > Serguei > > > On 7/5/18 07:48, Gary Adams wrote: >> A simple test run using "exclude none" shows 625K methods are being >> observed. >> The bulk of those methods were due to the last class accessed in the >> test - VirtualMachineManager. >> >> It's not important that this particular call is used. The test is >> simply demonstrating that >> filters work for other packages than java and javax. >> >> This proposed fix uses a simpler lookup for GregorianCalendar. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8206007 >> Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gary.adams at oracle.com Fri Jul 6 12:55:19 2018 From: gary.adams at oracle.com (Gary Adams) Date: Fri, 06 Jul 2018 08:55:19 -0400 Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time on some builds In-Reply-To: References: <5B3E2FC7.1060303@oracle.com> Message-ID: <5B3F66B7.4000400@oracle.com> This change reduces the test by ~180K method observations (29%). It also depends on less complicated methods. e.g. VirtualMachineManager deals with more class and service loaders On 7/5/18, 5:28 PM, Chris Plummer wrote: > Hi Gary, > > The changes look good. How much is the reducing execution by? > > thanks, > > Chris > > On 7/5/18 7:48 AM, Gary Adams wrote: >> A simple test run using "exclude none" shows 625K methods are being >> observed. >> The bulk of those methods were due to the last class accessed in the >> test - VirtualMachineManager. >> >> It's not important that this particular call is used. The test is >> simply demonstrating that >> filters work for other packages than java and javax. >> >> This proposed fix uses a simpler lookup for GregorianCalendar. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8206007 >> Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/ > > > From serguei.spitsyn at oracle.com Fri Jul 6 16:25:58 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Jul 2018 09:25:58 -0700 Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time on some builds In-Reply-To: <5B3F66A2.9020803@oracle.com> References: <5B3E2FC7.1060303@oracle.com> <1a60027a-f7c2-d892-997f-655fe9612718@oracle.com> <5B3F66A2.9020803@oracle.com> Message-ID: <40cbc6d4-4472-f500-3fb9-c70f53d496b2@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Fri Jul 6 17:47:57 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 6 Jul 2018 10:47:57 -0700 Subject: (11) RFR (XS): 8205966: [testbug] New Nestmates JDI test times out with Xcomp on sparc In-Reply-To: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com> References: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com> Message-ID: <7c533564-8deb-189c-2065-8cc73be2a785@oracle.com> Hi David, It looks good. I agree with Mikael, it is a nice speedup! Thanks, Serguei On 7/5/18 22:21, David Holmes wrote: > Bug: https://bugs.openjdk.java.net/browse/JDK-8205966 > webrev: http://cr.openjdk.java.net/~dholmes/8205966/webrev/ > > One of the @run variants was taking around 15x longer to execute. That > variant uses the InMemoryJavaCompiler which involves a lot of classes > and code execution. The test was enabling method entry event > generation for all of main, resulting in the massive slowdown. > > The fix is to add a new breakpoint() function that gets called after > the in-memory compilation setup is done, and we initially run the test > to that point before enabling the events. > > The problem @run now only takes 2x the other tests and so should avoid > the timeouts. > > Testing: mach5 tier4 solaris-sparc > ???????? mach5 tier 1-3 > > Thanks, > David From chris.plummer at oracle.com Fri Jul 6 19:17:03 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 6 Jul 2018 12:17:03 -0700 Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time on some builds In-Reply-To: <5B3F66B7.4000400@oracle.com> References: <5B3E2FC7.1060303@oracle.com> <5B3F66B7.4000400@oracle.com> Message-ID: <308e0c82-514f-a63c-89a0-1c9e983d549a@oracle.com> Ok. Still seems like an awful lot of methods being invoked, but is nice improvement. Chris On 7/6/18 5:55 AM, Gary Adams wrote: > This change reduces the test by ~180K method observations (29%). > It also depends on less complicated methods. e.g. VirtualMachineManager > deals with more class and service loaders > > On 7/5/18, 5:28 PM, Chris Plummer wrote: >> Hi Gary, >> >> The changes look good. How much is the reducing execution by? >> >> thanks, >> >> Chris >> >> On 7/5/18 7:48 AM, Gary Adams wrote: >>> A simple test run using "exclude none" shows 625K methods are being >>> observed. >>> The bulk of those methods were due to the last class accessed in the >>> test - VirtualMachineManager. >>> >>> It's not important that this particular call is used. The test is >>> simply demonstrating that >>> filters work for other packages than java and javax. >>> >>> This proposed fix uses a simpler lookup for GregorianCalendar. >>> >>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206007 >>> ? Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/ >> >> >> > From david.holmes at oracle.com Sun Jul 8 23:58:32 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 9 Jul 2018 09:58:32 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> Message-ID: <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> tl;dr skip the new regression test on Solaris New webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ This excludes the test from running on Solaris, so the makefile doesn't bother compiling this native test and the Java part of the test adds: ! * @requires os.family != "windows" & os.family != "solaris" * @summary Basic test of Thread and ThreadMXBean queries on a natively * attached thread that has failed to detach before terminating. + * @comment The native code only supports POSIX so no windows testing; also + * we have to skip solaris as a terminating thread that fails to + * detach will hit an infinite loop due to TLS destructor issues - see + * comments in JDK-8156708 Note this means that Solaris is not affected by the original issue because a still-attached native thread can't actually terminate due to the TLS destructor infinite-loop issue. Thanks, David On 6/07/2018 6:07 PM, David Holmes wrote: > The new test is hanging on Solaris. I just discovered we don't > run these tests on Solaris until tier4. > > David > > On 6/07/2018 8:40 AM, David Holmes wrote: >> Hi Chris, >> >> Thanks for looking at this. >> >> Updated webrev: >> >> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >> >> Only real changes in ji05t001.c. (And fixed typo in the new test) >> >> More below ... >> >> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>> Hi David, >>> >>> Solaris problems aside, overall it looks fine. Some minor things I >>> noted: >>> >>> I noticed that exitCode is never modified in agentA() or agentB(), so >>> there isn't much point to having it. If you reach the bottom of the >>> function, it passed, so PASSED can be returned. The code would be >>> more clear if it did this. As-is it is implied that you can reach the >>> bottom when it fails. >> >> I resisted any and all urges to do any kind of unrelated code cleanup >> in the tests - once you start you may end up doing a full rewrite. >> >>> Is detaching the threads along the failure paths really needed? >>> exit() is called, so this would seem to make it unnecessary. >> >> You're right that isn't necessary. I'll remove the changes from before >> the exits in ji05t001.c >> >>> I prefer assignments not to be embedded inside the "if" condition. >>> The DetachCurrentThread code in THREAD_return() is much more readable >>> than the similar code in agentA() and agentB(). >> >> It's an existing style already used in that test e.g. >> >> ??287???? if ((res = >> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( >> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) != >> 0) { >> >> and I don't mind it, so I'd prefer not to change it. >> >>> In the test: >>> >>> ?? 54???????? // Generally as long as we don't crash of throw unexpected >>> ?? 55???????? // exceptions then the test passes. In some cases we >>> know exactly >>> >>> "of" should be "or". >> >> Well spotted. Thanks. >> >>> Shouldn't you be catching exceptions for all the Thread methods you >>> are calling? Otherwise the test will exit if one is thrown, and the >>> above comment indicates that you don't want this. >> >> I'm not expecting there to be any exceptions from any of the called >> methods. That would potentially indicate a problem in handling the >> terminated native thread, so would indicate a test failure. >> >>> Don't we normally put these tests in a package? >> >> Doesn't seem to be any hard and fast rule. I only uses packages when >> they are important for the test. In runtime we have 905 java files and >> only 116 have a package statement. It varies elsewhere. >> >> Thanks, >> David >> >>> thanks, >>> >>> Chris >>> >>> On 7/5/18 2:58 AM, David Holmes wrote: >>>> Solaris compiler complains about doing a return from inside a >>>> do-while loop. I'll have to rework part of the fix tomorrow. >>>> >>>> David >>>> >>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>> >>>>> Problem: >>>>> >>>>> The tests create native threads that attach to the VM through >>>>> JNI_AttachCurrentThread but which then terminate without detaching >>>>> themselves. When the VM exits and we're using Flight Recorder >>>>> "dumponexit" this leads to a call to VM_PrintThreads that in part >>>>> wants to print the per-thread CPU usage. When we encounter the >>>>> threads that have terminated already the low level >>>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't >>>>> expect that and so fails an assert in debug mode and can SEGV in >>>>> product mode. >>>>> >>>>> Solution: >>>>> >>>>> Serviceability-side: fix the tests >>>>> >>>>> Change the tests so that the threads detach before terminating. The >>>>> two tests are (surprisingly) written in completely different >>>>> styles, so the solution also takes on two different styles. >>>>> >>>>> Runtime-side: make the VM more robust in the fact of JNI attached >>>>> threads that terminate before detaching, and add a regression test >>>>> >>>>> I took a good look at the low-level code for interacting with >>>>> arbitrary threads and as far as I can see the problem only exists >>>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere the >>>>> potential for a library call failure just reports an error value >>>>> (such as -1 for the cpu time used). >>>>> >>>>> So the fix is simply to allow for ESRCH when calling >>>>> pthread_getcpuclockid and return -1 for the cpu usage in that case. >>>>> >>>>> I created a new regression test to create a new native thread, >>>>> attach it and then let it terminate while still attached. The java >>>>> code then calls various Thread and ThreadMXBean functions on it to >>>>> ensure there are no crashes or unexpected exceptions. >>>>> >>>>> Testing: >>>>> ??- old tests with fixed run-time >>>>> ??- old run-time with fixed tests >>>>> ??- mach tier4 (which exposed the problem - that's where we enable >>>>> Flight recorder for the tests) [in progress] >>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>> ??- new regression test >>>>> >>>>> Thanks, >>>>> David >>> >>> >>> From david.holmes at oracle.com Sun Jul 8 23:59:10 2018 From: david.holmes at oracle.com (David Holmes) Date: Mon, 9 Jul 2018 09:59:10 +1000 Subject: (11) RFR (XS): 8205966: [testbug] New Nestmates JDI test times out with Xcomp on sparc In-Reply-To: <7c533564-8deb-189c-2065-8cc73be2a785@oracle.com> References: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com> <7c533564-8deb-189c-2065-8cc73be2a785@oracle.com> Message-ID: Thanks Serguei! David On 7/07/2018 3:47 AM, serguei.spitsyn at oracle.com wrote: > Hi David, > > It looks good. > I agree with Mikael, it is a nice speedup! > > Thanks, > Serguei > > > > On 7/5/18 22:21, David Holmes wrote: >> Bug: https://bugs.openjdk.java.net/browse/JDK-8205966 >> webrev: http://cr.openjdk.java.net/~dholmes/8205966/webrev/ >> >> One of the @run variants was taking around 15x longer to execute. That >> variant uses the InMemoryJavaCompiler which involves a lot of classes >> and code execution. The test was enabling method entry event >> generation for all of main, resulting in the massive slowdown. >> >> The fix is to add a new breakpoint() function that gets called after >> the in-memory compilation setup is done, and we initially run the test >> to that point before enabling the events. >> >> The problem @run now only takes 2x the other tests and so should avoid >> the timeouts. >> >> Testing: mach5 tier4 solaris-sparc >> ???????? mach5 tier 1-3 >> >> Thanks, >> David > From ralf.schmelter at sap.com Mon Jul 9 14:04:34 2018 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Mon, 9 Jul 2018 14:04:34 +0000 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> Message-ID: <21e17c666ac04930a0e4bb4869e989da@sap.com> Hi Chris, thanks for the review. > What testing have you done? I've tested the change by debugging by hand in eclipse and jdb, running the com/sun/jdi rtreg tests and the jdwp jck tests. And analog code is running in the SAP JVM for many years. > How long does this test take to run. 15 s according to jtreg. > What happens if for some reason SOE is never thrown? It's not clear to > me what the script would do in this case. It is treated as passed (which is not ideal). > In answer to the ShellScaffold.sh question, there is already work > underway to convert to pure java tests. See JDK-8201652. Ok, then I think it is better to convert the test to a Java TestScaffold test. I will update the webref when this is done. Best regards, Ralf -----Original Message----- From: Chris Plummer [mailto:chris.plummer at oracle.com] Sent: Freitag, 6. Juli 2018 00:37 To: Schmelter, Ralf ; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior Hi Ralf, Overall looks good, but I do have a few comments and questions. Please update the copyright. What testing have you done? How long does this test take to run. What happens if for some reason SOE is never thrown? It's not clear to me what the script would do in this case. In answer to the ShellScaffold.sh question, there is already work underway to convert to pure java tests. See JDK-8201652. I'm not certain if it is ok for you to just submit this new shell script, or if should be rewritten in pure java. Most of the work to convert the scripts has already been done but was put on hold. Maybe Serguei can comment and guide you on how it would be done in java. thanks, Chris On 7/3/18 3:43 AM, Schmelter, Ralf wrote: > Hi All, > > Please review the fix for the bughttps://bugs.openjdk.java.net/browse/JDK-8205608 . The webref is athttp://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/ . > > This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack. > > I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is. > > Best regards, > Ralf Schmelter From chris.plummer at oracle.com Mon Jul 9 18:22:46 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 9 Jul 2018 11:22:46 -0700 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> Message-ID: <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> Hi David, Would it be better to problem list this test on solaris using JDK-8156708. That way when JDK-8156708 is fixed it can come off the problem list and start executing on solaris. thanks, Chris On 7/8/18 4:58 PM, David Holmes wrote: > tl;dr skip the new regression test on Solaris > > New webrev: > > http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ > > This excludes the test from running on Solaris, so the makefile > doesn't bother compiling this native test and the Java part of the > test adds: > > ! * @requires os.family != "windows" & os.family != "solaris" > ? * @summary Basic test of Thread and ThreadMXBean queries on a natively > ? *????????? attached thread that has failed to detach before > terminating. > + * @comment The native code only supports POSIX so no windows > testing; also > + *????????? we have to skip solaris as a terminating thread that > fails to > + *????????? detach will hit an infinite loop due to TLS destructor > issues - see > + *????????? comments in JDK-8156708 > > Note this means that Solaris is not affected by the original issue > because a still-attached native thread can't actually terminate due to > the TLS destructor infinite-loop issue. > > Thanks, > David > > On 6/07/2018 6:07 PM, David Holmes wrote: >> The new test is hanging on Solaris. I just discovered we don't >> run these tests on Solaris until tier4. >> >> David >> >> On 6/07/2018 8:40 AM, David Holmes wrote: >>> Hi Chris, >>> >>> Thanks for looking at this. >>> >>> Updated webrev: >>> >>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>> >>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>> >>> More below ... >>> >>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>> Hi David, >>>> >>>> Solaris problems aside, overall it looks fine. Some minor things I >>>> noted: >>>> >>>> I noticed that exitCode is never modified in agentA() or agentB(), >>>> so there isn't much point to having it. If you reach the bottom of >>>> the function, it passed, so PASSED can be returned. The code would >>>> be more clear if it did this. As-is it is implied that you can >>>> reach the bottom when it fails. >>> >>> I resisted any and all urges to do any kind of unrelated code >>> cleanup in the tests - once you start you may end up doing a full >>> rewrite. >>> >>>> Is detaching the threads along the failure paths really needed? >>>> exit() is called, so this would seem to make it unnecessary. >>> >>> You're right that isn't necessary. I'll remove the changes from >>> before the exits in ji05t001.c >>> >>>> I prefer assignments not to be embedded inside the "if" condition. >>>> The DetachCurrentThread code in THREAD_return() is much more >>>> readable than the similar code in agentA() and agentB(). >>> >>> It's an existing style already used in that test e.g. >>> >>> ??287???? if ((res = >>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( >>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) >>> != 0) { >>> >>> and I don't mind it, so I'd prefer not to change it. >>> >>>> In the test: >>>> >>>> ?? 54???????? // Generally as long as we don't crash of throw >>>> unexpected >>>> ?? 55???????? // exceptions then the test passes. In some cases we >>>> know exactly >>>> >>>> "of" should be "or". >>> >>> Well spotted. Thanks. >>> >>>> Shouldn't you be catching exceptions for all the Thread methods you >>>> are calling? Otherwise the test will exit if one is thrown, and the >>>> above comment indicates that you don't want this. >>> >>> I'm not expecting there to be any exceptions from any of the called >>> methods. That would potentially indicate a problem in handling the >>> terminated native thread, so would indicate a test failure. >>> >>>> Don't we normally put these tests in a package? >>> >>> Doesn't seem to be any hard and fast rule. I only uses packages when >>> they are important for the test. In runtime we have 905 java files >>> and only 116 have a package statement. It varies elsewhere. >>> >>> Thanks, >>> David >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>> Solaris compiler complains about doing a return from inside >>>>> a do-while loop. I'll have to rework part of the fix tomorrow. >>>>> >>>>> David >>>>> >>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>> >>>>>> Problem: >>>>>> >>>>>> The tests create native threads that attach to the VM through >>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>> detaching themselves. When the VM exits and we're using Flight >>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads >>>>>> that in part wants to print the per-thread CPU usage. When we >>>>>> encounter the threads that have terminated already the low level >>>>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't >>>>>> expect that and so fails an assert in debug mode and can SEGV in >>>>>> product mode. >>>>>> >>>>>> Solution: >>>>>> >>>>>> Serviceability-side: fix the tests >>>>>> >>>>>> Change the tests so that the threads detach before terminating. >>>>>> The two tests are (surprisingly) written in completely different >>>>>> styles, so the solution also takes on two different styles. >>>>>> >>>>>> Runtime-side: make the VM more robust in the fact of JNI attached >>>>>> threads that terminate before detaching, and add a regression test >>>>>> >>>>>> I took a good look at the low-level code for interacting with >>>>>> arbitrary threads and as far as I can see the problem only exists >>>>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere >>>>>> the potential for a library call failure just reports an error >>>>>> value (such as -1 for the cpu time used). >>>>>> >>>>>> So the fix is simply to allow for ESRCH when calling >>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that case. >>>>>> >>>>>> I created a new regression test to create a new native thread, >>>>>> attach it and then let it terminate while still attached. The >>>>>> java code then calls various Thread and ThreadMXBean functions on >>>>>> it to ensure there are no crashes or unexpected exceptions. >>>>>> >>>>>> Testing: >>>>>> ??- old tests with fixed run-time >>>>>> ??- old run-time with fixed tests >>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>> enable Flight recorder for the tests) [in progress] >>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>> ??- new regression test >>>>>> >>>>>> Thanks, >>>>>> David >>>> >>>> >>>> From jini.george at oracle.com Mon Jul 9 18:44:33 2018 From: jini.george at oracle.com (Jini George) Date: Tue, 10 Jul 2018 00:14:33 +0530 Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X Message-ID: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com> Requesting reviews for enabling SA tests on OS X for Mach5. https://bugs.openjdk.java.net/browse/JDK-8199700 Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/ The changes are mostly to include the addition of sudo privileges to the SA launchers for OSX if Platform.shouldSAAttach() fails. Some tests (those using clhsdb) have been refactored to use ClhsdbLauncher for ease of maintainence. This also avoids checks for Platform.shouldSAAttach() for corefile related test cases. More details have been provided in JIRA. Thanks, Jini. From david.holmes at oracle.com Mon Jul 9 21:41:02 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 07:41:02 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> Message-ID: <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> Hi Chris, On 10/07/2018 4:22 AM, Chris Plummer wrote: > Hi David, > > Would it be better to problem list this test on solaris using > JDK-8156708. That way when JDK-8156708 is fixed it can come off the > problem list and start executing on solaris. JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could only fix this for VM created threads. The general problem of TLS destructors looping if a thread terminates without detaching from the VM is not solvable - other than by not using TLS in the VM. Thanks, David > thanks, > > Chris > > On 7/8/18 4:58 PM, David Holmes wrote: >> tl;dr skip the new regression test on Solaris >> >> New webrev: >> >> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >> >> This excludes the test from running on Solaris, so the makefile >> doesn't bother compiling this native test and the Java part of the >> test adds: >> >> ! * @requires os.family != "windows" & os.family != "solaris" >> ? * @summary Basic test of Thread and ThreadMXBean queries on a natively >> ? *????????? attached thread that has failed to detach before >> terminating. >> + * @comment The native code only supports POSIX so no windows >> testing; also >> + *????????? we have to skip solaris as a terminating thread that >> fails to >> + *????????? detach will hit an infinite loop due to TLS destructor >> issues - see >> + *????????? comments in JDK-8156708 >> >> Note this means that Solaris is not affected by the original issue >> because a still-attached native thread can't actually terminate due to >> the TLS destructor infinite-loop issue. >> >> Thanks, >> David >> >> On 6/07/2018 6:07 PM, David Holmes wrote: >>> The new test is hanging on Solaris. I just discovered we don't >>> run these tests on Solaris until tier4. >>> >>> David >>> >>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> Thanks for looking at this. >>>> >>>> Updated webrev: >>>> >>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>> >>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>> >>>> More below ... >>>> >>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>> Hi David, >>>>> >>>>> Solaris problems aside, overall it looks fine. Some minor things I >>>>> noted: >>>>> >>>>> I noticed that exitCode is never modified in agentA() or agentB(), >>>>> so there isn't much point to having it. If you reach the bottom of >>>>> the function, it passed, so PASSED can be returned. The code would >>>>> be more clear if it did this. As-is it is implied that you can >>>>> reach the bottom when it fails. >>>> >>>> I resisted any and all urges to do any kind of unrelated code >>>> cleanup in the tests - once you start you may end up doing a full >>>> rewrite. >>>> >>>>> Is detaching the threads along the failure paths really needed? >>>>> exit() is called, so this would seem to make it unnecessary. >>>> >>>> You're right that isn't necessary. I'll remove the changes from >>>> before the exits in ji05t001.c >>>> >>>>> I prefer assignments not to be embedded inside the "if" condition. >>>>> The DetachCurrentThread code in THREAD_return() is much more >>>>> readable than the similar code in agentA() and agentB(). >>>> >>>> It's an existing style already used in that test e.g. >>>> >>>> ??287???? if ((res = >>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( >>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) >>>> != 0) { >>>> >>>> and I don't mind it, so I'd prefer not to change it. >>>> >>>>> In the test: >>>>> >>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>> unexpected >>>>> ?? 55???????? // exceptions then the test passes. In some cases we >>>>> know exactly >>>>> >>>>> "of" should be "or". >>>> >>>> Well spotted. Thanks. >>>> >>>>> Shouldn't you be catching exceptions for all the Thread methods you >>>>> are calling? Otherwise the test will exit if one is thrown, and the >>>>> above comment indicates that you don't want this. >>>> >>>> I'm not expecting there to be any exceptions from any of the called >>>> methods. That would potentially indicate a problem in handling the >>>> terminated native thread, so would indicate a test failure. >>>> >>>>> Don't we normally put these tests in a package? >>>> >>>> Doesn't seem to be any hard and fast rule. I only uses packages when >>>> they are important for the test. In runtime we have 905 java files >>>> and only 116 have a package statement. It varies elsewhere. >>>> >>>> Thanks, >>>> David >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>> Solaris compiler complains about doing a return from inside >>>>>> a do-while loop. I'll have to rework part of the fix tomorrow. >>>>>> >>>>>> David >>>>>> >>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>> >>>>>>> Problem: >>>>>>> >>>>>>> The tests create native threads that attach to the VM through >>>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>>> detaching themselves. When the VM exits and we're using Flight >>>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads >>>>>>> that in part wants to print the per-thread CPU usage. When we >>>>>>> encounter the threads that have terminated already the low level >>>>>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't >>>>>>> expect that and so fails an assert in debug mode and can SEGV in >>>>>>> product mode. >>>>>>> >>>>>>> Solution: >>>>>>> >>>>>>> Serviceability-side: fix the tests >>>>>>> >>>>>>> Change the tests so that the threads detach before terminating. >>>>>>> The two tests are (surprisingly) written in completely different >>>>>>> styles, so the solution also takes on two different styles. >>>>>>> >>>>>>> Runtime-side: make the VM more robust in the fact of JNI attached >>>>>>> threads that terminate before detaching, and add a regression test >>>>>>> >>>>>>> I took a good look at the low-level code for interacting with >>>>>>> arbitrary threads and as far as I can see the problem only exists >>>>>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere >>>>>>> the potential for a library call failure just reports an error >>>>>>> value (such as -1 for the cpu time used). >>>>>>> >>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that case. >>>>>>> >>>>>>> I created a new regression test to create a new native thread, >>>>>>> attach it and then let it terminate while still attached. The >>>>>>> java code then calls various Thread and ThreadMXBean functions on >>>>>>> it to ensure there are no crashes or unexpected exceptions. >>>>>>> >>>>>>> Testing: >>>>>>> ??- old tests with fixed run-time >>>>>>> ??- old run-time with fixed tests >>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>> ??- new regression test >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>> >>>>> >>>>> > > From chris.plummer at oracle.com Mon Jul 9 21:50:09 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 9 Jul 2018 14:50:09 -0700 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> Message-ID: On 7/9/18 2:41 PM, David Holmes wrote: > Hi Chris, > > On 10/07/2018 4:22 AM, Chris Plummer wrote: >> Hi David, >> >> Would it be better to problem list this test on solaris using >> JDK-8156708. That way when JDK-8156708 is fixed it can come off the >> problem list and start executing on solaris. > > JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could > only fix this for VM created threads. The general problem of TLS > destructors looping if a thread terminates without detaching from the > VM is not solvable - other than by not using TLS in the VM. Ok, I misunderstood your comments in the test. Changes look fine. Chris > > Thanks, > David > >> thanks, >> >> Chris >> >> On 7/8/18 4:58 PM, David Holmes wrote: >>> tl;dr skip the new regression test on Solaris >>> >>> New webrev: >>> >>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>> >>> This excludes the test from running on Solaris, so the makefile >>> doesn't bother compiling this native test and the Java part of the >>> test adds: >>> >>> ! * @requires os.family != "windows" & os.family != "solaris" >>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>> natively >>> ? *????????? attached thread that has failed to detach before >>> terminating. >>> + * @comment The native code only supports POSIX so no windows >>> testing; also >>> + *????????? we have to skip solaris as a terminating thread that >>> fails to >>> + *????????? detach will hit an infinite loop due to TLS destructor >>> issues - see >>> + *????????? comments in JDK-8156708 >>> >>> Note this means that Solaris is not affected by the original issue >>> because a still-attached native thread can't actually terminate due >>> to the TLS destructor infinite-loop issue. >>> >>> Thanks, >>> David >>> >>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>> The new test is hanging on Solaris. I just discovered we >>>> don't run these tests on Solaris until tier4. >>>> >>>> David >>>> >>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> Thanks for looking at this. >>>>> >>>>> Updated webrev: >>>>> >>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>> >>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>> >>>>> More below ... >>>>> >>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> Solaris problems aside, overall it looks fine. Some minor things >>>>>> I noted: >>>>>> >>>>>> I noticed that exitCode is never modified in agentA() or >>>>>> agentB(), so there isn't much point to having it. If you reach >>>>>> the bottom of the function, it passed, so PASSED can be returned. >>>>>> The code would be more clear if it did this. As-is it is implied >>>>>> that you can reach the bottom when it fails. >>>>> >>>>> I resisted any and all urges to do any kind of unrelated code >>>>> cleanup in the tests - once you start you may end up doing a full >>>>> rewrite. >>>>> >>>>>> Is detaching the threads along the failure paths really needed? >>>>>> exit() is called, so this would seem to make it unnecessary. >>>>> >>>>> You're right that isn't necessary. I'll remove the changes from >>>>> before the exits in ji05t001.c >>>>> >>>>>> I prefer assignments not to be embedded inside the "if" >>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>> much more readable than the similar code in agentA() and agentB(). >>>>> >>>>> It's an existing style already used in that test e.g. >>>>> >>>>> ??287???? if ((res = >>>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) >>>>> 0)) != 0) { >>>>> >>>>> and I don't mind it, so I'd prefer not to change it. >>>>> >>>>>> In the test: >>>>>> >>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>> unexpected >>>>>> ?? 55???????? // exceptions then the test passes. In some cases >>>>>> we know exactly >>>>>> >>>>>> "of" should be "or". >>>>> >>>>> Well spotted. Thanks. >>>>> >>>>>> Shouldn't you be catching exceptions for all the Thread methods >>>>>> you are calling? Otherwise the test will exit if one is thrown, >>>>>> and the above comment indicates that you don't want this. >>>>> >>>>> I'm not expecting there to be any exceptions from any of the >>>>> called methods. That would potentially indicate a problem in >>>>> handling the terminated native thread, so would indicate a test >>>>> failure. >>>>> >>>>>> Don't we normally put these tests in a package? >>>>> >>>>> Doesn't seem to be any hard and fast rule. I only uses packages >>>>> when they are important for the test. In runtime we have 905 java >>>>> files and only 116 have a package statement. It varies elsewhere. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>> Solaris compiler complains about doing a return from >>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>> tomorrow. >>>>>>> >>>>>>> David >>>>>>> >>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>> >>>>>>>> Problem: >>>>>>>> >>>>>>>> The tests create native threads that attach to the VM through >>>>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>>>> detaching themselves. When the VM exits and we're using Flight >>>>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads >>>>>>>> that in part wants to print the per-thread CPU usage. When we >>>>>>>> encounter the threads that have terminated already the low >>>>>>>> level pthread_getcpuclockid calls returns ESRCH but the code >>>>>>>> doesn't expect that and so fails an assert in debug mode and >>>>>>>> can SEGV in product mode. >>>>>>>> >>>>>>>> Solution: >>>>>>>> >>>>>>>> Serviceability-side: fix the tests >>>>>>>> >>>>>>>> Change the tests so that the threads detach before terminating. >>>>>>>> The two tests are (surprisingly) written in completely >>>>>>>> different styles, so the solution also takes on two different >>>>>>>> styles. >>>>>>>> >>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>> regression test >>>>>>>> >>>>>>>> I took a good look at the low-level code for interacting with >>>>>>>> arbitrary threads and as far as I can see the problem only >>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. >>>>>>>> Elsewhere the potential for a library call failure just reports >>>>>>>> an error value (such as -1 for the cpu time used). >>>>>>>> >>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that >>>>>>>> case. >>>>>>>> >>>>>>>> I created a new regression test to create a new native thread, >>>>>>>> attach it and then let it terminate while still attached. The >>>>>>>> java code then calls various Thread and ThreadMXBean functions >>>>>>>> on it to ensure there are no crashes or unexpected exceptions. >>>>>>>> >>>>>>>> Testing: >>>>>>>> ??- old tests with fixed run-time >>>>>>>> ??- old run-time with fixed tests >>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>> ??- new regression test >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>> >>>>>> >>>>>> >> >> From david.holmes at oracle.com Mon Jul 9 22:17:13 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 08:17:13 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> Message-ID: Thanks Chris! Can I please get a second review. David On 10/07/2018 7:50 AM, Chris Plummer wrote: > On 7/9/18 2:41 PM, David Holmes wrote: >> Hi Chris, >> >> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>> Hi David, >>> >>> Would it be better to problem list this test on solaris using >>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the >>> problem list and start executing on solaris. >> >> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >> only fix this for VM created threads. The general problem of TLS >> destructors looping if a thread terminates without detaching from the >> VM is not solvable - other than by not using TLS in the VM. > Ok, I misunderstood your comments in the test. > > Changes look fine. > > Chris >> >> Thanks, >> David >> >>> thanks, >>> >>> Chris >>> >>> On 7/8/18 4:58 PM, David Holmes wrote: >>>> tl;dr skip the new regression test on Solaris >>>> >>>> New webrev: >>>> >>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>> >>>> This excludes the test from running on Solaris, so the makefile >>>> doesn't bother compiling this native test and the Java part of the >>>> test adds: >>>> >>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>> natively >>>> ? *????????? attached thread that has failed to detach before >>>> terminating. >>>> + * @comment The native code only supports POSIX so no windows >>>> testing; also >>>> + *????????? we have to skip solaris as a terminating thread that >>>> fails to >>>> + *????????? detach will hit an infinite loop due to TLS destructor >>>> issues - see >>>> + *????????? comments in JDK-8156708 >>>> >>>> Note this means that Solaris is not affected by the original issue >>>> because a still-attached native thread can't actually terminate due >>>> to the TLS destructor infinite-loop issue. >>>> >>>> Thanks, >>>> David >>>> >>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>> The new test is hanging on Solaris. I just discovered we >>>>> don't run these tests on Solaris until tier4. >>>>> >>>>> David >>>>> >>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> Thanks for looking at this. >>>>>> >>>>>> Updated webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>> >>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>> >>>>>> More below ... >>>>>> >>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> Solaris problems aside, overall it looks fine. Some minor things >>>>>>> I noted: >>>>>>> >>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>> agentB(), so there isn't much point to having it. If you reach >>>>>>> the bottom of the function, it passed, so PASSED can be returned. >>>>>>> The code would be more clear if it did this. As-is it is implied >>>>>>> that you can reach the bottom when it fails. >>>>>> >>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>> cleanup in the tests - once you start you may end up doing a full >>>>>> rewrite. >>>>>> >>>>>>> Is detaching the threads along the failure paths really needed? >>>>>>> exit() is called, so this would seem to make it unnecessary. >>>>>> >>>>>> You're right that isn't necessary. I'll remove the changes from >>>>>> before the exits in ji05t001.c >>>>>> >>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>> much more readable than the similar code in agentA() and agentB(). >>>>>> >>>>>> It's an existing style already used in that test e.g. >>>>>> >>>>>> ??287???? if ((res = >>>>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) >>>>>> 0)) != 0) { >>>>>> >>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>> >>>>>>> In the test: >>>>>>> >>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>> unexpected >>>>>>> ?? 55???????? // exceptions then the test passes. In some cases >>>>>>> we know exactly >>>>>>> >>>>>>> "of" should be "or". >>>>>> >>>>>> Well spotted. Thanks. >>>>>> >>>>>>> Shouldn't you be catching exceptions for all the Thread methods >>>>>>> you are calling? Otherwise the test will exit if one is thrown, >>>>>>> and the above comment indicates that you don't want this. >>>>>> >>>>>> I'm not expecting there to be any exceptions from any of the >>>>>> called methods. That would potentially indicate a problem in >>>>>> handling the terminated native thread, so would indicate a test >>>>>> failure. >>>>>> >>>>>>> Don't we normally put these tests in a package? >>>>>> >>>>>> Doesn't seem to be any hard and fast rule. I only uses packages >>>>>> when they are important for the test. In runtime we have 905 java >>>>>> files and only 116 have a package statement. It varies elsewhere. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>> Solaris compiler complains about doing a return from >>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>> tomorrow. >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>> >>>>>>>>> Problem: >>>>>>>>> >>>>>>>>> The tests create native threads that attach to the VM through >>>>>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>>>>> detaching themselves. When the VM exits and we're using Flight >>>>>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads >>>>>>>>> that in part wants to print the per-thread CPU usage. When we >>>>>>>>> encounter the threads that have terminated already the low >>>>>>>>> level pthread_getcpuclockid calls returns ESRCH but the code >>>>>>>>> doesn't expect that and so fails an assert in debug mode and >>>>>>>>> can SEGV in product mode. >>>>>>>>> >>>>>>>>> Solution: >>>>>>>>> >>>>>>>>> Serviceability-side: fix the tests >>>>>>>>> >>>>>>>>> Change the tests so that the threads detach before terminating. >>>>>>>>> The two tests are (surprisingly) written in completely >>>>>>>>> different styles, so the solution also takes on two different >>>>>>>>> styles. >>>>>>>>> >>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>> regression test >>>>>>>>> >>>>>>>>> I took a good look at the low-level code for interacting with >>>>>>>>> arbitrary threads and as far as I can see the problem only >>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. >>>>>>>>> Elsewhere the potential for a library call failure just reports >>>>>>>>> an error value (such as -1 for the cpu time used). >>>>>>>>> >>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that >>>>>>>>> case. >>>>>>>>> >>>>>>>>> I created a new regression test to create a new native thread, >>>>>>>>> attach it and then let it terminate while still attached. The >>>>>>>>> java code then calls various Thread and ThreadMXBean functions >>>>>>>>> on it to ensure there are no crashes or unexpected exceptions. >>>>>>>>> >>>>>>>>> Testing: >>>>>>>>> ??- old tests with fixed run-time >>>>>>>>> ??- old run-time with fixed tests >>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>> ??- new regression test >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>> >>>>>>> >>>>>>> >>> >>> > > From alexey.menkov at oracle.com Mon Jul 9 22:45:41 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 9 Jul 2018 15:45:41 -0700 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> Message-ID: +1 couple minor notes (no need to resend review) src/hotspot/os/linux/os_linux.cpp please replace 5581 } 5582 else { with } else { test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c please fix error reporting (I suppose you mean "TEST ERROR: pthread_create failed"/"TEST ERROR: pthread_join failed"): 85 if ((res = pthread_create(&thread, NULL, thread_start, NULL)) != 0) { 86 fprintf(stderr, "TEST ERROR: pthread_created failed: %s (%d)\n", strerror(res), res); 87 exit(1); 88 } 89 90 if ((res = pthread_join(thread, NULL)) != 0) { 91 fprintf(stderr, "TEST ERROR: pthread_created failed: %s (%d)\n", strerror(res), res); 92 exit(1); 93 } --alex On 07/09/2018 15:17, David Holmes wrote: > Thanks Chris! > > Can I please get a second review. > > David > > On 10/07/2018 7:50 AM, Chris Plummer wrote: >> On 7/9/18 2:41 PM, David Holmes wrote: >>> Hi Chris, >>> >>> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>>> Hi David, >>>> >>>> Would it be better to problem list this test on solaris using >>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the >>>> problem list and start executing on solaris. >>> >>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >>> only fix this for VM created threads. The general problem of TLS >>> destructors looping if a thread terminates without detaching from the >>> VM is not solvable - other than by not using TLS in the VM. >> Ok, I misunderstood your comments in the test. >> >> Changes look fine. >> >> Chris >>> >>> Thanks, >>> David >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 7/8/18 4:58 PM, David Holmes wrote: >>>>> tl;dr skip the new regression test on Solaris >>>>> >>>>> New webrev: >>>>> >>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>>> >>>>> This excludes the test from running on Solaris, so the makefile >>>>> doesn't bother compiling this native test and the Java part of the >>>>> test adds: >>>>> >>>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>>> natively >>>>> ? *????????? attached thread that has failed to detach before >>>>> terminating. >>>>> + * @comment The native code only supports POSIX so no windows >>>>> testing; also >>>>> + *????????? we have to skip solaris as a terminating thread that >>>>> fails to >>>>> + *????????? detach will hit an infinite loop due to TLS destructor >>>>> issues - see >>>>> + *????????? comments in JDK-8156708 >>>>> >>>>> Note this means that Solaris is not affected by the original issue >>>>> because a still-attached native thread can't actually terminate due >>>>> to the TLS destructor infinite-loop issue. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>>> The new test is hanging on Solaris. I just discovered we >>>>>> don't run these tests on Solaris until tier4. >>>>>> >>>>>> David >>>>>> >>>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Thanks for looking at this. >>>>>>> >>>>>>> Updated webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>>> >>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>>> >>>>>>> More below ... >>>>>>> >>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> Solaris problems aside, overall it looks fine. Some minor things >>>>>>>> I noted: >>>>>>>> >>>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>>> agentB(), so there isn't much point to having it. If you reach >>>>>>>> the bottom of the function, it passed, so PASSED can be >>>>>>>> returned. The code would be more clear if it did this. As-is it >>>>>>>> is implied that you can reach the bottom when it fails. >>>>>>> >>>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>>> cleanup in the tests - once you start you may end up doing a full >>>>>>> rewrite. >>>>>>> >>>>>>>> Is detaching the threads along the failure paths really needed? >>>>>>>> exit() is called, so this would seem to make it unnecessary. >>>>>>> >>>>>>> You're right that isn't necessary. I'll remove the changes from >>>>>>> before the exits in ji05t001.c >>>>>>> >>>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>>> much more readable than the similar code in agentA() and agentB(). >>>>>>> >>>>>>> It's an existing style already used in that test e.g. >>>>>>> >>>>>>> ??287???? if ((res = >>>>>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) >>>>>>> 0)) != 0) { >>>>>>> >>>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>>> >>>>>>>> In the test: >>>>>>>> >>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>>> unexpected >>>>>>>> ?? 55???????? // exceptions then the test passes. In some cases >>>>>>>> we know exactly >>>>>>>> >>>>>>>> "of" should be "or". >>>>>>> >>>>>>> Well spotted. Thanks. >>>>>>> >>>>>>>> Shouldn't you be catching exceptions for all the Thread methods >>>>>>>> you are calling? Otherwise the test will exit if one is thrown, >>>>>>>> and the above comment indicates that you don't want this. >>>>>>> >>>>>>> I'm not expecting there to be any exceptions from any of the >>>>>>> called methods. That would potentially indicate a problem in >>>>>>> handling the terminated native thread, so would indicate a test >>>>>>> failure. >>>>>>> >>>>>>>> Don't we normally put these tests in a package? >>>>>>> >>>>>>> Doesn't seem to be any hard and fast rule. I only uses packages >>>>>>> when they are important for the test. In runtime we have 905 java >>>>>>> files and only 116 have a package statement. It varies elsewhere. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>>> Solaris compiler complains about doing a return from >>>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>>> tomorrow. >>>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>>> >>>>>>>>>> Problem: >>>>>>>>>> >>>>>>>>>> The tests create native threads that attach to the VM through >>>>>>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>>>>>> detaching themselves. When the VM exits and we're using Flight >>>>>>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads >>>>>>>>>> that in part wants to print the per-thread CPU usage. When we >>>>>>>>>> encounter the threads that have terminated already the low >>>>>>>>>> level pthread_getcpuclockid calls returns ESRCH but the code >>>>>>>>>> doesn't expect that and so fails an assert in debug mode and >>>>>>>>>> can SEGV in product mode. >>>>>>>>>> >>>>>>>>>> Solution: >>>>>>>>>> >>>>>>>>>> Serviceability-side: fix the tests >>>>>>>>>> >>>>>>>>>> Change the tests so that the threads detach before >>>>>>>>>> terminating. The two tests are (surprisingly) written in >>>>>>>>>> completely different styles, so the solution also takes on two >>>>>>>>>> different styles. >>>>>>>>>> >>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>>> regression test >>>>>>>>>> >>>>>>>>>> I took a good look at the low-level code for interacting with >>>>>>>>>> arbitrary threads and as far as I can see the problem only >>>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. >>>>>>>>>> Elsewhere the potential for a library call failure just >>>>>>>>>> reports an error value (such as -1 for the cpu time used). >>>>>>>>>> >>>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that >>>>>>>>>> case. >>>>>>>>>> >>>>>>>>>> I created a new regression test to create a new native thread, >>>>>>>>>> attach it and then let it terminate while still attached. The >>>>>>>>>> java code then calls various Thread and ThreadMXBean functions >>>>>>>>>> on it to ensure there are no crashes or unexpected exceptions. >>>>>>>>>> >>>>>>>>>> Testing: >>>>>>>>>> ??- old tests with fixed run-time >>>>>>>>>> ??- old run-time with fixed tests >>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>>> ??- new regression test >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>> >>>>>>>> >>>>>>>> >>>> >>>> >> >> From david.holmes at oracle.com Mon Jul 9 23:22:21 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 09:22:21 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> Message-ID: <7c5dc8ba-aee8-a1d1-6821-6f9607a9e478@oracle.com> Adding back runtime On 10/07/2018 8:45 AM, Alex Menkov wrote: > +1 Thanks for looking at this Alex! > couple minor notes (no need to resend review) Webrev updated in place (v3) for others to see. > src/hotspot/os/linux/os_linux.cpp > please replace > > 5581???? } > 5582???? else { > > with > ??? } else { Done. > > test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c > please fix error reporting (I suppose you mean "TEST ERROR: > pthread_create failed"/"TEST ERROR: pthread_join failed"): > > ? 85?? if ((res = pthread_create(&thread, NULL, thread_start, NULL)) != > 0) { > ? 86???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s > (%d)\n", strerror(res), res); > ? 87???? exit(1); > ? 88?? } > ? 89 > ? 90?? if ((res = pthread_join(thread, NULL)) != 0) { > ? 91???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s > (%d)\n", strerror(res), res); > ? 92???? exit(1); > ? 93?? } Fixed - well spotted! Thanks, David > --alex > > On 07/09/2018 15:17, David Holmes wrote: >> Thanks Chris! >> >> Can I please get a second review. >> >> David >> >> On 10/07/2018 7:50 AM, Chris Plummer wrote: >>> On 7/9/18 2:41 PM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>>>> Hi David, >>>>> >>>>> Would it be better to problem list this test on solaris using >>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the >>>>> problem list and start executing on solaris. >>>> >>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >>>> only fix this for VM created threads. The general problem of TLS >>>> destructors looping if a thread terminates without detaching from >>>> the VM is not solvable - other than by not using TLS in the VM. >>> Ok, I misunderstood your comments in the test. >>> >>> Changes look fine. >>> >>> Chris >>>> >>>> Thanks, >>>> David >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 7/8/18 4:58 PM, David Holmes wrote: >>>>>> tl;dr skip the new regression test on Solaris >>>>>> >>>>>> New webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>>>> >>>>>> This excludes the test from running on Solaris, so the makefile >>>>>> doesn't bother compiling this native test and the Java part of the >>>>>> test adds: >>>>>> >>>>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>>>> natively >>>>>> ? *????????? attached thread that has failed to detach before >>>>>> terminating. >>>>>> + * @comment The native code only supports POSIX so no windows >>>>>> testing; also >>>>>> + *????????? we have to skip solaris as a terminating thread that >>>>>> fails to >>>>>> + *????????? detach will hit an infinite loop due to TLS >>>>>> destructor issues - see >>>>>> + *????????? comments in JDK-8156708 >>>>>> >>>>>> Note this means that Solaris is not affected by the original issue >>>>>> because a still-attached native thread can't actually terminate >>>>>> due to the TLS destructor infinite-loop issue. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>>>> The new test is hanging on Solaris. I just discovered we >>>>>>> don't run these tests on Solaris until tier4. >>>>>>> >>>>>>> David >>>>>>> >>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Thanks for looking at this. >>>>>>>> >>>>>>>> Updated webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>>>> >>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>>>> >>>>>>>> More below ... >>>>>>>> >>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> Solaris problems aside, overall it looks fine. Some minor >>>>>>>>> things I noted: >>>>>>>>> >>>>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>>>> agentB(), so there isn't much point to having it. If you reach >>>>>>>>> the bottom of the function, it passed, so PASSED can be >>>>>>>>> returned. The code would be more clear if it did this. As-is it >>>>>>>>> is implied that you can reach the bottom when it fails. >>>>>>>> >>>>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>>>> cleanup in the tests - once you start you may end up doing a >>>>>>>> full rewrite. >>>>>>>> >>>>>>>>> Is detaching the threads along the failure paths really needed? >>>>>>>>> exit() is called, so this would seem to make it unnecessary. >>>>>>>> >>>>>>>> You're right that isn't necessary. I'll remove the changes from >>>>>>>> before the exits in ji05t001.c >>>>>>>> >>>>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>>>> much more readable than the similar code in agentA() and agentB(). >>>>>>>> >>>>>>>> It's an existing style already used in that test e.g. >>>>>>>> >>>>>>>> ??287???? if ((res = >>>>>>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) >>>>>>>> 0)) != 0) { >>>>>>>> >>>>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>>>> >>>>>>>>> In the test: >>>>>>>>> >>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>>>> unexpected >>>>>>>>> ?? 55???????? // exceptions then the test passes. In some cases >>>>>>>>> we know exactly >>>>>>>>> >>>>>>>>> "of" should be "or". >>>>>>>> >>>>>>>> Well spotted. Thanks. >>>>>>>> >>>>>>>>> Shouldn't you be catching exceptions for all the Thread methods >>>>>>>>> you are calling? Otherwise the test will exit if one is thrown, >>>>>>>>> and the above comment indicates that you don't want this. >>>>>>>> >>>>>>>> I'm not expecting there to be any exceptions from any of the >>>>>>>> called methods. That would potentially indicate a problem in >>>>>>>> handling the terminated native thread, so would indicate a test >>>>>>>> failure. >>>>>>>> >>>>>>>>> Don't we normally put these tests in a package? >>>>>>>> >>>>>>>> Doesn't seem to be any hard and fast rule. I only uses packages >>>>>>>> when they are important for the test. In runtime we have 905 >>>>>>>> java files and only 116 have a package statement. It varies >>>>>>>> elsewhere. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>>>> Solaris compiler complains about doing a return from >>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>>>> tomorrow. >>>>>>>>>> >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>>>> >>>>>>>>>>> Problem: >>>>>>>>>>> >>>>>>>>>>> The tests create native threads that attach to the VM through >>>>>>>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>>>>>>> detaching themselves. When the VM exits and we're using >>>>>>>>>>> Flight Recorder "dumponexit" this leads to a call to >>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread >>>>>>>>>>> CPU usage. When we encounter the threads that have terminated >>>>>>>>>>> already the low level pthread_getcpuclockid calls returns >>>>>>>>>>> ESRCH but the code doesn't expect that and so fails an assert >>>>>>>>>>> in debug mode and can SEGV in product mode. >>>>>>>>>>> >>>>>>>>>>> Solution: >>>>>>>>>>> >>>>>>>>>>> Serviceability-side: fix the tests >>>>>>>>>>> >>>>>>>>>>> Change the tests so that the threads detach before >>>>>>>>>>> terminating. The two tests are (surprisingly) written in >>>>>>>>>>> completely different styles, so the solution also takes on >>>>>>>>>>> two different styles. >>>>>>>>>>> >>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>>>> regression test >>>>>>>>>>> >>>>>>>>>>> I took a good look at the low-level code for interacting with >>>>>>>>>>> arbitrary threads and as far as I can see the problem only >>>>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. >>>>>>>>>>> Elsewhere the potential for a library call failure just >>>>>>>>>>> reports an error value (such as -1 for the cpu time used). >>>>>>>>>>> >>>>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that >>>>>>>>>>> case. >>>>>>>>>>> >>>>>>>>>>> I created a new regression test to create a new native >>>>>>>>>>> thread, attach it and then let it terminate while still >>>>>>>>>>> attached. The java code then calls various Thread and >>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes >>>>>>>>>>> or unexpected exceptions. >>>>>>>>>>> >>>>>>>>>>> Testing: >>>>>>>>>>> ??- old tests with fixed run-time >>>>>>>>>>> ??- old run-time with fixed tests >>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>>>> ??- new regression test >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>> >>>>> >>> >>> From david.holmes at oracle.com Tue Jul 10 00:14:44 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 10:14:44 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <95f95813-52e7-aebd-10e4-306ecb534fc3@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> <7c5dc8ba-aee8-a1d1-6821-6f9607a9e478@oracle.com> <95f95813-52e7-aebd-10e4-306ecb534fc3@oracle.com> Message-ID: Thanks for looking at this Coleen. David On 10/07/2018 10:11 AM, coleen.phillimore at oracle.com wrote: > > This looks good!? Thank you for fixing these failures. > Coleen > > On 7/9/18 7:22 PM, David Holmes wrote: >> Adding back runtime >> >> On 10/07/2018 8:45 AM, Alex Menkov wrote: >>> +1 >> >> Thanks for looking at this Alex! >> >>> couple minor notes (no need to resend review) >> >> Webrev updated in place (v3) for others to see. >> >>> src/hotspot/os/linux/os_linux.cpp >>> please replace >>> >>> 5581???? } >>> 5582???? else { >>> >>> with >>> ???? } else { >> >> Done. >> >>> >>> test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c >>> please fix error reporting (I suppose you mean "TEST ERROR: >>> pthread_create failed"/"TEST ERROR: pthread_join failed"): >>> >>> ?? 85?? if ((res = pthread_create(&thread, NULL, thread_start, NULL)) >>> != 0) { >>> ?? 86???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s >>> (%d)\n", strerror(res), res); >>> ?? 87???? exit(1); >>> ?? 88?? } >>> ?? 89 >>> ?? 90?? if ((res = pthread_join(thread, NULL)) != 0) { >>> ?? 91???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s >>> (%d)\n", strerror(res), res); >>> ?? 92???? exit(1); >>> ?? 93?? } >> >> Fixed - well spotted! >> >> Thanks, >> David >> >>> --alex >>> >>> On 07/09/2018 15:17, David Holmes wrote: >>>> Thanks Chris! >>>> >>>> Can I please get a second review. >>>> >>>> David >>>> >>>> On 10/07/2018 7:50 AM, Chris Plummer wrote: >>>>> On 7/9/18 2:41 PM, David Holmes wrote: >>>>>> Hi Chris, >>>>>> >>>>>> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>>>>>> Hi David, >>>>>>> >>>>>>> Would it be better to problem list this test on solaris using >>>>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off >>>>>>> the problem list and start executing on solaris. >>>>>> >>>>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >>>>>> only fix this for VM created threads. The general problem of TLS >>>>>> destructors looping if a thread terminates without detaching from >>>>>> the VM is not solvable - other than by not using TLS in the VM. >>>>> Ok, I misunderstood your comments in the test. >>>>> >>>>> Changes look fine. >>>>> >>>>> Chris >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 7/8/18 4:58 PM, David Holmes wrote: >>>>>>>> tl;dr skip the new regression test on Solaris >>>>>>>> >>>>>>>> New webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>>>>>> >>>>>>>> This excludes the test from running on Solaris, so the makefile >>>>>>>> doesn't bother compiling this native test and the Java part of >>>>>>>> the test adds: >>>>>>>> >>>>>>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>>>>>> natively >>>>>>>> ? *????????? attached thread that has failed to detach before >>>>>>>> terminating. >>>>>>>> + * @comment The native code only supports POSIX so no windows >>>>>>>> testing; also >>>>>>>> + *????????? we have to skip solaris as a terminating thread >>>>>>>> that fails to >>>>>>>> + *????????? detach will hit an infinite loop due to TLS >>>>>>>> destructor issues - see >>>>>>>> + *????????? comments in JDK-8156708 >>>>>>>> >>>>>>>> Note this means that Solaris is not affected by the original >>>>>>>> issue because a still-attached native thread can't actually >>>>>>>> terminate due to the TLS destructor infinite-loop issue. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>>>>>> The new test is hanging on Solaris. I just discovered we >>>>>>>>> don't run these tests on Solaris until tier4. >>>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> Thanks for looking at this. >>>>>>>>>> >>>>>>>>>> Updated webrev: >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>>>>>> >>>>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>>>>>> >>>>>>>>>> More below ... >>>>>>>>>> >>>>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>>>>>> Hi David, >>>>>>>>>>> >>>>>>>>>>> Solaris problems aside, overall it looks fine. Some minor >>>>>>>>>>> things I noted: >>>>>>>>>>> >>>>>>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>>>>>> agentB(), so there isn't much point to having it. If you >>>>>>>>>>> reach the bottom of the function, it passed, so PASSED can be >>>>>>>>>>> returned. The code would be more clear if it did this. As-is >>>>>>>>>>> it is implied that you can reach the bottom when it fails. >>>>>>>>>> >>>>>>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>>>>>> cleanup in the tests - once you start you may end up doing a >>>>>>>>>> full rewrite. >>>>>>>>>> >>>>>>>>>>> Is detaching the threads along the failure paths really >>>>>>>>>>> needed? exit() is called, so this would seem to make it >>>>>>>>>>> unnecessary. >>>>>>>>>> >>>>>>>>>> You're right that isn't necessary. I'll remove the changes >>>>>>>>>> from before the exits in ji05t001.c >>>>>>>>>> >>>>>>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>>>>>> much more readable than the similar code in agentA() and >>>>>>>>>>> agentB(). >>>>>>>>>> >>>>>>>>>> It's an existing style already used in that test e.g. >>>>>>>>>> >>>>>>>>>> ??287???? if ((res = >>>>>>>>>> ??288 JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void >>>>>>>>>> *) 0)) != 0) { >>>>>>>>>> >>>>>>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>>>>>> >>>>>>>>>>> In the test: >>>>>>>>>>> >>>>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>>>>>> unexpected >>>>>>>>>>> ?? 55???????? // exceptions then the test passes. In some >>>>>>>>>>> cases we know exactly >>>>>>>>>>> >>>>>>>>>>> "of" should be "or". >>>>>>>>>> >>>>>>>>>> Well spotted. Thanks. >>>>>>>>>> >>>>>>>>>>> Shouldn't you be catching exceptions for all the Thread >>>>>>>>>>> methods you are calling? Otherwise the test will exit if one >>>>>>>>>>> is thrown, and the above comment indicates that you don't >>>>>>>>>>> want this. >>>>>>>>>> >>>>>>>>>> I'm not expecting there to be any exceptions from any of the >>>>>>>>>> called methods. That would potentially indicate a problem in >>>>>>>>>> handling the terminated native thread, so would indicate a >>>>>>>>>> test failure. >>>>>>>>>> >>>>>>>>>>> Don't we normally put these tests in a package? >>>>>>>>>> >>>>>>>>>> Doesn't seem to be any hard and fast rule. I only uses >>>>>>>>>> packages when they are important for the test. In runtime we >>>>>>>>>> have 905 java files and only 116 have a package statement. It >>>>>>>>>> varies elsewhere. >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>>>>>> Solaris compiler complains about doing a return from >>>>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>>>>>> tomorrow. >>>>>>>>>>>> >>>>>>>>>>>> David >>>>>>>>>>>> >>>>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>>>>>> >>>>>>>>>>>>> Problem: >>>>>>>>>>>>> >>>>>>>>>>>>> The tests create native threads that attach to the VM >>>>>>>>>>>>> through JNI_AttachCurrentThread but which then terminate >>>>>>>>>>>>> without detaching themselves. When the VM exits and we're >>>>>>>>>>>>> using Flight Recorder "dumponexit" this leads to a call to >>>>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread >>>>>>>>>>>>> CPU usage. When we encounter the threads that have >>>>>>>>>>>>> terminated already the low level pthread_getcpuclockid >>>>>>>>>>>>> calls returns ESRCH but the code doesn't expect that and so >>>>>>>>>>>>> fails an assert in debug mode and can SEGV in product mode. >>>>>>>>>>>>> >>>>>>>>>>>>> Solution: >>>>>>>>>>>>> >>>>>>>>>>>>> Serviceability-side: fix the tests >>>>>>>>>>>>> >>>>>>>>>>>>> Change the tests so that the threads detach before >>>>>>>>>>>>> terminating. The two tests are (surprisingly) written in >>>>>>>>>>>>> completely different styles, so the solution also takes on >>>>>>>>>>>>> two different styles. >>>>>>>>>>>>> >>>>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>>>>>> regression test >>>>>>>>>>>>> >>>>>>>>>>>>> I took a good look at the low-level code for interacting >>>>>>>>>>>>> with arbitrary threads and as far as I can see the problem >>>>>>>>>>>>> only exists for this one case of pthread_getcpuclockid on >>>>>>>>>>>>> Linux. Elsewhere the potential for a library call failure >>>>>>>>>>>>> just reports an error value (such as -1 for the cpu time >>>>>>>>>>>>> used). >>>>>>>>>>>>> >>>>>>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in >>>>>>>>>>>>> that case. >>>>>>>>>>>>> >>>>>>>>>>>>> I created a new regression test to create a new native >>>>>>>>>>>>> thread, attach it and then let it terminate while still >>>>>>>>>>>>> attached. The java code then calls various Thread and >>>>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes >>>>>>>>>>>>> or unexpected exceptions. >>>>>>>>>>>>> >>>>>>>>>>>>> Testing: >>>>>>>>>>>>> ??- old tests with fixed run-time >>>>>>>>>>>>> ??- old run-time with fixed tests >>>>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>>>>>> ??- new regression test >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> > From serguei.spitsyn at oracle.com Tue Jul 10 02:07:39 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 9 Jul 2018 19:07:39 -0700 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> Message-ID: <416fc226-e389-df65-9487-736efc9e7528@oracle.com> Hi David, It looks good modulo the minor comments that others have already found. Could I ask you to fix a couple of really minor issues in new test? Unneeded spaces are at lines 84 and 51 in .java and .c files: 83 if (mbean.isThreadCpuTimeSupported() && 84 mbean.isThreadCpuTimeEnabled() ) { . . . 51 class_id = (*env)->FindClass (env, "java/lang/Thread"); Thanks, Serguei On 7/9/18 15:17, David Holmes wrote: > Thanks Chris! > > Can I please get a second review. > > David > > On 10/07/2018 7:50 AM, Chris Plummer wrote: >> On 7/9/18 2:41 PM, David Holmes wrote: >>> Hi Chris, >>> >>> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>>> Hi David, >>>> >>>> Would it be better to problem list this test on solaris using >>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the >>>> problem list and start executing on solaris. >>> >>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >>> only fix this for VM created threads. The general problem of TLS >>> destructors looping if a thread terminates without detaching from >>> the VM is not solvable - other than by not using TLS in the VM. >> Ok, I misunderstood your comments in the test. >> >> Changes look fine. >> >> Chris >>> >>> Thanks, >>> David >>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 7/8/18 4:58 PM, David Holmes wrote: >>>>> tl;dr skip the new regression test on Solaris >>>>> >>>>> New webrev: >>>>> >>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>>> >>>>> This excludes the test from running on Solaris, so the makefile >>>>> doesn't bother compiling this native test and the Java part of the >>>>> test adds: >>>>> >>>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>>> natively >>>>> ? *????????? attached thread that has failed to detach before >>>>> terminating. >>>>> + * @comment The native code only supports POSIX so no windows >>>>> testing; also >>>>> + *????????? we have to skip solaris as a terminating thread that >>>>> fails to >>>>> + *????????? detach will hit an infinite loop due to TLS >>>>> destructor issues - see >>>>> + *????????? comments in JDK-8156708 >>>>> >>>>> Note this means that Solaris is not affected by the original issue >>>>> because a still-attached native thread can't actually terminate >>>>> due to the TLS destructor infinite-loop issue. >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>>> The new test is hanging on Solaris. I just discovered we >>>>>> don't run these tests on Solaris until tier4. >>>>>> >>>>>> David >>>>>> >>>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> Thanks for looking at this. >>>>>>> >>>>>>> Updated webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>>> >>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>>> >>>>>>> More below ... >>>>>>> >>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>>> Hi David, >>>>>>>> >>>>>>>> Solaris problems aside, overall it looks fine. Some minor >>>>>>>> things I noted: >>>>>>>> >>>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>>> agentB(), so there isn't much point to having it. If you reach >>>>>>>> the bottom of the function, it passed, so PASSED can be >>>>>>>> returned. The code would be more clear if it did this. As-is it >>>>>>>> is implied that you can reach the bottom when it fails. >>>>>>> >>>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>>> cleanup in the tests - once you start you may end up doing a >>>>>>> full rewrite. >>>>>>> >>>>>>>> Is detaching the threads along the failure paths really needed? >>>>>>>> exit() is called, so this would seem to make it unnecessary. >>>>>>> >>>>>>> You're right that isn't necessary. I'll remove the changes from >>>>>>> before the exits in ji05t001.c >>>>>>> >>>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>>> much more readable than the similar code in agentA() and agentB(). >>>>>>> >>>>>>> It's an existing style already used in that test e.g. >>>>>>> >>>>>>> ??287???? if ((res = >>>>>>> ??288 JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) >>>>>>> 0)) != 0) { >>>>>>> >>>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>>> >>>>>>>> In the test: >>>>>>>> >>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>>> unexpected >>>>>>>> ?? 55???????? // exceptions then the test passes. In some cases >>>>>>>> we know exactly >>>>>>>> >>>>>>>> "of" should be "or". >>>>>>> >>>>>>> Well spotted. Thanks. >>>>>>> >>>>>>>> Shouldn't you be catching exceptions for all the Thread methods >>>>>>>> you are calling? Otherwise the test will exit if one is thrown, >>>>>>>> and the above comment indicates that you don't want this. >>>>>>> >>>>>>> I'm not expecting there to be any exceptions from any of the >>>>>>> called methods. That would potentially indicate a problem in >>>>>>> handling the terminated native thread, so would indicate a test >>>>>>> failure. >>>>>>> >>>>>>>> Don't we normally put these tests in a package? >>>>>>> >>>>>>> Doesn't seem to be any hard and fast rule. I only uses packages >>>>>>> when they are important for the test. In runtime we have 905 >>>>>>> java files and only 116 have a package statement. It varies >>>>>>> elsewhere. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>>> Solaris compiler complains about doing a return from >>>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>>> tomorrow. >>>>>>>>> >>>>>>>>> David >>>>>>>>> >>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>>> >>>>>>>>>> Problem: >>>>>>>>>> >>>>>>>>>> The tests create native threads that attach to the VM through >>>>>>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>>>>>> detaching themselves. When the VM exits and we're using >>>>>>>>>> Flight Recorder "dumponexit" this leads to a call to >>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread >>>>>>>>>> CPU usage. When we encounter the threads that have terminated >>>>>>>>>> already the low level pthread_getcpuclockid calls returns >>>>>>>>>> ESRCH but the code doesn't expect that and so fails an assert >>>>>>>>>> in debug mode and can SEGV in product mode. >>>>>>>>>> >>>>>>>>>> Solution: >>>>>>>>>> >>>>>>>>>> Serviceability-side: fix the tests >>>>>>>>>> >>>>>>>>>> Change the tests so that the threads detach before >>>>>>>>>> terminating. The two tests are (surprisingly) written in >>>>>>>>>> completely different styles, so the solution also takes on >>>>>>>>>> two different styles. >>>>>>>>>> >>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>>> regression test >>>>>>>>>> >>>>>>>>>> I took a good look at the low-level code for interacting with >>>>>>>>>> arbitrary threads and as far as I can see the problem only >>>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. >>>>>>>>>> Elsewhere the potential for a library call failure just >>>>>>>>>> reports an error value (such as -1 for the cpu time used). >>>>>>>>>> >>>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that >>>>>>>>>> case. >>>>>>>>>> >>>>>>>>>> I created a new regression test to create a new native >>>>>>>>>> thread, attach it and then let it terminate while still >>>>>>>>>> attached. The java code then calls various Thread and >>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes >>>>>>>>>> or unexpected exceptions. >>>>>>>>>> >>>>>>>>>> Testing: >>>>>>>>>> ??- old tests with fixed run-time >>>>>>>>>> ??- old run-time with fixed tests >>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>>> ??- new regression test >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> David >>>>>>>> >>>>>>>> >>>>>>>> >>>> >>>> >> >> From david.holmes at oracle.com Tue Jul 10 02:35:41 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 10 Jul 2018 12:35:41 +1000 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <416fc226-e389-df65-9487-736efc9e7528@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> <416fc226-e389-df65-9487-736efc9e7528@oracle.com> Message-ID: <864787be-ca72-2422-0399-716a1abe7d27@oracle.com> On 10/07/2018 12:07 PM, serguei.spitsyn at oracle.com wrote: > Hi David, > > It looks good modulo the minor comments that others have already found. Thanks for taking a look. > Could I ask you to fix a couple of really minor issues in new test? > > Unneeded spaces are at lines 84 and 51 in .java and .c files: > > ? 83???????? if (mbean.isThreadCpuTimeSupported() && > ? 84???????????? mbean.isThreadCpuTimeEnabled() ) { > ? . . . > > ? 51?? class_id = (*env)->FindClass (env, "java/lang/Thread"); Sorry Serguei, too late. David > Thanks, > Serguei > > > On 7/9/18 15:17, David Holmes wrote: >> Thanks Chris! >> >> Can I please get a second review. >> >> David >> >> On 10/07/2018 7:50 AM, Chris Plummer wrote: >>> On 7/9/18 2:41 PM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>>>> Hi David, >>>>> >>>>> Would it be better to problem list this test on solaris using >>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the >>>>> problem list and start executing on solaris. >>>> >>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >>>> only fix this for VM created threads. The general problem of TLS >>>> destructors looping if a thread terminates without detaching from >>>> the VM is not solvable - other than by not using TLS in the VM. >>> Ok, I misunderstood your comments in the test. >>> >>> Changes look fine. >>> >>> Chris >>>> >>>> Thanks, >>>> David >>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 7/8/18 4:58 PM, David Holmes wrote: >>>>>> tl;dr skip the new regression test on Solaris >>>>>> >>>>>> New webrev: >>>>>> >>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>>>> >>>>>> This excludes the test from running on Solaris, so the makefile >>>>>> doesn't bother compiling this native test and the Java part of the >>>>>> test adds: >>>>>> >>>>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>>>> natively >>>>>> ? *????????? attached thread that has failed to detach before >>>>>> terminating. >>>>>> + * @comment The native code only supports POSIX so no windows >>>>>> testing; also >>>>>> + *????????? we have to skip solaris as a terminating thread that >>>>>> fails to >>>>>> + *????????? detach will hit an infinite loop due to TLS >>>>>> destructor issues - see >>>>>> + *????????? comments in JDK-8156708 >>>>>> >>>>>> Note this means that Solaris is not affected by the original issue >>>>>> because a still-attached native thread can't actually terminate >>>>>> due to the TLS destructor infinite-loop issue. >>>>>> >>>>>> Thanks, >>>>>> David >>>>>> >>>>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>>>> The new test is hanging on Solaris. I just discovered we >>>>>>> don't run these tests on Solaris until tier4. >>>>>>> >>>>>>> David >>>>>>> >>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> Thanks for looking at this. >>>>>>>> >>>>>>>> Updated webrev: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>>>> >>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>>>> >>>>>>>> More below ... >>>>>>>> >>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>>>> Hi David, >>>>>>>>> >>>>>>>>> Solaris problems aside, overall it looks fine. Some minor >>>>>>>>> things I noted: >>>>>>>>> >>>>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>>>> agentB(), so there isn't much point to having it. If you reach >>>>>>>>> the bottom of the function, it passed, so PASSED can be >>>>>>>>> returned. The code would be more clear if it did this. As-is it >>>>>>>>> is implied that you can reach the bottom when it fails. >>>>>>>> >>>>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>>>> cleanup in the tests - once you start you may end up doing a >>>>>>>> full rewrite. >>>>>>>> >>>>>>>>> Is detaching the threads along the failure paths really needed? >>>>>>>>> exit() is called, so this would seem to make it unnecessary. >>>>>>>> >>>>>>>> You're right that isn't necessary. I'll remove the changes from >>>>>>>> before the exits in ji05t001.c >>>>>>>> >>>>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>>>> much more readable than the similar code in agentA() and agentB(). >>>>>>>> >>>>>>>> It's an existing style already used in that test e.g. >>>>>>>> >>>>>>>> ??287???? if ((res = >>>>>>>> ??288 JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) >>>>>>>> 0)) != 0) { >>>>>>>> >>>>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>>>> >>>>>>>>> In the test: >>>>>>>>> >>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>>>> unexpected >>>>>>>>> ?? 55???????? // exceptions then the test passes. In some cases >>>>>>>>> we know exactly >>>>>>>>> >>>>>>>>> "of" should be "or". >>>>>>>> >>>>>>>> Well spotted. Thanks. >>>>>>>> >>>>>>>>> Shouldn't you be catching exceptions for all the Thread methods >>>>>>>>> you are calling? Otherwise the test will exit if one is thrown, >>>>>>>>> and the above comment indicates that you don't want this. >>>>>>>> >>>>>>>> I'm not expecting there to be any exceptions from any of the >>>>>>>> called methods. That would potentially indicate a problem in >>>>>>>> handling the terminated native thread, so would indicate a test >>>>>>>> failure. >>>>>>>> >>>>>>>>> Don't we normally put these tests in a package? >>>>>>>> >>>>>>>> Doesn't seem to be any hard and fast rule. I only uses packages >>>>>>>> when they are important for the test. In runtime we have 905 >>>>>>>> java files and only 116 have a package statement. It varies >>>>>>>> elsewhere. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> David >>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>>>> Solaris compiler complains about doing a return from >>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>>>> tomorrow. >>>>>>>>>> >>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>>>> >>>>>>>>>>> Problem: >>>>>>>>>>> >>>>>>>>>>> The tests create native threads that attach to the VM through >>>>>>>>>>> JNI_AttachCurrentThread but which then terminate without >>>>>>>>>>> detaching themselves. When the VM exits and we're using >>>>>>>>>>> Flight Recorder "dumponexit" this leads to a call to >>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread >>>>>>>>>>> CPU usage. When we encounter the threads that have terminated >>>>>>>>>>> already the low level pthread_getcpuclockid calls returns >>>>>>>>>>> ESRCH but the code doesn't expect that and so fails an assert >>>>>>>>>>> in debug mode and can SEGV in product mode. >>>>>>>>>>> >>>>>>>>>>> Solution: >>>>>>>>>>> >>>>>>>>>>> Serviceability-side: fix the tests >>>>>>>>>>> >>>>>>>>>>> Change the tests so that the threads detach before >>>>>>>>>>> terminating. The two tests are (surprisingly) written in >>>>>>>>>>> completely different styles, so the solution also takes on >>>>>>>>>>> two different styles. >>>>>>>>>>> >>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>>>> regression test >>>>>>>>>>> >>>>>>>>>>> I took a good look at the low-level code for interacting with >>>>>>>>>>> arbitrary threads and as far as I can see the problem only >>>>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. >>>>>>>>>>> Elsewhere the potential for a library call failure just >>>>>>>>>>> reports an error value (such as -1 for the cpu time used). >>>>>>>>>>> >>>>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that >>>>>>>>>>> case. >>>>>>>>>>> >>>>>>>>>>> I created a new regression test to create a new native >>>>>>>>>>> thread, attach it and then let it terminate while still >>>>>>>>>>> attached. The java code then calls various Thread and >>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes >>>>>>>>>>> or unexpected exceptions. >>>>>>>>>>> >>>>>>>>>>> Testing: >>>>>>>>>>> ??- old tests with fixed run-time >>>>>>>>>>> ??- old run-time with fixed tests >>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>>>> ??- new regression test >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> David >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>> >>>>> >>> >>> > From serguei.spitsyn at oracle.com Tue Jul 10 02:42:57 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 9 Jul 2018 19:42:57 -0700 Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code In-Reply-To: <864787be-ca72-2422-0399-716a1abe7d27@oracle.com> References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com> <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com> <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com> <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com> <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com> <416fc226-e389-df65-9487-736efc9e7528@oracle.com> <864787be-ca72-2422-0399-716a1abe7d27@oracle.com> Message-ID: <5d5822c1-e5fb-f38f-f44d-26086a9ff3b8@oracle.com> On 7/9/18 19:35, David Holmes wrote: > On 10/07/2018 12:07 PM, serguei.spitsyn at oracle.com wrote: >> Hi David, >> >> It looks good modulo the minor comments that others have already found. > > Thanks for taking a look. > >> Could I ask you to fix a couple of really minor issues in new test? >> >> Unneeded spaces are at lines 84 and 51 in .java and .c files: >> >> ?? 83???????? if (mbean.isThreadCpuTimeSupported() && >> ?? 84???????????? mbean.isThreadCpuTimeEnabled() ) { >> ?? . . . >> >> ?? 51?? class_id = (*env)->FindClass (env, "java/lang/Thread"); > > Sorry Serguei, too late. Not a problem, David. Sorry for being late. Thanks, Serguei > David > >> Thanks, >> Serguei >> >> >> On 7/9/18 15:17, David Holmes wrote: >>> Thanks Chris! >>> >>> Can I please get a second review. >>> >>> David >>> >>> On 10/07/2018 7:50 AM, Chris Plummer wrote: >>>> On 7/9/18 2:41 PM, David Holmes wrote: >>>>> Hi Chris, >>>>> >>>>> On 10/07/2018 4:22 AM, Chris Plummer wrote: >>>>>> Hi David, >>>>>> >>>>>> Would it be better to problem list this test on solaris using >>>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off >>>>>> the problem list and start executing on solaris. >>>>> >>>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could >>>>> only fix this for VM created threads. The general problem of TLS >>>>> destructors looping if a thread terminates without detaching from >>>>> the VM is not solvable - other than by not using TLS in the VM. >>>> Ok, I misunderstood your comments in the test. >>>> >>>> Changes look fine. >>>> >>>> Chris >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/8/18 4:58 PM, David Holmes wrote: >>>>>>> tl;dr skip the new regression test on Solaris >>>>>>> >>>>>>> New webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/ >>>>>>> >>>>>>> This excludes the test from running on Solaris, so the makefile >>>>>>> doesn't bother compiling this native test and the Java part of >>>>>>> the test adds: >>>>>>> >>>>>>> ! * @requires os.family != "windows" & os.family != "solaris" >>>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a >>>>>>> natively >>>>>>> ? *????????? attached thread that has failed to detach before >>>>>>> terminating. >>>>>>> + * @comment The native code only supports POSIX so no windows >>>>>>> testing; also >>>>>>> + *????????? we have to skip solaris as a terminating thread >>>>>>> that fails to >>>>>>> + *????????? detach will hit an infinite loop due to TLS >>>>>>> destructor issues - see >>>>>>> + *????????? comments in JDK-8156708 >>>>>>> >>>>>>> Note this means that Solaris is not affected by the original >>>>>>> issue because a still-attached native thread can't actually >>>>>>> terminate due to the TLS destructor infinite-loop issue. >>>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>>> On 6/07/2018 6:07 PM, David Holmes wrote: >>>>>>>> The new test is hanging on Solaris. I just discovered we >>>>>>>> don't run these tests on Solaris until tier4. >>>>>>>> >>>>>>>> David >>>>>>>> >>>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> Thanks for looking at this. >>>>>>>>> >>>>>>>>> Updated webrev: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/ >>>>>>>>> >>>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test) >>>>>>>>> >>>>>>>>> More below ... >>>>>>>>> >>>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote: >>>>>>>>>> Hi David, >>>>>>>>>> >>>>>>>>>> Solaris problems aside, overall it looks fine. Some minor >>>>>>>>>> things I noted: >>>>>>>>>> >>>>>>>>>> I noticed that exitCode is never modified in agentA() or >>>>>>>>>> agentB(), so there isn't much point to having it. If you >>>>>>>>>> reach the bottom of the function, it passed, so PASSED can be >>>>>>>>>> returned. The code would be more clear if it did this. As-is >>>>>>>>>> it is implied that you can reach the bottom when it fails. >>>>>>>>> >>>>>>>>> I resisted any and all urges to do any kind of unrelated code >>>>>>>>> cleanup in the tests - once you start you may end up doing a >>>>>>>>> full rewrite. >>>>>>>>> >>>>>>>>>> Is detaching the threads along the failure paths really >>>>>>>>>> needed? exit() is called, so this would seem to make it >>>>>>>>>> unnecessary. >>>>>>>>> >>>>>>>>> You're right that isn't necessary. I'll remove the changes >>>>>>>>> from before the exits in ji05t001.c >>>>>>>>> >>>>>>>>>> I prefer assignments not to be embedded inside the "if" >>>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is >>>>>>>>>> much more readable than the similar code in agentA() and >>>>>>>>>> agentB(). >>>>>>>>> >>>>>>>>> It's an existing style already used in that test e.g. >>>>>>>>> >>>>>>>>> ??287???? if ((res = >>>>>>>>> ??288 JNI_ENV_PTR(vm)->AttachCurrentThread( >>>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void >>>>>>>>> *) 0)) != 0) { >>>>>>>>> >>>>>>>>> and I don't mind it, so I'd prefer not to change it. >>>>>>>>> >>>>>>>>>> In the test: >>>>>>>>>> >>>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw >>>>>>>>>> unexpected >>>>>>>>>> ?? 55???????? // exceptions then the test passes. In some >>>>>>>>>> cases we know exactly >>>>>>>>>> >>>>>>>>>> "of" should be "or". >>>>>>>>> >>>>>>>>> Well spotted. Thanks. >>>>>>>>> >>>>>>>>>> Shouldn't you be catching exceptions for all the Thread >>>>>>>>>> methods you are calling? Otherwise the test will exit if one >>>>>>>>>> is thrown, and the above comment indicates that you don't >>>>>>>>>> want this. >>>>>>>>> >>>>>>>>> I'm not expecting there to be any exceptions from any of the >>>>>>>>> called methods. That would potentially indicate a problem in >>>>>>>>> handling the terminated native thread, so would indicate a >>>>>>>>> test failure. >>>>>>>>> >>>>>>>>>> Don't we normally put these tests in a package? >>>>>>>>> >>>>>>>>> Doesn't seem to be any hard and fast rule. I only uses >>>>>>>>> packages when they are important for the test. In runtime we >>>>>>>>> have 905 java files and only 116 have a package statement. It >>>>>>>>> varies elsewhere. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> David >>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote: >>>>>>>>>>> Solaris compiler complains about doing a return from >>>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix >>>>>>>>>>> tomorrow. >>>>>>>>>>> >>>>>>>>>>> David >>>>>>>>>>> >>>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote: >>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878 >>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/ >>>>>>>>>>>> >>>>>>>>>>>> Problem: >>>>>>>>>>>> >>>>>>>>>>>> The tests create native threads that attach to the VM >>>>>>>>>>>> through JNI_AttachCurrentThread but which then terminate >>>>>>>>>>>> without detaching themselves. When the VM exits and we're >>>>>>>>>>>> using Flight Recorder "dumponexit" this leads to a call to >>>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread >>>>>>>>>>>> CPU usage. When we encounter the threads that have >>>>>>>>>>>> terminated already the low level pthread_getcpuclockid >>>>>>>>>>>> calls returns ESRCH but the code doesn't expect that and so >>>>>>>>>>>> fails an assert in debug mode and can SEGV in product mode. >>>>>>>>>>>> >>>>>>>>>>>> Solution: >>>>>>>>>>>> >>>>>>>>>>>> Serviceability-side: fix the tests >>>>>>>>>>>> >>>>>>>>>>>> Change the tests so that the threads detach before >>>>>>>>>>>> terminating. The two tests are (surprisingly) written in >>>>>>>>>>>> completely different styles, so the solution also takes on >>>>>>>>>>>> two different styles. >>>>>>>>>>>> >>>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI >>>>>>>>>>>> attached threads that terminate before detaching, and add a >>>>>>>>>>>> regression test >>>>>>>>>>>> >>>>>>>>>>>> I took a good look at the low-level code for interacting >>>>>>>>>>>> with arbitrary threads and as far as I can see the problem >>>>>>>>>>>> only exists for this one case of pthread_getcpuclockid on >>>>>>>>>>>> Linux. Elsewhere the potential for a library call failure >>>>>>>>>>>> just reports an error value (such as -1 for the cpu time >>>>>>>>>>>> used). >>>>>>>>>>>> >>>>>>>>>>>> So the fix is simply to allow for ESRCH when calling >>>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in >>>>>>>>>>>> that case. >>>>>>>>>>>> >>>>>>>>>>>> I created a new regression test to create a new native >>>>>>>>>>>> thread, attach it and then let it terminate while still >>>>>>>>>>>> attached. The java code then calls various Thread and >>>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes >>>>>>>>>>>> or unexpected exceptions. >>>>>>>>>>>> >>>>>>>>>>>> Testing: >>>>>>>>>>>> ??- old tests with fixed run-time >>>>>>>>>>>> ??- old run-time with fixed tests >>>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we >>>>>>>>>>>> enable Flight recorder for the tests) [in progress] >>>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress] >>>>>>>>>>>> ??- new regression test >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> David >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>> >>>>>> >>>> >>>> >> From jcbeyler at google.com Tue Jul 10 18:41:49 2018 From: jcbeyler at google.com (JC Beyler) Date: Tue, 10 Jul 2018 11:41:49 -0700 Subject: RFR (S) 8205643: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails Message-ID: Hi All, Could someone review the one liner for the bug: https://bugs.openjdk.java.net/browse/JDK-8205643 The webrev is here: http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/ Basically, the test is testing CMS and Graal does not play well with CMS it seems so this removes Graal being tested with it. Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Tue Jul 10 19:26:34 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 10 Jul 2018 12:26:34 -0700 Subject: RFR (S) 8205643: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails In-Reply-To: References: Message-ID: <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com> Hi JC, you need also to remove the test from ProblemList --alex On 07/10/2018 11:41, JC Beyler wrote: > Hi All, > > Could someone review the one liner for the bug: > https://bugs.openjdk.java.net/browse/JDK-8205643 > > The webrev is here: > http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/ > > Basically, the test is testing CMS and Graal does not play well with CMS > it seems so this removes Graal being tested with it. > > Thanks, > Jc From jcbeyler at google.com Tue Jul 10 20:37:37 2018 From: jcbeyler at google.com (JC Beyler) Date: Tue, 10 Jul 2018 13:37:37 -0700 Subject: RFR (S) 8205643: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails In-Reply-To: <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com> References: <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com> Message-ID: Hi Alex, Done here: http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.01/ Any other issues with this fix? Thanks! Jc On Tue, Jul 10, 2018 at 12:26 PM Alex Menkov wrote: > Hi JC, > > you need also to remove the test from ProblemList > > --alex > > On 07/10/2018 11:41, JC Beyler wrote: > > Hi All, > > > > Could someone review the one liner for the bug: > > https://bugs.openjdk.java.net/browse/JDK-8205643 > > > > The webrev is here: > > http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/ > > > > Basically, the test is testing CMS and Graal does not play well with CMS > > it seems so this removes Graal being tested with it. > > > > Thanks, > > Jc > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Tue Jul 10 21:42:18 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 10 Jul 2018 14:42:18 -0700 Subject: RFR (S) 8205643: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails In-Reply-To: References: <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com> Message-ID: Looks good to me. --alex On 07/10/2018 13:37, JC Beyler wrote: > Hi Alex, > > Done here: > http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.01/ > > Any other issues with this fix? > > Thanks! > Jc > > On Tue, Jul 10, 2018 at 12:26 PM Alex Menkov > wrote: > > Hi JC, > > you need also to remove the test from ProblemList > > --alex > > On 07/10/2018 11:41, JC Beyler wrote: > > Hi All, > > > > Could someone review the one liner for the bug: > > https://bugs.openjdk.java.net/browse/JDK-8205643 > > > > The webrev is here: > > http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/ > > > > Basically, the test is testing CMS and Graal does not play well > with CMS > > it seems so this removes Graal being tested with it. > > > > Thanks, > > Jc > > > > -- > > Thanks, > Jc From serguei.spitsyn at oracle.com Tue Jul 10 21:54:32 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 10 Jul 2018 14:54:32 -0700 Subject: RFR (S) 8205643: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails In-Reply-To: <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com> References: <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com> Message-ID: <6bfc1d3a-33fd-bad7-fe63-da8dcec8400f@oracle.com> Hi Jc, The fix looks good. Alex is right. I forgot to tell you that the test has be excluded from the file: ?? open/test/hotspot/jtreg/ProblemList.txt Thanks, Serguei On 7/10/18 12:26, Alex Menkov wrote: > Hi JC, > > you need also to remove the test from ProblemList > > --alex > > On 07/10/2018 11:41, JC Beyler wrote: >> Hi All, >> >> Could someone review the one liner for the bug: >> https://bugs.openjdk.java.net/browse/JDK-8205643 >> >> The webrev is here: >> http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/ >> >> Basically, the test is testing CMS and Graal does not play well with >> CMS it seems so this removes Graal being tested with it. >> >> Thanks, >> Jc From serguei.spitsyn at oracle.com Tue Jul 10 21:56:37 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 10 Jul 2018 14:56:37 -0700 Subject: RFR (S) 8205643: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails In-Reply-To: <6bfc1d3a-33fd-bad7-fe63-da8dcec8400f@oracle.com> References: <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com> <6bfc1d3a-33fd-bad7-fe63-da8dcec8400f@oracle.com> Message-ID: Sorry, did not see your reply to Alex. Looks good - ship it! Thanks, Serguei On 7/10/18 14:54, serguei.spitsyn at oracle.com wrote: > Hi Jc, > > The fix looks good. > Alex is right. > I forgot to tell you that the test has be excluded from the file: > ?? open/test/hotspot/jtreg/ProblemList.txt > > Thanks, > Serguei > > > On 7/10/18 12:26, Alex Menkov wrote: >> Hi JC, >> >> you need also to remove the test from ProblemList >> >> --alex >> >> On 07/10/2018 11:41, JC Beyler wrote: >>> Hi All, >>> >>> Could someone review the one liner for the bug: >>> https://bugs.openjdk.java.net/browse/JDK-8205643 >>> >>> The webrev is here: >>> http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/ >>> >>> Basically, the test is testing CMS and Graal does not play well with >>> CMS it seems so this removes Graal being tested with it. >>> >>> Thanks, >>> Jc > From jcbeyler at google.com Tue Jul 10 22:31:08 2018 From: jcbeyler at google.com (JC Beyler) Date: Tue, 10 Jul 2018 15:31:08 -0700 Subject: RFR (S) 8205643: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails In-Reply-To: References: <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com> <6bfc1d3a-33fd-bad7-fe63-da8dcec8400f@oracle.com> Message-ID: Hi Serguei, Here it is: http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.02/ Could someone test/push it please? Thanks! Jc On Tue, Jul 10, 2018 at 2:56 PM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Sorry, did not see your reply to Alex. > Looks good - ship it! > > Thanks, > Serguei > > On 7/10/18 14:54, serguei.spitsyn at oracle.com wrote: > > Hi Jc, > > > > The fix looks good. > > Alex is right. > > I forgot to tell you that the test has be excluded from the file: > > open/test/hotspot/jtreg/ProblemList.txt > > > > Thanks, > > Serguei > > > > > > On 7/10/18 12:26, Alex Menkov wrote: > >> Hi JC, > >> > >> you need also to remove the test from ProblemList > >> > >> --alex > >> > >> On 07/10/2018 11:41, JC Beyler wrote: > >>> Hi All, > >>> > >>> Could someone review the one liner for the bug: > >>> https://bugs.openjdk.java.net/browse/JDK-8205643 > >>> > >>> The webrev is here: > >>> http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/ > >>> > >>> Basically, the test is testing CMS and Graal does not play well with > >>> CMS it seems so this removes Graal being tested with it. > >>> > >>> Thanks, > >>> Jc > > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Jul 10 22:38:08 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 10 Jul 2018 15:38:08 -0700 Subject: RFR (S) 8205643: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails In-Reply-To: References: <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com> <6bfc1d3a-33fd-bad7-fe63-da8dcec8400f@oracle.com> Message-ID: <484b69a2-c644-94da-6071-795b25900eb3@oracle.com> An HTML attachment was scrubbed... URL: From jini.george at oracle.com Wed Jul 11 02:38:37 2018 From: jini.george at oracle.com (Jini George) Date: Wed, 11 Jul 2018 08:08:37 +0530 Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X In-Reply-To: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com> References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com> Message-ID: <4a595c59-b457-2ed5-30fb-5d0436b9d55c@oracle.com> Gentle reminder ! Thanks, Jini. On 7/10/2018 12:14 AM, Jini George wrote: > Requesting reviews for enabling SA tests on OS X for Mach5. > > https://bugs.openjdk.java.net/browse/JDK-8199700 > > Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/ > > The changes are mostly to include the addition of sudo privileges to the > SA launchers for OSX if Platform.shouldSAAttach() fails. Some tests > (those using clhsdb) have been refactored to use ClhsdbLauncher for ease > of maintainence. This also avoids checks for Platform.shouldSAAttach() > for corefile related test cases. More details have been provided in JIRA. > > Thanks, > Jini. From david.holmes at oracle.com Wed Jul 11 08:24:29 2018 From: david.holmes at oracle.com (David Holmes) Date: Wed, 11 Jul 2018 18:24:29 +1000 Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X In-Reply-To: <4a595c59-b457-2ed5-30fb-5d0436b9d55c@oracle.com> References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com> <4a595c59-b457-2ed5-30fb-5d0436b9d55c@oracle.com> Message-ID: Hi Jini, There are quite a few changes to digest in this - it may have been better to break them up individually: - sudo use - refactor to use ClshdbLauncher - changes to use regex matching Focusing on the main sudo change the assumption is that on OSX you can run sudo without needing to provide a password - correct? That may be the case in mach5 but I'm not sure how others will go running these tests either in their test farms or locally. I'm not sure about the regex changes from contains to matches - won't you need additional wildcards at the start and end of the strings to allow the string to be embedded in a longer string ?? Thanks, David PS. I start vacation in 48 hours :) On 11/07/2018 12:38 PM, Jini George wrote: > Gentle reminder ! > > Thanks, > Jini. > > On 7/10/2018 12:14 AM, Jini George wrote: >> Requesting reviews for enabling SA tests on OS X for Mach5. >> >> https://bugs.openjdk.java.net/browse/JDK-8199700 >> >> Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/ >> >> The changes are mostly to include the addition of sudo privileges to >> the SA launchers for OSX if Platform.shouldSAAttach() fails. Some >> tests (those using clhsdb) have been refactored to use ClhsdbLauncher >> for ease of maintainence. This also avoids checks for >> Platform.shouldSAAttach() for corefile related test cases. More >> details have been provided in JIRA. >> >> Thanks, >> Jini. From jini.george at oracle.com Wed Jul 11 10:00:06 2018 From: jini.george at oracle.com (Jini George) Date: Wed, 11 Jul 2018 15:30:06 +0530 Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X In-Reply-To: References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com> <4a595c59-b457-2ed5-30fb-5d0436b9d55c@oracle.com> Message-ID: <50204c5b-62c5-e427-734b-cf5de2ffbb49@oracle.com> Thank you, David. My answers inline: On 7/11/2018 1:54 PM, David Holmes wrote: > Hi Jini, > > There are quite a few changes to digest in this - it may have been > better to break them up individually: > - sudo use > - refactor to use ClshdbLauncher > - changes to use regex matching > > Focusing on the main sudo change the assumption is that on OSX you can > run sudo without needing to provide a password - correct? That may be > the case in mach5 but I'm not sure how others will go running these > tests either in their test farms or locally. Right -- you would need to provide the password. So it prompts for the password for OSX. (Like how it would have been needed if you had run the test itself with 'sudo'). Examining the /etc/sudoers file to check if no password is needed could have been an option, but that itself would need an sudo, and probably would add unwanted complexity. > I'm not sure about the regex changes from contains to matches - won't > you need additional wildcards at the start and end of the strings to > allow the string to be embedded in a longer string ?? OutputAnalyzer's shouldMatch() uses the find() method of the Matcher class which matches sub-sequences. Thanks, Jini. > > Thanks, > David > > PS. I start vacation in 48 hours :) > > On 11/07/2018 12:38 PM, Jini George wrote: >> Gentle reminder ! >> >> Thanks, >> Jini. >> >> On 7/10/2018 12:14 AM, Jini George wrote: >>> Requesting reviews for enabling SA tests on OS X for Mach5. >>> >>> https://bugs.openjdk.java.net/browse/JDK-8199700 >>> >>> Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/ >>> >>> The changes are mostly to include the addition of sudo privileges to >>> the SA launchers for OSX if Platform.shouldSAAttach() fails. Some >>> tests (those using clhsdb) have been refactored to use ClhsdbLauncher >>> for ease of maintainence. This also avoids checks for >>> Platform.shouldSAAttach() for corefile related test cases. More >>> details have been provided in JIRA. >>> >>> Thanks, >>> Jini. From kubota.yuji at gmail.com Wed Jul 11 13:55:02 2018 From: kubota.yuji at gmail.com (KUBOTA Yuji) Date: Wed, 11 Jul 2018 22:55:02 +0900 Subject: RFR:8207048: jhsdb debugd cannot specify a port number Message-ID: Hi all, I filed bugzilla for small fix to improvement of `jhsdb debugd` to set a port of UnicastRemoteObject aka sun.jvm.hotspot.debugger.remote.RemoteDebuggerServer by `sun.jvm.hotspot.rmi.debugger.port`. Issue: https://bugs.openjdk.java.net/browse/JDK-8207048 Webrev: http://cr.openjdk.java.net/~ykubota/8207048/webrev.00/ We can set an RMI registry port of debugd server by `sun.jvm.hotspot.rmi.port`, but can not set a port of RemoteObject. So RemoteObject always uses an anonymous port. For security, we should not open ports widely to use debugd, so I want to fix. Could you review it? Thanks, Yuji From jcbeyler at google.com Wed Jul 11 17:04:17 2018 From: jcbeyler at google.com (JC Beyler) Date: Wed, 11 Jul 2018 10:04:17 -0700 Subject: RFR (S) 8206960: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail Message-ID: Hi all, Could someone review the small-ish webrev for the bug: https://bugs.openjdk.java.net/browse/JDK-8206960 The webrev is here: http://cr.openjdk.java.net/~jcbeyler/8206960/webrev.00/ Basically, the tests were failing for two reasons: - VMEventTest was failing because Graal does not support DisableIntrinsic required by the test, I disabled testing the test with Graal in this case - The other tests were failing because the BCI <-> source code line numbers are not always correct when using Graal via uncommon traps; therefore the tests now check if Graal is being used and, if so, only checks the method names. This allows us to still have tests working with Graal, albeit a bit more coarse. This passes all the HeapMonitor tests with -vmoptions:"-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal" (Except the GCCMS one which is being fixed via the one-liner for JDK-8205643). Let me know what you think, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From jini.george at oracle.com Wed Jul 11 17:32:51 2018 From: jini.george at oracle.com (Jini George) Date: Wed, 11 Jul 2018 23:02:51 +0530 Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running in Docker containers In-Reply-To: <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com> References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com> <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com> Message-ID: <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com> Hi Yasumasa, This looks good to me except for one nit. And some more comments would help. For e.g., it would help to say that NSPidMap is to map the host to container lwpids. The nit: * http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html Line 253: extra space after the parentheses Thanks, Jini. On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote: > PING: Could you review it? > >> ? JBS:??? https://bugs.openjdk.java.net/browse/JDK-8205992 >> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ > > > Thanks, > > Yasumasa > > > On 2018/06/28 22:12, Yasumasa Suenaga wrote: >> Hi all, >> >> Please review this change. >> >> ? JBS:??? https://bugs.openjdk.java.net/browse/JDK-8205992 >> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >> >> I tried to attach jhsdb to java process in docker container from >> container host, but it couldn't. >> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet. >> >> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they >> returns PIDs in container - they are different from host's PID. So I >> added the code to scan /proc//task to get all LWP IDs and they >> are kept in a Map in LinuxDebuggerLocal. >> >> Also SA_ALTROOT is set to /proc//root if SA detects debuggee runs >> in container. It helps SA to parse binaries in container. >> >> This change has been pushed to submit repo, and it was failed on OS X >> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963). >> But I guess it causes JDK-8205906. This change affects to Linux only. >> >> Could you review it? >> >> >> Thanks, >> >> Yasumasa >> From alexey.menkov at oracle.com Wed Jul 11 18:39:33 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 11 Jul 2018 11:39:33 -0700 Subject: RFR: JDK-8201513: nsk/jvmti/IterateThroughHeap/filter-* are broken Message-ID: Hi all, please review a fix for https://bugs.openjdk.java.net/browse/JDK-8201513 webrev: http://cr.openjdk.java.net/~amenkov/IterateThroughHeap/webrev/ summary: The tests had a error which was fixed during open-sourcing. After that the tests started to fail. Root cause of the failures is wrong verification (positive results are interpreted as negative) --alex From serguei.spitsyn at oracle.com Wed Jul 11 21:26:36 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 11 Jul 2018 14:26:36 -0700 Subject: RFR: JDK-8201513: nsk/jvmti/IterateThroughHeap/filter-* are broken In-Reply-To: References: Message-ID: Hi Alex, The fix looks good. Thank you for fixing the typos! Thanks, Serguei On 7/11/18 11:39, Alex Menkov wrote: > Hi all, > > please review a fix for > https://bugs.openjdk.java.net/browse/JDK-8201513 > webrev: > http://cr.openjdk.java.net/~amenkov/IterateThroughHeap/webrev/ > > summary: > The tests had a error which was fixed during open-sourcing. > After that the tests started to fail. Root cause of the failures is > wrong verification (positive results are interpreted as negative) > > --alex From serguei.spitsyn at oracle.com Wed Jul 11 21:42:25 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 11 Jul 2018 14:42:25 -0700 Subject: RFR (S) 8206960: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail In-Reply-To: References: Message-ID: <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com> An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Thu Jul 12 00:55:19 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 12 Jul 2018 10:55:19 +1000 Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X In-Reply-To: <50204c5b-62c5-e427-734b-cf5de2ffbb49@oracle.com> References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com> <4a595c59-b457-2ed5-30fb-5d0436b9d55c@oracle.com> <50204c5b-62c5-e427-734b-cf5de2ffbb49@oracle.com> Message-ID: <9644b36d-bb13-8625-a770-b11a9ee6c2eb@oracle.com> On 11/07/2018 8:00 PM, Jini George wrote: > Thank you, David. My answers inline: > > On 7/11/2018 1:54 PM, David Holmes wrote: >> Hi Jini, >> >> There are quite a few changes to digest in this - it may have been >> better to break them up individually: >> - sudo use >> - refactor to use ClshdbLauncher >> - changes to use regex matching >> >> Focusing on the main sudo change the assumption is that on OSX you can >> run sudo without needing to provide a password - correct? That may be >> the case in mach5 but I'm not sure how others will go running these >> tests either in their test farms or locally. > Right -- you would need to provide the password. So it prompts for the > password for OSX. (Like how it would have been needed if you had run the > test itself with 'sudo'). Examining the /etc/sudoers file to check if no > password is needed could have been an option, but that itself would need > an sudo, and probably would add unwanted complexity. So I'm not sure this change is acceptable when it may cause other testing environments to break. At a minimum I'd want to get the opinions of the SAP folk and anyone else doing regular build/test runs. >> I'm not sure about the regex changes from contains to matches - won't >> you need additional wildcards at the start and end of the strings to >> allow the string to be embedded in a longer string ?? > > OutputAnalyzer's shouldMatch() uses the find() method of the Matcher > class which matches sub-sequences. Ok. Thanks, David > Thanks, > Jini. > > >> >> Thanks, >> David >> >> PS. I start vacation in 48 hours :) >> >> On 11/07/2018 12:38 PM, Jini George wrote: >>> Gentle reminder ! >>> >>> Thanks, >>> Jini. >>> >>> On 7/10/2018 12:14 AM, Jini George wrote: >>>> Requesting reviews for enabling SA tests on OS X for Mach5. >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8199700 >>>> >>>> Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/ >>>> >>>> The changes are mostly to include the addition of sudo privileges to >>>> the SA launchers for OSX if Platform.shouldSAAttach() fails. Some >>>> tests (those using clhsdb) have been refactored to use >>>> ClhsdbLauncher for ease of maintainence. This also avoids checks for >>>> Platform.shouldSAAttach() for corefile related test cases. More >>>> details have been provided in JIRA. >>>> >>>> Thanks, >>>> Jini. From david.holmes at oracle.com Thu Jul 12 01:21:39 2018 From: david.holmes at oracle.com (David Holmes) Date: Thu, 12 Jul 2018 11:21:39 +1000 Subject: RFR:8207048: jhsdb debugd cannot specify a port number In-Reply-To: References: Message-ID: <5b422de0-ec5e-924e-004f-d58ab1474f85@oracle.com> Hi Yuji, I can't comment on the actual change proposed in this enhancement request, but it will need to have a CSR request created and approved due to the use of a new system property. Thanks, David On 11/07/2018 11:55 PM, KUBOTA Yuji wrote: > Hi all, > > I filed bugzilla for small fix to improvement of `jhsdb debugd` to set > a port of UnicastRemoteObject aka > sun.jvm.hotspot.debugger.remote.RemoteDebuggerServer by > `sun.jvm.hotspot.rmi.debugger.port`. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8207048 > Webrev: http://cr.openjdk.java.net/~ykubota/8207048/webrev.00/ > > We can set an RMI registry port of debugd server by > `sun.jvm.hotspot.rmi.port`, but can not set a port of RemoteObject. So > RemoteObject always uses an anonymous port. For security, we should > not open ports widely to use debugd, so I want to fix. > > Could you review it? > > Thanks, > Yuji > From kubota.yuji at gmail.com Thu Jul 12 01:40:47 2018 From: kubota.yuji at gmail.com (KUBOTA Yuji) Date: Thu, 12 Jul 2018 10:40:47 +0900 Subject: RFR:8207048: jhsdb debugd cannot specify a port number In-Reply-To: <5b422de0-ec5e-924e-004f-d58ab1474f85@oracle.com> References: <5b422de0-ec5e-924e-004f-d58ab1474f85@oracle.com> Message-ID: Hi David, Thank you for comment and updating JBS. I'll create a CSR request after getting comments whether this change is welcomed by community. Thanks, Yuji 2018-07-12 10:21 GMT+09:00 David Holmes : > Hi Yuji, > > I can't comment on the actual change proposed in this enhancement request, > but it will need to have a CSR request created and approved due to the use > of a new system property. > > Thanks, > David > > > > > On 11/07/2018 11:55 PM, KUBOTA Yuji wrote: >> >> Hi all, >> >> I filed bugzilla for small fix to improvement of `jhsdb debugd` to set >> a port of UnicastRemoteObject aka >> sun.jvm.hotspot.debugger.remote.RemoteDebuggerServer by >> `sun.jvm.hotspot.rmi.debugger.port`. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8207048 >> Webrev: http://cr.openjdk.java.net/~ykubota/8207048/webrev.00/ >> >> We can set an RMI registry port of debugd server by >> `sun.jvm.hotspot.rmi.port`, but can not set a port of RemoteObject. So >> RemoteObject always uses an anonymous port. For security, we should >> not open ports widely to use debugd, so I want to fix. >> >> Could you review it? >> >> Thanks, >> Yuji >> > From yasuenag at gmail.com Thu Jul 12 04:42:10 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 12 Jul 2018 13:42:10 +0900 Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running in Docker containers In-Reply-To: <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com> References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com> <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com> <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com> Message-ID: Thanks Jini, I uploaded new webrev. It contains some comments and removing extra space. http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/ Yasumasa 2018-07-12 2:32 GMT+09:00 Jini George : > Hi Yasumasa, > > This looks good to me except for one nit. And some more comments would help. > For e.g., it would help to say that NSPidMap is to map the host to container > lwpids. > > The nit: > > * > http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html > Line 253: extra space after the parentheses > > Thanks, > Jini. > > On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote: >> >> PING: Could you review it? >> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >> >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2018/06/28 22:12, Yasumasa Suenaga wrote: >>> >>> Hi all, >>> >>> Please review this change. >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >>> >>> I tried to attach jhsdb to java process in docker container from >>> container host, but it couldn't. >>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet. >>> >>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they >>> returns PIDs in container - they are different from host's PID. So I added >>> the code to scan /proc//task to get all LWP IDs and they are kept in a >>> Map in LinuxDebuggerLocal. >>> >>> Also SA_ALTROOT is set to /proc//root if SA detects debuggee runs in >>> container. It helps SA to parse binaries in container. >>> >>> This change has been pushed to submit repo, and it was failed on OS X >>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963). >>> But I guess it causes JDK-8205906. This change affects to Linux only. >>> >>> Could you review it? >>> >>> >>> Thanks, >>> >>> Yasumasa >>> > From jini.george at oracle.com Thu Jul 12 05:09:35 2018 From: jini.george at oracle.com (Jini George) Date: Thu, 12 Jul 2018 10:39:35 +0530 Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running in Docker containers In-Reply-To: References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com> <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com> <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com> Message-ID: Looks good to me. Thanks! Jini (Not a Reviewer). On 7/12/2018 10:12 AM, Yasumasa Suenaga wrote: > Thanks Jini, > > I uploaded new webrev. It contains some comments and removing extra space. > > http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/ > > > Yasumasa > > > > 2018-07-12 2:32 GMT+09:00 Jini George : >> Hi Yasumasa, >> >> This looks good to me except for one nit. And some more comments would help. >> For e.g., it would help to say that NSPidMap is to map the host to container >> lwpids. >> >> The nit: >> >> * >> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html >> Line 253: extra space after the parentheses >> >> Thanks, >> Jini. >> >> On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote: >>> >>> PING: Could you review it? >>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >>> >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2018/06/28 22:12, Yasumasa Suenaga wrote: >>>> >>>> Hi all, >>>> >>>> Please review this change. >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >>>> >>>> I tried to attach jhsdb to java process in docker container from >>>> container host, but it couldn't. >>>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet. >>>> >>>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they >>>> returns PIDs in container - they are different from host's PID. So I added >>>> the code to scan /proc//task to get all LWP IDs and they are kept in a >>>> Map in LinuxDebuggerLocal. >>>> >>>> Also SA_ALTROOT is set to /proc//root if SA detects debuggee runs in >>>> container. It helps SA to parse binaries in container. >>>> >>>> This change has been pushed to submit repo, and it was failed on OS X >>>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963). >>>> But I guess it causes JDK-8205906. This change affects to Linux only. >>>> >>>> Could you review it? >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >> From daniil.x.titov at oracle.com Thu Jul 12 05:23:18 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Wed, 11 Jul 2018 22:23:18 -0700 Subject: RFR JDK-8191948 : jdb error: InvalidTypeException: Can't assign double[][][] to double[][][] Message-ID: Please review the changes that fix jdb issue with evaluation of multidimensional arrays of primitives. The problem here is that for N-dimensional arrays of the primitives with N greater then 2, JDI fails to find its component type (which is an array of dimension N-1) assuming that it is a boot type. Thanks! Issue: https://bugs.openjdk.java.net/browse/JDK-8191948 Webrev: http://cr.openjdk.java.net/~dtitov/8191948/webrev.01 Best regards, Daniil From serguei.spitsyn at oracle.com Thu Jul 12 05:26:46 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 11 Jul 2018 22:26:46 -0700 Subject: RFR JDK-8191948 : jdb error: InvalidTypeException: Can't assign double[][][] to double[][][] In-Reply-To: References: Message-ID: <01fb43cf-3f19-e4e2-fae1-30c2a5665b44@oracle.com> Hi Daniil, It looks good. Thanks, Serguei On 7/11/18 22:23, Daniil Titov wrote: > Please review the changes that fix jdb issue with evaluation of multidimensional arrays of primitives. > > The problem here is that for N-dimensional arrays of the primitives with N greater then 2, JDI fails to find its component type (which is an array of dimension N-1) assuming that it is a boot type. > > Thanks! > > Issue: https://bugs.openjdk.java.net/browse/JDK-8191948 > Webrev: http://cr.openjdk.java.net/~dtitov/8191948/webrev.01 > > Best regards, > Daniil > > > > > > From yasuenag at gmail.com Thu Jul 12 05:29:05 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 12 Jul 2018 14:29:05 +0900 Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running in Docker containers In-Reply-To: References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com> <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com> <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com> Message-ID: Thanks Jini! I'm waiting for Reviewer. Yasumasa 2018-07-12 14:09 GMT+09:00 Jini George : > Looks good to me. > > Thanks! > Jini (Not a Reviewer). > > > On 7/12/2018 10:12 AM, Yasumasa Suenaga wrote: >> >> Thanks Jini, >> >> I uploaded new webrev. It contains some comments and removing extra space. >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/ >> >> >> Yasumasa >> >> >> >> 2018-07-12 2:32 GMT+09:00 Jini George : >>> >>> Hi Yasumasa, >>> >>> This looks good to me except for one nit. And some more comments would >>> help. >>> For e.g., it would help to say that NSPidMap is to map the host to >>> container >>> lwpids. >>> >>> The nit: >>> >>> * >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html >>> Line 253: extra space after the parentheses >>> >>> Thanks, >>> Jini. >>> >>> On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote: >>>> >>>> >>>> PING: Could you review it? >>>> >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >>>> >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2018/06/28 22:12, Yasumasa Suenaga wrote: >>>>> >>>>> >>>>> Hi all, >>>>> >>>>> Please review this change. >>>>> >>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >>>>> >>>>> I tried to attach jhsdb to java process in docker container from >>>>> container host, but it couldn't. >>>>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet. >>>>> >>>>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they >>>>> returns PIDs in container - they are different from host's PID. So I >>>>> added >>>>> the code to scan /proc//task to get all LWP IDs and they are kept >>>>> in a >>>>> Map in LinuxDebuggerLocal. >>>>> >>>>> Also SA_ALTROOT is set to /proc//root if SA detects debuggee runs >>>>> in >>>>> container. It helps SA to parse binaries in container. >>>>> >>>>> This change has been pushed to submit repo, and it was failed on OS X >>>>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963). >>>>> But I guess it causes JDK-8205906. This change affects to Linux only. >>>>> >>>>> Could you review it? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>> > From goetz.lindenmaier at sap.com Thu Jul 12 10:11:52 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Thu, 12 Jul 2018 10:11:52 +0000 Subject: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X In-Reply-To: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com> References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com> Message-ID: <5eb111d4ffd8427398a09c62a925e5d7@sap.com> Hi Jini, I had a look at your change. It makes tests fail if shouldSAAttach returns false. Now, these tests say "Errror: cannot attach", while before they would terminate silently. It is not an Error if the SA can not attach. You can reproduce this by just changing Platform.shouldSAAttach() to always return false. I'll run the patch throuqh our nightly tests to see whether they pass mac. Best regards, Goetz. > -----Original Message----- > From: serviceability-dev [mailto:serviceability-dev- > bounces at openjdk.java.net] On Behalf Of Jini George > Sent: Montag, 9. Juli 2018 20:45 > To: serviceability-dev at openjdk.java.net > Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X > > Requesting reviews for enabling SA tests on OS X for Mach5. > > https://bugs.openjdk.java.net/browse/JDK-8199700 > > Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/ > > The changes are mostly to include the addition of sudo privileges to the > SA launchers for OSX if Platform.shouldSAAttach() fails. Some tests > (those using clhsdb) have been refactored to use ClhsdbLauncher for ease > of maintainence. This also avoids checks for Platform.shouldSAAttach() > for corefile related test cases. More details have been provided in JIRA. > > Thanks, > Jini. From daniel.mitterdorfer at gmail.com Thu Jul 12 13:35:39 2018 From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer) Date: Thu, 12 Jul 2018 15:35:39 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() Message-ID: Hi, while working on a change in Elasticsearch, I discovered an interesting situation related to the implementation of jmm_getMemoryUsage (see [jdk-mem-usage]). In one of the test runs, a test failed with the following exception: java.lang.IllegalArgumentException: committed = 542113792 should be < max = 536870912 at java.lang.management.MemoryUsage.(MemoryUsage.java:166) at sun.management.MemoryImpl.getMemoryUsage0(Native Method) at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71) at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:246) [...] This happened on MacOS 10.12.6 with JDK 10 (build 10.0.1+10). The only JVM flags specified where -Xms512M -Xmx512M. So far this failure occurred only once and I could not reproduce it yet. The values reported in the exception message are: * "max": 536870912 = 512MB (exactly) * "committed": 542113792 = 517MB (exactly), i.e. 5MB more than "max". As the value of "max" is exactly what we have specified with -Xmx this indicates to me that the problem seems to be the calculation of "committed". As the value of "max" is exactly what we have specified with -Xmx it seems to indicate that the problem is the calculation of "committed". I do not understand under which conditions this can happen thus I post this to the mailing list in case anybody has ideas what might cause this. I plan to run further tests with JVM trace logging enabled (-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags to be precise) in the hope that this problem will occur again and I can provide logs that help to debug / fix the problem. Searching for that error message, there is [JDK-8020530] but that one is about *non-heap* memory usage and has already been resolved a while ago. Several sources (e.g. [apache-ignite-workaround] or [netbeans-bug]) seem to indicate that this problem happened indeed in the wild but what I find odd is that I could not find a single ticket in the OpenJDK bug tracker or a discussion on a JDK mailing list about this problem. I'd be glad to get any pointers on what might cause this or requests for additional info that I need to provide to help analyze this problem. Thanks, Daniel [jdk-mem-usage] http://hg.openjdk.java.net/jdk-updates/jdk10u/file/142f0ed9ff5b/src/hotspot/share/services/management.cpp#l728 [JDK-8020530] https://bugs.openjdk.java.net/browse/JDK-8020530 [apache-ignite-workaround] https://github.com/apache/ignite/blob/df4fd65a32/modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java#L336-L346 [netbeans-bug] https://netbeans.org/bugzilla/show_bug.cgi?id=194733 From jcbeyler at google.com Thu Jul 12 14:25:29 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 12 Jul 2018 07:25:29 -0700 Subject: RFR (S) 8206960: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail In-Reply-To: <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com> References: <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com> Message-ID: Thanks Serguei! Anybody motivated to give this a review please? Thanks! Jc On Wed, Jul 11, 2018 at 2:42 PM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Hi Jc, > > The fix looks good. > I'll sponsor a push once it has been reviewed. > > Thanks, > Serguei > > > On 7/11/18 10:04, JC Beyler wrote: > > Hi all, > > Could someone review the small-ish webrev for the bug: > https://bugs.openjdk.java.net/browse/JDK-8206960 > > The webrev is here: > http://cr.openjdk.java.net/~jcbeyler/8206960/webrev.00/ > > Basically, the tests were failing for two reasons: > - VMEventTest was failing because Graal does not support > DisableIntrinsic required by the test, I disabled testing the test with > Graal in this case > - The other tests were failing because the BCI <-> source code line > numbers are not always correct when using Graal via uncommon traps; > therefore the tests now check if Graal is being used and, if so, only > checks the method names. This allows us to still have tests working with > Graal, albeit a bit more coarse. > > This passes all the HeapMonitor tests > with -vmoptions:"-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI > -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal" > > (Except the GCCMS one which is being fixed via the one-liner for > JDK-8205643). > > Let me know what you think, > Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From gary.adams at oracle.com Thu Jul 12 14:53:38 2018 From: gary.adams at oracle.com (Gary Adams) Date: Thu, 12 Jul 2018 10:53:38 -0400 Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time on some builds In-Reply-To: <5B3E2FC7.1060303@oracle.com> References: <5B3E2FC7.1060303@oracle.com> Message-ID: <5B476B72.7060203@oracle.com> I've attached the patch for JDK-8206007. I'll need a sponsor to push the changes. On 7/5/18, 10:48 AM, Gary Adams wrote: > A simple test run using "exclude none" shows 625K methods are being > observed. > The bulk of those methods were due to the last class accessed in the > test - VirtualMachineManager. > > It's not important that this particular call is used. The test is > simply demonstrating that > filters work for other packages than java and javax. > > This proposed fix uses a simpler lookup for GregorianCalendar. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8206007 > Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/ -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: 8206007.patch URL: From jini.george at oracle.com Thu Jul 12 16:32:31 2018 From: jini.george at oracle.com (Jini George) Date: Thu, 12 Jul 2018 22:02:31 +0530 Subject: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X In-Reply-To: <5eb111d4ffd8427398a09c62a925e5d7@sap.com> References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com> <5eb111d4ffd8427398a09c62a925e5d7@sap.com> Message-ID: Thanks, Goetz. The "Error: cannot attach" was put in deliberately so that we get to know that this is not getting tested. I can change this to retain the old behaviour of skipping the tests if we cannot attach. Thanks, Jini. On 7/12/2018 3:41 PM, Lindenmaier, Goetz wrote: > Hi Jini, > > I had a look at your change. > It makes tests fail if shouldSAAttach returns false. > > Now, these tests say "Errror: cannot attach", > while before they would terminate silently. > > It is not an Error if the SA can not attach. > > You can reproduce this by just changing > Platform.shouldSAAttach() to always return false. > > I'll run the patch throuqh our nightly tests to > see whether they pass mac. > > Best regards, > Goetz. > >> -----Original Message----- >> From: serviceability-dev [mailto:serviceability-dev- >> bounces at openjdk.java.net] On Behalf Of Jini George >> Sent: Montag, 9. Juli 2018 20:45 >> To: serviceability-dev at openjdk.java.net >> Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X >> >> Requesting reviews for enabling SA tests on OS X for Mach5. >> >> https://bugs.openjdk.java.net/browse/JDK-8199700 >> >> Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/ >> >> The changes are mostly to include the addition of sudo privileges to the >> SA launchers for OSX if Platform.shouldSAAttach() fails. Some tests >> (those using clhsdb) have been refactored to use ClhsdbLauncher for ease >> of maintainence. This also avoids checks for Platform.shouldSAAttach() >> for corefile related test cases. More details have been provided in JIRA. >> >> Thanks, >> Jini. From jini.george at oracle.com Thu Jul 12 16:43:00 2018 From: jini.george at oracle.com (Jini George) Date: Thu, 12 Jul 2018 22:13:00 +0530 Subject: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X In-Reply-To: <5eb111d4ffd8427398a09c62a925e5d7@sap.com> References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com> <5eb111d4ffd8427398a09c62a925e5d7@sap.com> Message-ID: > > I'll run the patch throuqh our nightly tests to > see whether they pass mac. Thanks for this. Let me know in case there are timeouts due to there not being a no-password entry for the user in the /etc/sudoers list. Thanks, Jini. From alexey.menkov at oracle.com Thu Jul 12 18:21:55 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 12 Jul 2018 11:21:55 -0700 Subject: RFR JDK-8191948 : jdb error: InvalidTypeException: Can't assign double[][][] to double[][][] In-Reply-To: <01fb43cf-3f19-e4e2-fae1-30c2a5665b44@oracle.com> References: <01fb43cf-3f19-e4e2-fae1-30c2a5665b44@oracle.com> Message-ID: <9923479f-eb69-2d5b-3939-453c77c81aef@oracle.com> +1 --alex On 07/11/2018 22:26, serguei.spitsyn at oracle.com wrote: > Hi Daniil, > > It looks good. > > Thanks, > Serguei > > > On 7/11/18 22:23, Daniil Titov wrote: >> Please review the changes that fix jdb issue with evaluation of >> multidimensional arrays of primitives. >> >> The problem here is that for N-dimensional arrays of the primitives >> with N greater then 2, JDI fails to find its component type (which is >> an array of dimension N-1) assuming that it is a boot type. >> >> Thanks! >> Issue: https://bugs.openjdk.java.net/browse/JDK-8191948 >> Webrev: http://cr.openjdk.java.net/~dtitov/8191948/webrev.01 >> Best regards, >> Daniil >> >> >> > From alexey.menkov at oracle.com Thu Jul 12 18:30:37 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 12 Jul 2018 11:30:37 -0700 Subject: RFR (S) 8206960: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail In-Reply-To: References: <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com> Message-ID: Looks good to me as well. --alex On 07/12/2018 07:25, JC Beyler wrote: > Thanks Serguei! > > Anybody motivated to give this a review please? > > Thanks! > Jc > > On Wed, Jul 11, 2018 at 2:42 PM serguei.spitsyn at oracle.com > > wrote: > > Hi Jc, > > The fix looks good. > I'll sponsor a push once it has been reviewed. > > Thanks, > Serguei > > > On 7/11/18 10:04, JC Beyler wrote: >> Hi all, >> >> Could someone review the small-ish webrev for the bug: >> https://bugs.openjdk.java.net/browse/JDK-8206960 >> >> The webrev is here: >> http://cr.openjdk.java.net/~jcbeyler/8206960/webrev.00/ >> >> >> Basically, the tests were failing for two reasons: >> ? - VMEventTest was failing because Graal does not support >> DisableIntrinsic required by the test, I disabled testing the test >> with Graal in this case >> ? - The other tests were failing because the BCI <-> source code >> line numbers are not always correct when using Graal via uncommon >> traps; therefore the tests now check if Graal is being used and, >> if so, only checks the method names. This allows us to still have >> tests working with Graal, albeit a bit more coarse. >> >> This passes all the HeapMonitor tests >> with?-vmoptions:"-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI >> -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal" >> >> (Except the GCCMS one which is being fixed via the one-liner for >> JDK-8205643). >> >> Let me know what you think, >> Jc > > > > -- > > Thanks, > Jc From serguei.spitsyn at oracle.com Thu Jul 12 18:33:00 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 12 Jul 2018 11:33:00 -0700 Subject: RFR (S) 8206960: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail In-Reply-To: References: <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com> Message-ID: <79de22b1-f7c6-9406-53fa-9a1614029097@oracle.com> Thanks, Alex! Jc, I'll push it if you send me a patch. Thanks, Serguei On 7/12/18 11:30, Alex Menkov wrote: > Looks good to me as well. > > --alex > > On 07/12/2018 07:25, JC Beyler wrote: >> Thanks Serguei! >> >> Anybody motivated to give this a review please? >> >> Thanks! >> Jc >> >> On Wed, Jul 11, 2018 at 2:42 PM serguei.spitsyn at oracle.com >> > > wrote: >> >> ??? Hi Jc, >> >> ??? The fix looks good. >> ??? I'll sponsor a push once it has been reviewed. >> >> ??? Thanks, >> ??? Serguei >> >> >> ??? On 7/11/18 10:04, JC Beyler wrote: >>> ??? Hi all, >>> >>> ??? Could someone review the small-ish webrev for the bug: >>> ??? https://bugs.openjdk.java.net/browse/JDK-8206960 >>> >>> ??? The webrev is here: >>> ??? http://cr.openjdk.java.net/~jcbeyler/8206960/webrev.00/ >>> >>> >>> ??? Basically, the tests were failing for two reasons: >>> ??? ? - VMEventTest was failing because Graal does not support >>> ??? DisableIntrinsic required by the test, I disabled testing the test >>> ??? with Graal in this case >>> ??? ? - The other tests were failing because the BCI <-> source code >>> ??? line numbers are not always correct when using Graal via uncommon >>> ??? traps; therefore the tests now check if Graal is being used and, >>> ??? if so, only checks the method names. This allows us to still have >>> ??? tests working with Graal, albeit a bit more coarse. >>> >>> ??? This passes all the HeapMonitor tests >>> ??? with?-vmoptions:"-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI >>> ??? -XX:+TieredCompilation -XX:+UseJVMCICompiler >>> -Djvmci.Compiler=graal" >>> >>> ??? (Except the GCCMS one which is being fixed via the one-liner for >>> ??? JDK-8205643). >>> >>> ??? Let me know what you think, >>> ??? Jc >> >> >> >> -- >> >> Thanks, >> Jc From jcbeyler at google.com Thu Jul 12 19:02:21 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 12 Jul 2018 12:02:21 -0700 Subject: RFR (S) 8206960: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail In-Reply-To: <79de22b1-f7c6-9406-53fa-9a1614029097@oracle.com> References: <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com> <79de22b1-f7c6-9406-53fa-9a1614029097@oracle.com> Message-ID: Hi Serguei, Here you are: http://cr.openjdk.java.net/~jcbeyler/8206960/webrev.01/ Thanks for the push! Jc On Thu, Jul 12, 2018 at 11:33 AM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Thanks, Alex! > > Jc, > > I'll push it if you send me a patch. > > Thanks, > Serguei > > > On 7/12/18 11:30, Alex Menkov wrote: > > Looks good to me as well. > > > > --alex > > > > On 07/12/2018 07:25, JC Beyler wrote: > >> Thanks Serguei! > >> > >> Anybody motivated to give this a review please? > >> > >> Thanks! > >> Jc > >> > >> On Wed, Jul 11, 2018 at 2:42 PM serguei.spitsyn at oracle.com > >> >> > wrote: > >> > >> Hi Jc, > >> > >> The fix looks good. > >> I'll sponsor a push once it has been reviewed. > >> > >> Thanks, > >> Serguei > >> > >> > >> On 7/11/18 10:04, JC Beyler wrote: > >>> Hi all, > >>> > >>> Could someone review the small-ish webrev for the bug: > >>> https://bugs.openjdk.java.net/browse/JDK-8206960 > >>> > >>> The webrev is here: > >>> http://cr.openjdk.java.net/~jcbeyler/8206960/webrev.00/ > >>> > >>> > >>> Basically, the tests were failing for two reasons: > >>> - VMEventTest was failing because Graal does not support > >>> DisableIntrinsic required by the test, I disabled testing the test > >>> with Graal in this case > >>> - The other tests were failing because the BCI <-> source code > >>> line numbers are not always correct when using Graal via uncommon > >>> traps; therefore the tests now check if Graal is being used and, > >>> if so, only checks the method names. This allows us to still have > >>> tests working with Graal, albeit a bit more coarse. > >>> > >>> This passes all the HeapMonitor tests > >>> with -vmoptions:"-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI > >>> -XX:+TieredCompilation -XX:+UseJVMCICompiler > >>> -Djvmci.Compiler=graal" > >>> > >>> (Except the GCCMS one which is being fixed via the one-liner for > >>> JDK-8205643). > >>> > >>> Let me know what you think, > >>> Jc > >> > >> > >> > >> -- > >> > >> Thanks, > >> Jc > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Thu Jul 12 19:08:31 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 12 Jul 2018 12:08:31 -0700 Subject: RFR JDK-8191948 : jdb error: InvalidTypeException: Can't assign double[][][] to double[][][] In-Reply-To: <9923479f-eb69-2d5b-3939-453c77c81aef@oracle.com> References: <01fb43cf-3f19-e4e2-fae1-30c2a5665b44@oracle.com> <9923479f-eb69-2d5b-3939-453c77c81aef@oracle.com> Message-ID: <309C0D5B-365F-4EAC-8D8C-A87A197640BD@oracle.com> Thank you, Alex and Serguei for reviewing this change! Best regards, Daniil ?On 7/12/18, 11:21 AM, "Alex Menkov" wrote: +1 --alex On 07/11/2018 22:26, serguei.spitsyn at oracle.com wrote: > Hi Daniil, > > It looks good. > > Thanks, > Serguei > > > On 7/11/18 22:23, Daniil Titov wrote: >> Please review the changes that fix jdb issue with evaluation of >> multidimensional arrays of primitives. >> >> The problem here is that for N-dimensional arrays of the primitives >> with N greater then 2, JDI fails to find its component type (which is >> an array of dimension N-1) assuming that it is a boot type. >> >> Thanks! >> Issue: https://bugs.openjdk.java.net/browse/JDK-8191948 >> Webrev: http://cr.openjdk.java.net/~dtitov/8191948/webrev.01 >> Best regards, >> Daniil >> >> >> > From serguei.spitsyn at oracle.com Thu Jul 12 19:40:10 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 12 Jul 2018 12:40:10 -0700 Subject: RFR (S) 8206960: [Graal] serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail In-Reply-To: References: <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com> <79de22b1-f7c6-9406-53fa-9a1614029097@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Thu Jul 12 20:34:52 2018 From: mandy.chung at oracle.com (mandy chung) Date: Thu, 12 Jul 2018 13:34:52 -0700 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: References: Message-ID: <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com> It's indeed strange that no one reports this issue. I created: https://bugs.openjdk.java.net/browse/JDK-8207200 Mandy On 7/12/18 6:35 AM, Daniel Mitterdorfer wrote: > Hi, > > while working on a change in Elasticsearch, I discovered an interesting > situation related to the implementation of jmm_getMemoryUsage (see > [jdk-mem-usage]). In one of the test runs, a test failed with the following > exception: > > java.lang.IllegalArgumentException: committed = 542113792 should be < > max = 536870912 > at java.lang.management.MemoryUsage.(MemoryUsage.java:166) > at sun.management.MemoryImpl.getMemoryUsage0(Native Method) > at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71) > at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:246) > [...] > > This happened on MacOS 10.12.6 with JDK 10 (build 10.0.1+10). The only JVM flags > specified where -Xms512M -Xmx512M. So far this failure occurred only once and I > could not reproduce it yet. > > The values reported in the exception message are: > > * "max": 536870912 = 512MB (exactly) > * "committed": 542113792 = 517MB (exactly), i.e. 5MB more than "max". > > As the value of "max" is exactly what we have specified with -Xmx this indicates > to me that the problem seems to be the calculation of "committed". > > As the value of "max" is exactly what we have specified with -Xmx it seems to > indicate that the problem is the calculation of "committed". I do not > understand under which conditions this can happen thus I post this to the > mailing list in case anybody has ideas what might cause this. > > I plan to run further tests with JVM trace logging enabled > (-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags to be > precise) in the hope that this problem will occur again and I can provide logs > that help to debug / fix the problem. > > Searching for that error message, there is [JDK-8020530] but that one is about > *non-heap* memory usage and has already been resolved a while ago. Several > sources (e.g. [apache-ignite-workaround] or [netbeans-bug]) seem to indicate > that this problem happened indeed in the wild but what I find odd is that I > could not find a single ticket in the OpenJDK bug tracker or a discussion on a > JDK mailing list about this problem. > > I'd be glad to get any pointers on what might cause this or requests for > additional info that I need to provide to help analyze this problem. > > Thanks, > Daniel > > [jdk-mem-usage] > http://hg.openjdk.java.net/jdk-updates/jdk10u/file/142f0ed9ff5b/src/hotspot/share/services/management.cpp#l728 > [JDK-8020530] https://bugs.openjdk.java.net/browse/JDK-8020530 > [apache-ignite-workaround] > https://github.com/apache/ignite/blob/df4fd65a32/modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java#L336-L346 > [netbeans-bug] https://netbeans.org/bugzilla/show_bug.cgi?id=194733 > From jcbeyler at google.com Thu Jul 12 20:45:03 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 12 Jul 2018 13:45:03 -0700 Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling Message-ID: Hi all, Could I get a review of an update to the JVMTI Spec for Heap Sampling: http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ The assoicated bug is here: https://bugs.openjdk.java.net/browse/JDK-8205725 The associated CSR is here: https://bugs.openjdk.java.net/browse/JDK-8206940 The basic reasoning of this webrev/bug/CSR is: - rate is not the right word and should be renamed to interval, this is what provokes the change in the code/tests/API naming. - the spec does not mention that the new sampling interval will take time to be taken into account (you have to wait for a TLAB to be refilled); this adds that precision so that the user is not surprised - the spec explicitly says that the sampling is done via a geometric variable which averages to the sampling interval; it was asked to relax this and the spec should just say that the sampling is pseudo-random and the interval will average out to what the user requested. Thanks for all your help, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Jul 12 21:27:10 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 12 Jul 2018 14:27:10 -0700 Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling In-Reply-To: References: Message-ID: <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Thu Jul 12 22:48:34 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 12 Jul 2018 15:48:34 -0700 Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time on some builds In-Reply-To: <5B476B72.7060203@oracle.com> References: <5B3E2FC7.1060303@oracle.com> <5B476B72.7060203@oracle.com> Message-ID: <75c06bd1-a405-528b-25d9-307ca78d60c3@oracle.com> I'll take care of it shortly. Chris On 7/12/18 7:53 AM, Gary Adams wrote: > I've attached the patch for JDK-8206007. > I'll need a sponsor to push the changes. > > On 7/5/18, 10:48 AM, Gary Adams wrote: >> A simple test run using "exclude none" shows 625K methods are being >> observed. >> The bulk of those methods were due to the last class accessed in the >> test - VirtualMachineManager. >> >> It's not important that this particular call is used. The test is >> simply demonstrating that >> filters work for other packages than java and javax. >> >> This proposed fix uses a simpler lookup for GregorianCalendar. >> >> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206007 >> ? Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/ > From chris.plummer at oracle.com Thu Jul 12 22:58:36 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 12 Jul 2018 15:58:36 -0700 Subject: RFR: JDK-8201513: nsk/jvmti/IterateThroughHeap/filter-* are broken In-Reply-To: References: Message-ID: +1 On 7/11/18 2:26 PM, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > The fix looks good. > Thank you for fixing the typos! > > Thanks, > Serguei > > > On 7/11/18 11:39, Alex Menkov wrote: >> Hi all, >> >> please review a fix for >> https://bugs.openjdk.java.net/browse/JDK-8201513 >> webrev: >> http://cr.openjdk.java.net/~amenkov/IterateThroughHeap/webrev/ >> >> summary: >> The tests had a error which was fixed during open-sourcing. >> After that the tests started to fail. Root cause of the failures is >> wrong verification (positive results are interpreted as negative) >> >> --alex > From mikael.vidstedt at oracle.com Fri Jul 13 00:21:00 2018 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Thu, 12 Jul 2018 17:21:00 -0700 Subject: RFR(XS): 8207217: Problem list java/lang/management/ThreadMXBean/AllThreadIds.java Message-ID: Please review this change which problem lists the frequently failing java/lang/management/ThreadMXBean/AllThreadIds.java test until the issue[1] has been fixed: Bug: https://bugs.openjdk.java.net/browse/JDK-8207217 webrev: http://cr.openjdk.java.net/~mikael/webrevs/8207217/webrev.00/open/webrev/ Cheers, Mikael [1] https://bugs.openjdk.java.net/browse/JDK-8131745 -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Fri Jul 13 00:29:31 2018 From: david.holmes at oracle.com (David Holmes) Date: Fri, 13 Jul 2018 10:29:31 +1000 Subject: RFR(XS): 8207217: Problem list java/lang/management/ThreadMXBean/AllThreadIds.java In-Reply-To: References: Message-ID: <1684f644-bf38-47a7-2725-e4d3d700a573@oracle.com> Ship it! Thanks, David On 13/07/2018 10:21 AM, Mikael Vidstedt wrote: > > Please review this change which problem lists the frequently failing > java/lang/management/ThreadMXBean/AllThreadIds.java test until the > issue[1] has been fixed: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8207217 > webrev: > http://cr.openjdk.java.net/~mikael/webrevs/8207217/webrev.00/open/webrev/ > > Cheers, > Mikael > > [1] https://bugs.openjdk.java.net/browse/JDK-8131745 > From goetz.lindenmaier at sap.com Fri Jul 13 05:55:12 2018 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Fri, 13 Jul 2018 05:55:12 +0000 Subject: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X In-Reply-To: References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com> <5eb111d4ffd8427398a09c62a925e5d7@sap.com> Message-ID: <1156b70a17d44226a0510713e1975451@sap.com> Hi Jini, A whole bunch of tests failed on mac. I'll send you A jtr file off list, to avoid spamming the list. See below the core message. The tests passed on linuxppc64le, linuxx86_64 and solaris_sparc, the other tests are still pending. Best regards, Goetz. ----------System.err:(32/1923)---------- Command line: ['/priv/jvmtests/output_sapjvm12_o_jdk-test_dbgU_darwinintel64/sapjvm_12/bin/java' '-Xcomp' '-cp' '/priv/jvmtests/output_sapjvm12_o_jdk-test_dbgU_darwinintel64/jtreg_hotspot_work/JTwork/classes/serviceability/sa/ClhsdbFindPC.d:/priv/jvmtests/output_sapjvm12_o_jdk-test_dbgU_darwinintel64/jtreg_hotspot_work/JTwork/classes/test/lib' 'jdk.test.lib.apps.LingeredApp' '78a4a198-8a55-4684-ac1e-2d28311a0952.lck' ] sudo: no tty present and no askpass program specified stdout: []; stderr: [] exitValue = 1 LingeredApp stdout: []; LingeredApp stderr: [] LingeredApp exitValue = 0 java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0] at ClhsdbFindPC.testFindPC(ClhsdbFindPC.java:95) at ClhsdbFindPC.main(ClhsdbFindPC.java:103) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: java.lang.RuntimeException: Expected to get exit value of [0] at jdk.test.lib.process.OutputAnalyzer.shouldHaveExitValue(OutputAnalyzer.java:396) at ClhsdbLauncher.runCmd(ClhsdbLauncher.java:128) at ClhsdbLauncher.run(ClhsdbLauncher.java:176) at ClhsdbFindPC.testFindPC(ClhsdbFindPC.java:58) ... 7 more JavaTest Message: Test threw exception: java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0] JavaTest Message: shutting down test STATUS:Failed.`main' threw exception: java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0] > -----Original Message----- > From: Jini George > Sent: Thursday, July 12, 2018 6:43 PM > To: Lindenmaier, Goetz ; serviceability- > dev at openjdk.java.net > Subject: Re: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X > > > > > > I'll run the patch throuqh our nightly tests to > > see whether they pass mac. > > Thanks for this. Let me know in case there are timeouts due to there not > being a no-password entry for the user in the /etc/sudoers list. > > Thanks, > Jini. From jini.george at oracle.com Fri Jul 13 06:21:06 2018 From: jini.george at oracle.com (Jini George) Date: Fri, 13 Jul 2018 11:51:06 +0530 Subject: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X In-Reply-To: <1156b70a17d44226a0510713e1975451@sap.com> References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com> <5eb111d4ffd8427398a09c62a925e5d7@sap.com> <1156b70a17d44226a0510713e1975451@sap.com> Message-ID: Thanks a bunch, Goetz. As David feared, the tests are failing due to there not being a no-password entry for the user in the /etc/sudoers list ("sudo: no tty present and no askpass program specified"). Let me see what I can do about this. Thanks, Jini. On 7/13/2018 11:25 AM, Lindenmaier, Goetz wrote: > Hi Jini, > > A whole bunch of tests failed on mac. I'll send you > A jtr file off list, to avoid spamming the list. > See below the core message. > > The tests passed on linuxppc64le, linuxx86_64 and solaris_sparc, the other > tests are still pending. > > Best regards, > Goetz. > > ----------System.err:(32/1923)---------- > Command line: ['/priv/jvmtests/output_sapjvm12_o_jdk-test_dbgU_darwinintel64/sapjvm_12/bin/java' '-Xcomp' '-cp' '/priv/jvmtests/output_sapjvm12_o_jdk-test_dbgU_darwinintel64/jtreg_hotspot_work/JTwork/classes/serviceability/sa/ClhsdbFindPC.d:/priv/jvmtests/output_sapjvm12_o_jdk-test_dbgU_darwinintel64/jtreg_hotspot_work/JTwork/classes/test/lib' 'jdk.test.lib.apps.LingeredApp' '78a4a198-8a55-4684-ac1e-2d28311a0952.lck' ] > sudo: no tty present and no askpass program specified > stdout: []; > stderr: [] > exitValue = 1 > > LingeredApp stdout: []; > LingeredApp stderr: [] > LingeredApp exitValue = 0 > java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0] > > at ClhsdbFindPC.testFindPC(ClhsdbFindPC.java:95) > at ClhsdbFindPC.main(ClhsdbFindPC.java:103) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.lang.RuntimeException: Expected to get exit value of [0] > > at jdk.test.lib.process.OutputAnalyzer.shouldHaveExitValue(OutputAnalyzer.java:396) > at ClhsdbLauncher.runCmd(ClhsdbLauncher.java:128) > at ClhsdbLauncher.run(ClhsdbLauncher.java:176) > at ClhsdbFindPC.testFindPC(ClhsdbFindPC.java:58) > ... 7 more > > JavaTest Message: Test threw exception: java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0] > > JavaTest Message: shutting down test > > STATUS:Failed.`main' threw exception: java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0] > >> -----Original Message----- >> From: Jini George >> Sent: Thursday, July 12, 2018 6:43 PM >> To: Lindenmaier, Goetz ; serviceability- >> dev at openjdk.java.net >> Subject: Re: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X >> >> >>> >>> I'll run the patch throuqh our nightly tests to >>> see whether they pass mac. >> >> Thanks for this. Let me know in case there are timeouts due to there not >> being a no-password entry for the user in the /etc/sudoers list. >> >> Thanks, >> Jini. From daniel.mitterdorfer at gmail.com Fri Jul 13 08:04:47 2018 From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer) Date: Fri, 13 Jul 2018 10:04:47 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com> References: <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com> Message-ID: Hi Mandy, thank you for creating the issue. One note: I spotted this in JDK 10 (build 10.0.1+10) but in the ticket it says it affects version 8. Daniel Am Fr., 13. Juli 2018 um 04:15 Uhr schrieb mandy chung : > > It's indeed strange that no one reports this issue. I created: > https://bugs.openjdk.java.net/browse/JDK-8207200 > > Mandy > > On 7/12/18 6:35 AM, Daniel Mitterdorfer wrote: > > Hi, > > > > while working on a change in Elasticsearch, I discovered an interesting > > situation related to the implementation of jmm_getMemoryUsage (see > > [jdk-mem-usage]). In one of the test runs, a test failed with the following > > exception: > > > > java.lang.IllegalArgumentException: committed = 542113792 should be < > > max = 536870912 > > at java.lang.management.MemoryUsage.(MemoryUsage.java:166) > > at sun.management.MemoryImpl.getMemoryUsage0(Native Method) > > at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71) > > at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:246) > > [...] > > > > This happened on MacOS 10.12.6 with JDK 10 (build 10.0.1+10). The only JVM flags > > specified where -Xms512M -Xmx512M. So far this failure occurred only once and I > > could not reproduce it yet. > > > > The values reported in the exception message are: > > > > * "max": 536870912 = 512MB (exactly) > > * "committed": 542113792 = 517MB (exactly), i.e. 5MB more than "max". > > > > As the value of "max" is exactly what we have specified with -Xmx this indicates > > to me that the problem seems to be the calculation of "committed". > > > > As the value of "max" is exactly what we have specified with -Xmx it seems to > > indicate that the problem is the calculation of "committed". I do not > > understand under which conditions this can happen thus I post this to the > > mailing list in case anybody has ideas what might cause this. > > > > I plan to run further tests with JVM trace logging enabled > > (-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags to be > > precise) in the hope that this problem will occur again and I can provide logs > > that help to debug / fix the problem. > > > > Searching for that error message, there is [JDK-8020530] but that one is about > > *non-heap* memory usage and has already been resolved a while ago. Several > > sources (e.g. [apache-ignite-workaround] or [netbeans-bug]) seem to indicate > > that this problem happened indeed in the wild but what I find odd is that I > > could not find a single ticket in the OpenJDK bug tracker or a discussion on a > > JDK mailing list about this problem. > > > > I'd be glad to get any pointers on what might cause this or requests for > > additional info that I need to provide to help analyze this problem. > > > > Thanks, > > Daniel > > > > [jdk-mem-usage] > > http://hg.openjdk.java.net/jdk-updates/jdk10u/file/142f0ed9ff5b/src/hotspot/share/services/management.cpp#l728 > > [JDK-8020530] https://bugs.openjdk.java.net/browse/JDK-8020530 > > [apache-ignite-workaround] > > https://github.com/apache/ignite/blob/df4fd65a32/modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java#L336-L346 > > [netbeans-bug] https://netbeans.org/bugzilla/show_bug.cgi?id=194733 > > From Alan.Bateman at oracle.com Fri Jul 13 08:16:27 2018 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 13 Jul 2018 09:16:27 +0100 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: References: <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com> Message-ID: <9ff83f20-3c71-80cd-3d3a-5b65910d0266@oracle.com> On 13/07/2018 09:04, Daniel Mitterdorfer wrote: > Hi Mandy, > > thank you for creating the issue. One note: I spotted this in JDK 10 > (build 10.0.1+10) but in the ticket it says it affects version 8. > A bug with affects version N is assumed to be applicable to all releases > N unless tagged otherwise. So "10" could be added to the list of versions where the issue was spotted or confirmed if needed. -Alan From erik.helin at oracle.com Fri Jul 13 08:18:19 2018 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 13 Jul 2018 10:18:19 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: References: Message-ID: Hi Daniel, thanks for letting us know. Since you have only set -Xms512 and -Xmx512 and you are running on JDK 10 that means you are using the G1 garbage collector, so all the calls to pool->get_memory_usage() in the loop will end up in g1MemoryPool.cpp [0] which in turn will return cached values from the recalculate_sizes code in G1MonitoringSupport [1]. Since you are running with -Xmx512m you should have gotten 1 MB sized regions (see heapRegion.cpp for details [2]), so the 5 MB _could_ mean that five regions were accounted wrongly. Do you any kind of GC logging from the test run where you encountered the bug? The code in G1MonitoringSupport::recalculate_sizes seems messy enough that there could be in a small bug in there. I'm adding hotspot-gc-dev since all GC developers might not read serviceability-dev. Thanks, Erik [0]: http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/share/gc/g1/g1MemoryPool.cpp [1]: http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/share/gc/g1/g1MonitoringSupport.cpp#l182 [2]: http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/share/gc/g1/heapRegion.cpp#l63 On 07/12/2018 03:35 PM, Daniel Mitterdorfer wrote: > Hi, > > while working on a change in Elasticsearch, I discovered an interesting > situation related to the implementation of jmm_getMemoryUsage (see > [jdk-mem-usage]). In one of the test runs, a test failed with the following > exception: > > java.lang.IllegalArgumentException: committed = 542113792 should be < > max = 536870912 > at java.lang.management.MemoryUsage.(MemoryUsage.java:166) > at sun.management.MemoryImpl.getMemoryUsage0(Native Method) > at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71) > at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:246) > [...] > > This happened on MacOS 10.12.6 with JDK 10 (build 10.0.1+10). The only JVM flags > specified where -Xms512M -Xmx512M. So far this failure occurred only once and I > could not reproduce it yet. > > The values reported in the exception message are: > > * "max": 536870912 = 512MB (exactly) > * "committed": 542113792 = 517MB (exactly), i.e. 5MB more than "max". > > As the value of "max" is exactly what we have specified with -Xmx this indicates > to me that the problem seems to be the calculation of "committed". > > As the value of "max" is exactly what we have specified with -Xmx it seems to > indicate that the problem is the calculation of "committed". I do not > understand under which conditions this can happen thus I post this to the > mailing list in case anybody has ideas what might cause this. > > I plan to run further tests with JVM trace logging enabled > (-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags to be > precise) in the hope that this problem will occur again and I can provide logs > that help to debug / fix the problem. > > Searching for that error message, there is [JDK-8020530] but that one is about > *non-heap* memory usage and has already been resolved a while ago. Several > sources (e.g. [apache-ignite-workaround] or [netbeans-bug]) seem to indicate > that this problem happened indeed in the wild but what I find odd is that I > could not find a single ticket in the OpenJDK bug tracker or a discussion on a > JDK mailing list about this problem. > > I'd be glad to get any pointers on what might cause this or requests for > additional info that I need to provide to help analyze this problem. > > Thanks, > Daniel > > [jdk-mem-usage] > http://hg.openjdk.java.net/jdk-updates/jdk10u/file/142f0ed9ff5b/src/hotspot/share/services/management.cpp#l728 > [JDK-8020530] https://bugs.openjdk.java.net/browse/JDK-8020530 > [apache-ignite-workaround] > https://github.com/apache/ignite/blob/df4fd65a32/modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java#L336-L346 > [netbeans-bug] https://netbeans.org/bugzilla/show_bug.cgi?id=194733 > From daniel.mitterdorfer at gmail.com Fri Jul 13 08:26:50 2018 From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer) Date: Fri, 13 Jul 2018 10:26:50 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: <9ff83f20-3c71-80cd-3d3a-5b65910d0266@oracle.com> References: <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com> <9ff83f20-3c71-80cd-3d3a-5b65910d0266@oracle.com> Message-ID: Hi Alan, understood. Thanks for clarifying. Daniel Am Fr., 13. Juli 2018 um 10:15 Uhr schrieb Alan Bateman : > > > > On 13/07/2018 09:04, Daniel Mitterdorfer wrote: > > Hi Mandy, > > > > thank you for creating the issue. One note: I spotted this in JDK 10 > > (build 10.0.1+10) but in the ticket it says it affects version 8. > > > A bug with affects version N is assumed to be applicable to all releases > > N unless tagged otherwise. So "10" could be added to the list of > versions where the issue was spotted or confirmed if needed. > > -Alan From daniel.mitterdorfer at gmail.com Fri Jul 13 08:30:17 2018 From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer) Date: Fri, 13 Jul 2018 10:30:17 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: References: Message-ID: Hi Erik, > > Do you any kind of GC logging from the test run where you encountered > the bug? Unfortunately, we don't have GC logging enabled by default in our test suite so the exception trace is all I got. I am now repeatedly running the test suite with the original flags (-Xms512M -Xmx512M) and also added the following logging configuration: -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags As soon as I get another failure, I'll provide the full log file. Please let me know if you need any other logs (i.e. whether I should adjust my log configuration). Daniel From thomas.schatzl at oracle.com Fri Jul 13 08:33:33 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 13 Jul 2018 10:33:33 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: References: Message-ID: On Fri, 2018-07-13 at 10:30 +0200, Daniel Mitterdorfer wrote: > Hi Erik, > > > > Do you any kind of GC logging from the test run where you > > encountered the bug? > > Unfortunately, we don't have GC logging enabled by default in our > test suite so the exception trace is all I got. I am now repeatedly > running the test suite with the original flags (-Xms512M -Xmx512M) > and also added the following logging configuration: > > -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags > > As soon as I get another failure, I'll provide the full log file. > Please let me know if you need any other logs (i.e. whether I should > adjust my log configuration). I think these flags are fine. Since Erik and me strongly believe the issue is with the relevant G1 code Erik mentioned we will reassign the bug to us (he said there is already a bug reported on it). Thanks a lot, Thomas From erik.helin at oracle.com Fri Jul 13 08:34:45 2018 From: erik.helin at oracle.com (Erik Helin) Date: Fri, 13 Jul 2018 10:34:45 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com> References: <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com> Message-ID: On 07/12/2018 10:34 PM, mandy chung wrote: > It's indeed strange that no one reports this issue.? I created: > ?? https://bugs.openjdk.java.net/browse/JDK-8207200 Mandy: I moved the bug over to hotspot/gc, this is much more likely to be a problem with how the GC calculates the sizes. I don't think there is a bug in the serviceability layer, the JNI getMemoryUsage function only summarizes the data it gets from the GC. Thanks for creating the bug, we will follow up with Daniel. Erik > Mandy > > On 7/12/18 6:35 AM, Daniel Mitterdorfer wrote: >> Hi, >> >> while working on a change in Elasticsearch, I discovered an interesting >> situation related to the implementation of jmm_getMemoryUsage (see >> [jdk-mem-usage]). In one of the test runs, a test failed with the >> following >> exception: >> >> java.lang.IllegalArgumentException: committed = 542113792 should be < >> max = 536870912 >> at java.lang.management.MemoryUsage.(MemoryUsage.java:166) >> at sun.management.MemoryImpl.getMemoryUsage0(Native Method) >> at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71) >> at >> org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:246) >> >> [...] >> >> This happened on MacOS 10.12.6 with JDK 10 (build 10.0.1+10). The only >> JVM flags >> specified where -Xms512M -Xmx512M. So far this failure occurred only >> once and I >> could not reproduce it yet. >> >> The values reported in the exception message are: >> >> * "max": 536870912 = 512MB (exactly) >> * "committed": 542113792 = 517MB (exactly), i.e. 5MB more than "max". >> >> As the value of "max" is exactly what we have specified with -Xmx this >> indicates >> to me that the problem seems to be the calculation of "committed". >> >> As the value of "max" is exactly what we have specified with -Xmx it >> seems to >> indicate that the problem is the calculation of "committed". I do not >> understand under which conditions this can happen thus I post this to the >> mailing list in case anybody has ideas what might cause this. >> >> I plan to run further tests with JVM trace logging enabled >> (-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags >> to be >> precise) in the hope that this problem will occur again and I can >> provide logs >> that help to debug / fix the problem. >> >> Searching for that error message, there is [JDK-8020530] but that one >> is about >> *non-heap* memory usage and has already been resolved a while ago. >> Several >> sources (e.g. [apache-ignite-workaround] or [netbeans-bug]) seem to >> indicate >> that this problem happened indeed in the wild but what I find odd is >> that I >> could not find a single ticket in the OpenJDK bug tracker or a >> discussion on a >> JDK mailing list about this problem. >> >> I'd be glad to get any pointers on what might cause this or requests for >> additional info that I need to provide to help analyze this problem. >> >> Thanks, >> Daniel >> >> [jdk-mem-usage] >> http://hg.openjdk.java.net/jdk-updates/jdk10u/file/142f0ed9ff5b/src/hotspot/share/services/management.cpp#l728 >> >> [JDK-8020530] https://bugs.openjdk.java.net/browse/JDK-8020530 >> [apache-ignite-workaround] >> https://github.com/apache/ignite/blob/df4fd65a32/modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java#L336-L346 >> >> [netbeans-bug] https://netbeans.org/bugzilla/show_bug.cgi?id=194733 >> From gary.adams at oracle.com Fri Jul 13 11:29:31 2018 From: gary.adams at oracle.com (Gary Adams) Date: Fri, 13 Jul 2018 07:29:31 -0400 Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout configured Message-ID: <5B488D1B.3090808@oracle.com> This is a simple update to set the jtreg timeout to match the internal waittime already being used by these vmTestbase/nsk/jdb tests. Issue: https://bugs.openjdk.java.net/browse/JDK-8206013 Webrev: http://cr.openjdk.java.net/~gadams/8206013/webrev.00/ From ralf.schmelter at sap.com Fri Jul 13 13:22:54 2018 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Fri, 13 Jul 2018 13:22:54 +0000 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior Message-ID: <26a3d03903494257ba2995d082ae1960@sap.com> Hi Serguei, Sorry for the late reply, but it seems the spam filter has removed your emails. I just saw them in the archives. Regarding this code: 288 if (length != count) { 289 error = JVMTI_ERROR_INTERNAL; 290 } count is the number of frames filled into the array (it is set in the GetStackTrace JVMTI call) and length is the number of frames requested to be filled in. Both are independent of the start index at this point. Note that I've reused the count variable (it was first initialized to hold the number of frames on the stack). Maybe it is clearer to use a new variable in this call? The package will not be send if an error code is set on the output stream (see outStream_sendReply()). This should cover all cases (in both the new and the old code). Best regards, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.mitterdorfer at gmail.com Fri Jul 13 14:10:37 2018 From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer) Date: Fri, 13 Jul 2018 16:10:37 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: References: Message-ID: Hi, I have good news. I was able to reproduce this issue but this time I have logs. A test failed with the following stack trace around 15:06:55 with: java.lang.IllegalArgumentException: committed = 537919488 should be < max = 536870912 > at java.lang.management.MemoryUsage.(MemoryUsage.java:166) > at sun.management.MemoryImpl.getMemoryUsage0(Native Method) > at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71) > at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:242) This time it happened on Linux (kernel 4.13.0-45-generic) with JDK 10 (build 10+46). The JVM arguments were: -Xms512M -Xmx512M -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags The logs are somewhat massive (~250MB uncompressed) and available at https://www.dropbox.com/s/wir9sv1dk5cf54y/JDK-8207200_test_log.txt.zip?dl=0 I hope that helps identifying the cause. Please let me know if you need anything else. Daniel Am Fr., 13. Juli 2018 um 10:33 Uhr schrieb Thomas Schatzl : > > On Fri, 2018-07-13 at 10:30 +0200, Daniel Mitterdorfer wrote: > > Hi Erik, > > > > > > Do you any kind of GC logging from the test run where you > > > encountered the bug? > > > > Unfortunately, we don't have GC logging enabled by default in our > > test suite so the exception trace is all I got. I am now repeatedly > > running the test suite with the original flags (-Xms512M -Xmx512M) > > and also added the following logging configuration: > > > > -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags > > > > As soon as I get another failure, I'll provide the full log file. > > Please let me know if you need any other logs (i.e. whether I should > > adjust my log configuration). > > I think these flags are fine. > > Since Erik and me strongly believe the issue is with the relevant G1 > code Erik mentioned we will reassign the bug to us (he said there is > already a bug reported on it). > > Thanks a lot, > Thomas > From mandy.chung at oracle.com Fri Jul 13 15:01:53 2018 From: mandy.chung at oracle.com (mandy chung) Date: Fri, 13 Jul 2018 08:01:53 -0700 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: References: <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com> Message-ID: <2160e7d8-598d-6efa-6786-2c4616a1cce1@oracle.com> Great! Thanks Erik. Mandy On 7/13/18 1:34 AM, Erik Helin wrote: > On 07/12/2018 10:34 PM, mandy chung wrote: >> It's indeed strange that no one reports this issue.? I created: >> ??? https://bugs.openjdk.java.net/browse/JDK-8207200 > > Mandy: I moved the bug over to hotspot/gc, this is much more likely to > be a problem with how the GC calculates the sizes. I don't think there > is a bug in the serviceability layer, the JNI getMemoryUsage function > only summarizes the data it gets from the GC. > > Thanks for creating the bug, we will follow up with Daniel. > Erik > >> Mandy >> >> On 7/12/18 6:35 AM, Daniel Mitterdorfer wrote: >>> Hi, >>> >>> while working on a change in Elasticsearch, I discovered an interesting >>> situation related to the implementation of jmm_getMemoryUsage (see >>> [jdk-mem-usage]). In one of the test runs, a test failed with the >>> following >>> exception: >>> >>> java.lang.IllegalArgumentException: committed = 542113792 should be < >>> max = 536870912 >>> at java.lang.management.MemoryUsage.(MemoryUsage.java:166) >>> at sun.management.MemoryImpl.getMemoryUsage0(Native Method) >>> at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71) >>> at >>> org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:246) >>> >>> [...] >>> >>> This happened on MacOS 10.12.6 with JDK 10 (build 10.0.1+10). The >>> only JVM flags >>> specified where -Xms512M -Xmx512M. So far this failure occurred only >>> once and I >>> could not reproduce it yet. >>> >>> The values reported in the exception message are: >>> >>> * "max": 536870912 = 512MB (exactly) >>> * "committed": 542113792 = 517MB (exactly), i.e. 5MB more than "max". >>> >>> As the value of "max" is exactly what we have specified with -Xmx >>> this indicates >>> to me that the problem seems to be the calculation of "committed". >>> >>> As the value of "max" is exactly what we have specified with -Xmx it >>> seems to >>> indicate that the problem is the calculation of "committed". I do not >>> understand under which conditions this can happen thus I post this to >>> the >>> mailing list in case anybody has ideas what might cause this. >>> >>> I plan to run further tests with JVM trace logging enabled >>> (-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags >>> to be >>> precise) in the hope that this problem will occur again and I can >>> provide logs >>> that help to debug / fix the problem. >>> >>> Searching for that error message, there is [JDK-8020530] but that one >>> is about >>> *non-heap* memory usage and has already been resolved a while ago. >>> Several >>> sources (e.g. [apache-ignite-workaround] or [netbeans-bug]) seem to >>> indicate >>> that this problem happened indeed in the wild but what I find odd is >>> that I >>> could not find a single ticket in the OpenJDK bug tracker or a >>> discussion on a >>> JDK mailing list about this problem. >>> >>> I'd be glad to get any pointers on what might cause this or requests for >>> additional info that I need to provide to help analyze this problem. >>> >>> Thanks, >>> Daniel >>> >>> [jdk-mem-usage] >>> http://hg.openjdk.java.net/jdk-updates/jdk10u/file/142f0ed9ff5b/src/hotspot/share/services/management.cpp#l728 >>> >>> [JDK-8020530] https://bugs.openjdk.java.net/browse/JDK-8020530 >>> [apache-ignite-workaround] >>> https://github.com/apache/ignite/blob/df4fd65a32/modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java#L336-L346 >>> >>> [netbeans-bug] https://netbeans.org/bugzilla/show_bug.cgi?id=194733 >>> From serguei.spitsyn at oracle.com Fri Jul 13 16:25:01 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 13 Jul 2018 09:25:01 -0700 Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout configured In-Reply-To: <5B488D1B.3090808@oracle.com> References: <5B488D1B.3090808@oracle.com> Message-ID: <1934bc19-d78a-668a-7c05-529961f57565@oracle.com> Hi Gary, It looks good. Thanks, Serguei On 7/13/18 04:29, Gary Adams wrote: > This is a simple update to set the jtreg timeout to match the > internal waittime already being used by these vmTestbase/nsk/jdb tests. > > ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206013 > ? Webrev: http://cr.openjdk.java.net/~gadams/8206013/webrev.00/ From markus.gaisbauer at gmail.com Fri Jul 13 16:35:21 2018 From: markus.gaisbauer at gmail.com (Markus Gaisbauer) Date: Fri, 13 Jul 2018 18:35:21 +0200 Subject: ThreadMXBean::getCurrentThreadAllocatedBytes Message-ID: Hello, I am trying to use ThreadMXBean::getThreadAllocatedBytes (com.sun.management) to get the amount of allocated memory of the current thread in some performance critical code. Unfortunately, the current implementation can be rather slow and the duration of each call unpredictable. I ran a test in a JVM with 500 threads. Depending on which thread was queried, getThreadAllocatedBytes took between 100 ns and 2500 ns. The root cause of the problem is ThreadsList::find_JavaThread_from_java_tid which performs a linear scan through all Java threads in the current process. The more threads a JVM has, the slower it gets. In the worst case, the thread with the given TID is found as the last entry in the list. Before Java 10, the oldest thread is the slowest one to query. Since Java 10, the youngest thread is the slowest one to query. I think this was a side effect of introducing "Thread Safe Memory Reclamation (Thread-SMR) support". Oldest Thread Youngest Thread Java 8 8740 ns 76 ns Java 10 109 ns 2485 ns A common use case is to query the metric for the current thread (e.g. before and after performing some operation). This case can be optimized by introducing a new method: getCurrentThreadAllocatedBytes. I created a patch for http://hg.openjdk.java.net/jdk/jdk/ and by using the new method I saw the following improvements in my test: Oldest Thread Youngest Thread Proposal 37 ns 37 ns This is a 60x improvement over the worst case of the current API. In the best case of the current API, the new method is still 3 times faster. // based on JVM_SetNativeThreadName in jvm.cpp. JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, jobject currentThread)) // We don't use a ThreadsListHandle here because the current thread // must be alive. oop java_thread = JNIHandles::resolve_non_null(currentThread); JavaThread* thr = java_lang_Thread::thread(java_thread); if (thread == thr) { // only supported for the current thread return thr->cooked_allocated_bytes(); } return -1; JVM_END The proposed method also fixes the problem, that getThreadAllocatedBytes itself allocates some memory on the current thread (two long arrays, 24 bytes) and therefore can slightly skew measurements. The new method, getCurrentThreadAllocatedBytes, returns exactly the same value if it is called twice without allocating any memory between those calls. I also built a variation of this method that could be used to query allocated memory more efficiently for anyone who already has a java.lang.Thread object: JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env, jobject threadObj)) // based on code proposed in threadSMR.hpp ThreadsListHandle tlh; JavaThread* thr = NULL; bool is_alive = tlh.cv_internal_thread_to_JavaThread(threadObj, &thr, NULL); if (is_alive) { return thr->cooked_allocated_bytes(); } return -1; JVM_END This method took 70 ns in my test, which is 85% slower than GetCurrentThreadAllocatedMemory but still 30% faster than the best case of the current API. I currently have no immediate need for this second method, but I think it would also be a valueable addition to the API. I attached a patch for getCurrentThreadAllocatedBytes. I can create a second patch for also adding getThreadAllocatedMemory(java.lang.Thread) to the API. I am a first time contributor and I am not 100% sure what process I must follow to get a change like this into OpenJDK. Can someone have a look at my proposal and help me through the process? Best regards, Markus -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: getCurrentThreadAllocatedBytes.diff Type: application/octet-stream Size: 5058 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ThreadAllocatedBytesTest.java Type: application/octet-stream Size: 3119 bytes Desc: not available URL: From gary.adams at oracle.com Fri Jul 13 18:03:12 2018 From: gary.adams at oracle.com (Gary Adams) Date: Fri, 13 Jul 2018 14:03:12 -0400 Subject: ThreadMXBean::getCurrentThreadAllocatedBytes Message-ID: <5B48E960.5060300@oracle.com> Here's the starting point for openjdk contributing: http://openjdk.java.net/contribute/ Here's your post in the mail archives : http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html Most people will post a webrev to cr.openjdk.java.net for larger changesets. Most attachments are stripped when sent to the mailing list. From daniel.daugherty at oracle.com Fri Jul 13 18:44:39 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 13 Jul 2018 14:44:39 -0400 Subject: ThreadMXBean::getCurrentThreadAllocatedBytes In-Reply-To: References: Message-ID: On 7/13/18 12:35 PM, Markus Gaisbauer wrote: > Hello, > > I am trying to use ThreadMXBean::getThreadAllocatedBytes > (com.sun.management) to get the amount of allocated memory of the > current thread in some performance critical code. > > Unfortunately, the current implementation can be rather slow and the > duration of each call unpredictable. I ran a test in a JVM with 500 > threads. Depending on which thread was queried, > getThreadAllocatedBytes took between 100 ns and 2500 ns. > > The root cause of the problem is > ThreadsList::find_JavaThread_from_java_tid which performs a linear > scan through all Java threads in the current process. The more threads > a JVM has, the slower it gets. In the worst case, the thread with the > given TID is found as the last entry in the list. > > Before Java 10, the oldest thread is the slowest one to query. > Since Java 10, the youngest thread is the slowest one to query. I > think this was a side effect of introducing "Thread Safe Memory > Reclamation (Thread-SMR) support". > > ? ? ? ? ? ? ?Oldest Thread? ?Youngest Thread > Java 8? ? ? ? ? ? ?8740 ns? ? ? ? ? ? ?76 ns > Java 10? ? ? ? ? ? ?109 ns? ? ? ? ? ?2485 ns It is good to see that longest search is much faster. Erik and Robbin will be pleased since speeding up traversal of the ThreadsList was one of the things that we tried to do during the Thread-SMR project. A first step is get a new bug filed that documents the issue with ThreadMXBean::getThreadAllocatedBytes(). Perhaps Gary or Serguei will take care of that. Dan > A common use case is to query the metric for the current thread (e.g. > before and after performing some operation). This case can be > optimized by introducing a new method: getCurrentThreadAllocatedBytes. > > I created a patch for http://hg.openjdk.java.net/jdk/jdk/ and by using > the new method I saw the following improvements in my test: > ? ? ? ? ? ? ?Oldest Thread? ?Youngest Thread > Proposal? ? ? ? ? ? ?37 ns? ? ? ? ? ? ?37 ns > > This is a 60x improvement over the worst case of the current API. In > the best case of the current API, the new method is still 3 times faster. > > // based on JVM_SetNativeThreadName in jvm.cpp. > JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, > jobject currentThread)) > ? // We don't use a ThreadsListHandle here because the current thread > ? // must be alive. > ? oop java_thread = JNIHandles::resolve_non_null(currentThread); > ? JavaThread* thr = java_lang_Thread::thread(java_thread); > ? if (thread == thr) { > ? ? // only supported for the current thread > ? ? return thr->cooked_allocated_bytes(); > ? } > ? return -1; > JVM_END > > The proposed method also fixes the problem, that > getThreadAllocatedBytes itself allocates some memory on the current > thread (two long arrays, 24 bytes) and therefore can slightly skew > measurements. The new method,?getCurrentThreadAllocatedBytes, returns > exactly the same value if it is called twice without allocating any > memory between those calls. > > I also built a variation of this method that could be used to query > allocated memory more efficiently for anyone who already has a > java.lang.Thread object: > > JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env, jobject > threadObj)) > ? // based on code proposedin threadSMR.hpp > ThreadsListHandle tlh; > ? JavaThread* thr = NULL; > ? bool is_alive = tlh.cv_internal_thread_to_JavaThread(threadObj, > &thr, NULL); > ? if (is_alive) { > ? ? return thr->cooked_allocated_bytes(); > ? } > ? return -1; > JVM_END > > This method took 70 ns in my test, which is 85% slower > than?GetCurrentThreadAllocatedMemory but still 30% faster than the > best case of the current API. I currently have no immediate need for > this second method, but I think it would also be a valueable addition > to the API. > > I attached a patch for getCurrentThreadAllocatedBytes. I can create a > second patch for also adding > getThreadAllocatedMemory(java.lang.Thread) to the API. > > I am a first time contributor and I am not 100% sure what process I > must follow to get a change like this into OpenJDK. Can someone have a > look at my proposal and help me through the process? > > Best regards, > Markus > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Fri Jul 13 20:21:06 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 13 Jul 2018 13:21:06 -0700 Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout configured In-Reply-To: <5B488D1B.3090808@oracle.com> References: <5B488D1B.3090808@oracle.com> Message-ID: <4271a8e2-1087-bc35-0697-00717ef9ffb9@oracle.com> Hi Gary, It looks like you have properly added timeout=300 wherever we use -waittime:5. However, I'm not 100% convinced this is always the right approach. In the bug description you said that -waittime is used as a timeout for individual operations. However, there could be multiple of those operations, and they could in sum exceed the 300 second jtreg timeout you added. What is the default for -waittime? I'm also guessing that the initial application of -waittime was never really tuned to the specific tests and just cloned across most of them. It seems every test either needs 5m or the default, which doesn't really make much sense. If 5m was really needed, we should have seen a lot of failures when ported to jtreg, but as far as I know the only reason this issue got on your radar was due to exclude001 needing 7m. Maybe rather than adding timeout=300? you should change -waitime to 2m, since other than exclude001, none of the tests seem to need more than 2m. Lastly, does timeoutFactor impact -waittime? It seems it should be applied to it also. I'm not sure if it is. thanks, Chris On 7/13/18 4:29 AM, Gary Adams wrote: > This is a simple update to set the jtreg timeout to match the > internal waittime already being used by these vmTestbase/nsk/jdb tests. > > ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206013 > ? Webrev: http://cr.openjdk.java.net/~gadams/8206013/webrev.00/ From daniel.daugherty at oracle.com Fri Jul 13 20:46:12 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 13 Jul 2018 16:46:12 -0400 Subject: ThreadMXBean::getCurrentThreadAllocatedBytes In-Reply-To: References: Message-ID: <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com> On 7/13/18 2:44 PM, Daniel D. Daugherty wrote: > On 7/13/18 12:35 PM, Markus Gaisbauer wrote: >> Hello, >> >> I am trying to use ThreadMXBean::getThreadAllocatedBytes >> (com.sun.management) to get the amount of allocated memory of the >> current thread in some performance critical code. >> >> Unfortunately, the current implementation can be rather slow and the >> duration of each call unpredictable. I ran a test in a JVM with 500 >> threads. Depending on which thread was queried, >> getThreadAllocatedBytes took between 100 ns and 2500 ns. >> >> The root cause of the problem is >> ThreadsList::find_JavaThread_from_java_tid which performs a linear >> scan through all Java threads in the current process. The more >> threads a JVM has, the slower it gets. In the worst case, the thread >> with the given TID is found as the last entry in the list. >> >> Before Java 10, the oldest thread is the slowest one to query. >> Since Java 10, the youngest thread is the slowest one to query. I >> think this was a side effect of introducing "Thread Safe Memory >> Reclamation (Thread-SMR) support". >> >> ? ? ? ? ? ? ?Oldest Thread? ?Youngest Thread >> Java 8? ? ? ? ? ? ?8740 ns? ? ? ? ? ? ?76 ns >> Java 10? ? ? ? ? ? ?109 ns? ? ? ? ? ?2485 ns > > It is good to see that longest search is much faster. Erik and Robbin > will be pleased since speeding up traversal of the ThreadsList was one > of the things that we tried to do during the Thread-SMR project. > > A first step is get a new bug filed that documents the issue with > ThreadMXBean::getThreadAllocatedBytes(). Perhaps Gary or Serguei > will take care of that. > > Dan > > >> A common use case is to query the metric for the current thread (e.g. >> before and after performing some operation). This case can be >> optimized by introducing a new method: getCurrentThreadAllocatedBytes. >> >> I created a patch for http://hg.openjdk.java.net/jdk/jdk/ and by >> using the new method I saw the following improvements in my test: >> ? ? ? ? ? ? ?Oldest Thread? ?Youngest Thread >> Proposal? ? ? ? ? ? ?37 ns? ? ? ? ? ? ?37 ns >> >> This is a 60x improvement over the worst case of the current API. In >> the best case of the current API, the new method is still 3 times faster. >> >> // based on JVM_SetNativeThreadName in jvm.cpp. >> JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, >> jobject currentThread)) >> ? // We don't use a ThreadsListHandle here because the current thread >> ? // must be alive. >> ? oop java_thread = JNIHandles::resolve_non_null(currentThread); >> JavaThread* thr = java_lang_Thread::thread(java_thread); >> ? if (thread == thr) { >> ? ? // only supported for the current thread >> ? ? return thr->cooked_allocated_bytes(); >> ? } >> ? return -1; >> JVM_END >> >> The proposed method also fixes the problem, that >> getThreadAllocatedBytes itself allocates some memory on the current >> thread (two long arrays, 24 bytes) and therefore can slightly skew >> measurements. The new method,?getCurrentThreadAllocatedBytes, returns >> exactly the same value if it is called twice without allocating any >> memory between those calls. >> >> I also built a variation of this method that could be used to query >> allocated memory more efficiently for anyone who already has a >> java.lang.Thread object: >> >> JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env, jobject >> threadObj)) >> ? // based on code proposedin threadSMR.hpp >> ThreadsListHandle tlh; >> JavaThread* thr = NULL; >> ? bool is_alive = tlh.cv_internal_thread_to_JavaThread(threadObj, >> &thr, NULL); >> ? if (is_alive) { >> ? ? return thr->cooked_allocated_bytes(); >> ? } >> ? return -1; >> JVM_END >> >> This method took 70 ns in my test, which is 85% slower >> than?GetCurrentThreadAllocatedMemory but still 30% faster than the >> best case of the current API. I currently have no immediate need for >> this second method, but I think it would also be a valueable addition >> to the API. >> >> I attached a patch for getCurrentThreadAllocatedBytes. I can create a >> second patch for also adding >> getThreadAllocatedMemory(java.lang.Thread) to the API. >> >> I am a first time contributor and I am not 100% sure what process I >> must follow to get a change like this into OpenJDK. Can someone have >> a look at my proposal and help me through the process? >> >> Best regards, >> Markus >> > I believe this is the code that's causing you grief: open/src/hotspot/share/services/management.cpp: // Gets an array containing the amount of memory allocated on the Java // heap for a set of threads (in bytes).? Each element of the array is // the amount of memory allocated for the thread ID specified in the // corresponding entry in the given array of thread IDs; or -1 if the // thread does not exist or has terminated. JVM_ENTRY(void, jmm_GetThreadAllocatedMemory(JNIEnv *env, jlongArray ids, ???????????????????????????????????????????? jlongArray sizeArray)) ? // Check if threads is null ? if (ids == NULL || sizeArray == NULL) { ??? THROW(vmSymbols::java_lang_NullPointerException()); ? } ? ResourceMark rm(THREAD); ? typeArrayOop ta = typeArrayOop(JNIHandles::resolve_non_null(ids)); ? typeArrayHandle ids_ah(THREAD, ta); ? typeArrayOop sa = typeArrayOop(JNIHandles::resolve_non_null(sizeArray)); ? typeArrayHandle sizeArray_h(THREAD, sa); ? // validate the thread id array ? validate_thread_id_array(ids_ah, CHECK); ? // sizeArray must be of the same length as the given array of thread IDs ? int num_threads = ids_ah->length(); ? if (num_threads != sizeArray_h->length()) { ??? THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), ????????????? "The length of the given long array does not match the length of " ????????????? "the given array of thread IDs"); ? } ? ThreadsListHandle tlh; ? for (int i = 0; i < num_threads; i++) { ??? JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(ids_ah->long_at(i)); ??? if (java_thread != NULL) { ????? sizeArray_h->long_at_put(i, java_thread->cooked_allocated_bytes()); ??? } ? } JVM_END Perhaps something like this above the "ThreadsListHandle tlh;" line: ? if (num_threads == 1 && THREAD->is_Java_thread()) { ??? // Only asking for 1 thread so if we're a JavaThread, then ??? // see if this request is for ourself. ??? JavaThread* jt = THREAD; ??? oop tobj = jt->threadObj(); ??? if (ids_ah->long_at(0) == java_lang_Thread::thread_id(tobj)) { ????? // Return the info for ourself. ????? sizeArray_h->long_at_put(0, jt->cooked_allocated_bytes()); ????? return; ??? } ? } I haven't checked to see if this will even compile, but I think you'll get the idea. Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Fri Jul 13 20:52:52 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Fri, 13 Jul 2018 16:52:52 -0400 Subject: ThreadMXBean::getCurrentThreadAllocatedBytes In-Reply-To: <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com> References: <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com> Message-ID: <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com> Markus, I filed the following bug for you: ??? JDK-8207266 ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread ??? https://bugs.openjdk.java.net/browse/JDK-8207266 Dan On 7/13/18 4:46 PM, Daniel D. Daugherty wrote: > On 7/13/18 2:44 PM, Daniel D. Daugherty wrote: >> On 7/13/18 12:35 PM, Markus Gaisbauer wrote: >>> Hello, >>> >>> I am trying to use ThreadMXBean::getThreadAllocatedBytes >>> (com.sun.management) to get the amount of allocated memory of the >>> current thread in some performance critical code. >>> >>> Unfortunately, the current implementation can be rather slow and the >>> duration of each call unpredictable. I ran a test in a JVM with 500 >>> threads. Depending on which thread was queried, >>> getThreadAllocatedBytes took between 100 ns and 2500 ns. >>> >>> The root cause of the problem is >>> ThreadsList::find_JavaThread_from_java_tid which performs a linear >>> scan through all Java threads in the current process. The more >>> threads a JVM has, the slower it gets. In the worst case, the thread >>> with the given TID is found as the last entry in the list. >>> >>> Before Java 10, the oldest thread is the slowest one to query. >>> Since Java 10, the youngest thread is the slowest one to query. I >>> think this was a side effect of introducing "Thread Safe Memory >>> Reclamation (Thread-SMR) support". >>> >>> ? ? ? ? ? ? ?Oldest Thread? ?Youngest Thread >>> Java 8 ?8740 ns? ? ? ? ? ? ?76 ns >>> Java 10 ?109 ns? ? ? ? ? ?2485 ns >> >> It is good to see that longest search is much faster. Erik and Robbin >> will be pleased since speeding up traversal of the ThreadsList was one >> of the things that we tried to do during the Thread-SMR project. >> >> A first step is get a new bug filed that documents the issue with >> ThreadMXBean::getThreadAllocatedBytes(). Perhaps Gary or Serguei >> will take care of that. >> >> Dan >> >> >>> A common use case is to query the metric for the current thread >>> (e.g. before and after performing some operation). This case can be >>> optimized by introducing a new method: getCurrentThreadAllocatedBytes. >>> >>> I created a patch for http://hg.openjdk.java.net/jdk/jdk/ and by >>> using the new method I saw the following improvements in my test: >>> ? ? ? ? ? ? ?Oldest Thread? ?Youngest Thread >>> Proposal ?37 ns? ? ? ? ? ? ?37 ns >>> >>> This is a 60x improvement over the worst case of the current API. In >>> the best case of the current API, the new method is still 3 times >>> faster. >>> >>> // based on JVM_SetNativeThreadName in jvm.cpp. >>> JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, >>> jobject currentThread)) >>> ? // We don't use a ThreadsListHandle here because the current thread >>> ? // must be alive. >>> ? oop java_thread = JNIHandles::resolve_non_null(currentThread); >>> JavaThread* thr = java_lang_Thread::thread(java_thread); >>> ? if (thread == thr) { >>> ? ? // only supported for the current thread >>> ? ? return thr->cooked_allocated_bytes(); >>> ? } >>> ? return -1; >>> JVM_END >>> >>> The proposed method also fixes the problem, that >>> getThreadAllocatedBytes itself allocates some memory on the current >>> thread (two long arrays, 24 bytes) and therefore can slightly skew >>> measurements. The new method,?getCurrentThreadAllocatedBytes, >>> returns exactly the same value if it is called twice without >>> allocating any memory between those calls. >>> >>> I also built a variation of this method that could be used to query >>> allocated memory more efficiently for anyone who already has a >>> java.lang.Thread object: >>> >>> JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env, jobject >>> threadObj)) >>> ? // based on code proposedin threadSMR.hpp >>> ThreadsListHandle tlh; >>> JavaThread* thr = NULL; >>> ? bool is_alive = tlh.cv_internal_thread_to_JavaThread(threadObj, >>> &thr, NULL); >>> ? if (is_alive) { >>> ? ? return thr->cooked_allocated_bytes(); >>> ? } >>> ? return -1; >>> JVM_END >>> >>> This method took 70 ns in my test, which is 85% slower >>> than?GetCurrentThreadAllocatedMemory but still 30% faster than the >>> best case of the current API. I currently have no immediate need for >>> this second method, but I think it would also be a valueable >>> addition to the API. >>> >>> I attached a patch for getCurrentThreadAllocatedBytes. I can create >>> a second patch for also adding >>> getThreadAllocatedMemory(java.lang.Thread) to the API. >>> >>> I am a first time contributor and I am not 100% sure what process I >>> must follow to get a change like this into OpenJDK. Can someone have >>> a look at my proposal and help me through the process? >>> >>> Best regards, >>> Markus >>> >> > > I believe this is the code that's causing you grief: > > open/src/hotspot/share/services/management.cpp: > > // Gets an array containing the amount of memory allocated on the Java > // heap for a set of threads (in bytes).? Each element of the array is > // the amount of memory allocated for the thread ID specified in the > // corresponding entry in the given array of thread IDs; or -1 if the > // thread does not exist or has terminated. > JVM_ENTRY(void, jmm_GetThreadAllocatedMemory(JNIEnv *env, jlongArray ids, > ???????????????????????????????????????????? jlongArray sizeArray)) > ? // Check if threads is null > ? if (ids == NULL || sizeArray == NULL) { > ??? THROW(vmSymbols::java_lang_NullPointerException()); > ? } > > ? ResourceMark rm(THREAD); > ? typeArrayOop ta = typeArrayOop(JNIHandles::resolve_non_null(ids)); > ? typeArrayHandle ids_ah(THREAD, ta); > > ? typeArrayOop sa = typeArrayOop(JNIHandles::resolve_non_null(sizeArray)); > ? typeArrayHandle sizeArray_h(THREAD, sa); > > ? // validate the thread id array > ? validate_thread_id_array(ids_ah, CHECK); > > ? // sizeArray must be of the same length as the given array of thread IDs > ? int num_threads = ids_ah->length(); > ? if (num_threads != sizeArray_h->length()) { > ??? THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), > ????????????? "The length of the given long array does not match the > length of " > ????????????? "the given array of thread IDs"); > ? } > > ? ThreadsListHandle tlh; > ? for (int i = 0; i < num_threads; i++) { > ??? JavaThread* java_thread = > tlh.list()->find_JavaThread_from_java_tid(ids_ah->long_at(i)); > ??? if (java_thread != NULL) { > ????? sizeArray_h->long_at_put(i, java_thread->cooked_allocated_bytes()); > ??? } > ? } > JVM_END > > > Perhaps something like this above the "ThreadsListHandle tlh;" line: > > ? if (num_threads == 1 && THREAD->is_Java_thread()) { > ??? // Only asking for 1 thread so if we're a JavaThread, then > ??? // see if this request is for ourself. > ??? JavaThread* jt = THREAD; > ??? oop tobj = jt->threadObj(); > > ??? if (ids_ah->long_at(0) == java_lang_Thread::thread_id(tobj)) { > ????? // Return the info for ourself. > ????? sizeArray_h->long_at_put(0, jt->cooked_allocated_bytes()); > ????? return; > ??? } > ? } > > I haven't checked to see if this will even compile, but I > think you'll get the idea. > > Dan > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gary.adams at oracle.com Fri Jul 13 21:36:46 2018 From: gary.adams at oracle.com (gary.adams at oracle.com) Date: Fri, 13 Jul 2018 17:36:46 -0400 Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout configured In-Reply-To: <4271a8e2-1087-bc35-0697-00717ef9ffb9@oracle.com> References: <5B488D1B.3090808@oracle.com> <4271a8e2-1087-bc35-0697-00717ef9ffb9@oracle.com> Message-ID: <4afe54b7-e493-1a45-4fe4-166e9b112dae@oracle.com> We know that the default jtreg timeout is 2 minutes and typically runs with a timeoutfactor of 4 or 10. So the harness "safety net" is 8 to 20 minutes from jtreg. It does appear that most of the vmTestbase tests use a 5 minute waittime. I have seen waittime used in different ways. The one we saw most recently was waiting for a specific reply that was taking upwords of 7 minutes handling method exclude filtering. e.g. 600K methods on solaris-sparcv9-debug I've seen other tests using waittime as a total test timeout. The jtreg timeout factor has not been applied to the vmTestbase waitime. The tests have been quickly ported so they can run under jtreg harness, but have not been converted to use the all the jtreg features. The purpose of this specific fix is to prevent jtreg from an early termination at 2 minutes or 8 minutes, when the original waittime allows for 5 minutes. Reducing waittime will not speed up the tests. It would probably introduce more intermittent timeout reports. On 7/13/18 4:21 PM, Chris Plummer wrote: > Hi Gary, > > It looks like you have properly added timeout=300 wherever we use > -waittime:5. However, I'm not 100% convinced this is always the right > approach. In the bug description you said that -waittime is used as a > timeout for individual operations. However, there could be multiple of > those operations, and they could in sum exceed the 300 second jtreg > timeout you added. > > What is the default for -waittime? I'm also guessing that the initial > application of -waittime was never really tuned to the specific tests > and just cloned across most of them. It seems every test either needs > 5m or the default, which doesn't really make much sense. If 5m was > really needed, we should have seen a lot of failures when ported to > jtreg, but as far as I know the only reason this issue got on your > radar was due to exclude001 needing 7m. Maybe rather than adding > timeout=300? you should change -waitime to 2m, since other than > exclude001, none of the tests seem to need more than 2m. > > Lastly, does timeoutFactor impact -waittime? It seems it should be > applied to it also. I'm not sure if it is. > > thanks, > > Chris > > On 7/13/18 4:29 AM, Gary Adams wrote: >> This is a simple update to set the jtreg timeout to match the >> internal waittime already being used by these vmTestbase/nsk/jdb tests. >> >> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206013 >> ? Webrev: http://cr.openjdk.java.net/~gadams/8206013/webrev.00/ > > > From chris.plummer at oracle.com Fri Jul 13 22:30:08 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 13 Jul 2018 15:30:08 -0700 Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout configured In-Reply-To: <4afe54b7-e493-1a45-4fe4-166e9b112dae@oracle.com> References: <5B488D1B.3090808@oracle.com> <4271a8e2-1087-bc35-0697-00717ef9ffb9@oracle.com> <4afe54b7-e493-1a45-4fe4-166e9b112dae@oracle.com> Message-ID: Hi Gary, I wasn't suggesting a shorter waittime to speed up the tests. It's just another (of many) timeout related parameters use to detect (what should be very uncommon) timeout failures sooner. I guess in that case it does make the test faster in cases where it does timeout. So one question is how much do we care about timeout performance? If not at all (we think the timeout is very rare, if ever), we'd just do a something like a 1h timeout and forget about it. However, historically that is not the approach we have taken. jtreg is given a fairly short timeout of 2m, multiplied to account for platform performance. So while I understand it doesn't make sense to have the waittime be longer than the (adjusted) jtreg timeout (we'd always hit the jtreg timeout first), I don't think that implies we should make the jtreg timeout longer. Maybe we should make the waittime shorter. In any case, with the current timeoutFactor in place, it's already the case that the jtreg timeout is longer than waittime. So I'm not sure why you feel the need to make the jtreg timeout longer, unless the test is hitting the jtreg timeout already. And another thought that just came to me. Timeouts can also serve the purpose of detecting bugs. If the test author decides the test should finish in 1m, and someone bumps the timeout to 10m, that might make a performance bug introduced in the future go unnoticed. In general I don't think we should increase the timeout for tests that are not currently timing out. For ones that are, first see if there is a performance related issue. Chris On 7/13/18 2:36 PM, gary.adams at oracle.com wrote: > We know that the default jtreg timeout is 2 minutes and typically > runs with a timeoutfactor of 4 or 10. So the harness "safety net" > is 8 to 20 minutes from jtreg. > > It does appear that most of the vmTestbase tests use a 5 minute > waittime. I have seen waittime used in different ways. The one we > saw most recently was waiting for a specific reply that was taking > upwords of 7 minutes handling method exclude filtering. e.g. > 600K methods on solaris-sparcv9-debug > > I've seen other tests using waittime as a total test timeout. > > The jtreg timeout factor has not been applied to the vmTestbase waitime. > The tests have been quickly ported so they can run under jtreg > harness, but have not been converted to use the all the jtreg features. > > The purpose of this specific fix is to prevent jtreg from an early > termination at 2 minutes or 8 minutes, when the original waittime > allows for 5 minutes. > > Reducing waittime will not speed up the tests. It would probably > introduce > more intermittent timeout reports. > > On 7/13/18 4:21 PM, Chris Plummer wrote: >> Hi Gary, >> >> It looks like you have properly added timeout=300 wherever we use >> -waittime:5. However, I'm not 100% convinced this is always the right >> approach. In the bug description you said that -waittime is used as a >> timeout for individual operations. However, there could be multiple >> of those operations, and they could in sum exceed the 300 second >> jtreg timeout you added. >> >> What is the default for -waittime? I'm also guessing that the initial >> application of -waittime was never really tuned to the specific tests >> and just cloned across most of them. It seems every test either needs >> 5m or the default, which doesn't really make much sense. If 5m was >> really needed, we should have seen a lot of failures when ported to >> jtreg, but as far as I know the only reason this issue got on your >> radar was due to exclude001 needing 7m. Maybe rather than adding >> timeout=300? you should change -waitime to 2m, since other than >> exclude001, none of the tests seem to need more than 2m. >> >> Lastly, does timeoutFactor impact -waittime? It seems it should be >> applied to it also. I'm not sure if it is. >> >> thanks, >> >> Chris >> >> On 7/13/18 4:29 AM, Gary Adams wrote: >>> This is a simple update to set the jtreg timeout to match the >>> internal waittime already being used by these vmTestbase/nsk/jdb tests. >>> >>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206013 >>> ? Webrev: http://cr.openjdk.java.net/~gadams/8206013/webrev.00/ >> >> >> > From daniil.x.titov at oracle.com Fri Jul 13 23:34:41 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Fri, 13 Jul 2018 16:34:41 -0700 Subject: RFR 8207261: [Graal] JDI and JDWP tests that consume all memory should be filtered out to not run with Graal Message-ID: <810B1BA0-92AE-4801-A825-2024DB22D81F@oracle.com> Please review the change that filters out 8 JDI and JDWP tests when running with Graal. ?These tests consume all memory ( to force GC or to test that the OutOfMemory error is thrown ) that sporadically results in the exceptions in the Graal compiler threads and failure. Issue: https://bugs.openjdk.java.net/browse/JDK-8207261 Webrev: http://cr.openjdk.java.net/~dtitov/8207261/webrev.01/ Thanks! Best regards, Daniil From chris.plummer at oracle.com Sat Jul 14 00:28:31 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 13 Jul 2018 17:28:31 -0700 Subject: RFR 8207261: [Graal] JDI and JDWP tests that consume all memory should be filtered out to not run with Graal In-Reply-To: <810B1BA0-92AE-4801-A825-2024DB22D81F@oracle.com> References: <810B1BA0-92AE-4801-A825-2024DB22D81F@oracle.com> Message-ID: Looks good. Chris On 7/13/18 4:34 PM, Daniil Titov wrote: > Please review the change that filters out 8 JDI and JDWP tests when running with Graal. ?These tests consume all memory ( to force GC or to test that the OutOfMemory error is thrown ) that sporadically results in the exceptions in the Graal compiler threads and failure. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8207261 > Webrev: http://cr.openjdk.java.net/~dtitov/8207261/webrev.01/ > > Thanks! > > Best regards, > Daniil > > > From serguei.spitsyn at oracle.com Sat Jul 14 00:29:46 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 13 Jul 2018 17:29:46 -0700 Subject: RFR 8207261: [Graal] JDI and JDWP tests that consume all memory should be filtered out to not run with Graal In-Reply-To: <810B1BA0-92AE-4801-A825-2024DB22D81F@oracle.com> References: <810B1BA0-92AE-4801-A825-2024DB22D81F@oracle.com> Message-ID: <3d7e49ef-02f7-3882-7608-39d36f141b2e@oracle.com> Hi Daniil, It looks good. Thanks, Serguei On 7/13/18 16:34, Daniil Titov wrote: > Please review the change that filters out 8 JDI and JDWP tests when running with Graal. ?These tests consume all memory ( to force GC or to test that the OutOfMemory error is thrown ) that sporadically results in the exceptions in the Graal compiler threads and failure. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8207261 > Webrev: http://cr.openjdk.java.net/~dtitov/8207261/webrev.01/ > > Thanks! > > Best regards, > Daniil > > > From kubota.yuji at gmail.com Sat Jul 14 17:56:32 2018 From: kubota.yuji at gmail.com (KUBOTA Yuji) Date: Sun, 15 Jul 2018 02:56:32 +0900 Subject: RFR:8207048: jhsdb debugd cannot specify a port number In-Reply-To: References: <5b422de0-ec5e-924e-004f-d58ab1474f85@oracle.com> Message-ID: Hi David and all, My goal is we can set the port of RMI and RMI registry through command line option in jhsdb debugd. So I want to create a CSR request of JDK-8207048 which propose to change jhsdb command line option. P.S.: I have never created a CSR request before. I'll need some time to learn that. Thanks, Yuji 2018-07-12 10:40 GMT+09:00 KUBOTA Yuji : > Hi David, > > Thank you for comment and updating JBS. I'll create a CSR request > after getting comments whether this change is welcomed by community. > > Thanks, > Yuji > > 2018-07-12 10:21 GMT+09:00 David Holmes : >> Hi Yuji, >> >> I can't comment on the actual change proposed in this enhancement request, >> but it will need to have a CSR request created and approved due to the use >> of a new system property. >> >> Thanks, >> David >> >> >> >> >> On 11/07/2018 11:55 PM, KUBOTA Yuji wrote: >>> >>> Hi all, >>> >>> I filed bugzilla for small fix to improvement of `jhsdb debugd` to set >>> a port of UnicastRemoteObject aka >>> sun.jvm.hotspot.debugger.remote.RemoteDebuggerServer by >>> `sun.jvm.hotspot.rmi.debugger.port`. >>> >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8207048 >>> Webrev: http://cr.openjdk.java.net/~ykubota/8207048/webrev.00/ >>> >>> We can set an RMI registry port of debugd server by >>> `sun.jvm.hotspot.rmi.port`, but can not set a port of RemoteObject. So >>> RemoteObject always uses an anonymous port. For security, we should >>> not open ports widely to use debugd, so I want to fix. >>> >>> Could you review it? >>> >>> Thanks, >>> Yuji >>> >> From gary.adams at oracle.com Mon Jul 16 14:49:16 2018 From: gary.adams at oracle.com (Gary Adams) Date: Mon, 16 Jul 2018 10:49:16 -0400 Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout configured In-Reply-To: <5B4CA830.2000206@oracle.com> References: <5B4CA830.2000206@oracle.com> Message-ID: <5B4CB06C.20602@oracle.com> I agree that timeouts should be very rare and that a shorter timeout helps during test development. These tests were written a long time ago and the test developers are no longer available for ongoing adjustments. These tests were not designed to be performance regression tests. They typically have a single feature that is being tested. For instance the recent investigations with exclude001 revealed many more methods were being processed now than when the test was originally written. Increasing waittime from 5 to 7 minutes allowed it to run to completion on the slower solaris-sparcv9-debug build. I believe we are looking for ways to keep the continuous integration systems building and testing automatically. Intermittent timeout failures should be avoided where possible. I believe historically we have seen both vmTestbase/nsk waitime timeouts and jtreg timeouts in this collection of tests. Increasing the jtreg timeout should allow the internal waitime timeout to have first shot at reporting a timeout. > Hi Gary, > > I wasn't suggesting a shorter waittime to speed up the tests. It's just > another (of many) timeout related parameters use to detect (what should > be very uncommon) timeout failures sooner. I guess in that case it does > make the test faster in cases where it does timeout. Agreed. > > So one question is how much do we care about timeout performance? If not > at all (we think the timeout is very rare, if ever), we'd just do a > something like a 1h timeout and forget about it. However, historically > that is not the approach we have taken. jtreg is given a fairly short > timeout of 2m, multiplied to account for platform performance. When/if these tests are rewritten to be more jtreg centric, the timeout and time factor arguments should be updated. I believe performance specific tests should catch regressions. These functional tests should be given adequate time to complete their tasks. > > So while I understand it doesn't make sense to have the waittime be > longer than the (adjusted) jtreg timeout (we'd always hit the jtreg > timeout first), I don't think that implies we should make the jtreg > timeout longer. Maybe we should make the waittime shorter. In any case, > with the current timeoutFactor in place, it's already the case that the > jtreg timeout is longer than waittime. So I'm not sure why you feel the > need to make the jtreg timeout longer, unless the test is hitting the > jtreg timeout already. The current waittime setting is what the original test developer designated. It corresponds closest to the jtreg timeout setting. > > And another thought that just came to me. Timeouts can also serve the > purpose of detecting bugs. If the test author decides the test should > finish in 1m, and someone bumps the timeout to 10m, that might make a > performance bug introduced in the future go unnoticed. In general I > don't think we should increase the timeout for tests that are not > currently timing out. For ones that are, first see if there is a > performance related issue. I believe the focus should be setting the timeouts that allow these tests to reliably complete on the current supported platforms and build variants. > > Chris > > On 7/13/18 2:36 PM,gary.adams at oracle.com wrote: > >/ We know that the default jtreg timeout is 2 minutes and typically > />/ runs with a timeoutfactor of 4 or 10. So the harness "safety net" > />/ is 8 to 20 minutes from jtreg. > />/ > />/ It does appear that most of the vmTestbase tests use a 5 minute > />/ waittime. I have seen waittime used in different ways. The one we > />/ saw most recently was waiting for a specific reply that was taking > />/ upwords of 7 minutes handling method exclude filtering. e.g. > />/ 600K methods on solaris-sparcv9-debug > />/ > />/ I've seen other tests using waittime as a total test timeout. > />/ > />/ The jtreg timeout factor has not been applied to the vmTestbase waitime. > />/ The tests have been quickly ported so they can run under jtreg > />/ harness, but have not been converted to use the all the jtreg features. > />/ > />/ The purpose of this specific fix is to prevent jtreg from an early > />/ termination at 2 minutes or 8 minutes, when the original waittime > />/ allows for 5 minutes. > />/ > />/ Reducing waittime will not speed up the tests. It would probably > />/ introduce > />/ more intermittent timeout reports. > />/ > />/ On 7/13/18 4:21 PM, Chris Plummer wrote: > />>/ Hi Gary, > />>/ > />>/ It looks like you have properly added timeout=300 wherever we use > />>/ -waittime:5. However, I'm not 100% convinced this is always the right > />>/ approach. In the bug description you said that -waittime is used as a > />>/ timeout for individual operations. However, there could be multiple > />>/ of those operations, and they could in sum exceed the 300 second > />>/ jtreg timeout you added. > />>/ > />>/ What is the default for -waittime? I'm also guessing that the initial > />>/ application of -waittime was never really tuned to the specific tests > />>/ and just cloned across most of them. It seems every test either needs > />>/ 5m or the default, which doesn't really make much sense. If 5m was > />>/ really needed, we should have seen a lot of failures when ported to > />>/ jtreg, but as far as I know the only reason this issue got on your > />>/ radar was due to exclude001 needing 7m. Maybe rather than adding > />>/ timeout=300 you should change -waitime to 2m, since other than > />>/ exclude001, none of the tests seem to need more than 2m. > />>/ > />>/ Lastly, does timeoutFactor impact -waittime? It seems it should be > />>/ applied to it also. I'm not sure if it is. > />>/ > />>/ thanks, > />>/ > />>/ Chris > />>/ > />>/ On 7/13/18 4:29 AM, Gary Adams wrote: > />>>/ This is a simple update to set the jtreg timeout to match the > />>>/ internal waittime already being used by these vmTestbase/nsk/jdb tests. > />>>/ > />>>/ Issue:https://bugs.openjdk.java.net/browse/JDK-8206013 > />>>/ Webrev:http://cr.openjdk.java.net/~gadams/8206013/webrev.00/ > />>/ > />>/ > />>/ > />/ > / > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Mon Jul 16 17:58:36 2018 From: jcbeyler at google.com (JC Beyler) Date: Mon, 16 Jul 2018 10:58:36 -0700 Subject: RFR(S) 8205652: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java fails Message-ID: Hi all, Small RFR to update a HeapMonitor test that had two issues: a test was wrong and the test was not allocating enough to get to an expected sample count. Instead of allocating 10 times more and hit some OOM on the test framework, the webrev allocates in chunks and gets the number of samples. I ran this 10k times on my machine and it passed. Serguei ran mach5 testing with it and said it looked good. Bug associated is: JDK-8205652 Webrev is here: http://cr.openjdk.java.net/~jcbeyler/8205652/webrev.01/ Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From hohensee at amazon.com Mon Jul 16 18:22:30 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 16 Jul 2018 18:22:30 +0000 Subject: ThreadMXBean::getCurrentThreadAllocatedBytes In-Reply-To: <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com> References: <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com> <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com> Message-ID: I believe you could move the code ahead of the call to validate_thread_id_array() because that method just checks for thread ids <= 0. diff -r 3ddf41505d54 src/hotspot/share/services/management.cpp --- a/src/hotspot/share/services/management.cpp Sun Jun 03 23:33:00 2018 -0700 +++ b/src/hotspot/share/services/management.cpp Mon Jul 16 10:41:28 2018 -0700 @@ -2084,11 +2083,19 @@ typeArrayOop sa = typeArrayOop(JNIHandles::resolve_non_null(sizeArray)); typeArrayHandle sizeArray_h(THREAD, sa); + // Special-case current thread + int num_threads = ids_ah->length(); + JavaThread* java_thread = JavaThread::current(); + if (num_threads == 1 && sizeArray_h->length() == 1 && + ids_ah->long_at(0) == java_lang_Thread::thread_id(java_thread->threadObj())) { + sizeArray_h->long_at_put(0, java_thread->cooked_allocated_bytes()); + return; + } + // validate the thread id array validate_thread_id_array(ids_ah, CHECK); // sizeArray must be of the same length as the given array of thread IDs - int num_threads = ids_ah->length(); if (num_threads != sizeArray_h->length()) { THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), "The length of the given long array does not match the length of " If performance is good enough, and if you still want to add getCurrentThreadAllocatedBytes() (imo a good idea, since getCurrentThreadCpuTime() and getCurrentThreadUserTime() already exist), you could implement it by ?getThreadAllocatedBytes(Thread::currentThread().getId())?. You might want also want to add getCurrentThread* methods to com.sun.management where they don?t currently exist: then we?d have a complete parallel method set. Another approach to improving things is to fix the underlying problem with find_JavaThread_from_java_tid(). https://bugs.openjdk.java.net/browse/JDK-8185005 proposes doing that in a different context. We came up with a patch for JDK8 that uses an open addressed hashtable (one where the ?bucket chain? is in the index array, see https://en.wikipedia.org/wiki/Hash_table#Open_addressing) to map Java tids to JavaThread*s. I?ve forward ported it to JDK12, see http://cr.openjdk.java.net/~phh/8185005/webrev.00/. The main disadvantage, of course, is that it?s yet another data structure that takes up memory. It?s really fast though and speeds up our profilers quite a bit. Perhaps we could replace the existing thread list with a variation on this map, since it?s quick to just run through the underlying array when you want to run through the threads. Thanks, Paul From: serviceability-dev on behalf of "Daniel D. Daugherty" Reply-To: "daniel.daugherty at oracle.com" Date: Friday, July 13, 2018 at 1:53 PM To: Markus Gaisbauer , "serviceability-dev at openjdk.java.net" , Erik ?sterlund , Robbin Ehn Subject: Re: ThreadMXBean::getCurrentThreadAllocatedBytes Markus, I filed the following bug for you: JDK-8207266 ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread https://bugs.openjdk.java.net/browse/JDK-8207266 Dan On 7/13/18 4:46 PM, Daniel D. Daugherty wrote: On 7/13/18 2:44 PM, Daniel D. Daugherty wrote: On 7/13/18 12:35 PM, Markus Gaisbauer wrote: Hello, I am trying to use ThreadMXBean::getThreadAllocatedBytes (com.sun.management) to get the amount of allocated memory of the current thread in some performance critical code. Unfortunately, the current implementation can be rather slow and the duration of each call unpredictable. I ran a test in a JVM with 500 threads. Depending on which thread was queried, getThreadAllocatedBytes took between 100 ns and 2500 ns. The root cause of the problem is ThreadsList::find_JavaThread_from_java_tid which performs a linear scan through all Java threads in the current process. The more threads a JVM has, the slower it gets. In the worst case, the thread with the given TID is found as the last entry in the list. Before Java 10, the oldest thread is the slowest one to query. Since Java 10, the youngest thread is the slowest one to query. I think this was a side effect of introducing "Thread Safe Memory Reclamation (Thread-SMR) support". Oldest Thread Youngest Thread Java 8 8740 ns 76 ns Java 10 109 ns 2485 ns It is good to see that longest search is much faster. Erik and Robbin will be pleased since speeding up traversal of the ThreadsList was one of the things that we tried to do during the Thread-SMR project. A first step is get a new bug filed that documents the issue with ThreadMXBean::getThreadAllocatedBytes(). Perhaps Gary or Serguei will take care of that. Dan A common use case is to query the metric for the current thread (e.g. before and after performing some operation). This case can be optimized by introducing a new method: getCurrentThreadAllocatedBytes. I created a patch for http://hg.openjdk.java.net/jdk/jdk/ and by using the new method I saw the following improvements in my test: Oldest Thread Youngest Thread Proposal 37 ns 37 ns This is a 60x improvement over the worst case of the current API. In the best case of the current API, the new method is still 3 times faster. // based on JVM_SetNativeThreadName in jvm.cpp. JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, jobject currentThread)) // We don't use a ThreadsListHandle here because the current thread // must be alive. oop java_thread = JNIHandles::resolve_non_null(currentThread); JavaThread* thr = java_lang_Thread::thread(java_thread); if (thread == thr) { // only supported for the current thread return thr->cooked_allocated_bytes(); } return -1; JVM_END The proposed method also fixes the problem, that getThreadAllocatedBytes itself allocates some memory on the current thread (two long arrays, 24 bytes) and therefore can slightly skew measurements. The new method, getCurrentThreadAllocatedBytes, returns exactly the same value if it is called twice without allocating any memory between those calls. I also built a variation of this method that could be used to query allocated memory more efficiently for anyone who already has a java.lang.Thread object: JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env, jobject threadObj)) // based on code proposed in threadSMR.hpp ThreadsListHandle tlh; JavaThread* thr = NULL; bool is_alive = tlh.cv_internal_thread_to_JavaThread(threadObj, &thr, NULL); if (is_alive) { return thr->cooked_allocated_bytes(); } return -1; JVM_END This method took 70 ns in my test, which is 85% slower than GetCurrentThreadAllocatedMemory but still 30% faster than the best case of the current API. I currently have no immediate need for this second method, but I think it would also be a valueable addition to the API. I attached a patch for getCurrentThreadAllocatedBytes. I can create a second patch for also adding getThreadAllocatedMemory(java.lang.Thread) to the API. I am a first time contributor and I am not 100% sure what process I must follow to get a change like this into OpenJDK. Can someone have a look at my proposal and help me through the process? Best regards, Markus I believe this is the code that's causing you grief: open/src/hotspot/share/services/management.cpp: // Gets an array containing the amount of memory allocated on the Java // heap for a set of threads (in bytes). Each element of the array is // the amount of memory allocated for the thread ID specified in the // corresponding entry in the given array of thread IDs; or -1 if the // thread does not exist or has terminated. JVM_ENTRY(void, jmm_GetThreadAllocatedMemory(JNIEnv *env, jlongArray ids, jlongArray sizeArray)) // Check if threads is null if (ids == NULL || sizeArray == NULL) { THROW(vmSymbols::java_lang_NullPointerException()); } ResourceMark rm(THREAD); typeArrayOop ta = typeArrayOop(JNIHandles::resolve_non_null(ids)); typeArrayHandle ids_ah(THREAD, ta); typeArrayOop sa = typeArrayOop(JNIHandles::resolve_non_null(sizeArray)); typeArrayHandle sizeArray_h(THREAD, sa); // validate the thread id array validate_thread_id_array(ids_ah, CHECK); // sizeArray must be of the same length as the given array of thread IDs int num_threads = ids_ah->length(); if (num_threads != sizeArray_h->length()) { THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), "The length of the given long array does not match the length of " "the given array of thread IDs"); } ThreadsListHandle tlh; for (int i = 0; i < num_threads; i++) { JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(ids_ah->long_at(i)); if (java_thread != NULL) { sizeArray_h->long_at_put(i, java_thread->cooked_allocated_bytes()); } } JVM_END Perhaps something like this above the "ThreadsListHandle tlh;" line: if (num_threads == 1 && THREAD->is_Java_thread()) { // Only asking for 1 thread so if we're a JavaThread, then // see if this request is for ourself. JavaThread* jt = THREAD; oop tobj = jt->threadObj(); if (ids_ah->long_at(0) == java_lang_Thread::thread_id(tobj)) { // Return the info for ourself. sizeArray_h->long_at_put(0, jt->cooked_allocated_bytes()); return; } } I haven't checked to see if this will even compile, but I think you'll get the idea. Dan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Mon Jul 16 19:37:18 2018 From: jcbeyler at google.com (JC Beyler) Date: Mon, 16 Jul 2018 12:37:18 -0700 Subject: RFR(S) 8205541: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java fails Message-ID: Hi all, Small RFR to update two HeapMonitor tests to remove test failures when resetting a test data structure and assuming wrongly that the data structure was empty afterwards due to a second thread adding something to it. The fix is to disable sampling then reset the storage before enabling it again. Bug associated is: JDK-8205541 Webrev is here: http://cr.openjdk.java.net/~jcbeyler/8205541/webrev.02/ Thanks all! Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Mon Jul 16 19:39:22 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 16 Jul 2018 15:39:22 -0400 Subject: ThreadMXBean::getCurrentThreadAllocatedBytes In-Reply-To: References: <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com> <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com> Message-ID: <300fdd4d-8146-cdac-321e-daaff6f65bc8@oracle.com> The new block needs to be just above the "ThreadsListHandle tlh;" line in order to preserve all of the existing checks... More below... On 7/16/18 2:22 PM, Hohensee, Paul wrote: > > I believe you could move the code ahead of the call to > validate_thread_id_array() because that method just checks for thread > ids <= 0. > > *diff -r 3ddf41505d54 src/hotspot/share/services/management.cpp* > > *--- a/src/hotspot/share/services/management.cpp Sun Jun 03 23:33:00 > 2018 -0700* > > *+++ b/src/hotspot/share/services/management.cpp Mon Jul 16 10:41:28 > 2018 -0700* > > @@ -2084,11 +2083,19 @@ > > typeArrayOop sa = typeArrayOop(JNIHandles::resolve_non_null(sizeArray)); > > typeArrayHandle sizeArray_h(THREAD, sa); > > +// Special-case current thread > The next line uses ids_ah, but validate_threads_id_array() has not been called yet so you don't know whether ids_ah is valid yet. > +int num_threads = ids_ah->length(); > The original code that I posted used the existing THREADS variable rather than a call to JavaThread::current() which can be expensive. > +JavaThread* java_thread = JavaThread::current(); > > +if (num_threads == 1 && sizeArray_h->length() == 1 && > The next line uses ids_ah, but validate_threads_id_array() has not been called yetso you don't know whether ids_ah is valid yet. > +ids_ah->long_at(0) == > java_lang_Thread::thread_id(java_thread->threadObj())) { > > +sizeArray_h->long_at_put(0, java_thread->cooked_allocated_bytes()); > > +return; > > +} > > + > > // validate the thread id array > > validate_thread_id_array(ids_ah, CHECK); > > // sizeArray must be of the same length as the given array of thread IDs > It's not safe to move the next line before validate_thread_id_array(). > -int num_threads = ids_ah->length(); > > if (num_threads != sizeArray_h->length()) { > > THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), > > "The length of the given long array does not match the length of " > > If performance is good enough, and if you still want to add > getCurrentThreadAllocatedBytes() (imo a good idea, since > getCurrentThreadCpuTime() and getCurrentThreadUserTime() already > exist), you could implement it by > ?getThreadAllocatedBytes(Thread::currentThread().getId())?. You might > want also want to add getCurrentThread* methods to com.sun.management > where they don?t currently exist: then we?d have a complete parallel > method set. > > Another approach to improving things is to fix the underlying problem > with find_JavaThread_from_java_tid(). > https://bugs.openjdk.java.net/browse/JDK-8185005 proposes doing that > in a different context. We came up with a patch for JDK8 that uses an > open addressed hashtable (one where the ?bucket chain? is in the index > array, see https://en.wikipedia.org/wiki/Hash_table#Open_addressing > ) to map > Java tids to JavaThread*s. I?ve forward ported it to JDK12, see > http://cr.openjdk.java.net/~phh/8185005/webrev.00/ > . The main > disadvantage, of course, is that it?s yet another data structure that > takes up memory. It?s really fast though and speeds up our profilers > quite a bit. Perhaps we could replace the existing thread list with a > variation on this map, since it?s quick to just run through the > underlying array when you want to run through the threads. > Hmmm... That bug got closed as will-not-fix. I'm not sure why the triage team decided that. Dan > Thanks, > > Paul > > *From: *serviceability-dev > on behalf of "Daniel D. > Daugherty" > *Reply-To: *"daniel.daugherty at oracle.com" > *Date: *Friday, July 13, 2018 at 1:53 PM > *To: *Markus Gaisbauer , > "serviceability-dev at openjdk.java.net" > , Erik ?sterlund > , Robbin Ehn > *Subject: *Re: ThreadMXBean::getCurrentThreadAllocatedBytes > > Markus, > > I filed the following bug for you: > > ??? JDK-8207266 ThreadMXBean::getThreadAllocatedBytes() can be quicker > for self thread > https://bugs.openjdk.java.net/browse/JDK-8207266 > > Dan > > On 7/13/18 4:46 PM, Daniel D. Daugherty wrote: > > On 7/13/18 2:44 PM, Daniel D. Daugherty wrote: > > On 7/13/18 12:35 PM, Markus Gaisbauer wrote: > > Hello, > > I am trying to use ThreadMXBean::getThreadAllocatedBytes > (com.sun.management) to get the amount of allocated memory > of the current thread in some performance critical code. > > Unfortunately, the current implementation can be rather > slow and the duration of each call unpredictable. I ran a > test in a JVM with 500 threads. Depending on which thread > was queried, getThreadAllocatedBytes took between 100 ns > and 2500 ns. > > The root cause of the problem is > ThreadsList::find_JavaThread_from_java_tid which performs > a linear scan through all Java threads in the current > process. The more threads a JVM has, the slower it gets. > In the worst case, the thread with the given TID is found > as the last entry in the list. > > Before Java 10, the oldest thread is the slowest one to query. > > Since Java 10, the youngest thread is the slowest one to > query. I think this was a side effect of introducing > "Thread Safe Memory Reclamation (Thread-SMR) support". > > ? ? ? ?Oldest Thread? ?Youngest Thread > > Java 8? ? ? ? ? ? ?8740 ns? ? ? ? ? ? ?76 ns > > Java 10? ? ? ? ? ? ?109 ns? ? ? ? ? ?2485 ns > > > It is good to see that longest search is much faster. Erik and > Robbin > will be pleased since speeding up traversal of the ThreadsList > was one > of the things that we tried to do during the Thread-SMR project. > > A first step is get a new bug filed that documents the issue with > ThreadMXBean::getThreadAllocatedBytes(). Perhaps Gary or Serguei > will take care of that. > > Dan > > > > A common use case is to query the metric for the current > thread (e.g. before and after performing some operation). > This case can be optimized by introducing a new method: > getCurrentThreadAllocatedBytes. > > I created a patch for http://hg.openjdk.java.net/jdk/jdk/ > and by using the new > method I saw the following improvements in my test: > > ? ? ? ?Oldest Thread? ?Youngest Thread > > Proposal ? ? ? ? ? ?37 ns? ? ? ? ? ? ?37 ns > > This is a 60x improvement over the worst case of the > current API. In the best case of the current API, the new > method is still 3 times faster. > > // based on JVM_SetNativeThreadName in jvm.cpp. > > JVM_ENTRY(jlong, > jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, jobject > currentThread)) > > ? // We don't use a ThreadsListHandle here because the > current thread > > ? // must be alive. > > ? oop java_thread = > JNIHandles::resolve_non_null(currentThread); > > ? JavaThread* thr = java_lang_Thread::thread(java_thread); > > ? if (thread == thr) { > > ? ? // only supported for the current thread > > ? ? return thr->cooked_allocated_bytes(); > > ? } > > ? return -1; > > JVM_END > > The proposed method also fixes the problem, that > getThreadAllocatedBytes itself allocates some memory on > the current thread (two long arrays, 24 bytes) and > therefore can slightly skew measurements. The new > method,?getCurrentThreadAllocatedBytes, returns exactly > the same value if it is called twice without allocating > any memory between those calls. > > I also built a variation of this method that could be used > to query allocated memory more efficiently for anyone who > already has a java.lang.Thread object: > > JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env, > jobject threadObj)) > > ? // based on code proposed in threadSMR.hpp > > ? ThreadsListHandle tlh; > > ? JavaThread* thr = NULL; > > ? bool is_alive = > tlh.cv_internal_thread_to_JavaThread(threadObj, &thr, NULL); > > ? if (is_alive) { > > ? ? return thr->cooked_allocated_bytes(); > > ? } > > ? return -1; > > JVM_END > > This method took 70 ns in my test, which is 85% slower > than?GetCurrentThreadAllocatedMemory but still 30% faster > than the best case of the current API. I currently have no > immediate need for this second method, but I think it > would also be a valueable addition to the API. > > I attached a patch for getCurrentThreadAllocatedBytes. I > can create a second patch for also adding > getThreadAllocatedMemory(java.lang.Thread) to the API. > > I am a first time contributor and I am not 100% sure what > process I must follow to get a change like this into > OpenJDK. Can someone have a look at my proposal and help > me through the process? > > Best regards, > > Markus > > > I believe this is the code that's causing you grief: > > open/src/hotspot/share/services/management.cpp: > > // Gets an array containing the amount of memory allocated on the Java > // heap for a set of threads (in bytes).? Each element of the array is > // the amount of memory allocated for the thread ID specified in the > // corresponding entry in the given array of thread IDs; or -1 if the > // thread does not exist or has terminated. > JVM_ENTRY(void, jmm_GetThreadAllocatedMemory(JNIEnv *env, > jlongArray ids, > jlongArray sizeArray)) > ? // Check if threads is null > ? if (ids == NULL || sizeArray == NULL) { > THROW(vmSymbols::java_lang_NullPointerException()); > ? } > > ? ResourceMark rm(THREAD); > ? typeArrayOop ta = typeArrayOop(JNIHandles::resolve_non_null(ids)); > ? typeArrayHandle ids_ah(THREAD, ta); > > ? typeArrayOop sa = > typeArrayOop(JNIHandles::resolve_non_null(sizeArray)); > ? typeArrayHandle sizeArray_h(THREAD, sa); > > ? // validate the thread id array > ? validate_thread_id_array(ids_ah, CHECK); > > ? // sizeArray must be of the same length as the given array of > thread IDs > ? int num_threads = ids_ah->length(); > ? if (num_threads != sizeArray_h->length()) { > THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(), > ????????????? "The length of the given long array does not match > the length of " > ????????????? "the given array of thread IDs"); > ? } > > ? ThreadsListHandle tlh; > ? for (int i = 0; i < num_threads; i++) { > ??? JavaThread* java_thread = > tlh.list()->find_JavaThread_from_java_tid(ids_ah->long_at(i)); > ??? if (java_thread != NULL) { > ????? sizeArray_h->long_at_put(i, > java_thread->cooked_allocated_bytes()); > ??? } > ? } > JVM_END > > > Perhaps something like this above the "ThreadsListHandle tlh;" line: > > ? if (num_threads == 1 && THREAD->is_Java_thread()) { > ??? // Only asking for 1 thread so if we're a JavaThread, then > ??? // see if this request is for ourself. > ??? JavaThread* jt = THREAD; > ??? oop tobj = jt->threadObj(); > > ??? if (ids_ah->long_at(0) == java_lang_Thread::thread_id(tobj)) { > ????? // Return the info for ourself. > ????? sizeArray_h->long_at_put(0, jt->cooked_allocated_bytes()); > ????? return; > ??? } > ? } > > I haven't checked to see if this will even compile, but I > think you'll get the idea. > > Dan > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From markus.gaisbauer at gmail.com Mon Jul 16 19:42:20 2018 From: markus.gaisbauer at gmail.com (Markus Gaisbauer) Date: Mon, 16 Jul 2018 21:42:20 +0200 Subject: ThreadMXBean::getCurrentThreadAllocatedBytes In-Reply-To: References: <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com> <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com> Message-ID: Hi, Thank you for all the help. ?I added the code suggested by Paul and ran my small microbenchmark. The optimized getThreadAllocatedBytes now takes 57 ns if fed with the ID of the current thread. All calls that aren't for the current thread will be a bit slower. As far as I am concerned, adding this optimization probably doesn't hurt. In fact it would be awesome if this could be backported to Java 8. But I am still strongly in favor of adding a special method just for the current thread: * It is still faster (35 ns vs 57 ns) * No heap memory is allocated by this method * If this method is available, callers can always expect good (constant time) performance. On the other hand two versions would exist for getThreadAllocatedBytes. Library code would have to figure out somehow if this particular JVM has the slow or fast version. In my own use case, I would only want to get the metric if it is extremely fast. It's not worth the overhead, if it takes thousands of nanoseconds. I wasn't aware of the existence of JavaThread::current() before. My new method in management.cpp could be simplified to this: JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env)) return JavaThread::current()->cooked_allocated_bytes(); JVM_END This code is not only simpler but also a bit faster. Each call now takes 35 ns instead of 37 ns. Is there a technical reason why no new native methods should be added to ThreadImpl.java? I saw that getCurrentThreadCpuTime also uses this magic number 0 to indicate the current thread. Wouldn't it be cleaner to have two methods instead of using and checking a special number in Java/native code? Best regards, Markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From markus.gaisbauer at gmail.com Mon Jul 16 19:49:40 2018 From: markus.gaisbauer at gmail.com (Markus Gaisbauer) Date: Mon, 16 Jul 2018 21:49:40 +0200 Subject: ThreadMXBean::getCurrentThreadAllocatedBytes In-Reply-To: References: <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com> <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com> Message-ID: I saw Daniels comment too late, but using THREAD instead of JavaThread::current() indeed makes the new method again a bit simpler and faster (now 33 ns instead of 35 ns). JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env)) return THREAD->cooked_allocated_bytes(); JVM_END On Mon, Jul 16, 2018 at 9:42 PM Markus Gaisbauer wrote: > Hi, > > Thank you for all the help. > > ?I added the code suggested by Paul and ran my small microbenchmark. The > optimized getThreadAllocatedBytes now takes 57 ns if fed with the ID of the > current thread. All calls that aren't for the current thread will be a bit > slower. As far as I am concerned, adding this optimization probably doesn't > hurt. In fact it would be awesome if this could be backported to Java 8. > > But I am still strongly in favor of adding a special method just for the > current thread: > * It is still faster (35 ns vs 57 ns) > * No heap memory is allocated by this method > * If this method is available, callers can always expect good (constant > time) performance. On the other hand two versions would exist for getThreadAllocatedBytes. > Library code would have to figure out somehow if this particular JVM has > the slow or fast version. In my own use case, I would only want to get the > metric if it is extremely fast. It's not worth the overhead, if it takes > thousands of nanoseconds. > > I wasn't aware of the existence of JavaThread::current() before. My new > method in management.cpp could be simplified to this: > > JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env)) > return JavaThread::current()->cooked_allocated_bytes(); > JVM_END > > This code is not only simpler but also a bit faster. Each call now takes > 35 ns instead of 37 ns. > > Is there a technical reason why no new native methods should be added to > ThreadImpl.java? I saw that getCurrentThreadCpuTime also uses this magic > number 0 to indicate the current thread. Wouldn't it be cleaner to have > two methods instead of using and checking a special number in Java/native > code? > > Best regards, > Markus > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Jul 16 19:58:38 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 16 Jul 2018 12:58:38 -0700 Subject: RFR(S) 8205652: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java fails In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From hohensee at amazon.com Mon Jul 16 20:28:46 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 16 Jul 2018 20:28:46 +0000 Subject: ThreadMXBean::getCurrentThreadAllocatedBytes In-Reply-To: References: <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com> <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com> Message-ID: <9E5419DB-7F46-4AC8-809B-6E9DCE18825B@amazon.com> Given your requirements, and that my proposal is slower, yours is better. :) There?s no technical reason why we can?t add what you?re asking for (and everyone who?s weighed in so far is in favor), but what do people think of adding the rest of the missing getCurrentThread* methods? These would be getCurrentThreadInfo() getCurrentThreadInfo(boolean lockedMonitors, boolean lockedSynchronizers) getCurrentThreadInfo(int maxDepth) Shall we add these to java.lang.management or com.sun.management? Since we?re doing major releases every 6 months, I?d say j.l.m, but it doesn?t matter to me one way or the other. Imo, getCurrentThreadAllocatedMemory() should go in com.sun.management because that?s where the *AllocatedMemory* methods are. For getCurrentThreadAllocatedMemory(), you should add checks for isThreadAllocatedMemorySupported() and isThreadAllocatedMemoryEnabled(). Paul From: Markus Gaisbauer Date: Monday, July 16, 2018 at 12:50 PM To: "Hohensee, Paul" Cc: "daniel.daugherty at oracle.com" , "serviceability-dev at openjdk.java.net" , "erik.osterlund at oracle.com" , "robbin.ehn at oracle.com" Subject: Re: ThreadMXBean::getCurrentThreadAllocatedBytes I saw Daniels comment too late, but using THREAD instead of JavaThread::current() indeed makes the new method again a bit simpler and faster (now 33 ns instead of 35 ns). JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env)) return THREAD->cooked_allocated_bytes(); JVM_END On Mon, Jul 16, 2018 at 9:42 PM Markus Gaisbauer > wrote: Hi, Thank you for all the help. ?I added the code suggested by Paul and ran my small microbenchmark. The optimized getThreadAllocatedBytes now takes 57 ns if fed with the ID of the current thread. All calls that aren't for the current thread will be a bit slower. As far as I am concerned, adding this optimization probably doesn't hurt. In fact it would be awesome if this could be backported to Java 8. But I am still strongly in favor of adding a special method just for the current thread: * It is still faster (35 ns vs 57 ns) * No heap memory is allocated by this method * If this method is available, callers can always expect good (constant time) performance. On the other hand two versions would exist for getThreadAllocatedBytes. Library code would have to figure out somehow if this particular JVM has the slow or fast version. In my own use case, I would only want to get the metric if it is extremely fast. It's not worth the overhead, if it takes thousands of nanoseconds. I wasn't aware of the existence of JavaThread::current() before. My new method in management.cpp could be simplified to this: JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env)) return JavaThread::current()->cooked_allocated_bytes(); JVM_END This code is not only simpler but also a bit faster. Each call now takes 35 ns instead of 37 ns. Is there a technical reason why no new native methods should be added to ThreadImpl.java? I saw that getCurrentThreadCpuTime also uses this magic number 0 to indicate the current thread. Wouldn't it be cleaner to have two methods instead of using and checking a special number in Java/native code? Best regards, Markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Jul 16 23:06:40 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 16 Jul 2018 16:06:40 -0700 Subject: RFR(S) 8205541: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java fails In-Reply-To: References: Message-ID: <0fe71fe7-1d40-0d81-e7a8-5797ffcd4471@oracle.com> An HTML attachment was scrubbed... URL: From jcbeyler at google.com Mon Jul 16 23:07:27 2018 From: jcbeyler at google.com (JC Beyler) Date: Mon, 16 Jul 2018 16:07:27 -0700 Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling In-Reply-To: <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com> References: <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com> Message-ID: Hi all, The CSR has recently been approved, could someone else review the spec update webrev: http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ The associated bug is here: https://bugs.openjdk.java.net/browse/JDK-8205725 The associated CSR is here: https://bugs.openjdk.java.net/browse/JDK-8206940 Thanks all! Jc On Thu, Jul 12, 2018 at 2:27 PM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Hi Jc, > > It looks good to me (including the CSR that I'had already reviewed). > Thank you for preparing a fix for this issue so quickly! > > Thanks, > Serguei > > > On 7/12/18 13:45, JC Beyler wrote: > > Hi all, > > Could I get a review of an update to the JVMTI Spec for Heap Sampling: > http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ > > The assoicated bug is here: > https://bugs.openjdk.java.net/browse/JDK-8205725 > The associated CSR is here: > https://bugs.openjdk.java.net/browse/JDK-8206940 > > The basic reasoning of this webrev/bug/CSR is: > - rate is not the right word and should be renamed to interval, this is > what provokes the change in the code/tests/API naming. > - the spec does not mention that the new sampling interval will take time > to be taken into account (you have to wait for a TLAB to be refilled); this > adds that precision so that the user is not surprised > - the spec explicitly says that the sampling is done via a geometric > variable which averages to the sampling interval; it was asked to relax > this and the spec should just say that the sampling is pseudo-random and > the interval will average out to what the user requested. > > Thanks for all your help, > Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Jul 16 23:10:29 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 16 Jul 2018 16:10:29 -0700 Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling In-Reply-To: References: <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Tue Jul 17 01:17:54 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 16 Jul 2018 18:17:54 -0700 Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout configured In-Reply-To: <5B4CB06C.20602@oracle.com> References: <5B4CA830.2000206@oracle.com> <5B4CB06C.20602@oracle.com> Message-ID: <97c9dfb8-c91e-6412-fb02-412f67a09ff8@oracle.com> An HTML attachment was scrubbed... URL: From harsha.wardhana.b at oracle.com Tue Jul 17 06:23:50 2018 From: harsha.wardhana.b at oracle.com (Harsha Wardhana B) Date: Tue, 17 Jul 2018 11:53:50 +0530 Subject: RFR : JDK-8170299 - Debugger does not stop inside the low memory notifications code Message-ID: <0d8b21bb-10f3-8e80-d579-d890cb046d16@oracle.com> Hi All, Please review the fix for the bug, JDK-8170299 - Debugger does not stop inside the low memory notifications code webrev at, http://cr.openjdk.java.net/~hb/8170299/webrev.00/ Description of the fix: The debugger does not stop inside the listeners registered for notification from 1. com.sun.management.GarbageCollectorMXBean 2. sun.management.MemoryImpl (MemoryMXBean) 3. com.sun.management.DiagnosticCommandMBean The listeners registered for above MBeans are invoked by 'ServiceThread' which is a hidden thread and is not visible to the debugger. This issue was was already worked on before and below is the review thread for the same. http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-September/021782.html http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-December/022611.html With the current fix, all the user registered callbacks for above MBeans are executed in a newly created SingleThreadExecutor. The above file is also re-factored to use CopyOnWriteArrayList for managing the listeners. The fix has been tested in Mach5 by running all the tests under open/:jdk_management and closed/:jdk_management. The tests under open/test/jdk/java/lang/management/MemoryMXBean cover the above code changes. I can add more tests in the subsequent reviews if need arises. Please review the above change and let me know your comments. Thanks Harsha -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Tue Jul 17 08:20:26 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 17 Jul 2018 01:20:26 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Message-ID: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> Please review the change that fix the JDI test when running with Graal. The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled. Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/ Thanks! --Daniil From bob.vandette at oracle.com Tue Jul 17 14:00:08 2018 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 17 Jul 2018 10:00:08 -0400 Subject: RFR: 8206456 - [TESTBUG] docker jtreg tests fail on systems without cpuset.effective_cpus / cpuset.effective_mem Message-ID: Please review this fix which eliminates some docker/cgroup test failures when running on older Linux kernels with missing cgroup metric files. BUGS: https://bugs.openjdk.java.net/browse/JDK-8206456 WEBREV: http://cr.openjdk.java.net/~bobv/8206456/webrev/ This fix has been verified by the reporter of the issue. Bob. From ralf.schmelter at sap.com Tue Jul 17 14:08:53 2018 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Tue, 17 Jul 2018 14:08:53 +0000 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <21e17c666ac04930a0e4bb4869e989da@sap.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> Message-ID: Hi all, here is an updated webref at http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v2/ I've converted the shell based test to a Java one, which had the nice side effect of speeding it up (now takes ~1 second runtime). The fix itself is mainly unchanged, but I've added the variable 'filledIn' to store the number of frames actually filled in by the GetStackTrace call. Formerly I've reused the count variable, but this can lead to misunderstandings. Best regards, Ralf -----Original Message----- From: serviceability-dev [mailto:serviceability-dev-bounces at openjdk.java.net] On Behalf Of Schmelter, Ralf Sent: Montag, 9. Juli 2018 16:05 To: Chris Plummer ; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com Subject: [CAUTION] RE: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior Hi Chris, thanks for the review. > What testing have you done? I've tested the change by debugging by hand in eclipse and jdb, running the com/sun/jdi rtreg tests and the jdwp jck tests. And analog code is running in the SAP JVM for many years. > How long does this test take to run. 15 s according to jtreg. > What happens if for some reason SOE is never thrown? It's not clear to > me what the script would do in this case. It is treated as passed (which is not ideal). > In answer to the ShellScaffold.sh question, there is already work > underway to convert to pure java tests. See JDK-8201652. Ok, then I think it is better to convert the test to a Java TestScaffold test. I will update the webref when this is done. Best regards, Ralf -----Original Message----- From: Chris Plummer [mailto:chris.plummer at oracle.com] Sent: Freitag, 6. Juli 2018 00:37 To: Schmelter, Ralf ; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior Hi Ralf, Overall looks good, but I do have a few comments and questions. Please update the copyright. What testing have you done? How long does this test take to run. What happens if for some reason SOE is never thrown? It's not clear to me what the script would do in this case. In answer to the ShellScaffold.sh question, there is already work underway to convert to pure java tests. See JDK-8201652. I'm not certain if it is ok for you to just submit this new shell script, or if should be rewritten in pure java. Most of the work to convert the scripts has already been done but was put on hold. Maybe Serguei can comment and guide you on how it would be done in java. thanks, Chris On 7/3/18 3:43 AM, Schmelter, Ralf wrote: > Hi All, > > Please review the fix for the bughttps://bugs.openjdk.java.net/browse/JDK-8205608 . The webref is athttp://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/ . > > This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack. > > I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is. > > Best regards, > Ralf Schmelter From matthias.baesken at sap.com Tue Jul 17 14:13:08 2018 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 17 Jul 2018 14:13:08 +0000 Subject: 8206456 - [TESTBUG] docker jtreg tests fail on systems without cpuset.effective_cpus / cpuset.effective_mem In-Reply-To: References: Message-ID: Hi Bob, looks good (I am not a Reviewer however) ! The reported issues occured on a SUSE Linux 12 SP1 system , where /sys/fs/cgroup/cpuset/cpuset.effective_cpus and /sys/fs/cgroup/cpuset/cpuset.effective_mems are not present . I applied Bobs patch , now the jdk/internal/platform/docker - jtreg tests do not fail any more on the mentioned system . Thanks, Matthias > -----Original Message----- > From: Bob Vandette [mailto:bob.vandette at oracle.com] > Sent: Dienstag, 17. Juli 2018 16:00 > To: serviceability-dev at openjdk.java.net serviceability- > dev at openjdk.java.net ; core-libs- > dev > Cc: Baesken, Matthias ; Schmidt, Lutz > > Subject: RFR: 8206456 - [TESTBUG] docker jtreg tests fail on systems without > cpuset.effective_cpus / cpuset.effective_mem > > Please review this fix which eliminates some docker/cgroup test failures > when running on older > Linux kernels with missing cgroup metric files. > > BUGS: > https://bugs.openjdk.java.net/browse/JDK-8206456 > > WEBREV: > http://cr.openjdk.java.net/~bobv/8206456/webrev/ > > This fix has been verified by the reporter of the issue. > > Bob. > > > From gary.adams at oracle.com Tue Jul 17 15:33:54 2018 From: gary.adams at oracle.com (Gary Adams) Date: Tue, 17 Jul 2018 11:33:54 -0400 Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR: suspendCounts don't match for : Common-Cleaner Message-ID: <5B4E0C62.3020808@oracle.com> A race condition exists between the debugger and the debuggee. The first test thread is started with SUSPEND_NONE policy set. While processing the thread start event the debugger captures an initial set of thread suspend counts and resumes the debuggee vm. If the debuggee advances quickly it reaches the breakpoint set for methodForCommunication. Since the breakpoint carries with it SUSPEND_ALL policy, when the debugger captures a second set of suspend counts, it will not match the expected counts for a SUSPEND_NONE scenario. The proposed fix introduces a yield in the debuggee test thread run method to allow the debugger to get the expected sampled values. Issue: https://bugs.openjdk.java.net/browse/JDK-8170089 Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/ test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java: ... 186 private void setCommunicationBreakpoint(ReferenceType refType, String methodName) { 187 Method method = debuggee.methodByName(refType, methodName); 188 Location location = null; 189 try { 190 location = method.allLineLocations().get(0); 191 } catch (AbsentInformationException e) { 192 throw new Failure(e); 193 } 194 bpRequest = debuggee.makeBreakpoint(location); 195 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); 197 bpRequest.putProperty("number", "zero"); 198 bpRequest.enable(); 199 200 eventHandler.addListener( 201 new EventHandler.EventListener() { 202 public boolean eventReceived(Event event) { 203 if (event instanceof BreakpointEvent && bpRequest.equals(event.request())) { 204 synchronized(eventHandler) { 205 display("Received communication breakpoint event."); 206 bpCount++; 207 eventHandler.notifyAll(); 208 } 209 return true; 210 } 211 return false; 212 } 213 } 214 ); 215 } test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: ... 140 display("......--> vm.suspend();"); 141 vm.suspend(); 142 143 display(" getting : Map suspendsCounts1"); 144 145 Map suspendsCounts1 = new HashMap(); 146 for (ThreadReference threadReference : vm.allThreads()) { 147 suspendsCounts1.put(threadReference.name(), threadReference.suspendCount()); 148 } 149 display(suspendsCounts1.toString()); 150 151 display(" eventSet.resume;"); 152 eventSet.resume(); 153 154 display(" getting : Map suspendsCounts2"); This is where the breakpoint is encountered before the second set of suspend counts is acquired. 155 Map suspendsCounts2 = new HashMap(); 156 for (ThreadReference threadReference : vm.allThreads()) { 157 suspendsCounts2.put(threadReference.name(), threadReference.suspendCount()); 158 } 159 display(suspendsCounts2.toString()); From serguei.spitsyn at oracle.com Tue Jul 17 20:34:33 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 17 Jul 2018 13:34:33 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Tue Jul 17 21:29:04 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 17 Jul 2018 14:29:04 -0700 Subject: RFR(S) 8205652: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java fails In-Reply-To: References: Message-ID: <0894a3e4-b279-5f2a-e6f7-33f065cdabd7@oracle.com> +1 --alex On 07/16/2018 12:58, serguei.spitsyn at oracle.com wrote: > Hi Jc, > > It looks good to me. > > Thanks, > Serguei > > On 7/16/18 10:58, JC Beyler wrote: >> Hi all, >> >> Small RFR to update a HeapMonitor test that had two issues: a test was >> wrong and the test was not allocating enough to get to an expected >> sample count. Instead of allocating 10 times more and hit some OOM on >> the test framework, the webrev allocates in chunks and gets the number >> of samples. >> >> I ran this 10k times on my machine and it passed. Serguei ran mach5 >> testing with it and said it looked good. >> >> Bug associated is: JDK-8205652 >> >> Webrev is here: >> http://cr.openjdk.java.net/~jcbeyler/8205652/webrev.01/ >> >> >> Thanks, >> Jc > From alexey.menkov at oracle.com Tue Jul 17 21:48:26 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 17 Jul 2018 14:48:26 -0700 Subject: RFR(S) 8205541: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java fails In-Reply-To: <0fe71fe7-1d40-0d81-e7a8-5797ffcd4471@oracle.com> References: <0fe71fe7-1d40-0d81-e7a8-5797ffcd4471@oracle.com> Message-ID: <11d9384f-3b8b-013d-eb0f-e41736ff584b@oracle.com> Looks good to me as well --alex On 07/16/2018 16:06, serguei.spitsyn at oracle.com wrote: > Hi Jc, > > It looks good to me. > > Thanks, > Serguei > > > On 7/16/18 12:37, JC Beyler wrote: >> Hi all, >> >> Small RFR to update two HeapMonitor tests to remove test failures when >> resetting a test data structure and assuming wrongly that the data >> structure was empty afterwards due to a second thread adding something >> to it. >> The fix is to disable sampling then reset the storage before enabling >> it again. >> >> Bug associated is: JDK-8205541 >> >> Webrev is here: >> http://cr.openjdk.java.net/~jcbeyler/8205541/webrev.02/ >> >> >> Thanks all! >> Jc > From daniil.x.titov at oracle.com Tue Jul 17 21:55:35 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 17 Jul 2018 14:55:35 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> Message-ID: <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> Hi Serguei, The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters. The testSourceFilter() method does the following: 1. creates a ClassPrepareRequest object 2. registers new ClassPrepareEventListener 3. sends a command to debuggee to a load test class 4. waits till the debuggee performed the command 5. removes ClassPrepareEventListener 6. checks if a ClassPrepareEvent was received Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners) cat -n test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java /** 251 * This method sets up default listeners. 252 */ 253 private void createDefaultListeners() { 254 /** 255 * This listener catches up all unexpected events. 256 * 257 */ 258 addListener( 259 new EventListener() { 260 public boolean eventReceived(Event event) { 261 log.complain("EventHandler> Unexpected event: " + event.getClass().getName()); 262 unexpectedEventCaught = true; 263 return true; 264 } 265 } 266 ); 267 On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener is unregistered inside testSourceFilter() method. Please see below the new webrev with the changes you suggested. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/ Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 1:34 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Not sure, I fully understand the fix. So, let's start from some questions. Why the DefaultClassPrepareEventListener is needed? Is it not enough to filter out the other threads in the ClassPrepareEventListener.eventReceived() method ? 243 eventHandler.startListening(); 244 // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads. 245 // The listener should be added after the event listener is started to ensure that it called before 246 // the default event listener that handles unexpected events. 247 eventHandler.addListener(new DefaultClassPrepareEventListener()); It is still not clear why the default listener is added after the listening is started but not before. If the default listener is really needed then could you, please, split the lines above and L129, L160 to make a little bit shorter? I'd also suggest to replace "class prepared events" at L244 ? with "ClassPrepare event" or "class prepare event". There is also an unneeded space in the "( e.g. compiler)". Thanks, Serguei On 7/17/18 01:20, Daniil Titov wrote: Please review the change that fix the JDI test when running with Graal. The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled. Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/ Thanks! --Daniil From jcbeyler at google.com Tue Jul 17 22:12:06 2018 From: jcbeyler at google.com (JC Beyler) Date: Tue, 17 Jul 2018 15:12:06 -0700 Subject: RFR(S) 8205541: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java fails In-Reply-To: <11d9384f-3b8b-013d-eb0f-e41736ff584b@oracle.com> References: <0fe71fe7-1d40-0d81-e7a8-5797ffcd4471@oracle.com> <11d9384f-3b8b-013d-eb0f-e41736ff584b@oracle.com> Message-ID: Hi Alex, Thanks for the review! Here is now the new webrev, ready for a push: http://cr.openjdk.java.net/~jcbeyler/8205541/webrev.03/ Thanks all, Jc On Tue, Jul 17, 2018 at 2:48 PM Alex Menkov wrote: > Looks good to me as well > > --alex > > On 07/16/2018 16:06, serguei.spitsyn at oracle.com wrote: > > Hi Jc, > > > > It looks good to me. > > > > Thanks, > > Serguei > > > > > > On 7/16/18 12:37, JC Beyler wrote: > >> Hi all, > >> > >> Small RFR to update two HeapMonitor tests to remove test failures when > >> resetting a test data structure and assuming wrongly that the data > >> structure was empty afterwards due to a second thread adding something > >> to it. > >> The fix is to disable sampling then reset the storage before enabling > >> it again. > >> > >> Bug associated is: JDK-8205541 > >> > >> Webrev is here: > >> http://cr.openjdk.java.net/~jcbeyler/8205541/webrev.02/ > >> > >> > >> Thanks all! > >> Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Tue Jul 17 22:38:50 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 17 Jul 2018 15:38:50 -0700 Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling In-Reply-To: References: <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com> Message-ID: <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com> The changes look good to me. --alex On 07/16/2018 16:10, serguei.spitsyn at oracle.com wrote: > Hi all, > > We need at least one more review before pushing it. > > Thanks, > Serguei > > > On 7/16/18 16:07, JC Beyler wrote: >> Hi all, >> >> The CSR has recently been approved, could someone else review the spec >> update webrev: >> http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ >> >> >> The associated bug is here: >> https://bugs.openjdk.java.net/browse/JDK-8205725 >> The associated CSR is here: >> https://bugs.openjdk.java.net/browse/JDK-8206940 >> >> Thanks all! >> Jc >> >> >> On Thu, Jul 12, 2018 at 2:27 PM serguei.spitsyn at oracle.com >> > > wrote: >> >> Hi Jc, >> >> It looks good to me (including the CSR that I'had already reviewed). >> Thank you for preparing a fix for this issue so quickly! >> >> Thanks, >> Serguei >> >> >> On 7/12/18 13:45, JC Beyler wrote: >>> Hi all, >>> >>> Could I get a review of an update to the JVMTI Spec for Heap >>> Sampling: >>> http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ >>> >>> >>> The assoicated bug is here: >>> https://bugs.openjdk.java.net/browse/JDK-8205725 >>> The associated CSR is here: >>> https://bugs.openjdk.java.net/browse/JDK-8206940 >>> >>> The basic reasoning of this webrev/bug/CSR is: >>> - rate is not the right word and should be renamed to interval, >>> this is what provokes the change in the code/tests/API naming. >>> - the spec does not mention that the new sampling interval will >>> take time to be taken into account (you have to wait for a TLAB >>> to be refilled); this adds that precision so that the user is not >>> surprised >>> - the spec explicitly says that the sampling is done via a >>> geometric variable which averages to the sampling interval; it >>> was asked to relax this and the spec should just say that the >>> sampling is pseudo-random and the interval will average out to >>> what the user requested. >>> >>> Thanks for all your help, >>> Jc >> >> >> >> -- >> >> Thanks, >> Jc > From serguei.spitsyn at oracle.com Tue Jul 17 23:30:10 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 17 Jul 2018 16:30:10 -0700 Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling In-Reply-To: <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com> References: <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com> <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com> Message-ID: <2494a96a-c68f-3a22-a429-e49384f864bd@oracle.com> Thanks a lot for reviews, Alex! Serguei On 7/17/18 15:38, Alex Menkov wrote: > The changes look good to me. > > --alex > > On 07/16/2018 16:10, serguei.spitsyn at oracle.com wrote: >> Hi all, >> >> We need at least one more review before pushing it. >> >> Thanks, >> Serguei >> >> >> On 7/16/18 16:07, JC Beyler wrote: >>> Hi all, >>> >>> The CSR has recently been approved, could someone else review the >>> spec update webrev: >>> http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ >>> >>> >>> The associated bug is here: >>> https://bugs.openjdk.java.net/browse/JDK-8205725 >>> The associated CSR is here: >>> https://bugs.openjdk.java.net/browse/JDK-8206940 >>> >>> Thanks all! >>> Jc >>> >>> >>> On Thu, Jul 12, 2018 at 2:27 PM serguei.spitsyn at oracle.com >>> >> > wrote: >>> >>> ??? Hi Jc, >>> >>> ??? It looks good to me (including the CSR that I'had already >>> reviewed). >>> ??? Thank you for preparing a fix for this issue so quickly! >>> >>> ??? Thanks, >>> ??? Serguei >>> >>> >>> ??? On 7/12/18 13:45, JC Beyler wrote: >>>> ??? Hi all, >>>> >>>> ??? Could I get a review of an update to the JVMTI Spec for Heap >>>> ??? Sampling: >>>> ??? http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ >>>> >>>> >>>> ??? The assoicated bug is here: >>>> ??? https://bugs.openjdk.java.net/browse/JDK-8205725 >>>> ??? The associated CSR is here: >>>> ??? https://bugs.openjdk.java.net/browse/JDK-8206940 >>>> >>>> ??? The basic reasoning of this webrev/bug/CSR is: >>>> ??? - rate is not the right word and should be renamed to interval, >>>> ??? this is what provokes the change in the code/tests/API naming. >>>> ??? - the spec does not mention that the new sampling interval will >>>> ??? take time to be taken into account (you have to wait for a TLAB >>>> ??? to be refilled); this adds that precision so that the user is not >>>> ??? surprised >>>> ??? - the spec explicitly says that the sampling is done via a >>>> ??? geometric variable which averages to the sampling interval; it >>>> ??? was asked to relax this and the spec should just say that the >>>> ??? sampling is pseudo-random and the interval will average out to >>>> ??? what the user requested. >>>> >>>> ??? Thanks for all your help, >>>> ??? Jc >>> >>> >>> >>> -- >>> >>> Thanks, >>> Jc >> From serguei.spitsyn at oracle.com Tue Jul 17 23:53:53 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 17 Jul 2018 16:53:53 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From mandy.chung at oracle.com Wed Jul 18 00:07:03 2018 From: mandy.chung at oracle.com (mandy chung) Date: Tue, 17 Jul 2018 17:07:03 -0700 Subject: RFR: 8206456 - [TESTBUG] docker jtreg tests fail on systems without cpuset.effective_cpus / cpuset.effective_mem In-Reply-To: References: Message-ID: <598c9af3-2041-0be5-f177-b3a031b2ef61@oracle.com> On 7/17/18 7:00 AM, Bob Vandette wrote: > Please review this fix which eliminates some docker/cgroup test failures when running on older > Linux kernels with missing cgroup metric files. > > BUGS: > https://bugs.openjdk.java.net/browse/JDK-8206456 > > WEBREV: > http://cr.openjdk.java.net/~bobv/8206456/webrev/ Nit: It would be clearer to check for the specific metrics: int[] cpusets = metrics.getEffectiveCpuSetCpus(); if (cpusets.length != 0) { .... } Same applies to getEffectiveCpuSetMems. No need for a new webrev. Mandy P.S. I am not sure the conversion from the primitive to boxed type is necessary. But this is not related to this issue. You may want to take a look at that. From daniil.x.titov at oracle.com Wed Jul 18 00:25:49 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 17 Jul 2018 17:25:49 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> Message-ID: Hi Serguei, We could combine both listeners in one but in this case this listener should be DefaultClassPrepareEventListener that is registered only once at the very beginning of the whole test. We will also need to add a method to reset eventReceived counter between invocations of testSourceFilter() since every call of testSourceFilter() is a separate subtest. Just wanted to make sure that I correctly understood your proposal. addListener() is invoked after startListening() just due to specifics of EventHandler implementation. EventHandler.addListener() adds a listener to the head of the list, so the last added listener is the first one to be called. And default listeners (including "unhandled events" one) are created when EventHandler.startListening() method is called. So to ensure that our listener is called before the "unhandled events" we have to call addListener() after startListening() method. cat -n test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java 222 /** 223 * This is normally called in the main thread of the test debugger. 224 * It starts up an EventHandler thread that gets events coming in 225 * from the debuggee and distributes them to listeners. 226 */ 227 public void startListening() { 228 createDefaultEventRequests(); 229 createDefaultListeners(); 230 listenThread.start(); 231 } 232 250 /** 251 * This method sets up default listeners. 252 */ 253 private void createDefaultListeners() { 254 /** 255 * This listener catches up all unexpected events. 256 * 257 */ 258 addListener( 259 new EventListener() { 260 public boolean eventReceived(Event event) { 261 log.complain("EventHandler> Unexpected event: " + event.getClass().getName()); 262 unexpectedEventCaught = true; 263 return true; 264 } 265 } 266 ); 267 /** 350 * Add at beginning of the list because we want 351 * the LAST added listener to be FIRST to process 352 * current event. 353 */ 354 public void addListener(EventListener listener) { 355 display("Adding listener " + listener); 356 synchronized(listeners) { 357 listeners.add(0, listener); 358 } 359 } Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 4:53 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Thank you for clarification and the webrev update! I still have a couple of questions though. I'd suggest more simple approach like below: 154 public boolean eventReceived(Event event) { 155 if (event instanceof ClassPrepareEvent) { 156 ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; 157 ThreadReference thread = classPrepareEvent.thread(); 158 if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { 159 eventReceived++; 160 161 log.display("ClassPrepareEventListener: Event received: " + event + 162 " Class: " + classPrepareEvent.referenceType().name()); 163 164 vm.resume(); 165 166 return true; 167 } 168 } 169 170 return false; 171 } to something like: public boolean eventReceived(Event event) { if (event instanceof ClassPrepareEvent) { ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; ThreadReference thread = classPrepareEvent.thread(); if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { eventReceived++; log.display("ClassPrepareEventListener: Event received: " + event + " Class: " + classPrepareEvent.referenceType().name()); } else { log.display("ClassPrepareEventListener: Event filtered out: " + event + " Class: " + classPrepareEvent.referenceType().name() + " Thread:" + classPrepareEvent.thread().name()); } vm.resume(); return true; ?} return false; } 245 eventHandler.startListening(); 246 // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads. 247 // The listener should be added after the event listener is started to ensure that it 248 // called before the default event listener that handles unexpected events. 249 eventHandler.addListener(new DefaultClassPrepareEventListener()); ? Still unclear why addListener() is invoked after startListening() but not before. ? It can be that a place add this listener is not right and have to be moved into testSourceFilter(). ? But I hope this fragment is not needed with the simplified approach. ? Otherwise, it looks good. Thanks, Serguei On 7/17/18 14:55, Daniil Titov wrote: Hi Serguei, The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters. The testSourceFilter() method does the following: 1. creates a ClassPrepareRequest object 2. registers new ClassPrepareEventListener 3. sends a command to debuggee to a load test class 4. waits till the debuggee performed the command 5. removes ClassPrepareEventListener 6. checks if a ClassPrepareEvent was received Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners) cat -n test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java /** 251 * This method sets up default listeners. 252 */ 253 private void createDefaultListeners() { 254 /** 255 * This listener catches up all unexpected events. 256 * 257 */ 258 addListener( 259 new EventListener() { 260 public boolean eventReceived(Event event) { 261 log.complain("EventHandler> Unexpected event: " + event.getClass().getName()); 262 unexpectedEventCaught = true; 263 return true; 264 } 265 } 266 ); 267 On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener is unregistered inside testSourceFilter() method. Please see below the new webrev with the changes you suggested. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/ Thanks! Best regards, Daniil From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com Date: Tuesday, July 17, 2018 at 1:34 PM To: Daniil Titov mailto:daniil.x.titov at oracle.com, mailto:serviceability-dev at openjdk.java.netserviceability-dev@openjdk.java.net mailto:serviceability-dev at openjdk.java.net Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Not sure, I fully understand the fix. So, let's start from some questions. Why the DefaultClassPrepareEventListener is needed? Is it not enough to filter out the other threads in the ClassPrepareEventListener.eventReceived() method ? 243 eventHandler.startListening(); 244 // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads. 245 // The listener should be added after the event listener is started to ensure that it called before 246 // the default event listener that handles unexpected events. 247 eventHandler.addListener(new DefaultClassPrepareEventListener()); It is still not clear why the default listener is added after the listening is started but not before. If the default listener is really needed then could you, please, split the lines above and L129, L160 to make a little bit shorter? I'd also suggest to replace "class prepared events" at L244 ? with "ClassPrepare event" or "class prepare event". There is also an unneeded space in the "( e.g. compiler)". Thanks, Serguei On 7/17/18 01:20, Daniil Titov wrote: Please review the change that fix the JDI test when running with Graal. The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled. Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/ Thanks! --Daniil From jcbeyler at google.com Wed Jul 18 00:54:26 2018 From: jcbeyler at google.com (JC Beyler) Date: Tue, 17 Jul 2018 17:54:26 -0700 Subject: RFR(S) 8205652: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java fails In-Reply-To: <0894a3e4-b279-5f2a-e6f7-33f065cdabd7@oracle.com> References: <0894a3e4-b279-5f2a-e6f7-33f065cdabd7@oracle.com> Message-ID: Hi all, Here is the webrev: http://cr.openjdk.java.net/~jcbeyler/8205652/webrev.02/ Thanks for the reviews and future push! Jc On Tue, Jul 17, 2018 at 2:29 PM Alex Menkov wrote: > +1 > > --alex > > On 07/16/2018 12:58, serguei.spitsyn at oracle.com wrote: > > Hi Jc, > > > > It looks good to me. > > > > Thanks, > > Serguei > > > > On 7/16/18 10:58, JC Beyler wrote: > >> Hi all, > >> > >> Small RFR to update a HeapMonitor test that had two issues: a test was > >> wrong and the test was not allocating enough to get to an expected > >> sample count. Instead of allocating 10 times more and hit some OOM on > >> the test framework, the webrev allocates in chunks and gets the number > >> of samples. > >> > >> I ran this 10k times on my machine and it passed. Serguei ran mach5 > >> testing with it and said it looked good. > >> > >> Bug associated is: JDK-8205652 > >> > >> Webrev is here: > >> http://cr.openjdk.java.net/~jcbeyler/8205652/webrev.01/ > >> > >> > >> Thanks, > >> Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Jul 18 01:41:48 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 17 Jul 2018 18:41:48 -0700 Subject: RFR(S) 8205652: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java fails In-Reply-To: References: <0894a3e4-b279-5f2a-e6f7-33f065cdabd7@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Jul 18 01:47:20 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 17 Jul 2018 18:47:20 -0700 Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling In-Reply-To: <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com> References: <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com> <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com> Message-ID: Hi Jc, Are you waiting for more reviewers? Otherwise, could you send me a patch for push please? Thanks, Serguei On 7/17/18 15:38, Alex Menkov wrote: > The changes look good to me. > > --alex > > On 07/16/2018 16:10, serguei.spitsyn at oracle.com wrote: >> Hi all, >> >> We need at least one more review before pushing it. >> >> Thanks, >> Serguei >> >> >> On 7/16/18 16:07, JC Beyler wrote: >>> Hi all, >>> >>> The CSR has recently been approved, could someone else review the >>> spec update webrev: >>> http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ >>> >>> >>> The associated bug is here: >>> https://bugs.openjdk.java.net/browse/JDK-8205725 >>> The associated CSR is here: >>> https://bugs.openjdk.java.net/browse/JDK-8206940 >>> >>> Thanks all! >>> Jc >>> >>> >>> On Thu, Jul 12, 2018 at 2:27 PM serguei.spitsyn at oracle.com >>> >> > wrote: >>> >>> ??? Hi Jc, >>> >>> ??? It looks good to me (including the CSR that I'had already >>> reviewed). >>> ??? Thank you for preparing a fix for this issue so quickly! >>> >>> ??? Thanks, >>> ??? Serguei >>> >>> >>> ??? On 7/12/18 13:45, JC Beyler wrote: >>>> ??? Hi all, >>>> >>>> ??? Could I get a review of an update to the JVMTI Spec for Heap >>>> ??? Sampling: >>>> ??? http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ >>>> >>>> >>>> ??? The assoicated bug is here: >>>> ??? https://bugs.openjdk.java.net/browse/JDK-8205725 >>>> ??? The associated CSR is here: >>>> ??? https://bugs.openjdk.java.net/browse/JDK-8206940 >>>> >>>> ??? The basic reasoning of this webrev/bug/CSR is: >>>> ??? - rate is not the right word and should be renamed to interval, >>>> ??? this is what provokes the change in the code/tests/API naming. >>>> ??? - the spec does not mention that the new sampling interval will >>>> ??? take time to be taken into account (you have to wait for a TLAB >>>> ??? to be refilled); this adds that precision so that the user is not >>>> ??? surprised >>>> ??? - the spec explicitly says that the sampling is done via a >>>> ??? geometric variable which averages to the sampling interval; it >>>> ??? was asked to relax this and the spec should just say that the >>>> ??? sampling is pseudo-random and the interval will average out to >>>> ??? what the user requested. >>>> >>>> ??? Thanks for all your help, >>>> ??? Jc >>> >>> >>> >>> -- >>> >>> Thanks, >>> Jc >> From daniil.x.titov at oracle.com Wed Jul 18 02:06:37 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 17 Jul 2018 19:06:37 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> Message-ID: <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> Hi Serguei, Please review a new version of the patch. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.03 Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 4:53 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Thank you for clarification and the webrev update! I still have a couple of questions though. I'd suggest more simple approach like below: 154???????? public boolean eventReceived(Event event) { 155???????????? if (event instanceof ClassPrepareEvent) { 156???????????????? ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; 157???????????????? ThreadReference thread = classPrepareEvent.thread(); 158???????????????? if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { 159???????????????????? eventReceived++; 160 ?161??????? ?????????????log.display("ClassPrepareEventListener: Event received: " + event + 162???????????????????????????? " Class: " + classPrepareEvent.referenceType().name()); 163 ?164???????????????????? vm.resume(); 165 ?166???????????????????? return true; 167???????????????? } 168???????????? } 169 ?170???????????? return false; 171???????? } to something like: ????????? public boolean eventReceived(Event event) { ????????????? if (event instanceof ClassPrepareEvent) { ????????????????? ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; ????????????????? ThreadReference thread = classPrepareEvent.thread(); ????????????????? if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { ????????????????????? eventReceived++; ????? ????????????????log.display("ClassPrepareEventListener: Event received: " + event + ????????????????????????????? " Class: " + classPrepareEvent.referenceType().name()); ????????????????? } else { ????????????????????? log.display("ClassPrepareEventListener: Event filtered out: " + event + ????????????????????????????? " Class: " + classPrepareEvent.referenceType().name() + ????????????????????????????? " Thread:" + classPrepareEvent.thread().name()); ????????????????? } ????????????????? vm.resume(); ????????????????? return true; ???????????? } ????????????? return false; ????????? } 245???????? eventHandler.startListening(); 246???????? // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads. 247???????? // The listener should be added after the event listener is started to ensure that it 248???????? // called before the default event listener that handles unexpected events. 249???????? eventHandler.addListener(new DefaultClassPrepareEventListener()); Still unclear why addListener() is invoked after startListening() but not before. It can be that a place add this listener is not right and have to be moved into testSourceFilter(). But I hope this fragment is not needed with the simplified approach. Otherwise, it looks good. Thanks, Serguei On 7/17/18 14:55, Daniil Titov wrote: Hi Serguei, The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters.? The testSourceFilter() method does the following: ????? 1.? creates a ClassPrepareRequest object ????? 2. registers new ClassPrepareEventListener ????? 3. sends a command to debuggee to a load test class ??????4. waits till the debuggee performed the command ????? 5. removes ClassPrepareEventListener ????? 6. checks if a ClassPrepareEvent was received Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners) cat -n? test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java ? /** ?? 251?? ???? * This method sets up default listeners. ?? 252?? ???? */ ?? 253?? ??? private void createDefaultListeners() { ?? 254?? ??????? /** ?? 255?? ???????? * This listener catches up all unexpected events. ?? 256?? ???????? * ?? 257?? ???????? */ ?? 258?? ??????? addListener( ?? 259?? ??????????????? new EventListener() { ?? 260?? ??????????????????? public boolean eventReceived(Event event) { ?? 261?? ??????????????????????? log.complain("EventHandler>? Unexpected event: " + event.getClass().getName()); ?? 262?? ??????????????????????? unexpectedEventCaught = true; ?? 263?? ??????????????????????? return true; ?? 264?? ??????????????????? } ?? 265?? ??????????????? } ?? 266?? ??????? ); ?? 267?? On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener? is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener? is unregistered inside testSourceFilter() method. Please see below the new webrev with the changes you suggested. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/ Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 1:34 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Not sure, I fully understand the fix. So, let's start from some questions. Why the DefaultClassPrepareEventListener is needed? Is it not enough to filter out the other threads in the ClassPrepareEventListener.eventReceived() method ? 243???????? eventHandler.startListening(); 244???????? // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads. 245???????? // The listener should be added after the event listener is started to ensure that it called before 246???????? // the default event listener that handles unexpected events. 247???????? eventHandler.addListener(new DefaultClassPrepareEventListener()); ? It is still not clear why the default listener is added ? after the listening is started but not before. ? If the default listener is really needed then could you, please, ? split the lines above and L129, L160 to make a little bit shorter? ? ??I'd also suggest to replace "class prepared events" at L244 with "ClassPrepare event" or "class prepare event". ? There is also an unneeded space in the "( e.g. compiler)". Thanks, Serguei On 7/17/18 01:20, Daniil Titov wrote: Please review the change that fix the JDI test when running with Graal. The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled. Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/ Thanks! --Daniil -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Jul 18 03:00:02 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 17 Jul 2018 20:00:02 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> Message-ID: <7be59224-6013-b163-daf2-2ed6bb90a3b8@oracle.com> An HTML attachment was scrubbed... URL: From jcbeyler at google.com Wed Jul 18 03:05:08 2018 From: jcbeyler at google.com (JC Beyler) Date: Tue, 17 Jul 2018 20:05:08 -0700 Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling In-Reply-To: References: <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com> <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com> Message-ID: Hi Serguei, No I was waiting for the other patches to be pushed (thank you for doing it). Now that it is done, I prepared this one that should be clean of conflicts for you :-) Here it is: http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.04/ Thanks! Jc On Tue, Jul 17, 2018 at 6:47 PM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Hi Jc, > > Are you waiting for more reviewers? > Otherwise, could you send me a patch for push please? > > Thanks, > Serguei > > > On 7/17/18 15:38, Alex Menkov wrote: > > The changes look good to me. > > > > --alex > > > > On 07/16/2018 16:10, serguei.spitsyn at oracle.com wrote: > >> Hi all, > >> > >> We need at least one more review before pushing it. > >> > >> Thanks, > >> Serguei > >> > >> > >> On 7/16/18 16:07, JC Beyler wrote: > >>> Hi all, > >>> > >>> The CSR has recently been approved, could someone else review the > >>> spec update webrev: > >>> http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ > >>> > >>> > >>> The associated bug is here: > >>> https://bugs.openjdk.java.net/browse/JDK-8205725 > >>> The associated CSR is here: > >>> https://bugs.openjdk.java.net/browse/JDK-8206940 > >>> > >>> Thanks all! > >>> Jc > >>> > >>> > >>> On Thu, Jul 12, 2018 at 2:27 PM serguei.spitsyn at oracle.com > >>> >>> > wrote: > >>> > >>> Hi Jc, > >>> > >>> It looks good to me (including the CSR that I'had already > >>> reviewed). > >>> Thank you for preparing a fix for this issue so quickly! > >>> > >>> Thanks, > >>> Serguei > >>> > >>> > >>> On 7/12/18 13:45, JC Beyler wrote: > >>>> Hi all, > >>>> > >>>> Could I get a review of an update to the JVMTI Spec for Heap > >>>> Sampling: > >>>> http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ > >>>> > >>>> > >>>> The assoicated bug is here: > >>>> https://bugs.openjdk.java.net/browse/JDK-8205725 > >>>> The associated CSR is here: > >>>> https://bugs.openjdk.java.net/browse/JDK-8206940 > >>>> > >>>> The basic reasoning of this webrev/bug/CSR is: > >>>> - rate is not the right word and should be renamed to interval, > >>>> this is what provokes the change in the code/tests/API naming. > >>>> - the spec does not mention that the new sampling interval will > >>>> take time to be taken into account (you have to wait for a TLAB > >>>> to be refilled); this adds that precision so that the user is not > >>>> surprised > >>>> - the spec explicitly says that the sampling is done via a > >>>> geometric variable which averages to the sampling interval; it > >>>> was asked to relax this and the spec should just say that the > >>>> sampling is pseudo-random and the interval will average out to > >>>> what the user requested. > >>>> > >>>> Thanks for all your help, > >>>> Jc > >>> > >>> > >>> > >>> -- > >>> > >>> Thanks, > >>> Jc > >> > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Jul 18 03:31:41 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 17 Jul 2018 20:31:41 -0700 Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling In-Reply-To: References: <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com> <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com> Message-ID: <506173c6-722d-b4b5-6505-b2166b073282@oracle.com> An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Wed Jul 18 03:32:10 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 17 Jul 2018 20:32:10 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: <7be59224-6013-b163-daf2-2ed6bb90a3b8@oracle.com> References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> <7be59224-6013-b163-daf2-2ed6bb90a3b8@oracle.com> Message-ID: Hi Serguei, The changes are in the one test class vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java so they affect only this single test. No other tests depend on this class. Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 7:59 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, It looks good to me. Thank you for the update. How many tests are depending on this class? Could we say that all the nsk/jdi/ClassPrepareRequest tests need to be checked that there are no regressions? Thanks, Serguei On 7/17/18 19:06, Daniil Titov wrote: Hi Serguei, Please review a new version of the patch. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.03 Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 4:53 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Thank you for clarification and the webrev update! I still have a couple of questions though. I'd suggest more simple approach like below: 154 public boolean eventReceived(Event event) { 155 if (event instanceof ClassPrepareEvent) { 156 ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; 157 ThreadReference thread = classPrepareEvent.thread(); 158 if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { 159 eventReceived++; 160 161 log.display("ClassPrepareEventListener: Event received: " + event + 162 " Class: " + classPrepareEvent.referenceType().name()); 163 164 vm.resume(); 165 166 return true; 167 } 168 } 169 170 return false; 171 } to something like: public boolean eventReceived(Event event) { if (event instanceof ClassPrepareEvent) { ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; ThreadReference thread = classPrepareEvent.thread(); if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { eventReceived++; log.display("ClassPrepareEventListener: Event received: " + event + " Class: " + classPrepareEvent.referenceType().name()); } else { log.display("ClassPrepareEventListener: Event filtered out: " + event + " Class: " + classPrepareEvent.referenceType().name() + " Thread:" + classPrepareEvent.thread().name()); } vm.resume(); return true; } return false; } 245 eventHandler.startListening(); 246 // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads. 247 // The listener should be added after the event listener is started to ensure that it 248 // called before the default event listener that handles unexpected events. 249 eventHandler.addListener(new DefaultClassPrepareEventListener()); Still unclear why addListener() is invoked after startListening() but not before. It can be that a place add this listener is not right and have to be moved into testSourceFilter(). But I hope this fragment is not needed with the simplified approach. Otherwise, it looks good. Thanks, Serguei On 7/17/18 14:55, Daniil Titov wrote: Hi Serguei, The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters. The testSourceFilter() method does the following: 1. creates a ClassPrepareRequest object 2. registers new ClassPrepareEventListener 3. sends a command to debuggee to a load test class 4. waits till the debuggee performed the command 5. removes ClassPrepareEventListener 6. checks if a ClassPrepareEvent was received Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners) cat -n test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java /** 251 * This method sets up default listeners. 252 */ 253 private void createDefaultListeners() { 254 /** 255 * This listener catches up all unexpected events. 256 * 257 */ 258 addListener( 259 new EventListener() { 260 public boolean eventReceived(Event event) { 261 log.complain("EventHandler> Unexpected event: " + event.getClass().getName()); 262 unexpectedEventCaught = true; 263 return true; 264 } 265 } 266 ); 267 On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener is unregistered inside testSourceFilter() method. Please see below the new webrev with the changes you suggested. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/ Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 1:34 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Not sure, I fully understand the fix. So, let's start from some questions. Why the DefaultClassPrepareEventListener is needed? Is it not enough to filter out the other threads in the ClassPrepareEventListener.eventReceived() method ? 243 eventHandler.startListening(); 244 // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads. 245 // The listener should be added after the event listener is started to ensure that it called before 246 // the default event listener that handles unexpected events. 247 eventHandler.addListener(new DefaultClassPrepareEventListener()); It is still not clear why the default listener is added after the listening is started but not before. If the default listener is really needed then could you, please, split the lines above and L129, L160 to make a little bit shorter? I'd also suggest to replace "class prepared events" at L244 with "ClassPrepare event" or "class prepare event". There is also an unneeded space in the "( e.g. compiler)". Thanks, Serguei On 7/17/18 01:20, Daniil Titov wrote: Please review the change that fix the JDI test when running with Graal. The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled. Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/ Thanks! --Daniil -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Jul 18 03:36:50 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 17 Jul 2018 20:36:50 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> <7be59224-6013-b163-daf2-2ed6bb90a3b8@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Jul 18 05:01:50 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 17 Jul 2018 22:01:50 -0700 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> Message-ID: <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> Hi Ralf, A few comments below, but overall looks good: ? 27? * @summary get stack trace for large stacks took too long. How about "Test that getting the stack trace for a very large stack does not take too long". The max number of frames you'll test for is 100M, but the stack size is set to 4m, assuming -Xss works (and I think on some platforms it may not). 100M frames seems like overkill for a 4M stack. If the stack was nothing more than a frame link pointer on a 32-bit system, you'd only have 1M frames, but lets be more realistic than that and say you should never have more than 256k frames. Lowering the max number of frames will prevent this test from taking a very long time on platforms where -Xss has failed. ? 65???????????????? // Have some frames be removed before we call again. Should this be: "Pop some frames so there is room on the stack for the println()" ? 96???????? bpe = resumeTo("Frames2Targ", "callEnded", "()V"); What happens if we never get to callEnded()? thanks, Chris On 7/17/18 7:08 AM, Schmelter, Ralf wrote: > Hi all, > > here is an updated webref at http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v2/ > > I've converted the shell based test to a Java one, which had the nice side effect of speeding it up (now takes ~1 second runtime). > > The fix itself is mainly unchanged, but I've added the variable 'filledIn' to store the number of frames actually filled in by the GetStackTrace call. Formerly I've reused the count variable, but this can lead to misunderstandings. > > Best regards, > Ralf > > > -----Original Message----- > From: serviceability-dev [mailto:serviceability-dev-bounces at openjdk.java.net] On Behalf Of Schmelter, Ralf > Sent: Montag, 9. Juli 2018 16:05 > To: Chris Plummer ; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com > Subject: [CAUTION] RE: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior > > Hi Chris, > > thanks for the review. > >> What testing have you done? > I've tested the change by debugging by hand in eclipse and jdb, running the com/sun/jdi rtreg tests and the jdwp jck tests. And analog code is running in the SAP JVM for many years. > > >> How long does this test take to run. > 15 s according to jtreg. > > >> What happens if for some reason SOE is never thrown? It's not clear to >> me what the script would do in this case. > It is treated as passed (which is not ideal). > > >> In answer to the ShellScaffold.sh question, there is already work >> underway to convert to pure java tests. See JDK-8201652. > Ok, then I think it is better to convert the test to a Java TestScaffold test. I will update the webref when this is done. > > Best regards, > Ralf > > > > -----Original Message----- > From: Chris Plummer [mailto:chris.plummer at oracle.com] > Sent: Freitag, 6. Juli 2018 00:37 > To: Schmelter, Ralf ; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com > Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior > > Hi Ralf, > > Overall looks good, but I do have a few comments and questions. > > Please update the copyright. > > What testing have you done? > > How long does this test take to run. > > What happens if for some reason SOE is never thrown? It's not clear to > me what the script would do in this case. > In answer to the ShellScaffold.sh question, there is already work > underway to convert to pure java tests. See JDK-8201652. I'm not certain > if it is ok for you to just submit this new shell script, or if should > be rewritten in pure java. Most of the work to convert the scripts has > already been done but was put on hold. Maybe Serguei can comment and > guide you on how it would be done in java. > > thanks, > > Chris > > On 7/3/18 3:43 AM, Schmelter, Ralf wrote: >> Hi All, >> >> Please review the fix for the bughttps://bugs.openjdk.java.net/browse/JDK-8205608 . The webref is athttp://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/ . >> >> This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack. >> >> I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is. >> >> Best regards, >> Ralf Schmelter > From chris.plummer at oracle.com Wed Jul 18 06:38:59 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 17 Jul 2018 23:38:59 -0700 Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR: suspendCounts don't match for : Common-Cleaner In-Reply-To: <5B4E0C62.3020808@oracle.com> References: <5B4E0C62.3020808@oracle.com> Message-ID: <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> Hi Gary, I've been having trouble following the control flow of this test. One thing I've stumbled across is the following: ??????????? /* A debuggee class must define 'methodForCommunication' ???????????? * method and invoke it in points of synchronization ???????????? * with a debugger. ???????????? */ setCommunicationBreakpoint(debuggeeClass,"methodForCommunication"); So why isn't this mode of synchronization good enough? Is it because it was not designed with the understanding that the debugger might be doing suspended thread counts, and suspending all threads at the breakpoint messes up the test? From what I can tell of the test, after the debuggee is started and hits the default breakpoint at the start of main(), the debugger then does a vm.resume() at the start of the for loop in the runTest() method. The debuggee then creates a thread and calls methodForCommunication(). There is already a breakpoint set there by the above debuggee code. It's unclear to me what happens as a result of this breakpoint and how it serves the test. Also unclear to me who is responsible for the vm.resume() after the breakpoint is hit. The debugger then requests all ThreadStart events, requesting that no threads be disabled when it is sent. I think you are saying that when the ThreadStart event comes in, sometimes we are at the methodForCommunication breakpoint, with all threads disabled, and this messes up the thread suspend counts. You want to delay 100ms so the breakpoint event can be processed and threads resumed again (although I can't see who actually resumes the thread after hitting the methodForCommunication breakpoint). Chris On 7/17/18 8:33 AM, Gary Adams wrote: > A race condition exists between the debugger and the debuggee. > > The first test thread is started with SUSPEND_NONE policy set. > While processing the thread start event the debugger captures > an initial set of thread suspend counts and resumes the > debuggee vm. If the debuggee advances quickly it reaches > the breakpoint set for methodForCommunication. Since the breakpoint > carries with it SUSPEND_ALL policy, when the debugger captures a second > set of suspend counts, it will not match the expected counts for > a SUSPEND_NONE scenario. > > The proposed fix introduces a yield in the debuggee test thread run > method > to allow the debugger to get the expected sampled values. > > ? Issue: https://bugs.openjdk.java.net/browse/JDK-8170089 > ? Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/ > > > test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java: > ... > ?? 186??????? private void setCommunicationBreakpoint(ReferenceType > refType, String methodName) { > ?? 187??????????? Method method = debuggee.methodByName(refType, > methodName); > ?? 188??????????? Location location = null; > ?? 189??????????? try { > ?? 190??????????????? location = method.allLineLocations().get(0); > ?? 191??????????? } catch (AbsentInformationException e) { > ?? 192??????????????? throw new Failure(e); > ?? 193??????????? } > ?? 194??????????? bpRequest = debuggee.makeBreakpoint(location); > ?? 195 > > ?? 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); > > ?? 197??????????? bpRequest.putProperty("number", "zero"); > ?? 198??????????? bpRequest.enable(); > ?? 199 > ?? 200??????????? eventHandler.addListener( > ?? 201???????????????? new EventHandler.EventListener() { > ?? 202???????????????????? public boolean eventReceived(Event event) { > ?? 203??????????????????????? if (event instanceof BreakpointEvent && > bpRequest.equals(event.request())) { > ?? 204??????????????????????????? synchronized(eventHandler) { > ?? 205??????????????????????????????? display("Received communication > breakpoint event."); > ?? 206??????????????????????????????? bpCount++; > ?? 207??????????????????????????????? eventHandler.notifyAll(); > ?? 208??????????????????????????? } > ?? 209??????????????????????????? return true; > ?? 210??????????????????????? } > ?? 211??????????????????????? return false; > ?? 212???????????????????? } > ?? 213???????????????? } > ?? 214??????????? ); > ?? 215??????? } > > > test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: > ... > ?? 140??????????????????? display("......--> vm.suspend();"); > ?? 141??????????????????? vm.suspend(); > ?? 142 > ?? 143??????????????????? display("??????? getting : Map Integer> suspendsCounts1"); > ?? 144 > ?? 145??????????????????? Map suspendsCounts1 = new > HashMap(); > ?? 146??????????????????? for (ThreadReference threadReference : > vm.allThreads()) { > ?? 147 suspendsCounts1.put(threadReference.name(), > threadReference.suspendCount()); > ?? 148??????????????????? } > ?? 149??????????????????? display(suspendsCounts1.toString()); > ?? 150 > ?? 151??????????????????? display("??????? eventSet.resume;"); > ?? 152??????????????????? eventSet.resume(); > ?? 153 > ?? 154??????????????????? display("??????? getting : Map Integer> suspendsCounts2"); > > This is where the breakpoint is encountered before the second set of > suspend counts is acquired. > > ?? 155??????????????????? Map suspendsCounts2 = new > HashMap(); > ?? 156??????????????????? for (ThreadReference threadReference : > vm.allThreads()) { > ?? 157 suspendsCounts2.put(threadReference.name(), > threadReference.suspendCount()); > ?? 158??????????????????? } > ?? 159??????????????????? display(suspendsCounts2.toString()); > From gary.adams at oracle.com Wed Jul 18 11:52:36 2018 From: gary.adams at oracle.com (Gary Adams) Date: Wed, 18 Jul 2018 07:52:36 -0400 Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR: suspendCounts don't match for : Common-Cleaner In-Reply-To: <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> References: <5B4E0C62.3020808@oracle.com> <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> Message-ID: <5B4F2A04.20409@oracle.com> There is nothing wrong with the breakpoint in methodForCommunication. The test uses it to make sure the threads are each tested separately. The breakpoint eventhandler just displays a message, increments a counter and returns. Let me step through resume008 the debugee to help clarify ... 1. The test thread is created and the synchronized break point is observed. lines 101-102 2. The thread is started. lines 104,135-137 2a. The main thread blocks on a local object. lines 133, 139 2b. The test thread is started. lines 137, A run entered message is displayed, line 159 The main thread lock object is notified, line 167 2b1. The main thread continues. line 167, 146 The next test thread is created. line 106 The synchronized breakpoint is observed, line 107 2b2. A run exited message is displayed, line 169 On the resume008 debugger side ... 1. On a thread start event the debugee is suspended, line 141 2. Messages are displayed and a first set of thread suspend counts is acquired. lines 143-151 3. The threads are resumed, line 152 ---> 4. Messages are displayed and a second set of thread suspend counts is acquired. lines 154-159 The way the test is written the expectation is the debugger steps 2,3,4 will all happen while the test thread is running. When the debugger resumes the debuggee threads (debugger step 3) the debuggee continues from where it left off (debuggee steps 2b,2b1,2b2) If we complete debuggee step 2b1 (line 107) before the debugger completes step 4 line 159, then the synchronized breakpoint will suspend the vm and the counts will not match for the SUSPEND_NONE test thread start. resume008a.java: 100 case 0: 101 thread0 = new Threadresume008a("thread0"); 102 methodForCommunication(); 103 104 threadStart(thread0); 105 106 thread1 = new Threadresume008a("thread1"); 107 methodForCommunication(); 108 break; ... 135 static int threadStart(Thread t) { 136 synchronized (waitnotifyObj) { 137 t.start(); 138 try { 139 waitnotifyObj.wait(); 140 } catch ( Exception e) { 141 exitCode = FAILED; 142 logErr(" Exception : " + e ); 143 return FAILED; 144 } 145 } 146 return PASSED; 147 } 149 static class Threadresume008a extends Thread { ... 157 158 public void run() { 159 log1(" 'run': enter :: threadName == " + tName); This is the proposed fix that will let the debugger complete it's second acquisition of suspend counts while the test thread is still running. 160 // Yield, so the start thread event processing can be completed. 161 try { 162 Thread.sleep(100); 163 } catch (InterruptedException e) { 164 // ignored 165 } 166 synchronized (waitnotifyObj) { 167 waitnotifyObj.notify(); 168 } 169 log1(" 'run': exit :: threadName == " + tName); 170 return; 171 } 172 } 150 151 String tName = null; 152 153 public Threadresume008a(String threadName) { 154 super(threadName); 155 tName = threadName; 156 } 157 158 public void run() { 159 log1(" 'run': enter :: threadName == " + tName); 160 // Yield, so the start thread event processing can be completed. 161 try { 162 Thread.sleep(100); 163 } catch (InterruptedException e) { 164 // ignored 165 } 166 synchronized (waitnotifyObj) { 167 waitnotifyObj.notify(); 168 } 169 log1(" 'run': exit :: threadName == " + tName); 170 return; 171 } 172 } On 7/18/18, 2:38 AM, Chris Plummer wrote: > Hi Gary, > > I've been having trouble following the control flow of this test. One > thing I've stumbled across is the following: > > /* A debuggee class must define 'methodForCommunication' > * method and invoke it in points of synchronization > * with a debugger. > */ > setCommunicationBreakpoint(debuggeeClass,"methodForCommunication"); > > So why isn't this mode of synchronization good enough? Is it because > it was not designed with the understanding that the debugger might be > doing suspended thread counts, and suspending all threads at the > breakpoint messes up the test? > > From what I can tell of the test, after the debuggee is started and > hits the default breakpoint at the start of main(), the debugger then > does a vm.resume() at the start of the for loop in the runTest() > method. The debuggee then creates a thread and calls > methodForCommunication(). There is already a breakpoint set there by > the above debuggee code. It's unclear to me what happens as a result > of this breakpoint and how it serves the test. Also unclear to me who > is responsible for the vm.resume() after the breakpoint is hit. > > The debugger then requests all ThreadStart events, requesting that no > threads be disabled when it is sent. I think you are saying that when > the ThreadStart event comes in, sometimes we are at the > methodForCommunication breakpoint, with all threads disabled, and this > messes up the thread suspend counts. You want to delay 100ms so the > breakpoint event can be processed and threads resumed again (although > I can't see who actually resumes the thread after hitting the > methodForCommunication breakpoint). > > Chris > > On 7/17/18 8:33 AM, Gary Adams wrote: >> A race condition exists between the debugger and the debuggee. >> >> The first test thread is started with SUSPEND_NONE policy set. >> While processing the thread start event the debugger captures >> an initial set of thread suspend counts and resumes the >> debuggee vm. If the debuggee advances quickly it reaches >> the breakpoint set for methodForCommunication. Since the breakpoint >> carries with it SUSPEND_ALL policy, when the debugger captures a second >> set of suspend counts, it will not match the expected counts for >> a SUSPEND_NONE scenario. >> >> The proposed fix introduces a yield in the debuggee test thread run >> method >> to allow the debugger to get the expected sampled values. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8170089 >> Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/ >> >> >> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java: >> ... >> 186 private void setCommunicationBreakpoint(ReferenceType >> refType, String methodName) { >> 187 Method method = debuggee.methodByName(refType, >> methodName); >> 188 Location location = null; >> 189 try { >> 190 location = method.allLineLocations().get(0); >> 191 } catch (AbsentInformationException e) { >> 192 throw new Failure(e); >> 193 } >> 194 bpRequest = debuggee.makeBreakpoint(location); >> 195 >> >> 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); >> >> 197 bpRequest.putProperty("number", "zero"); >> 198 bpRequest.enable(); >> 199 >> 200 eventHandler.addListener( >> 201 new EventHandler.EventListener() { >> 202 public boolean eventReceived(Event event) { >> 203 if (event instanceof BreakpointEvent && >> bpRequest.equals(event.request())) { >> 204 synchronized(eventHandler) { >> 205 display("Received communication >> breakpoint event."); >> 206 bpCount++; >> 207 eventHandler.notifyAll(); >> 208 } >> 209 return true; >> 210 } >> 211 return false; >> 212 } >> 213 } >> 214 ); >> 215 } >> >> >> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: >> ... >> 140 display("......--> vm.suspend();"); >> 141 vm.suspend(); >> 142 >> 143 display(" getting : Map> Integer> suspendsCounts1"); >> 144 >> 145 Map suspendsCounts1 = new >> HashMap(); >> 146 for (ThreadReference threadReference : >> vm.allThreads()) { >> 147 suspendsCounts1.put(threadReference.name(), >> threadReference.suspendCount()); >> 148 } >> 149 display(suspendsCounts1.toString()); >> 150 >> 151 display(" eventSet.resume;"); >> 152 eventSet.resume(); >> 153 >> 154 display(" getting : Map> Integer> suspendsCounts2"); >> >> This is where the breakpoint is encountered before the second set of >> suspend counts is acquired. >> >> 155 Map suspendsCounts2 = new >> HashMap(); >> 156 for (ThreadReference threadReference : >> vm.allThreads()) { >> 157 suspendsCounts2.put(threadReference.name(), >> threadReference.suspendCount()); >> 158 } >> 159 display(suspendsCounts2.toString()); >> > From yasuenag at gmail.com Wed Jul 18 12:59:04 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Wed, 18 Jul 2018 21:59:04 +0900 Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running in Docker containers In-Reply-To: References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com> <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com> <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com> Message-ID: <1bae36e7-3efc-3aef-6a99-324102da2549@gmail.com> PING: Could you review it? JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/ This change has been reviewed by Jini. We need a Reviewer. Thanks, Yasumasa On 2018/07/12 13:42, Yasumasa Suenaga wrote: > Thanks Jini, > > I uploaded new webrev. It contains some comments and removing extra space. > > http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/ > > > Yasumasa > > > > 2018-07-12 2:32 GMT+09:00 Jini George : >> Hi Yasumasa, >> >> This looks good to me except for one nit. And some more comments would help. >> For e.g., it would help to say that NSPidMap is to map the host to container >> lwpids. >> >> The nit: >> >> * >> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html >> Line 253: extra space after the parentheses >> >> Thanks, >> Jini. >> >> On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote: >>> >>> PING: Could you review it? >>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >>> >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2018/06/28 22:12, Yasumasa Suenaga wrote: >>>> >>>> Hi all, >>>> >>>> Please review this change. >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >>>> >>>> I tried to attach jhsdb to java process in docker container from >>>> container host, but it couldn't. >>>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet. >>>> >>>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they >>>> returns PIDs in container - they are different from host's PID. So I added >>>> the code to scan /proc//task to get all LWP IDs and they are kept in a >>>> Map in LinuxDebuggerLocal. >>>> >>>> Also SA_ALTROOT is set to /proc//root if SA detects debuggee runs in >>>> container. It helps SA to parse binaries in container. >>>> >>>> This change has been pushed to submit repo, and it was failed on OS X >>>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963). >>>> But I guess it causes JDK-8205906. This change affects to Linux only. >>>> >>>> Could you review it? >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >> From ralf.schmelter at sap.com Wed Jul 18 15:44:39 2018 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Wed, 18 Jul 2018 15:44:39 +0000 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> Message-ID: Hi Chris, here is an updated webref http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v3/ I've changed the summary text and the comment according to your suggestion. The 100M frames is surely overkill for this test. I had seen that the JIT compiler started to inline, leading to less memory needed per frame. But I've never got more than 1M frames even for very big stacks. Therefore I've reduced it in the test to 1M. When the stack overflow never occurs and callEnded() thus never gets called, the test will fail, because bpe = resumeTo("Frames2Targ", "callEnded", "()V"); will fail since the VM will exit and never reach the breakpoint. In addition, a message will be written about the missing SOE. Best regards, Ralf -----Original Message----- From: Chris Plummer [mailto:chris.plummer at oracle.com] Sent: Mittwoch, 18. Juli 2018 07:02 To: Schmelter, Ralf ; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com; Stuefe, Thomas Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior Hi Ralf, A few comments below, but overall looks good: ? 27? * @summary get stack trace for large stacks took too long. How about "Test that getting the stack trace for a very large stack does not take too long". The max number of frames you'll test for is 100M, but the stack size is set to 4m, assuming -Xss works (and I think on some platforms it may not). 100M frames seems like overkill for a 4M stack. If the stack was nothing more than a frame link pointer on a 32-bit system, you'd only have 1M frames, but lets be more realistic than that and say you should never have more than 256k frames. Lowering the max number of frames will prevent this test from taking a very long time on platforms where -Xss has failed. ? 65???????????????? // Have some frames be removed before we call again. Should this be: "Pop some frames so there is room on the stack for the println()" ? 96???????? bpe = resumeTo("Frames2Targ", "callEnded", "()V"); What happens if we never get to callEnded()? thanks, Chris On 7/17/18 7:08 AM, Schmelter, Ralf wrote: > Hi all, > > here is an updated webref at http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v2/ > > I've converted the shell based test to a Java one, which had the nice side effect of speeding it up (now takes ~1 second runtime). > > The fix itself is mainly unchanged, but I've added the variable 'filledIn' to store the number of frames actually filled in by the GetStackTrace call. Formerly I've reused the count variable, but this can lead to misunderstandings. > > Best regards, > Ralf > > > -----Original Message----- > From: serviceability-dev [mailto:serviceability-dev-bounces at openjdk.java.net] On Behalf Of Schmelter, Ralf > Sent: Montag, 9. Juli 2018 16:05 > To: Chris Plummer ; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com > Subject: [CAUTION] RE: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior > > Hi Chris, > > thanks for the review. > >> What testing have you done? > I've tested the change by debugging by hand in eclipse and jdb, running the com/sun/jdi rtreg tests and the jdwp jck tests. And analog code is running in the SAP JVM for many years. > > >> How long does this test take to run. > 15 s according to jtreg. > > >> What happens if for some reason SOE is never thrown? It's not clear to >> me what the script would do in this case. > It is treated as passed (which is not ideal). > > >> In answer to the ShellScaffold.sh question, there is already work >> underway to convert to pure java tests. See JDK-8201652. > Ok, then I think it is better to convert the test to a Java TestScaffold test. I will update the webref when this is done. > > Best regards, > Ralf > > > > -----Original Message----- > From: Chris Plummer [mailto:chris.plummer at oracle.com] > Sent: Freitag, 6. Juli 2018 00:37 > To: Schmelter, Ralf ; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com > Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior > > Hi Ralf, > > Overall looks good, but I do have a few comments and questions. > > Please update the copyright. > > What testing have you done? > > How long does this test take to run. > > What happens if for some reason SOE is never thrown? It's not clear to > me what the script would do in this case. > In answer to the ShellScaffold.sh question, there is already work > underway to convert to pure java tests. See JDK-8201652. I'm not certain > if it is ok for you to just submit this new shell script, or if should > be rewritten in pure java. Most of the work to convert the scripts has > already been done but was put on hold. Maybe Serguei can comment and > guide you on how it would be done in java. > > thanks, > > Chris > > On 7/3/18 3:43 AM, Schmelter, Ralf wrote: >> Hi All, >> >> Please review the fix for the bughttps://bugs.openjdk.java.net/browse/JDK-8205608 . The webref is athttp://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/ . >> >> This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack. >> >> I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is. >> >> Best regards, >> Ralf Schmelter > From jcbeyler at google.com Wed Jul 18 16:21:19 2018 From: jcbeyler at google.com (JC Beyler) Date: Wed, 18 Jul 2018 09:21:19 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: Subject Was: Re: RFR (S): C1 still does eden allocations when TLAB is enabled + serviceability-dev Hi all, Could anyone else give me a review of this webrev and check/test the various architecture changes? http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ Thanks for all your help! Jc On Mon, Jul 16, 2018 at 2:58 PM JC Beyler wrote: > Hi all, > > Here is a webrev that does all the architectures in the same way: > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > > Could anyone review the other architectures and test? > - arm, sparc & aarch64 are also modified now to follow the same "if no > tlab, then consider eden space allocation" logic. > > Thanks for your help! > Jc > > On Fri, Jul 13, 2018 at 9:16 PM JC Beyler wrote: > >> Hi Kim, >> >> I opened this bug >> https://bugs.openjdk.java.net/browse/JDK-8190862 >> >> and now I've done an update: >> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >> >> I basically have done your nits but also removed the try_eden (it was >> used to bind a label but was not used). I updated the comments to use the >> one you preferred. >> >> I still have to do the other architectures though but at least we seem to >> have a consensus on this architecture, correct? >> >> Thanks for the review, >> Jc >> >> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett >> wrote: >> >>> > On Jul 13, 2018, at 4:54 PM, JC Beyler wrote: >>> > >>> > Yes, you are right, I did those changes due to: >>> > https://bugs.openjdk.java.net/browse/JDK-8194084 >>> > >>> > If Robbin agrees to this change, and if no one sees an issue, I'll go >>> ahead >>> > and propagate the change across architectures. >>> > >>> > Thanks for the review, I'll wait for Robbin (or anyone else's comment >>> and >>> > review) :) >>> > Jc >>> > >>> > On Fri, Jul 13, 2018 at 1:08 PM John Rose >>> wrote: >>> > >>> >> On Jul 13, 2018, at 10:23 AM, JC Beyler wrote: >>> >> >>> >> >>> >> I'm not sure if we had left this case intentionally or not but, if we >>> want >>> >> it all to be consistent, we should perhaps fix it. >>> >> >>> >> >>> >> Well, you put in that logic last February, so unless somebody speaks >>> up >>> >> quickly, I support your adjusting it to be the way you want it. >>> >> >>> >> Doing "hg grep -u supports_inline_contig_alloc -I src/hotspot/share" >>> >> suggests that the GC group is most active in touching this feature. >>> >> If Robbin is OK with it, there's your reviewer. >>> >> >>> >> FWIW, you can use me as a reviewer, but I'd get one other person >>> >> working on the GC to OK it. >>> >> >>> >> ? John >>> >> >>> > >>> > >>> > -- >>> > >>> > Thanks, >>> > Jc >>> >>> Robbin is on vacation; you might not hear from him for a while. >>> >>> I'm assuming you'll open a new bug for this? >>> >>> Except for a few minor nits (below), this looks okay to me. >>> >>> The comment at line 1052 needs updating. >>> >>> pre-existing: The retry_tlab label declared on line 1054 is unused. >>> >>> pre-existing: The try_eden label declared on line 1054 is bound at >>> line 1058, but unreferenced. >>> >>> I like the wording of the comment at 1139 better than the wording at >>> 1016. >>> >>> >> >> -- >> >> Thanks, >> Jc >> > > > -- > > Thanks, > Jc > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Jul 18 17:10:36 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Jul 2018 10:10:36 -0700 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> Message-ID: <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com> Hi Ralf, Looks good. thanks, Chris On 7/18/18 8:44 AM, Schmelter, Ralf wrote: > Hi Chris, > > here is an updated webref http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v3/ > > I've changed the summary text and the comment according to your suggestion. > > The 100M frames is surely overkill for this test. I had seen that the JIT compiler started to inline, leading to less memory needed per frame. But I've never got more than 1M frames even for very big stacks. Therefore I've reduced it in the test to 1M. > > When the stack overflow never occurs and callEnded() thus never gets called, the test will fail, because > bpe = resumeTo("Frames2Targ", "callEnded", "()V"); > will fail since the VM will exit and never reach the breakpoint. In addition, a message will be written about the missing SOE. > > Best regards, > Ralf > > > -----Original Message----- > From: Chris Plummer [mailto:chris.plummer at oracle.com] > Sent: Mittwoch, 18. Juli 2018 07:02 > To: Schmelter, Ralf ; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com; Stuefe, Thomas > Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior > > Hi Ralf, > > A few comments below, but overall looks good: > > ? 27? * @summary get stack trace for large stacks took too long. > > How about "Test that getting the stack trace for a very large stack does > not take too long". > > The max number of frames you'll test for is 100M, but the stack size is > set to 4m, assuming -Xss works (and I think on some platforms it may > not). 100M frames seems like overkill for a 4M stack. If the stack was > nothing more than a frame link pointer on a 32-bit system, you'd only > have 1M frames, but lets be more realistic than that and say you should > never have more than 256k frames. Lowering the max number of frames will > prevent this test from taking a very long time on platforms where -Xss > has failed. > > ? 65???????????????? // Have some frames be removed before we call again. > > Should this be: "Pop some frames so there is room on the stack for the > println()" > > ? 96???????? bpe = resumeTo("Frames2Targ", "callEnded", "()V"); > > What happens if we never get to callEnded()? > > thanks, > > Chris > > On 7/17/18 7:08 AM, Schmelter, Ralf wrote: >> Hi all, >> >> here is an updated webref at http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v2/ >> >> I've converted the shell based test to a Java one, which had the nice side effect of speeding it up (now takes ~1 second runtime). >> >> The fix itself is mainly unchanged, but I've added the variable 'filledIn' to store the number of frames actually filled in by the GetStackTrace call. Formerly I've reused the count variable, but this can lead to misunderstandings. >> >> Best regards, >> Ralf >> >> >> -----Original Message----- >> From: serviceability-dev [mailto:serviceability-dev-bounces at openjdk.java.net] On Behalf Of Schmelter, Ralf >> Sent: Montag, 9. Juli 2018 16:05 >> To: Chris Plummer ; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com >> Subject: [CAUTION] RE: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior >> >> Hi Chris, >> >> thanks for the review. >> >>> What testing have you done? >> I've tested the change by debugging by hand in eclipse and jdb, running the com/sun/jdi rtreg tests and the jdwp jck tests. And analog code is running in the SAP JVM for many years. >> >> >>> How long does this test take to run. >> 15 s according to jtreg. >> >> >>> What happens if for some reason SOE is never thrown? It's not clear to >>> me what the script would do in this case. >> It is treated as passed (which is not ideal). >> >> >>> In answer to the ShellScaffold.sh question, there is already work >>> underway to convert to pure java tests. See JDK-8201652. >> Ok, then I think it is better to convert the test to a Java TestScaffold test. I will update the webref when this is done. >> >> Best regards, >> Ralf >> >> >> >> -----Original Message----- >> From: Chris Plummer [mailto:chris.plummer at oracle.com] >> Sent: Freitag, 6. Juli 2018 00:37 >> To: Schmelter, Ralf ; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com >> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior >> >> Hi Ralf, >> >> Overall looks good, but I do have a few comments and questions. >> >> Please update the copyright. >> >> What testing have you done? >> >> How long does this test take to run. >> >> What happens if for some reason SOE is never thrown? It's not clear to >> me what the script would do in this case. >> In answer to the ShellScaffold.sh question, there is already work >> underway to convert to pure java tests. See JDK-8201652. I'm not certain >> if it is ok for you to just submit this new shell script, or if should >> be rewritten in pure java. Most of the work to convert the scripts has >> already been done but was put on hold. Maybe Serguei can comment and >> guide you on how it would be done in java. >> >> thanks, >> >> Chris >> >> On 7/3/18 3:43 AM, Schmelter, Ralf wrote: >>> Hi All, >>> >>> Please review the fix for the bughttps://bugs.openjdk.java.net/browse/JDK-8205608 . The webref is athttp://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/ . >>> >>> This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack. >>> >>> I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is. >>> >>> Best regards, >>> Ralf Schmelter > From chris.plummer at oracle.com Wed Jul 18 18:50:49 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Jul 2018 11:50:49 -0700 Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR: suspendCounts don't match for : Common-Cleaner In-Reply-To: <5B4F2A04.20409@oracle.com> References: <5B4E0C62.3020808@oracle.com> <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com> Message-ID: <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com> Hi Gary, Who does the resume for the breakpoint event? ??????? eventHandler.addListener( ???????????? new EventHandler.EventListener() { ???????????????? public boolean eventReceived(Event event) { ??????????????????? if (event instanceof BreakpointEvent && bpRequest.equals(event.request())) { ??????????????????????? synchronized(eventHandler) { ??????????????????????????? display("Received communication breakpoint event."); ??????????????????????????? bpCount++; ??????????????????????????? eventHandler.notifyAll(); ??????????????????????? } ??????????????????????? return true; ??????????????????? } ??????????????????? return false; ???????????????? } ???????????? } ??????? ); Also: > ? 1. On a thread start event the debugee is suspended, line 141 That's not true for the first ThreadStartEvent since SUSPEND_NONE was used. Chris On 7/18/18 4:52 AM, Gary Adams wrote: > There is nothing wrong with the breakpoint in methodForCommunication. > The test uses it to make sure the threads are each tested separately. > The breakpoint eventhandler just displays a message, increments a counter > and returns. > > Let me step through resume008 the debugee to help clarify ... > > 1. The test thread is created and the synchronized break point is > observed. lines 101-102 > 2. The thread is started. lines 104,135-137 > ??? 2a. The main thread blocks on a local object. lines 133, 139 > ??? 2b. The test thread is started. lines 137, > ?????????? A run entered message is displayed, line 159 > ?????????? The main thread lock object is notified, line 167 > ????????? 2b1. The main thread continues. line 167, 146 > ????????????????? The next test thread is created. line 106 > ????????????????? The synchronized breakpoint is observed, line 107 > ????????? 2b2. A run exited message is displayed, line 169 > > On the resume008 debugger side? ... > ? 1. On a thread start event the debugee is suspended, line 141 > ? 2. Messages are displayed and a first set of thread suspend counts > is acquired. lines 143-151 > ? 3. The threads are resumed, line 152 > ---> > ? 4.? Messages are displayed and a second set of thread suspend counts > is acquired. lines 154-159 > > The way the test is written the expectation is the debugger steps > 2,3,4 will all happen > while the test thread is running. > > When the debugger resumes the debuggee threads (debugger step 3) > the debuggee continues from where it left off (debuggee steps 2b,2b1,2b2) > > If we complete debuggee step 2b1 (line 107) before the debugger > completes step 4 line 159, > then the synchronized breakpoint will suspend the vm and the counts > will not match > for the SUSPEND_NONE test thread start. > > resume008a.java: > > ?? 100??????????????????????? case 0: > ?? 101??????????????????????????????? thread0 = new > Threadresume008a("thread0"); > ?? 102??????????????????????????????? methodForCommunication(); > ?? 103 > ?? 104??????????????????????????????? threadStart(thread0); > ?? 105 > ?? 106??????????????????????????????? thread1 = new > Threadresume008a("thread1"); > ?? 107??????????????????????????????? methodForCommunication(); > ?? 108??????????????????????????????? break; > > ?? ... > ?? 135??????? static int threadStart(Thread t) { > ?? 136??????????? synchronized (waitnotifyObj) { > ?? 137??????????????? t.start(); > ?? 138??????????????? try { > ?? 139??????????????????? waitnotifyObj.wait(); > ?? 140??????????????? } catch ( Exception e) { > ?? 141??????????????????? exitCode = FAILED; > ?? 142??????????????????? logErr("?????? Exception : " + e ); > ?? 143??????????????????? return FAILED; > ?? 144??????????????? } > ?? 145??????????? } > ?? 146??????????? return PASSED; > ?? 147??????? } > > ?? 149??????? static class Threadresume008a extends Thread { > ?? ... > ?? 157 > ?? 158??????????? public void run() { > ?? 159??????????????? log1("? 'run': enter? :: threadName == " + tName); > > This is the proposed fix that will let the debugger complete it's second > acquisition of suspend counts while the test thread is still running. > > ?? 160??????????????? // Yield, so the start thread event processing > can be completed. > ?? 161??????????????? try { > ?? 162??????????????????? Thread.sleep(100); > ?? 163??????????????? } catch (InterruptedException e) { > ?? 164??????????????????? // ignored > ?? 165??????????????? } > > ?? 166??????????????? synchronized (waitnotifyObj) { > ?? 167??????????????????????? waitnotifyObj.notify(); > ?? 168??????????????? } > ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + tName); > ?? 170??????????????? return; > ?? 171??????????? } > ?? 172??????? } > ?? 150 > ?? 151??????????? String tName = null; > ?? 152 > ?? 153??????????? public Threadresume008a(String threadName) { > ?? 154??????????????? super(threadName); > ?? 155??????????????? tName = threadName; > ?? 156??????????? } > ?? 157 > ?? 158??????????? public void run() { > ?? 159??????????????? log1("? 'run': enter? :: threadName == " + tName); > ?? 160??????????????? // Yield, so the start thread event processing > can be completed. > ?? 161??????????????? try { > ?? 162??????????????????? Thread.sleep(100); > ?? 163??????????????? } catch (InterruptedException e) { > ?? 164??????????????????? // ignored > ?? 165??????????????? } > ?? 166??????????????? synchronized (waitnotifyObj) { > ?? 167??????????????????????? waitnotifyObj.notify(); > ?? 168??????????????? } > ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + tName); > ?? 170??????????????? return; > ?? 171??????????? } > ?? 172??????? } > > > > On 7/18/18, 2:38 AM, Chris Plummer wrote: >> Hi Gary, >> >> I've been having trouble following the control flow of this test. One >> thing I've stumbled across is the following: >> >> ??????????? /* A debuggee class must define 'methodForCommunication' >> ???????????? * method and invoke it in points of synchronization >> ???????????? * with a debugger. >> ???????????? */ >> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication"); >> >> So why isn't this mode of synchronization good enough? Is it because >> it was not designed with the understanding that the debugger might be >> doing suspended thread counts, and suspending all threads at the >> breakpoint messes up the test? >> >> From what I can tell of the test, after the debuggee is started and >> hits the default breakpoint at the start of main(), the debugger then >> does a vm.resume() at the start of the for loop in the runTest() >> method. The debuggee then creates a thread and calls >> methodForCommunication(). There is already a breakpoint set there by >> the above debuggee code. It's unclear to me what happens as a result >> of this breakpoint and how it serves the test. Also unclear to me who >> is responsible for the vm.resume() after the breakpoint is hit. >> >> The debugger then requests all ThreadStart events, requesting that no >> threads be disabled when it is sent. I think you are saying that when >> the ThreadStart event comes in, sometimes we are at the >> methodForCommunication breakpoint, with all threads disabled, and >> this messes up the thread suspend counts. You want to delay 100ms so >> the breakpoint event can be processed and threads resumed again >> (although I can't see who actually resumes the thread after hitting >> the methodForCommunication breakpoint). >> >> Chris >> >> On 7/17/18 8:33 AM, Gary Adams wrote: >>> A race condition exists between the debugger and the debuggee. >>> >>> The first test thread is started with SUSPEND_NONE policy set. >>> While processing the thread start event the debugger captures >>> an initial set of thread suspend counts and resumes the >>> debuggee vm. If the debuggee advances quickly it reaches >>> the breakpoint set for methodForCommunication. Since the breakpoint >>> carries with it SUSPEND_ALL policy, when the debugger captures a second >>> set of suspend counts, it will not match the expected counts for >>> a SUSPEND_NONE scenario. >>> >>> The proposed fix introduces a yield in the debuggee test thread run >>> method >>> to allow the debugger to get the expected sampled values. >>> >>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8170089 >>> ? Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/ >>> >>> >>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java: >>> ... >>> ?? 186??????? private void setCommunicationBreakpoint(ReferenceType >>> refType, String methodName) { >>> ?? 187??????????? Method method = debuggee.methodByName(refType, >>> methodName); >>> ?? 188??????????? Location location = null; >>> ?? 189??????????? try { >>> ?? 190??????????????? location = method.allLineLocations().get(0); >>> ?? 191??????????? } catch (AbsentInformationException e) { >>> ?? 192??????????????? throw new Failure(e); >>> ?? 193??????????? } >>> ?? 194??????????? bpRequest = debuggee.makeBreakpoint(location); >>> ?? 195 >>> >>> ?? 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); >>> >>> ?? 197??????????? bpRequest.putProperty("number", "zero"); >>> ?? 198??????????? bpRequest.enable(); >>> ?? 199 >>> ?? 200??????????? eventHandler.addListener( >>> ?? 201???????????????? new EventHandler.EventListener() { >>> ?? 202???????????????????? public boolean eventReceived(Event event) { >>> ?? 203??????????????????????? if (event instanceof BreakpointEvent >>> && bpRequest.equals(event.request())) { >>> ?? 204??????????????????????????? synchronized(eventHandler) { >>> ?? 205??????????????????????????????? display("Received >>> communication breakpoint event."); >>> ?? 206??????????????????????????????? bpCount++; >>> ?? 207 eventHandler.notifyAll(); >>> ?? 208??????????????????????????? } >>> ?? 209??????????????????????????? return true; >>> ?? 210??????????????????????? } >>> ?? 211??????????????????????? return false; >>> ?? 212???????????????????? } >>> ?? 213???????????????? } >>> ?? 214??????????? ); >>> ?? 215??????? } >>> >>> >>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: >>> ... >>> ?? 140??????????????????? display("......--> vm.suspend();"); >>> ?? 141??????????????????? vm.suspend(); >>> ?? 142 >>> ?? 143??????????????????? display("??????? getting : Map>> Integer> suspendsCounts1"); >>> ?? 144 >>> ?? 145??????????????????? Map suspendsCounts1 = new >>> HashMap(); >>> ?? 146??????????????????? for (ThreadReference threadReference : >>> vm.allThreads()) { >>> ?? 147 suspendsCounts1.put(threadReference.name(), >>> threadReference.suspendCount()); >>> ?? 148??????????????????? } >>> ?? 149??????????????????? display(suspendsCounts1.toString()); >>> ?? 150 >>> ?? 151??????????????????? display("??????? eventSet.resume;"); >>> ?? 152??????????????????? eventSet.resume(); >>> ?? 153 >>> ?? 154??????????????????? display("??????? getting : Map>> Integer> suspendsCounts2"); >>> >>> This is where the breakpoint is encountered before the second set of >>> suspend counts is acquired. >>> >>> ?? 155??????????????????? Map suspendsCounts2 = new >>> HashMap(); >>> ?? 156??????????????????? for (ThreadReference threadReference : >>> vm.allThreads()) { >>> ?? 157 suspendsCounts2.put(threadReference.name(), >>> threadReference.suspendCount()); >>> ?? 158??????????????????? } >>> ?? 159??????????????????? display(suspendsCounts2.toString()); >>> >> > From gary.adams at oracle.com Wed Jul 18 19:45:03 2018 From: gary.adams at oracle.com (Gary Adams) Date: Wed, 18 Jul 2018 15:45:03 -0400 Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR: suspendCounts don't match for : Common-Cleaner In-Reply-To: <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com> References: <5B4E0C62.3020808@oracle.com> <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com> <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com> Message-ID: <5B4F98BF.1060602@oracle.com> Answers below ... On 7/18/18, 2:50 PM, Chris Plummer wrote: > Hi Gary, > > Who does the resume for the breakpoint event? > > eventHandler.addListener( > new EventHandler.EventListener() { > public boolean eventReceived(Event event) { > if (event instanceof BreakpointEvent && > bpRequest.equals(event.request())) { > synchronized(eventHandler) { > display("Received communication breakpoint > event."); > bpCount++; > eventHandler.notifyAll(); > } > return true; > } > return false; > } > } > ); I believe you are looking for this sequence. At the top of the loop a check is made if resume() should be called "shouldRunAfterBreakpoint". lines 96-99 is an early termination. And at the bottom of the loop, line 240, is the normal continue the test to the next case. resume008.java : ... 94 for (int i = 0; ; i++) { 95 96 if (!shouldRunAfterBreakpoint()) { 97 vm.resume(); 98 break; 99 } 100 101 102 display(":::::: case: # " + i); 103 104 switch (i) { 105 106 case 0: 107 eventRequest = settingThreadStartRequest ( 108 SUSPEND_NONE, "ThreadStartRequest1"); ... 238 239 display("......--> vm.resume()"); 240 vm.resume(); 241 } > > Also: > >> 1. On a thread start event the debugee is suspended, line 141 > That's not true for the first ThreadStartEvent since SUSPEND_NONE was > used. The thread start event is set to SUSPEND_NONE for thread0, but when the thread start event is observed the resume008 test suspends the vm immediately after fetching the "number" property. 132 if ( !(newEvent instanceof ThreadStartEvent)) { 133 setFailedStatus("ERROR: new event is not ThreadStartEvent"); 134 } else { 135 136 String property = (String) newEvent.request().getProperty("number"); 137 display(" got new ThreadStartEvent with propety 'number' == " + property); 138 139 display("......checking up on EventSet.resume()"); 140 display("......--> vm.suspend();"); 141 vm.suspend(); > > Chris > > On 7/18/18 4:52 AM, Gary Adams wrote: >> There is nothing wrong with the breakpoint in methodForCommunication. >> The test uses it to make sure the threads are each tested separately. >> The breakpoint eventhandler just displays a message, increments a >> counter >> and returns. >> >> Let me step through resume008a the debugee to help clarify ... >> >> 1. The test thread is created and the synchronized break point is >> observed. lines 101-102 >> 2. The thread is started. lines 104,135-137 >> 2a. The main thread blocks on a local object. lines 133, 139 >> 2b. The test thread is started. lines 137, >> A run entered message is displayed, line 159 >> The main thread lock object is notified, line 167 >> 2b1. The main thread continues. line 167, 146 >> The next test thread is created. line 106 >> The synchronized breakpoint is observed, line 107 >> 2b2. A run exited message is displayed, line 169 >> >> On the resume008 debugger side ... >> 1. On a thread start event the debugee is suspended, line 141 >> 2. Messages are displayed and a first set of thread suspend counts >> is acquired. lines 143-151 >> 3. The threads are resumed, line 152 >> ---> >> 4. Messages are displayed and a second set of thread suspend >> counts is acquired. lines 154-159 >> >> The way the test is written the expectation is the debugger steps >> 2,3,4 will all happen >> while the test thread is running. >> >> When the debugger resumes the debuggee threads (debugger step 3) >> the debuggee continues from where it left off (debuggee steps >> 2b,2b1,2b2) >> >> If we complete debuggee step 2b1 (line 107) before the debugger >> completes step 4 line 159, >> then the synchronized breakpoint will suspend the vm and the counts >> will not match >> for the SUSPEND_NONE test thread start. >> >> resume008a.java: >> >> 100 case 0: >> 101 thread0 = new >> Threadresume008a("thread0"); >> 102 methodForCommunication(); >> 103 >> 104 threadStart(thread0); >> 105 >> 106 thread1 = new >> Threadresume008a("thread1"); >> 107 methodForCommunication(); >> 108 break; >> >> ... >> 135 static int threadStart(Thread t) { >> 136 synchronized (waitnotifyObj) { >> 137 t.start(); >> 138 try { >> 139 waitnotifyObj.wait(); >> 140 } catch ( Exception e) { >> 141 exitCode = FAILED; >> 142 logErr(" Exception : " + e ); >> 143 return FAILED; >> 144 } >> 145 } >> 146 return PASSED; >> 147 } >> >> 149 static class Threadresume008a extends Thread { >> ... >> 157 >> 158 public void run() { >> 159 log1(" 'run': enter :: threadName == " + tName); >> >> This is the proposed fix that will let the debugger complete it's second >> acquisition of suspend counts while the test thread is still running. >> >> 160 // Yield, so the start thread event processing >> can be completed. >> 161 try { >> 162 Thread.sleep(100); >> 163 } catch (InterruptedException e) { >> 164 // ignored >> 165 } >> >> 166 synchronized (waitnotifyObj) { >> 167 waitnotifyObj.notify(); >> 168 } >> 169 log1(" 'run': exit :: threadName == " + tName); >> 170 return; >> 171 } >> 172 } >> 150 >> 151 String tName = null; >> 152 >> 153 public Threadresume008a(String threadName) { >> 154 super(threadName); >> 155 tName = threadName; >> 156 } >> 157 >> 158 public void run() { >> 159 log1(" 'run': enter :: threadName == " + tName); >> 160 // Yield, so the start thread event processing >> can be completed. >> 161 try { >> 162 Thread.sleep(100); >> 163 } catch (InterruptedException e) { >> 164 // ignored >> 165 } >> 166 synchronized (waitnotifyObj) { >> 167 waitnotifyObj.notify(); >> 168 } >> 169 log1(" 'run': exit :: threadName == " + tName); >> 170 return; >> 171 } >> 172 } >> >> >> >> On 7/18/18, 2:38 AM, Chris Plummer wrote: >>> Hi Gary, >>> >>> I've been having trouble following the control flow of this test. >>> One thing I've stumbled across is the following: >>> >>> /* A debuggee class must define 'methodForCommunication' >>> * method and invoke it in points of synchronization >>> * with a debugger. >>> */ >>> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication"); >>> >>> So why isn't this mode of synchronization good enough? Is it because >>> it was not designed with the understanding that the debugger might >>> be doing suspended thread counts, and suspending all threads at the >>> breakpoint messes up the test? >>> >>> From what I can tell of the test, after the debuggee is started and >>> hits the default breakpoint at the start of main(), the debugger >>> then does a vm.resume() at the start of the for loop in the >>> runTest() method. The debuggee then creates a thread and calls >>> methodForCommunication(). There is already a breakpoint set there by >>> the above debuggee code. It's unclear to me what happens as a result >>> of this breakpoint and how it serves the test. Also unclear to me >>> who is responsible for the vm.resume() after the breakpoint is hit. >>> >>> The debugger then requests all ThreadStart events, requesting that >>> no threads be disabled when it is sent. I think you are saying that >>> when the ThreadStart event comes in, sometimes we are at the >>> methodForCommunication breakpoint, with all threads disabled, and >>> this messes up the thread suspend counts. You want to delay 100ms so >>> the breakpoint event can be processed and threads resumed again >>> (although I can't see who actually resumes the thread after hitting >>> the methodForCommunication breakpoint). >>> >>> Chris >>> >>> On 7/17/18 8:33 AM, Gary Adams wrote: >>>> A race condition exists between the debugger and the debuggee. >>>> >>>> The first test thread is started with SUSPEND_NONE policy set. >>>> While processing the thread start event the debugger captures >>>> an initial set of thread suspend counts and resumes the >>>> debuggee vm. If the debuggee advances quickly it reaches >>>> the breakpoint set for methodForCommunication. Since the breakpoint >>>> carries with it SUSPEND_ALL policy, when the debugger captures a >>>> second >>>> set of suspend counts, it will not match the expected counts for >>>> a SUSPEND_NONE scenario. >>>> >>>> The proposed fix introduces a yield in the debuggee test thread run >>>> method >>>> to allow the debugger to get the expected sampled values. >>>> >>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8170089 >>>> Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/ >>>> >>>> >>>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java: >>>> ... >>>> 186 private void setCommunicationBreakpoint(ReferenceType >>>> refType, String methodName) { >>>> 187 Method method = debuggee.methodByName(refType, >>>> methodName); >>>> 188 Location location = null; >>>> 189 try { >>>> 190 location = method.allLineLocations().get(0); >>>> 191 } catch (AbsentInformationException e) { >>>> 192 throw new Failure(e); >>>> 193 } >>>> 194 bpRequest = debuggee.makeBreakpoint(location); >>>> 195 >>>> >>>> 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); >>>> >>>> 197 bpRequest.putProperty("number", "zero"); >>>> 198 bpRequest.enable(); >>>> 199 >>>> 200 eventHandler.addListener( >>>> 201 new EventHandler.EventListener() { >>>> 202 public boolean eventReceived(Event event) { >>>> 203 if (event instanceof BreakpointEvent >>>> && bpRequest.equals(event.request())) { >>>> 204 synchronized(eventHandler) { >>>> 205 display("Received >>>> communication breakpoint event."); >>>> 206 bpCount++; >>>> 207 eventHandler.notifyAll(); >>>> 208 } >>>> 209 return true; >>>> 210 } >>>> 211 return false; >>>> 212 } >>>> 213 } >>>> 214 ); >>>> 215 } >>>> >>>> >>>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: >>>> ... >>>> 140 display("......--> vm.suspend();"); >>>> 141 vm.suspend(); >>>> 142 >>>> 143 display(" getting : Map>>> Integer> suspendsCounts1"); >>>> 144 >>>> 145 Map suspendsCounts1 = >>>> new HashMap(); >>>> 146 for (ThreadReference threadReference : >>>> vm.allThreads()) { >>>> 147 suspendsCounts1.put(threadReference.name(), >>>> threadReference.suspendCount()); >>>> 148 } >>>> 149 display(suspendsCounts1.toString()); >>>> 150 >>>> 151 display(" eventSet.resume;"); >>>> 152 eventSet.resume(); >>>> 153 >>>> 154 display(" getting : Map>>> Integer> suspendsCounts2"); >>>> >>>> This is where the breakpoint is encountered before the second set >>>> of suspend counts is acquired. >>>> >>>> 155 Map suspendsCounts2 = >>>> new HashMap(); >>>> 156 for (ThreadReference threadReference : >>>> vm.allThreads()) { >>>> 157 suspendsCounts2.put(threadReference.name(), >>>> threadReference.suspendCount()); >>>> 158 } >>>> 159 display(suspendsCounts2.toString()); >>>> >>> >> > From chris.plummer at oracle.com Wed Jul 18 20:47:09 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Jul 2018 13:47:09 -0700 Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR: suspendCounts don't match for : Common-Cleaner In-Reply-To: <5B4F98BF.1060602@oracle.com> References: <5B4E0C62.3020808@oracle.com> <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com> <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com> <5B4F98BF.1060602@oracle.com> Message-ID: <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com> Hi Gary Ok, so shouldRunAfterBreakpoint() is the code that does the eventHandler.wait(), so it gets the eventHandler.notifyAll() notification from the BreakpointEvent handler. And as a side note, I see now that resumption of execution after the breakpoint at main() is done by: ??????????? // after waitForClassPrepared() main debuggee thread is suspended, resume it before test start ??????????? display("RESUME DEBUGGEE VM"); ??????????? vm.resume(); ??????????? testRun(); shouldRunAfterBreakpoint() is returning true until the end of the test when the debuggee is executes "instruction = end". That's why runTests() does a "break" when shouldRunAfterBreakpoint() returns false. So this means the code that is checking shouldRunAfterBreakpoint() is not resuming execution for the first few (probably 3) methodForCommunication() breakpoints. However, it does make sure that runTests() blocks until the BreakPointEvent has been processed. You point out the vm.resume() at the bottom of the loop in runTests(), but that's only after a bunch of ThreadStartEvent processing above it has been done already. The ThreadStartEvent would never get generated if there was not a resume some point earlier. I think it is happening during the eventHandler.waitForRequestedEventSet() call, which does a vm.resume(). So if I understand the order of things now: -shouldRunAfterBreakpoint() returns after first methodForCommunication() is hit. At this point we know the first thread has been created, but no attempt to start it yet. The debuggee is suspended at this point. -runTests() requests ThreadStartEvents with SUSPEND_NONE. This also does a vm.resume(). -The debuggee starts the thread and then does another methodForCommunication() (this 2nd one is actually after the 2nd thread has been created, but not yet started). Now we have a race. Do we get the ThreadStartEvent first or the BreakpointEvent. This is because when the ThreadStartEvent is generated, the thread is not suspended due to SUSPEND_NONE. Even if the ThreadStartEvent comes in first, the async handling of the BreakpointEvent can cause problems during the ThreadStartEvent processing. -You added a 100ms delay after the thread has started, but before methodForCommunication(), hoping it will make it so the ThreadStartEvent can be received and fully processed before the BreakpointEvent is. I think it would be preferable to fix this by doing better sychronization. After all, that is the approach the test originally took. It could have been written with a bunch of sleep() delays instead, but that in general is not a very good approach. What if you added a shouldRunAfterBreakpoint() call after getting the ThreadStartEvent arrives. At this point you would know that the vm is suspended due to the breakpoint, so no need for: ??????????????? display("......checking up on EventSet.resume()"); ??????????????? display("......--> vm.suspend();"); ??????????????? vm.suspend(); You might then also need to add another methodForCommunication() call at the end of case 0 and 1 in the debuggee, although I think you could instead just change the shouldRunAfterBreakpoint() at the start of the loop. I think that check actually belongs at the end of the loop, and only for case 2. In fact it would be an error if shouldRunAfterBreakpoint() did not return true in that case. Then you also need to add a shouldRunAfterBreakpoint() at the start of case 0 to get things rolling (and I think at the start of case 1 also). Chris On 7/18/18 12:45 PM, Gary Adams wrote: > Answers below? ... > > On 7/18/18, 2:50 PM, Chris Plummer wrote: >> Hi Gary, >> >> Who does the resume for the breakpoint event? >> >> ??????? eventHandler.addListener( >> ???????????? new EventHandler.EventListener() { >> ???????????????? public boolean eventReceived(Event event) { >> ??????????????????? if (event instanceof BreakpointEvent && >> bpRequest.equals(event.request())) { >> ??????????????????????? synchronized(eventHandler) { >> ??????????????????????????? display("Received communication >> breakpoint event."); >> ??????????????????????????? bpCount++; >> ??????????????????????????? eventHandler.notifyAll(); >> ??????????????????????? } >> ??????????????????????? return true; >> ??????????????????? } >> ??????????????????? return false; >> ???????????????? } >> ???????????? } >> ??????? ); > I believe you are looking for this sequence. > At the top of the loop a check is made if > resume() should be called "shouldRunAfterBreakpoint". > lines 96-99 is an early termination. And at the > bottom of the loop, line 240, is the normal > continue the test to the next case. > > resume008.java : > ... > ??? 94??????????? for (int i = 0; ; i++) { > ??? 95 > > ??? 96??????????????? if (!shouldRunAfterBreakpoint()) { > ??? 97??????????????????? vm.resume(); > ??? 98??????????????????? break; > ??? 99??????????????? } > > 100 > ?? 101 > ?? 102??????????????? display(":::::: case: # " + i); > ?? 103 > ?? 104??????????????? switch (i) { > ?? 105 > ?? 106??????????????????? case 0: > ?? 107??????????????????? eventRequest = settingThreadStartRequest ( > ?? 108?????????????????????????????????????????? SUSPEND_NONE, > "ThreadStartRequest1"); > ... > ? 238 > ?? 239??????????????? display("......--> vm.resume()"); > ?? 240??????????????? vm.resume(); > ?? 241??????????? } >> >> Also: >> >>> ? 1. On a thread start event the debugee is suspended, line 141 >> That's not true for the first ThreadStartEvent since SUSPEND_NONE was >> used. > The thread start event is set to SUSPEND_NONE for thread0, but when > the thread start event is observed the resume008 test suspends the vm > immediately after fetching the "number" property. My point is that the Debuggee continues to run after the ThreadStartEvent is sent, and relies on the debugger to stop it after receiving the event. But in the meantime the debuggee has advanced to the next breakpoint, but only sometimes, thus the bug you are seeing. > > ?? 132??????????????? if ( !(newEvent instanceof ThreadStartEvent)) { > ?? 133??????????????????? setFailedStatus("ERROR: new event is not > ThreadStartEvent"); > ?? 134??????????????? } else { > ?? 135 > ?? 136??????????????????? String property = (String) > newEvent.request().getProperty("number"); > ?? 137??????????????????? display("?????? got new ThreadStartEvent > with propety 'number' == " + property); > ?? 138 > ?? 139??????????????????? display("......checking up on > EventSet.resume()"); > ?? 140??????????????????? display("......--> vm.suspend();"); > ?? 141??????????????????? vm.suspend(); > > >> >> Chris >> >> On 7/18/18 4:52 AM, Gary Adams wrote: >>> There is nothing wrong with the breakpoint in methodForCommunication. >>> The test uses it to make sure the threads are each tested separately. >>> The breakpoint eventhandler just displays a message, increments a >>> counter >>> and returns. >>> >>> Let me step through resume008a the debugee to help clarify ... >>> >>> 1. The test thread is created and the synchronized break point is >>> observed. lines 101-102 >>> 2. The thread is started. lines 104,135-137 >>> ??? 2a. The main thread blocks on a local object. lines 133, 139 >>> ??? 2b. The test thread is started. lines 137, >>> ?????????? A run entered message is displayed, line 159 >>> ?????????? The main thread lock object is notified, line 167 >>> ????????? 2b1. The main thread continues. line 167, 146 >>> ????????????????? The next test thread is created. line 106 >>> ????????????????? The synchronized breakpoint is observed, line 107 >>> ????????? 2b2. A run exited message is displayed, line 169 >>> >>> On the resume008 debugger side? ... >>> ? 1. On a thread start event the debugee is suspended, line 141 >>> ? 2. Messages are displayed and a first set of thread suspend counts >>> is acquired. lines 143-151 >>> ? 3. The threads are resumed, line 152 >>> ---> >>> ? 4.? Messages are displayed and a second set of thread suspend >>> counts is acquired. lines 154-159 >>> >>> The way the test is written the expectation is the debugger steps >>> 2,3,4 will all happen >>> while the test thread is running. >>> >>> When the debugger resumes the debuggee threads (debugger step 3) >>> the debuggee continues from where it left off (debuggee steps >>> 2b,2b1,2b2) >>> >>> If we complete debuggee step 2b1 (line 107) before the debugger >>> completes step 4 line 159, >>> then the synchronized breakpoint will suspend the vm and the counts >>> will not match >>> for the SUSPEND_NONE test thread start. >>> >>> resume008a.java: >>> >>> ?? 100??????????????????????? case 0: >>> ?? 101??????????????????????????????? thread0 = new >>> Threadresume008a("thread0"); >>> ?? 102 methodForCommunication(); >>> ?? 103 >>> ?? 104??????????????????????????????? threadStart(thread0); >>> ?? 105 >>> ?? 106??????????????????????????????? thread1 = new >>> Threadresume008a("thread1"); >>> ?? 107 methodForCommunication(); >>> ?? 108??????????????????????????????? break; >>> >>> ?? ... >>> ?? 135??????? static int threadStart(Thread t) { >>> ?? 136??????????? synchronized (waitnotifyObj) { >>> ?? 137??????????????? t.start(); >>> ?? 138??????????????? try { >>> ?? 139??????????????????? waitnotifyObj.wait(); >>> ?? 140??????????????? } catch ( Exception e) { >>> ?? 141??????????????????? exitCode = FAILED; >>> ?? 142??????????????????? logErr("?????? Exception : " + e ); >>> ?? 143??????????????????? return FAILED; >>> ?? 144??????????????? } >>> ?? 145??????????? } >>> ?? 146??????????? return PASSED; >>> ?? 147??????? } >>> >>> ?? 149??????? static class Threadresume008a extends Thread { >>> ?? ... >>> ?? 157 >>> ?? 158??????????? public void run() { >>> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + >>> tName); >>> >>> This is the proposed fix that will let the debugger complete it's >>> second >>> acquisition of suspend counts while the test thread is still running. >>> >>> ?? 160??????????????? // Yield, so the start thread event processing >>> can be completed. >>> ?? 161??????????????? try { >>> ?? 162??????????????????? Thread.sleep(100); >>> ?? 163??????????????? } catch (InterruptedException e) { >>> ?? 164??????????????????? // ignored >>> ?? 165??????????????? } >>> >>> ?? 166??????????????? synchronized (waitnotifyObj) { >>> ?? 167??????????????????????? waitnotifyObj.notify(); >>> ?? 168??????????????? } >>> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + >>> tName); >>> ?? 170??????????????? return; >>> ?? 171??????????? } >>> ?? 172??????? } >>> ?? 150 >>> ?? 151??????????? String tName = null; >>> ?? 152 >>> ?? 153??????????? public Threadresume008a(String threadName) { >>> ?? 154??????????????? super(threadName); >>> ?? 155??????????????? tName = threadName; >>> ?? 156??????????? } >>> ?? 157 >>> ?? 158??????????? public void run() { >>> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + >>> tName); >>> ?? 160??????????????? // Yield, so the start thread event processing >>> can be completed. >>> ?? 161??????????????? try { >>> ?? 162??????????????????? Thread.sleep(100); >>> ?? 163??????????????? } catch (InterruptedException e) { >>> ?? 164??????????????????? // ignored >>> ?? 165??????????????? } >>> ?? 166??????????????? synchronized (waitnotifyObj) { >>> ?? 167??????????????????????? waitnotifyObj.notify(); >>> ?? 168??????????????? } >>> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + >>> tName); >>> ?? 170??????????????? return; >>> ?? 171??????????? } >>> ?? 172??????? } >>> >>> >>> >>> On 7/18/18, 2:38 AM, Chris Plummer wrote: >>>> Hi Gary, >>>> >>>> I've been having trouble following the control flow of this test. >>>> One thing I've stumbled across is the following: >>>> >>>> ??????????? /* A debuggee class must define 'methodForCommunication' >>>> ???????????? * method and invoke it in points of synchronization >>>> ???????????? * with a debugger. >>>> ???????????? */ >>>> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication"); >>>> >>>> So why isn't this mode of synchronization good enough? Is it >>>> because it was not designed with the understanding that the >>>> debugger might be doing suspended thread counts, and suspending all >>>> threads at the breakpoint messes up the test? >>>> >>>> From what I can tell of the test, after the debuggee is started and >>>> hits the default breakpoint at the start of main(), the debugger >>>> then does a vm.resume() at the start of the for loop in the >>>> runTest() method. The debuggee then creates a thread and calls >>>> methodForCommunication(). There is already a breakpoint set there >>>> by the above debuggee code. It's unclear to me what happens as a >>>> result of this breakpoint and how it serves the test. Also unclear >>>> to me who is responsible for the vm.resume() after the breakpoint >>>> is hit. >>>> >>>> The debugger then requests all ThreadStart events, requesting that >>>> no threads be disabled when it is sent. I think you are saying that >>>> when the ThreadStart event comes in, sometimes we are at the >>>> methodForCommunication breakpoint, with all threads disabled, and >>>> this messes up the thread suspend counts. You want to delay 100ms >>>> so the breakpoint event can be processed and threads resumed again >>>> (although I can't see who actually resumes the thread after hitting >>>> the methodForCommunication breakpoint). >>>> >>>> Chris >>>> >>>> On 7/17/18 8:33 AM, Gary Adams wrote: >>>>> A race condition exists between the debugger and the debuggee. >>>>> >>>>> The first test thread is started with SUSPEND_NONE policy set. >>>>> While processing the thread start event the debugger captures >>>>> an initial set of thread suspend counts and resumes the >>>>> debuggee vm. If the debuggee advances quickly it reaches >>>>> the breakpoint set for methodForCommunication. Since the breakpoint >>>>> carries with it SUSPEND_ALL policy, when the debugger captures a >>>>> second >>>>> set of suspend counts, it will not match the expected counts for >>>>> a SUSPEND_NONE scenario. >>>>> >>>>> The proposed fix introduces a yield in the debuggee test thread >>>>> run method >>>>> to allow the debugger to get the expected sampled values. >>>>> >>>>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8170089 >>>>> ? Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/ >>>>> >>>>> >>>>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java: >>>>> ... >>>>> ?? 186??????? private void >>>>> setCommunicationBreakpoint(ReferenceType refType, String >>>>> methodName) { >>>>> ?? 187??????????? Method method = debuggee.methodByName(refType, >>>>> methodName); >>>>> ?? 188??????????? Location location = null; >>>>> ?? 189??????????? try { >>>>> ?? 190??????????????? location = method.allLineLocations().get(0); >>>>> ?? 191??????????? } catch (AbsentInformationException e) { >>>>> ?? 192??????????????? throw new Failure(e); >>>>> ?? 193??????????? } >>>>> ?? 194??????????? bpRequest = debuggee.makeBreakpoint(location); >>>>> ?? 195 >>>>> >>>>> ?? 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); >>>>> >>>>> ?? 197??????????? bpRequest.putProperty("number", "zero"); >>>>> ?? 198??????????? bpRequest.enable(); >>>>> ?? 199 >>>>> ?? 200??????????? eventHandler.addListener( >>>>> ?? 201???????????????? new EventHandler.EventListener() { >>>>> ?? 202???????????????????? public boolean eventReceived(Event >>>>> event) { >>>>> ?? 203??????????????????????? if (event instanceof BreakpointEvent >>>>> && bpRequest.equals(event.request())) { >>>>> ?? 204 synchronized(eventHandler) { >>>>> ?? 205??????????????????????????????? display("Received >>>>> communication breakpoint event."); >>>>> ?? 206??????????????????????????????? bpCount++; >>>>> ?? 207 eventHandler.notifyAll(); >>>>> ?? 208??????????????????????????? } >>>>> ?? 209??????????????????????????? return true; >>>>> ?? 210??????????????????????? } >>>>> ?? 211??????????????????????? return false; >>>>> ?? 212???????????????????? } >>>>> ?? 213???????????????? } >>>>> ?? 214??????????? ); >>>>> ?? 215??????? } >>>>> >>>>> >>>>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: >>>>> ... >>>>> ?? 140??????????????????? display("......--> vm.suspend();"); >>>>> ?? 141??????????????????? vm.suspend(); >>>>> ?? 142 >>>>> ?? 143??????????????????? display("??????? getting : Map>>>> Integer> suspendsCounts1"); >>>>> ?? 144 >>>>> ?? 145??????????????????? Map suspendsCounts1 = >>>>> new HashMap(); >>>>> ?? 146??????????????????? for (ThreadReference threadReference : >>>>> vm.allThreads()) { >>>>> ?? 147 suspendsCounts1.put(threadReference.name(), >>>>> threadReference.suspendCount()); >>>>> ?? 148??????????????????? } >>>>> ?? 149 display(suspendsCounts1.toString()); >>>>> ?? 150 >>>>> ?? 151??????????????????? display(" eventSet.resume;"); >>>>> ?? 152??????????????????? eventSet.resume(); >>>>> ?? 153 >>>>> ?? 154??????????????????? display("??????? getting : Map>>>> Integer> suspendsCounts2"); >>>>> >>>>> This is where the breakpoint is encountered before the second set >>>>> of suspend counts is acquired. >>>>> >>>>> ?? 155??????????????????? Map suspendsCounts2 = >>>>> new HashMap(); >>>>> ?? 156??????????????????? for (ThreadReference threadReference : >>>>> vm.allThreads()) { >>>>> ?? 157 suspendsCounts2.put(threadReference.name(), >>>>> threadReference.suspendCount()); >>>>> ?? 158??????????????????? } >>>>> ?? 159 display(suspendsCounts2.toString()); >>>>> >>>> >>> >> > From serguei.spitsyn at oracle.com Wed Jul 18 20:56:47 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Jul 2018 13:56:47 -0700 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com> Message-ID: <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com> An HTML attachment was scrubbed... URL: From gary.adams at oracle.com Wed Jul 18 22:09:43 2018 From: gary.adams at oracle.com (gary.adams at oracle.com) Date: Wed, 18 Jul 2018 18:09:43 -0400 Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR: suspendCounts don't match for : Common-Cleaner In-Reply-To: <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com> References: <5B4E0C62.3020808@oracle.com> <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com> <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com> <5B4F98BF.1060602@oracle.com> <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com> Message-ID: On 7/18/18 4:47 PM, Chris Plummer wrote: > Hi Gary > > Ok, so shouldRunAfterBreakpoint() is the code that does the > eventHandler.wait(), so it gets the eventHandler.notifyAll() > notification from the BreakpointEvent handler. > > And as a side note, I see now that resumption of execution after the > breakpoint at main() is done by: > > ??????????? // after waitForClassPrepared() main debuggee thread is > suspended, resume it before test start > ??????????? display("RESUME DEBUGGEE VM"); > ??????????? vm.resume(); > > ??????????? testRun(); > > shouldRunAfterBreakpoint() is returning true until the end of the test > when the debuggee is executes "instruction = end". That's why > runTests() does a "break" when shouldRunAfterBreakpoint() returns > false. So this means the code that is checking > shouldRunAfterBreakpoint() is not resuming execution for the first few > (probably 3) methodForCommunication() breakpoints. However, it does > make sure that runTests() blocks until the BreakPointEvent has been > processed. > > You point out the vm.resume() at the bottom of the loop in runTests(), > but that's only after a bunch of ThreadStartEvent processing above it > has been done already. The ThreadStartEvent would never get generated > if there was not a resume some point earlier. I think it is happening > during the eventHandler.waitForRequestedEventSet() call, which does a > vm.resume(). > > So if I understand the order of things now: > > -shouldRunAfterBreakpoint() returns after first > methodForCommunication() is hit. At this point we know the first > thread has been created, but no attempt to start it yet. The debuggee > is suspended at this point. > -runTests() requests ThreadStartEvents with SUSPEND_NONE. This also > does a vm.resume(). > -The debuggee starts the thread and then does another > methodForCommunication() (this 2nd one is actually after the 2nd > thread has been created, but not yet started). Now we have a race. Do > we get the ThreadStartEvent first or the BreakpointEvent. This is > because when the ThreadStartEvent is generated, the thread is not > suspended due to SUSPEND_NONE. Even if the ThreadStartEvent comes in > first, the async handling of the BreakpointEvent can cause problems > during the ThreadStartEvent processing. Based on the failed log in the bug report, the thread start event is observed, the suspend counts acquired, then after the resume, the breakpoint message is displayed and the second set of suspend counts acquired. I can show you the passed and failed logs tomorrow. > -You added a 100ms delay after the thread has started, but before > methodForCommunication(), hoping it will make it so the > ThreadStartEvent can be received and fully processed before the > BreakpointEvent is. The delay is mostly just a yield so the debugger gets a chance to run. > > I think it would be preferable to fix this by doing better > sychronization. After all, that is the approach the test originally > took. It could have been written with a bunch of sleep() delays > instead, but that in general is not a very good approach. > > What if you added a shouldRunAfterBreakpoint() call after getting the > ThreadStartEvent arrives. At this point you would know that the vm is > suspended due to the breakpoint, so no need for: > > ??????????????? display("......checking up on EventSet.resume()"); > ??????????????? display("......--> vm.suspend();"); > ??????????????? vm.suspend(); I think the suspend is intentional to capture the the suspend counts. It also needs to resume the vm and acquire again so it can confirm the correct suspend count behaviors. If the test waits to capture the second set of suspend counts, the breakpoint causes incorrect values. ... > > You might then also need to add another methodForCommunication() call > at the end of case 0 and 1 in the debuggee, although I think you could > instead just change the shouldRunAfterBreakpoint() at the start of the > loop. I think that check actually belongs at the end of the loop, and > only for case 2. In fact it would be an error if > shouldRunAfterBreakpoint() did not return true in that case. Then you > also need to add a shouldRunAfterBreakpoint() at the start of case 0 > to get things rolling (and I think at the start of case 1 also). > > Chris > > > On 7/18/18 12:45 PM, Gary Adams wrote: >> Answers below? ... >> >> On 7/18/18, 2:50 PM, Chris Plummer wrote: >>> Hi Gary, >>> >>> Who does the resume for the breakpoint event? >>> >>> ??????? eventHandler.addListener( >>> ???????????? new EventHandler.EventListener() { >>> ???????????????? public boolean eventReceived(Event event) { >>> ??????????????????? if (event instanceof BreakpointEvent && >>> bpRequest.equals(event.request())) { >>> ??????????????????????? synchronized(eventHandler) { >>> ??????????????????????????? display("Received communication >>> breakpoint event."); >>> ??????????????????????????? bpCount++; >>> ??????????????????????????? eventHandler.notifyAll(); >>> ??????????????????????? } >>> ??????????????????????? return true; >>> ??????????????????? } >>> ??????????????????? return false; >>> ???????????????? } >>> ???????????? } >>> ??????? ); >> I believe you are looking for this sequence. >> At the top of the loop a check is made if >> resume() should be called "shouldRunAfterBreakpoint". >> lines 96-99 is an early termination. And at the >> bottom of the loop, line 240, is the normal >> continue the test to the next case. >> >> resume008.java : >> ... >> ??? 94??????????? for (int i = 0; ; i++) { >> ??? 95 >> >> ??? 96??????????????? if (!shouldRunAfterBreakpoint()) { >> ??? 97??????????????????? vm.resume(); >> ??? 98??????????????????? break; >> ??? 99??????????????? } >> >> 100 >> ?? 101 >> ?? 102??????????????? display(":::::: case: # " + i); >> ?? 103 >> ?? 104??????????????? switch (i) { >> ?? 105 >> ?? 106??????????????????? case 0: >> ?? 107??????????????????? eventRequest = settingThreadStartRequest ( >> ?? 108?????????????????????????????????????????? SUSPEND_NONE, >> "ThreadStartRequest1"); >> ... >> ? 238 >> ?? 239??????????????? display("......--> vm.resume()"); >> ?? 240??????????????? vm.resume(); >> ?? 241??????????? } >>> >>> Also: >>> >>>> ? 1. On a thread start event the debugee is suspended, line 141 >>> That's not true for the first ThreadStartEvent since SUSPEND_NONE >>> was used. >> The thread start event is set to SUSPEND_NONE for thread0, but when >> the thread start event is observed the resume008 test suspends the vm >> immediately after fetching the "number" property. > My point is that the Debuggee continues to run after the > ThreadStartEvent is sent, and relies on the debugger to stop it after > receiving the event. But in the meantime the debuggee has advanced to > the next breakpoint, but only sometimes, thus the bug you are seeing. >> >> ?? 132??????????????? if ( !(newEvent instanceof ThreadStartEvent)) { >> ?? 133??????????????????? setFailedStatus("ERROR: new event is not >> ThreadStartEvent"); >> ?? 134??????????????? } else { >> ?? 135 >> ?? 136??????????????????? String property = (String) >> newEvent.request().getProperty("number"); >> ?? 137??????????????????? display("?????? got new ThreadStartEvent >> with propety 'number' == " + property); >> ?? 138 >> ?? 139??????????????????? display("......checking up on >> EventSet.resume()"); >> ?? 140??????????????????? display("......--> vm.suspend();"); >> ?? 141??????????????????? vm.suspend(); >> >> >>> >>> Chris >>> >>> On 7/18/18 4:52 AM, Gary Adams wrote: >>>> There is nothing wrong with the breakpoint in methodForCommunication. >>>> The test uses it to make sure the threads are each tested separately. >>>> The breakpoint eventhandler just displays a message, increments a >>>> counter >>>> and returns. >>>> >>>> Let me step through resume008a the debugee to help clarify ... >>>> >>>> 1. The test thread is created and the synchronized break point is >>>> observed. lines 101-102 >>>> 2. The thread is started. lines 104,135-137 >>>> ??? 2a. The main thread blocks on a local object. lines 133, 139 >>>> ??? 2b. The test thread is started. lines 137, >>>> ?????????? A run entered message is displayed, line 159 >>>> ?????????? The main thread lock object is notified, line 167 >>>> ????????? 2b1. The main thread continues. line 167, 146 >>>> ????????????????? The next test thread is created. line 106 >>>> ????????????????? The synchronized breakpoint is observed, line 107 >>>> ????????? 2b2. A run exited message is displayed, line 169 >>>> >>>> On the resume008 debugger side? ... >>>> ? 1. On a thread start event the debugee is suspended, line 141 >>>> ? 2. Messages are displayed and a first set of thread suspend >>>> counts is acquired. lines 143-151 >>>> ? 3. The threads are resumed, line 152 >>>> ---> >>>> ? 4.? Messages are displayed and a second set of thread suspend >>>> counts is acquired. lines 154-159 >>>> >>>> The way the test is written the expectation is the debugger steps >>>> 2,3,4 will all happen >>>> while the test thread is running. >>>> >>>> When the debugger resumes the debuggee threads (debugger step 3) >>>> the debuggee continues from where it left off (debuggee steps >>>> 2b,2b1,2b2) >>>> >>>> If we complete debuggee step 2b1 (line 107) before the debugger >>>> completes step 4 line 159, >>>> then the synchronized breakpoint will suspend the vm and the counts >>>> will not match >>>> for the SUSPEND_NONE test thread start. >>>> >>>> resume008a.java: >>>> >>>> ?? 100??????????????????????? case 0: >>>> ?? 101??????????????????????????????? thread0 = new >>>> Threadresume008a("thread0"); >>>> ?? 102 methodForCommunication(); >>>> ?? 103 >>>> ?? 104??????????????????????????????? threadStart(thread0); >>>> ?? 105 >>>> ?? 106??????????????????????????????? thread1 = new >>>> Threadresume008a("thread1"); >>>> ?? 107 methodForCommunication(); >>>> ?? 108??????????????????????????????? break; >>>> >>>> ?? ... >>>> ?? 135??????? static int threadStart(Thread t) { >>>> ?? 136??????????? synchronized (waitnotifyObj) { >>>> ?? 137??????????????? t.start(); >>>> ?? 138??????????????? try { >>>> ?? 139??????????????????? waitnotifyObj.wait(); >>>> ?? 140??????????????? } catch ( Exception e) { >>>> ?? 141??????????????????? exitCode = FAILED; >>>> ?? 142??????????????????? logErr("?????? Exception : " + e ); >>>> ?? 143??????????????????? return FAILED; >>>> ?? 144??????????????? } >>>> ?? 145??????????? } >>>> ?? 146??????????? return PASSED; >>>> ?? 147??????? } >>>> >>>> ?? 149??????? static class Threadresume008a extends Thread { >>>> ?? ... >>>> ?? 157 >>>> ?? 158??????????? public void run() { >>>> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + >>>> tName); >>>> >>>> This is the proposed fix that will let the debugger complete it's >>>> second >>>> acquisition of suspend counts while the test thread is still running. >>>> >>>> ?? 160??????????????? // Yield, so the start thread event >>>> processing can be completed. >>>> ?? 161??????????????? try { >>>> ?? 162??????????????????? Thread.sleep(100); >>>> ?? 163??????????????? } catch (InterruptedException e) { >>>> ?? 164??????????????????? // ignored >>>> ?? 165??????????????? } >>>> >>>> ?? 166??????????????? synchronized (waitnotifyObj) { >>>> ?? 167??????????????????????? waitnotifyObj.notify(); >>>> ?? 168??????????????? } >>>> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + >>>> tName); >>>> ?? 170??????????????? return; >>>> ?? 171??????????? } >>>> ?? 172??????? } >>>> ?? 150 >>>> ?? 151??????????? String tName = null; >>>> ?? 152 >>>> ?? 153??????????? public Threadresume008a(String threadName) { >>>> ?? 154??????????????? super(threadName); >>>> ?? 155??????????????? tName = threadName; >>>> ?? 156??????????? } >>>> ?? 157 >>>> ?? 158??????????? public void run() { >>>> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + >>>> tName); >>>> ?? 160??????????????? // Yield, so the start thread event >>>> processing can be completed. >>>> ?? 161??????????????? try { >>>> ?? 162??????????????????? Thread.sleep(100); >>>> ?? 163??????????????? } catch (InterruptedException e) { >>>> ?? 164??????????????????? // ignored >>>> ?? 165??????????????? } >>>> ?? 166??????????????? synchronized (waitnotifyObj) { >>>> ?? 167??????????????????????? waitnotifyObj.notify(); >>>> ?? 168??????????????? } >>>> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + >>>> tName); >>>> ?? 170??????????????? return; >>>> ?? 171??????????? } >>>> ?? 172??????? } >>>> >>>> >>>> >>>> On 7/18/18, 2:38 AM, Chris Plummer wrote: >>>>> Hi Gary, >>>>> >>>>> I've been having trouble following the control flow of this test. >>>>> One thing I've stumbled across is the following: >>>>> >>>>> ??????????? /* A debuggee class must define 'methodForCommunication' >>>>> ???????????? * method and invoke it in points of synchronization >>>>> ???????????? * with a debugger. >>>>> ???????????? */ >>>>> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication"); >>>>> >>>>> So why isn't this mode of synchronization good enough? Is it >>>>> because it was not designed with the understanding that the >>>>> debugger might be doing suspended thread counts, and suspending >>>>> all threads at the breakpoint messes up the test? >>>>> >>>>> From what I can tell of the test, after the debuggee is started >>>>> and hits the default breakpoint at the start of main(), the >>>>> debugger then does a vm.resume() at the start of the for loop in >>>>> the runTest() method. The debuggee then creates a thread and calls >>>>> methodForCommunication(). There is already a breakpoint set there >>>>> by the above debuggee code. It's unclear to me what happens as a >>>>> result of this breakpoint and how it serves the test. Also unclear >>>>> to me who is responsible for the vm.resume() after the breakpoint >>>>> is hit. >>>>> >>>>> The debugger then requests all ThreadStart events, requesting that >>>>> no threads be disabled when it is sent. I think you are saying >>>>> that when the ThreadStart event comes in, sometimes we are at the >>>>> methodForCommunication breakpoint, with all threads disabled, and >>>>> this messes up the thread suspend counts. You want to delay 100ms >>>>> so the breakpoint event can be processed and threads resumed again >>>>> (although I can't see who actually resumes the thread after >>>>> hitting the methodForCommunication breakpoint). >>>>> >>>>> Chris >>>>> >>>>> On 7/17/18 8:33 AM, Gary Adams wrote: >>>>>> A race condition exists between the debugger and the debuggee. >>>>>> >>>>>> The first test thread is started with SUSPEND_NONE policy set. >>>>>> While processing the thread start event the debugger captures >>>>>> an initial set of thread suspend counts and resumes the >>>>>> debuggee vm. If the debuggee advances quickly it reaches >>>>>> the breakpoint set for methodForCommunication. Since the breakpoint >>>>>> carries with it SUSPEND_ALL policy, when the debugger captures a >>>>>> second >>>>>> set of suspend counts, it will not match the expected counts for >>>>>> a SUSPEND_NONE scenario. >>>>>> >>>>>> The proposed fix introduces a yield in the debuggee test thread >>>>>> run method >>>>>> to allow the debugger to get the expected sampled values. >>>>>> >>>>>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8170089 >>>>>> ? Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/ >>>>>> >>>>>> >>>>>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java: >>>>>> ... >>>>>> ?? 186??????? private void >>>>>> setCommunicationBreakpoint(ReferenceType refType, String >>>>>> methodName) { >>>>>> ?? 187??????????? Method method = debuggee.methodByName(refType, >>>>>> methodName); >>>>>> ?? 188??????????? Location location = null; >>>>>> ?? 189??????????? try { >>>>>> ?? 190??????????????? location = method.allLineLocations().get(0); >>>>>> ?? 191??????????? } catch (AbsentInformationException e) { >>>>>> ?? 192??????????????? throw new Failure(e); >>>>>> ?? 193??????????? } >>>>>> ?? 194??????????? bpRequest = debuggee.makeBreakpoint(location); >>>>>> ?? 195 >>>>>> >>>>>> ?? 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); >>>>>> >>>>>> ?? 197??????????? bpRequest.putProperty("number", "zero"); >>>>>> ?? 198??????????? bpRequest.enable(); >>>>>> ?? 199 >>>>>> ?? 200??????????? eventHandler.addListener( >>>>>> ?? 201???????????????? new EventHandler.EventListener() { >>>>>> ?? 202???????????????????? public boolean eventReceived(Event >>>>>> event) { >>>>>> ?? 203??????????????????????? if (event instanceof >>>>>> BreakpointEvent && bpRequest.equals(event.request())) { >>>>>> ?? 204 synchronized(eventHandler) { >>>>>> ?? 205??????????????????????????????? display("Received >>>>>> communication breakpoint event."); >>>>>> ?? 206??????????????????????????????? bpCount++; >>>>>> ?? 207 eventHandler.notifyAll(); >>>>>> ?? 208??????????????????????????? } >>>>>> ?? 209??????????????????????????? return true; >>>>>> ?? 210??????????????????????? } >>>>>> ?? 211??????????????????????? return false; >>>>>> ?? 212???????????????????? } >>>>>> ?? 213???????????????? } >>>>>> ?? 214??????????? ); >>>>>> ?? 215??????? } >>>>>> >>>>>> >>>>>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: >>>>>> >>>>>> ... >>>>>> ?? 140??????????????????? display("......--> vm.suspend();"); >>>>>> ?? 141??????????????????? vm.suspend(); >>>>>> ?? 142 >>>>>> ?? 143??????????????????? display("??????? getting : Map>>>>> Integer> suspendsCounts1"); >>>>>> ?? 144 >>>>>> ?? 145??????????????????? Map suspendsCounts1 = >>>>>> new HashMap(); >>>>>> ?? 146??????????????????? for (ThreadReference threadReference : >>>>>> vm.allThreads()) { >>>>>> ?? 147 suspendsCounts1.put(threadReference.name(), >>>>>> threadReference.suspendCount()); >>>>>> ?? 148??????????????????? } >>>>>> ?? 149 display(suspendsCounts1.toString()); >>>>>> ?? 150 >>>>>> ?? 151??????????????????? display(" eventSet.resume;"); >>>>>> ?? 152??????????????????? eventSet.resume(); >>>>>> ?? 153 >>>>>> ?? 154??????????????????? display("??????? getting : Map>>>>> Integer> suspendsCounts2"); >>>>>> >>>>>> This is where the breakpoint is encountered before the second set >>>>>> of suspend counts is acquired. >>>>>> >>>>>> ?? 155??????????????????? Map suspendsCounts2 = >>>>>> new HashMap(); >>>>>> ?? 156??????????????????? for (ThreadReference threadReference : >>>>>> vm.allThreads()) { >>>>>> ?? 157 suspendsCounts2.put(threadReference.name(), >>>>>> threadReference.suspendCount()); >>>>>> ?? 158??????????????????? } >>>>>> ?? 159 display(suspendsCounts2.toString()); >>>>>> >>>>> >>>> >>> >> > From serguei.spitsyn at oracle.com Thu Jul 19 03:32:27 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Jul 2018 20:32:27 -0700 Subject: RFR(XS): 8207819: Problem list serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java Message-ID: Please, review the fix for sub-task: ? https://bugs.openjdk.java.net/browse/JDK-8207819 The test HeapMonitorStatRateTest.java needs to be problem listed until main bug is fixed ? https://bugs.openjdk.java.net/browse/JDK-8207765 The patch is: diff -r 3c0a5bf931e4 test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt??? Thu Jul 19 10:30:24 2018 +0800 +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 18 20:27:10 2018 -0700 @@ -81,6 +81,7 @@ ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java 8207765 generic-all ?############################################################################# Thanks, Serguei From chris.plummer at oracle.com Thu Jul 19 03:47:56 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 18 Jul 2018 20:47:56 -0700 Subject: RFR(XS): 8207819: Problem list serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java In-Reply-To: References: Message-ID: Looks good. Chris On 7/18/18 8:32 PM, serguei.spitsyn at oracle.com wrote: > Please, review the fix for sub-task: > ? https://bugs.openjdk.java.net/browse/JDK-8207819 > > > The test HeapMonitorStatRateTest.java needs to be problem listed until > main bug is fixed > ? https://bugs.openjdk.java.net/browse/JDK-8207765 > > > The patch is: > > diff -r 3c0a5bf931e4 test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt??? Thu Jul 19 10:30:24 2018 > +0800 > +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 18 20:27:10 2018 > -0700 > @@ -81,6 +81,7 @@ > > ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all > ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all > +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java > 8207765 generic-all > > ?############################################################################# > > > > Thanks, > Serguei From serguei.spitsyn at oracle.com Thu Jul 19 03:49:58 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 18 Jul 2018 20:49:58 -0700 Subject: RFR(XS): 8207819: Problem list serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java In-Reply-To: References: Message-ID: <5c714d80-2216-84bd-5391-a7fbcccd34a7@oracle.com> Thanks, Chris! This meets the Trivial Change policy, so that pushing now. Thanks, Serguei On 7/18/18 20:47, Chris Plummer wrote: > Looks good. > > Chris > > On 7/18/18 8:32 PM, serguei.spitsyn at oracle.com wrote: >> Please, review the fix for sub-task: >> ? https://bugs.openjdk.java.net/browse/JDK-8207819 >> >> >> The test HeapMonitorStatRateTest.java needs to be problem listed >> until main bug is fixed >> ? https://bugs.openjdk.java.net/browse/JDK-8207765 >> >> >> The patch is: >> >> diff -r 3c0a5bf931e4 test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt??? Thu Jul 19 10:30:24 2018 >> +0800 >> +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 18 20:27:10 2018 >> -0700 >> @@ -81,6 +81,7 @@ >> >> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 >> generic-all >> ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 >> generic-all >> +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java >> 8207765 generic-all >> >> ?############################################################################# >> >> >> >> Thanks, >> Serguei > > > From gary.adams at oracle.com Thu Jul 19 12:08:12 2018 From: gary.adams at oracle.com (Gary Adams) Date: Thu, 19 Jul 2018 08:08:12 -0400 Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR: suspendCounts don't match for : Common-Cleaner In-Reply-To: References: <5B4E0C62.3020808@oracle.com> <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com> <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com> <5B4F98BF.1060602@oracle.com> <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com> Message-ID: <5B507F2C.4080503@oracle.com> In the successful run below "the first acquire thread suspend counts, resume, and the second acquire thread suspend counts" is not interrupted by the breakpoint event. Note that the failed thread0 case the test thread finishes rapidly. [2018-01-22T20:33:46.86] debugee.stderr> **> debuggee: 'run': enter :: threadName == thread0 *[2018-01-22T20:33:46.86] debugee.stderr> **> debuggee: 'run': exit :: threadName == thread0* and the successful test run , the thread0 run method exits after the thread1 has started. debugger> :::::: case: # 1 debugger> ......waiting for new ThreadStartEvent : 1 EventHandler> waitForRequestedEventSet: enabling remove of listener nsk.share.jdi.EventHandler$7 at 616bc3ae EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 616bc3ae EventHandler> waitForRequestedEventSet: vm.resume called EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD *debugee.stderr> **> debuggee: 'run': exit :: threadName == thread0* Here's a recent mach5 failed log: [2018-01-22T20:33:45.65] # [2018-01-22T20:33:45.65] export TEST_CLEANUP [2018-01-22T20:33:45.65] export SHELL [2018-01-22T20:33:45.65] export DISPLAY [2018-01-22T20:33:45.65] export LIBJSIG_PATH [2018-01-22T20:33:45.65] export TESTBASE [2018-01-22T20:33:45.65] export JAVA_OPTS [2018-01-22T20:33:45.65] export RAS_OPTIONS [2018-01-22T20:33:45.65] export HOME [2018-01-22T20:33:45.65] export LD_LIBRARY_PATH [2018-01-22T20:33:45.65] export CLASSPATH [2018-01-22T20:33:45.65] export TEMP [2018-01-22T20:33:45.65] export TESTED_JAVA_HOME [2018-01-22T20:33:45.65] export BASH_ENV [2018-01-22T20:33:45.65] export PATH [2018-01-22T20:33:45.65] TEST_DEST_DIR="resume008" [2018-01-22T20:33:45.65] # Actual: TEST_DEST_DIR=resume008 [2018-01-22T20:33:45.65] TESTNAME="${test_case_name}" [2018-01-22T20:33:45.65] # Actual: TESTNAME=resume008 [2018-01-22T20:33:45.65] testName="nsk/jdi/EventSet/resume//resume008" [2018-01-22T20:33:45.65] # Actual: testName=nsk/jdi/EventSet/resume//resume008 [2018-01-22T20:33:45.65] TESTDIR="${test_work_dir}" [2018-01-22T20:33:45.65] # Actual: TESTDIR=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008 [2018-01-22T20:33:45.65] testWorkDir="${test_work_dir}/" [2018-01-22T20:33:45.65] # Actual: testWorkDir=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/ [2018-01-22T20:33:45.65] export testWorkDir [2018-01-22T20:33:45.65] tlogOutFile="${test_work_dir}/${test_name}.tlog" [2018-01-22T20:33:45.65] # Actual: tlogOutFile=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/resume008.tlog [2018-01-22T20:33:45.65] testErrFile="${test_work_dir}/${test_name}.err" [2018-01-22T20:33:45.65] # Actual: testErrFile=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/resume008.err [2018-01-22T20:33:45.65] EXECUTE_CLASS="${test_name}" [2018-01-22T20:33:45.66] # Actual: EXECUTE_CLASS=resume008 [2018-01-22T20:33:45.66] NSK_STRESS_METASPACE_OPTS="-XX:MaxMetaspaceSize=128m -XX:CompressedClassSpaceSize=64m -XX:MaxHeapSize=512m -Xlog:gc(ASTERISK_SUBST),gc+heap=trace" [2018-01-22T20:33:45.66] # Actual: NSK_STRESS_METASPACE_OPTS=-XX:MaxMetaspaceSize=128m -XX:CompressedClassSpaceSize=64m -XX:MaxHeapSize=512m -Xlog:gc*,gc+heap=trace [2018-01-22T20:33:45.66] export NSK_STRESS_METASPACE_OPTS [2018-01-22T20:33:45.66] EXECUTE_CLASS="nsk.jdi.EventSet.resume.resume008" [2018-01-22T20:33:45.66] # Actual: EXECUTE_CLASS=nsk.jdi.EventSet.resume.resume008 [2018-01-22T20:33:45.66] TEST_ARGS="${JDI_TEST_KEYS} -debugee.vmkeys=${JDI_DEBUGEE_VM_KEYS}" [2018-01-22T20:33:45.66] # Actual: TEST_ARGS=-verbose -arch=linux-amd64 -waittime=5 -debugee.vmkind=java -transport.address=dynamic -debugee.vmkeys=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:45.66] JAVA="${TESTED_JAVA_HOME}/bin/${DEBUGGER_KIND_OF_JAVA}" [2018-01-22T20:33:45.66] # Actual: JAVA=/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10/bin/java [2018-01-22T20:33:45.66] JAVA_OPTS="${DEBUGGER_JAVA_OPTS}" [2018-01-22T20:33:45.66] # Actual: JAVA_OPTS= [2018-01-22T20:33:45.66] APPLICATION_TIMEOUT="${TIMEOUT}" [2018-01-22T20:33:45.66] # Actual: APPLICATION_TIMEOUT=30 [2018-01-22T20:33:45.66] CLASSPATH="${test_work_dir}${PS}${CLASSPATH}" [2018-01-22T20:33:45.66] # Actual: CLASSPATH=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008:/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.test/hotspot/closed/tonga/bin/classes: [2018-01-22T20:33:45.66] export CLASSPATH [2018-01-22T20:33:45.66] ${JAVA} ${JAVA_OPTS} ${EXECUTE_CLASS} ${TEST_ARGS} [2018-01-22T20:33:45.66] # Actual: /scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10/bin/java nsk.jdi.EventSet.resume.resume008 -verbose -arch=linux-amd64 -waittime=5 -debugee.vmkind=java -transport.address=dynamic -debugee.vmkeys=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:46.01] binder> VirtualMachineManager: version 9.0 [2018-01-22T20:33:46.05] binder> Finding connector: default [2018-01-22T20:33:46.05] binder> LaunchingConnector: [2018-01-22T20:33:46.06] binder> name: com.sun.jdi.CommandLineLaunch [2018-01-22T20:33:46.06] binder> description: Launches target using Sun Java VM command line and attaches to it [2018-01-22T20:33:46.06] binder> transport: com.sun.tools.jdi.SunCommandLineLauncher$2 at 457e2f02 [2018-01-22T20:33:46.19] binder> Connector arguments: [2018-01-22T20:33:46.19] binder> home=/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10 [2018-01-22T20:33:46.19] binder> vmexec=java [2018-01-22T20:33:46.19] binder> options=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:46.20] binder> main=nsk.jdi.EventSet.resume.resume008a "-verbose" "-arch=linux-amd64" "-waittime=5" "-debugee.vmkind=java" "-transport.address=dynamic" "-debugee.vmkeys=-XX:MaxRAMPercentage=12.5" "-pipe.port=28038" [2018-01-22T20:33:46.20] binder> quote=" [2018-01-22T20:33:46.20] binder> suspend=true [2018-01-22T20:33:46.20] binder> Launching debugee [2018-01-22T20:33:46.56] binder> Waiting for VM initialized [2018-01-22T20:33:46.60] Initial VMStartEvent received: VMStartEvent in thread main [2018-01-22T20:33:46.61] EventHandler> Adding listener nsk.share.jdi.EventHandler$1 at 1e7c7811 [2018-01-22T20:33:46.61] EventHandler> Adding listener nsk.share.jdi.EventHandler$2 at 1a3869f4 [2018-01-22T20:33:46.61] EventHandler> Adding listener nsk.share.jdi.EventHandler$3 at 77f99a05 [2018-01-22T20:33:46.61] EventHandler> Adding listener nsk.share.jdi.EventHandler$4 at 3aeaafa6 [2018-01-22T20:33:46.61] EventHandler> Adding listener nsk.share.jdi.EventHandler$5 at 4d3167f4 [2018-01-22T20:33:46.62] EventHandler> waitForRequestedEvent: enabling remove of listener nsk.share.jdi.EventHandler$6 at 4eb7f003 [2018-01-22T20:33:46.62] EventHandler> Adding listener nsk.share.jdi.EventHandler$6 at 4eb7f003 [2018-01-22T20:33:46.62] EventHandler> waitForRequestedEvent: vm.resume called [2018-01-22T20:33:46.67] EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD [2018-01-22T20:33:46.68] EventHandler> Event: ClassPrepareEventImpl req class prepare request (enabled) [2018-01-22T20:33:46.69] EventHandler> waitForRequestedEvent: Received event(ClassPrepareEvent in thread main) for request(class prepare request (enabled)) [2018-01-22T20:33:46.69] EventHandler> Removing listener nsk.share.jdi.EventHandler$6 at 4eb7f003 [2018-01-22T20:33:46.69] debugger> Received ClassPrepareEvent for debuggee class: nsk.jdi.EventSet.resume.resume008a [2018-01-22T20:33:46.71] binder> Breakpoint set: [2018-01-22T20:33:46.71] breakpoint request nsk.jdi.EventSet.resume.resume008a:60 (disabled) [2018-01-22T20:33:46.71] EventHandler> Adding listener nsk.share.jdi.TestDebuggerType1$1 at 43738a82 [2018-01-22T20:33:46.71] debugger> TESTING BEGINS [2018-01-22T20:33:46.71] debugger> RESUME DEBUGGEE VM [2018-01-22T20:33:46.72] debugger> shouldRunAfterBreakpoint: entered [2018-01-22T20:33:46.72] debugger> shouldRunAfterBreakpoint: waiting for breakpoint event during 1 sec. [2018-01-22T20:33:46.84] EventHandler> Received event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.84] EventHandler> Event: BreakpointEventImpl req breakpoint request nsk.jdi.EventSet.resume.resume008a:60 (enabled) [2018-01-22T20:33:46.84] debugger> Received communication breakpoint event. [2018-01-22T20:33:46.84] debugger> shouldRunAfterBreakpoint: received breakpoint event. [2018-01-22T20:33:46.84] debugee.stderr> **> debuggee: debuggee started! [2018-01-22T20:33:46.85] debugger> shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.85] debugger> :::::: case: # 0 [2018-01-22T20:33:46.85] debugger> ......waiting for new ThreadStartEvent : 0 [2018-01-22T20:33:46.85] EventHandler> waitForRequestedEventSet: enabling remove of listener nsk.share.jdi.EventHandler$7 at 6ec8211c [2018-01-22T20:33:46.85] EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 6ec8211c [2018-01-22T20:33:46.85] EventHandler> waitForRequestedEventSet: vm.resume called [2018-01-22T20:33:46.86] debugee.stderr> **> debuggee: 'run': enter :: threadName == thread0 [2018-01-22T20:33:46.86] debugee.stderr> **> debuggee: 'run': exit :: threadName == thread0 [2018-01-22T20:33:46.86] EventHandler> Received event set with policy = SUSPEND_NONE [2018-01-22T20:33:46.86] EventHandler> waitForRequestedEventSet: Received event set for request: thread start request (enabled) [2018-01-22T20:33:46.86] EventHandler> Event: ThreadStartEventImpl req thread start request (enabled) [2018-01-22T20:33:46.86] EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 6ec8211c [2018-01-22T20:33:46.86] debugger> got new ThreadStartEvent with propety 'number' == ThreadStartRequest1 [2018-01-22T20:33:46.86] debugger> ......checking up on EventSet.resume() [2018-01-22T20:33:46.86] debugger> ......--> vm.suspend(); [2018-01-22T20:33:46.87] debugger> getting : Map suspendsCounts1 [2018-01-22T20:33:46.87] debugger> {Reference Handler=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1} [2018-01-22T20:33:46.87] debugger> eventSet.resume; [2018-01-22T20:33:46.87] debugger> getting : Map suspendsCounts2 [2018-01-22T20:33:46.87] EventHandler> Received event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.87] EventHandler> Event: BreakpointEventImpl req breakpoint request nsk.jdi.EventSet.resume.resume008a:60 (enabled) [2018-01-22T20:33:46.87] debugger> Received communication breakpoint event. [2018-01-22T20:33:46.87] debugger> {Reference Handler=2, Common-Cleaner=2, main=1, Signal Dispatcher=2, Finalizer=2} [2018-01-22T20:33:46.87] debugger> getting : int policy = eventSet.suspendPolicy(); [2018-01-22T20:33:46.87] debugger> case SUSPEND_NONE [2018-01-22T20:33:46.87] debugger> checking Reference Handler [2018-01-22T20:33:46.87] # ERROR: debugger> ERROR: suspendCounts don't match for : Reference Handler [2018-01-22T20:33:46.88] The following stacktrace is for Aurora. Used to create a RULE: [2018-01-22T20:33:46.88] nsk.share.TestFailure: debugger> ERROR: suspendCounts don't match for : Reference Handler [2018-01-22T20:33:46.88] at nsk.share.Log.logExceptionForAurora(Log.java:411) [2018-01-22T20:33:46.88] at nsk.share.Log.complain(Log.java:380) [2018-01-22T20:33:46.88] at nsk.share.jdi.TestDebuggerType1.complain(TestDebuggerType1.java:63) [2018-01-22T20:33:46.88] at nsk.jdi.EventSet.resume.resume008.testRun(resume008.java:163) [2018-01-22T20:33:46.88] at nsk.share.jdi.TestDebuggerType1.runThis(TestDebuggerType1.java:104) [2018-01-22T20:33:46.88] at nsk.jdi.EventSet.resume.resume008.run(resume008.java:62) [2018-01-22T20:33:46.88] at nsk.jdi.EventSet.resume.resume008.main(resume008.java:57) [2018-01-22T20:33:46.88] # ERROR: debugger> before resuming : 1 [2018-01-22T20:33:46.88] # ERROR: debugger> after resuming : 2 [2018-01-22T20:33:46.88] debugger> ......--> vm.resume() [2018-01-22T20:33:46.88] debugger> shouldRunAfterBreakpoint: entered [2018-01-22T20:33:46.88] debugger> shouldRunAfterBreakpoint: received breakpoint event. [2018-01-22T20:33:46.88] debugger> shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.88] debugger> :::::: case: # 1 [2018-01-22T20:33:46.88] debugger> ......waiting for new ThreadStartEvent : 1 [2018-01-22T20:33:46.88] EventHandler> waitForRequestedEventSet: enabling remove of listener nsk.share.jdi.EventHandler$7 at 548ad73b [2018-01-22T20:33:46.88] EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 548ad73b [2018-01-22T20:33:46.88] EventHandler> waitForRequestedEventSet: vm.resume called [2018-01-22T20:33:46.88] EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD [2018-01-22T20:33:46.88] EventHandler> waitForRequestedEventSet: Received event set for request: thread start request (enabled) [2018-01-22T20:33:46.88] EventHandler> Event: ThreadStartEventImpl req thread start request (enabled) [2018-01-22T20:33:46.88] EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 548ad73b [2018-01-22T20:33:46.88] debugger> got new ThreadStartEvent with propety 'number' == ThreadStartRequest2 [2018-01-22T20:33:46.88] debugger> ......checking up on EventSet.resume() [2018-01-22T20:33:46.88] debugger> ......--> vm.suspend(); [2018-01-22T20:33:46.88] debugger> getting : Map suspendsCounts1 [2018-01-22T20:33:46.89] debugger> {Reference Handler=1, thread1=2, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1} [2018-01-22T20:33:46.89] debugger> eventSet.resume; [2018-01-22T20:33:46.89] debugger> getting : Map suspendsCounts2 [2018-01-22T20:33:46.89] debugger> {Reference Handler=1, thread1=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1} [2018-01-22T20:33:46.89] debugger> getting : int policy = eventSet.suspendPolicy(); [2018-01-22T20:33:46.89] debugger> case SUSPEND_THREAD [2018-01-22T20:33:46.89] debugger> checking Reference Handler [2018-01-22T20:33:46.89] debugger> checking thread1 [2018-01-22T20:33:46.89] debugger> checking Common-Cleaner [2018-01-22T20:33:46.89] debugger> checking main [2018-01-22T20:33:46.90] debugger> checking Signal Dispatcher [2018-01-22T20:33:46.90] debugger> checking Finalizer [2018-01-22T20:33:46.90] debugger> ......--> vm.resume() [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: entered [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: waiting for breakpoint event during 1 sec. [2018-01-22T20:33:46.90] debugee.stderr> **> debuggee: 'run': enter :: threadName == thread1 [2018-01-22T20:33:46.90] debugee.stderr> **> debuggee: 'run': exit :: threadName == thread1 [2018-01-22T20:33:46.90] EventHandler> Received event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.90] EventHandler> Event: BreakpointEventImpl req breakpoint request nsk.jdi.EventSet.resume.resume008a:60 (enabled) [2018-01-22T20:33:46.90] debugger> Received communication breakpoint event. [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: received breakpoint event. [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.90] debugger> :::::: case: # 2 [2018-01-22T20:33:46.90] debugger> ......waiting for new ThreadStartEvent : 2 [2018-01-22T20:33:46.90] EventHandler> waitForRequestedEventSet: enabling remove of listener nsk.share.jdi.EventHandler$7 at 2641e737 [2018-01-22T20:33:46.90] EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 2641e737 [2018-01-22T20:33:46.90] EventHandler> waitForRequestedEventSet: vm.resume called [2018-01-22T20:33:46.90] EventHandler> Received event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.90] EventHandler> waitForRequestedEventSet: Received event set for request: thread start request (enabled) [2018-01-22T20:33:46.90] EventHandler> Event: ThreadStartEventImpl req thread start request (enabled) [2018-01-22T20:33:46.90] EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 2641e737 [2018-01-22T20:33:46.90] debugger> got new ThreadStartEvent with propety 'number' == ThreadStartRequest3 [2018-01-22T20:33:46.90] debugger> ......checking up on EventSet.resume() [2018-01-22T20:33:46.90] debugger> ......--> vm.suspend(); [2018-01-22T20:33:46.90] debugger> getting : Map suspendsCounts1 [2018-01-22T20:33:46.91] debugger> {Reference Handler=2, thread2=2, Common-Cleaner=2, main=2, Signal Dispatcher=2, Finalizer=2} [2018-01-22T20:33:46.91] debugger> eventSet.resume; [2018-01-22T20:33:46.91] debugger> getting : Map suspendsCounts2 [2018-01-22T20:33:46.91] debugger> {Reference Handler=1, thread2=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1} [2018-01-22T20:33:46.91] debugger> getting : int policy = eventSet.suspendPolicy(); [2018-01-22T20:33:46.91] debugger> case SUSPEND_ALL [2018-01-22T20:33:46.91] debugger> checking Reference Handler [2018-01-22T20:33:46.91] debugger> checking thread2 [2018-01-22T20:33:46.91] debugger> checking Common-Cleaner [2018-01-22T20:33:46.91] debugger> checking main [2018-01-22T20:33:46.91] debugger> checking Signal Dispatcher [2018-01-22T20:33:46.91] debugger> checking Finalizer [2018-01-22T20:33:46.91] debugger> ......--> vm.resume() [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: entered [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: waiting for breakpoint event during 1 sec. [2018-01-22T20:33:46.91] debugee.stderr> **> debuggee: 'run': enter :: threadName == thread2 [2018-01-22T20:33:46.91] debugee.stderr> **> debuggee: 'run': exit :: threadName == thread2 [2018-01-22T20:33:46.91] EventHandler> Received event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.91] EventHandler> Event: BreakpointEventImpl req breakpoint request nsk.jdi.EventSet.resume.resume008a:60 (enabled) [2018-01-22T20:33:46.91] debugger> Received communication breakpoint event. [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: received breakpoint event. [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: received instruction from debuggee to finish. [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: exited with false. [2018-01-22T20:33:46.91] debugger> TESTING ENDS [2018-01-22T20:33:46.91] debugger> Waiting for debuggee's exit... [2018-01-22T20:33:46.91] EventHandler> waitForVMDisconnect [2018-01-22T20:33:46.91] debugee.stderr> **> debuggee: debuggee exits [2018-01-22T20:33:46.92] EventHandler> Received event set with policy = SUSPEND_NONE [2018-01-22T20:33:46.92] EventHandler> Event: VMDeathEventImpl req null [2018-01-22T20:33:46.92] EventHandler> receieved VMDeath [2018-01-22T20:33:46.92] EventHandler> Removing listener nsk.share.jdi.EventHandler$3 at 77f99a05 [2018-01-22T20:33:47.25] EventHandler> Received event set with policy = SUSPEND_NONE [2018-01-22T20:33:47.25] EventHandler> Event: VMDisconnectEventImpl req null [2018-01-22T20:33:47.25] EventHandler> receieved VMDisconnect [2018-01-22T20:33:47.25] EventHandler> Removing listener nsk.share.jdi.EventHandler$4 at 3aeaafa6 [2018-01-22T20:33:47.25] EventHandler> finished [2018-01-22T20:33:47.25] EventHandler> waitForVMDisconnect: done [2018-01-22T20:33:47.25] debugger> Event handler thread exited. [2018-01-22T20:33:47.25] debugger> Debuggee PASSED. [2018-01-22T20:33:47.26] [2018-01-22T20:33:47.26] [2018-01-22T20:33:47.26] #> [2018-01-22T20:33:47.26] #> SUMMARY: Following errors occured [2018-01-22T20:33:47.26] #> during test execution: [2018-01-22T20:33:47.26] #> [2018-01-22T20:33:47.26] # ERROR: debugger> ERROR: suspendCounts don't match for : Reference Handler [2018-01-22T20:33:47.26] # ERROR: debugger> before resuming : 1 [2018-01-22T20:33:47.26] # ERROR: debugger> after resuming : 2 [2018-01-22T20:33:47.27] # Test level exit status: 97 Here's a recent passed log from a local run: ----------System.out:(164/9808)---------- run [nsk.jdi.EventSet.resume.resume008, -verbose, -arch=linux-x64, -waittime=5, -debugee.vmkind=java, -transport.address=dynamic, -debugee.vmkeys=-XX:MaxRAMPercentage=2 ] binder> VirtualMachineManager: version 11.0 binder> Finding connector: default binder> LaunchingConnector: binder> name: com.sun.jdi.CommandLineLaunch binder> description: Launches target using Sun Java VM command line and attaches to it binder> transport: com.sun.tools.jdi.SunCommandLineLauncher$2 at 749dec1a binder> Connector arguments: binder> home=/export/users/gradams/ws/jdk-jdk/build/linux-x64/images/jdk binder> vmexec=java binder> options=-XX:MaxRAMPercentage=2 binder> main=nsk.jdi.EventSet.resume.resume008a "-verbose" "-arch=linux-x64" "-waittime=5" "-debugee.vmkind=java" "-transport.address=dynamic" "-debugee.vmkeys=-XX:MaxRAMPercentage=2 " "-pipe.port=35940" binder> quote=" binder> suspend=true binder> Launching debugee binder> Waiting for VM initialized Initial VMStartEvent received: VMStartEvent in thread main EventHandler> Adding listener nsk.share.jdi.EventHandler$1 at 2ab41d39 EventHandler> Adding listener nsk.share.jdi.EventHandler$2 at 2e3cb1e2 EventHandler> Adding listener nsk.share.jdi.EventHandler$3 at 57f20df9 EventHandler> Adding listener nsk.share.jdi.EventHandler$4 at 6e72e291 EventHandler> Adding listener nsk.share.jdi.EventHandler$5 at 5889e23e EventHandler> waitForRequestedEvent: enabling remove of listener nsk.share.jdi.EventHandler$6 at 46dcda7f EventHandler> Adding listener nsk.share.jdi.EventHandler$6 at 46dcda7f EventHandler> waitForRequestedEvent: vm.resume called EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD EventHandler> Event: ClassPrepareEventImpl req class prepare request (enabled) EventHandler> waitForRequestedEvent: Received event(ClassPrepareEvent in thread main) for request(class prepare request (enabled)) EventHandler> Removing listener nsk.share.jdi.EventHandler$6 at 46dcda7f debugger> Received ClassPrepareEvent for debuggee class: nsk.jdi.EventSet.resume.resume008a binder> Breakpoint set: breakpoint request nsk.jdi.EventSet.resume.resume008a:74 (disabled) EventHandler> Adding listener nsk.share.jdi.TestDebuggerType1$1 at 322c2a05 debugger> TESTING BEGINS debugger> RESUME DEBUGGEE VM debugger> shouldRunAfterBreakpoint: entered debugger> shouldRunAfterBreakpoint: waiting for breakpoint event during 1 sec. debugee.stderr> **> debuggee: debuggee started! EventHandler> Received event set with policy = SUSPEND_ALL EventHandler> Event: BreakpointEventImpl req breakpoint request nsk.jdi.EventSet.resume.resume008a:74 (enabled) debugger> Received communication breakpoint event. debugger> shouldRunAfterBreakpoint: received breakpoint event. debugger> shouldRunAfterBreakpoint: exited with true. debugger> :::::: case: # 0 debugger> ......waiting for new ThreadStartEvent : 0 EventHandler> waitForRequestedEventSet: enabling remove of listener nsk.share.jdi.EventHandler$7 at 78aa490d EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 78aa490d EventHandler> waitForRequestedEventSet: vm.resume called EventHandler> Received event set with policy = SUSPEND_NONE debugee.stderr> **> debuggee: 'run': enter :: threadName == thread0 EventHandler> waitForRequestedEventSet: Received event set for request: thread start request (enabled) EventHandler> Event: ThreadStartEventImpl req thread start request (enabled) EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 78aa490d EventHandler> Received event set with policy = SUSPEND_ALL EventHandler> Event: BreakpointEventImpl req breakpoint request nsk.jdi.EventSet.resume.resume008a:74 (enabled) debugger> Received communication breakpoint event. debugger> got new ThreadStartEvent with propety 'number' == ThreadStartRequest1 debugger> ......checking up on EventSet.resume() debugger> ......--> vm.suspend(); debugger> getting : Map suspendsCounts1 debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, Signal Dispatcher=2, Finalizer=2} debugger> eventSet.resume; debugger> getting : Map suspendsCounts2 debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, Signal Dispatcher=2, Finalizer=2} debugger> getting : int policy = eventSet.suspendPolicy(); debugger> case SUSPEND_NONE debugger> checking Reference Handler debugger> checking thread0 debugger> checking Common-Cleaner debugger> checking main debugger> checking Signal Dispatcher debugger> checking Finalizer debugger> ......--> vm.resume() debugger> shouldRunAfterBreakpoint: entered debugger> shouldRunAfterBreakpoint: received breakpoint event. debugger> shouldRunAfterBreakpoint: exited with true. debugger> :::::: case: # 1 debugger> ......waiting for new ThreadStartEvent : 1 EventHandler> waitForRequestedEventSet: enabling remove of listener nsk.share.jdi.EventHandler$7 at 616bc3ae EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 616bc3ae EventHandler> waitForRequestedEventSet: vm.resume called EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD debugee.stderr> **> debuggee: 'run': exit :: threadName == thread0 EventHandler> waitForRequestedEventSet: Received event set for request: thread start request (enabled) EventHandler> Event: ThreadStartEventImpl req thread start request (enabled) EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 616bc3ae debugger> got new ThreadStartEvent with propety 'number' == ThreadStartRequest2 debugger> ......checking up on EventSet.resume() debugger> ......--> vm.suspend(); debugger> getting : Map suspendsCounts1 debugger> {Reference Handler=1, thread1=2, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1} debugger> eventSet.resume; debugger> getting : Map suspendsCounts2 debugger> {Reference Handler=1, thread1=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1} debugger> getting : int policy = eventSet.suspendPolicy(); debugger> case SUSPEND_THREAD debugger> checking Reference Handler debugger> checking thread1 debugger> checking Common-Cleaner debugger> checking main debugger> checking Signal Dispatcher debugger> checking Finalizer debugger> ......--> vm.resume() debugger> shouldRunAfterBreakpoint: entered debugger> shouldRunAfterBreakpoint: waiting for breakpoint event during 1 sec. debugee.stderr> **> debuggee: 'run': enter :: threadName == thread1 debugee.stderr> **> debuggee: 'run': exit :: threadName == thread1 EventHandler> Received event set with policy = SUSPEND_ALL EventHandler> Event: BreakpointEventImpl req breakpoint request nsk.jdi.EventSet.resume.resume008a:74 (enabled) debugger> Received communication breakpoint event. debugger> shouldRunAfterBreakpoint: received breakpoint event. debugger> shouldRunAfterBreakpoint: exited with true. debugger> :::::: case: # 2 debugger> ......waiting for new ThreadStartEvent : 2 EventHandler> waitForRequestedEventSet: enabling remove of listener nsk.share.jdi.EventHandler$7 at 44e265ef EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 44e265ef EventHandler> waitForRequestedEventSet: vm.resume called EventHandler> Received event set with policy = SUSPEND_ALL EventHandler> waitForRequestedEventSet: Received event set for request: thread start request (enabled) EventHandler> Event: ThreadStartEventImpl req thread start request (enabled) EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 44e265ef debugger> got new ThreadStartEvent with propety 'number' == ThreadStartRequest3 debugger> ......checking up on EventSet.resume() debugger> ......--> vm.suspend(); debugger> getting : Map suspendsCounts1 debugger> {Reference Handler=2, thread2=2, Common-Cleaner=2, main=2, Signal Dispatcher=2, Finalizer=2} debugger> eventSet.resume; debugger> getting : Map suspendsCounts2 debugger> {Reference Handler=1, thread2=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1} debugger> getting : int policy = eventSet.suspendPolicy(); debugger> case SUSPEND_ALL debugger> checking Reference Handler debugger> checking thread2 debugger> checking Common-Cleaner debugger> checking main debugger> checking Signal Dispatcher debugger> checking Finalizer debugger> ......--> vm.resume() debugger> shouldRunAfterBreakpoint: entered debugger> shouldRunAfterBreakpoint: waiting for breakpoint event during 1 sec. debugee.stderr> **> debuggee: 'run': enter :: threadName == thread2 debugee.stderr> **> debuggee: 'run': exit :: threadName == thread2 EventHandler> Received event set with policy = SUSPEND_ALL EventHandler> Event: BreakpointEventImpl req breakpoint request nsk.jdi.EventSet.resume.resume008a:74 (enabled) debugger> Received communication breakpoint event. debugger> shouldRunAfterBreakpoint: received breakpoint event. debugger> shouldRunAfterBreakpoint: received instruction from debuggee to finish. debugger> shouldRunAfterBreakpoint: exited with false. debugger> TESTING ENDS debugger> Waiting for debuggee's exit... debugee.stderr> **> debuggee: debuggee exits EventHandler> waitForVMDisconnect EventHandler> Received event set with policy = SUSPEND_NONE EventHandler> Event: VMDeathEventImpl req null EventHandler> receieved VMDeath EventHandler> Removing listener nsk.share.jdi.EventHandler$3 at 57f20df9 EventHandler> Received event set with policy = SUSPEND_NONE EventHandler> Event: VMDisconnectEventImpl req null EventHandler> receieved VMDisconnect EventHandler> Removing listener nsk.share.jdi.EventHandler$4 at 6e72e291 EventHandler> finished EventHandler> waitForVMDisconnect: done debugger> Event handler thread exited. debugger> Debuggee PASSED. On 7/18/18, 6:09 PM, gary.adams at oracle.com wrote: > On 7/18/18 4:47 PM, Chris Plummer wrote: >> Hi Gary >> >> Ok, so shouldRunAfterBreakpoint() is the code that does the >> eventHandler.wait(), so it gets the eventHandler.notifyAll() >> notification from the BreakpointEvent handler. >> >> And as a side note, I see now that resumption of execution after the >> breakpoint at main() is done by: >> >> // after waitForClassPrepared() main debuggee thread is >> suspended, resume it before test start >> display("RESUME DEBUGGEE VM"); >> vm.resume(); >> >> testRun(); >> >> shouldRunAfterBreakpoint() is returning true until the end of the >> test when the debuggee is executes "instruction = end". That's why >> runTests() does a "break" when shouldRunAfterBreakpoint() returns >> false. So this means the code that is checking >> shouldRunAfterBreakpoint() is not resuming execution for the first >> few (probably 3) methodForCommunication() breakpoints. However, it >> does make sure that runTests() blocks until the BreakPointEvent has >> been processed. >> >> You point out the vm.resume() at the bottom of the loop in >> runTests(), but that's only after a bunch of ThreadStartEvent >> processing above it has been done already. The ThreadStartEvent would >> never get generated if there was not a resume some point earlier. I >> think it is happening during the >> eventHandler.waitForRequestedEventSet() call, which does a vm.resume(). >> >> So if I understand the order of things now: >> >> -shouldRunAfterBreakpoint() returns after first >> methodForCommunication() is hit. At this point we know the first >> thread has been created, but no attempt to start it yet. The debuggee >> is suspended at this point. >> -runTests() requests ThreadStartEvents with SUSPEND_NONE. This also >> does a vm.resume(). >> -The debuggee starts the thread and then does another >> methodForCommunication() (this 2nd one is actually after the 2nd >> thread has been created, but not yet started). Now we have a race. Do >> we get the ThreadStartEvent first or the BreakpointEvent. This is >> because when the ThreadStartEvent is generated, the thread is not >> suspended due to SUSPEND_NONE. Even if the ThreadStartEvent comes in >> first, the async handling of the BreakpointEvent can cause problems >> during the ThreadStartEvent processing. > Based on the failed log in the bug report, the thread start event is > observed, > the suspend counts acquired, then after the resume, the breakpoint > message > is displayed and the second set of suspend counts acquired. > > I can show you the passed and failed logs tomorrow. >> -You added a 100ms delay after the thread has started, but before >> methodForCommunication(), hoping it will make it so the >> ThreadStartEvent can be received and fully processed before the >> BreakpointEvent is. > The delay is mostly just a yield so the debugger gets a chance to run. >> >> I think it would be preferable to fix this by doing better >> sychronization. After all, that is the approach the test originally >> took. It could have been written with a bunch of sleep() delays >> instead, but that in general is not a very good approach. >> >> What if you added a shouldRunAfterBreakpoint() call after getting the >> ThreadStartEvent arrives. At this point you would know that the vm is >> suspended due to the breakpoint, so no need for: >> >> display("......checking up on EventSet.resume()"); >> display("......--> vm.suspend();"); >> vm.suspend(); > I think the suspend is intentional to capture the the suspend counts. > It also needs to resume the vm and acquire again so it can confirm the > correct > suspend count behaviors. > If the test waits to capture the second set of suspend counts, the > breakpoint > causes incorrect values. > > ... >> >> You might then also need to add another methodForCommunication() call >> at the end of case 0 and 1 in the debuggee, although I think you >> could instead just change the shouldRunAfterBreakpoint() at the start >> of the loop. I think that check actually belongs at the end of the >> loop, and only for case 2. In fact it would be an error if >> shouldRunAfterBreakpoint() did not return true in that case. Then you >> also need to add a shouldRunAfterBreakpoint() at the start of case 0 >> to get things rolling (and I think at the start of case 1 also). >> >> Chris >> >> >> On 7/18/18 12:45 PM, Gary Adams wrote: >>> Answers below ... >>> >>> On 7/18/18, 2:50 PM, Chris Plummer wrote: >>>> Hi Gary, >>>> >>>> Who does the resume for the breakpoint event? >>>> >>>> eventHandler.addListener( >>>> new EventHandler.EventListener() { >>>> public boolean eventReceived(Event event) { >>>> if (event instanceof BreakpointEvent && >>>> bpRequest.equals(event.request())) { >>>> synchronized(eventHandler) { >>>> display("Received communication >>>> breakpoint event."); >>>> bpCount++; >>>> eventHandler.notifyAll(); >>>> } >>>> return true; >>>> } >>>> return false; >>>> } >>>> } >>>> ); >>> I believe you are looking for this sequence. >>> At the top of the loop a check is made if >>> resume() should be called "shouldRunAfterBreakpoint". >>> lines 96-99 is an early termination. And at the >>> bottom of the loop, line 240, is the normal >>> continue the test to the next case. >>> >>> resume008.java : >>> ... >>> 94 for (int i = 0; ; i++) { >>> 95 >>> >>> 96 if (!shouldRunAfterBreakpoint()) { >>> 97 vm.resume(); >>> 98 break; >>> 99 } >>> >>> 100 >>> 101 >>> 102 display(":::::: case: # " + i); >>> 103 >>> 104 switch (i) { >>> 105 >>> 106 case 0: >>> 107 eventRequest = settingThreadStartRequest ( >>> 108 SUSPEND_NONE, >>> "ThreadStartRequest1"); >>> ... >>> 238 >>> 239 display("......--> vm.resume()"); >>> 240 vm.resume(); >>> 241 } >>>> >>>> Also: >>>> >>>>> 1. On a thread start event the debugee is suspended, line 141 >>>> That's not true for the first ThreadStartEvent since SUSPEND_NONE >>>> was used. >>> The thread start event is set to SUSPEND_NONE for thread0, but when >>> the thread start event is observed the resume008 test suspends the vm >>> immediately after fetching the "number" property. >> My point is that the Debuggee continues to run after the >> ThreadStartEvent is sent, and relies on the debugger to stop it after >> receiving the event. But in the meantime the debuggee has advanced to >> the next breakpoint, but only sometimes, thus the bug you are seeing. >>> >>> 132 if ( !(newEvent instanceof ThreadStartEvent)) { >>> 133 setFailedStatus("ERROR: new event is not >>> ThreadStartEvent"); >>> 134 } else { >>> 135 >>> 136 String property = (String) >>> newEvent.request().getProperty("number"); >>> 137 display(" got new ThreadStartEvent >>> with propety 'number' == " + property); >>> 138 >>> 139 display("......checking up on >>> EventSet.resume()"); >>> 140 display("......--> vm.suspend();"); >>> 141 vm.suspend(); >>> >>> >>>> >>>> Chris >>>> >>>> On 7/18/18 4:52 AM, Gary Adams wrote: >>>>> There is nothing wrong with the breakpoint in methodForCommunication. >>>>> The test uses it to make sure the threads are each tested separately. >>>>> The breakpoint eventhandler just displays a message, increments a >>>>> counter >>>>> and returns. >>>>> >>>>> Let me step through resume008a the debugee to help clarify ... >>>>> >>>>> 1. The test thread is created and the synchronized break point is >>>>> observed. lines 101-102 >>>>> 2. The thread is started. lines 104,135-137 >>>>> 2a. The main thread blocks on a local object. lines 133, 139 >>>>> 2b. The test thread is started. lines 137, >>>>> A run entered message is displayed, line 159 >>>>> The main thread lock object is notified, line 167 >>>>> 2b1. The main thread continues. line 167, 146 >>>>> The next test thread is created. line 106 >>>>> The synchronized breakpoint is observed, line 107 >>>>> 2b2. A run exited message is displayed, line 169 >>>>> >>>>> On the resume008 debugger side ... >>>>> 1. On a thread start event the debugee is suspended, line 141 >>>>> 2. Messages are displayed and a first set of thread suspend >>>>> counts is acquired. lines 143-151 >>>>> 3. The threads are resumed, line 152 >>>>> ---> >>>>> 4. Messages are displayed and a second set of thread suspend >>>>> counts is acquired. lines 154-159 >>>>> >>>>> The way the test is written the expectation is the debugger steps >>>>> 2,3,4 will all happen >>>>> while the test thread is running. >>>>> >>>>> When the debugger resumes the debuggee threads (debugger step 3) >>>>> the debuggee continues from where it left off (debuggee steps >>>>> 2b,2b1,2b2) >>>>> >>>>> If we complete debuggee step 2b1 (line 107) before the debugger >>>>> completes step 4 line 159, >>>>> then the synchronized breakpoint will suspend the vm and the >>>>> counts will not match >>>>> for the SUSPEND_NONE test thread start. >>>>> >>>>> resume008a.java: >>>>> >>>>> 100 case 0: >>>>> 101 thread0 = new >>>>> Threadresume008a("thread0"); >>>>> 102 methodForCommunication(); >>>>> 103 >>>>> 104 threadStart(thread0); >>>>> 105 >>>>> 106 thread1 = new >>>>> Threadresume008a("thread1"); >>>>> 107 methodForCommunication(); >>>>> 108 break; >>>>> >>>>> ... >>>>> 135 static int threadStart(Thread t) { >>>>> 136 synchronized (waitnotifyObj) { >>>>> 137 t.start(); >>>>> 138 try { >>>>> 139 waitnotifyObj.wait(); >>>>> 140 } catch ( Exception e) { >>>>> 141 exitCode = FAILED; >>>>> 142 logErr(" Exception : " + e ); >>>>> 143 return FAILED; >>>>> 144 } >>>>> 145 } >>>>> 146 return PASSED; >>>>> 147 } >>>>> >>>>> 149 static class Threadresume008a extends Thread { >>>>> ... >>>>> 157 >>>>> 158 public void run() { >>>>> 159 log1(" 'run': enter :: threadName == " + >>>>> tName); >>>>> >>>>> This is the proposed fix that will let the debugger complete it's >>>>> second >>>>> acquisition of suspend counts while the test thread is still running. >>>>> >>>>> 160 // Yield, so the start thread event >>>>> processing can be completed. >>>>> 161 try { >>>>> 162 Thread.sleep(100); >>>>> 163 } catch (InterruptedException e) { >>>>> 164 // ignored >>>>> 165 } >>>>> >>>>> 166 synchronized (waitnotifyObj) { >>>>> 167 waitnotifyObj.notify(); >>>>> 168 } >>>>> 169 log1(" 'run': exit :: threadName == " + >>>>> tName); >>>>> 170 return; >>>>> 171 } >>>>> 172 } >>>>> 150 >>>>> 151 String tName = null; >>>>> 152 >>>>> 153 public Threadresume008a(String threadName) { >>>>> 154 super(threadName); >>>>> 155 tName = threadName; >>>>> 156 } >>>>> 157 >>>>> 158 public void run() { >>>>> 159 log1(" 'run': enter :: threadName == " + >>>>> tName); >>>>> 160 // Yield, so the start thread event >>>>> processing can be completed. >>>>> 161 try { >>>>> 162 Thread.sleep(100); >>>>> 163 } catch (InterruptedException e) { >>>>> 164 // ignored >>>>> 165 } >>>>> 166 synchronized (waitnotifyObj) { >>>>> 167 waitnotifyObj.notify(); >>>>> 168 } >>>>> 169 log1(" 'run': exit :: threadName == " + >>>>> tName); >>>>> 170 return; >>>>> 171 } >>>>> 172 } >>>>> >>>>> >>>>> >>>>> On 7/18/18, 2:38 AM, Chris Plummer wrote: >>>>>> Hi Gary, >>>>>> >>>>>> I've been having trouble following the control flow of this test. >>>>>> One thing I've stumbled across is the following: >>>>>> >>>>>> /* A debuggee class must define 'methodForCommunication' >>>>>> * method and invoke it in points of synchronization >>>>>> * with a debugger. >>>>>> */ >>>>>> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication"); >>>>>> >>>>>> So why isn't this mode of synchronization good enough? Is it >>>>>> because it was not designed with the understanding that the >>>>>> debugger might be doing suspended thread counts, and suspending >>>>>> all threads at the breakpoint messes up the test? >>>>>> >>>>>> From what I can tell of the test, after the debuggee is started >>>>>> and hits the default breakpoint at the start of main(), the >>>>>> debugger then does a vm.resume() at the start of the for loop in >>>>>> the runTest() method. The debuggee then creates a thread and >>>>>> calls methodForCommunication(). There is already a breakpoint set >>>>>> there by the above debuggee code. It's unclear to me what happens >>>>>> as a result of this breakpoint and how it serves the test. Also >>>>>> unclear to me who is responsible for the vm.resume() after the >>>>>> breakpoint is hit. >>>>>> >>>>>> The debugger then requests all ThreadStart events, requesting >>>>>> that no threads be disabled when it is sent. I think you are >>>>>> saying that when the ThreadStart event comes in, sometimes we are >>>>>> at the methodForCommunication breakpoint, with all threads >>>>>> disabled, and this messes up the thread suspend counts. You want >>>>>> to delay 100ms so the breakpoint event can be processed and >>>>>> threads resumed again (although I can't see who actually resumes >>>>>> the thread after hitting the methodForCommunication breakpoint). >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/17/18 8:33 AM, Gary Adams wrote: >>>>>>> A race condition exists between the debugger and the debuggee. >>>>>>> >>>>>>> The first test thread is started with SUSPEND_NONE policy set. >>>>>>> While processing the thread start event the debugger captures >>>>>>> an initial set of thread suspend counts and resumes the >>>>>>> debuggee vm. If the debuggee advances quickly it reaches >>>>>>> the breakpoint set for methodForCommunication. Since the breakpoint >>>>>>> carries with it SUSPEND_ALL policy, when the debugger captures a >>>>>>> second >>>>>>> set of suspend counts, it will not match the expected counts for >>>>>>> a SUSPEND_NONE scenario. >>>>>>> >>>>>>> The proposed fix introduces a yield in the debuggee test thread >>>>>>> run method >>>>>>> to allow the debugger to get the expected sampled values. >>>>>>> >>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8170089 >>>>>>> Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/ >>>>>>> >>>>>>> >>>>>>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java: >>>>>>> ... >>>>>>> 186 private void >>>>>>> setCommunicationBreakpoint(ReferenceType refType, String >>>>>>> methodName) { >>>>>>> 187 Method method = debuggee.methodByName(refType, >>>>>>> methodName); >>>>>>> 188 Location location = null; >>>>>>> 189 try { >>>>>>> 190 location = method.allLineLocations().get(0); >>>>>>> 191 } catch (AbsentInformationException e) { >>>>>>> 192 throw new Failure(e); >>>>>>> 193 } >>>>>>> 194 bpRequest = debuggee.makeBreakpoint(location); >>>>>>> 195 >>>>>>> >>>>>>> 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); >>>>>>> >>>>>>> 197 bpRequest.putProperty("number", "zero"); >>>>>>> 198 bpRequest.enable(); >>>>>>> 199 >>>>>>> 200 eventHandler.addListener( >>>>>>> 201 new EventHandler.EventListener() { >>>>>>> 202 public boolean eventReceived(Event >>>>>>> event) { >>>>>>> 203 if (event instanceof >>>>>>> BreakpointEvent && bpRequest.equals(event.request())) { >>>>>>> 204 synchronized(eventHandler) { >>>>>>> 205 display("Received >>>>>>> communication breakpoint event."); >>>>>>> 206 bpCount++; >>>>>>> 207 eventHandler.notifyAll(); >>>>>>> 208 } >>>>>>> 209 return true; >>>>>>> 210 } >>>>>>> 211 return false; >>>>>>> 212 } >>>>>>> 213 } >>>>>>> 214 ); >>>>>>> 215 } >>>>>>> >>>>>>> >>>>>>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: >>>>>>> >>>>>>> ... >>>>>>> 140 display("......--> vm.suspend();"); >>>>>>> 141 vm.suspend(); >>>>>>> 142 >>>>>>> 143 display(" getting : Map>>>>>> Integer> suspendsCounts1"); >>>>>>> 144 >>>>>>> 145 Map suspendsCounts1 = >>>>>>> new HashMap(); >>>>>>> 146 for (ThreadReference threadReference : >>>>>>> vm.allThreads()) { >>>>>>> 147 suspendsCounts1.put(threadReference.name(), >>>>>>> threadReference.suspendCount()); >>>>>>> 148 } >>>>>>> 149 display(suspendsCounts1.toString()); >>>>>>> 150 >>>>>>> 151 display(" eventSet.resume;"); >>>>>>> 152 eventSet.resume(); >>>>>>> 153 >>>>>>> 154 display(" getting : Map>>>>>> Integer> suspendsCounts2"); >>>>>>> >>>>>>> This is where the breakpoint is encountered before the second >>>>>>> set of suspend counts is acquired. >>>>>>> >>>>>>> 155 Map suspendsCounts2 = >>>>>>> new HashMap(); >>>>>>> 156 for (ThreadReference threadReference : >>>>>>> vm.allThreads()) { >>>>>>> 157 suspendsCounts2.put(threadReference.name(), >>>>>>> threadReference.suspendCount()); >>>>>>> 158 } >>>>>>> 159 display(suspendsCounts2.toString()); >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Thu Jul 19 13:46:52 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 19 Jul 2018 09:46:52 -0400 Subject: RFR(XS): 8207819: Problem list serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java In-Reply-To: <5c714d80-2216-84bd-5391-a7fbcccd34a7@oracle.com> References: <5c714d80-2216-84bd-5391-a7fbcccd34a7@oracle.com> Message-ID: JDK-8207765 covers two different tests as of yesterday: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java and serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java I updated it to add a similar failure mode sighting for HeapMonitorStatIntervalTest.java Dan On 7/18/18 11:49 PM, serguei.spitsyn at oracle.com wrote: > Thanks, Chris! > This meets the Trivial Change policy, so that pushing now. > > Thanks, > Serguei > > > On 7/18/18 20:47, Chris Plummer wrote: >> Looks good. >> >> Chris >> >> On 7/18/18 8:32 PM, serguei.spitsyn at oracle.com wrote: >>> Please, review the fix for sub-task: >>> ? https://bugs.openjdk.java.net/browse/JDK-8207819 >>> >>> >>> The test HeapMonitorStatRateTest.java needs to be problem listed >>> until main bug is fixed >>> ? https://bugs.openjdk.java.net/browse/JDK-8207765 >>> >>> >>> The patch is: >>> >>> diff -r 3c0a5bf931e4 test/hotspot/jtreg/ProblemList.txt >>> --- a/test/hotspot/jtreg/ProblemList.txt??? Thu Jul 19 10:30:24 2018 >>> +0800 >>> +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 18 20:27:10 2018 >>> -0700 >>> @@ -81,6 +81,7 @@ >>> >>> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 >>> generic-all >>> ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 >>> generic-all >>> +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java >>> 8207765 generic-all >>> >>> ?############################################################################# >>> >>> >>> >>> Thanks, >>> Serguei >> >> >> > > From serguei.spitsyn at oracle.com Thu Jul 19 13:55:28 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 19 Jul 2018 06:55:28 -0700 Subject: RFR(XS): 8207819: Problem list serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java In-Reply-To: References: <5c714d80-2216-84bd-5391-a7fbcccd34a7@oracle.com> Message-ID: <27fd8f83-682a-738d-8203-cd20e8bf2556@oracle.com> Hi Dan, Thank you, Dan. I've just discovered the same in the recent mach5 test results. Sorry for overlooking it. Will need another sub-task for this now. Thanks, Serguei On 7/19/18 06:46, Daniel D. Daugherty wrote: > JDK-8207765 covers two different tests as of yesterday: > > serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java > > and > > serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java > > > I updated it to add a similar failure mode sighting for > HeapMonitorStatIntervalTest.java > > Dan > > > On 7/18/18 11:49 PM, serguei.spitsyn at oracle.com wrote: >> Thanks, Chris! >> This meets the Trivial Change policy, so that pushing now. >> >> Thanks, >> Serguei >> >> >> On 7/18/18 20:47, Chris Plummer wrote: >>> Looks good. >>> >>> Chris >>> >>> On 7/18/18 8:32 PM, serguei.spitsyn at oracle.com wrote: >>>> Please, review the fix for sub-task: >>>> ? https://bugs.openjdk.java.net/browse/JDK-8207819 >>>> >>>> >>>> The test HeapMonitorStatRateTest.java needs to be problem listed >>>> until main bug is fixed >>>> ? https://bugs.openjdk.java.net/browse/JDK-8207765 >>>> >>>> >>>> The patch is: >>>> >>>> diff -r 3c0a5bf931e4 test/hotspot/jtreg/ProblemList.txt >>>> --- a/test/hotspot/jtreg/ProblemList.txt??? Thu Jul 19 10:30:24 >>>> 2018 +0800 >>>> +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 18 20:27:10 >>>> 2018 -0700 >>>> @@ -81,6 +81,7 @@ >>>> >>>> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 >>>> generic-all >>>> ?serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all >>>> +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java >>>> 8207765 generic-all >>>> >>>> ?############################################################################# >>>> >>>> >>>> >>>> Thanks, >>>> Serguei >>> >>> >>> >> >> > From yasuenag at gmail.com Thu Jul 19 14:03:24 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 19 Jul 2018 23:03:24 +0900 Subject: RFR: 8207843: HSDB cannot show Object Histogram when ZGC is working Message-ID: <0e8ce2e6-e043-2320-e660-a2f1f4df820e@gmail.com> Hi all, Please review this webrev. JBS: https://bugs.openjdk.java.net/browse/JDK-8207843 webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207843/webrev.00/ I encountered AssertionFailure when I attached HSDB to the process which is working with ZGC as below: sun.jvm.hotspot.utilities.AssertionFailure: Unexpected CollectedHeap type: sun.jvm.hotspot.gc.z.ZCollectedHeap at jdk.hotspot.agent/sun.jvm.hotspot.utilities.Assert.that(Assert.java:32) at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.collectLiveRegions(ObjectHeap.java:448) at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.iterate(ObjectHeap.java:173) at jdk.hotspot.agent/sun.jvm.hotspot.HSDB$VisitHeap.run(HSDB.java:1741) at jdk.hotspot.agent/sun.jvm.hotspot.utilities.WorkerThread$MainLoop.run(WorkerThread.java:70) at java.base/java.lang.Thread.run(Thread.java:832) ObjectHeap#collectLiveRegions() branches by instance type of CollectedHeap. However it does not support ZCollectedHeap. So I add ZCollectedHeap to it and add some methods to iterate ZPageTable. Thanks, Yasumasa From erik.helin at oracle.com Thu Jul 19 14:57:25 2018 From: erik.helin at oracle.com (Erik Helin) Date: Thu, 19 Jul 2018 16:57:25 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: References: Message-ID: <35365ee0-e158-ad99-93c0-6f60251573a1@oracle.com> On 07/13/2018 04:10 PM, Daniel Mitterdorfer wrote: > Hi, > > I have good news. I was able to reproduce this issue but this time I > have logs. A test failed with the following stack trace around > 15:06:55 with: > > java.lang.IllegalArgumentException: committed = 537919488 should be < > max = 536870912 > > at java.lang.management.MemoryUsage.(MemoryUsage.java:166) > > at sun.management.MemoryImpl.getMemoryUsage0(Native Method) > > at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71) > > at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:242) > > This time it happened on Linux (kernel 4.13.0-45-generic) with JDK 10 > (build 10+46). The JVM arguments were: > > -Xms512M -Xmx512M > -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags > > The logs are somewhat massive (~250MB uncompressed) and available at > https://www.dropbox.com/s/wir9sv1dk5cf54y/JDK-8207200_test_log.txt.zip?dl=0 Thanks for the logs Daniel, they helped a lot! Me and Thomas looked through the logs and the code and as we suspected, this is code is a bit buggy :/ Please see the bug for more details: https://bugs.openjdk.java.net/browse/JDK-8207200 Again, thanks for taking your time and reporting this issue and for getting us the logs, much appreciated! Erik > I hope that helps identifying the cause. Please let me know if you > need anything else. > > Daniel > Am Fr., 13. Juli 2018 um 10:33 Uhr schrieb Thomas Schatzl > : >> >> On Fri, 2018-07-13 at 10:30 +0200, Daniel Mitterdorfer wrote: >>> Hi Erik, >>>> >>>> Do you any kind of GC logging from the test run where you >>>> encountered the bug? >>> >>> Unfortunately, we don't have GC logging enabled by default in our >>> test suite so the exception trace is all I got. I am now repeatedly >>> running the test suite with the original flags (-Xms512M -Xmx512M) >>> and also added the following logging configuration: >>> >>> -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags >>> >>> As soon as I get another failure, I'll provide the full log file. >>> Please let me know if you need any other logs (i.e. whether I should >>> adjust my log configuration). >> >> I think these flags are fine. >> >> Since Erik and me strongly believe the issue is with the relevant G1 >> code Erik mentioned we will reassign the bug to us (he said there is >> already a bug reported on it). >> >> Thanks a lot, >> Thomas >> From bob.vandette at oracle.com Thu Jul 19 15:34:33 2018 From: bob.vandette at oracle.com (Bob Vandette) Date: Thu, 19 Jul 2018 11:34:33 -0400 Subject: RFR: 8206456 - [TESTBUG] docker jtreg tests fail on systems without cpuset.effective_cpus / cpuset.effective_mem In-Reply-To: <598c9af3-2041-0be5-f177-b3a031b2ef61@oracle.com> References: <598c9af3-2041-0be5-f177-b3a031b2ef61@oracle.com> Message-ID: <9766EFD8-9220-4AC1-A5D2-71A8F9568FC4@oracle.com> > On Jul 17, 2018, at 8:07 PM, mandy chung wrote: > > > > On 7/17/18 7:00 AM, Bob Vandette wrote: >> Please review this fix which eliminates some docker/cgroup test failures when running on older >> Linux kernels with missing cgroup metric files. >> BUGS: >> https://bugs.openjdk.java.net/browse/JDK-8206456 >> WEBREV: >> http://cr.openjdk.java.net/~bobv/8206456/webrev/ > > Nit: It would be clearer to check for the specific metrics: > > int[] cpusets = metrics.getEffectiveCpuSetCpus(); > if (cpusets.length != 0) { > .... > } > > Same applies to getEffectiveCpuSetMems. No need for a new webrev. Thanks, I?ll do that cleanup. > > Mandy > P.S. I am not sure the conversion from the primitive to boxed type > is necessary. But this is not related to this issue. You may > want to take a look at that. I?ll defer this issue to Harsha who wrote these tests since changing that is out of scope for this fix. Thanks, Bob. From jcbeyler at google.com Thu Jul 19 16:39:57 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 19 Jul 2018 09:39:57 -0700 Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC Message-ID: Hi all, Could I have a few reviews of: http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ The test assumed the size of a 1-element array but ZGC changes that assumption. The test now first allocates a bit of memory and gets the average size of the samples before assuming the size. This works with/without ZGC. Webrev: http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8207765 Thanks! Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Thu Jul 19 16:45:14 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 19 Jul 2018 12:45:14 -0400 Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC In-Reply-To: References: Message-ID: <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com> JDK-8207765 covers two different tests as of yesterday: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java and serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java I updated it to add a similar failure mode sighting for HeapMonitorStatIntervalTest.java Does your fix address both test failures? Dan On 7/19/18 12:39 PM, JC Beyler wrote: > Hi all, > > Could I have a few reviews of: > http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ > > > The? test assumed the size of a 1-element array but ZGC changes that > assumption. The test now first allocates a bit of memory and gets the > average size of the samples before assuming the size. This works > with/without ZGC. > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8207765 > > Thanks! > Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Thu Jul 19 17:07:06 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 19 Jul 2018 10:07:06 -0700 Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC In-Reply-To: <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com> References: <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com> Message-ID: Hi Dan, serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java became serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java, when we updated the spec and said "rate" was the wrong word. So yes, it fixes both since at some point all branches should see that the StatRate test becomes renamed into the StatInterval test. Does that make sense? Thanks! Jc On Thu, Jul 19, 2018 at 9:45 AM Daniel D. Daugherty < daniel.daugherty at oracle.com> wrote: > JDK-8207765 covers two different tests as of yesterday: > > serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java > > and > > serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java > > I updated it to add a similar failure mode sighting for > HeapMonitorStatIntervalTest.java > > > Does your fix address both test failures? > > Dan > > > On 7/19/18 12:39 PM, JC Beyler wrote: > > Hi all, > > Could I have a few reviews of: > http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ > > The test assumed the size of a 1-element array but ZGC changes that > assumption. The test now first allocates a bit of memory and gets the > average size of the samples before assuming the size. This works > with/without ZGC. > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8207765 > > Thanks! > Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Thu Jul 19 17:08:42 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 19 Jul 2018 10:08:42 -0700 Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC In-Reply-To: References: <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com> Message-ID: I forgot to put the link: https://bugs.openjdk.java.net/browse/JDK-8207763 It got renamed in jdk11 via: http://hg.openjdk.java.net/jdk/jdk11/rev/1edcf36fe15f Thanks! Jc On Thu, Jul 19, 2018 at 10:07 AM JC Beyler wrote: > Hi Dan, > > > serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java > became > serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java, > when we updated the spec and said "rate" was the wrong word. > > So yes, it fixes both since at some point all branches should see that the > StatRate test becomes renamed into the StatInterval test. Does that make > sense? > > Thanks! > Jc > > > On Thu, Jul 19, 2018 at 9:45 AM Daniel D. Daugherty < > daniel.daugherty at oracle.com> wrote: > >> JDK-8207765 covers two different tests as of yesterday: >> >> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java >> >> and >> >> >> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java >> >> I updated it to add a similar failure mode sighting for >> HeapMonitorStatIntervalTest.java >> >> >> Does your fix address both test failures? >> >> Dan >> >> >> On 7/19/18 12:39 PM, JC Beyler wrote: >> >> Hi all, >> >> Could I have a few reviews of: >> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ >> >> The test assumed the size of a 1-element array but ZGC changes that >> assumption. The test now first allocates a bit of memory and gets the >> average size of the samples before assuming the size. This works >> with/without ZGC. >> >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8207765 >> >> Thanks! >> Jc >> >> >> > > -- > > Thanks, > Jc > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.mitterdorfer at gmail.com Thu Jul 19 17:10:09 2018 From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer) Date: Thu, 19 Jul 2018 19:10:09 +0200 Subject: committed > max in MemoryMXBean#getHeapMemoryUsage() In-Reply-To: <35365ee0-e158-ad99-93c0-6f60251573a1@oracle.com> References: <35365ee0-e158-ad99-93c0-6f60251573a1@oracle.com> Message-ID: Hi Erik, I am quite happy that I could reproduce it after running the tests repeatedly for approximately a week after the first failure. Glad I could help and thank you all for you help as well! Daniel Am Do., 19. Juli 2018 um 16:57 Uhr schrieb Erik Helin : > > On 07/13/2018 04:10 PM, Daniel Mitterdorfer wrote: > > Hi, > > > > I have good news. I was able to reproduce this issue but this time I > > have logs. A test failed with the following stack trace around > > 15:06:55 with: > > > > java.lang.IllegalArgumentException: committed = 537919488 should be < > > max = 536870912 > > > at java.lang.management.MemoryUsage.(MemoryUsage.java:166) > > > at sun.management.MemoryImpl.getMemoryUsage0(Native Method) > > > at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71) > > > at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:242) > > > > This time it happened on Linux (kernel 4.13.0-45-generic) with JDK 10 > > (build 10+46). The JVM arguments were: > > > > -Xms512M -Xmx512M > > -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags > > > > The logs are somewhat massive (~250MB uncompressed) and available at > > https://www.dropbox.com/s/wir9sv1dk5cf54y/JDK-8207200_test_log.txt.zip?dl=0 > > Thanks for the logs Daniel, they helped a lot! Me and Thomas looked > through the logs and the code and as we suspected, this is code is a bit > buggy :/ Please see the bug for more details: > > https://bugs.openjdk.java.net/browse/JDK-8207200 > > Again, thanks for taking your time and reporting this issue and for > getting us the logs, much appreciated! > Erik > > > I hope that helps identifying the cause. Please let me know if you > > need anything else. > > > > Daniel > > Am Fr., 13. Juli 2018 um 10:33 Uhr schrieb Thomas Schatzl > > : > >> > >> On Fri, 2018-07-13 at 10:30 +0200, Daniel Mitterdorfer wrote: > >>> Hi Erik, > >>>> > >>>> Do you any kind of GC logging from the test run where you > >>>> encountered the bug? > >>> > >>> Unfortunately, we don't have GC logging enabled by default in our > >>> test suite so the exception trace is all I got. I am now repeatedly > >>> running the test suite with the original flags (-Xms512M -Xmx512M) > >>> and also added the following logging configuration: > >>> > >>> -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags > >>> > >>> As soon as I get another failure, I'll provide the full log file. > >>> Please let me know if you need any other logs (i.e. whether I should > >>> adjust my log configuration). > >> > >> I think these flags are fine. > >> > >> Since Erik and me strongly believe the issue is with the relevant G1 > >> code Erik mentioned we will reassign the bug to us (he said there is > >> already a bug reported on it). > >> > >> Thanks a lot, > >> Thomas > >> From chris.plummer at oracle.com Thu Jul 19 17:10:59 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 19 Jul 2018 10:10:59 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> <7be59224-6013-b163-daf2-2ed6bb90a3b8@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Thu Jul 19 17:22:57 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 19 Jul 2018 10:22:57 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> <7be59224-6013-b163-daf2-2ed6bb90a3b8@oracle.com> Message-ID: Hi Chris, This would depend on how the particular test is implemented. The specifics of this particular test are that an event listener for ClassPrepare events was registered and unregistered multiple times and the failure happened when ClassPrepare event from Graal compiler thread was received at the moment when the listener was unregistered. Best regards, Daniil From: Chris Plummer Date: Thursday, July 19, 2018 at 10:11 AM To: Daniil Titov , "serguei.spitsyn at oracle.com" , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails It seems that any test that requests ClassPrepareEvents could be getting unexpected events when graal is enabled. Chris On 7/17/18 8:32 PM, Daniil Titov wrote: Hi Serguei, The changes are in the one test class vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java so they affect only this single test. No other tests depend on this class. Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 7:59 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, It looks good to me. Thank you for the update. How many tests are depending on this class? Could we say that all the nsk/jdi/ClassPrepareRequest tests need to be checked that there are no regressions? Thanks, Serguei On 7/17/18 19:06, Daniil Titov wrote: Hi Serguei, Please review a new version of the patch. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.03 Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 4:53 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Thank you for clarification and the webrev update! I still have a couple of questions though. I'd suggest more simple approach like below: 154 public boolean eventReceived(Event event) { 155 if (event instanceof ClassPrepareEvent) { 156 ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; 157 ThreadReference thread = classPrepareEvent.thread(); 158 if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { 159 eventReceived++; 160 161 log.display("ClassPrepareEventListener: Event received: " + event + 162 " Class: " + classPrepareEvent.referenceType().name()); 163 164 vm.resume(); 165 166 return true; 167 } 168 } 169 170 return false; 171 } to something like: public boolean eventReceived(Event event) { if (event instanceof ClassPrepareEvent) { ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; ThreadReference thread = classPrepareEvent.thread(); if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { eventReceived++; log.display("ClassPrepareEventListener: Event received: " + event + " Class: " + classPrepareEvent.referenceType().name()); } else { log.display("ClassPrepareEventListener: Event filtered out: " + event + " Class: " + classPrepareEvent.referenceType().name() + " Thread:" + classPrepareEvent.thread().name()); } vm.resume(); return true; } return false; } 245 eventHandler.startListening(); 246 // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads. 247 // The listener should be added after the event listener is started to ensure that it 248 // called before the default event listener that handles unexpected events. 249 eventHandler.addListener(new DefaultClassPrepareEventListener()); Still unclear why addListener() is invoked after startListening() but not before. It can be that a place add this listener is not right and have to be moved into testSourceFilter(). But I hope this fragment is not needed with the simplified approach. Otherwise, it looks good. Thanks, Serguei On 7/17/18 14:55, Daniil Titov wrote: Hi Serguei, The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters. The testSourceFilter() method does the following: 1. creates a ClassPrepareRequest object 2. registers new ClassPrepareEventListener 3. sends a command to debuggee to a load test class 4. waits till the debuggee performed the command 5. removes ClassPrepareEventListener 6. checks if a ClassPrepareEvent was received Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners) cat -n test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java /** 251 * This method sets up default listeners. 252 */ 253 private void createDefaultListeners() { 254 /** 255 * This listener catches up all unexpected events. 256 * 257 */ 258 addListener( 259 new EventListener() { 260 public boolean eventReceived(Event event) { 261 log.complain("EventHandler> Unexpected event: " + event.getClass().getName()); 262 unexpectedEventCaught = true; 263 return true; 264 } 265 } 266 ); 267 On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener is unregistered inside testSourceFilter() method. Please see below the new webrev with the changes you suggested. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/ Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 1:34 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Not sure, I fully understand the fix. So, let's start from some questions. Why the DefaultClassPrepareEventListener is needed? Is it not enough to filter out the other threads in the ClassPrepareEventListener.eventReceived() method ? 243 eventHandler.startListening(); 244 // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads. 245 // The listener should be added after the event listener is started to ensure that it called before 246 // the default event listener that handles unexpected events. 247 eventHandler.addListener(new DefaultClassPrepareEventListener()); It is still not clear why the default listener is added after the listening is started but not before. If the default listener is really needed then could you, please, split the lines above and L129, L160 to make a little bit shorter? I'd also suggest to replace "class prepared events" at L244 with "ClassPrepare event" or "class prepare event". There is also an unneeded space in the "( e.g. compiler)". Thanks, Serguei On 7/17/18 01:20, Daniil Titov wrote: Please review the change that fix the JDI test when running with Graal. The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled. Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/ Thanks! --Daniil -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Thu Jul 19 17:31:42 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 19 Jul 2018 10:31:42 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Thu Jul 19 17:45:04 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 19 Jul 2018 10:45:04 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> Message-ID: <5471B2B0-6D2B-41E6-B02A-F48386E64862@oracle.com> Hi Chris, It solves the problem when ClassPrepare event comes from the Graal compiler thread at the time when no any listener listening for this event is registered by the test. In this case it is handled by the EventHandler itself (lines 261-262 below) as an unexpected event and the test fails (the EventHandler ?has its own ?default? listeners registered when the event handler starts to handle events if they are not handled by the listener the test registers). cat -n test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java /** 251 * This method sets up default listeners. 252 */ 253 private void createDefaultListeners() { 254 /** 255 * This listener catches up all unexpected events. 256 * 257 */ 258 addListener( 259 new EventListener() { 260 public boolean eventReceived(Event event) { 261 log.complain("EventHandler> Unexpected event: " + event.getClass().getName()); 262 unexpectedEventCaught = true; 263 return true; 264 } 265 } 266 ); 267 Best regards, Daniil From: Chris Plummer Date: Thursday, July 19, 2018 at 10:31 AM To: Daniil Titov , "serguei.spitsyn at oracle.com" , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, I understand the changes in eventReceived() to filter out events that are not on the main thread, but I don't understand why you've gone from having a new listener on each iteration, to just having one listener that stays active over all iterations. What problem does that change solve? thanks, Chris On 7/17/18 7:06 PM, Daniil Titov wrote: Hi Serguei, Please review a new version of the patch. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.03 Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 4:53 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Thank you for clarification and the webrev update! I still have a couple of questions though. I'd suggest more simple approach like below: 154 public boolean eventReceived(Event event) { 155 if (event instanceof ClassPrepareEvent) { 156 ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; 157 ThreadReference thread = classPrepareEvent.thread(); 158 if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { 159 eventReceived++; 160 161 log.display("ClassPrepareEventListener: Event received: " + event + 162 " Class: " + classPrepareEvent.referenceType().name()); 163 164 vm.resume(); 165 166 return true; 167 } 168 } 169 170 return false; 171 } to something like: public boolean eventReceived(Event event) { if (event instanceof ClassPrepareEvent) { ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; ThreadReference thread = classPrepareEvent.thread(); if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { eventReceived++; log.display("ClassPrepareEventListener: Event received: " + event + " Class: " + classPrepareEvent.referenceType().name()); } else { log.display("ClassPrepareEventListener: Event filtered out: " + event + " Class: " + classPrepareEvent.referenceType().name() + " Thread:" + classPrepareEvent.thread().name()); } vm.resume(); return true; } return false; } 245 eventHandler.startListening(); 246 // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads. 247 // The listener should be added after the event listener is started to ensure that it 248 // called before the default event listener that handles unexpected events. 249 eventHandler.addListener(new DefaultClassPrepareEventListener()); Still unclear why addListener() is invoked after startListening() but not before. It can be that a place add this listener is not right and have to be moved into testSourceFilter(). But I hope this fragment is not needed with the simplified approach. Otherwise, it looks good. Thanks, Serguei On 7/17/18 14:55, Daniil Titov wrote: Hi Serguei, The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters. The testSourceFilter() method does the following: 1. creates a ClassPrepareRequest object 2. registers new ClassPrepareEventListener 3. sends a command to debuggee to a load test class 4. waits till the debuggee performed the command 5. removes ClassPrepareEventListener 6. checks if a ClassPrepareEvent was received Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners) cat -n test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java /** 251 * This method sets up default listeners. 252 */ 253 private void createDefaultListeners() { 254 /** 255 * This listener catches up all unexpected events. 256 * 257 */ 258 addListener( 259 new EventListener() { 260 public boolean eventReceived(Event event) { 261 log.complain("EventHandler> Unexpected event: " + event.getClass().getName()); 262 unexpectedEventCaught = true; 263 return true; 264 } 265 } 266 ); 267 On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener is unregistered inside testSourceFilter() method. Please see below the new webrev with the changes you suggested. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/ Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 1:34 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Not sure, I fully understand the fix. So, let's start from some questions. Why the DefaultClassPrepareEventListener is needed? Is it not enough to filter out the other threads in the ClassPrepareEventListener.eventReceived() method ? 243 eventHandler.startListening(); 244 // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads. 245 // The listener should be added after the event listener is started to ensure that it called before 246 // the default event listener that handles unexpected events. 247 eventHandler.addListener(new DefaultClassPrepareEventListener()); It is still not clear why the default listener is added after the listening is started but not before. If the default listener is really needed then could you, please, split the lines above and L129, L160 to make a little bit shorter? I'd also suggest to replace "class prepared events" at L244 with "ClassPrepare event" or "class prepare event". There is also an unneeded space in the "( e.g. compiler)". Thanks, Serguei On 7/17/18 01:20, Daniil Titov wrote: Please review the change that fix the JDI test when running with Graal. The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled. Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/ Thanks! --Daniil -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Thu Jul 19 17:54:42 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 19 Jul 2018 10:54:42 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: <5471B2B0-6D2B-41E6-B02A-F48386E64862@oracle.com> References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> <5471B2B0-6D2B-41E6-B02A-F48386E64862@oracle.com> Message-ID: <870171c6-89e7-d019-a3a4-10a7df2230d3@oracle.com> An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Thu Jul 19 18:01:29 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 19 Jul 2018 11:01:29 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: <870171c6-89e7-d019-a3a4-10a7df2230d3@oracle.com> References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> <5471B2B0-6D2B-41E6-B02A-F48386E64862@oracle.com> <870171c6-89e7-d019-a3a4-10a7df2230d3@oracle.com> Message-ID: <9A1E81BE-B05D-48B9-8123-E2739BD68DFD@oracle.com> Hi Chris, Some events are still coming in after disable() returns. The event handler sees the request object associated with this event ( event.request() ) as disabled but it still receives them. Best regards, Daniil From: Chris Plummer Date: Thursday, July 19, 2018 at 10:54 AM To: Daniil Titov , "serguei.spitsyn at oracle.com" , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails But the code used to be: 168 request.disable(); 169 170 eventHandler.removeListener(listener); Doesn't the disable stop any new ClassPrepareEvents from coming in, and this is done before the listener is removed, or is there a synchronization issue here, and you can still get some events coming in after disable() returns. I'm not sure if disable() makes any guarantees about the debuggee side having fully processed it and guaranteed delivery of all pending events before it returns. thanks, Chris On 7/19/18 10:45 AM, Daniil Titov wrote: Hi Chris, It solves the problem when ClassPrepare event comes from the Graal compiler thread at the time when no any listener listening for this event is registered by the test. In this case it is handled by the EventHandler itself (lines 261-262 below) as an unexpected event and the test fails (the EventHandler has its own ?default? listeners registered when the event handler starts to handle events if they are not handled by the listener the test registers). cat -n test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java /** 251 * This method sets up default listeners. 252 */ 253 private void createDefaultListeners() { 254 /** 255 * This listener catches up all unexpected events. 256 * 257 */ 258 addListener( 259 new EventListener() { 260 public boolean eventReceived(Event event) { 261 log.complain("EventHandler> Unexpected event: " + event.getClass().getName()); 262 unexpectedEventCaught = true; 263 return true; 264 } 265 } 266 ); 267 Best regards, Daniil From: Chris Plummer Date: Thursday, July 19, 2018 at 10:31 AM To: Daniil Titov , "serguei.spitsyn at oracle.com" , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, I understand the changes in eventReceived() to filter out events that are not on the main thread, but I don't understand why you've gone from having a new listener on each iteration, to just having one listener that stays active over all iterations. What problem does that change solve? thanks, Chris On 7/17/18 7:06 PM, Daniil Titov wrote: Hi Serguei, Please review a new version of the patch. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.03 Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 4:53 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Thank you for clarification and the webrev update! I still have a couple of questions though. I'd suggest more simple approach like below: 154 public boolean eventReceived(Event event) { 155 if (event instanceof ClassPrepareEvent) { 156 ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; 157 ThreadReference thread = classPrepareEvent.thread(); 158 if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { 159 eventReceived++; 160 161 log.display("ClassPrepareEventListener: Event received: " + event + 162 " Class: " + classPrepareEvent.referenceType().name()); 163 164 vm.resume(); 165 166 return true; 167 } 168 } 169 170 return false; 171 } to something like: public boolean eventReceived(Event event) { if (event instanceof ClassPrepareEvent) { ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; ThreadReference thread = classPrepareEvent.thread(); if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { eventReceived++; log.display("ClassPrepareEventListener: Event received: " + event + " Class: " + classPrepareEvent.referenceType().name()); } else { log.display("ClassPrepareEventListener: Event filtered out: " + event + " Class: " + classPrepareEvent.referenceType().name() + " Thread:" + classPrepareEvent.thread().name()); } vm.resume(); return true; } return false; } 245 eventHandler.startListening(); 246 // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads. 247 // The listener should be added after the event listener is started to ensure that it 248 // called before the default event listener that handles unexpected events. 249 eventHandler.addListener(new DefaultClassPrepareEventListener()); Still unclear why addListener() is invoked after startListening() but not before. It can be that a place add this listener is not right and have to be moved into testSourceFilter(). But I hope this fragment is not needed with the simplified approach. Otherwise, it looks good. Thanks, Serguei On 7/17/18 14:55, Daniil Titov wrote: Hi Serguei, The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters. The testSourceFilter() method does the following: 1. creates a ClassPrepareRequest object 2. registers new ClassPrepareEventListener 3. sends a command to debuggee to a load test class 4. waits till the debuggee performed the command 5. removes ClassPrepareEventListener 6. checks if a ClassPrepareEvent was received Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners) cat -n test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java /** 251 * This method sets up default listeners. 252 */ 253 private void createDefaultListeners() { 254 /** 255 * This listener catches up all unexpected events. 256 * 257 */ 258 addListener( 259 new EventListener() { 260 public boolean eventReceived(Event event) { 261 log.complain("EventHandler> Unexpected event: " + event.getClass().getName()); 262 unexpectedEventCaught = true; 263 return true; 264 } 265 } 266 ); 267 On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener is unregistered inside testSourceFilter() method. Please see below the new webrev with the changes you suggested. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/ Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 1:34 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Not sure, I fully understand the fix. So, let's start from some questions. Why the DefaultClassPrepareEventListener is needed? Is it not enough to filter out the other threads in the ClassPrepareEventListener.eventReceived() method ? 243 eventHandler.startListening(); 244 // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads. 245 // The listener should be added after the event listener is started to ensure that it called before 246 // the default event listener that handles unexpected events. 247 eventHandler.addListener(new DefaultClassPrepareEventListener()); It is still not clear why the default listener is added after the listening is started but not before. If the default listener is really needed then could you, please, split the lines above and L129, L160 to make a little bit shorter? I'd also suggest to replace "class prepared events" at L244 with "ClassPrepare event" or "class prepare event". There is also an unneeded space in the "( e.g. compiler)". Thanks, Serguei On 7/17/18 01:20, Daniil Titov wrote: Please review the change that fix the JDI test when running with Graal. The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled. Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/ Thanks! --Daniil -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Thu Jul 19 18:03:16 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 19 Jul 2018 11:03:16 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: <9A1E81BE-B05D-48B9-8123-E2739BD68DFD@oracle.com> References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> <5471B2B0-6D2B-41E6-B02A-F48386E64862@oracle.com> <870171c6-89e7-d019-a3a4-10a7df2230d3@oracle.com> <9A1E81BE-B05D-48B9-8123-E2739BD68DFD@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Thu Jul 19 18:06:47 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 19 Jul 2018 11:06:47 -0700 Subject: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails In-Reply-To: References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com> <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com> <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com> <5471B2B0-6D2B-41E6-B02A-F48386E64862@oracle.com> <870171c6-89e7-d019-a3a4-10a7df2230d3@oracle.com> <9A1E81BE-B05D-48B9-8123-E2739BD68DFD@oracle.com> Message-ID: Thank you Chris and Serguei for reviewing this change! Best regards, Daniil From: Chris Plummer Date: Thursday, July 19, 2018 at 11:03 AM To: Daniil Titov , "serguei.spitsyn at oracle.com" , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Ok, your changes look good then. thanks, Chris On 7/19/18 11:01 AM, Daniil Titov wrote: Hi Chris, Some events are still coming in after disable() returns. The event handler sees the request object associated with this event ( event.request() ) as disabled but it still receives them. Best regards, Daniil From: Chris Plummer Date: Thursday, July 19, 2018 at 10:54 AM To: Daniil Titov , "serguei.spitsyn at oracle.com" , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails But the code used to be: 168 request.disable(); 169 170 eventHandler.removeListener(listener); Doesn't the disable stop any new ClassPrepareEvents from coming in, and this is done before the listener is removed, or is there a synchronization issue here, and you can still get some events coming in after disable() returns. I'm not sure if disable() makes any guarantees about the debuggee side having fully processed it and guaranteed delivery of all pending events before it returns. thanks, Chris On 7/19/18 10:45 AM, Daniil Titov wrote: Hi Chris, It solves the problem when ClassPrepare event comes from the Graal compiler thread at the time when no any listener listening for this event is registered by the test. In this case it is handled by the EventHandler itself (lines 261-262 below) as an unexpected event and the test fails (the EventHandler has its own ?default? listeners registered when the event handler starts to handle events if they are not handled by the listener the test registers). cat -n test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java /** 251 * This method sets up default listeners. 252 */ 253 private void createDefaultListeners() { 254 /** 255 * This listener catches up all unexpected events. 256 * 257 */ 258 addListener( 259 new EventListener() { 260 public boolean eventReceived(Event event) { 261 log.complain("EventHandler> Unexpected event: " + event.getClass().getName()); 262 unexpectedEventCaught = true; 263 return true; 264 } 265 } 266 ); 267 Best regards, Daniil From: Chris Plummer Date: Thursday, July 19, 2018 at 10:31 AM To: Daniil Titov , "serguei.spitsyn at oracle.com" , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, I understand the changes in eventReceived() to filter out events that are not on the main thread, but I don't understand why you've gone from having a new listener on each iteration, to just having one listener that stays active over all iterations. What problem does that change solve? thanks, Chris On 7/17/18 7:06 PM, Daniil Titov wrote: Hi Serguei, Please review a new version of the patch. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.03 Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 4:53 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Thank you for clarification and the webrev update! I still have a couple of questions though. I'd suggest more simple approach like below: 154 public boolean eventReceived(Event event) { 155 if (event instanceof ClassPrepareEvent) { 156 ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; 157 ThreadReference thread = classPrepareEvent.thread(); 158 if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { 159 eventReceived++; 160 161 log.display("ClassPrepareEventListener: Event received: " + event + 162 " Class: " + classPrepareEvent.referenceType().name()); 163 164 vm.resume(); 165 166 return true; 167 } 168 } 169 170 return false; 171 } to something like: public boolean eventReceived(Event event) { if (event instanceof ClassPrepareEvent) { ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event; ThreadReference thread = classPrepareEvent.thread(); if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) { eventReceived++; log.display("ClassPrepareEventListener: Event received: " + event + " Class: " + classPrepareEvent.referenceType().name()); } else { log.display("ClassPrepareEventListener: Event filtered out: " + event + " Class: " + classPrepareEvent.referenceType().name() + " Thread:" + classPrepareEvent.thread().name()); } vm.resume(); return true; } return false; } 245 eventHandler.startListening(); 246 // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads. 247 // The listener should be added after the event listener is started to ensure that it 248 // called before the default event listener that handles unexpected events. 249 eventHandler.addListener(new DefaultClassPrepareEventListener()); Still unclear why addListener() is invoked after startListening() but not before. It can be that a place add this listener is not right and have to be moved into testSourceFilter(). But I hope this fragment is not needed with the simplified approach. Otherwise, it looks good. Thanks, Serguei On 7/17/18 14:55, Daniil Titov wrote: Hi Serguei, The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters. The testSourceFilter() method does the following: 1. creates a ClassPrepareRequest object 2. registers new ClassPrepareEventListener 3. sends a command to debuggee to a load test class 4. waits till the debuggee performed the command 5. removes ClassPrepareEventListener 6. checks if a ClassPrepareEvent was received Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners) cat -n test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java /** 251 * This method sets up default listeners. 252 */ 253 private void createDefaultListeners() { 254 /** 255 * This listener catches up all unexpected events. 256 * 257 */ 258 addListener( 259 new EventListener() { 260 public boolean eventReceived(Event event) { 261 log.complain("EventHandler> Unexpected event: " + event.getClass().getName()); 262 unexpectedEventCaught = true; 263 return true; 264 } 265 } 266 ); 267 On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener is unregistered inside testSourceFilter() method. Please see below the new webrev with the changes you suggested. Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/ Thanks! Best regards, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, July 17, 2018 at 1:34 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails Hi Daniil, Not sure, I fully understand the fix. So, let's start from some questions. Why the DefaultClassPrepareEventListener is needed? Is it not enough to filter out the other threads in the ClassPrepareEventListener.eventReceived() method ? 243 eventHandler.startListening(); 244 // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads. 245 // The listener should be added after the event listener is started to ensure that it called before 246 // the default event listener that handles unexpected events. 247 eventHandler.addListener(new DefaultClassPrepareEventListener()); It is still not clear why the default listener is added after the listening is started but not before. If the default listener is really needed then could you, please, split the lines above and L129, L160 to make a little bit shorter? I'd also suggest to replace "class prepared events" at L244 with "ClassPrepare event" or "class prepare event". There is also an unneeded space in the "( e.g. compiler)". Thanks, Serguei On 7/17/18 01:20, Daniil Titov wrote: Please review the change that fix the JDI test when running with Graal. The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled. Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/ Thanks! --Daniil -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Thu Jul 19 18:26:49 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 19 Jul 2018 11:26:49 -0700 Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running in Docker containers In-Reply-To: <1bae36e7-3efc-3aef-6a99-324102da2549@gmail.com> References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com> <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com> <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com> <1bae36e7-3efc-3aef-6a99-324102da2549@gmail.com> Message-ID: <9894ffb1-1325-f38e-afa6-90f2b56a6d89@oracle.com> Hi Yasumasa, ? 84???? // It maps the LWPID in the host to it in the container. "it" -> "the PID" ?286???? // Get LWPID in the host from the container's LWPID. ?287???? public int getHostPID(int id) { ?288???????? try { ?289???????????? return nspidMap.get(id); ?290???????? } catch (NullPointerException e) { ?291???????????? return -1; ?292???????? } ?293???? } What is the source of the NPE here? Is it because nspidMap was never initialized because the process is not in a container? In that case I think you should be checking for null rather than having an NPE be part of normal execution. ? 42???????????? int hostPID = ((LinuxDebuggerLocal)debugger).getHostPID(pid); ? 43???????????? if (hostPID != -1) { ? 44???????????????? pid = hostPID; ? 45???????????? } A comment here would be helpful. The rest looks good. I should probably run it through some internal testing. Let me know when you have a final webrev. thanks, Chris On 7/18/18 5:59 AM, Yasumasa Suenaga wrote: > PING: > > Could you review it? > > ?? JBS:??? https://bugs.openjdk.java.net/browse/JDK-8205992 > ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/ > > This change has been reviewed by Jini. > We need a Reviewer. > > > Thanks, > > Yasumasa > > > On 2018/07/12 13:42, Yasumasa Suenaga wrote: >> Thanks Jini, >> >> I uploaded new webrev. It contains some comments and removing extra >> space. >> >> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/ >> >> >> Yasumasa >> >> >> >> 2018-07-12 2:32 GMT+09:00 Jini George : >>> Hi Yasumasa, >>> >>> This looks good to me except for one nit. And some more comments >>> would help. >>> For e.g., it would help to say that NSPidMap is to map the host to >>> container >>> lwpids. >>> >>> The nit: >>> >>> * >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html >>> >>> Line 253: extra space after the parentheses >>> >>> Thanks, >>> Jini. >>> >>> On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote: >>>> >>>> PING: Could you review it? >>>> >>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>>>> ?? webrev: >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2018/06/28 22:12, Yasumasa Suenaga wrote: >>>>> >>>>> Hi all, >>>>> >>>>> Please review this change. >>>>> >>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>>>> ?? webrev: >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >>>>> >>>>> I tried to attach jhsdb to java process in docker container from >>>>> container host, but it couldn't. >>>>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet. >>>>> >>>>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but >>>>> they >>>>> returns PIDs in container - they are different from host's PID. So >>>>> I added >>>>> the code to scan /proc//task to get all LWP IDs and they are >>>>> kept in a >>>>> Map in LinuxDebuggerLocal. >>>>> >>>>> Also SA_ALTROOT is set to /proc//root if SA detects debuggee >>>>> runs in >>>>> container. It helps SA to parse binaries in container. >>>>> >>>>> This change has been pushed to submit repo, and it was failed on OS X >>>>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963). >>>>> But I guess it causes JDK-8205906. This change affects to Linux only. >>>>> >>>>> Could you review it? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>> From serguei.spitsyn at oracle.com Thu Jul 19 21:33:30 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 19 Jul 2018 14:33:30 -0700 Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC In-Reply-To: References: <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From jcbeyler at google.com Thu Jul 19 21:52:07 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 19 Jul 2018 14:52:07 -0700 Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC In-Reply-To: References: <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com> Message-ID: Hi Serguei, Done here: http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.01/ I added: + // Calculate the size of a 1-element array in order to assess average sampling interval+ // via the HeapMonitorStatIntervalTest. This is needed because various GCs could add+ // extra memory to arrays.+ // This is done by allocating a 1-element array and then looking in the heap monitoring+ // samples for the average size of objects collected. Let me know what you think and then I need one more review to prepare the patch :-) Thanks all! Jc On Thu, Jul 19, 2018 at 2:33 PM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Hi Jc, > > The fix looks good to me. > Just minor comments. > > > http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor.java.frames.html > > 108 public static void calculateAverageOneElementSize() { > > Could you, please, add a comment before calculateAverageOneElementSize > method > explaining shortly why it is needed and what it is doing? > Otherwise, it is not easy to understand this code from scratch. > > Thanks, > Serguei > > > On 7/19/18 10:08, JC Beyler wrote: > > I forgot to put the link: > https://bugs.openjdk.java.net/browse/JDK-8207763 > > It got renamed in jdk11 via: > http://hg.openjdk.java.net/jdk/jdk11/rev/1edcf36fe15f > > Thanks! > Jc > > On Thu, Jul 19, 2018 at 10:07 AM JC Beyler wrote: > >> Hi Dan, >> >> >> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java >> became >> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java, >> when we updated the spec and said "rate" was the wrong word. >> >> So yes, it fixes both since at some point all branches should see that >> the StatRate test becomes renamed into the StatInterval test. Does that >> make sense? >> >> Thanks! >> Jc >> >> >> On Thu, Jul 19, 2018 at 9:45 AM Daniel D. Daugherty < >> daniel.daugherty at oracle.com> wrote: >> >>> JDK-8207765 covers two different tests as of yesterday: >>> >>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java >>> >>> and >>> >>> >>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java >>> >>> I updated it to add a similar failure mode sighting for >>> HeapMonitorStatIntervalTest.java >>> >>> >>> Does your fix address both test failures? >>> >>> Dan >>> >>> >>> On 7/19/18 12:39 PM, JC Beyler wrote: >>> >>> Hi all, >>> >>> Could I have a few reviews of: >>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ >>> >>> The test assumed the size of a 1-element array but ZGC changes that >>> assumption. The test now first allocates a bit of memory and gets the >>> average size of the samples before assuming the size. This works >>> with/without ZGC. >>> >>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8207765 >>> >>> Thanks! >>> Jc >>> >>> >>> >> >> -- >> >> Thanks, >> Jc >> > > > -- > > Thanks, > Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Jul 19 22:20:23 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 19 Jul 2018 15:20:23 -0700 Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC In-Reply-To: References: <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com> Message-ID: <6c66de5b-fb39-3212-cef8-2fba58aca121@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Jul 19 23:32:38 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 19 Jul 2018 16:32:38 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> Thanks, Rahul! In fact, there no good experts for this area in the serviceability team. It would be much better if anyone from the Compiler team could do it. Vladimir K., Is there anyone from the Compiler team available to review this? Otherwise, I could try to review it but am not sure about my review quality. Thanks, Serguei On 7/19/18 00:48, Rahul Raghavan wrote: > RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled > > (just adding + hotspot-compiler-dev also) > > > On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: > Subject Was: > Re: RFR (S): C1 still does eden allocations when TLAB is enabled > > + serviceability-dev > > Hi all, > > Could anyone else give me a review of this webrev and check/test the > various architecture changes? > > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > > > Thanks for all your help! > Jc > > >> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler wrote: >> >>> Hi all, >>> >>> Here is a webrev that does all the architectures in the same way: >>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>> >>> Could anyone review the other architectures and test? >>> ?? - arm, sparc & aarch64 are also modified now to follow the same >>> "if no >>> tlab, then consider eden space allocation" logic. >>> >>> Thanks for your help! >>> Jc >>> >>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler wrote: >>> >>>> Hi Kim, >>>> >>>> I opened this bug >>>> https://bugs.openjdk.java.net/browse/JDK-8190862 >>>> >>>> and now I've done an update: >>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >>>> >>>> I basically have done your nits but also removed the try_eden (it was >>>> used to bind a label but was not used). I updated the comments to >>>> use the >>>> one you preferred. >>>> >>>> I still have to do the other architectures though but at least we >>>> seem to >>>> have a consensus on this architecture, correct? >>>> >>>> Thanks for the review, >>>> Jc >>>> >>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett >>>> wrote: >>>> >>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler wrote: >>>>>> >>>>>> Yes, you are right, I did those changes due to: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 >>>>>> >>>>>> If Robbin agrees to this change, and if no one sees an issue, >>>>>> I'll go >>>>> ahead >>>>>> and propagate the change across architectures. >>>>>> >>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's >>>>>> comment >>>>> and >>>>>> review) :) >>>>>> Jc >>>>>> >>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose >>>>> wrote: >>>>>> >>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler >>>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> I'm not sure if we had left this case intentionally or not but, >>>>>>> if we >>>>> want >>>>>>> it all to be consistent, we should perhaps fix it. >>>>>>> >>>>>>> >>>>>>> Well, you put in that logic last February, so unless somebody >>>>>>> speaks >>>>> up >>>>>>> quickly, I support your adjusting it to be the way you want it. >>>>>>> >>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I >>>>>>> src/hotspot/share" >>>>>>> suggests that the GC group is most active in touching this feature. >>>>>>> If Robbin is OK with it, there's your reviewer. >>>>>>> >>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person >>>>>>> working on the GC to OK it. >>>>>>> >>>>>>> ? John >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Thanks, >>>>>> Jc >>>>> >>>>> Robbin is on vacation; you might not hear from him for a while. >>>>> >>>>> I'm assuming you'll open a new bug for this? >>>>> >>>>> Except for a few minor nits (below), this looks okay to me. >>>>> >>>>> The comment at line 1052 needs updating. >>>>> >>>>> pre-existing: The retry_tlab label declared on line 1054 is unused. >>>>> >>>>> pre-existing: The try_eden label declared on line 1054 is bound at >>>>> line 1058, but unreferenced. >>>>> >>>>> I like the wording of the comment at 1139 better than the wording at >>>>> 1016. >>>>> >>>>> >>>> >>>> -- >>>> >>>> Thanks, >>>> Jc >>>> >>> >>> >>> -- >>> >>> Thanks, >>> Jc >>> >> >> From alexey.menkov at oracle.com Fri Jul 20 00:06:13 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 19 Jul 2018 17:06:13 -0700 Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC In-Reply-To: References: <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com> Message-ID: <9a75ac9e-a1db-63dc-07aa-3be97442ce7e@oracle.com> Looks good. --alex On 07/19/2018 14:52, JC Beyler wrote: > Hi Serguei, > > Done here: > http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.01/ > > > I added: > > + // Calculate the size of a 1-element array in order to assess average > sampling interval > + // via the HeapMonitorStatIntervalTest. This is needed because various > GCs could add > + // extra memory to arrays. > + // This is done by allocating a 1-element array and then looking in > the heap monitoring > + // samples for the average size of objects collected. > > > Let me know what you think and then I need one more review to prepare > the patch :-) > > Thanks all! > Jc > > On Thu, Jul 19, 2018 at 2:33 PM serguei.spitsyn at oracle.com > > wrote: > > Hi Jc, > > The fix looks good to me. > Just minor comments. > > http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor.java.frames.html > > 108 public static void calculateAverageOneElementSize() { > > ? Could you, please, add a comment before > calculateAverageOneElementSize method > ? explaining shortly why it is needed and what it is doing? > ? Otherwise, it is not easy to understand this code from scratch. > > Thanks, > Serguei > > > On 7/19/18 10:08, JC Beyler wrote: >> I forgot to put the link: >> https://bugs.openjdk.java.net/browse/JDK-8207763 >> >> It got renamed in jdk11 via: >> http://hg.openjdk.java.net/jdk/jdk11/rev/1edcf36fe15f >> >> Thanks! >> Jc >> >> On Thu, Jul 19, 2018 at 10:07 AM JC Beyler > > wrote: >> >> Hi Dan, >> >> >> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java >> became >> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java, >> when we updated the spec and said "rate" was the wrong word. >> >> So yes, it fixes both since at some point all branches should >> see that the StatRate test becomes renamed into the >> StatInterval test. Does that make sense? >> >> Thanks! >> Jc >> >> >> On Thu, Jul 19, 2018 at 9:45 AM Daniel D. Daugherty >> > > wrote: >> >> JDK-8207765 covers two different tests as of yesterday: >> >> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java >> >> and >> >> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java >> >> I updated it to add a similar failure mode sighting for >> HeapMonitorStatIntervalTest.java >> >> >> Does your fix address both test failures? >> >> Dan >> >> >> On 7/19/18 12:39 PM, JC Beyler wrote: >>> Hi all, >>> >>> Could I have a few reviews of: >>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ >>> >>> >>> The? test assumed the size of a 1-element array but ZGC >>> changes that assumption. The test now first allocates a >>> bit of memory and gets the average size of the samples >>> before assuming the size. This works with/without ZGC. >>> >>> Webrev: >>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8207765 >>> >>> Thanks! >>> Jc >> >> >> >> -- >> >> Thanks, >> Jc >> >> >> >> -- >> >> Thanks, >> Jc > > > > -- > > Thanks, > Jc From serguei.spitsyn at oracle.com Fri Jul 20 00:21:45 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 19 Jul 2018 17:21:45 -0700 Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC In-Reply-To: <9a75ac9e-a1db-63dc-07aa-3be97442ce7e@oracle.com> References: <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com> <9a75ac9e-a1db-63dc-07aa-3be97442ce7e@oracle.com> Message-ID: <9f8cd454-1e48-c4f5-41fc-834adda7867e@oracle.com> Thanks a lot, Alex! Jc, Could you please send me a patch for push? Thanks, Serguei On 7/19/18 17:06, Alex Menkov wrote: > Looks good. > > --alex > > On 07/19/2018 14:52, JC Beyler wrote: >> Hi Serguei, >> >> Done here: >> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.01/ >> >> >> I added: >> >> + // Calculate the size of a 1-element array in order to assess >> average sampling interval >> + // via the HeapMonitorStatIntervalTest. This is needed because >> various GCs could add >> + // extra memory to arrays. >> + // This is done by allocating a 1-element array and then looking in >> the heap monitoring >> + // samples for the average size of objects collected. >> >> >> Let me know what you think and then I need one more review to prepare >> the patch :-) >> >> Thanks all! >> Jc >> >> On Thu, Jul 19, 2018 at 2:33 PM serguei.spitsyn at oracle.com >> > > wrote: >> >> ??? Hi Jc, >> >> ??? The fix looks good to me. >> ??? Just minor comments. >> >> http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor.java.frames.html >> >> ??? 108 public static void calculateAverageOneElementSize() { >> >> ???? ? Could you, please, add a comment before >> ??? calculateAverageOneElementSize method >> ???? ? explaining shortly why it is needed and what it is doing? >> ???? ? Otherwise, it is not easy to understand this code from scratch. >> >> ??? Thanks, >> ??? Serguei >> >> >> ??? On 7/19/18 10:08, JC Beyler wrote: >>> ??? I forgot to put the link: >>> ??? https://bugs.openjdk.java.net/browse/JDK-8207763 >>> >>> ??? It got renamed in jdk11 via: >>> ??? http://hg.openjdk.java.net/jdk/jdk11/rev/1edcf36fe15f >>> >>> ??? Thanks! >>> ??? Jc >>> >>> ??? On Thu, Jul 19, 2018 at 10:07 AM JC Beyler >> ??? > wrote: >>> >>> ??????? Hi Dan, >>> >>> >>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java >>> ??????? became >>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java, >>> ??????? when we updated the spec and said "rate" was the wrong word. >>> >>> ??????? So yes, it fixes both since at some point all branches should >>> ??????? see that the StatRate test becomes renamed into the >>> ??????? StatInterval test. Does that make sense? >>> >>> ??????? Thanks! >>> ??????? Jc >>> >>> >>> ??????? On Thu, Jul 19, 2018 at 9:45 AM Daniel D. Daugherty >>> ??????? >> ??????? > wrote: >>> >>> ??????????? JDK-8207765 covers two different tests as of yesterday: >>> >>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java >>> >>> ??????????? and >>> >>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java >>> >>> ??????????? I updated it to add a similar failure mode sighting for >>> ??????????? HeapMonitorStatIntervalTest.java >>> >>> >>> ??????????? Does your fix address both test failures? >>> >>> ??????????? Dan >>> >>> >>> ??????????? On 7/19/18 12:39 PM, JC Beyler wrote: >>>> ??????????? Hi all, >>>> >>>> ??????????? Could I have a few reviews of: >>>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ >>>> >>>> >>>> ??????????? The? test assumed the size of a 1-element array but ZGC >>>> ??????????? changes that assumption. The test now first allocates a >>>> ??????????? bit of memory and gets the average size of the samples >>>> ??????????? before assuming the size. This works with/without ZGC. >>>> >>>> ??????????? Webrev: >>>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ >>>> >>>> ??????????? Bug: https://bugs.openjdk.java.net/browse/JDK-8207765 >>>> >>>> ??????????? Thanks! >>>> ??????????? Jc >>> >>> >>> >>> ??????? -- >>> ??????? Thanks, >>> ??????? Jc >>> >>> >>> >>> ??? -- >>> ??? Thanks, >>> ??? Jc >> >> >> >> -- >> >> Thanks, >> Jc From jcbeyler at google.com Fri Jul 20 01:22:49 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 19 Jul 2018 18:22:49 -0700 Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC In-Reply-To: <9f8cd454-1e48-c4f5-41fc-834adda7867e@oracle.com> References: <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com> <9a75ac9e-a1db-63dc-07aa-3be97442ce7e@oracle.com> <9f8cd454-1e48-c4f5-41fc-834adda7867e@oracle.com> Message-ID: Hi Serguei and Alexey, Thanks both and here you are: http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.02/ Let me know if you need anything else! Jc On Thu, Jul 19, 2018 at 5:21 PM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Thanks a lot, Alex! > > Jc, > > Could you please send me a patch for push? > > Thanks, > Serguei > > On 7/19/18 17:06, Alex Menkov wrote: > > Looks good. > > > > --alex > > > > On 07/19/2018 14:52, JC Beyler wrote: > >> Hi Serguei, > >> > >> Done here: > >> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.01/ > >> > >> > >> I added: > >> > >> + // Calculate the size of a 1-element array in order to assess > >> average sampling interval > >> + // via the HeapMonitorStatIntervalTest. This is needed because > >> various GCs could add > >> + // extra memory to arrays. > >> + // This is done by allocating a 1-element array and then looking in > >> the heap monitoring > >> + // samples for the average size of objects collected. > >> > >> > >> Let me know what you think and then I need one more review to prepare > >> the patch :-) > >> > >> Thanks all! > >> Jc > >> > >> On Thu, Jul 19, 2018 at 2:33 PM serguei.spitsyn at oracle.com > >> >> > wrote: > >> > >> Hi Jc, > >> > >> The fix looks good to me. > >> Just minor comments. > >> > >> > http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor.java.frames.html > >> > >> 108 public static void calculateAverageOneElementSize() { > >> > >> Could you, please, add a comment before > >> calculateAverageOneElementSize method > >> explaining shortly why it is needed and what it is doing? > >> Otherwise, it is not easy to understand this code from scratch. > >> > >> Thanks, > >> Serguei > >> > >> > >> On 7/19/18 10:08, JC Beyler wrote: > >>> I forgot to put the link: > >>> https://bugs.openjdk.java.net/browse/JDK-8207763 > >>> > >>> It got renamed in jdk11 via: > >>> http://hg.openjdk.java.net/jdk/jdk11/rev/1edcf36fe15f > >>> > >>> Thanks! > >>> Jc > >>> > >>> On Thu, Jul 19, 2018 at 10:07 AM JC Beyler >>> > wrote: > >>> > >>> Hi Dan, > >>> > >>> > >>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java > >>> became > >>> > serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java, > >>> when we updated the spec and said "rate" was the wrong word. > >>> > >>> So yes, it fixes both since at some point all branches should > >>> see that the StatRate test becomes renamed into the > >>> StatInterval test. Does that make sense? > >>> > >>> Thanks! > >>> Jc > >>> > >>> > >>> On Thu, Jul 19, 2018 at 9:45 AM Daniel D. Daugherty > >>> >>> > wrote: > >>> > >>> JDK-8207765 covers two different tests as of yesterday: > >>> > >>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java > >>> > >>> and > >>> > >>> > serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java > >>> > >>> I updated it to add a similar failure mode sighting for > >>> HeapMonitorStatIntervalTest.java > >>> > >>> > >>> Does your fix address both test failures? > >>> > >>> Dan > >>> > >>> > >>> On 7/19/18 12:39 PM, JC Beyler wrote: > >>>> Hi all, > >>>> > >>>> Could I have a few reviews of: > >>>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ > >>>> > >>>> > >>>> The test assumed the size of a 1-element array but ZGC > >>>> changes that assumption. The test now first allocates a > >>>> bit of memory and gets the average size of the samples > >>>> before assuming the size. This works with/without ZGC. > >>>> > >>>> Webrev: > >>>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ > >>>> > >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8207765 > >>>> > >>>> Thanks! > >>>> Jc > >>> > >>> > >>> > >>> -- > >>> Thanks, > >>> Jc > >>> > >>> > >>> > >>> -- > >>> Thanks, > >>> Jc > >> > >> > >> > >> -- > >> > >> Thanks, > >> Jc > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From yasuenag at gmail.com Fri Jul 20 05:13:49 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Fri, 20 Jul 2018 14:13:49 +0900 Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running in Docker containers Message-ID: Hi Chris, Thank you for your comment. I uploaded new webrev. Could you review again? http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.02/ I tested my change on Linux x64, but I cannot check it on other platform (includes older Linux). However SA tests are included in HotSpot tier 1 tests. Tests on submit repo work fine with this change (mach5-one-ysuenaga-JDK-8205992-20180720-0305-31840). Thanks, Yasumasa 2018-07-20 3:26 GMT+09:00 Chris Plummer : > Hi Yasumasa, > > 84 // It maps the LWPID in the host to it in the container. > > "it" -> "the PID" > > 286 // Get LWPID in the host from the container's LWPID. > 287 public int getHostPID(int id) { > 288 try { > 289 return nspidMap.get(id); > 290 } catch (NullPointerException e) { > 291 return -1; > 292 } > 293 } > > What is the source of the NPE here? Is it because nspidMap was never > initialized because the process is not in a container? In that case I think > you should be checking for null rather than having an NPE be part of normal > execution. > > 42 int hostPID = > ((LinuxDebuggerLocal)debugger).getHostPID(pid); > 43 if (hostPID != -1) { > 44 pid = hostPID; > 45 } > > A comment here would be helpful. > > The rest looks good. I should probably run it through some internal testing. > Let me know when you have a final webrev. > > thanks, > > Chris > > > On 7/18/18 5:59 AM, Yasumasa Suenaga wrote: >> >> PING: >> >> Could you review it? >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/ >> >> This change has been reviewed by Jini. >> We need a Reviewer. >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2018/07/12 13:42, Yasumasa Suenaga wrote: >>> >>> Thanks Jini, >>> >>> I uploaded new webrev. It contains some comments and removing extra >>> space. >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/ >>> >>> >>> Yasumasa >>> >>> >>> >>> 2018-07-12 2:32 GMT+09:00 Jini George : >>>> >>>> Hi Yasumasa, >>>> >>>> This looks good to me except for one nit. And some more comments would >>>> help. >>>> For e.g., it would help to say that NSPidMap is to map the host to >>>> container >>>> lwpids. >>>> >>>> The nit: >>>> >>>> * >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html >>>> Line 253: extra space after the parentheses >>>> >>>> Thanks, >>>> Jini. >>>> >>>> On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote: >>>>> >>>>> >>>>> PING: Could you review it? >>>>> >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>>>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >>>>> >>>>> >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2018/06/28 22:12, Yasumasa Suenaga wrote: >>>>>> >>>>>> >>>>>> Hi all, >>>>>> >>>>>> Please review this change. >>>>>> >>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>>>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >>>>>> >>>>>> I tried to attach jhsdb to java process in docker container from >>>>>> container host, but it couldn't. >>>>>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet. >>>>>> >>>>>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they >>>>>> returns PIDs in container - they are different from host's PID. So I >>>>>> added >>>>>> the code to scan /proc//task to get all LWP IDs and they are kept >>>>>> in a >>>>>> Map in LinuxDebuggerLocal. >>>>>> >>>>>> Also SA_ALTROOT is set to /proc//root if SA detects debuggee runs >>>>>> in >>>>>> container. It helps SA to parse binaries in container. >>>>>> >>>>>> This change has been pushed to submit repo, and it was failed on OS X >>>>>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963). >>>>>> But I guess it causes JDK-8205906. This change affects to Linux only. >>>>>> >>>>>> Could you review it? >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>> > > From ralf.schmelter at sap.com Fri Jul 20 14:28:09 2018 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Fri, 20 Jul 2018 14:28:09 +0000 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com> <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com> Message-ID: <6de6362944f84740b80abb22cbbea872@sap.com> Hi Sergue, I?ve updated the webref: http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/ JVMTI_ERROR_OPAQUE_FRAME was never returned from GetFrameLocation(). If it would have, the old code would have removed all native methods from the call stack. The original JVMDI call did indeed return JVMDI_ERROR_OPAQUE_FRAME, so maybe it was a leftover from the JVMDI->JVMTI transition. I?ve tried to make the test more readable and added some comments to explain why it is done the way it is. Best regards, Ralf From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com] Sent: Mittwoch, 18. Juli 2018 22:57 To: Chris Plummer ; Schmelter, Ralf ; serviceability-dev at openjdk.java.net; Stuefe, Thomas Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior Hi Ralf, The fix itself looks pretty good to me. Some minor comments. The copyright year needs an update. 218 jint count, filledIn; Could you, please, split the declarations above into different lines to follow the local style? Ii is interesting that the original implementation checked the error code returned from the JVMTI GetFrameLocation for being equal to JVMTI_ERROR_OPAQUE_FRAME. However, the GetFrameLocation spec does not list this error code as possible. Some comments about the test. 52 static void callEnded() { 53 System.out.println("SOE occurred as expected"); 54 } 55 56 static int call(int depth) { 57 if (depth == 0) { 58 // Should have seen a stack overflow by now. 59 System.out.println("Exited without creating SOE"); 60 System.exit(0); 61 } 62 63 try { 64 int newDepth = call(depth - 1); 65 66 if (newDepth == -1_000) { 67 // Pop some frames so there is room on the stack for the 68 // println() 69 callEnded(); 70 } 71 72 return newDepth - 1; 73 } catch (StackOverflowError e) { 74 return -1; 75 } 76 } 77 } ? I'd suggest to rename the methods call() and callEnded() to something like ? recursiveMethod() and recursionEnd(). ? Also, the manipulations with SOE create a complexity and are confusing. ? Could it be more simple to let it propagated and then catch in main()? ? What is the point for all these checks at the lines 104-119? ? In general, I'm looking for some ways to make it more clear, simple and stable. Thanks, Serguei From vladimir.kozlov at oracle.com Fri Jul 20 17:52:56 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 20 Jul 2018 10:52:56 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> Message-ID: <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> I asked Igor V. to look. Seems like review is done in an other thread which does not have bug id in subject. Currently webrev.03 Vladimir On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: > Thanks, Rahul! > In fact, there no good experts for this area in the serviceability team. > It would be much better if anyone from the Compiler team could do it. > > Vladimir K., > > Is there anyone from the Compiler team available to review this? > Otherwise, I could try to review it but am not sure about my review > quality. > > Thanks, > Serguei > > > On 7/19/18 00:48, Rahul Raghavan wrote: >> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled >> >> (just adding + hotspot-compiler-dev also) >> >> >> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: >> Subject Was: >> Re: RFR (S): C1 still does eden allocations when TLAB is enabled >> >> + serviceability-dev >> >> Hi all, >> >> Could anyone else give me a review of this webrev and check/test the >> various architecture changes? >> >> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >> >> >> Thanks for all your help! >> Jc >> >> >>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler wrote: >>> >>>> Hi all, >>>> >>>> Here is a webrev that does all the architectures in the same way: >>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>>> >>>> Could anyone review the other architectures and test? >>>> ?? - arm, sparc & aarch64 are also modified now to follow the same >>>> "if no >>>> tlab, then consider eden space allocation" logic. >>>> >>>> Thanks for your help! >>>> Jc >>>> >>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler wrote: >>>> >>>>> Hi Kim, >>>>> >>>>> I opened this bug >>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 >>>>> >>>>> and now I've done an update: >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >>>>> >>>>> I basically have done your nits but also removed the try_eden (it was >>>>> used to bind a label but was not used). I updated the comments to >>>>> use the >>>>> one you preferred. >>>>> >>>>> I still have to do the other architectures though but at least we >>>>> seem to >>>>> have a consensus on this architecture, correct? >>>>> >>>>> Thanks for the review, >>>>> Jc >>>>> >>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett >>>>> wrote: >>>>> >>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler wrote: >>>>>>> >>>>>>> Yes, you are right, I did those changes due to: >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 >>>>>>> >>>>>>> If Robbin agrees to this change, and if no one sees an issue, >>>>>>> I'll go >>>>>> ahead >>>>>>> and propagate the change across architectures. >>>>>>> >>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's >>>>>>> comment >>>>>> and >>>>>>> review) :) >>>>>>> Jc >>>>>>> >>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose >>>>>> wrote: >>>>>>> >>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler >>>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> I'm not sure if we had left this case intentionally or not but, >>>>>>>> if we >>>>>> want >>>>>>>> it all to be consistent, we should perhaps fix it. >>>>>>>> >>>>>>>> >>>>>>>> Well, you put in that logic last February, so unless somebody >>>>>>>> speaks >>>>>> up >>>>>>>> quickly, I support your adjusting it to be the way you want it. >>>>>>>> >>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I >>>>>>>> src/hotspot/share" >>>>>>>> suggests that the GC group is most active in touching this feature. >>>>>>>> If Robbin is OK with it, there's your reviewer. >>>>>>>> >>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person >>>>>>>> working on the GC to OK it. >>>>>>>> >>>>>>>> ? John >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Thanks, >>>>>>> Jc >>>>>> >>>>>> Robbin is on vacation; you might not hear from him for a while. >>>>>> >>>>>> I'm assuming you'll open a new bug for this? >>>>>> >>>>>> Except for a few minor nits (below), this looks okay to me. >>>>>> >>>>>> The comment at line 1052 needs updating. >>>>>> >>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused. >>>>>> >>>>>> pre-existing: The try_eden label declared on line 1054 is bound at >>>>>> line 1058, but unreferenced. >>>>>> >>>>>> I like the wording of the comment at 1139 better than the wording at >>>>>> 1016. >>>>>> >>>>>> >>>>> >>>>> -- >>>>> >>>>> Thanks, >>>>> Jc >>>>> >>>> >>>> >>>> -- >>>> >>>> Thanks, >>>> Jc >>>> >>> >>> > From vladimir.kozlov at oracle.com Fri Jul 20 17:57:11 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 20 Jul 2018 10:57:11 -0700 Subject: RFR (S): C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: <22211468-5b15-e6a8-be6b-7ce5d2fbdf27@oracle.com> Please, don't do review in 2 mailing threads. Thanks, Vladimir On 7/20/18 8:30 AM, JC Beyler wrote: > Awesome thanks Thomas! > > Here is the webrev with the extra information then: > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ > > Thanks again for all the reviews everyone! > Jc > > On Fri, Jul 20, 2018 at 3:23 AM Thomas Schatzl > wrote: > >> Hi, >> >> On Mon, 2018-07-16 at 14:58 -0700, JC Beyler wrote: >>> Hi all, >>> >>> Here is a webrev that does all the architectures in the same way: >>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>> >>> Could anyone review the other architectures and test? >>> - arm, sparc & aarch64 are also modified now to follow the same "if >>> no >>> tlab, then consider eden space allocation" logic. >>> >>> Thanks for your help! >>> Jc >>> >> >> looks good. >> >> I ran the change through hs-tier1-3 with no issues. It only tests on >> sparc and x64 though. >> >> I do not expect issues on the other platforms though :) >> >> Thanks, >> Thomas >> >> > From serguei.spitsyn at oracle.com Fri Jul 20 18:18:20 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 20 Jul 2018 11:18:20 -0700 Subject: RFR (S): 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: Restored the bug number and added back the hotspot-dev and serviceability-dev mailing lists. Thanks, Serguei On 7/20/18 08:30, JC Beyler wrote: > Awesome thanks Thomas! > > Here is the webrev with the extra information then: > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ > > Thanks again for all the reviews everyone! > Jc > > On Fri, Jul 20, 2018 at 3:23 AM Thomas Schatzl > wrote: > >> Hi, >> >> On Mon, 2018-07-16 at 14:58 -0700, JC Beyler wrote: >>> Hi all, >>> >>> Here is a webrev that does all the architectures in the same way: >>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>> >>> Could anyone review the other architectures and test? >>> - arm, sparc & aarch64 are also modified now to follow the same "if >>> no >>> tlab, then consider eden space allocation" logic. >>> >>> Thanks for your help! >>> Jc >>> >> looks good. >> >> I ran the change through hs-tier1-3 with no issues. It only tests on >> sparc and x64 though. >> >> I do not expect issues on the other platforms though :) >> >> Thanks, >> Thomas >> >> From serguei.spitsyn at oracle.com Fri Jul 20 18:21:56 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 20 Jul 2018 11:21:56 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> Message-ID: Thank you a lot, Vladimir! Yes, the webrev.03 is the latest. Jc, will correct us if it is not right. Thanks, Serguei On 7/20/18 10:52, Vladimir Kozlov wrote: > I asked Igor V. to look. > > Seems like review is done in an other thread which does not have bug > id in subject. Currently webrev.03 > > Vladimir > > On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: >> Thanks, Rahul! >> In fact, there no good experts for this area in the serviceability team. >> It would be much better if anyone from the Compiler team could do it. >> >> Vladimir K., >> >> Is there anyone from the Compiler team available to review this? >> Otherwise, I could try to review it but am not sure about my review >> quality. >> >> Thanks, >> Serguei >> >> >> On 7/19/18 00:48, Rahul Raghavan wrote: >>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled >>> >>> (just adding + hotspot-compiler-dev also) >>> >>> >>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: >>> Subject Was: >>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled >>> >>> + serviceability-dev >>> >>> Hi all, >>> >>> Could anyone else give me a review of this webrev and check/test the >>> various architecture changes? >>> >>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>> >>> >>> Thanks for all your help! >>> Jc >>> >>> >>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler wrote: >>>> >>>>> Hi all, >>>>> >>>>> Here is a webrev that does all the architectures in the same way: >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>>>> >>>>> Could anyone review the other architectures and test? >>>>> ?? - arm, sparc & aarch64 are also modified now to follow the same >>>>> "if no >>>>> tlab, then consider eden space allocation" logic. >>>>> >>>>> Thanks for your help! >>>>> Jc >>>>> >>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler >>>>> wrote: >>>>> >>>>>> Hi Kim, >>>>>> >>>>>> I opened this bug >>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 >>>>>> >>>>>> and now I've done an update: >>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >>>>>> >>>>>> I basically have done your nits but also removed the try_eden (it >>>>>> was >>>>>> used to bind a label but was not used). I updated the comments to >>>>>> use the >>>>>> one you preferred. >>>>>> >>>>>> I still have to do the other architectures though but at least we >>>>>> seem to >>>>>> have a consensus on this architecture, correct? >>>>>> >>>>>> Thanks for the review, >>>>>> Jc >>>>>> >>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett >>>>>> wrote: >>>>>> >>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler >>>>>>>> wrote: >>>>>>>> >>>>>>>> Yes, you are right, I did those changes due to: >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 >>>>>>>> >>>>>>>> If Robbin agrees to this change, and if no one sees an issue, >>>>>>>> I'll go >>>>>>> ahead >>>>>>>> and propagate the change across architectures. >>>>>>>> >>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's >>>>>>>> comment >>>>>>> and >>>>>>>> review) :) >>>>>>>> Jc >>>>>>>> >>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose >>>>>>> wrote: >>>>>>>> >>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> I'm not sure if we had left this case intentionally or not >>>>>>>>> but, if we >>>>>>> want >>>>>>>>> it all to be consistent, we should perhaps fix it. >>>>>>>>> >>>>>>>>> >>>>>>>>> Well, you put in that logic last February, so unless somebody >>>>>>>>> speaks >>>>>>> up >>>>>>>>> quickly, I support your adjusting it to be the way you want it. >>>>>>>>> >>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I >>>>>>>>> src/hotspot/share" >>>>>>>>> suggests that the GC group is most active in touching this >>>>>>>>> feature. >>>>>>>>> If Robbin is OK with it, there's your reviewer. >>>>>>>>> >>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person >>>>>>>>> working on the GC to OK it. >>>>>>>>> >>>>>>>>> ? John >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jc >>>>>>> >>>>>>> Robbin is on vacation; you might not hear from him for a while. >>>>>>> >>>>>>> I'm assuming you'll open a new bug for this? >>>>>>> >>>>>>> Except for a few minor nits (below), this looks okay to me. >>>>>>> >>>>>>> The comment at line 1052 needs updating. >>>>>>> >>>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused. >>>>>>> >>>>>>> pre-existing: The try_eden label declared on line 1054 is bound at >>>>>>> line 1058, but unreferenced. >>>>>>> >>>>>>> I like the wording of the comment at 1139 better than the >>>>>>> wording at >>>>>>> 1016. >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Thanks, >>>>>> Jc >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Thanks, >>>>> Jc >>>>> >>>> >>>> >> From chris.plummer at oracle.com Fri Jul 20 18:37:22 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 20 Jul 2018 11:37:22 -0700 Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR: suspendCounts don't match for : Common-Cleaner In-Reply-To: <5B507F2C.4080503@oracle.com> References: <5B4E0C62.3020808@oracle.com> <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com> <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com> <5B4F98BF.1060602@oracle.com> <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com> <5B507F2C.4080503@oracle.com> Message-ID: <2bd15cea-86b5-1af0-ef53-2c82a4685224@oracle.com> Hi Gary, The test fails if the breakpoint event comes in after the test captures the initial thread suspend counts and before the test captures the 2nd suspend counts. debugger>???????? getting : Map suspendsCounts1 debugger> {Reference Handler=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1} debugger>???????? eventSet.resume; debugger>???????? getting : Map suspendsCounts2 EventHandler> Received event set with policy = SUSPEND_ALL EventHandler> Event: BreakpointEventImpl req breakpoint request nsk.jdi.EventSet.resume.resume008a:60 (enabled) debugger> Received communication breakpoint event. debugger> {Reference Handler=2, Common-Cleaner=2, main=1, Signal Dispatcher=2, Finalizer=2} So we end up with some threads starting with 1 suspend and ending with 2 (not clear to me why main is still at 1). It will pass if the breakpoint comes in after it does both of suspend count checks, as you have shown with the sleep(100) solution. Output looks like this: debugger>??????? got new ThreadStartEvent with propety 'number' == ThreadStartRequest1 ... debugger> ......--> vm.suspend(); debugger>???????? getting : Map suspendsCounts1 debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1} debugger>???????? eventSet.resume; debugger>???????? getting : Map suspendsCounts2 debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1} ... debugger> Received communication breakpoint event. I've also shown that it passes if the breakpoint always comes in before capturing the initial suspend counts. I added a sleep on the debugger side right after eventHandler.waitForRequestedEventSet() returns. Output looks like: debugger> Received communication breakpoint event. debugger>??????? got new ThreadStartEvent with propety 'number' == ThreadStartRequest1 ... debugger> ......--> vm.suspend(); debugger>???????? getting : Map suspendsCounts1 debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, Signal Dispatcher=2, Finalizer=2} debugger>???????? eventSet.resume; debugger>???????? getting : Map suspendsCounts2 debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, Signal Dispatcher=2, Finalizer=2} I think we should add synchronization to force one of these two outcomes. For the first, you would need to make the debugger modify some variable that the debuggee is watching (sitting in a loop waiting for it to change). For the second, you can rely on the existing methodForCommunication() approach. You just need to restructure the debugger a bit. I had started down this path late Wednesday, but got sidetracked by a few other things. I can look into it some more if you'd like. thanks, Chris On 7/19/18 5:08 AM, Gary Adams wrote: > In the successful run below "the first acquire thread suspend counts, > resume, > and the second acquire thread suspend counts" is not interrupted by the > breakpoint event. > > Note that the failed thread0 case the test thread finishes rapidly. > [2018-01-22T20:33:46.86] debugee.stderr> **> debuggee: 'run': enter :: > threadName == thread0 *[2018-01-22T20:33:46.86] debugee.stderr> **> > debuggee: 'run': exit :: threadName == thread0* > > and the successful test run , the thread0 run method exits after the > thread1 > has started. > > debugger> :::::: case: # 1 > debugger> ......waiting for new ThreadStartEvent : 1 > EventHandler> waitForRequestedEventSet: enabling remove of listener > nsk.share.jdi.EventHandler$7 at 616bc3ae > EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 616bc3ae > EventHandler> waitForRequestedEventSet: vm.resume called > EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD > *debugee.stderr> **> debuggee:?? 'run': exit?? :: threadName == thread0* > > > Here's a recent mach5 failed log: > [2018-01-22T20:33:45.65] # [2018-01-22T20:33:45.65] export > TEST_CLEANUP [2018-01-22T20:33:45.65] export SHELL > [2018-01-22T20:33:45.65] export DISPLAY [2018-01-22T20:33:45.65] > export LIBJSIG_PATH [2018-01-22T20:33:45.65] export TESTBASE > [2018-01-22T20:33:45.65] export JAVA_OPTS [2018-01-22T20:33:45.65] > export RAS_OPTIONS [2018-01-22T20:33:45.65] export HOME > [2018-01-22T20:33:45.65] export LD_LIBRARY_PATH > [2018-01-22T20:33:45.65] export CLASSPATH [2018-01-22T20:33:45.65] > export TEMP [2018-01-22T20:33:45.65] export TESTED_JAVA_HOME > [2018-01-22T20:33:45.65] export BASH_ENV [2018-01-22T20:33:45.65] > export PATH [2018-01-22T20:33:45.65] TEST_DEST_DIR="resume008" > [2018-01-22T20:33:45.65] # Actual: TEST_DEST_DIR=resume008 > [2018-01-22T20:33:45.65] TESTNAME="${test_case_name}" > [2018-01-22T20:33:45.65] # Actual: TESTNAME=resume008 > [2018-01-22T20:33:45.65] testName="nsk/jdi/EventSet/resume//resume008" > [2018-01-22T20:33:45.65] # Actual: > testName=nsk/jdi/EventSet/resume//resume008 [2018-01-22T20:33:45.65] > TESTDIR="${test_work_dir}" [2018-01-22T20:33:45.65] # Actual: > TESTDIR=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008 > [2018-01-22T20:33:45.65] testWorkDir="${test_work_dir}/" > [2018-01-22T20:33:45.65] # Actual: > testWorkDir=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/ > [2018-01-22T20:33:45.65] export testWorkDir [2018-01-22T20:33:45.65] > tlogOutFile="${test_work_dir}/${test_name}.tlog" > [2018-01-22T20:33:45.65] # Actual: > tlogOutFile=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/resume008.tlog > [2018-01-22T20:33:45.65] > testErrFile="${test_work_dir}/${test_name}.err" > [2018-01-22T20:33:45.65] # Actual: > testErrFile=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/resume008.err > [2018-01-22T20:33:45.65] EXECUTE_CLASS="${test_name}" > [2018-01-22T20:33:45.66] # Actual: EXECUTE_CLASS=resume008 > [2018-01-22T20:33:45.66] > NSK_STRESS_METASPACE_OPTS="-XX:MaxMetaspaceSize=128m > -XX:CompressedClassSpaceSize=64m -XX:MaxHeapSize=512m > -Xlog:gc(ASTERISK_SUBST),gc+heap=trace" [2018-01-22T20:33:45.66] # > Actual: NSK_STRESS_METASPACE_OPTS=-XX:MaxMetaspaceSize=128m > -XX:CompressedClassSpaceSize=64m -XX:MaxHeapSize=512m > -Xlog:gc*,gc+heap=trace [2018-01-22T20:33:45.66] export > NSK_STRESS_METASPACE_OPTS [2018-01-22T20:33:45.66] > EXECUTE_CLASS="nsk.jdi.EventSet.resume.resume008" > [2018-01-22T20:33:45.66] # Actual: > EXECUTE_CLASS=nsk.jdi.EventSet.resume.resume008 > [2018-01-22T20:33:45.66] TEST_ARGS="${JDI_TEST_KEYS} > -debugee.vmkeys=${JDI_DEBUGEE_VM_KEYS}" [2018-01-22T20:33:45.66] # > Actual: TEST_ARGS=-verbose -arch=linux-amd64 -waittime=5 > -debugee.vmkind=java -transport.address=dynamic > -debugee.vmkeys=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:45.66] > JAVA="${TESTED_JAVA_HOME}/bin/${DEBUGGER_KIND_OF_JAVA}" > [2018-01-22T20:33:45.66] # Actual: > JAVA=/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10/bin/java > [2018-01-22T20:33:45.66] JAVA_OPTS="${DEBUGGER_JAVA_OPTS}" > [2018-01-22T20:33:45.66] # Actual: JAVA_OPTS= [2018-01-22T20:33:45.66] > APPLICATION_TIMEOUT="${TIMEOUT}" [2018-01-22T20:33:45.66] # Actual: > APPLICATION_TIMEOUT=30 [2018-01-22T20:33:45.66] > CLASSPATH="${test_work_dir}${PS}${CLASSPATH}" [2018-01-22T20:33:45.66] > # Actual: > CLASSPATH=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008:/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.test/hotspot/closed/tonga/bin/classes: > [2018-01-22T20:33:45.66] export CLASSPATH [2018-01-22T20:33:45.66] > ${JAVA} ${JAVA_OPTS} ${EXECUTE_CLASS} ${TEST_ARGS} > [2018-01-22T20:33:45.66] # Actual: > /scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10/bin/java > nsk.jdi.EventSet.resume.resume008 -verbose -arch=linux-amd64 > -waittime=5 -debugee.vmkind=java -transport.address=dynamic > -debugee.vmkeys=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:46.01] > binder> VirtualMachineManager: version 9.0 [2018-01-22T20:33:46.05] > binder> Finding connector: default [2018-01-22T20:33:46.05] binder> > LaunchingConnector: [2018-01-22T20:33:46.06] binder> name: > com.sun.jdi.CommandLineLaunch [2018-01-22T20:33:46.06] binder> > description: Launches target using Sun Java VM command line and > attaches to it [2018-01-22T20:33:46.06] binder> transport: > com.sun.tools.jdi.SunCommandLineLauncher$2 at 457e2f02 > [2018-01-22T20:33:46.19] binder> Connector arguments: > [2018-01-22T20:33:46.19] binder> > home=/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10 > [2018-01-22T20:33:46.19] binder> vmexec=java [2018-01-22T20:33:46.19] > binder> options=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:46.20] > binder> main=nsk.jdi.EventSet.resume.resume008a "-verbose" > "-arch=linux-amd64" "-waittime=5" "-debugee.vmkind=java" > "-transport.address=dynamic" > "-debugee.vmkeys=-XX:MaxRAMPercentage=12.5" "-pipe.port=28038" > [2018-01-22T20:33:46.20] binder> quote=" [2018-01-22T20:33:46.20] > binder> suspend=true [2018-01-22T20:33:46.20] binder> Launching > debugee [2018-01-22T20:33:46.56] binder> Waiting for VM initialized > [2018-01-22T20:33:46.60] Initial VMStartEvent received: VMStartEvent > in thread main [2018-01-22T20:33:46.61] EventHandler> Adding listener > nsk.share.jdi.EventHandler$1 at 1e7c7811 [2018-01-22T20:33:46.61] > EventHandler> Adding listener nsk.share.jdi.EventHandler$2 at 1a3869f4 > [2018-01-22T20:33:46.61] EventHandler> Adding listener > nsk.share.jdi.EventHandler$3 at 77f99a05 [2018-01-22T20:33:46.61] > EventHandler> Adding listener nsk.share.jdi.EventHandler$4 at 3aeaafa6 > [2018-01-22T20:33:46.61] EventHandler> Adding listener > nsk.share.jdi.EventHandler$5 at 4d3167f4 [2018-01-22T20:33:46.62] > EventHandler> waitForRequestedEvent: enabling remove of listener > nsk.share.jdi.EventHandler$6 at 4eb7f003 [2018-01-22T20:33:46.62] > EventHandler> Adding listener nsk.share.jdi.EventHandler$6 at 4eb7f003 > [2018-01-22T20:33:46.62] EventHandler> waitForRequestedEvent: > vm.resume called [2018-01-22T20:33:46.67] EventHandler> Received event > set with policy = SUSPEND_EVENT_THREAD [2018-01-22T20:33:46.68] > EventHandler> Event: ClassPrepareEventImpl req class prepare request > (enabled) [2018-01-22T20:33:46.69] EventHandler> > waitForRequestedEvent: Received event(ClassPrepareEvent in thread > main) for request(class prepare request (enabled)) > [2018-01-22T20:33:46.69] EventHandler> Removing listener > nsk.share.jdi.EventHandler$6 at 4eb7f003 [2018-01-22T20:33:46.69] > debugger> Received ClassPrepareEvent for debuggee class: > nsk.jdi.EventSet.resume.resume008a [2018-01-22T20:33:46.71] binder> > Breakpoint set: [2018-01-22T20:33:46.71] breakpoint request > nsk.jdi.EventSet.resume.resume008a:60 (disabled) > [2018-01-22T20:33:46.71] EventHandler> Adding listener > nsk.share.jdi.TestDebuggerType1$1 at 43738a82 [2018-01-22T20:33:46.71] > debugger> TESTING BEGINS [2018-01-22T20:33:46.71] debugger> RESUME > DEBUGGEE VM [2018-01-22T20:33:46.72] debugger> > shouldRunAfterBreakpoint: entered [2018-01-22T20:33:46.72] debugger> > shouldRunAfterBreakpoint: waiting for breakpoint event during 1 sec. > [2018-01-22T20:33:46.84] EventHandler> Received event set with policy > = SUSPEND_ALL [2018-01-22T20:33:46.84] EventHandler> Event: > BreakpointEventImpl req breakpoint request > nsk.jdi.EventSet.resume.resume008a:60 (enabled) > [2018-01-22T20:33:46.84] debugger> Received communication breakpoint > event. [2018-01-22T20:33:46.84] debugger> shouldRunAfterBreakpoint: > received breakpoint event. [2018-01-22T20:33:46.84] debugee.stderr> > **> debuggee: debuggee started! [2018-01-22T20:33:46.85] debugger> > shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.85] > debugger> :::::: case: # 0 [2018-01-22T20:33:46.85] debugger> > ......waiting for new ThreadStartEvent : 0 [2018-01-22T20:33:46.85] > EventHandler> waitForRequestedEventSet: enabling remove of listener > nsk.share.jdi.EventHandler$7 at 6ec8211c [2018-01-22T20:33:46.85] > EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 6ec8211c > [2018-01-22T20:33:46.85] EventHandler> waitForRequestedEventSet: > vm.resume called [2018-01-22T20:33:46.86] debugee.stderr> **> > debuggee: 'run': enter :: threadName == thread0 > [2018-01-22T20:33:46.86] debugee.stderr> **> debuggee: 'run': exit :: > threadName == thread0 [2018-01-22T20:33:46.86] EventHandler> Received > event set with policy = SUSPEND_NONE [2018-01-22T20:33:46.86] > EventHandler> waitForRequestedEventSet: Received event set for > request: thread start request (enabled) [2018-01-22T20:33:46.86] > EventHandler> Event: ThreadStartEventImpl req thread start request > (enabled) [2018-01-22T20:33:46.86] EventHandler> Removing listener > nsk.share.jdi.EventHandler$7 at 6ec8211c [2018-01-22T20:33:46.86] > debugger> got new ThreadStartEvent with propety 'number' == > ThreadStartRequest1 [2018-01-22T20:33:46.86] debugger> ......checking > up on EventSet.resume() [2018-01-22T20:33:46.86] debugger> ......--> > vm.suspend(); [2018-01-22T20:33:46.87] debugger> getting : Map Integer> suspendsCounts1 [2018-01-22T20:33:46.87] debugger> {Reference > Handler=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1} > [2018-01-22T20:33:46.87] debugger> eventSet.resume; > [2018-01-22T20:33:46.87] debugger> getting : Map > suspendsCounts2 [2018-01-22T20:33:46.87] EventHandler> Received event > set with policy = SUSPEND_ALL [2018-01-22T20:33:46.87] EventHandler> > Event: BreakpointEventImpl req breakpoint request > nsk.jdi.EventSet.resume.resume008a:60 (enabled) > [2018-01-22T20:33:46.87] debugger> Received communication breakpoint > event. [2018-01-22T20:33:46.87] debugger> {Reference Handler=2, > Common-Cleaner=2, main=1, Signal Dispatcher=2, Finalizer=2} > [2018-01-22T20:33:46.87] debugger> getting : int policy = > eventSet.suspendPolicy(); [2018-01-22T20:33:46.87] debugger> case > SUSPEND_NONE [2018-01-22T20:33:46.87] debugger> checking Reference > Handler [2018-01-22T20:33:46.87] # ERROR: debugger> ERROR: > suspendCounts don't match for : Reference Handler > [2018-01-22T20:33:46.88] The following stacktrace is for Aurora. Used > to create a RULE: [2018-01-22T20:33:46.88] nsk.share.TestFailure: > debugger> ERROR: suspendCounts don't match for : Reference Handler > [2018-01-22T20:33:46.88] at > nsk.share.Log.logExceptionForAurora(Log.java:411) > [2018-01-22T20:33:46.88] at nsk.share.Log.complain(Log.java:380) > [2018-01-22T20:33:46.88] at > nsk.share.jdi.TestDebuggerType1.complain(TestDebuggerType1.java:63) > [2018-01-22T20:33:46.88] at > nsk.jdi.EventSet.resume.resume008.testRun(resume008.java:163) > [2018-01-22T20:33:46.88] at > nsk.share.jdi.TestDebuggerType1.runThis(TestDebuggerType1.java:104) > [2018-01-22T20:33:46.88] at > nsk.jdi.EventSet.resume.resume008.run(resume008.java:62) > [2018-01-22T20:33:46.88] at > nsk.jdi.EventSet.resume.resume008.main(resume008.java:57) > [2018-01-22T20:33:46.88] # ERROR: debugger> before resuming : 1 > [2018-01-22T20:33:46.88] # ERROR: debugger> after resuming : 2 > [2018-01-22T20:33:46.88] debugger> ......--> vm.resume() > [2018-01-22T20:33:46.88] debugger> shouldRunAfterBreakpoint: entered > [2018-01-22T20:33:46.88] debugger> shouldRunAfterBreakpoint: received > breakpoint event. [2018-01-22T20:33:46.88] debugger> > shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.88] > debugger> :::::: case: # 1 [2018-01-22T20:33:46.88] debugger> > ......waiting for new ThreadStartEvent : 1 [2018-01-22T20:33:46.88] > EventHandler> waitForRequestedEventSet: enabling remove of listener > nsk.share.jdi.EventHandler$7 at 548ad73b [2018-01-22T20:33:46.88] > EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 548ad73b > [2018-01-22T20:33:46.88] EventHandler> waitForRequestedEventSet: > vm.resume called [2018-01-22T20:33:46.88] EventHandler> Received event > set with policy = SUSPEND_EVENT_THREAD [2018-01-22T20:33:46.88] > EventHandler> waitForRequestedEventSet: Received event set for > request: thread start request (enabled) [2018-01-22T20:33:46.88] > EventHandler> Event: ThreadStartEventImpl req thread start request > (enabled) [2018-01-22T20:33:46.88] EventHandler> Removing listener > nsk.share.jdi.EventHandler$7 at 548ad73b [2018-01-22T20:33:46.88] > debugger> got new ThreadStartEvent with propety 'number' == > ThreadStartRequest2 [2018-01-22T20:33:46.88] debugger> ......checking > up on EventSet.resume() [2018-01-22T20:33:46.88] debugger> ......--> > vm.suspend(); [2018-01-22T20:33:46.88] debugger> getting : Map Integer> suspendsCounts1 [2018-01-22T20:33:46.89] debugger> {Reference > Handler=1, thread1=2, Common-Cleaner=1, main=1, Signal Dispatcher=1, > Finalizer=1} [2018-01-22T20:33:46.89] debugger> eventSet.resume; > [2018-01-22T20:33:46.89] debugger> getting : Map > suspendsCounts2 [2018-01-22T20:33:46.89] debugger> {Reference > Handler=1, thread1=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, > Finalizer=1} [2018-01-22T20:33:46.89] debugger> getting : int policy = > eventSet.suspendPolicy(); [2018-01-22T20:33:46.89] debugger> case > SUSPEND_THREAD [2018-01-22T20:33:46.89] debugger> checking Reference > Handler [2018-01-22T20:33:46.89] debugger> checking thread1 > [2018-01-22T20:33:46.89] debugger> checking Common-Cleaner > [2018-01-22T20:33:46.89] debugger> checking main > [2018-01-22T20:33:46.90] debugger> checking Signal Dispatcher > [2018-01-22T20:33:46.90] debugger> checking Finalizer > [2018-01-22T20:33:46.90] debugger> ......--> vm.resume() > [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: entered > [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: waiting > for breakpoint event during 1 sec. [2018-01-22T20:33:46.90] > debugee.stderr> **> debuggee: 'run': enter :: threadName == thread1 > [2018-01-22T20:33:46.90] debugee.stderr> **> debuggee: 'run': exit :: > threadName == thread1 [2018-01-22T20:33:46.90] EventHandler> Received > event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.90] > EventHandler> Event: BreakpointEventImpl req breakpoint request > nsk.jdi.EventSet.resume.resume008a:60 (enabled) > [2018-01-22T20:33:46.90] debugger> Received communication breakpoint > event. [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: > received breakpoint event. [2018-01-22T20:33:46.90] debugger> > shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.90] > debugger> :::::: case: # 2 [2018-01-22T20:33:46.90] debugger> > ......waiting for new ThreadStartEvent : 2 [2018-01-22T20:33:46.90] > EventHandler> waitForRequestedEventSet: enabling remove of listener > nsk.share.jdi.EventHandler$7 at 2641e737 [2018-01-22T20:33:46.90] > EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 2641e737 > [2018-01-22T20:33:46.90] EventHandler> waitForRequestedEventSet: > vm.resume called [2018-01-22T20:33:46.90] EventHandler> Received event > set with policy = SUSPEND_ALL [2018-01-22T20:33:46.90] EventHandler> > waitForRequestedEventSet: Received event set for request: thread start > request (enabled) [2018-01-22T20:33:46.90] EventHandler> Event: > ThreadStartEventImpl req thread start request (enabled) > [2018-01-22T20:33:46.90] EventHandler> Removing listener > nsk.share.jdi.EventHandler$7 at 2641e737 [2018-01-22T20:33:46.90] > debugger> got new ThreadStartEvent with propety 'number' == > ThreadStartRequest3 [2018-01-22T20:33:46.90] debugger> ......checking > up on EventSet.resume() [2018-01-22T20:33:46.90] debugger> ......--> > vm.suspend(); [2018-01-22T20:33:46.90] debugger> getting : Map Integer> suspendsCounts1 [2018-01-22T20:33:46.91] debugger> {Reference > Handler=2, thread2=2, Common-Cleaner=2, main=2, Signal Dispatcher=2, > Finalizer=2} [2018-01-22T20:33:46.91] debugger> eventSet.resume; > [2018-01-22T20:33:46.91] debugger> getting : Map > suspendsCounts2 [2018-01-22T20:33:46.91] debugger> {Reference > Handler=1, thread2=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, > Finalizer=1} [2018-01-22T20:33:46.91] debugger> getting : int policy = > eventSet.suspendPolicy(); [2018-01-22T20:33:46.91] debugger> case > SUSPEND_ALL [2018-01-22T20:33:46.91] debugger> checking Reference > Handler [2018-01-22T20:33:46.91] debugger> checking thread2 > [2018-01-22T20:33:46.91] debugger> checking Common-Cleaner > [2018-01-22T20:33:46.91] debugger> checking main > [2018-01-22T20:33:46.91] debugger> checking Signal Dispatcher > [2018-01-22T20:33:46.91] debugger> checking Finalizer > [2018-01-22T20:33:46.91] debugger> ......--> vm.resume() > [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: entered > [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: waiting > for breakpoint event during 1 sec. [2018-01-22T20:33:46.91] > debugee.stderr> **> debuggee: 'run': enter :: threadName == thread2 > [2018-01-22T20:33:46.91] debugee.stderr> **> debuggee: 'run': exit :: > threadName == thread2 [2018-01-22T20:33:46.91] EventHandler> Received > event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.91] > EventHandler> Event: BreakpointEventImpl req breakpoint request > nsk.jdi.EventSet.resume.resume008a:60 (enabled) > [2018-01-22T20:33:46.91] debugger> Received communication breakpoint > event. [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: > received breakpoint event. [2018-01-22T20:33:46.91] debugger> > shouldRunAfterBreakpoint: received instruction from debuggee to > finish. [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: > exited with false. [2018-01-22T20:33:46.91] debugger> TESTING ENDS > [2018-01-22T20:33:46.91] debugger> Waiting for debuggee's exit... > [2018-01-22T20:33:46.91] EventHandler> waitForVMDisconnect > [2018-01-22T20:33:46.91] debugee.stderr> **> debuggee: debuggee exits > [2018-01-22T20:33:46.92] EventHandler> Received event set with policy > = SUSPEND_NONE [2018-01-22T20:33:46.92] EventHandler> Event: > VMDeathEventImpl req null [2018-01-22T20:33:46.92] EventHandler> > receieved VMDeath [2018-01-22T20:33:46.92] EventHandler> Removing > listener nsk.share.jdi.EventHandler$3 at 77f99a05 > [2018-01-22T20:33:47.25] EventHandler> Received event set with policy > = SUSPEND_NONE [2018-01-22T20:33:47.25] EventHandler> Event: > VMDisconnectEventImpl req null [2018-01-22T20:33:47.25] EventHandler> > receieved VMDisconnect [2018-01-22T20:33:47.25] EventHandler> Removing > listener nsk.share.jdi.EventHandler$4 at 3aeaafa6 > [2018-01-22T20:33:47.25] EventHandler> finished > [2018-01-22T20:33:47.25] EventHandler> waitForVMDisconnect: done > [2018-01-22T20:33:47.25] debugger> Event handler thread exited. > [2018-01-22T20:33:47.25] debugger> Debuggee PASSED. > [2018-01-22T20:33:47.26] [2018-01-22T20:33:47.26] > [2018-01-22T20:33:47.26] #> [2018-01-22T20:33:47.26] #> SUMMARY: > Following errors occured [2018-01-22T20:33:47.26] #> during test > execution: [2018-01-22T20:33:47.26] #> [2018-01-22T20:33:47.26] # > ERROR: debugger> ERROR: suspendCounts don't match for : Reference > Handler [2018-01-22T20:33:47.26] # ERROR: debugger> before resuming : > 1 [2018-01-22T20:33:47.26] # ERROR: debugger> after resuming : 2 > [2018-01-22T20:33:47.27] # Test level exit status: 97 > > > Here's a recent passed log from a local run: > > ----------System.out:(164/9808)---------- > run [nsk.jdi.EventSet.resume.resume008, -verbose, -arch=linux-x64, > -waittime=5, -debugee.vmkind=java, -transport.address=dynamic, > -debugee.vmkeys=-XX:MaxRAMPercentage=2 ] > binder> VirtualMachineManager: version 11.0 > binder> Finding connector: default > binder> LaunchingConnector: > binder>???? name: com.sun.jdi.CommandLineLaunch > binder>???? description: Launches target using Sun Java VM command > line and attaches to it > binder>???? transport: com.sun.tools.jdi.SunCommandLineLauncher$2 at 749dec1a > binder> Connector arguments: > binder> home=/export/users/gradams/ws/jdk-jdk/build/linux-x64/images/jdk > binder>???? vmexec=java > binder>???? options=-XX:MaxRAMPercentage=2 > binder>???? main=nsk.jdi.EventSet.resume.resume008a "-verbose" > "-arch=linux-x64" "-waittime=5" "-debugee.vmkind=java" > "-transport.address=dynamic" "-debugee.vmkeys=-XX:MaxRAMPercentage=2 " > "-pipe.port=35940" > binder>???? quote=" > binder>???? suspend=true > binder> Launching debugee > binder> Waiting for VM initialized > Initial VMStartEvent received: VMStartEvent in thread main > EventHandler> Adding listener nsk.share.jdi.EventHandler$1 at 2ab41d39 > EventHandler> Adding listener nsk.share.jdi.EventHandler$2 at 2e3cb1e2 > EventHandler> Adding listener nsk.share.jdi.EventHandler$3 at 57f20df9 > EventHandler> Adding listener nsk.share.jdi.EventHandler$4 at 6e72e291 > EventHandler> Adding listener nsk.share.jdi.EventHandler$5 at 5889e23e > EventHandler> waitForRequestedEvent: enabling remove of listener > nsk.share.jdi.EventHandler$6 at 46dcda7f > EventHandler> Adding listener nsk.share.jdi.EventHandler$6 at 46dcda7f > EventHandler> waitForRequestedEvent: vm.resume called > EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD > EventHandler> Event: ClassPrepareEventImpl req class prepare request? > (enabled) > EventHandler> waitForRequestedEvent: Received event(ClassPrepareEvent > in thread main) for request(class prepare request? (enabled)) > EventHandler> Removing listener nsk.share.jdi.EventHandler$6 at 46dcda7f > debugger> Received ClassPrepareEvent for debuggee class: > nsk.jdi.EventSet.resume.resume008a > binder> Breakpoint set: > ??? breakpoint request nsk.jdi.EventSet.resume.resume008a:74 (disabled) > EventHandler> Adding listener nsk.share.jdi.TestDebuggerType1$1 at 322c2a05 > debugger> TESTING BEGINS > debugger> RESUME DEBUGGEE VM > debugger> shouldRunAfterBreakpoint: entered > debugger> shouldRunAfterBreakpoint: waiting for breakpoint event > during 1 sec. > > debugee.stderr> **> debuggee: debuggee started! > EventHandler> Received event set with policy = SUSPEND_ALL > EventHandler> Event: BreakpointEventImpl req breakpoint request > nsk.jdi.EventSet.resume.resume008a:74 (enabled) > debugger> Received communication breakpoint event. > > debugger> shouldRunAfterBreakpoint: received breakpoint event. > debugger> shouldRunAfterBreakpoint: exited with true. > debugger> :::::: case: # 0 > debugger> ......waiting for new ThreadStartEvent : 0 > > EventHandler> waitForRequestedEventSet: enabling remove of listener > nsk.share.jdi.EventHandler$7 at 78aa490d > EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 78aa490d > EventHandler> waitForRequestedEventSet: vm.resume called > EventHandler> Received event set with policy = SUSPEND_NONE > debugee.stderr> **> debuggee:?? 'run': enter? :: threadName == thread0 > EventHandler> waitForRequestedEventSet: Received event set for > request: thread start request? (enabled) > EventHandler> Event: ThreadStartEventImpl req thread start request? > (enabled) > EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 78aa490d > EventHandler> Received event set with policy = SUSPEND_ALL > EventHandler> Event: BreakpointEventImpl req breakpoint request > nsk.jdi.EventSet.resume.resume008a:74 (enabled) > debugger> Received communication breakpoint event. > > debugger>??????? got new ThreadStartEvent with propety 'number' == > ThreadStartRequest1 > debugger> ......checking up on EventSet.resume() > debugger> ......--> vm.suspend(); > debugger>???????? getting : Map suspendsCounts1 > debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, > Signal Dispatcher=2, Finalizer=2} > debugger>???????? eventSet.resume; > debugger>???????? getting : Map suspendsCounts2 > debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, > Signal Dispatcher=2, Finalizer=2} > debugger>???????? getting : int policy = eventSet.suspendPolicy(); > debugger>???????? case SUSPEND_NONE > debugger>???????? checking Reference Handler > debugger>???????? checking thread0 > debugger>???????? checking Common-Cleaner > debugger>???????? checking main > debugger>???????? checking Signal Dispatcher > debugger>???????? checking Finalizer > debugger> ......--> vm.resume() > debugger> shouldRunAfterBreakpoint: entered > debugger> shouldRunAfterBreakpoint: received breakpoint event. > debugger> shouldRunAfterBreakpoint: exited with true. > debugger> :::::: case: # 1 > debugger> ......waiting for new ThreadStartEvent : 1 > EventHandler> waitForRequestedEventSet: enabling remove of listener > nsk.share.jdi.EventHandler$7 at 616bc3ae > EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 616bc3ae > EventHandler> waitForRequestedEventSet: vm.resume called > EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD > debugee.stderr> **> debuggee:?? 'run': exit?? :: threadName == thread0 > EventHandler> waitForRequestedEventSet: Received event set for > request: thread start request? (enabled) > EventHandler> Event: ThreadStartEventImpl req thread start request? > (enabled) > EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 616bc3ae > debugger>??????? got new ThreadStartEvent with propety 'number' == > ThreadStartRequest2 > debugger> ......checking up on EventSet.resume() > debugger> ......--> vm.suspend(); > debugger>???????? getting : Map suspendsCounts1 > debugger> {Reference Handler=1, thread1=2, Common-Cleaner=1, main=1, > Signal Dispatcher=1, Finalizer=1} > debugger>???????? eventSet.resume; > debugger>???????? getting : Map suspendsCounts2 > debugger> {Reference Handler=1, thread1=1, Common-Cleaner=1, main=1, > Signal Dispatcher=1, Finalizer=1} > debugger>???????? getting : int policy = eventSet.suspendPolicy(); > debugger>???????? case SUSPEND_THREAD > debugger> checking Reference Handler > debugger> checking thread1 > debugger> checking Common-Cleaner > debugger> checking main > debugger> checking Signal Dispatcher > debugger> checking Finalizer > debugger> ......--> vm.resume() > debugger> shouldRunAfterBreakpoint: entered > debugger> shouldRunAfterBreakpoint: waiting for breakpoint event > during 1 sec. > debugee.stderr> **> debuggee:?? 'run': enter? :: threadName == thread1 > debugee.stderr> **> debuggee:?? 'run': exit?? :: threadName == thread1 > EventHandler> Received event set with policy = SUSPEND_ALL > EventHandler> Event: BreakpointEventImpl req breakpoint request > nsk.jdi.EventSet.resume.resume008a:74 (enabled) > debugger> Received communication breakpoint event. > debugger> shouldRunAfterBreakpoint: received breakpoint event. > debugger> shouldRunAfterBreakpoint: exited with true. > debugger> :::::: case: # 2 > debugger> ......waiting for new ThreadStartEvent : 2 > EventHandler> waitForRequestedEventSet: enabling remove of listener > nsk.share.jdi.EventHandler$7 at 44e265ef > EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 44e265ef > EventHandler> waitForRequestedEventSet: vm.resume called > EventHandler> Received event set with policy = SUSPEND_ALL > EventHandler> waitForRequestedEventSet: Received event set for > request: thread start request? (enabled) > EventHandler> Event: ThreadStartEventImpl req thread start request? > (enabled) > EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 44e265ef > debugger>??????? got new ThreadStartEvent with propety 'number' == > ThreadStartRequest3 > debugger> ......checking up on EventSet.resume() > debugger> ......--> vm.suspend(); > debugger>???????? getting : Map suspendsCounts1 > debugger> {Reference Handler=2, thread2=2, Common-Cleaner=2, main=2, > Signal Dispatcher=2, Finalizer=2} > debugger>???????? eventSet.resume; > debugger>???????? getting : Map suspendsCounts2 > debugger> {Reference Handler=1, thread2=1, Common-Cleaner=1, main=1, > Signal Dispatcher=1, Finalizer=1} > debugger>???????? getting : int policy = eventSet.suspendPolicy(); > debugger>???????? case SUSPEND_ALL > debugger> checking Reference Handler > debugger> checking thread2 > debugger> checking Common-Cleaner > debugger> checking main > debugger> checking Signal Dispatcher > debugger> checking Finalizer > debugger> ......--> vm.resume() > debugger> shouldRunAfterBreakpoint: entered > debugger> shouldRunAfterBreakpoint: waiting for breakpoint event > during 1 sec. > debugee.stderr> **> debuggee:?? 'run': enter? :: threadName == thread2 > debugee.stderr> **> debuggee:?? 'run': exit?? :: threadName == thread2 > EventHandler> Received event set with policy = SUSPEND_ALL > EventHandler> Event: BreakpointEventImpl req breakpoint request > nsk.jdi.EventSet.resume.resume008a:74 (enabled) > debugger> Received communication breakpoint event. > debugger> shouldRunAfterBreakpoint: received breakpoint event. > debugger> shouldRunAfterBreakpoint: received instruction from debuggee > to finish. > debugger> shouldRunAfterBreakpoint: exited with false. > debugger> TESTING ENDS > debugger> Waiting for debuggee's exit... > debugee.stderr> **> debuggee: debuggee exits > EventHandler> waitForVMDisconnect > EventHandler> Received event set with policy = SUSPEND_NONE > EventHandler> Event: VMDeathEventImpl req null > EventHandler> receieved VMDeath > EventHandler> Removing listener nsk.share.jdi.EventHandler$3 at 57f20df9 > EventHandler> Received event set with policy = SUSPEND_NONE > EventHandler> Event: VMDisconnectEventImpl req null > EventHandler> receieved VMDisconnect > EventHandler> Removing listener nsk.share.jdi.EventHandler$4 at 6e72e291 > EventHandler> finished > EventHandler> waitForVMDisconnect: done > debugger> Event handler thread exited. > debugger> Debuggee PASSED. > > On 7/18/18, 6:09 PM, gary.adams at oracle.com wrote: >> On 7/18/18 4:47 PM, Chris Plummer wrote: >>> Hi Gary >>> >>> Ok, so shouldRunAfterBreakpoint() is the code that does the >>> eventHandler.wait(), so it gets the eventHandler.notifyAll() >>> notification from the BreakpointEvent handler. >>> >>> And as a side note, I see now that resumption of execution after the >>> breakpoint at main() is done by: >>> >>> ??????????? // after waitForClassPrepared() main debuggee thread is >>> suspended, resume it before test start >>> ??????????? display("RESUME DEBUGGEE VM"); >>> ??????????? vm.resume(); >>> >>> ??????????? testRun(); >>> >>> shouldRunAfterBreakpoint() is returning true until the end of the >>> test when the debuggee is executes "instruction = end". That's why >>> runTests() does a "break" when shouldRunAfterBreakpoint() returns >>> false. So this means the code that is checking >>> shouldRunAfterBreakpoint() is not resuming execution for the first >>> few (probably 3) methodForCommunication() breakpoints. However, it >>> does make sure that runTests() blocks until the BreakPointEvent has >>> been processed. >>> >>> You point out the vm.resume() at the bottom of the loop in >>> runTests(), but that's only after a bunch of ThreadStartEvent >>> processing above it has been done already. The ThreadStartEvent >>> would never get generated if there was not a resume some point >>> earlier. I think it is happening during the >>> eventHandler.waitForRequestedEventSet() call, which does a vm.resume(). >>> >>> So if I understand the order of things now: >>> >>> -shouldRunAfterBreakpoint() returns after first >>> methodForCommunication() is hit. At this point we know the first >>> thread has been created, but no attempt to start it yet. The >>> debuggee is suspended at this point. >>> -runTests() requests ThreadStartEvents with SUSPEND_NONE. This also >>> does a vm.resume(). >>> -The debuggee starts the thread and then does another >>> methodForCommunication() (this 2nd one is actually after the 2nd >>> thread has been created, but not yet started). Now we have a race. >>> Do we get the ThreadStartEvent first or the BreakpointEvent. This is >>> because when the ThreadStartEvent is generated, the thread is not >>> suspended due to SUSPEND_NONE. Even if the ThreadStartEvent comes in >>> first, the async handling of the BreakpointEvent can cause problems >>> during the ThreadStartEvent processing. >> Based on the failed log in the bug report, the thread start event is >> observed, >> the suspend counts acquired, then after the resume, the breakpoint >> message >> is displayed and the second set of suspend counts acquired. >> >> I can show you the passed and failed logs tomorrow. >>> -You added a 100ms delay after the thread has started, but before >>> methodForCommunication(), hoping it will make it so the >>> ThreadStartEvent can be received and fully processed before the >>> BreakpointEvent is. >> The delay is mostly just a yield so the debugger gets a chance to run. >>> >>> I think it would be preferable to fix this by doing better >>> sychronization. After all, that is the approach the test originally >>> took. It could have been written with a bunch of sleep() delays >>> instead, but that in general is not a very good approach. >>> >>> What if you added a shouldRunAfterBreakpoint() call after getting >>> the ThreadStartEvent arrives. At this point you would know that the >>> vm is suspended due to the breakpoint, so no need for: >>> >>> ??????????????? display("......checking up on EventSet.resume()"); >>> ??????????????? display("......--> vm.suspend();"); >>> ??????????????? vm.suspend(); >> I think the suspend is intentional to capture the the suspend counts. >> It also needs to resume the vm and acquire again so it can confirm >> the correct >> suspend count behaviors. >> If the test waits to capture the second set of suspend counts, the >> breakpoint >> causes incorrect values. >> >> ... >>> >>> You might then also need to add another methodForCommunication() >>> call at the end of case 0 and 1 in the debuggee, although I think >>> you could instead just change the shouldRunAfterBreakpoint() at the >>> start of the loop. I think that check actually belongs at the end of >>> the loop, and only for case 2. In fact it would be an error if >>> shouldRunAfterBreakpoint() did not return true in that case. Then >>> you also need to add a shouldRunAfterBreakpoint() at the start of >>> case 0 to get things rolling (and I think at the start of case 1 also). >>> >>> Chris >>> >>> >>> On 7/18/18 12:45 PM, Gary Adams wrote: >>>> Answers below? ... >>>> >>>> On 7/18/18, 2:50 PM, Chris Plummer wrote: >>>>> Hi Gary, >>>>> >>>>> Who does the resume for the breakpoint event? >>>>> >>>>> ??????? eventHandler.addListener( >>>>> ???????????? new EventHandler.EventListener() { >>>>> ???????????????? public boolean eventReceived(Event event) { >>>>> ??????????????????? if (event instanceof BreakpointEvent && >>>>> bpRequest.equals(event.request())) { >>>>> ??????????????????????? synchronized(eventHandler) { >>>>> ??????????????????????????? display("Received communication >>>>> breakpoint event."); >>>>> ??????????????????????????? bpCount++; >>>>> ??????????????????????????? eventHandler.notifyAll(); >>>>> ??????????????????????? } >>>>> ??????????????????????? return true; >>>>> ??????????????????? } >>>>> ??????????????????? return false; >>>>> ???????????????? } >>>>> ???????????? } >>>>> ??????? ); >>>> I believe you are looking for this sequence. >>>> At the top of the loop a check is made if >>>> resume() should be called "shouldRunAfterBreakpoint". >>>> lines 96-99 is an early termination. And at the >>>> bottom of the loop, line 240, is the normal >>>> continue the test to the next case. >>>> >>>> resume008.java : >>>> ... >>>> ??? 94??????????? for (int i = 0; ; i++) { >>>> ??? 95 >>>> >>>> ??? 96??????????????? if (!shouldRunAfterBreakpoint()) { >>>> ??? 97??????????????????? vm.resume(); >>>> ??? 98??????????????????? break; >>>> ??? 99??????????????? } >>>> >>>> 100 >>>> ?? 101 >>>> ?? 102??????????????? display(":::::: case: # " + i); >>>> ?? 103 >>>> ?? 104??????????????? switch (i) { >>>> ?? 105 >>>> ?? 106??????????????????? case 0: >>>> ?? 107??????????????????? eventRequest = settingThreadStartRequest ( >>>> ?? 108 SUSPEND_NONE, "ThreadStartRequest1"); >>>> ... >>>> ? 238 >>>> ?? 239??????????????? display("......--> vm.resume()"); >>>> ?? 240??????????????? vm.resume(); >>>> ?? 241??????????? } >>>>> >>>>> Also: >>>>> >>>>>> ? 1. On a thread start event the debugee is suspended, line 141 >>>>> That's not true for the first ThreadStartEvent since SUSPEND_NONE >>>>> was used. >>>> The thread start event is set to SUSPEND_NONE for thread0, but when >>>> the thread start event is observed the resume008 test suspends the vm >>>> immediately after fetching the "number" property. >>> My point is that the Debuggee continues to run after the >>> ThreadStartEvent is sent, and relies on the debugger to stop it >>> after receiving the event. But in the meantime the debuggee has >>> advanced to the next breakpoint, but only sometimes, thus the bug >>> you are seeing. >>>> >>>> ?? 132??????????????? if ( !(newEvent instanceof ThreadStartEvent)) { >>>> ?? 133??????????????????? setFailedStatus("ERROR: new event is not >>>> ThreadStartEvent"); >>>> ?? 134??????????????? } else { >>>> ?? 135 >>>> ?? 136??????????????????? String property = (String) >>>> newEvent.request().getProperty("number"); >>>> ?? 137??????????????????? display("?????? got new ThreadStartEvent >>>> with propety 'number' == " + property); >>>> ?? 138 >>>> ?? 139??????????????????? display("......checking up on >>>> EventSet.resume()"); >>>> ?? 140??????????????????? display("......--> vm.suspend();"); >>>> ?? 141??????????????????? vm.suspend(); >>>> >>>> >>>>> >>>>> Chris >>>>> >>>>> On 7/18/18 4:52 AM, Gary Adams wrote: >>>>>> There is nothing wrong with the breakpoint in >>>>>> methodForCommunication. >>>>>> The test uses it to make sure the threads are each tested >>>>>> separately. >>>>>> The breakpoint eventhandler just displays a message, increments a >>>>>> counter >>>>>> and returns. >>>>>> >>>>>> Let me step through resume008a the debugee to help clarify ... >>>>>> >>>>>> 1. The test thread is created and the synchronized break point is >>>>>> observed. lines 101-102 >>>>>> 2. The thread is started. lines 104,135-137 >>>>>> ??? 2a. The main thread blocks on a local object. lines 133, 139 >>>>>> ??? 2b. The test thread is started. lines 137, >>>>>> ?????????? A run entered message is displayed, line 159 >>>>>> ?????????? The main thread lock object is notified, line 167 >>>>>> ????????? 2b1. The main thread continues. line 167, 146 >>>>>> ????????????????? The next test thread is created. line 106 >>>>>> ????????????????? The synchronized breakpoint is observed, line 107 >>>>>> ????????? 2b2. A run exited message is displayed, line 169 >>>>>> >>>>>> On the resume008 debugger side? ... >>>>>> ? 1. On a thread start event the debugee is suspended, line 141 >>>>>> ? 2. Messages are displayed and a first set of thread suspend >>>>>> counts is acquired. lines 143-151 >>>>>> ? 3. The threads are resumed, line 152 >>>>>> ---> >>>>>> ? 4.? Messages are displayed and a second set of thread suspend >>>>>> counts is acquired. lines 154-159 >>>>>> >>>>>> The way the test is written the expectation is the debugger steps >>>>>> 2,3,4 will all happen >>>>>> while the test thread is running. >>>>>> >>>>>> When the debugger resumes the debuggee threads (debugger step 3) >>>>>> the debuggee continues from where it left off (debuggee steps >>>>>> 2b,2b1,2b2) >>>>>> >>>>>> If we complete debuggee step 2b1 (line 107) before the debugger >>>>>> completes step 4 line 159, >>>>>> then the synchronized breakpoint will suspend the vm and the >>>>>> counts will not match >>>>>> for the SUSPEND_NONE test thread start. >>>>>> >>>>>> resume008a.java: >>>>>> >>>>>> ?? 100??????????????????????? case 0: >>>>>> ?? 101??????????????????????????????? thread0 = new >>>>>> Threadresume008a("thread0"); >>>>>> ?? 102 methodForCommunication(); >>>>>> ?? 103 >>>>>> ?? 104 threadStart(thread0); >>>>>> ?? 105 >>>>>> ?? 106??????????????????????????????? thread1 = new >>>>>> Threadresume008a("thread1"); >>>>>> ?? 107 methodForCommunication(); >>>>>> ?? 108??????????????????????????????? break; >>>>>> >>>>>> ?? ... >>>>>> ?? 135??????? static int threadStart(Thread t) { >>>>>> ?? 136??????????? synchronized (waitnotifyObj) { >>>>>> ?? 137??????????????? t.start(); >>>>>> ?? 138??????????????? try { >>>>>> ?? 139??????????????????? waitnotifyObj.wait(); >>>>>> ?? 140??????????????? } catch ( Exception e) { >>>>>> ?? 141??????????????????? exitCode = FAILED; >>>>>> ?? 142??????????????????? logErr("?????? Exception : " + e ); >>>>>> ?? 143??????????????????? return FAILED; >>>>>> ?? 144??????????????? } >>>>>> ?? 145??????????? } >>>>>> ?? 146??????????? return PASSED; >>>>>> ?? 147??????? } >>>>>> >>>>>> ?? 149??????? static class Threadresume008a extends Thread { >>>>>> ?? ... >>>>>> ?? 157 >>>>>> ?? 158??????????? public void run() { >>>>>> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + >>>>>> tName); >>>>>> >>>>>> This is the proposed fix that will let the debugger complete it's >>>>>> second >>>>>> acquisition of suspend counts while the test thread is still >>>>>> running. >>>>>> >>>>>> ?? 160??????????????? // Yield, so the start thread event >>>>>> processing can be completed. >>>>>> ?? 161??????????????? try { >>>>>> ?? 162??????????????????? Thread.sleep(100); >>>>>> ?? 163??????????????? } catch (InterruptedException e) { >>>>>> ?? 164??????????????????? // ignored >>>>>> ?? 165??????????????? } >>>>>> >>>>>> ?? 166??????????????? synchronized (waitnotifyObj) { >>>>>> ?? 167??????????????????????? waitnotifyObj.notify(); >>>>>> ?? 168??????????????? } >>>>>> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + >>>>>> tName); >>>>>> ?? 170??????????????? return; >>>>>> ?? 171??????????? } >>>>>> ?? 172??????? } >>>>>> ?? 150 >>>>>> ?? 151??????????? String tName = null; >>>>>> ?? 152 >>>>>> ?? 153??????????? public Threadresume008a(String threadName) { >>>>>> ?? 154??????????????? super(threadName); >>>>>> ?? 155??????????????? tName = threadName; >>>>>> ?? 156??????????? } >>>>>> ?? 157 >>>>>> ?? 158??????????? public void run() { >>>>>> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + >>>>>> tName); >>>>>> ?? 160??????????????? // Yield, so the start thread event >>>>>> processing can be completed. >>>>>> ?? 161??????????????? try { >>>>>> ?? 162??????????????????? Thread.sleep(100); >>>>>> ?? 163??????????????? } catch (InterruptedException e) { >>>>>> ?? 164??????????????????? // ignored >>>>>> ?? 165??????????????? } >>>>>> ?? 166??????????????? synchronized (waitnotifyObj) { >>>>>> ?? 167??????????????????????? waitnotifyObj.notify(); >>>>>> ?? 168??????????????? } >>>>>> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + >>>>>> tName); >>>>>> ?? 170??????????????? return; >>>>>> ?? 171??????????? } >>>>>> ?? 172??????? } >>>>>> >>>>>> >>>>>> >>>>>> On 7/18/18, 2:38 AM, Chris Plummer wrote: >>>>>>> Hi Gary, >>>>>>> >>>>>>> I've been having trouble following the control flow of this >>>>>>> test. One thing I've stumbled across is the following: >>>>>>> >>>>>>> ??????????? /* A debuggee class must define >>>>>>> 'methodForCommunication' >>>>>>> ???????????? * method and invoke it in points of synchronization >>>>>>> ???????????? * with a debugger. >>>>>>> ???????????? */ >>>>>>> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication"); >>>>>>> >>>>>>> So why isn't this mode of synchronization good enough? Is it >>>>>>> because it was not designed with the understanding that the >>>>>>> debugger might be doing suspended thread counts, and suspending >>>>>>> all threads at the breakpoint messes up the test? >>>>>>> >>>>>>> From what I can tell of the test, after the debuggee is started >>>>>>> and hits the default breakpoint at the start of main(), the >>>>>>> debugger then does a vm.resume() at the start of the for loop in >>>>>>> the runTest() method. The debuggee then creates a thread and >>>>>>> calls methodForCommunication(). There is already a breakpoint >>>>>>> set there by the above debuggee code. It's unclear to me what >>>>>>> happens as a result of this breakpoint and how it serves the >>>>>>> test. Also unclear to me who is responsible for the vm.resume() >>>>>>> after the breakpoint is hit. >>>>>>> >>>>>>> The debugger then requests all ThreadStart events, requesting >>>>>>> that no threads be disabled when it is sent. I think you are >>>>>>> saying that when the ThreadStart event comes in, sometimes we >>>>>>> are at the methodForCommunication breakpoint, with all threads >>>>>>> disabled, and this messes up the thread suspend counts. You want >>>>>>> to delay 100ms so the breakpoint event can be processed and >>>>>>> threads resumed again (although I can't see who actually resumes >>>>>>> the thread after hitting the methodForCommunication breakpoint). >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 7/17/18 8:33 AM, Gary Adams wrote: >>>>>>>> A race condition exists between the debugger and the debuggee. >>>>>>>> >>>>>>>> The first test thread is started with SUSPEND_NONE policy set. >>>>>>>> While processing the thread start event the debugger captures >>>>>>>> an initial set of thread suspend counts and resumes the >>>>>>>> debuggee vm. If the debuggee advances quickly it reaches >>>>>>>> the breakpoint set for methodForCommunication. Since the >>>>>>>> breakpoint >>>>>>>> carries with it SUSPEND_ALL policy, when the debugger captures >>>>>>>> a second >>>>>>>> set of suspend counts, it will not match the expected counts for >>>>>>>> a SUSPEND_NONE scenario. >>>>>>>> >>>>>>>> The proposed fix introduces a yield in the debuggee test thread >>>>>>>> run method >>>>>>>> to allow the debugger to get the expected sampled values. >>>>>>>> >>>>>>>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8170089 >>>>>>>> ? Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/ >>>>>>>> >>>>>>>> >>>>>>>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java: >>>>>>>> >>>>>>>> ... >>>>>>>> ?? 186??????? private void >>>>>>>> setCommunicationBreakpoint(ReferenceType refType, String >>>>>>>> methodName) { >>>>>>>> ?? 187??????????? Method method = >>>>>>>> debuggee.methodByName(refType, methodName); >>>>>>>> ?? 188??????????? Location location = null; >>>>>>>> ?? 189??????????? try { >>>>>>>> ?? 190??????????????? location = method.allLineLocations().get(0); >>>>>>>> ?? 191??????????? } catch (AbsentInformationException e) { >>>>>>>> ?? 192??????????????? throw new Failure(e); >>>>>>>> ?? 193??????????? } >>>>>>>> ?? 194??????????? bpRequest = debuggee.makeBreakpoint(location); >>>>>>>> ?? 195 >>>>>>>> >>>>>>>> ?? 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); >>>>>>>> >>>>>>>> ?? 197??????????? bpRequest.putProperty("number", "zero"); >>>>>>>> ?? 198??????????? bpRequest.enable(); >>>>>>>> ?? 199 >>>>>>>> ?? 200??????????? eventHandler.addListener( >>>>>>>> ?? 201???????????????? new EventHandler.EventListener() { >>>>>>>> ?? 202???????????????????? public boolean eventReceived(Event >>>>>>>> event) { >>>>>>>> ?? 203??????????????????????? if (event instanceof >>>>>>>> BreakpointEvent && bpRequest.equals(event.request())) { >>>>>>>> ?? 204 synchronized(eventHandler) { >>>>>>>> ?? 205 display("Received communication breakpoint event."); >>>>>>>> ?? 206??????????????????????????????? bpCount++; >>>>>>>> ?? 207 eventHandler.notifyAll(); >>>>>>>> ?? 208??????????????????????????? } >>>>>>>> ?? 209??????????????????????????? return true; >>>>>>>> ?? 210??????????????????????? } >>>>>>>> ?? 211??????????????????????? return false; >>>>>>>> ?? 212???????????????????? } >>>>>>>> ?? 213???????????????? } >>>>>>>> ?? 214??????????? ); >>>>>>>> ?? 215??????? } >>>>>>>> >>>>>>>> >>>>>>>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: >>>>>>>> >>>>>>>> ... >>>>>>>> ?? 140??????????????????? display("......--> vm.suspend();"); >>>>>>>> ?? 141??????????????????? vm.suspend(); >>>>>>>> ?? 142 >>>>>>>> ?? 143??????????????????? display("??????? getting : >>>>>>>> Map suspendsCounts1"); >>>>>>>> ?? 144 >>>>>>>> ?? 145??????????????????? Map suspendsCounts1 >>>>>>>> = new HashMap(); >>>>>>>> ?? 146??????????????????? for (ThreadReference threadReference >>>>>>>> : vm.allThreads()) { >>>>>>>> ?? 147 suspendsCounts1.put(threadReference.name(), >>>>>>>> threadReference.suspendCount()); >>>>>>>> ?? 148??????????????????? } >>>>>>>> ?? 149 display(suspendsCounts1.toString()); >>>>>>>> ?? 150 >>>>>>>> ?? 151??????????????????? display(" eventSet.resume;"); >>>>>>>> ?? 152??????????????????? eventSet.resume(); >>>>>>>> ?? 153 >>>>>>>> ?? 154??????????????????? display("??????? getting : >>>>>>>> Map suspendsCounts2"); >>>>>>>> >>>>>>>> This is where the breakpoint is encountered before the second >>>>>>>> set of suspend counts is acquired. >>>>>>>> >>>>>>>> ?? 155??????????????????? Map suspendsCounts2 >>>>>>>> = new HashMap(); >>>>>>>> ?? 156??????????????????? for (ThreadReference threadReference >>>>>>>> : vm.allThreads()) { >>>>>>>> ?? 157 suspendsCounts2.put(threadReference.name(), >>>>>>>> threadReference.suspendCount()); >>>>>>>> ?? 158??????????????????? } >>>>>>>> ?? 159 display(suspendsCounts2.toString()); >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From gary.adams at oracle.com Fri Jul 20 19:11:24 2018 From: gary.adams at oracle.com (Gary Adams) Date: Fri, 20 Jul 2018 15:11:24 -0400 Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find boolVar with expected value: false In-Reply-To: <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com> References: <5B082D2E.7000408@oracle.com> <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com> Message-ID: <5B5233DC.5040003@oracle.com> Here's another attempt to clear up the overlapping output from the command processing and event handler in the jdb tests. The fundamental problem is observed when "prompts" are produced interleaved with command and event output. This attempts to fix the issue by buffering the output and printing it fully assembled. Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev.01/ On 5/26/18, 6:50 AM, gary.adams at oracle.com wrote: > This is a review request for a previously closed test bug. > The test was recently moved to the open repos, and the > proposed fix is in the open code. > > Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/ > > > -------- Forwarded Message -------- > Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot > find boolVar with expected value: false > Date: Fri, 25 May 2018 11:35:10 -0400 > From: Gary Adams > Reply-To: gary.adams at oracle.com > > > > > > The jdb tests use stdin to send commands to a jdb process > and parses the stdout to determine if a command was > successful and when the process is prompting for new commands > to be sent. > > Some commands are synchronous, so when the command is completed > a new prompt is sent back immediately. > > Some commands are asynchronous, so there could be a delay > until a breakpoint is reached. The event handler then sends a prompt > when the application thread is stopped and new jdb commands can be sent. > > The problem causing the intermittent failures was a corruption in the > output stream when prompts were being sent at the wrong times. > > Instead of receiving > "Breakpoint hit:" > > > the log contained > "Breakpoint hit:" > > Once out of sync, jdb commands were being sent prematurely > and the wrong values were being compared against expected behavior. > The simple fix proposed here recognizes that commands like "cont", > "step" and "next" are asynchronous commands and should not send back > a prompt immediately. Instead. the event handler will deliver the next prompt > when the next "Breakpoint hit:" or "Step completed:" state change occurs. > > The bulk of the testing was done on windows-x64-debug builds where the > intermittent failures were observed in ~5 in 1000 testruns. The fix has > also been tested on linux-x64-debug, solaris-sparcv9-debug, > and macosx-x64-debug, even though the failures have never been reported > against those platforms. > > Failures have been observed in many of the nsk/jdb tests with similar corrupted > output streams, but never directly associated with this issue before. > > redefine001, caught_exception002, locals002, eval001, next001, > stop_at003, step002, print002, trace001, step_up001, read001, > clear004, kill001, set001 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Fri Jul 20 19:37:56 2018 From: jcbeyler at google.com (JC Beyler) Date: Fri, 20 Jul 2018 12:37:56 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> Message-ID: Yes that is right, this is the latest: http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ I apologize for the multiple threads and confusion, Jc On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Thank you a lot, Vladimir! > Yes, the webrev.03 is the latest. > Jc, will correct us if it is not right. > > Thanks, > Serguei > > > On 7/20/18 10:52, Vladimir Kozlov wrote: > > I asked Igor V. to look. > > > > Seems like review is done in an other thread which does not have bug > > id in subject. Currently webrev.03 > > > > Vladimir > > > > On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: > >> Thanks, Rahul! > >> In fact, there no good experts for this area in the serviceability team. > >> It would be much better if anyone from the Compiler team could do it. > >> > >> Vladimir K., > >> > >> Is there anyone from the Compiler team available to review this? > >> Otherwise, I could try to review it but am not sure about my review > >> quality. > >> > >> Thanks, > >> Serguei > >> > >> > >> On 7/19/18 00:48, Rahul Raghavan wrote: > >>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled > >>> > >>> (just adding + hotspot-compiler-dev also) > >>> > >>> > >>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: > >>> Subject Was: > >>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled > >>> > >>> + serviceability-dev > >>> > >>> Hi all, > >>> > >>> Could anyone else give me a review of this webrev and check/test the > >>> various architecture changes? > >>> > >>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > >>> > >>> > >>> Thanks for all your help! > >>> Jc > >>> > >>> > >>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler > wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> Here is a webrev that does all the architectures in the same way: > >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > >>>>> > >>>>> Could anyone review the other architectures and test? > >>>>> - arm, sparc & aarch64 are also modified now to follow the same > >>>>> "if no > >>>>> tlab, then consider eden space allocation" logic. > >>>>> > >>>>> Thanks for your help! > >>>>> Jc > >>>>> > >>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler > >>>>> wrote: > >>>>> > >>>>>> Hi Kim, > >>>>>> > >>>>>> I opened this bug > >>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 > >>>>>> > >>>>>> and now I've done an update: > >>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ > >>>>>> > >>>>>> I basically have done your nits but also removed the try_eden (it > >>>>>> was > >>>>>> used to bind a label but was not used). I updated the comments to > >>>>>> use the > >>>>>> one you preferred. > >>>>>> > >>>>>> I still have to do the other architectures though but at least we > >>>>>> seem to > >>>>>> have a consensus on this architecture, correct? > >>>>>> > >>>>>> Thanks for the review, > >>>>>> Jc > >>>>>> > >>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett > > >>>>>> wrote: > >>>>>> > >>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>> Yes, you are right, I did those changes due to: > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 > >>>>>>>> > >>>>>>>> If Robbin agrees to this change, and if no one sees an issue, > >>>>>>>> I'll go > >>>>>>> ahead > >>>>>>>> and propagate the change across architectures. > >>>>>>>> > >>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's > >>>>>>>> comment > >>>>>>> and > >>>>>>>> review) :) > >>>>>>>> Jc > >>>>>>>> > >>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose > > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I'm not sure if we had left this case intentionally or not > >>>>>>>>> but, if we > >>>>>>> want > >>>>>>>>> it all to be consistent, we should perhaps fix it. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Well, you put in that logic last February, so unless somebody > >>>>>>>>> speaks > >>>>>>> up > >>>>>>>>> quickly, I support your adjusting it to be the way you want it. > >>>>>>>>> > >>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I > >>>>>>>>> src/hotspot/share" > >>>>>>>>> suggests that the GC group is most active in touching this > >>>>>>>>> feature. > >>>>>>>>> If Robbin is OK with it, there's your reviewer. > >>>>>>>>> > >>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person > >>>>>>>>> working on the GC to OK it. > >>>>>>>>> > >>>>>>>>> ? John > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Jc > >>>>>>> > >>>>>>> Robbin is on vacation; you might not hear from him for a while. > >>>>>>> > >>>>>>> I'm assuming you'll open a new bug for this? > >>>>>>> > >>>>>>> Except for a few minor nits (below), this looks okay to me. > >>>>>>> > >>>>>>> The comment at line 1052 needs updating. > >>>>>>> > >>>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused. > >>>>>>> > >>>>>>> pre-existing: The try_eden label declared on line 1054 is bound at > >>>>>>> line 1058, but unreferenced. > >>>>>>> > >>>>>>> I like the wording of the comment at 1139 better than the > >>>>>>> wording at > >>>>>>> 1016. > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> -- > >>>>>> > >>>>>> Thanks, > >>>>>> Jc > >>>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> > >>>>> Thanks, > >>>>> Jc > >>>>> > >>>> > >>>> > >> > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Fri Jul 20 20:07:29 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 20 Jul 2018 13:07:29 -0700 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <6de6362944f84740b80abb22cbbea872@sap.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com> <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com> <6de6362944f84740b80abb22cbbea872@sap.com> Message-ID: Hi Ralf, Changes look good and pass all the testing I did. You can push once Serguei approves. thanks, Chris On 7/20/18 7:28 AM, Schmelter, Ralf wrote: > Hi Sergue, > > I?ve updated the webref: http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/ > > JVMTI_ERROR_OPAQUE_FRAME was never returned from GetFrameLocation(). If it would have, the old code would have removed all native methods from the call stack. The original JVMDI call did indeed return JVMDI_ERROR_OPAQUE_FRAME, so maybe it was a leftover from the JVMDI->JVMTI transition. > > I?ve tried to make the test more readable and added some comments to explain why it is done the way it is. > > Best regards, > Ralf > > > > > From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com] > Sent: Mittwoch, 18. Juli 2018 22:57 > To: Chris Plummer ; Schmelter, Ralf ; serviceability-dev at openjdk.java.net; Stuefe, Thomas > Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior > > Hi Ralf, > > The fix itself looks pretty good to me. > Some minor comments. > > The copyright year needs an update. > 218 jint count, filledIn; > > Could you, please, split the declarations above into different lines to follow the local style? > Ii is interesting that the original implementation checked the error code returned > from the JVMTI GetFrameLocation for being equal to JVMTI_ERROR_OPAQUE_FRAME. > However, the GetFrameLocation spec does not list this error code as possible. > > > Some comments about the test. > 52 static void callEnded() { > 53 System.out.println("SOE occurred as expected"); > 54 } > 55 > 56 static int call(int depth) { > 57 if (depth == 0) { > 58 // Should have seen a stack overflow by now. > 59 System.out.println("Exited without creating SOE"); > 60 System.exit(0); > 61 } > 62 > 63 try { > 64 int newDepth = call(depth - 1); > 65 > 66 if (newDepth == -1_000) { > 67 // Pop some frames so there is room on the stack for the > 68 // println() > 69 callEnded(); > 70 } > 71 > 72 return newDepth - 1; > 73 } catch (StackOverflowError e) { > 74 return -1; > 75 } > 76 } > 77 } > ? I'd suggest to rename the methods call() and callEnded() to something like > ? recursiveMethod() and recursionEnd(). > ? Also, the manipulations with SOE create a complexity and are confusing. > ? Could it be more simple to let it propagated and then catch in main()? > ? What is the point for all these checks at the lines 104-119? > ? In general, I'm looking for some ways to make it more clear, simple and stable. > > Thanks, > Serguei From chris.plummer at oracle.com Fri Jul 20 20:12:51 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 20 Jul 2018 13:12:51 -0700 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com> <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com> <6de6362944f84740b80abb22cbbea872@sap.com> Message-ID: Oops. Sorry, that testing comment was for another changeset. I didn't test your changes. If you think they could use some additional testing on some more platforms, let me know. thanks, Chris On 7/20/18 1:07 PM, Chris Plummer wrote: > Hi Ralf, > > Changes look good and pass all the testing I did. You can push once > Serguei approves. > > thanks, > > Chris > > On 7/20/18 7:28 AM, Schmelter, Ralf wrote: >> Hi Sergue, >> >> I?ve updated the webref: >> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/ >> >> JVMTI_ERROR_OPAQUE_FRAME was never returned from GetFrameLocation(). >> If it would have, the old code would have removed all native methods >> from the call stack. The original JVMDI call did indeed return >> JVMDI_ERROR_OPAQUE_FRAME, so maybe it was a leftover from the >> JVMDI->JVMTI transition. >> >> I?ve tried to make the test more readable and added some comments to >> explain why it is done the way it is. >> >> Best regards, >> Ralf >> >> >> >> >> From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com] >> Sent: Mittwoch, 18. Juli 2018 22:57 >> To: Chris Plummer ; Schmelter, Ralf >> ; serviceability-dev at openjdk.java.net; >> Stuefe, Thomas >> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c >> to prevent quadratic runtime behavior >> >> Hi Ralf, >> >> The fix itself looks pretty good to me. >> Some minor comments. >> >> The copyright year needs an update. >> ? 218???? jint count, filledIn; >> >> ? Could you, please, split the declarations above into different >> lines to follow the local style? >> Ii is interesting that the original implementation checked the error >> code returned >> from the JVMTI GetFrameLocation for being equal to >> JVMTI_ERROR_OPAQUE_FRAME. >> However, the GetFrameLocation spec does not list this error code as >> possible. >> >> >> Some comments about the test. >> ?? 52???? static void callEnded() { >> ?? 53???????? System.out.println("SOE occurred as expected"); >> ?? 54???? } >> ?? 55 >> ?? 56???? static int call(int depth) { >> ?? 57???????? if (depth == 0) { >> ?? 58???????????? // Should have seen a stack overflow by now. >> ?? 59???????????? System.out.println("Exited without creating SOE"); >> ?? 60???????????? System.exit(0); >> ?? 61???????? } >> ?? 62 >> ?? 63???????? try { >> ?? 64???????????? int newDepth = call(depth - 1); >> ?? 65 >> ?? 66???????????? if (newDepth == -1_000) { >> ?? 67???????????????? // Pop some frames so there is room on the >> stack for the >> ?? 68???????????????? // println() >> ?? 69???????????????? callEnded(); >> ?? 70???????????? } >> ?? 71 >> ?? 72???????????? return newDepth - 1; >> ?? 73???????? } catch (StackOverflowError e) { >> ?? 74???????????? return -1; >> ?? 75???????? } >> ?? 76???? } >> ?? 77 } >> ?? I'd suggest to rename the methods call() and callEnded() to >> something like >> ?? recursiveMethod() and recursionEnd(). >> ?? Also, the manipulations with SOE create a complexity and are >> confusing. >> ?? Could it be more simple to let it propagated and then catch in >> main()? >> ?? What is the point for all these checks at the lines 104-119? >> ?? In general, I'm looking for some ways to make it more clear, >> simple and stable. >> >> Thanks, >> Serguei > > > From chris.plummer at oracle.com Fri Jul 20 20:13:55 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 20 Jul 2018 13:13:55 -0700 Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running in Docker containers In-Reply-To: References: Message-ID: Hi Yasumasa, Changes look and and passed all my testing. thanks, Chris On 7/19/18 10:13 PM, Yasumasa Suenaga wrote: > Hi Chris, > > Thank you for your comment. > I uploaded new webrev. Could you review again? > > http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.02/ > > I tested my change on Linux x64, but I cannot check it on other > platform (includes older Linux). > However SA tests are included in HotSpot tier 1 tests. Tests on submit > repo work fine with this change > (mach5-one-ysuenaga-JDK-8205992-20180720-0305-31840). > > > Thanks, > > Yasumasa > > > 2018-07-20 3:26 GMT+09:00 Chris Plummer : >> Hi Yasumasa, >> >> 84 // It maps the LWPID in the host to it in the container. >> >> "it" -> "the PID" >> >> 286 // Get LWPID in the host from the container's LWPID. >> 287 public int getHostPID(int id) { >> 288 try { >> 289 return nspidMap.get(id); >> 290 } catch (NullPointerException e) { >> 291 return -1; >> 292 } >> 293 } >> >> What is the source of the NPE here? Is it because nspidMap was never >> initialized because the process is not in a container? In that case I think >> you should be checking for null rather than having an NPE be part of normal >> execution. >> >> 42 int hostPID = >> ((LinuxDebuggerLocal)debugger).getHostPID(pid); >> 43 if (hostPID != -1) { >> 44 pid = hostPID; >> 45 } >> >> A comment here would be helpful. >> >> The rest looks good. I should probably run it through some internal testing. >> Let me know when you have a final webrev. >> >> thanks, >> >> Chris >> >> >> On 7/18/18 5:59 AM, Yasumasa Suenaga wrote: >>> PING: >>> >>> Could you review it? >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/ >>> >>> This change has been reviewed by Jini. >>> We need a Reviewer. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2018/07/12 13:42, Yasumasa Suenaga wrote: >>>> Thanks Jini, >>>> >>>> I uploaded new webrev. It contains some comments and removing extra >>>> space. >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/ >>>> >>>> >>>> Yasumasa >>>> >>>> >>>> >>>> 2018-07-12 2:32 GMT+09:00 Jini George : >>>>> Hi Yasumasa, >>>>> >>>>> This looks good to me except for one nit. And some more comments would >>>>> help. >>>>> For e.g., it would help to say that NSPidMap is to map the host to >>>>> container >>>>> lwpids. >>>>> >>>>> The nit: >>>>> >>>>> * >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html >>>>> Line 253: extra space after the parentheses >>>>> >>>>> Thanks, >>>>> Jini. >>>>> >>>>> On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote: >>>>>> >>>>>> PING: Could you review it? >>>>>> >>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>>>>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >>>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2018/06/28 22:12, Yasumasa Suenaga wrote: >>>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> Please review this change. >>>>>>> >>>>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8205992 >>>>>>> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/ >>>>>>> >>>>>>> I tried to attach jhsdb to java process in docker container from >>>>>>> container host, but it couldn't. >>>>>>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet. >>>>>>> >>>>>>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they >>>>>>> returns PIDs in container - they are different from host's PID. So I >>>>>>> added >>>>>>> the code to scan /proc//task to get all LWP IDs and they are kept >>>>>>> in a >>>>>>> Map in LinuxDebuggerLocal. >>>>>>> >>>>>>> Also SA_ALTROOT is set to /proc//root if SA detects debuggee runs >>>>>>> in >>>>>>> container. It helps SA to parse binaries in container. >>>>>>> >>>>>>> This change has been pushed to submit repo, and it was failed on OS X >>>>>>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963). >>>>>>> But I guess it causes JDK-8205906. This change affects to Linux only. >>>>>>> >>>>>>> Could you review it? >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >> From serguei.spitsyn at oracle.com Fri Jul 20 20:40:16 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 20 Jul 2018 13:40:16 -0700 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <6de6362944f84740b80abb22cbbea872@sap.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com> <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com> <6de6362944f84740b80abb22cbbea872@sap.com> Message-ID: <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com> Hi Ralf, On 7/20/18 07:28, Schmelter, Ralf wrote: > Hi Sergue, > > I?ve updated the webref: http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/ The copyright year in ThreadReferenceImpl.c still has to be 2018, not 2008. http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html 72 if (newDepth == -1_000) { 73 // Pop some frames so there is room on the stack for the 74 // call (including println()). 75 notifyRecursionEnded(); 76 } ? I have a concern on potential issue mentioned in the comment above. ? Should a StackOverflowError be expected here? 79 } catch (StackOverflowError e) { 80 // Use negative depth to indicate the recursion has ended. 81 return -1; 82 } ? What is going to happen if the StackOverflowError was really caught above? ? If I understand it correctly, the notifyRecursionEnded() call will be missed then. ? This breakpoint will be missed as well: 107 bpe = resumeTo("Frames2Targ", "notifyRecursionEnded", "()V"); > JVMTI_ERROR_OPAQUE_FRAME was never returned from GetFrameLocation(). If it would have, the old code would have removed all native methods from the call stack. The original JVMDI call did indeed return JVMDI_ERROR_OPAQUE_FRAME, so maybe it was a leftover from the JVMDI->JVMTI transition. Agreed. > I?ve tried to make the test more readable and added some comments to explain why it is done the way it is. Thank you for the update! Thanks, Serguei > Best regards, > Ralf > > > > > From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com] > Sent: Mittwoch, 18. Juli 2018 22:57 > To: Chris Plummer ; Schmelter, Ralf ; serviceability-dev at openjdk.java.net; Stuefe, Thomas > Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior > > Hi Ralf, > > The fix itself looks pretty good to me. > Some minor comments. > > The copyright year needs an update. > 218 jint count, filledIn; > > Could you, please, split the declarations above into different lines to follow the local style? > Ii is interesting that the original implementation checked the error code returned > from the JVMTI GetFrameLocation for being equal to JVMTI_ERROR_OPAQUE_FRAME. > However, the GetFrameLocation spec does not list this error code as possible. > > > Some comments about the test. > 52 static void callEnded() { > 53 System.out.println("SOE occurred as expected"); > 54 } > 55 > 56 static int call(int depth) { > 57 if (depth == 0) { > 58 // Should have seen a stack overflow by now. > 59 System.out.println("Exited without creating SOE"); > 60 System.exit(0); > 61 } > 62 > 63 try { > 64 int newDepth = call(depth - 1); > 65 > 66 if (newDepth == -1_000) { > 67 // Pop some frames so there is room on the stack for the > 68 // println() > 69 callEnded(); > 70 } > 71 > 72 return newDepth - 1; > 73 } catch (StackOverflowError e) { > 74 return -1; > 75 } > 76 } > 77 } > ? I'd suggest to rename the methods call() and callEnded() to something like > ? recursiveMethod() and recursionEnd(). > ? Also, the manipulations with SOE create a complexity and are confusing. > ? Could it be more simple to let it propagated and then catch in main()? > ? What is the point for all these checks at the lines 104-119? > ? In general, I'm looking for some ways to make it more clear, simple and stable. > > Thanks, > Serguei From chris.plummer at oracle.com Fri Jul 20 20:44:55 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 20 Jul 2018 13:44:55 -0700 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com> <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com> <6de6362944f84740b80abb22cbbea872@sap.com> <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com> Message-ID: On 7/20/18 1:40 PM, serguei.spitsyn at oracle.com wrote: > Hi Ralf, > > > On 7/20/18 07:28, Schmelter, Ralf wrote: >> Hi Sergue, >> >> I?ve updated the webref: >> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/ > > The copyright year in ThreadReferenceImpl.c still has to be 2018, not > 2008. > > http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html > > > ? 72???????????? if (newDepth == -1_000) { > ? 73???????????????? // Pop some frames so there is room on the stack > for the > ? 74???????????????? // call (including println()). > ? 75???????????????? notifyRecursionEnded(); > ? 76???????????? } > > ? I have a concern on potential issue mentioned in the comment above. > ? Should a StackOverflowError be expected here? > > ? 79???????? } catch (StackOverflowError e) { > ? 80???????????? // Use negative depth to indicate the recursion has > ended. > ? 81???????????? return -1; > ? 82???????? } > > ? What is going to happen if the StackOverflowError was really caught > above? The SOE is really caught in the above code. I returns -1, and starts the unwinding of the stack. After 1000 frames have been popped via returns, notifyRecursionEnded() will be called. The pops are so notifyRecursionEnded() can be called without worry of another SOE. Chris > ? If I understand it correctly, the notifyRecursionEnded() call will > be missed then. > ? This breakpoint will be missed as well: > > ? 107???????? bpe = resumeTo("Frames2Targ", "notifyRecursionEnded", > "()V"); > > > >> JVMTI_ERROR_OPAQUE_FRAME was never returned from GetFrameLocation(). >> If it would have, the old code would have removed all native methods >> from the call stack. The original JVMDI call did indeed return >> JVMDI_ERROR_OPAQUE_FRAME, so maybe it was a leftover from the >> JVMDI->JVMTI transition. > > Agreed. > >> I?ve tried to make the test more readable and added some comments to >> explain why it is done the way it is. > > Thank you for the update! > > > Thanks, > Serguei > >> Best regards, >> Ralf >> >> >> >> >> From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com] >> Sent: Mittwoch, 18. Juli 2018 22:57 >> To: Chris Plummer ; Schmelter, Ralf >> ; serviceability-dev at openjdk.java.net; >> Stuefe, Thomas >> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c >> to prevent quadratic runtime behavior >> >> Hi Ralf, >> >> The fix itself looks pretty good to me. >> Some minor comments. >> >> The copyright year needs an update. >> ? 218???? jint count, filledIn; >> >> ? Could you, please, split the declarations above into different >> lines to follow the local style? >> Ii is interesting that the original implementation checked the error >> code returned >> from the JVMTI GetFrameLocation for being equal to >> JVMTI_ERROR_OPAQUE_FRAME. >> However, the GetFrameLocation spec does not list this error code as >> possible. >> >> >> Some comments about the test. >> ?? 52???? static void callEnded() { >> ?? 53???????? System.out.println("SOE occurred as expected"); >> ?? 54???? } >> ?? 55 >> ?? 56???? static int call(int depth) { >> ?? 57???????? if (depth == 0) { >> ?? 58???????????? // Should have seen a stack overflow by now. >> ?? 59???????????? System.out.println("Exited without creating SOE"); >> ?? 60???????????? System.exit(0); >> ?? 61???????? } >> ?? 62 >> ?? 63???????? try { >> ?? 64???????????? int newDepth = call(depth - 1); >> ?? 65 >> ?? 66???????????? if (newDepth == -1_000) { >> ?? 67???????????????? // Pop some frames so there is room on the >> stack for the >> ?? 68???????????????? // println() >> ?? 69???????????????? callEnded(); >> ?? 70???????????? } >> ?? 71 >> ?? 72???????????? return newDepth - 1; >> ?? 73???????? } catch (StackOverflowError e) { >> ?? 74???????????? return -1; >> ?? 75???????? } >> ?? 76???? } >> ?? 77 } >> ?? I'd suggest to rename the methods call() and callEnded() to >> something like >> ?? recursiveMethod() and recursionEnd(). >> ?? Also, the manipulations with SOE create a complexity and are >> confusing. >> ?? Could it be more simple to let it propagated and then catch in >> main()? >> ?? What is the point for all these checks at the lines 104-119? >> ?? In general, I'm looking for some ways to make it more clear, >> simple and stable. >> >> Thanks, >> Serguei > From serguei.spitsyn at oracle.com Fri Jul 20 21:04:22 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 20 Jul 2018 14:04:22 -0700 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com> <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com> <6de6362944f84740b80abb22cbbea872@sap.com> <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com> Message-ID: <343bce9f-7072-17fd-9351-1c62f67a12de@oracle.com> On 7/20/18 13:44, Chris Plummer wrote: > On 7/20/18 1:40 PM, serguei.spitsyn at oracle.com wrote: >> Hi Ralf, >> >> >> On 7/20/18 07:28, Schmelter, Ralf wrote: >>> Hi Sergue, >>> >>> I?ve updated the webref: >>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/ >> >> The copyright year in ThreadReferenceImpl.c still has to be 2018, not >> 2008. >> >> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html >> >> >> ? 72???????????? if (newDepth == -1_000) { >> ? 73???????????????? // Pop some frames so there is room on the stack >> for the >> ? 74???????????????? // call (including println()). >> ? 75???????????????? notifyRecursionEnded(); >> ? 76???????????? } >> >> ? I have a concern on potential issue mentioned in the comment above. >> ? Should a StackOverflowError be expected here? >> >> ? 79???????? } catch (StackOverflowError e) { >> ? 80???????????? // Use negative depth to indicate the recursion has >> ended. >> ? 81???????????? return -1; >> ? 82???????? } >> >> ? What is going to happen if the StackOverflowError was really caught >> above? > The SOE is really caught in the above code. I returns -1, and starts > the unwinding of the stack. After 1000 frames have been popped via > returns, notifyRecursionEnded() will be called. The pops are so > notifyRecursionEnded() can be called without worry of another SOE. Got it, thanks Chris. So, I'm Okay with the fix assuming the copyright year is fixed. Thanks, Serguei > > Chris >> ? If I understand it correctly, the notifyRecursionEnded() call will >> be missed then. >> ? This breakpoint will be missed as well: >> >> ? 107???????? bpe = resumeTo("Frames2Targ", "notifyRecursionEnded", >> "()V"); >> >> >> >>> JVMTI_ERROR_OPAQUE_FRAME was never returned from GetFrameLocation(). >>> If it would have, the old code would have removed all native methods >>> from the call stack. The original JVMDI call did indeed return >>> JVMDI_ERROR_OPAQUE_FRAME, so maybe it was a leftover from the >>> JVMDI->JVMTI transition. >> >> Agreed. >> >>> I?ve tried to make the test more readable and added some comments to >>> explain why it is done the way it is. >> >> Thank you for the update! >> >> >> Thanks, >> Serguei >> >>> Best regards, >>> Ralf >>> >>> >>> >>> >>> From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com] >>> Sent: Mittwoch, 18. Juli 2018 22:57 >>> To: Chris Plummer ; Schmelter, Ralf >>> ; serviceability-dev at openjdk.java.net; >>> Stuefe, Thomas >>> Subject: Re: RFR (S) 8205608: Fix 'frames()' in >>> ThreadReferenceImpl.c to prevent quadratic runtime behavior >>> >>> Hi Ralf, >>> >>> The fix itself looks pretty good to me. >>> Some minor comments. >>> >>> The copyright year needs an update. >>> ? 218???? jint count, filledIn; >>> >>> ? Could you, please, split the declarations above into different >>> lines to follow the local style? >>> Ii is interesting that the original implementation checked the error >>> code returned >>> from the JVMTI GetFrameLocation for being equal to >>> JVMTI_ERROR_OPAQUE_FRAME. >>> However, the GetFrameLocation spec does not list this error code as >>> possible. >>> >>> >>> Some comments about the test. >>> ?? 52???? static void callEnded() { >>> ?? 53???????? System.out.println("SOE occurred as expected"); >>> ?? 54???? } >>> ?? 55 >>> ?? 56???? static int call(int depth) { >>> ?? 57???????? if (depth == 0) { >>> ?? 58???????????? // Should have seen a stack overflow by now. >>> ?? 59???????????? System.out.println("Exited without creating SOE"); >>> ?? 60???????????? System.exit(0); >>> ?? 61???????? } >>> ?? 62 >>> ?? 63???????? try { >>> ?? 64???????????? int newDepth = call(depth - 1); >>> ?? 65 >>> ?? 66???????????? if (newDepth == -1_000) { >>> ?? 67???????????????? // Pop some frames so there is room on the >>> stack for the >>> ?? 68???????????????? // println() >>> ?? 69???????????????? callEnded(); >>> ?? 70???????????? } >>> ?? 71 >>> ?? 72???????????? return newDepth - 1; >>> ?? 73???????? } catch (StackOverflowError e) { >>> ?? 74???????????? return -1; >>> ?? 75???????? } >>> ?? 76???? } >>> ?? 77 } >>> ?? I'd suggest to rename the methods call() and callEnded() to >>> something like >>> ?? recursiveMethod() and recursionEnd(). >>> ?? Also, the manipulations with SOE create a complexity and are >>> confusing. >>> ?? Could it be more simple to let it propagated and then catch in >>> main()? >>> ?? What is the point for all these checks at the lines 104-119? >>> ?? In general, I'm looking for some ways to make it more clear, >>> simple and stable. >>> >>> Thanks, >>> Serguei >> > From hohensee at amazon.com Fri Jul 20 22:37:14 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Fri, 20 Jul 2018 22:37:14 +0000 Subject: RFR(L): 8196889: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions Message-ID: Please review. Bug: https://bugs.openjdk.java.net/browse/JDK-8196989 CSR: https://bugs.openjdk.java.net/browse/JDK-8196991 Webrev: http://cr.openjdk.java.net/~phh/8196989/webrev.00 This webrev is marked ?L? because it?s a behavioral change (CSR in draft state, may I have a review of that too please?) and because the test change fanout is large. The actual code changes are ?M?. Passes the submit repo, Hotspot tier1, the JFR gc event tests and any other test set with ?gc? or ?serviceability? in the test directory name. I found it difficult to verify the accuracy of the reported values other than manually, since they can vary from run to run of the same program. I?d appreciate suggestions for how to go about writing accuracy tests. I set out originally to revamp only the MXBeans, but decided it would be incomplete if I didn?t include the jstat counters and the output of the GC.heap_info jcmd option. I can separate the latter two into their own RFEs, but I find it easier understand it all in a single webrev and hope the reviewers will too. The basic approach is to add the new memory pools and collectors, the new jstat counters, and an archive region counter that stands in for an actual archive region set. HeapRegionSets are disjoint, so initially I tried to create a first-class archive region set (on the same level as the humongous region set), but that idea foundered on the fact that there?s too much code I don?t fully understand that depends on archive regions being in the existing old region set. Externally (i.e., in the MXBeans and the jstat counters), however, the old region set doesn?t include archive regions (unless running in legacy mode). I used CMS?s TraceCMSMemoryManagerStats class as the model for TraceConcMemoryManagerStats, which latter collects statistics on concurrent cycles. There are two STW pauses in each concurrent cycle: they are recorded separately and count as two sun.gc.collector.2 events. The humongous and archive space committed and used values are always identical, hence they are always 100% used. The revised output of jcmd GC.heap_info is in G1CollectedHeap::print_on(). I fixed a typo in src/hotspot/share/gc/g1/g1Policy.hpp by changing the result type of young_list_target_length() from size_t to uint, which latter is the type of the _young_list_target_length member. I updated the copyright date in src/hotspot/share/services/memoryService.hpp to 2018, as I neglected to do so in a previous push. Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.veresov at oracle.com Sat Jul 21 20:47:34 2018 From: igor.veresov at oracle.com (Igor Veresov) Date: Sat, 21 Jul 2018 13:47:34 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> Message-ID: <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com> I think you can just predicate the emission of these stubs for !UseTLAB, and not mess with the CPU-specific code. What do you think? diff --git a/src/hotspot/share/c1/c1_LIRGenerator.cpp b/src/hotspot/share/c1/c1_LIRGenerator.cpp --- a/src/hotspot/share/c1/c1_LIRGenerator.cpp +++ b/src/hotspot/share/c1/c1_LIRGenerator.cpp @@ -674,7 +674,7 @@ void LIRGenerator::new_instance(LIR_Opr dst, ciInstanceKlass* klass, bool is_unresolved, LIR_Opr scratch1, LIR_Opr scratch2, LIR_Opr scratch3, LIR_Opr scratch4, LIR_Opr klass_reg, CodeEmitInfo* info) { klass2reg_with_patching(klass_reg, klass, info, is_unresolved); // If klass is not loaded we do not know if the klass has finalizers: - if (UseFastNewInstance && klass->is_loaded() + if (UseFastNewInstance && !UseTLAB && klass->is_loaded() && !Klass::layout_helper_needs_slow_path(klass->layout_helper())) { Runtime1::StubID stub_id = klass->is_initialized() ? Runtime1::fast_new_instance_id : Runtime1::fast_new_instance_init_check_id; igor > On Jul 20, 2018, at 12:37 PM, JC Beyler wrote: > > Yes that is right, this is the latest: > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ > > I apologize for the multiple threads and confusion, > Jc > > On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com < > serguei.spitsyn at oracle.com> wrote: > >> Thank you a lot, Vladimir! >> Yes, the webrev.03 is the latest. >> Jc, will correct us if it is not right. >> >> Thanks, >> Serguei >> >> >> On 7/20/18 10:52, Vladimir Kozlov wrote: >>> I asked Igor V. to look. >>> >>> Seems like review is done in an other thread which does not have bug >>> id in subject. Currently webrev.03 >>> >>> Vladimir >>> >>> On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: >>>> Thanks, Rahul! >>>> In fact, there no good experts for this area in the serviceability team. >>>> It would be much better if anyone from the Compiler team could do it. >>>> >>>> Vladimir K., >>>> >>>> Is there anyone from the Compiler team available to review this? >>>> Otherwise, I could try to review it but am not sure about my review >>>> quality. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 7/19/18 00:48, Rahul Raghavan wrote: >>>>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled >>>>> >>>>> (just adding + hotspot-compiler-dev also) >>>>> >>>>> >>>>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: >>>>> Subject Was: >>>>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled >>>>> >>>>> + serviceability-dev >>>>> >>>>> Hi all, >>>>> >>>>> Could anyone else give me a review of this webrev and check/test the >>>>> various architecture changes? >>>>> >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>>>> >>>>> >>>>> Thanks for all your help! >>>>> Jc >>>>> >>>>> >>>>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler >> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> Here is a webrev that does all the architectures in the same way: >>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >>>>>>> >>>>>>> Could anyone review the other architectures and test? >>>>>>> - arm, sparc & aarch64 are also modified now to follow the same >>>>>>> "if no >>>>>>> tlab, then consider eden space allocation" logic. >>>>>>> >>>>>>> Thanks for your help! >>>>>>> Jc >>>>>>> >>>>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler >>>>>>> wrote: >>>>>>> >>>>>>>> Hi Kim, >>>>>>>> >>>>>>>> I opened this bug >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 >>>>>>>> >>>>>>>> and now I've done an update: >>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >>>>>>>> >>>>>>>> I basically have done your nits but also removed the try_eden (it >>>>>>>> was >>>>>>>> used to bind a label but was not used). I updated the comments to >>>>>>>> use the >>>>>>>> one you preferred. >>>>>>>> >>>>>>>> I still have to do the other architectures though but at least we >>>>>>>> seem to >>>>>>>> have a consensus on this architecture, correct? >>>>>>>> >>>>>>>> Thanks for the review, >>>>>>>> Jc >>>>>>>> >>>>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett >> >>>>>>>> wrote: >>>>>>>> >>>>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Yes, you are right, I did those changes due to: >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 >>>>>>>>>> >>>>>>>>>> If Robbin agrees to this change, and if no one sees an issue, >>>>>>>>>> I'll go >>>>>>>>> ahead >>>>>>>>>> and propagate the change across architectures. >>>>>>>>>> >>>>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's >>>>>>>>>> comment >>>>>>>>> and >>>>>>>>>> review) :) >>>>>>>>>> Jc >>>>>>>>>> >>>>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose >> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I'm not sure if we had left this case intentionally or not >>>>>>>>>>> but, if we >>>>>>>>> want >>>>>>>>>>> it all to be consistent, we should perhaps fix it. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Well, you put in that logic last February, so unless somebody >>>>>>>>>>> speaks >>>>>>>>> up >>>>>>>>>>> quickly, I support your adjusting it to be the way you want it. >>>>>>>>>>> >>>>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I >>>>>>>>>>> src/hotspot/share" >>>>>>>>>>> suggests that the GC group is most active in touching this >>>>>>>>>>> feature. >>>>>>>>>>> If Robbin is OK with it, there's your reviewer. >>>>>>>>>>> >>>>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person >>>>>>>>>>> working on the GC to OK it. >>>>>>>>>>> >>>>>>>>>>> ? John >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Jc >>>>>>>>> >>>>>>>>> Robbin is on vacation; you might not hear from him for a while. >>>>>>>>> >>>>>>>>> I'm assuming you'll open a new bug for this? >>>>>>>>> >>>>>>>>> Except for a few minor nits (below), this looks okay to me. >>>>>>>>> >>>>>>>>> The comment at line 1052 needs updating. >>>>>>>>> >>>>>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused. >>>>>>>>> >>>>>>>>> pre-existing: The try_eden label declared on line 1054 is bound at >>>>>>>>> line 1058, but unreferenced. >>>>>>>>> >>>>>>>>> I like the wording of the comment at 1139 better than the >>>>>>>>> wording at >>>>>>>>> 1016. >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Jc >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Thanks, >>>>>>> Jc >>>>>>> >>>>>> >>>>>> >>>> >> >> > > -- > > Thanks, > Jc From jcbeyler at google.com Sun Jul 22 02:06:26 2018 From: jcbeyler at google.com (JC Beyler) Date: Sat, 21 Jul 2018 19:06:26 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com> References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com> Message-ID: Hi Igor, Thanks for looking at it! I don't know the code paths enough to know if that is sufficient (I'll trust you evidently). I can run the tests next week if we prefer that route. Were I to choose, I would prefer that interpreter/c1/c2 all follow the same kind of paths, which would be my fix I believe: 1) If TLAB, allocate there or slowpath 2) Else If contiguous inline allocations are enabled, try that 3) Goto Slowpath With your fix, even if we do not have the issue anymore, it still keeps code that is not consistent but perhaps I'm missing something? Jc On Sat, Jul 21, 2018 at 1:47 PM Igor Veresov wrote: > I think you can just predicate the emission of these stubs for !UseTLAB, > and not mess with the CPU-specific code. What do you think? > > diff --git a/src/hotspot/share/c1/c1_LIRGenerator.cpp > b/src/hotspot/share/c1/c1_LIRGenerator.cpp > --- a/src/hotspot/share/c1/c1_LIRGenerator.cpp > +++ b/src/hotspot/share/c1/c1_LIRGenerator.cpp > @@ -674,7 +674,7 @@ > void LIRGenerator::new_instance(LIR_Opr dst, ciInstanceKlass* klass, bool > is_unresolved, LIR_Opr scratch1, LIR_Opr scratch2, LIR_Opr scratch3, > LIR_Opr scratch4, LIR_Opr klass_reg, CodeEmitInfo* info) { > klass2reg_with_patching(klass_reg, klass, info, is_unresolved); > // If klass is not loaded we do not know if the klass has finalizers: > - if (UseFastNewInstance && klass->is_loaded() > + if (UseFastNewInstance && !UseTLAB && klass->is_loaded() > && !Klass::layout_helper_needs_slow_path(klass->layout_helper())) { > > Runtime1::StubID stub_id = klass->is_initialized() ? > Runtime1::fast_new_instance_id : Runtime1::fast_new_instance_init_check_id; > > > igor > > > On Jul 20, 2018, at 12:37 PM, JC Beyler wrote: > > > > Yes that is right, this is the latest: > > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ > > > > I apologize for the multiple threads and confusion, > > Jc > > > > On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com < > > serguei.spitsyn at oracle.com> wrote: > > > >> Thank you a lot, Vladimir! > >> Yes, the webrev.03 is the latest. > >> Jc, will correct us if it is not right. > >> > >> Thanks, > >> Serguei > >> > >> > >> On 7/20/18 10:52, Vladimir Kozlov wrote: > >>> I asked Igor V. to look. > >>> > >>> Seems like review is done in an other thread which does not have bug > >>> id in subject. Currently webrev.03 > >>> > >>> Vladimir > >>> > >>> On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: > >>>> Thanks, Rahul! > >>>> In fact, there no good experts for this area in the serviceability > team. > >>>> It would be much better if anyone from the Compiler team could do it. > >>>> > >>>> Vladimir K., > >>>> > >>>> Is there anyone from the Compiler team available to review this? > >>>> Otherwise, I could try to review it but am not sure about my review > >>>> quality. > >>>> > >>>> Thanks, > >>>> Serguei > >>>> > >>>> > >>>> On 7/19/18 00:48, Rahul Raghavan wrote: > >>>>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled > >>>>> > >>>>> (just adding + hotspot-compiler-dev also) > >>>>> > >>>>> > >>>>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: > >>>>> Subject Was: > >>>>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled > >>>>> > >>>>> + serviceability-dev > >>>>> > >>>>> Hi all, > >>>>> > >>>>> Could anyone else give me a review of this webrev and check/test the > >>>>> various architecture changes? > >>>>> > >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > >>>>> > >>>>> > >>>>> Thanks for all your help! > >>>>> Jc > >>>>> > >>>>> > >>>>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler > >> wrote: > >>>>>> > >>>>>>> Hi all, > >>>>>>> > >>>>>>> Here is a webrev that does all the architectures in the same way: > >>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > >>>>>>> > >>>>>>> Could anyone review the other architectures and test? > >>>>>>> - arm, sparc & aarch64 are also modified now to follow the same > >>>>>>> "if no > >>>>>>> tlab, then consider eden space allocation" logic. > >>>>>>> > >>>>>>> Thanks for your help! > >>>>>>> Jc > >>>>>>> > >>>>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Hi Kim, > >>>>>>>> > >>>>>>>> I opened this bug > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 > >>>>>>>> > >>>>>>>> and now I've done an update: > >>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ > >>>>>>>> > >>>>>>>> I basically have done your nits but also removed the try_eden (it > >>>>>>>> was > >>>>>>>> used to bind a label but was not used). I updated the comments to > >>>>>>>> use the > >>>>>>>> one you preferred. > >>>>>>>> > >>>>>>>> I still have to do the other architectures though but at least we > >>>>>>>> seem to > >>>>>>>> have a consensus on this architecture, correct? > >>>>>>>> > >>>>>>>> Thanks for the review, > >>>>>>>> Jc > >>>>>>>> > >>>>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett < > kim.barrett at oracle.com > >>> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> Yes, you are right, I did those changes due to: > >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 > >>>>>>>>>> > >>>>>>>>>> If Robbin agrees to this change, and if no one sees an issue, > >>>>>>>>>> I'll go > >>>>>>>>> ahead > >>>>>>>>>> and propagate the change across architectures. > >>>>>>>>>> > >>>>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's > >>>>>>>>>> comment > >>>>>>>>> and > >>>>>>>>>> review) :) > >>>>>>>>>> Jc > >>>>>>>>>> > >>>>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose < > john.r.rose at oracle.com > >>> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler > >>>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> I'm not sure if we had left this case intentionally or not > >>>>>>>>>>> but, if we > >>>>>>>>> want > >>>>>>>>>>> it all to be consistent, we should perhaps fix it. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Well, you put in that logic last February, so unless somebody > >>>>>>>>>>> speaks > >>>>>>>>> up > >>>>>>>>>>> quickly, I support your adjusting it to be the way you want it. > >>>>>>>>>>> > >>>>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I > >>>>>>>>>>> src/hotspot/share" > >>>>>>>>>>> suggests that the GC group is most active in touching this > >>>>>>>>>>> feature. > >>>>>>>>>>> If Robbin is OK with it, there's your reviewer. > >>>>>>>>>>> > >>>>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other > person > >>>>>>>>>>> working on the GC to OK it. > >>>>>>>>>>> > >>>>>>>>>>> ? John > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> Jc > >>>>>>>>> > >>>>>>>>> Robbin is on vacation; you might not hear from him for a while. > >>>>>>>>> > >>>>>>>>> I'm assuming you'll open a new bug for this? > >>>>>>>>> > >>>>>>>>> Except for a few minor nits (below), this looks okay to me. > >>>>>>>>> > >>>>>>>>> The comment at line 1052 needs updating. > >>>>>>>>> > >>>>>>>>> pre-existing: The retry_tlab label declared on line 1054 is > unused. > >>>>>>>>> > >>>>>>>>> pre-existing: The try_eden label declared on line 1054 is bound > at > >>>>>>>>> line 1058, but unreferenced. > >>>>>>>>> > >>>>>>>>> I like the wording of the comment at 1139 better than the > >>>>>>>>> wording at > >>>>>>>>> 1016. > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Jc > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Jc > >>>>>>> > >>>>>> > >>>>>> > >>>> > >> > >> > > > > -- > > > > Thanks, > > Jc > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From igor.veresov at oracle.com Sun Jul 22 02:39:21 2018 From: igor.veresov at oracle.com (Igor Veresov) Date: Sat, 21 Jul 2018 19:39:21 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com> Message-ID: <719DC045-4311-499E-9F7D-784096A044C6@oracle.com> Yeah, the fix I proposed doesn?t do exactly what we?d want. Sorry for the confusion. Your fix is fine. Reviewed. igor > On Jul 21, 2018, at 7:06 PM, JC Beyler wrote: > > Hi Igor, > > Thanks for looking at it! I don't know the code paths enough to know if that is sufficient (I'll trust you evidently). I can run the tests next week if we prefer that route. > > Were I to choose, I would prefer that interpreter/c1/c2 all follow the same kind of paths, which would be my fix I believe: > 1) If TLAB, allocate there or slowpath > 2) Else If contiguous inline allocations are enabled, try that > 3) Goto Slowpath > > With your fix, even if we do not have the issue anymore, it still keeps code that is not consistent but perhaps I'm missing something? > Jc > > On Sat, Jul 21, 2018 at 1:47 PM Igor Veresov > wrote: > I think you can just predicate the emission of these stubs for !UseTLAB, and not mess with the CPU-specific code. What do you think? > > diff --git a/src/hotspot/share/c1/c1_LIRGenerator.cpp b/src/hotspot/share/c1/c1_LIRGenerator.cpp > --- a/src/hotspot/share/c1/c1_LIRGenerator.cpp > +++ b/src/hotspot/share/c1/c1_LIRGenerator.cpp > @@ -674,7 +674,7 @@ > void LIRGenerator::new_instance(LIR_Opr dst, ciInstanceKlass* klass, bool is_unresolved, LIR_Opr scratch1, LIR_Opr scratch2, LIR_Opr scratch3, LIR_Opr scratch4, LIR_Opr klass_reg, CodeEmitInfo* info) { > klass2reg_with_patching(klass_reg, klass, info, is_unresolved); > // If klass is not loaded we do not know if the klass has finalizers: > - if (UseFastNewInstance && klass->is_loaded() > + if (UseFastNewInstance && !UseTLAB && klass->is_loaded() > && !Klass::layout_helper_needs_slow_path(klass->layout_helper())) { > > Runtime1::StubID stub_id = klass->is_initialized() ? Runtime1::fast_new_instance_id : Runtime1::fast_new_instance_init_check_id; > > > igor > > > On Jul 20, 2018, at 12:37 PM, JC Beyler > wrote: > > > > Yes that is right, this is the latest: > > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ > > > > I apologize for the multiple threads and confusion, > > Jc > > > > On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com < > > serguei.spitsyn at oracle.com > wrote: > > > >> Thank you a lot, Vladimir! > >> Yes, the webrev.03 is the latest. > >> Jc, will correct us if it is not right. > >> > >> Thanks, > >> Serguei > >> > >> > >> On 7/20/18 10:52, Vladimir Kozlov wrote: > >>> I asked Igor V. to look. > >>> > >>> Seems like review is done in an other thread which does not have bug > >>> id in subject. Currently webrev.03 > >>> > >>> Vladimir > >>> > >>> On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: > >>>> Thanks, Rahul! > >>>> In fact, there no good experts for this area in the serviceability team. > >>>> It would be much better if anyone from the Compiler team could do it. > >>>> > >>>> Vladimir K., > >>>> > >>>> Is there anyone from the Compiler team available to review this? > >>>> Otherwise, I could try to review it but am not sure about my review > >>>> quality. > >>>> > >>>> Thanks, > >>>> Serguei > >>>> > >>>> > >>>> On 7/19/18 00:48, Rahul Raghavan wrote: > >>>>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled > >>>>> > >>>>> (just adding + hotspot-compiler-dev also) > >>>>> > >>>>> > >>>>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: > >>>>> Subject Was: > >>>>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled > >>>>> > >>>>> + serviceability-dev > >>>>> > >>>>> Hi all, > >>>>> > >>>>> Could anyone else give me a review of this webrev and check/test the > >>>>> various architecture changes? > >>>>> > >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > >>>>> > >>>>> > >>>>> Thanks for all your help! > >>>>> Jc > >>>>> > >>>>> > >>>>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler > > >> wrote: > >>>>>> > >>>>>>> Hi all, > >>>>>>> > >>>>>>> Here is a webrev that does all the architectures in the same way: > >>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ > >>>>>>> > >>>>>>> Could anyone review the other architectures and test? > >>>>>>> - arm, sparc & aarch64 are also modified now to follow the same > >>>>>>> "if no > >>>>>>> tlab, then consider eden space allocation" logic. > >>>>>>> > >>>>>>> Thanks for your help! > >>>>>>> Jc > >>>>>>> > >>>>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler > > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Hi Kim, > >>>>>>>> > >>>>>>>> I opened this bug > >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 > >>>>>>>> > >>>>>>>> and now I've done an update: > >>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ > >>>>>>>> > >>>>>>>> I basically have done your nits but also removed the try_eden (it > >>>>>>>> was > >>>>>>>> used to bind a label but was not used). I updated the comments to > >>>>>>>> use the > >>>>>>>> one you preferred. > >>>>>>>> > >>>>>>>> I still have to do the other architectures though but at least we > >>>>>>>> seem to > >>>>>>>> have a consensus on this architecture, correct? > >>>>>>>> > >>>>>>>> Thanks for the review, > >>>>>>>> Jc > >>>>>>>> > >>>>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett > >>> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler > > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>> Yes, you are right, I did those changes due to: > >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 > >>>>>>>>>> > >>>>>>>>>> If Robbin agrees to this change, and if no one sees an issue, > >>>>>>>>>> I'll go > >>>>>>>>> ahead > >>>>>>>>>> and propagate the change across architectures. > >>>>>>>>>> > >>>>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's > >>>>>>>>>> comment > >>>>>>>>> and > >>>>>>>>>> review) :) > >>>>>>>>>> Jc > >>>>>>>>>> > >>>>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose > >>> > >>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler > > >>>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> I'm not sure if we had left this case intentionally or not > >>>>>>>>>>> but, if we > >>>>>>>>> want > >>>>>>>>>>> it all to be consistent, we should perhaps fix it. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> Well, you put in that logic last February, so unless somebody > >>>>>>>>>>> speaks > >>>>>>>>> up > >>>>>>>>>>> quickly, I support your adjusting it to be the way you want it. > >>>>>>>>>>> > >>>>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I > >>>>>>>>>>> src/hotspot/share" > >>>>>>>>>>> suggests that the GC group is most active in touching this > >>>>>>>>>>> feature. > >>>>>>>>>>> If Robbin is OK with it, there's your reviewer. > >>>>>>>>>>> > >>>>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person > >>>>>>>>>>> working on the GC to OK it. > >>>>>>>>>>> > >>>>>>>>>>> ? John > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> Jc > >>>>>>>>> > >>>>>>>>> Robbin is on vacation; you might not hear from him for a while. > >>>>>>>>> > >>>>>>>>> I'm assuming you'll open a new bug for this? > >>>>>>>>> > >>>>>>>>> Except for a few minor nits (below), this looks okay to me. > >>>>>>>>> > >>>>>>>>> The comment at line 1052 needs updating. > >>>>>>>>> > >>>>>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused. > >>>>>>>>> > >>>>>>>>> pre-existing: The try_eden label declared on line 1054 is bound at > >>>>>>>>> line 1058, but unreferenced. > >>>>>>>>> > >>>>>>>>> I like the wording of the comment at 1139 better than the > >>>>>>>>> wording at > >>>>>>>>> 1016. > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> -- > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Jc > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> > >>>>>>> Thanks, > >>>>>>> Jc > >>>>>>> > >>>>>> > >>>>>> > >>>> > >> > >> > > > > -- > > > > Thanks, > > Jc > > > > -- > > Thanks, > Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Mon Jul 23 03:04:15 2018 From: jcbeyler at google.com (JC Beyler) Date: Sun, 22 Jul 2018 20:04:15 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: <719DC045-4311-499E-9F7D-784096A044C6@oracle.com> References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com> <719DC045-4311-499E-9F7D-784096A044C6@oracle.com> Message-ID: Thanks Igor! http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.04/ Has now your name in the reviewers. Would anyone want to push it by chance? Thanks! Jc On Sat, Jul 21, 2018 at 7:39 PM Igor Veresov wrote: > Yeah, the fix I proposed doesn?t do exactly what we?d want. Sorry for the > confusion. Your fix is fine. Reviewed. > > igor > > On Jul 21, 2018, at 7:06 PM, JC Beyler wrote: > > Hi Igor, > > Thanks for looking at it! I don't know the code paths enough to know if > that is sufficient (I'll trust you evidently). I can run the tests next > week if we prefer that route. > > Were I to choose, I would prefer that interpreter/c1/c2 all follow the > same kind of paths, which would be my fix I believe: > 1) If TLAB, allocate there or slowpath > 2) Else If contiguous inline allocations are enabled, try that > 3) Goto Slowpath > > With your fix, even if we do not have the issue anymore, it still keeps > code that is not consistent but perhaps I'm missing something? > Jc > > On Sat, Jul 21, 2018 at 1:47 PM Igor Veresov > wrote: > >> I think you can just predicate the emission of these stubs for !UseTLAB, >> and not mess with the CPU-specific code. What do you think? >> >> diff --git a/src/hotspot/share/c1/c1_LIRGenerator.cpp >> b/src/hotspot/share/c1/c1_LIRGenerator.cpp >> --- a/src/hotspot/share/c1/c1_LIRGenerator.cpp >> +++ b/src/hotspot/share/c1/c1_LIRGenerator.cpp >> @@ -674,7 +674,7 @@ >> void LIRGenerator::new_instance(LIR_Opr dst, ciInstanceKlass* klass, >> bool is_unresolved, LIR_Opr scratch1, LIR_Opr scratch2, LIR_Opr scratch3, >> LIR_Opr scratch4, LIR_Opr klass_reg, CodeEmitInfo* info) { >> klass2reg_with_patching(klass_reg, klass, info, is_unresolved); >> // If klass is not loaded we do not know if the klass has finalizers: >> - if (UseFastNewInstance && klass->is_loaded() >> + if (UseFastNewInstance && !UseTLAB && klass->is_loaded() >> && !Klass::layout_helper_needs_slow_path(klass->layout_helper())) { >> >> Runtime1::StubID stub_id = klass->is_initialized() ? >> Runtime1::fast_new_instance_id : Runtime1::fast_new_instance_init_check_id; >> >> >> igor >> >> > On Jul 20, 2018, at 12:37 PM, JC Beyler wrote: >> > >> > Yes that is right, this is the latest: >> > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ >> > >> > I apologize for the multiple threads and confusion, >> > Jc >> > >> > On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com < >> > serguei.spitsyn at oracle.com> wrote: >> > >> >> Thank you a lot, Vladimir! >> >> Yes, the webrev.03 is the latest. >> >> Jc, will correct us if it is not right. >> >> >> >> Thanks, >> >> Serguei >> >> >> >> >> >> On 7/20/18 10:52, Vladimir Kozlov wrote: >> >>> I asked Igor V. to look. >> >>> >> >>> Seems like review is done in an other thread which does not have bug >> >>> id in subject. Currently webrev.03 >> >>> >> >>> Vladimir >> >>> >> >>> On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote: >> >>>> Thanks, Rahul! >> >>>> In fact, there no good experts for this area in the serviceability >> team. >> >>>> It would be much better if anyone from the Compiler team could do it. >> >>>> >> >>>> Vladimir K., >> >>>> >> >>>> Is there anyone from the Compiler team available to review this? >> >>>> Otherwise, I could try to review it but am not sure about my review >> >>>> quality. >> >>>> >> >>>> Thanks, >> >>>> Serguei >> >>>> >> >>>> >> >>>> On 7/19/18 00:48, Rahul Raghavan wrote: >> >>>>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled >> >>>>> >> >>>>> (just adding + hotspot-compiler-dev also) >> >>>>> >> >>>>> >> >>>>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: >> >>>>> Subject Was: >> >>>>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled >> >>>>> >> >>>>> + serviceability-dev >> >>>>> >> >>>>> Hi all, >> >>>>> >> >>>>> Could anyone else give me a review of this webrev and check/test the >> >>>>> various architecture changes? >> >>>>> >> >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >> >>>>> >> >>>>> >> >>>>> Thanks for all your help! >> >>>>> Jc >> >>>>> >> >>>>> >> >>>>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler >> >> wrote: >> >>>>>> >> >>>>>>> Hi all, >> >>>>>>> >> >>>>>>> Here is a webrev that does all the architectures in the same way: >> >>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >> >>>>>>> >> >>>>>>> Could anyone review the other architectures and test? >> >>>>>>> - arm, sparc & aarch64 are also modified now to follow the same >> >>>>>>> "if no >> >>>>>>> tlab, then consider eden space allocation" logic. >> >>>>>>> >> >>>>>>> Thanks for your help! >> >>>>>>> Jc >> >>>>>>> >> >>>>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler >> >>>>>>> wrote: >> >>>>>>> >> >>>>>>>> Hi Kim, >> >>>>>>>> >> >>>>>>>> I opened this bug >> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 >> >>>>>>>> >> >>>>>>>> and now I've done an update: >> >>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >> >>>>>>>> >> >>>>>>>> I basically have done your nits but also removed the try_eden (it >> >>>>>>>> was >> >>>>>>>> used to bind a label but was not used). I updated the comments to >> >>>>>>>> use the >> >>>>>>>> one you preferred. >> >>>>>>>> >> >>>>>>>> I still have to do the other architectures though but at least we >> >>>>>>>> seem to >> >>>>>>>> have a consensus on this architecture, correct? >> >>>>>>>> >> >>>>>>>> Thanks for the review, >> >>>>>>>> Jc >> >>>>>>>> >> >>>>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett < >> kim.barrett at oracle.com >> >>> >> >>>>>>>> wrote: >> >>>>>>>> >> >>>>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler >> >>>>>>>>>> wrote: >> >>>>>>>>>> >> >>>>>>>>>> Yes, you are right, I did those changes due to: >> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 >> >>>>>>>>>> >> >>>>>>>>>> If Robbin agrees to this change, and if no one sees an issue, >> >>>>>>>>>> I'll go >> >>>>>>>>> ahead >> >>>>>>>>>> and propagate the change across architectures. >> >>>>>>>>>> >> >>>>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's >> >>>>>>>>>> comment >> >>>>>>>>> and >> >>>>>>>>>> review) :) >> >>>>>>>>>> Jc >> >>>>>>>>>> >> >>>>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose < >> john.r.rose at oracle.com >> >>> >> >>>>>>>>> wrote: >> >>>>>>>>>> >> >>>>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler >> >>>>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> I'm not sure if we had left this case intentionally or not >> >>>>>>>>>>> but, if we >> >>>>>>>>> want >> >>>>>>>>>>> it all to be consistent, we should perhaps fix it. >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> Well, you put in that logic last February, so unless somebody >> >>>>>>>>>>> speaks >> >>>>>>>>> up >> >>>>>>>>>>> quickly, I support your adjusting it to be the way you want >> it. >> >>>>>>>>>>> >> >>>>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I >> >>>>>>>>>>> src/hotspot/share" >> >>>>>>>>>>> suggests that the GC group is most active in touching this >> >>>>>>>>>>> feature. >> >>>>>>>>>>> If Robbin is OK with it, there's your reviewer. >> >>>>>>>>>>> >> >>>>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other >> person >> >>>>>>>>>>> working on the GC to OK it. >> >>>>>>>>>>> >> >>>>>>>>>>> ? John >> >>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> -- >> >>>>>>>>>> >> >>>>>>>>>> Thanks, >> >>>>>>>>>> Jc >> >>>>>>>>> >> >>>>>>>>> Robbin is on vacation; you might not hear from him for a while. >> >>>>>>>>> >> >>>>>>>>> I'm assuming you'll open a new bug for this? >> >>>>>>>>> >> >>>>>>>>> Except for a few minor nits (below), this looks okay to me. >> >>>>>>>>> >> >>>>>>>>> The comment at line 1052 needs updating. >> >>>>>>>>> >> >>>>>>>>> pre-existing: The retry_tlab label declared on line 1054 is >> unused. >> >>>>>>>>> >> >>>>>>>>> pre-existing: The try_eden label declared on line 1054 is bound >> at >> >>>>>>>>> line 1058, but unreferenced. >> >>>>>>>>> >> >>>>>>>>> I like the wording of the comment at 1139 better than the >> >>>>>>>>> wording at >> >>>>>>>>> 1016. >> >>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>>>> -- >> >>>>>>>> >> >>>>>>>> Thanks, >> >>>>>>>> Jc >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> >> >>>>>>> Thanks, >> >>>>>>> Jc >> >>>>>>> >> >>>>>> >> >>>>>> >> >>>> >> >> >> >> >> > >> > -- >> > >> > Thanks, >> > Jc >> >> > > -- > > Thanks, > Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Jul 23 06:52:59 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Sun, 22 Jul 2018 23:52:59 -0700 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com> <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com> <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com> <719DC045-4311-499E-9F7D-784096A044C6@oracle.com> Message-ID: <4e48d67c-9c01-c1bf-4955-a237f1484c2e@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Mon Jul 23 07:42:16 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 23 Jul 2018 00:42:16 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp Message-ID: Hello, Please review the following fix for JDK11: https://bugs.openjdk.java.net/browse/JDK-8151259 http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00 It fixes the following 3 tests: vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java Any of which could fail when run with -Xcomp with (followed by a bunch more errors): ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test ignored. Although lately we've only seen this with redefclass030.java on macosx. These 3 tests do redefinition of a "hot" method after triggering compilation for it. After the redef some testing is done to ensure that the redef was done correctly, but the issue these test have actually comes before any redef is done. The test attempts to trigger compilation by calling a hot method a lot. The agent detects compilation by receiving a CompiledMethodLoad event. There was an issue discovered long ago that when -Xcomp is used, the compilation happens before the "hot" method is ever called. Then the redef would happen before compilation, and this somehow messed up the test (I'm not exactly sure how). The fix was to basically abandon the redef attempt when this problem is detected, and then supposedly just let the test run to completion (skipping the actual testing of the redef). After this change, if you ran with -Xcomp it would pass, but if you looked in the log you would see: ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test ignored. However, there was a bug in the logic to make the test run to completion, and also causes the above message to not appear. Instead the test would fail with: # ERROR: Redefinition not completed. Followed by a bunch more error message during the part of the test that checks if the redef was done properly. If the CompiledMethodLoad event comes in before the hot method is ever called (which it does with -Xcomp), the test sets fire = -1. If the hot method was called, it is set to 1.? The setting of fire = -1 was added to fix the -Xcomp problem mentioned above. The jvmti agent does the following: ??? do { ??????? THREAD_sleep(1); ??????? /* wait for compilation to happen */ ??? } while(fire == 0); ??? if (fire == 1) { ??????? /* do the redef here */ ??????? NSK_DISPLAY0("agentProc: <<<<<<<< RedefineClasses() is successfully done\n"); ??? } else { ??????? // fire == -1 ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't executed. Don't perform redefinition\n"); ??? } The agent then syncs with the debuggee, waiting for it finish up. What the test expects is that waitForRedefinitionStarted() in the debuggee will time out after two seconds while waiting for fire == 1 (which it thinks will will always happen because it was set to -1). When it times out, the test does appear to exit properly with, but with the following in the log, which is intended: ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test ignored. However, sometimes before waitForRedefinitionStarted() times out, the hot method is called enough times to trigger compilation. So another CompiledMethodLoad event arrives, and this time fire is set to 1. Because of this, waitForRedefinitionStarted() doesn't time out and returns with an indication that the redef has started. After this waitForRedefinitionCompleted() is executed. It waits for the redef to complete, but it never does since the agent decided not to do the redef when it saw fire == -1. So waitForRedefinitionCompleted() times out after 10 seconds and the test fails, with: # ERROR: Redefinition not completed. Actually the above error is not really what causes the failure. When the above error is detected, no error status is set and the test continues as if the redef had been done. So then the logic that detects if the redef was done properly ends up failing, and that's where the test actually indicates a failure status. You see a whole bunch of other errors in the log because of all the checks that fail. The fix is to not abandon the test when the first CompiledMethodLoad event is before the hot method was called. Instead just leave fire==0 and wait for the next CompiledMethodLoad event that is triggered after the method is called enough times to be recompiled. I'm not sure why it was not originally done this way. Possibly the recompilation did not happen reliably, but I have not run into this problem. The other changes in redefclass030.c are just cleaning up debug tracing. Another fix was to properly set the error status when waitForRedefinitionStarted() or waitForRedefinitionCompleted() times out, although this is just a safety net and I didn't run into any cases where this happened after fixing the CompiledMethodLoad event handling. So in general the changes in redefclass030.java were not needed, but provide better error handling. thanks, Chris From daniel.daugherty at oracle.com Mon Jul 23 16:10:22 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 23 Jul 2018 12:10:22 -0400 Subject: RFR(XXXS): 8208092 ProblemList serviceability/sa/ClhsdbCDSCore.java Message-ID: Greetings, We have an intermittent tier1 test failure on Linux-X64 in both JDK11 and JDK12. I'm putting it on the ProblemList: $ hg diff diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 2018 +0530 +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 12:07:59 2018 -0400 @@ -79,6 +79,7 @@ ?# :hotspot_serviceability +serviceability/sa/ClhsdbCDSCore.java???????????????? 8208092 linux-x64 ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all Thanks, in advance, for a single (R)eview of this trivial change. Dan From sgehwolf at redhat.com Mon Jul 23 16:27:30 2018 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Mon, 23 Jul 2018 18:27:30 +0200 Subject: RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 Message-ID: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com> Hi, Could I please get a review of this one-liner change related to jhsdb --mixed when attaching to a running Java process? The issue arises when threads are in native code and that native code has frame pointers not properly preserved. In such a case the SA performs a simple frame pointer valididy check: ebp >= esp However, the code of retrieving the value for esp is incorrect in as much as it's not in sync with native code in regards to the register index: native code => X86ThreadContext.SP Java code => X86ThreadContext.ESP X86ThreadContext.ESP is never being set by the native code. Since X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then returns null, ebp.lessThan(esp) wrongly returns false causing the issue. This webrev fixes it by using SP as index on the Java side. Thoughts? webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/ bug: https://bugs.openjdk.java.net/browse/JDK-8208091 Thanks, Severin From serguei.spitsyn at oracle.com Mon Jul 23 16:44:20 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 23 Jul 2018 09:44:20 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: References: Message-ID: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> Hi Chris, Would it be more simple to avoid running these tests with -Xcomp? I guess, this would work: @requires vm.compMode != "Xcomp" Thanks, Serguei On 7/23/18 00:42, Chris Plummer wrote: > Hello, > > Please review the following fix for JDK11: > > https://bugs.openjdk.java.net/browse/JDK-8151259 > http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00 > > It fixes the following 3 tests: > > vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java > vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java > vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java > > Any of which could fail when run with -Xcomp with (followed by a bunch > more errors): > > ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test > ignored. > > Although lately we've only seen this with redefclass030.java on macosx. > > These 3 tests do redefinition of a "hot" method after triggering > compilation for it. After the redef some testing is done to ensure > that the redef was done correctly, but the issue these test have > actually comes before any redef is done. > > The test attempts to trigger compilation by calling a hot method a > lot. The agent detects compilation by receiving a CompiledMethodLoad > event. There was an issue discovered long ago that when -Xcomp is > used, the compilation happens before the "hot" method is ever called. > Then the redef would happen before compilation, and this somehow > messed up the test (I'm not exactly sure how). The fix was to > basically abandon the redef attempt when this problem is detected, and > then supposedly just let the test run to completion (skipping the > actual testing of the redef). After this change, if you ran with > -Xcomp it would pass, but if you looked in the log you would see: > > ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test > ignored. > > However, there was a bug in the logic to make the test run to > completion, and also causes the above message to not appear. Instead > the test would fail with: > > # ERROR: Redefinition not completed. > > Followed by a bunch more error message during the part of the test > that checks if the redef was done properly. > > If the CompiledMethodLoad event comes in before the hot method is ever > called (which it does with -Xcomp), the test sets fire = -1. If the > hot method was called, it is set to 1.? The setting of fire = -1 was > added to fix the -Xcomp problem mentioned above. The jvmti agent does > the following: > > ??? do { > ??????? THREAD_sleep(1); > ??????? /* wait for compilation to happen */ > ??? } while(fire == 0); > > ??? if (fire == 1) { > ??????? /* do the redef here */ > ??????? NSK_DISPLAY0("agentProc: <<<<<<<< RedefineClasses() is > successfully done\n"); > ??? } else { > ??????? // fire == -1 > ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't executed. Don't > perform redefinition\n"); > ??? } > > The agent then syncs with the debuggee, waiting for it finish up. What > the test expects is that waitForRedefinitionStarted() in the debuggee > will time out after two seconds while waiting for fire == 1 (which it > thinks will will always happen because it was set to -1). When it > times out, the test does appear to exit properly with, but with the > following in the log, which is intended: > > ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test > ignored. > > However, sometimes before waitForRedefinitionStarted() times out, the > hot method is called enough times to trigger compilation. So another > CompiledMethodLoad event arrives, and this time fire is set to 1. > Because of this, waitForRedefinitionStarted() doesn't time out and > returns with an indication that the redef has started. After this > waitForRedefinitionCompleted() is executed. It waits for the redef to > complete, but it never does since the agent decided not to do the > redef when it saw fire == -1. So waitForRedefinitionCompleted() times > out after 10 seconds and the test fails, with: > > # ERROR: Redefinition not completed. > > Actually the above error is not really what causes the failure. When > the above error is detected, no error status is set and the test > continues as if the redef had been done. So then the logic that > detects if the redef was done properly ends up failing, and that's > where the test actually indicates a failure status. You see a whole > bunch of other errors in the log because of all the checks that fail. > > The fix is to not abandon the test when the first CompiledMethodLoad > event is before the hot method was called. Instead just leave fire==0 > and wait for the next CompiledMethodLoad event that is triggered after > the method is called enough times to be recompiled. I'm not sure why > it was not originally done this way. Possibly the recompilation did > not happen reliably, but I have not run into this problem. The other > changes in redefclass030.c are just cleaning up debug tracing. > > Another fix was to properly set the error status when > waitForRedefinitionStarted() or waitForRedefinitionCompleted() times > out, although this is just a safety net and I didn't run into any > cases where this happened after fixing the CompiledMethodLoad event > handling. So in general the changes in redefclass030.java were not > needed, but provide better error handling. > > thanks, > > Chris > From daniel.daugherty at oracle.com Mon Jul 23 17:17:30 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 23 Jul 2018 13:17:30 -0400 Subject: RFR(XXXS): 8208092 ProblemList serviceability/sa/ClhsdbCDSCore.java In-Reply-To: <6d7b229d-0d0b-d828-4fc8-3693cc2e3c1d@oracle.com> References: <6d7b229d-0d0b-d828-4fc8-3693cc2e3c1d@oracle.com> Message-ID: I added back the alias... you accidentally deleted it... On 7/23/18 12:34 PM, serguei.spitsyn at oracle.com wrote: > Hi Dan, > > The bug number in the problem list has to be 8207832, not 8208092. :) Thanks for the catch! I knew I should have waited until after lunch to send out that RFR... sigh... > Count it as reviewed if you fix it - trivial rule applies. $ hg diff diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 2018 +0530 +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 13:15:50 2018 -0400 @@ -79,6 +79,7 @@ ?# :hotspot_serviceability +serviceability/sa/ClhsdbCDSCore.java???????????????? 8207832 linux-x64 ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all Dan > > Thanks, > Serguei > > > On 7/23/18 09:10, Daniel D. Daugherty wrote: >> Greetings, >> >> We have an intermittent tier1 test failure on Linux-X64 in both JDK11 >> and >> JDK12. I'm putting it on the ProblemList: >> >> $ hg diff >> diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 2018 >> +0530 >> +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 12:07:59 2018 >> -0400 >> @@ -79,6 +79,7 @@ >> >> ?# :hotspot_serviceability >> >> +serviceability/sa/ClhsdbCDSCore.java???????????????? 8208092 linux-x64 >> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 >> generic-all >> ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 >> generic-all >> >> Thanks, in advance, for a single (R)eview of this trivial change. >> >> Dan >> > From serguei.spitsyn at oracle.com Mon Jul 23 18:37:08 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 23 Jul 2018 11:37:08 -0700 Subject: RFR(XXXS): 8208092 ProblemList serviceability/sa/ClhsdbCDSCore.java In-Reply-To: References: <6d7b229d-0d0b-d828-4fc8-3693cc2e3c1d@oracle.com> Message-ID: Looks good. Thanks, Serguei On 7/23/18 10:17, Daniel D. Daugherty wrote: > I added back the alias... you accidentally deleted it... > > > On 7/23/18 12:34 PM, serguei.spitsyn at oracle.com wrote: >> Hi Dan, >> >> The bug number in the problem list has to be 8207832, not 8208092. :) > > Thanks for the catch! I knew I should have waited until after lunch to > send out that RFR... sigh... > > >> Count it as reviewed if you fix it - trivial rule applies. > > $ hg diff > diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 2018 > +0530 > +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 13:15:50 2018 > -0400 > @@ -79,6 +79,7 @@ > > ?# :hotspot_serviceability > > +serviceability/sa/ClhsdbCDSCore.java???????????????? 8207832 linux-x64 > ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all > ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all > > > Dan > > >> >> Thanks, >> Serguei >> >> >> On 7/23/18 09:10, Daniel D. Daugherty wrote: >>> Greetings, >>> >>> We have an intermittent tier1 test failure on Linux-X64 in both >>> JDK11 and >>> JDK12. I'm putting it on the ProblemList: >>> >>> $ hg diff >>> diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt >>> --- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 2018 >>> +0530 >>> +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 12:07:59 2018 >>> -0400 >>> @@ -79,6 +79,7 @@ >>> >>> ?# :hotspot_serviceability >>> >>> +serviceability/sa/ClhsdbCDSCore.java???????????????? 8208092 linux-x64 >>> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 >>> generic-all >>> ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 >>> generic-all >>> >>> Thanks, in advance, for a single (R)eview of this trivial change. >>> >>> Dan >>> >> > From daniel.daugherty at oracle.com Mon Jul 23 18:40:13 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 23 Jul 2018 14:40:13 -0400 Subject: RFR(XXXS): 8208092 ProblemList serviceability/sa/ClhsdbCDSCore.java In-Reply-To: References: <6d7b229d-0d0b-d828-4fc8-3693cc2e3c1d@oracle.com> Message-ID: Thanks! Dan On 7/23/18 2:37 PM, serguei.spitsyn at oracle.com wrote: > Looks good. > > Thanks, > Serguei > > On 7/23/18 10:17, Daniel D. Daugherty wrote: >> I added back the alias... you accidentally deleted it... >> >> >> On 7/23/18 12:34 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Dan, >>> >>> The bug number in the problem list has to be 8207832, not 8208092. :) >> >> Thanks for the catch! I knew I should have waited until after lunch to >> send out that RFR... sigh... >> >> >>> Count it as reviewed if you fix it - trivial rule applies. >> >> $ hg diff >> diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 2018 >> +0530 >> +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 13:15:50 2018 >> -0400 >> @@ -79,6 +79,7 @@ >> >> ?# :hotspot_serviceability >> >> +serviceability/sa/ClhsdbCDSCore.java???????????????? 8207832 linux-x64 >> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 >> generic-all >> ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 >> generic-all >> >> >> Dan >> >> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 7/23/18 09:10, Daniel D. Daugherty wrote: >>>> Greetings, >>>> >>>> We have an intermittent tier1 test failure on Linux-X64 in both >>>> JDK11 and >>>> JDK12. I'm putting it on the ProblemList: >>>> >>>> $ hg diff >>>> diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt >>>> --- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 >>>> 2018 +0530 >>>> +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 12:07:59 >>>> 2018 -0400 >>>> @@ -79,6 +79,7 @@ >>>> >>>> ?# :hotspot_serviceability >>>> >>>> +serviceability/sa/ClhsdbCDSCore.java 8208092 linux-x64 >>>> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 >>>> generic-all >>>> ?serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all >>>> >>>> Thanks, in advance, for a single (R)eview of this trivial change. >>>> >>>> Dan >>>> >>> >> > From chris.plummer at oracle.com Mon Jul 23 18:40:24 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 23 Jul 2018 11:40:24 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> Message-ID: <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> Hi Serguei, If the fix was complicated I would agree, but it really just boils down to this one line change: -??????????? fire = -1; +??????????? fire = 0; // Ignore this compilation. Wait for next one. Given that, I see no reason not to increase our test coverage by supporting this test during -Xcomp runs. thanks, Chris On 7/23/18 9:44 AM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > Would it be more simple to avoid running these tests with -Xcomp? > I guess, this would work: @requires vm.compMode != "Xcomp" > > Thanks, > Serguei > > > On 7/23/18 00:42, Chris Plummer wrote: >> Hello, >> >> Please review the following fix for JDK11: >> >> https://bugs.openjdk.java.net/browse/JDK-8151259 >> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00 >> >> It fixes the following 3 tests: >> >> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java >> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java >> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java >> >> Any of which could fail when run with -Xcomp with (followed by a >> bunch more errors): >> >> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test >> ignored. >> >> Although lately we've only seen this with redefclass030.java on macosx. >> >> These 3 tests do redefinition of a "hot" method after triggering >> compilation for it. After the redef some testing is done to ensure >> that the redef was done correctly, but the issue these test have >> actually comes before any redef is done. >> >> The test attempts to trigger compilation by calling a hot method a >> lot. The agent detects compilation by receiving a CompiledMethodLoad >> event. There was an issue discovered long ago that when -Xcomp is >> used, the compilation happens before the "hot" method is ever called. >> Then the redef would happen before compilation, and this somehow >> messed up the test (I'm not exactly sure how). The fix was to >> basically abandon the redef attempt when this problem is detected, >> and then supposedly just let the test run to completion (skipping the >> actual testing of the redef). After this change, if you ran with >> -Xcomp it would pass, but if you looked in the log you would see: >> >> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test >> ignored. >> >> However, there was a bug in the logic to make the test run to >> completion, and also causes the above message to not appear. Instead >> the test would fail with: >> >> # ERROR: Redefinition not completed. >> >> Followed by a bunch more error message during the part of the test >> that checks if the redef was done properly. >> >> If the CompiledMethodLoad event comes in before the hot method is >> ever called (which it does with -Xcomp), the test sets fire = -1. If >> the hot method was called, it is set to 1.? The setting of fire = -1 >> was added to fix the -Xcomp problem mentioned above. The jvmti agent >> does the following: >> >> ??? do { >> ??????? THREAD_sleep(1); >> ??????? /* wait for compilation to happen */ >> ??? } while(fire == 0); >> >> ??? if (fire == 1) { >> ??????? /* do the redef here */ >> ??????? NSK_DISPLAY0("agentProc: <<<<<<<< RedefineClasses() is >> successfully done\n"); >> ??? } else { >> ??????? // fire == -1 >> ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't executed. >> Don't perform redefinition\n"); >> ??? } >> >> The agent then syncs with the debuggee, waiting for it finish up. >> What the test expects is that waitForRedefinitionStarted() in the >> debuggee will time out after two seconds while waiting for fire == 1 >> (which it thinks will will always happen because it was set to -1). >> When it times out, the test does appear to exit properly with, but >> with the following in the log, which is intended: >> >> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test >> ignored. >> >> However, sometimes before waitForRedefinitionStarted() times out, the >> hot method is called enough times to trigger compilation. So another >> CompiledMethodLoad event arrives, and this time fire is set to 1. >> Because of this, waitForRedefinitionStarted() doesn't time out and >> returns with an indication that the redef has started. After this >> waitForRedefinitionCompleted() is executed. It waits for the redef to >> complete, but it never does since the agent decided not to do the >> redef when it saw fire == -1. So waitForRedefinitionCompleted() times >> out after 10 seconds and the test fails, with: >> >> # ERROR: Redefinition not completed. >> >> Actually the above error is not really what causes the failure. When >> the above error is detected, no error status is set and the test >> continues as if the redef had been done. So then the logic that >> detects if the redef was done properly ends up failing, and that's >> where the test actually indicates a failure status. You see a whole >> bunch of other errors in the log because of all the checks that fail. >> >> The fix is to not abandon the test when the first CompiledMethodLoad >> event is before the hot method was called. Instead just leave fire==0 >> and wait for the next CompiledMethodLoad event that is triggered >> after the method is called enough times to be recompiled. I'm not >> sure why it was not originally done this way. Possibly the >> recompilation did not happen reliably, but I have not run into this >> problem. The other changes in redefclass030.c are just cleaning up >> debug tracing. >> >> Another fix was to properly set the error status when >> waitForRedefinitionStarted() or waitForRedefinitionCompleted() times >> out, although this is just a safety net and I didn't run into any >> cases where this happened after fixing the CompiledMethodLoad event >> handling. So in general the changes in redefclass030.java were not >> needed, but provide better error handling. >> >> thanks, >> >> Chris >> > From chris.plummer at oracle.com Mon Jul 23 19:17:26 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 23 Jul 2018 12:17:26 -0700 Subject: RFR(XS):8208075: Quarantine vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java Message-ID: <831a1b6d-c6e4-ea65-3cc1-c7202e75440f@oracle.com> Hi, Please review the following, to be pushed to JDK 11 https://bugs.openjdk.java.net/browse/JDK-8208075 diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt +++ b/test/hotspot/jtreg/ProblemList.txt @@ -123,7 +123,7 @@ ?vmTestbase/nsk/jvmti/ClearBreakpoint/clrbrk001/TestDescription.java 8016181 generic-all ?vmTestbase/nsk/jvmti/FieldModification/fieldmod001/TestDescription.java 8016181 generic-all -vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java 8202896 linux-x64 +vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java 8202896,8206076,8208074 generic-all ?vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted001/TestDescription.java 7013634 generic-all ?vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted003/TestDescription.java 6606767 generic-all ?vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted004/TestDescription.java 7013634,6606767 generic-all The test was already quarantined on linux-x86 due to 8202896. However a few days ago it started to fail for a different reason on every run and every platform. 8208074 was filed for it. Also, it fails rarely due to timeout, and 8206076 was filed for that failure a few months ago. https://bugs.openjdk.java.net/browse/JDK-8202896 https://bugs.openjdk.java.net/browse/JDK-8206076 https://bugs.openjdk.java.net/browse/JDK-8208074 thanks, Chris From serguei.spitsyn at oracle.com Mon Jul 23 19:39:41 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 23 Jul 2018 12:39:41 -0700 Subject: RFR(XS):8208075: Quarantine vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java In-Reply-To: <831a1b6d-c6e4-ea65-3cc1-c7202e75440f@oracle.com> References: <831a1b6d-c6e4-ea65-3cc1-c7202e75440f@oracle.com> Message-ID: <6e64b192-2955-c765-abf5-718fdc02f168@oracle.com> Looks good. Thanks, Serguei On 7/23/18 12:17, Chris Plummer wrote: > Hi, > > Please review the following, to be pushed to JDK 11 > > https://bugs.openjdk.java.net/browse/JDK-8208075 > > diff --git a/test/hotspot/jtreg/ProblemList.txt > b/test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt > +++ b/test/hotspot/jtreg/ProblemList.txt > @@ -123,7 +123,7 @@ > > ?vmTestbase/nsk/jvmti/ClearBreakpoint/clrbrk001/TestDescription.java > 8016181 generic-all > ?vmTestbase/nsk/jvmti/FieldModification/fieldmod001/TestDescription.java > 8016181 generic-all > -vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java > 8202896 linux-x64 > +vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java > 8202896,8206076,8208074 generic-all > ?vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted001/TestDescription.java > 7013634 generic-all > ?vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted003/TestDescription.java > 6606767 generic-all > ?vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted004/TestDescription.java > 7013634,6606767 generic-all > > The test was already quarantined on linux-x86 due to 8202896. However > a few days ago it started to fail for a different reason on every run > and every platform. 8208074 was filed for it. Also, it fails rarely > due to timeout, and 8206076 was filed for that failure a few months ago. > > https://bugs.openjdk.java.net/browse/JDK-8202896 > https://bugs.openjdk.java.net/browse/JDK-8206076 > https://bugs.openjdk.java.net/browse/JDK-8208074 > > thanks, > > Chris > From hohensee at amazon.com Mon Jul 23 21:33:28 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 23 Jul 2018 21:33:28 +0000 Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions Message-ID: Corrected subject line: 8196889 s/b 8196989. From: hotspot-gc-dev on behalf of "Hohensee, Paul" Date: Friday, July 20, 2018 at 3:38 PM To: "hotspot-gc-dev at openjdk.java.net" , "serviceability-dev at openjdk.java.net" Subject: RFR(L): 8196889: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions Please review. Bug: https://bugs.openjdk.java.net/browse/JDK-8196989 CSR: https://bugs.openjdk.java.net/browse/JDK-8196991 Webrev: http://cr.openjdk.java.net/~phh/8196989/webrev.00 This webrev is marked ?L? because it?s a behavioral change (CSR in draft state, may I have a review of that too please?) and because the test change fanout is large. The actual code changes are ?M?. Passes the submit repo, Hotspot tier1, the JFR gc event tests and any other test set with ?gc? or ?serviceability? in the test directory name. I found it difficult to verify the accuracy of the reported values other than manually, since they can vary from run to run of the same program. I?d appreciate suggestions for how to go about writing accuracy tests. I set out originally to revamp only the MXBeans, but decided it would be incomplete if I didn?t include the jstat counters and the output of the GC.heap_info jcmd option. I can separate the latter two into their own RFEs, but I find it easier understand it all in a single webrev and hope the reviewers will too. The basic approach is to add the new memory pools and collectors, the new jstat counters, and an archive region counter that stands in for an actual archive region set. HeapRegionSets are disjoint, so initially I tried to create a first-class archive region set (on the same level as the humongous region set), but that idea foundered on the fact that there?s too much code I don?t fully understand that depends on archive regions being in the existing old region set. Externally (i.e., in the MXBeans and the jstat counters), however, the old region set doesn?t include archive regions (unless running in legacy mode). I used CMS?s TraceCMSMemoryManagerStats class as the model for TraceConcMemoryManagerStats, which latter collects statistics on concurrent cycles. There are two STW pauses in each concurrent cycle: they are recorded separately and count as two sun.gc.collector.2 events. The humongous and archive space committed and used values are always identical, hence they are always 100% used. The revised output of jcmd GC.heap_info is in G1CollectedHeap::print_on(). I fixed a typo in src/hotspot/share/gc/g1/g1Policy.hpp by changing the result type of young_list_target_length() from size_t to uint, which latter is the type of the _young_list_target_length member. I updated the copyright date in src/hotspot/share/services/memoryService.hpp to 2018, as I neglected to do so in a previous push. Thanks, Paul -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Jul 24 00:22:27 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 23 Jul 2018 17:22:27 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> Message-ID: Hi Chris, On 7/23/18 11:40, Chris Plummer wrote: > Hi Serguei, > > If the fix was complicated I would agree, but it really just boils > down to this one line change: > > -??????????? fire = -1; > +??????????? fire = 0; // Ignore this compilation. Wait for next one. It is not obvious that this will completely fix the problem. Is it possible that there will not be next compilation with the -Xcomp? If it is possible then it is better to explicitly exclude these tests for -Xcomp. Otherwise, consider this reviewed. > > Given that, I see no reason not to increase our test coverage by > supporting this test during -Xcomp runs. I'd agree if it is going to be stable. Thanks, Serguei > > thanks, > > Chris > > On 7/23/18 9:44 AM, serguei.spitsyn at oracle.com wrote: >> Hi Chris, >> >> Would it be more simple to avoid running these tests with -Xcomp? >> I guess, this would work: @requires vm.compMode != "Xcomp" >> >> Thanks, >> Serguei >> >> >> On 7/23/18 00:42, Chris Plummer wrote: >>> Hello, >>> >>> Please review the following fix for JDK11: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8151259 >>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00 >>> >>> It fixes the following 3 tests: >>> >>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java >>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java >>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java >>> >>> Any of which could fail when run with -Xcomp with (followed by a >>> bunch more errors): >>> >>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test >>> ignored. >>> >>> Although lately we've only seen this with redefclass030.java on macosx. >>> >>> These 3 tests do redefinition of a "hot" method after triggering >>> compilation for it. After the redef some testing is done to ensure >>> that the redef was done correctly, but the issue these test have >>> actually comes before any redef is done. >>> >>> The test attempts to trigger compilation by calling a hot method a >>> lot. The agent detects compilation by receiving a CompiledMethodLoad >>> event. There was an issue discovered long ago that when -Xcomp is >>> used, the compilation happens before the "hot" method is ever >>> called. Then the redef would happen before compilation, and this >>> somehow messed up the test (I'm not exactly sure how). The fix was >>> to basically abandon the redef attempt when this problem is >>> detected, and then supposedly just let the test run to completion >>> (skipping the actual testing of the redef). After this change, if >>> you ran with -Xcomp it would pass, but if you looked in the log you >>> would see: >>> >>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test >>> ignored. >>> >>> However, there was a bug in the logic to make the test run to >>> completion, and also causes the above message to not appear. Instead >>> the test would fail with: >>> >>> # ERROR: Redefinition not completed. >>> >>> Followed by a bunch more error message during the part of the test >>> that checks if the redef was done properly. >>> >>> If the CompiledMethodLoad event comes in before the hot method is >>> ever called (which it does with -Xcomp), the test sets fire = -1. If >>> the hot method was called, it is set to 1.? The setting of fire = -1 >>> was added to fix the -Xcomp problem mentioned above. The jvmti agent >>> does the following: >>> >>> ??? do { >>> ??????? THREAD_sleep(1); >>> ??????? /* wait for compilation to happen */ >>> ??? } while(fire == 0); >>> >>> ??? if (fire == 1) { >>> ??????? /* do the redef here */ >>> ??????? NSK_DISPLAY0("agentProc: <<<<<<<< RedefineClasses() is >>> successfully done\n"); >>> ??? } else { >>> ??????? // fire == -1 >>> ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't executed. >>> Don't perform redefinition\n"); >>> ??? } >>> >>> The agent then syncs with the debuggee, waiting for it finish up. >>> What the test expects is that waitForRedefinitionStarted() in the >>> debuggee will time out after two seconds while waiting for fire == 1 >>> (which it thinks will will always happen because it was set to -1). >>> When it times out, the test does appear to exit properly with, but >>> with the following in the log, which is intended: >>> >>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test >>> ignored. >>> >>> However, sometimes before waitForRedefinitionStarted() times out, >>> the hot method is called enough times to trigger compilation. So >>> another CompiledMethodLoad event arrives, and this time fire is set >>> to 1. Because of this, waitForRedefinitionStarted() doesn't time out >>> and returns with an indication that the redef has started. After >>> this waitForRedefinitionCompleted() is executed. It waits for the >>> redef to complete, but it never does since the agent decided not to >>> do the redef when it saw fire == -1. So >>> waitForRedefinitionCompleted() times out after 10 seconds and the >>> test fails, with: >>> >>> # ERROR: Redefinition not completed. >>> >>> Actually the above error is not really what causes the failure. When >>> the above error is detected, no error status is set and the test >>> continues as if the redef had been done. So then the logic that >>> detects if the redef was done properly ends up failing, and that's >>> where the test actually indicates a failure status. You see a whole >>> bunch of other errors in the log because of all the checks that fail. >>> >>> The fix is to not abandon the test when the first CompiledMethodLoad >>> event is before the hot method was called. Instead just leave >>> fire==0 and wait for the next CompiledMethodLoad event that is >>> triggered after the method is called enough times to be recompiled. >>> I'm not sure why it was not originally done this way. Possibly the >>> recompilation did not happen reliably, but I have not run into this >>> problem. The other changes in redefclass030.c are just cleaning up >>> debug tracing. >>> >>> Another fix was to properly set the error status when >>> waitForRedefinitionStarted() or waitForRedefinitionCompleted() times >>> out, although this is just a safety net and I didn't run into any >>> cases where this happened after fixing the CompiledMethodLoad event >>> handling. So in general the changes in redefclass030.java were not >>> needed, but provide better error handling. >>> >>> thanks, >>> >>> Chris >>> >> > > From chris.plummer at oracle.com Tue Jul 24 03:19:33 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 23 Jul 2018 20:19:33 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> Message-ID: On 7/23/18 5:22 PM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > > On 7/23/18 11:40, Chris Plummer wrote: >> Hi Serguei, >> >> If the fix was complicated I would agree, but it really just boils >> down to this one line change: >> >> -??????????? fire = -1; >> +??????????? fire = 0; // Ignore this compilation. Wait for next one. > > It is not obvious that this will completely fix the problem. > Is it possible that there will not be next compilation with the -Xcomp? It's only one method that we check for. I don't see why there would be 2nd -Xcomp compilation for it, but even if there was, the test will ignore it just like the first one. It will ignore compilations of the method until the flag has been set indicating the method has been executed once. If for some reason the method is never compiled after being executed once, the test will give up waiting for it (I think after 30 seconds) and produce an error. > > If it is possible then it is better to explicitly exclude these tests > for -Xcomp. > Otherwise, consider this reviewed. > >> >> Given that, I see no reason not to increase our test coverage by >> supporting this test during -Xcomp runs. > > I'd agree if it is going to be stable. > If problems turn up in the future, we can reconsider disabling it. thanks, Chris > Thanks, > Serguei > >> >> thanks, >> >> Chris >> >> On 7/23/18 9:44 AM, serguei.spitsyn at oracle.com wrote: >>> Hi Chris, >>> >>> Would it be more simple to avoid running these tests with -Xcomp? >>> I guess, this would work: @requires vm.compMode != "Xcomp" >>> >>> Thanks, >>> Serguei >>> >>> >>> On 7/23/18 00:42, Chris Plummer wrote: >>>> Hello, >>>> >>>> Please review the following fix for JDK11: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8151259 >>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00 >>>> >>>> It fixes the following 3 tests: >>>> >>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java >>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java >>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java >>>> >>>> Any of which could fail when run with -Xcomp with (followed by a >>>> bunch more errors): >>>> >>>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test >>>> ignored. >>>> >>>> Although lately we've only seen this with redefclass030.java on >>>> macosx. >>>> >>>> These 3 tests do redefinition of a "hot" method after triggering >>>> compilation for it. After the redef some testing is done to ensure >>>> that the redef was done correctly, but the issue these test have >>>> actually comes before any redef is done. >>>> >>>> The test attempts to trigger compilation by calling a hot method a >>>> lot. The agent detects compilation by receiving a >>>> CompiledMethodLoad event. There was an issue discovered long ago >>>> that when -Xcomp is used, the compilation happens before the "hot" >>>> method is ever called. Then the redef would happen before >>>> compilation, and this somehow messed up the test (I'm not exactly >>>> sure how). The fix was to basically abandon the redef attempt when >>>> this problem is detected, and then supposedly just let the test run >>>> to completion (skipping the actual testing of the redef). After >>>> this change, if you ran with -Xcomp it would pass, but if you >>>> looked in the log you would see: >>>> >>>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test >>>> ignored. >>>> >>>> However, there was a bug in the logic to make the test run to >>>> completion, and also causes the above message to not appear. >>>> Instead the test would fail with: >>>> >>>> # ERROR: Redefinition not completed. >>>> >>>> Followed by a bunch more error message during the part of the test >>>> that checks if the redef was done properly. >>>> >>>> If the CompiledMethodLoad event comes in before the hot method is >>>> ever called (which it does with -Xcomp), the test sets fire = -1. >>>> If the hot method was called, it is set to 1.? The setting of fire >>>> = -1 was added to fix the -Xcomp problem mentioned above. The jvmti >>>> agent does the following: >>>> >>>> ??? do { >>>> ??????? THREAD_sleep(1); >>>> ??????? /* wait for compilation to happen */ >>>> ??? } while(fire == 0); >>>> >>>> ??? if (fire == 1) { >>>> ??????? /* do the redef here */ >>>> ??????? NSK_DISPLAY0("agentProc: <<<<<<<< RedefineClasses() is >>>> successfully done\n"); >>>> ??? } else { >>>> ??????? // fire == -1 >>>> ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't executed. >>>> Don't perform redefinition\n"); >>>> ??? } >>>> >>>> The agent then syncs with the debuggee, waiting for it finish up. >>>> What the test expects is that waitForRedefinitionStarted() in the >>>> debuggee will time out after two seconds while waiting for fire == >>>> 1 (which it thinks will will always happen because it was set to >>>> -1). When it times out, the test does appear to exit properly with, >>>> but with the following in the log, which is intended: >>>> >>>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test >>>> ignored. >>>> >>>> However, sometimes before waitForRedefinitionStarted() times out, >>>> the hot method is called enough times to trigger compilation. So >>>> another CompiledMethodLoad event arrives, and this time fire is set >>>> to 1. Because of this, waitForRedefinitionStarted() doesn't time >>>> out and returns with an indication that the redef has started. >>>> After this waitForRedefinitionCompleted() is executed. It waits for >>>> the redef to complete, but it never does since the agent decided >>>> not to do the redef when it saw fire == -1. So >>>> waitForRedefinitionCompleted() times out after 10 seconds and the >>>> test fails, with: >>>> >>>> # ERROR: Redefinition not completed. >>>> >>>> Actually the above error is not really what causes the failure. >>>> When the above error is detected, no error status is set and the >>>> test continues as if the redef had been done. So then the logic >>>> that detects if the redef was done properly ends up failing, and >>>> that's where the test actually indicates a failure status. You see >>>> a whole bunch of other errors in the log because of all the checks >>>> that fail. >>>> >>>> The fix is to not abandon the test when the first >>>> CompiledMethodLoad event is before the hot method was called. >>>> Instead just leave fire==0 and wait for the next CompiledMethodLoad >>>> event that is triggered after the method is called enough times to >>>> be recompiled. I'm not sure why it was not originally done this >>>> way. Possibly the recompilation did not happen reliably, but I have >>>> not run into this problem. The other changes in redefclass030.c are >>>> just cleaning up debug tracing. >>>> >>>> Another fix was to properly set the error status when >>>> waitForRedefinitionStarted() or waitForRedefinitionCompleted() >>>> times out, although this is just a safety net and I didn't run into >>>> any cases where this happened after fixing the CompiledMethodLoad >>>> event handling. So in general the changes in redefclass030.java >>>> were not needed, but provide better error handling. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>> >> >> > From serguei.spitsyn at oracle.com Tue Jul 24 07:25:16 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 24 Jul 2018 00:25:16 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> Message-ID: <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com> An HTML attachment was scrubbed... URL: From ralf.schmelter at sap.com Tue Jul 24 13:32:54 2018 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Tue, 24 Jul 2018 13:32:54 +0000 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <343bce9f-7072-17fd-9351-1c62f67a12de@oracle.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com> <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com> <6de6362944f84740b80abb22cbbea872@sap.com> <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com> <343bce9f-7072-17fd-9351-1c62f67a12de@oracle.com> Message-ID: <92dcce7000a94cf89ae2169cb1f843f2@sap.com> Hi all, here is the update webref with the fixed copyright: http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v5/ Best regards, Ralf -----Original Message----- From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com] Sent: Freitag, 20. Juli 2018 23:04 To: Chris Plummer ; Schmelter, Ralf ; serviceability-dev at openjdk.java.net; Stuefe, Thomas Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior On 7/20/18 13:44, Chris Plummer wrote: > On 7/20/18 1:40 PM, serguei.spitsyn at oracle.com wrote: >> Hi Ralf, >> >> >> On 7/20/18 07:28, Schmelter, Ralf wrote: >>> Hi Sergue, >>> >>> I?ve updated the webref: >>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/ >> >> The copyright year in ThreadReferenceImpl.c still has to be 2018, not >> 2008. >> >> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html >> >> >> ? 72???????????? if (newDepth == -1_000) { >> ? 73???????????????? // Pop some frames so there is room on the stack >> for the >> ? 74???????????????? // call (including println()). >> ? 75???????????????? notifyRecursionEnded(); >> ? 76???????????? } >> >> ? I have a concern on potential issue mentioned in the comment above. >> ? Should a StackOverflowError be expected here? >> >> ? 79???????? } catch (StackOverflowError e) { >> ? 80???????????? // Use negative depth to indicate the recursion has >> ended. >> ? 81???????????? return -1; >> ? 82???????? } >> >> ? What is going to happen if the StackOverflowError was really caught >> above? > The SOE is really caught in the above code. I returns -1, and starts > the unwinding of the stack. After 1000 frames have been popped via > returns, notifyRecursionEnded() will be called. The pops are so > notifyRecursionEnded() can be called without worry of another SOE. Got it, thanks Chris. So, I'm Okay with the fix assuming the copyright year is fixed. Thanks, Serguei From serguei.spitsyn at oracle.com Tue Jul 24 16:01:34 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 24 Jul 2018 09:01:34 -0700 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <92dcce7000a94cf89ae2169cb1f843f2@sap.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com> <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com> <6de6362944f84740b80abb22cbbea872@sap.com> <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com> <343bce9f-7072-17fd-9351-1c62f67a12de@oracle.com> <92dcce7000a94cf89ae2169cb1f843f2@sap.com> Message-ID: <76229193-7f46-a17a-7ebb-bddbd3d698b9@oracle.com> Hi Ralf, I think, you have to consider it reviewed. Sorry, I was not clear no new webrev is needed. Do you need a sponsor for the push? Thanks, Serguei On 7/24/18 06:32, Schmelter, Ralf wrote: > Hi all, > > here is the update webref with the fixed copyright: http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v5/ > > Best regards, > Ralf > > -----Original Message----- > From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com] > Sent: Freitag, 20. Juli 2018 23:04 > To: Chris Plummer ; Schmelter, Ralf ; serviceability-dev at openjdk.java.net; Stuefe, Thomas > Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior > > On 7/20/18 13:44, Chris Plummer wrote: >> On 7/20/18 1:40 PM, serguei.spitsyn at oracle.com wrote: >>> Hi Ralf, >>> >>> >>> On 7/20/18 07:28, Schmelter, Ralf wrote: >>>> Hi Sergue, >>>> >>>> I?ve updated the webref: >>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/ >>> The copyright year in ThreadReferenceImpl.c still has to be 2018, not >>> 2008. >>> >>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html >>> >>> >>> ? 72???????????? if (newDepth == -1_000) { >>> ? 73???????????????? // Pop some frames so there is room on the stack >>> for the >>> ? 74???????????????? // call (including println()). >>> ? 75???????????????? notifyRecursionEnded(); >>> ? 76???????????? } >>> >>> ? I have a concern on potential issue mentioned in the comment above. >>> ? Should a StackOverflowError be expected here? >>> >>> ? 79???????? } catch (StackOverflowError e) { >>> ? 80???????????? // Use negative depth to indicate the recursion has >>> ended. >>> ? 81???????????? return -1; >>> ? 82???????? } >>> >>> ? What is going to happen if the StackOverflowError was really caught >>> above? >> The SOE is really caught in the above code. I returns -1, and starts >> the unwinding of the stack. After 1000 frames have been popped via >> returns, notifyRecursionEnded() will be called. The pops are so >> notifyRecursionEnded() can be called without worry of another SOE. > Got it, thanks Chris. > > So, I'm Okay with the fix assuming the copyright year is fixed. > > Thanks, > Serguei From chris.plummer at oracle.com Tue Jul 24 16:27:04 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 24 Jul 2018 09:27:04 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com> References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com> Message-ID: <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Jul 24 19:18:16 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 24 Jul 2018 12:18:16 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com> References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com> <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From gary.adams at oracle.com Tue Jul 24 19:28:28 2018 From: gary.adams at oracle.com (Gary Adams) Date: Tue, 24 Jul 2018 15:28:28 -0400 Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR: suspendCounts don't match for : Common-Cleaner In-Reply-To: <2bd15cea-86b5-1af0-ef53-2c82a4685224@oracle.com> References: <5B4E0C62.3020808@oracle.com> <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com> <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com> <5B4F98BF.1060602@oracle.com> <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com> <5B507F2C.4080503@oracle.com> <2bd15cea-86b5-1af0-ef53-2c82a4685224@oracle.com> Message-ID: <5B577DDC.3000500@oracle.com> Here's a quick prototype to add a variable to the debuggee. The debugger sets it at the end of each completed test case. The debuggee can then check for the value change to delay hitting the breakpoint which interfered with suspend count checks. Would need to add a bit more error and timeout checking to complete the fix. Should also check if the other resume008 test cases need similar synchronization. Could possibly migrate the code up to TestDebuggerType1 if other tests also needed this generic capability. diff --git a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java --- a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java +++ b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java @@ -63,6 +63,9 @@ * to be resulting in the event. * - Upon getting new event, the debugger * performs the check corresponding to the event. + * - The debugger informs the debuggee when it completes + * each test case, so it will wait before hitting + * communication breakpoints. */ public class resume008 extends TestDebuggerType1 { @@ -234,6 +237,7 @@ default: throw new Failure("** default case 1 **"); } + informDebuggeeTestCase(i); } display("......--> vm.resume()"); @@ -255,4 +259,25 @@ } } + /** + * Inform debuggee which thread test the debugger has completed. + * Used for synchronization, so the debuggee does not move too quickly. + * @param testCase index of just completed test + */ + void informDebuggeeTestCase(int testCase) { + if (!EventHandler.isDisconnected() && debuggeeClass != null) { + try { + ((ClassType)debuggeeClass) + .setValue(debuggeeClass.fieldByName("testCase"), + vm.mirrorOf(testCase)); + } catch (InvalidTypeException ite) { + // ignored + } catch (ClassNotLoadedException cnle) { + // ignored + } catch (VMDisconnectedException e) { + // ignored } + } + } + +} diff --git a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java --- a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java +++ b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java @@ -62,6 +62,7 @@ static int exitCode = PASSED; + static int testCase = -1; static int instruction = 1; static int end = 0; // static int quit = 0; @@ -104,6 +105,15 @@ threadStart(thread0); thread1 = new Threadresume008a("thread1"); + // Wait for debugger to complete the first test case + // before advancing to the next breakpoint + while (testCase < 0) { + try { + Thread.sleep(100); + } catch (InterruptedException e) { + // ignored + } + } methodForCommunication(); break; On 7/20/18, 2:37 PM, Chris Plummer wrote: > Hi Gary, > > The test fails if the breakpoint event comes in after the test > captures the initial thread suspend counts and before the test > captures the 2nd suspend counts. > > debugger> getting : Map suspendsCounts1 > debugger> {Reference Handler=1, Common-Cleaner=1, main=1, Signal > Dispatcher=1, Finalizer=1} > debugger> eventSet.resume; > debugger> getting : Map suspendsCounts2 > EventHandler> Received event set with policy = SUSPEND_ALL > EventHandler> Event: BreakpointEventImpl req breakpoint request > nsk.jdi.EventSet.resume.resume008a:60 (enabled) > debugger> Received communication breakpoint event. > debugger> {Reference Handler=2, Common-Cleaner=2, main=1, Signal > Dispatcher=2, Finalizer=2} > > So we end up with some threads starting with 1 suspend and ending with > 2 (not clear to me why main is still at 1). > > It will pass if the breakpoint comes in after it does both of suspend > count checks, as you have shown with the sleep(100) solution. Output > looks like this: > > debugger> got new ThreadStartEvent with propety 'number' == > ThreadStartRequest1 > ... > debugger> ......--> vm.suspend(); > debugger> getting : Map suspendsCounts1 > debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, > Signal Dispatcher=1, Finalizer=1} > debugger> eventSet.resume; > debugger> getting : Map suspendsCounts2 > debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, > Signal Dispatcher=1, Finalizer=1} > ... > debugger> Received communication breakpoint event. > > I've also shown that it passes if the breakpoint always comes in > before capturing the initial suspend counts. I added a sleep on the > debugger side right after eventHandler.waitForRequestedEventSet() > returns. Output looks like: > > debugger> Received communication breakpoint event. > debugger> got new ThreadStartEvent with propety 'number' == > ThreadStartRequest1 > ... > debugger> ......--> vm.suspend(); > debugger> getting : Map suspendsCounts1 > debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, > Signal Dispatcher=2, Finalizer=2} > debugger> eventSet.resume; > debugger> getting : Map suspendsCounts2 > debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, > Signal Dispatcher=2, Finalizer=2} > > I think we should add synchronization to force one of these two > outcomes. For the first, you would need to make the debugger modify > some variable that the debuggee is watching (sitting in a loop waiting > for it to change). For the second, you can rely on the existing > methodForCommunication() approach. You just need to restructure the > debugger a bit. I had started down this path late Wednesday, but got > sidetracked by a few other things. I can look into it some more if > you'd like. > > thanks, > > Chris > > On 7/19/18 5:08 AM, Gary Adams wrote: >> In the successful run below "the first acquire thread suspend counts, >> resume, >> and the second acquire thread suspend counts" is not interrupted by the >> breakpoint event. >> >> Note that the failed thread0 case the test thread finishes rapidly. >> [2018-01-22T20:33:46.86] debugee.stderr> **> debuggee: 'run': enter >> :: threadName == thread0 *[2018-01-22T20:33:46.86] debugee.stderr> >> **> debuggee: 'run': exit :: threadName == thread0* >> >> and the successful test run , the thread0 run method exits after the >> thread1 >> has started. >> >> debugger> :::::: case: # 1 >> debugger> ......waiting for new ThreadStartEvent : 1 >> EventHandler> waitForRequestedEventSet: enabling remove of listener >> nsk.share.jdi.EventHandler$7 at 616bc3ae >> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 616bc3ae >> EventHandler> waitForRequestedEventSet: vm.resume called >> EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD >> *debugee.stderr> **> debuggee: 'run': exit :: threadName == thread0* >> >> >> Here's a recent mach5 failed log: >> [2018-01-22T20:33:45.65] # [2018-01-22T20:33:45.65] export >> TEST_CLEANUP [2018-01-22T20:33:45.65] export SHELL >> [2018-01-22T20:33:45.65] export DISPLAY [2018-01-22T20:33:45.65] >> export LIBJSIG_PATH [2018-01-22T20:33:45.65] export TESTBASE >> [2018-01-22T20:33:45.65] export JAVA_OPTS [2018-01-22T20:33:45.65] >> export RAS_OPTIONS [2018-01-22T20:33:45.65] export HOME >> [2018-01-22T20:33:45.65] export LD_LIBRARY_PATH >> [2018-01-22T20:33:45.65] export CLASSPATH [2018-01-22T20:33:45.65] >> export TEMP [2018-01-22T20:33:45.65] export TESTED_JAVA_HOME >> [2018-01-22T20:33:45.65] export BASH_ENV [2018-01-22T20:33:45.65] >> export PATH [2018-01-22T20:33:45.65] TEST_DEST_DIR="resume008" >> [2018-01-22T20:33:45.65] # Actual: TEST_DEST_DIR=resume008 >> [2018-01-22T20:33:45.65] TESTNAME="${test_case_name}" >> [2018-01-22T20:33:45.65] # Actual: TESTNAME=resume008 >> [2018-01-22T20:33:45.65] >> testName="nsk/jdi/EventSet/resume//resume008" >> [2018-01-22T20:33:45.65] # Actual: >> testName=nsk/jdi/EventSet/resume//resume008 [2018-01-22T20:33:45.65] >> TESTDIR="${test_work_dir}" [2018-01-22T20:33:45.65] # Actual: >> TESTDIR=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008 >> [2018-01-22T20:33:45.65] testWorkDir="${test_work_dir}/" >> [2018-01-22T20:33:45.65] # Actual: >> testWorkDir=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/ >> [2018-01-22T20:33:45.65] export testWorkDir [2018-01-22T20:33:45.65] >> tlogOutFile="${test_work_dir}/${test_name}.tlog" >> [2018-01-22T20:33:45.65] # Actual: >> tlogOutFile=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/resume008.tlog >> [2018-01-22T20:33:45.65] >> testErrFile="${test_work_dir}/${test_name}.err" >> [2018-01-22T20:33:45.65] # Actual: >> testErrFile=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/resume008.err >> [2018-01-22T20:33:45.65] EXECUTE_CLASS="${test_name}" >> [2018-01-22T20:33:45.66] # Actual: EXECUTE_CLASS=resume008 >> [2018-01-22T20:33:45.66] >> NSK_STRESS_METASPACE_OPTS="-XX:MaxMetaspaceSize=128m >> -XX:CompressedClassSpaceSize=64m -XX:MaxHeapSize=512m >> -Xlog:gc(ASTERISK_SUBST),gc+heap=trace" [2018-01-22T20:33:45.66] # >> Actual: NSK_STRESS_METASPACE_OPTS=-XX:MaxMetaspaceSize=128m >> -XX:CompressedClassSpaceSize=64m -XX:MaxHeapSize=512m >> -Xlog:gc*,gc+heap=trace [2018-01-22T20:33:45.66] export >> NSK_STRESS_METASPACE_OPTS [2018-01-22T20:33:45.66] >> EXECUTE_CLASS="nsk.jdi.EventSet.resume.resume008" >> [2018-01-22T20:33:45.66] # Actual: >> EXECUTE_CLASS=nsk.jdi.EventSet.resume.resume008 >> [2018-01-22T20:33:45.66] TEST_ARGS="${JDI_TEST_KEYS} >> -debugee.vmkeys=${JDI_DEBUGEE_VM_KEYS}" [2018-01-22T20:33:45.66] # >> Actual: TEST_ARGS=-verbose -arch=linux-amd64 -waittime=5 >> -debugee.vmkind=java -transport.address=dynamic >> -debugee.vmkeys=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:45.66] >> JAVA="${TESTED_JAVA_HOME}/bin/${DEBUGGER_KIND_OF_JAVA}" >> [2018-01-22T20:33:45.66] # Actual: >> JAVA=/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10/bin/java >> [2018-01-22T20:33:45.66] JAVA_OPTS="${DEBUGGER_JAVA_OPTS}" >> [2018-01-22T20:33:45.66] # Actual: JAVA_OPTS= >> [2018-01-22T20:33:45.66] APPLICATION_TIMEOUT="${TIMEOUT}" >> [2018-01-22T20:33:45.66] # Actual: APPLICATION_TIMEOUT=30 >> [2018-01-22T20:33:45.66] >> CLASSPATH="${test_work_dir}${PS}${CLASSPATH}" >> [2018-01-22T20:33:45.66] # Actual: >> CLASSPATH=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008:/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.test/hotspot/closed/tonga/bin/classes: >> [2018-01-22T20:33:45.66] export CLASSPATH [2018-01-22T20:33:45.66] >> ${JAVA} ${JAVA_OPTS} ${EXECUTE_CLASS} ${TEST_ARGS} >> [2018-01-22T20:33:45.66] # Actual: >> /scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10/bin/java >> nsk.jdi.EventSet.resume.resume008 -verbose -arch=linux-amd64 >> -waittime=5 -debugee.vmkind=java -transport.address=dynamic >> -debugee.vmkeys=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:46.01] >> binder> VirtualMachineManager: version 9.0 [2018-01-22T20:33:46.05] >> binder> Finding connector: default [2018-01-22T20:33:46.05] binder> >> LaunchingConnector: [2018-01-22T20:33:46.06] binder> name: >> com.sun.jdi.CommandLineLaunch [2018-01-22T20:33:46.06] binder> >> description: Launches target using Sun Java VM command line and >> attaches to it [2018-01-22T20:33:46.06] binder> transport: >> com.sun.tools.jdi.SunCommandLineLauncher$2 at 457e2f02 >> [2018-01-22T20:33:46.19] binder> Connector arguments: >> [2018-01-22T20:33:46.19] binder> >> home=/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10 >> [2018-01-22T20:33:46.19] binder> vmexec=java [2018-01-22T20:33:46.19] >> binder> options=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:46.20] >> binder> main=nsk.jdi.EventSet.resume.resume008a "-verbose" >> "-arch=linux-amd64" "-waittime=5" "-debugee.vmkind=java" >> "-transport.address=dynamic" >> "-debugee.vmkeys=-XX:MaxRAMPercentage=12.5" "-pipe.port=28038" >> [2018-01-22T20:33:46.20] binder> quote=" [2018-01-22T20:33:46.20] >> binder> suspend=true [2018-01-22T20:33:46.20] binder> Launching >> debugee [2018-01-22T20:33:46.56] binder> Waiting for VM initialized >> [2018-01-22T20:33:46.60] Initial VMStartEvent received: VMStartEvent >> in thread main [2018-01-22T20:33:46.61] EventHandler> Adding listener >> nsk.share.jdi.EventHandler$1 at 1e7c7811 [2018-01-22T20:33:46.61] >> EventHandler> Adding listener nsk.share.jdi.EventHandler$2 at 1a3869f4 >> [2018-01-22T20:33:46.61] EventHandler> Adding listener >> nsk.share.jdi.EventHandler$3 at 77f99a05 [2018-01-22T20:33:46.61] >> EventHandler> Adding listener nsk.share.jdi.EventHandler$4 at 3aeaafa6 >> [2018-01-22T20:33:46.61] EventHandler> Adding listener >> nsk.share.jdi.EventHandler$5 at 4d3167f4 [2018-01-22T20:33:46.62] >> EventHandler> waitForRequestedEvent: enabling remove of listener >> nsk.share.jdi.EventHandler$6 at 4eb7f003 [2018-01-22T20:33:46.62] >> EventHandler> Adding listener nsk.share.jdi.EventHandler$6 at 4eb7f003 >> [2018-01-22T20:33:46.62] EventHandler> waitForRequestedEvent: >> vm.resume called [2018-01-22T20:33:46.67] EventHandler> Received >> event set with policy = SUSPEND_EVENT_THREAD [2018-01-22T20:33:46.68] >> EventHandler> Event: ClassPrepareEventImpl req class prepare request >> (enabled) [2018-01-22T20:33:46.69] EventHandler> >> waitForRequestedEvent: Received event(ClassPrepareEvent in thread >> main) for request(class prepare request (enabled)) >> [2018-01-22T20:33:46.69] EventHandler> Removing listener >> nsk.share.jdi.EventHandler$6 at 4eb7f003 [2018-01-22T20:33:46.69] >> debugger> Received ClassPrepareEvent for debuggee class: >> nsk.jdi.EventSet.resume.resume008a [2018-01-22T20:33:46.71] binder> >> Breakpoint set: [2018-01-22T20:33:46.71] breakpoint request >> nsk.jdi.EventSet.resume.resume008a:60 (disabled) >> [2018-01-22T20:33:46.71] EventHandler> Adding listener >> nsk.share.jdi.TestDebuggerType1$1 at 43738a82 [2018-01-22T20:33:46.71] >> debugger> TESTING BEGINS [2018-01-22T20:33:46.71] debugger> RESUME >> DEBUGGEE VM [2018-01-22T20:33:46.72] debugger> >> shouldRunAfterBreakpoint: entered [2018-01-22T20:33:46.72] debugger> >> shouldRunAfterBreakpoint: waiting for breakpoint event during 1 sec. >> [2018-01-22T20:33:46.84] EventHandler> Received event set with policy >> = SUSPEND_ALL [2018-01-22T20:33:46.84] EventHandler> Event: >> BreakpointEventImpl req breakpoint request >> nsk.jdi.EventSet.resume.resume008a:60 (enabled) >> [2018-01-22T20:33:46.84] debugger> Received communication breakpoint >> event. [2018-01-22T20:33:46.84] debugger> shouldRunAfterBreakpoint: >> received breakpoint event. [2018-01-22T20:33:46.84] debugee.stderr> >> **> debuggee: debuggee started! [2018-01-22T20:33:46.85] debugger> >> shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.85] >> debugger> :::::: case: # 0 [2018-01-22T20:33:46.85] debugger> >> ......waiting for new ThreadStartEvent : 0 [2018-01-22T20:33:46.85] >> EventHandler> waitForRequestedEventSet: enabling remove of listener >> nsk.share.jdi.EventHandler$7 at 6ec8211c [2018-01-22T20:33:46.85] >> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 6ec8211c >> [2018-01-22T20:33:46.85] EventHandler> waitForRequestedEventSet: >> vm.resume called [2018-01-22T20:33:46.86] debugee.stderr> **> >> debuggee: 'run': enter :: threadName == thread0 >> [2018-01-22T20:33:46.86] debugee.stderr> **> debuggee: 'run': exit :: >> threadName == thread0 [2018-01-22T20:33:46.86] EventHandler> Received >> event set with policy = SUSPEND_NONE [2018-01-22T20:33:46.86] >> EventHandler> waitForRequestedEventSet: Received event set for >> request: thread start request (enabled) [2018-01-22T20:33:46.86] >> EventHandler> Event: ThreadStartEventImpl req thread start request >> (enabled) [2018-01-22T20:33:46.86] EventHandler> Removing listener >> nsk.share.jdi.EventHandler$7 at 6ec8211c [2018-01-22T20:33:46.86] >> debugger> got new ThreadStartEvent with propety 'number' == >> ThreadStartRequest1 [2018-01-22T20:33:46.86] debugger> ......checking >> up on EventSet.resume() [2018-01-22T20:33:46.86] debugger> ......--> >> vm.suspend(); [2018-01-22T20:33:46.87] debugger> getting : >> Map suspendsCounts1 [2018-01-22T20:33:46.87] >> debugger> {Reference Handler=1, Common-Cleaner=1, main=1, Signal >> Dispatcher=1, Finalizer=1} [2018-01-22T20:33:46.87] debugger> >> eventSet.resume; [2018-01-22T20:33:46.87] debugger> getting : >> Map suspendsCounts2 [2018-01-22T20:33:46.87] >> EventHandler> Received event set with policy = SUSPEND_ALL >> [2018-01-22T20:33:46.87] EventHandler> Event: BreakpointEventImpl req >> breakpoint request nsk.jdi.EventSet.resume.resume008a:60 (enabled) >> [2018-01-22T20:33:46.87] debugger> Received communication breakpoint >> event. [2018-01-22T20:33:46.87] debugger> {Reference Handler=2, >> Common-Cleaner=2, main=1, Signal Dispatcher=2, Finalizer=2} >> [2018-01-22T20:33:46.87] debugger> getting : int policy = >> eventSet.suspendPolicy(); [2018-01-22T20:33:46.87] debugger> case >> SUSPEND_NONE [2018-01-22T20:33:46.87] debugger> checking Reference >> Handler [2018-01-22T20:33:46.87] # ERROR: debugger> ERROR: >> suspendCounts don't match for : Reference Handler >> [2018-01-22T20:33:46.88] The following stacktrace is for Aurora. Used >> to create a RULE: [2018-01-22T20:33:46.88] nsk.share.TestFailure: >> debugger> ERROR: suspendCounts don't match for : Reference Handler >> [2018-01-22T20:33:46.88] at >> nsk.share.Log.logExceptionForAurora(Log.java:411) >> [2018-01-22T20:33:46.88] at nsk.share.Log.complain(Log.java:380) >> [2018-01-22T20:33:46.88] at >> nsk.share.jdi.TestDebuggerType1.complain(TestDebuggerType1.java:63) >> [2018-01-22T20:33:46.88] at >> nsk.jdi.EventSet.resume.resume008.testRun(resume008.java:163) >> [2018-01-22T20:33:46.88] at >> nsk.share.jdi.TestDebuggerType1.runThis(TestDebuggerType1.java:104) >> [2018-01-22T20:33:46.88] at >> nsk.jdi.EventSet.resume.resume008.run(resume008.java:62) >> [2018-01-22T20:33:46.88] at >> nsk.jdi.EventSet.resume.resume008.main(resume008.java:57) >> [2018-01-22T20:33:46.88] # ERROR: debugger> before resuming : 1 >> [2018-01-22T20:33:46.88] # ERROR: debugger> after resuming : 2 >> [2018-01-22T20:33:46.88] debugger> ......--> vm.resume() >> [2018-01-22T20:33:46.88] debugger> shouldRunAfterBreakpoint: entered >> [2018-01-22T20:33:46.88] debugger> shouldRunAfterBreakpoint: received >> breakpoint event. [2018-01-22T20:33:46.88] debugger> >> shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.88] >> debugger> :::::: case: # 1 [2018-01-22T20:33:46.88] debugger> >> ......waiting for new ThreadStartEvent : 1 [2018-01-22T20:33:46.88] >> EventHandler> waitForRequestedEventSet: enabling remove of listener >> nsk.share.jdi.EventHandler$7 at 548ad73b [2018-01-22T20:33:46.88] >> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 548ad73b >> [2018-01-22T20:33:46.88] EventHandler> waitForRequestedEventSet: >> vm.resume called [2018-01-22T20:33:46.88] EventHandler> Received >> event set with policy = SUSPEND_EVENT_THREAD [2018-01-22T20:33:46.88] >> EventHandler> waitForRequestedEventSet: Received event set for >> request: thread start request (enabled) [2018-01-22T20:33:46.88] >> EventHandler> Event: ThreadStartEventImpl req thread start request >> (enabled) [2018-01-22T20:33:46.88] EventHandler> Removing listener >> nsk.share.jdi.EventHandler$7 at 548ad73b [2018-01-22T20:33:46.88] >> debugger> got new ThreadStartEvent with propety 'number' == >> ThreadStartRequest2 [2018-01-22T20:33:46.88] debugger> ......checking >> up on EventSet.resume() [2018-01-22T20:33:46.88] debugger> ......--> >> vm.suspend(); [2018-01-22T20:33:46.88] debugger> getting : >> Map suspendsCounts1 [2018-01-22T20:33:46.89] >> debugger> {Reference Handler=1, thread1=2, Common-Cleaner=1, main=1, >> Signal Dispatcher=1, Finalizer=1} [2018-01-22T20:33:46.89] debugger> >> eventSet.resume; [2018-01-22T20:33:46.89] debugger> getting : >> Map suspendsCounts2 [2018-01-22T20:33:46.89] >> debugger> {Reference Handler=1, thread1=1, Common-Cleaner=1, main=1, >> Signal Dispatcher=1, Finalizer=1} [2018-01-22T20:33:46.89] debugger> >> getting : int policy = eventSet.suspendPolicy(); >> [2018-01-22T20:33:46.89] debugger> case SUSPEND_THREAD >> [2018-01-22T20:33:46.89] debugger> checking Reference Handler >> [2018-01-22T20:33:46.89] debugger> checking thread1 >> [2018-01-22T20:33:46.89] debugger> checking Common-Cleaner >> [2018-01-22T20:33:46.89] debugger> checking main >> [2018-01-22T20:33:46.90] debugger> checking Signal Dispatcher >> [2018-01-22T20:33:46.90] debugger> checking Finalizer >> [2018-01-22T20:33:46.90] debugger> ......--> vm.resume() >> [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: entered >> [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: waiting >> for breakpoint event during 1 sec. [2018-01-22T20:33:46.90] >> debugee.stderr> **> debuggee: 'run': enter :: threadName == thread1 >> [2018-01-22T20:33:46.90] debugee.stderr> **> debuggee: 'run': exit :: >> threadName == thread1 [2018-01-22T20:33:46.90] EventHandler> Received >> event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.90] >> EventHandler> Event: BreakpointEventImpl req breakpoint request >> nsk.jdi.EventSet.resume.resume008a:60 (enabled) >> [2018-01-22T20:33:46.90] debugger> Received communication breakpoint >> event. [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: >> received breakpoint event. [2018-01-22T20:33:46.90] debugger> >> shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.90] >> debugger> :::::: case: # 2 [2018-01-22T20:33:46.90] debugger> >> ......waiting for new ThreadStartEvent : 2 [2018-01-22T20:33:46.90] >> EventHandler> waitForRequestedEventSet: enabling remove of listener >> nsk.share.jdi.EventHandler$7 at 2641e737 [2018-01-22T20:33:46.90] >> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 2641e737 >> [2018-01-22T20:33:46.90] EventHandler> waitForRequestedEventSet: >> vm.resume called [2018-01-22T20:33:46.90] EventHandler> Received >> event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.90] >> EventHandler> waitForRequestedEventSet: Received event set for >> request: thread start request (enabled) [2018-01-22T20:33:46.90] >> EventHandler> Event: ThreadStartEventImpl req thread start request >> (enabled) [2018-01-22T20:33:46.90] EventHandler> Removing listener >> nsk.share.jdi.EventHandler$7 at 2641e737 [2018-01-22T20:33:46.90] >> debugger> got new ThreadStartEvent with propety 'number' == >> ThreadStartRequest3 [2018-01-22T20:33:46.90] debugger> ......checking >> up on EventSet.resume() [2018-01-22T20:33:46.90] debugger> ......--> >> vm.suspend(); [2018-01-22T20:33:46.90] debugger> getting : >> Map suspendsCounts1 [2018-01-22T20:33:46.91] >> debugger> {Reference Handler=2, thread2=2, Common-Cleaner=2, main=2, >> Signal Dispatcher=2, Finalizer=2} [2018-01-22T20:33:46.91] debugger> >> eventSet.resume; [2018-01-22T20:33:46.91] debugger> getting : >> Map suspendsCounts2 [2018-01-22T20:33:46.91] >> debugger> {Reference Handler=1, thread2=1, Common-Cleaner=1, main=1, >> Signal Dispatcher=1, Finalizer=1} [2018-01-22T20:33:46.91] debugger> >> getting : int policy = eventSet.suspendPolicy(); >> [2018-01-22T20:33:46.91] debugger> case SUSPEND_ALL >> [2018-01-22T20:33:46.91] debugger> checking Reference Handler >> [2018-01-22T20:33:46.91] debugger> checking thread2 >> [2018-01-22T20:33:46.91] debugger> checking Common-Cleaner >> [2018-01-22T20:33:46.91] debugger> checking main >> [2018-01-22T20:33:46.91] debugger> checking Signal Dispatcher >> [2018-01-22T20:33:46.91] debugger> checking Finalizer >> [2018-01-22T20:33:46.91] debugger> ......--> vm.resume() >> [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: entered >> [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: waiting >> for breakpoint event during 1 sec. [2018-01-22T20:33:46.91] >> debugee.stderr> **> debuggee: 'run': enter :: threadName == thread2 >> [2018-01-22T20:33:46.91] debugee.stderr> **> debuggee: 'run': exit :: >> threadName == thread2 [2018-01-22T20:33:46.91] EventHandler> Received >> event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.91] >> EventHandler> Event: BreakpointEventImpl req breakpoint request >> nsk.jdi.EventSet.resume.resume008a:60 (enabled) >> [2018-01-22T20:33:46.91] debugger> Received communication breakpoint >> event. [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: >> received breakpoint event. [2018-01-22T20:33:46.91] debugger> >> shouldRunAfterBreakpoint: received instruction from debuggee to >> finish. [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: >> exited with false. [2018-01-22T20:33:46.91] debugger> TESTING ENDS >> [2018-01-22T20:33:46.91] debugger> Waiting for debuggee's exit... >> [2018-01-22T20:33:46.91] EventHandler> waitForVMDisconnect >> [2018-01-22T20:33:46.91] debugee.stderr> **> debuggee: debuggee exits >> [2018-01-22T20:33:46.92] EventHandler> Received event set with policy >> = SUSPEND_NONE [2018-01-22T20:33:46.92] EventHandler> Event: >> VMDeathEventImpl req null [2018-01-22T20:33:46.92] EventHandler> >> receieved VMDeath [2018-01-22T20:33:46.92] EventHandler> Removing >> listener nsk.share.jdi.EventHandler$3 at 77f99a05 >> [2018-01-22T20:33:47.25] EventHandler> Received event set with policy >> = SUSPEND_NONE [2018-01-22T20:33:47.25] EventHandler> Event: >> VMDisconnectEventImpl req null [2018-01-22T20:33:47.25] EventHandler> >> receieved VMDisconnect [2018-01-22T20:33:47.25] EventHandler> >> Removing listener nsk.share.jdi.EventHandler$4 at 3aeaafa6 >> [2018-01-22T20:33:47.25] EventHandler> finished >> [2018-01-22T20:33:47.25] EventHandler> waitForVMDisconnect: done >> [2018-01-22T20:33:47.25] debugger> Event handler thread exited. >> [2018-01-22T20:33:47.25] debugger> Debuggee PASSED. >> [2018-01-22T20:33:47.26] [2018-01-22T20:33:47.26] >> [2018-01-22T20:33:47.26] #> [2018-01-22T20:33:47.26] #> SUMMARY: >> Following errors occured [2018-01-22T20:33:47.26] #> during test >> execution: [2018-01-22T20:33:47.26] #> [2018-01-22T20:33:47.26] # >> ERROR: debugger> ERROR: suspendCounts don't match for : Reference >> Handler [2018-01-22T20:33:47.26] # ERROR: debugger> before resuming : >> 1 [2018-01-22T20:33:47.26] # ERROR: debugger> after resuming : 2 >> [2018-01-22T20:33:47.27] # Test level exit status: 97 >> >> >> Here's a recent passed log from a local run: >> >> ----------System.out:(164/9808)---------- >> run [nsk.jdi.EventSet.resume.resume008, -verbose, -arch=linux-x64, >> -waittime=5, -debugee.vmkind=java, -transport.address=dynamic, >> -debugee.vmkeys=-XX:MaxRAMPercentage=2 ] >> binder> VirtualMachineManager: version 11.0 >> binder> Finding connector: default >> binder> LaunchingConnector: >> binder> name: com.sun.jdi.CommandLineLaunch >> binder> description: Launches target using Sun Java VM command >> line and attaches to it >> binder> transport: >> com.sun.tools.jdi.SunCommandLineLauncher$2 at 749dec1a >> binder> Connector arguments: >> binder> home=/export/users/gradams/ws/jdk-jdk/build/linux-x64/images/jdk >> binder> vmexec=java >> binder> options=-XX:MaxRAMPercentage=2 >> binder> main=nsk.jdi.EventSet.resume.resume008a "-verbose" >> "-arch=linux-x64" "-waittime=5" "-debugee.vmkind=java" >> "-transport.address=dynamic" "-debugee.vmkeys=-XX:MaxRAMPercentage=2 >> " "-pipe.port=35940" >> binder> quote=" >> binder> suspend=true >> binder> Launching debugee >> binder> Waiting for VM initialized >> Initial VMStartEvent received: VMStartEvent in thread main >> EventHandler> Adding listener nsk.share.jdi.EventHandler$1 at 2ab41d39 >> EventHandler> Adding listener nsk.share.jdi.EventHandler$2 at 2e3cb1e2 >> EventHandler> Adding listener nsk.share.jdi.EventHandler$3 at 57f20df9 >> EventHandler> Adding listener nsk.share.jdi.EventHandler$4 at 6e72e291 >> EventHandler> Adding listener nsk.share.jdi.EventHandler$5 at 5889e23e >> EventHandler> waitForRequestedEvent: enabling remove of listener >> nsk.share.jdi.EventHandler$6 at 46dcda7f >> EventHandler> Adding listener nsk.share.jdi.EventHandler$6 at 46dcda7f >> EventHandler> waitForRequestedEvent: vm.resume called >> EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD >> EventHandler> Event: ClassPrepareEventImpl req class prepare request >> (enabled) >> EventHandler> waitForRequestedEvent: Received event(ClassPrepareEvent >> in thread main) for request(class prepare request (enabled)) >> EventHandler> Removing listener nsk.share.jdi.EventHandler$6 at 46dcda7f >> debugger> Received ClassPrepareEvent for debuggee class: >> nsk.jdi.EventSet.resume.resume008a >> binder> Breakpoint set: >> breakpoint request nsk.jdi.EventSet.resume.resume008a:74 (disabled) >> EventHandler> Adding listener nsk.share.jdi.TestDebuggerType1$1 at 322c2a05 >> debugger> TESTING BEGINS >> debugger> RESUME DEBUGGEE VM >> debugger> shouldRunAfterBreakpoint: entered >> debugger> shouldRunAfterBreakpoint: waiting for breakpoint event >> during 1 sec. >> >> debugee.stderr> **> debuggee: debuggee started! >> EventHandler> Received event set with policy = SUSPEND_ALL >> EventHandler> Event: BreakpointEventImpl req breakpoint request >> nsk.jdi.EventSet.resume.resume008a:74 (enabled) >> debugger> Received communication breakpoint event. >> >> debugger> shouldRunAfterBreakpoint: received breakpoint event. >> debugger> shouldRunAfterBreakpoint: exited with true. >> debugger> :::::: case: # 0 >> debugger> ......waiting for new ThreadStartEvent : 0 >> >> EventHandler> waitForRequestedEventSet: enabling remove of listener >> nsk.share.jdi.EventHandler$7 at 78aa490d >> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 78aa490d >> EventHandler> waitForRequestedEventSet: vm.resume called >> EventHandler> Received event set with policy = SUSPEND_NONE >> debugee.stderr> **> debuggee: 'run': enter :: threadName == thread0 >> EventHandler> waitForRequestedEventSet: Received event set for >> request: thread start request (enabled) >> EventHandler> Event: ThreadStartEventImpl req thread start request >> (enabled) >> EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 78aa490d >> EventHandler> Received event set with policy = SUSPEND_ALL >> EventHandler> Event: BreakpointEventImpl req breakpoint request >> nsk.jdi.EventSet.resume.resume008a:74 (enabled) >> debugger> Received communication breakpoint event. >> >> debugger> got new ThreadStartEvent with propety 'number' == >> ThreadStartRequest1 >> debugger> ......checking up on EventSet.resume() >> debugger> ......--> vm.suspend(); >> debugger> getting : Map suspendsCounts1 >> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, >> Signal Dispatcher=2, Finalizer=2} >> debugger> eventSet.resume; >> debugger> getting : Map suspendsCounts2 >> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, >> Signal Dispatcher=2, Finalizer=2} >> debugger> getting : int policy = eventSet.suspendPolicy(); >> debugger> case SUSPEND_NONE >> debugger> checking Reference Handler >> debugger> checking thread0 >> debugger> checking Common-Cleaner >> debugger> checking main >> debugger> checking Signal Dispatcher >> debugger> checking Finalizer >> debugger> ......--> vm.resume() >> debugger> shouldRunAfterBreakpoint: entered >> debugger> shouldRunAfterBreakpoint: received breakpoint event. >> debugger> shouldRunAfterBreakpoint: exited with true. >> debugger> :::::: case: # 1 >> debugger> ......waiting for new ThreadStartEvent : 1 >> EventHandler> waitForRequestedEventSet: enabling remove of listener >> nsk.share.jdi.EventHandler$7 at 616bc3ae >> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 616bc3ae >> EventHandler> waitForRequestedEventSet: vm.resume called >> EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD >> debugee.stderr> **> debuggee: 'run': exit :: threadName == thread0 >> EventHandler> waitForRequestedEventSet: Received event set for >> request: thread start request (enabled) >> EventHandler> Event: ThreadStartEventImpl req thread start request >> (enabled) >> EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 616bc3ae >> debugger> got new ThreadStartEvent with propety 'number' == >> ThreadStartRequest2 >> debugger> ......checking up on EventSet.resume() >> debugger> ......--> vm.suspend(); >> debugger> getting : Map suspendsCounts1 >> debugger> {Reference Handler=1, thread1=2, Common-Cleaner=1, main=1, >> Signal Dispatcher=1, Finalizer=1} >> debugger> eventSet.resume; >> debugger> getting : Map suspendsCounts2 >> debugger> {Reference Handler=1, thread1=1, Common-Cleaner=1, main=1, >> Signal Dispatcher=1, Finalizer=1} >> debugger> getting : int policy = eventSet.suspendPolicy(); >> debugger> case SUSPEND_THREAD >> debugger> checking Reference Handler >> debugger> checking thread1 >> debugger> checking Common-Cleaner >> debugger> checking main >> debugger> checking Signal Dispatcher >> debugger> checking Finalizer >> debugger> ......--> vm.resume() >> debugger> shouldRunAfterBreakpoint: entered >> debugger> shouldRunAfterBreakpoint: waiting for breakpoint event >> during 1 sec. >> debugee.stderr> **> debuggee: 'run': enter :: threadName == thread1 >> debugee.stderr> **> debuggee: 'run': exit :: threadName == thread1 >> EventHandler> Received event set with policy = SUSPEND_ALL >> EventHandler> Event: BreakpointEventImpl req breakpoint request >> nsk.jdi.EventSet.resume.resume008a:74 (enabled) >> debugger> Received communication breakpoint event. >> debugger> shouldRunAfterBreakpoint: received breakpoint event. >> debugger> shouldRunAfterBreakpoint: exited with true. >> debugger> :::::: case: # 2 >> debugger> ......waiting for new ThreadStartEvent : 2 >> EventHandler> waitForRequestedEventSet: enabling remove of listener >> nsk.share.jdi.EventHandler$7 at 44e265ef >> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 44e265ef >> EventHandler> waitForRequestedEventSet: vm.resume called >> EventHandler> Received event set with policy = SUSPEND_ALL >> EventHandler> waitForRequestedEventSet: Received event set for >> request: thread start request (enabled) >> EventHandler> Event: ThreadStartEventImpl req thread start request >> (enabled) >> EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 44e265ef >> debugger> got new ThreadStartEvent with propety 'number' == >> ThreadStartRequest3 >> debugger> ......checking up on EventSet.resume() >> debugger> ......--> vm.suspend(); >> debugger> getting : Map suspendsCounts1 >> debugger> {Reference Handler=2, thread2=2, Common-Cleaner=2, main=2, >> Signal Dispatcher=2, Finalizer=2} >> debugger> eventSet.resume; >> debugger> getting : Map suspendsCounts2 >> debugger> {Reference Handler=1, thread2=1, Common-Cleaner=1, main=1, >> Signal Dispatcher=1, Finalizer=1} >> debugger> getting : int policy = eventSet.suspendPolicy(); >> debugger> case SUSPEND_ALL >> debugger> checking Reference Handler >> debugger> checking thread2 >> debugger> checking Common-Cleaner >> debugger> checking main >> debugger> checking Signal Dispatcher >> debugger> checking Finalizer >> debugger> ......--> vm.resume() >> debugger> shouldRunAfterBreakpoint: entered >> debugger> shouldRunAfterBreakpoint: waiting for breakpoint event >> during 1 sec. >> debugee.stderr> **> debuggee: 'run': enter :: threadName == thread2 >> debugee.stderr> **> debuggee: 'run': exit :: threadName == thread2 >> EventHandler> Received event set with policy = SUSPEND_ALL >> EventHandler> Event: BreakpointEventImpl req breakpoint request >> nsk.jdi.EventSet.resume.resume008a:74 (enabled) >> debugger> Received communication breakpoint event. >> debugger> shouldRunAfterBreakpoint: received breakpoint event. >> debugger> shouldRunAfterBreakpoint: received instruction from >> debuggee to finish. >> debugger> shouldRunAfterBreakpoint: exited with false. >> debugger> TESTING ENDS >> debugger> Waiting for debuggee's exit... >> debugee.stderr> **> debuggee: debuggee exits >> EventHandler> waitForVMDisconnect >> EventHandler> Received event set with policy = SUSPEND_NONE >> EventHandler> Event: VMDeathEventImpl req null >> EventHandler> receieved VMDeath >> EventHandler> Removing listener nsk.share.jdi.EventHandler$3 at 57f20df9 >> EventHandler> Received event set with policy = SUSPEND_NONE >> EventHandler> Event: VMDisconnectEventImpl req null >> EventHandler> receieved VMDisconnect >> EventHandler> Removing listener nsk.share.jdi.EventHandler$4 at 6e72e291 >> EventHandler> finished >> EventHandler> waitForVMDisconnect: done >> debugger> Event handler thread exited. >> debugger> Debuggee PASSED. >> >> On 7/18/18, 6:09 PM, gary.adams at oracle.com wrote: >>> On 7/18/18 4:47 PM, Chris Plummer wrote: >>>> Hi Gary >>>> >>>> Ok, so shouldRunAfterBreakpoint() is the code that does the >>>> eventHandler.wait(), so it gets the eventHandler.notifyAll() >>>> notification from the BreakpointEvent handler. >>>> >>>> And as a side note, I see now that resumption of execution after >>>> the breakpoint at main() is done by: >>>> >>>> // after waitForClassPrepared() main debuggee thread is >>>> suspended, resume it before test start >>>> display("RESUME DEBUGGEE VM"); >>>> vm.resume(); >>>> >>>> testRun(); >>>> >>>> shouldRunAfterBreakpoint() is returning true until the end of the >>>> test when the debuggee is executes "instruction = end". That's why >>>> runTests() does a "break" when shouldRunAfterBreakpoint() returns >>>> false. So this means the code that is checking >>>> shouldRunAfterBreakpoint() is not resuming execution for the first >>>> few (probably 3) methodForCommunication() breakpoints. However, it >>>> does make sure that runTests() blocks until the BreakPointEvent has >>>> been processed. >>>> >>>> You point out the vm.resume() at the bottom of the loop in >>>> runTests(), but that's only after a bunch of ThreadStartEvent >>>> processing above it has been done already. The ThreadStartEvent >>>> would never get generated if there was not a resume some point >>>> earlier. I think it is happening during the >>>> eventHandler.waitForRequestedEventSet() call, which does a >>>> vm.resume(). >>>> >>>> So if I understand the order of things now: >>>> >>>> -shouldRunAfterBreakpoint() returns after first >>>> methodForCommunication() is hit. At this point we know the first >>>> thread has been created, but no attempt to start it yet. The >>>> debuggee is suspended at this point. >>>> -runTests() requests ThreadStartEvents with SUSPEND_NONE. This also >>>> does a vm.resume(). >>>> -The debuggee starts the thread and then does another >>>> methodForCommunication() (this 2nd one is actually after the 2nd >>>> thread has been created, but not yet started). Now we have a race. >>>> Do we get the ThreadStartEvent first or the BreakpointEvent. This >>>> is because when the ThreadStartEvent is generated, the thread is >>>> not suspended due to SUSPEND_NONE. Even if the ThreadStartEvent >>>> comes in first, the async handling of the BreakpointEvent can cause >>>> problems during the ThreadStartEvent processing. >>> Based on the failed log in the bug report, the thread start event is >>> observed, >>> the suspend counts acquired, then after the resume, the breakpoint >>> message >>> is displayed and the second set of suspend counts acquired. >>> >>> I can show you the passed and failed logs tomorrow. >>>> -You added a 100ms delay after the thread has started, but before >>>> methodForCommunication(), hoping it will make it so the >>>> ThreadStartEvent can be received and fully processed before the >>>> BreakpointEvent is. >>> The delay is mostly just a yield so the debugger gets a chance to run. >>>> >>>> I think it would be preferable to fix this by doing better >>>> sychronization. After all, that is the approach the test originally >>>> took. It could have been written with a bunch of sleep() delays >>>> instead, but that in general is not a very good approach. >>>> >>>> What if you added a shouldRunAfterBreakpoint() call after getting >>>> the ThreadStartEvent arrives. At this point you would know that the >>>> vm is suspended due to the breakpoint, so no need for: >>>> >>>> display("......checking up on EventSet.resume()"); >>>> display("......--> vm.suspend();"); >>>> vm.suspend(); >>> I think the suspend is intentional to capture the the suspend counts. >>> It also needs to resume the vm and acquire again so it can confirm >>> the correct >>> suspend count behaviors. >>> If the test waits to capture the second set of suspend counts, the >>> breakpoint >>> causes incorrect values. >>> >>> ... >>>> >>>> You might then also need to add another methodForCommunication() >>>> call at the end of case 0 and 1 in the debuggee, although I think >>>> you could instead just change the shouldRunAfterBreakpoint() at the >>>> start of the loop. I think that check actually belongs at the end >>>> of the loop, and only for case 2. In fact it would be an error if >>>> shouldRunAfterBreakpoint() did not return true in that case. Then >>>> you also need to add a shouldRunAfterBreakpoint() at the start of >>>> case 0 to get things rolling (and I think at the start of case 1 >>>> also). >>>> >>>> Chris >>>> >>>> >>>> On 7/18/18 12:45 PM, Gary Adams wrote: >>>>> Answers below ... >>>>> >>>>> On 7/18/18, 2:50 PM, Chris Plummer wrote: >>>>>> Hi Gary, >>>>>> >>>>>> Who does the resume for the breakpoint event? >>>>>> >>>>>> eventHandler.addListener( >>>>>> new EventHandler.EventListener() { >>>>>> public boolean eventReceived(Event event) { >>>>>> if (event instanceof BreakpointEvent && >>>>>> bpRequest.equals(event.request())) { >>>>>> synchronized(eventHandler) { >>>>>> display("Received communication >>>>>> breakpoint event."); >>>>>> bpCount++; >>>>>> eventHandler.notifyAll(); >>>>>> } >>>>>> return true; >>>>>> } >>>>>> return false; >>>>>> } >>>>>> } >>>>>> ); >>>>> I believe you are looking for this sequence. >>>>> At the top of the loop a check is made if >>>>> resume() should be called "shouldRunAfterBreakpoint". >>>>> lines 96-99 is an early termination. And at the >>>>> bottom of the loop, line 240, is the normal >>>>> continue the test to the next case. >>>>> >>>>> resume008.java : >>>>> ... >>>>> 94 for (int i = 0; ; i++) { >>>>> 95 >>>>> >>>>> 96 if (!shouldRunAfterBreakpoint()) { >>>>> 97 vm.resume(); >>>>> 98 break; >>>>> 99 } >>>>> >>>>> 100 >>>>> 101 >>>>> 102 display(":::::: case: # " + i); >>>>> 103 >>>>> 104 switch (i) { >>>>> 105 >>>>> 106 case 0: >>>>> 107 eventRequest = settingThreadStartRequest ( >>>>> 108 SUSPEND_NONE, "ThreadStartRequest1"); >>>>> ... >>>>> 238 >>>>> 239 display("......--> vm.resume()"); >>>>> 240 vm.resume(); >>>>> 241 } >>>>>> >>>>>> Also: >>>>>> >>>>>>> 1. On a thread start event the debugee is suspended, line 141 >>>>>> That's not true for the first ThreadStartEvent since SUSPEND_NONE >>>>>> was used. >>>>> The thread start event is set to SUSPEND_NONE for thread0, but when >>>>> the thread start event is observed the resume008 test suspends the vm >>>>> immediately after fetching the "number" property. >>>> My point is that the Debuggee continues to run after the >>>> ThreadStartEvent is sent, and relies on the debugger to stop it >>>> after receiving the event. But in the meantime the debuggee has >>>> advanced to the next breakpoint, but only sometimes, thus the bug >>>> you are seeing. >>>>> >>>>> 132 if ( !(newEvent instanceof ThreadStartEvent)) { >>>>> 133 setFailedStatus("ERROR: new event is not >>>>> ThreadStartEvent"); >>>>> 134 } else { >>>>> 135 >>>>> 136 String property = (String) >>>>> newEvent.request().getProperty("number"); >>>>> 137 display(" got new ThreadStartEvent >>>>> with propety 'number' == " + property); >>>>> 138 >>>>> 139 display("......checking up on >>>>> EventSet.resume()"); >>>>> 140 display("......--> vm.suspend();"); >>>>> 141 vm.suspend(); >>>>> >>>>> >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/18/18 4:52 AM, Gary Adams wrote: >>>>>>> There is nothing wrong with the breakpoint in >>>>>>> methodForCommunication. >>>>>>> The test uses it to make sure the threads are each tested >>>>>>> separately. >>>>>>> The breakpoint eventhandler just displays a message, increments >>>>>>> a counter >>>>>>> and returns. >>>>>>> >>>>>>> Let me step through resume008a the debugee to help clarify ... >>>>>>> >>>>>>> 1. The test thread is created and the synchronized break point >>>>>>> is observed. lines 101-102 >>>>>>> 2. The thread is started. lines 104,135-137 >>>>>>> 2a. The main thread blocks on a local object. lines 133, 139 >>>>>>> 2b. The test thread is started. lines 137, >>>>>>> A run entered message is displayed, line 159 >>>>>>> The main thread lock object is notified, line 167 >>>>>>> 2b1. The main thread continues. line 167, 146 >>>>>>> The next test thread is created. line 106 >>>>>>> The synchronized breakpoint is observed, line 107 >>>>>>> 2b2. A run exited message is displayed, line 169 >>>>>>> >>>>>>> On the resume008 debugger side ... >>>>>>> 1. On a thread start event the debugee is suspended, line 141 >>>>>>> 2. Messages are displayed and a first set of thread suspend >>>>>>> counts is acquired. lines 143-151 >>>>>>> 3. The threads are resumed, line 152 >>>>>>> ---> >>>>>>> 4. Messages are displayed and a second set of thread suspend >>>>>>> counts is acquired. lines 154-159 >>>>>>> >>>>>>> The way the test is written the expectation is the debugger >>>>>>> steps 2,3,4 will all happen >>>>>>> while the test thread is running. >>>>>>> >>>>>>> When the debugger resumes the debuggee threads (debugger step 3) >>>>>>> the debuggee continues from where it left off (debuggee steps >>>>>>> 2b,2b1,2b2) >>>>>>> >>>>>>> If we complete debuggee step 2b1 (line 107) before the debugger >>>>>>> completes step 4 line 159, >>>>>>> then the synchronized breakpoint will suspend the vm and the >>>>>>> counts will not match >>>>>>> for the SUSPEND_NONE test thread start. >>>>>>> >>>>>>> resume008a.java: >>>>>>> >>>>>>> 100 case 0: >>>>>>> 101 thread0 = new >>>>>>> Threadresume008a("thread0"); >>>>>>> 102 methodForCommunication(); >>>>>>> 103 >>>>>>> 104 threadStart(thread0); >>>>>>> 105 >>>>>>> 106 thread1 = new >>>>>>> Threadresume008a("thread1"); >>>>>>> 107 methodForCommunication(); >>>>>>> 108 break; >>>>>>> >>>>>>> ... >>>>>>> 135 static int threadStart(Thread t) { >>>>>>> 136 synchronized (waitnotifyObj) { >>>>>>> 137 t.start(); >>>>>>> 138 try { >>>>>>> 139 waitnotifyObj.wait(); >>>>>>> 140 } catch ( Exception e) { >>>>>>> 141 exitCode = FAILED; >>>>>>> 142 logErr(" Exception : " + e ); >>>>>>> 143 return FAILED; >>>>>>> 144 } >>>>>>> 145 } >>>>>>> 146 return PASSED; >>>>>>> 147 } >>>>>>> >>>>>>> 149 static class Threadresume008a extends Thread { >>>>>>> ... >>>>>>> 157 >>>>>>> 158 public void run() { >>>>>>> 159 log1(" 'run': enter :: threadName == " + >>>>>>> tName); >>>>>>> >>>>>>> This is the proposed fix that will let the debugger complete >>>>>>> it's second >>>>>>> acquisition of suspend counts while the test thread is still >>>>>>> running. >>>>>>> >>>>>>> 160 // Yield, so the start thread event >>>>>>> processing can be completed. >>>>>>> 161 try { >>>>>>> 162 Thread.sleep(100); >>>>>>> 163 } catch (InterruptedException e) { >>>>>>> 164 // ignored >>>>>>> 165 } >>>>>>> >>>>>>> 166 synchronized (waitnotifyObj) { >>>>>>> 167 waitnotifyObj.notify(); >>>>>>> 168 } >>>>>>> 169 log1(" 'run': exit :: threadName == " + >>>>>>> tName); >>>>>>> 170 return; >>>>>>> 171 } >>>>>>> 172 } >>>>>>> 150 >>>>>>> 151 String tName = null; >>>>>>> 152 >>>>>>> 153 public Threadresume008a(String threadName) { >>>>>>> 154 super(threadName); >>>>>>> 155 tName = threadName; >>>>>>> 156 } >>>>>>> 157 >>>>>>> 158 public void run() { >>>>>>> 159 log1(" 'run': enter :: threadName == " + >>>>>>> tName); >>>>>>> 160 // Yield, so the start thread event >>>>>>> processing can be completed. >>>>>>> 161 try { >>>>>>> 162 Thread.sleep(100); >>>>>>> 163 } catch (InterruptedException e) { >>>>>>> 164 // ignored >>>>>>> 165 } >>>>>>> 166 synchronized (waitnotifyObj) { >>>>>>> 167 waitnotifyObj.notify(); >>>>>>> 168 } >>>>>>> 169 log1(" 'run': exit :: threadName == " + >>>>>>> tName); >>>>>>> 170 return; >>>>>>> 171 } >>>>>>> 172 } >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 7/18/18, 2:38 AM, Chris Plummer wrote: >>>>>>>> Hi Gary, >>>>>>>> >>>>>>>> I've been having trouble following the control flow of this >>>>>>>> test. One thing I've stumbled across is the following: >>>>>>>> >>>>>>>> /* A debuggee class must define >>>>>>>> 'methodForCommunication' >>>>>>>> * method and invoke it in points of synchronization >>>>>>>> * with a debugger. >>>>>>>> */ >>>>>>>> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication"); >>>>>>>> >>>>>>>> >>>>>>>> So why isn't this mode of synchronization good enough? Is it >>>>>>>> because it was not designed with the understanding that the >>>>>>>> debugger might be doing suspended thread counts, and suspending >>>>>>>> all threads at the breakpoint messes up the test? >>>>>>>> >>>>>>>> From what I can tell of the test, after the debuggee is started >>>>>>>> and hits the default breakpoint at the start of main(), the >>>>>>>> debugger then does a vm.resume() at the start of the for loop >>>>>>>> in the runTest() method. The debuggee then creates a thread and >>>>>>>> calls methodForCommunication(). There is already a breakpoint >>>>>>>> set there by the above debuggee code. It's unclear to me what >>>>>>>> happens as a result of this breakpoint and how it serves the >>>>>>>> test. Also unclear to me who is responsible for the vm.resume() >>>>>>>> after the breakpoint is hit. >>>>>>>> >>>>>>>> The debugger then requests all ThreadStart events, requesting >>>>>>>> that no threads be disabled when it is sent. I think you are >>>>>>>> saying that when the ThreadStart event comes in, sometimes we >>>>>>>> are at the methodForCommunication breakpoint, with all threads >>>>>>>> disabled, and this messes up the thread suspend counts. You >>>>>>>> want to delay 100ms so the breakpoint event can be processed >>>>>>>> and threads resumed again (although I can't see who actually >>>>>>>> resumes the thread after hitting the methodForCommunication >>>>>>>> breakpoint). >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 7/17/18 8:33 AM, Gary Adams wrote: >>>>>>>>> A race condition exists between the debugger and the debuggee. >>>>>>>>> >>>>>>>>> The first test thread is started with SUSPEND_NONE policy set. >>>>>>>>> While processing the thread start event the debugger captures >>>>>>>>> an initial set of thread suspend counts and resumes the >>>>>>>>> debuggee vm. If the debuggee advances quickly it reaches >>>>>>>>> the breakpoint set for methodForCommunication. Since the >>>>>>>>> breakpoint >>>>>>>>> carries with it SUSPEND_ALL policy, when the debugger captures >>>>>>>>> a second >>>>>>>>> set of suspend counts, it will not match the expected counts for >>>>>>>>> a SUSPEND_NONE scenario. >>>>>>>>> >>>>>>>>> The proposed fix introduces a yield in the debuggee test >>>>>>>>> thread run method >>>>>>>>> to allow the debugger to get the expected sampled values. >>>>>>>>> >>>>>>>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8170089 >>>>>>>>> Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/ >>>>>>>>> >>>>>>>>> >>>>>>>>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java: >>>>>>>>> >>>>>>>>> ... >>>>>>>>> 186 private void >>>>>>>>> setCommunicationBreakpoint(ReferenceType refType, String >>>>>>>>> methodName) { >>>>>>>>> 187 Method method = >>>>>>>>> debuggee.methodByName(refType, methodName); >>>>>>>>> 188 Location location = null; >>>>>>>>> 189 try { >>>>>>>>> 190 location = >>>>>>>>> method.allLineLocations().get(0); >>>>>>>>> 191 } catch (AbsentInformationException e) { >>>>>>>>> 192 throw new Failure(e); >>>>>>>>> 193 } >>>>>>>>> 194 bpRequest = debuggee.makeBreakpoint(location); >>>>>>>>> 195 >>>>>>>>> >>>>>>>>> 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL); >>>>>>>>> >>>>>>>>> 197 bpRequest.putProperty("number", "zero"); >>>>>>>>> 198 bpRequest.enable(); >>>>>>>>> 199 >>>>>>>>> 200 eventHandler.addListener( >>>>>>>>> 201 new EventHandler.EventListener() { >>>>>>>>> 202 public boolean eventReceived(Event >>>>>>>>> event) { >>>>>>>>> 203 if (event instanceof >>>>>>>>> BreakpointEvent && bpRequest.equals(event.request())) { >>>>>>>>> 204 synchronized(eventHandler) { >>>>>>>>> 205 display("Received communication breakpoint event."); >>>>>>>>> 206 bpCount++; >>>>>>>>> 207 eventHandler.notifyAll(); >>>>>>>>> 208 } >>>>>>>>> 209 return true; >>>>>>>>> 210 } >>>>>>>>> 211 return false; >>>>>>>>> 212 } >>>>>>>>> 213 } >>>>>>>>> 214 ); >>>>>>>>> 215 } >>>>>>>>> >>>>>>>>> >>>>>>>>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: >>>>>>>>> >>>>>>>>> ... >>>>>>>>> 140 display("......--> vm.suspend();"); >>>>>>>>> 141 vm.suspend(); >>>>>>>>> 142 >>>>>>>>> 143 display(" getting : >>>>>>>>> Map suspendsCounts1"); >>>>>>>>> 144 >>>>>>>>> 145 Map suspendsCounts1 >>>>>>>>> = new HashMap(); >>>>>>>>> 146 for (ThreadReference threadReference >>>>>>>>> : vm.allThreads()) { >>>>>>>>> 147 suspendsCounts1.put(threadReference.name(), >>>>>>>>> threadReference.suspendCount()); >>>>>>>>> 148 } >>>>>>>>> 149 display(suspendsCounts1.toString()); >>>>>>>>> 150 >>>>>>>>> 151 display(" eventSet.resume;"); >>>>>>>>> 152 eventSet.resume(); >>>>>>>>> 153 >>>>>>>>> 154 display(" getting : >>>>>>>>> Map suspendsCounts2"); >>>>>>>>> >>>>>>>>> This is where the breakpoint is encountered before the second >>>>>>>>> set of suspend counts is acquired. >>>>>>>>> >>>>>>>>> 155 Map suspendsCounts2 >>>>>>>>> = new HashMap(); >>>>>>>>> 156 for (ThreadReference threadReference >>>>>>>>> : vm.allThreads()) { >>>>>>>>> 157 suspendsCounts2.put(threadReference.name(), >>>>>>>>> threadReference.suspendCount()); >>>>>>>>> 158 } >>>>>>>>> 159 display(suspendsCounts2.toString()); >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From chris.plummer at oracle.com Tue Jul 24 19:42:41 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 24 Jul 2018 12:42:41 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com> <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com> Message-ID: <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Tue Jul 24 20:22:14 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 24 Jul 2018 13:22:14 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com> References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com> <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com> <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Tue Jul 24 20:55:57 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 24 Jul 2018 13:55:57 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com> References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com> <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com> <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com> <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com> Message-ID: <68ef717f-7388-0dcb-be6b-7b8ed462fdf5@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Jul 24 20:46:04 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 24 Jul 2018 13:46:04 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com> <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com> <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com> Message-ID: <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Jul 24 22:00:03 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 24 Jul 2018 15:00:03 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: <68ef717f-7388-0dcb-be6b-7b8ed462fdf5@oracle.com> References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com> <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com> <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com> <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com> <68ef717f-7388-0dcb-be6b-7b8ed462fdf5@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Tue Jul 24 23:23:48 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 24 Jul 2018 16:23:48 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com> <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com> <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com> <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com> <68ef717f-7388-0dcb-be6b-7b8ed462fdf5@oracle.com> Message-ID: <539dc93d-8984-4d82-50eb-bd0395476247@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Tue Jul 24 23:46:19 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 24 Jul 2018 16:46:19 -0700 Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR: suspendCounts don't match for : Common-Cleaner In-Reply-To: <5B577DDC.3000500@oracle.com> References: <5B4E0C62.3020808@oracle.com> <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com> <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com> <5B4F98BF.1060602@oracle.com> <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com> <5B507F2C.4080503@oracle.com> <2bd15cea-86b5-1af0-ef53-2c82a4685224@oracle.com> <5B577DDC.3000500@oracle.com> Message-ID: <00dd4347-7869-b566-b7f6-d8b897e73c80@oracle.com> Hi Gary, It looks like that should work fine. thanks, Chris On 7/24/18 12:28 PM, Gary Adams wrote: > Here's a quick prototype to add a variable to the debuggee. > The debugger sets it at the end of each completed test case. > > The debuggee can then check for the value change to delay > hitting the breakpoint which interfered with suspend count checks. > > Would need to add a bit more error and timeout checking to > complete the fix. Should also check if the other resume008 test cases > need similar synchronization. Could possibly migrate the code up to > TestDebuggerType1 if other tests also needed this generic capability. > > > diff --git > a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java > b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java > --- > a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java > +++ > b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java > @@ -63,6 +63,9 @@ > ? *?? to be resulting in the event. > ? * - Upon getting new event, the debugger > ? *?? performs the check corresponding to the event. > + * - The debugger informs the debuggee when it completes > + *?? each test case, so it will wait before hitting > + *?? communication breakpoints. > ? */ > > ?public class resume008 extends TestDebuggerType1 { > @@ -234,6 +237,7 @@ > > ????????????????????? default: throw new Failure("** default case 1 **"); > ???????????????? } > +??????????????? informDebuggeeTestCase(i); > ???????????? } > > ???????????? display("......--> vm.resume()"); > @@ -255,4 +259,25 @@ > ???????? } > ???? } > > +??? /** > +???? * Inform debuggee which thread test the debugger has completed. > +???? * Used for synchronization, so the debuggee does not move too > quickly. > +???? * @param testCase index of just completed test > +???? */ > +??? void informDebuggeeTestCase(int testCase) { > +??????? if (!EventHandler.isDisconnected() && debuggeeClass != null) { > +??????????? try { > +??????????????? ((ClassType)debuggeeClass) > + .setValue(debuggeeClass.fieldByName("testCase"), > +????????????????????????????? vm.mirrorOf(testCase)); > +??????????? } catch (InvalidTypeException ite) { > +??????????????? // ignored > +??????????? } catch (ClassNotLoadedException cnle) { > +??????????????? // ignored > +??????????? } catch (VMDisconnectedException e) { > +??????????????? // ignored > ?} > +??????? } > +??? } > + > +} > > > diff --git > a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java > b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java > --- > a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java > +++ > b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java > @@ -62,6 +62,7 @@ > > ???? static int exitCode = PASSED; > > +??? static int testCase = -1; > ???? static int instruction = 1; > ???? static int end???????? = 0; > ??????????????????????????????????? //??? static int quit??????? = 0; > @@ -104,6 +105,15 @@ > ???????????????????????????? threadStart(thread0); > > ???????????????????????????? thread1 = new Threadresume008a("thread1"); > +??????????????????????????? // Wait for debugger to complete the > first test case > +??????????????????????????? // before advancing to the next breakpoint > +??????????????????????????? while (testCase < 0) { > +??????????????????????????????? try { > +??????????????????????????????????? Thread.sleep(100); > +??????????????????????????????? } catch (InterruptedException e) { > +??????????????????????????????????? // ignored > +??????????????????????????????? } > +??????????????????????????? } > ???????????????????????????? methodForCommunication(); > ???????????????????????????? break; > > > On 7/20/18, 2:37 PM, Chris Plummer wrote: >> Hi Gary, >> >> The test fails if the breakpoint event comes in after the test >> captures the initial thread suspend counts and before the test >> captures the 2nd suspend counts. >> >> debugger>???????? getting : Map suspendsCounts1 >> debugger> {Reference Handler=1, Common-Cleaner=1, main=1, Signal >> Dispatcher=1, Finalizer=1} >> debugger>???????? eventSet.resume; >> debugger>???????? getting : Map suspendsCounts2 >> EventHandler> Received event set with policy = SUSPEND_ALL >> EventHandler> Event: BreakpointEventImpl req breakpoint request >> nsk.jdi.EventSet.resume.resume008a:60 (enabled) >> debugger> Received communication breakpoint event. >> debugger> {Reference Handler=2, Common-Cleaner=2, main=1, Signal >> Dispatcher=2, Finalizer=2} >> >> So we end up with some threads starting with 1 suspend and ending >> with 2 (not clear to me why main is still at 1). >> >> It will pass if the breakpoint comes in after it does both of suspend >> count checks, as you have shown with the sleep(100) solution. Output >> looks like this: >> >> debugger>??????? got new ThreadStartEvent with propety 'number' == >> ThreadStartRequest1 >> ... >> debugger> ......--> vm.suspend(); >> debugger>???????? getting : Map suspendsCounts1 >> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, >> Signal Dispatcher=1, Finalizer=1} >> debugger>???????? eventSet.resume; >> debugger>???????? getting : Map suspendsCounts2 >> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, >> Signal Dispatcher=1, Finalizer=1} >> ... >> debugger> Received communication breakpoint event. >> >> I've also shown that it passes if the breakpoint always comes in >> before capturing the initial suspend counts. I added a sleep on the >> debugger side right after eventHandler.waitForRequestedEventSet() >> returns. Output looks like: >> >> debugger> Received communication breakpoint event. >> debugger>??????? got new ThreadStartEvent with propety 'number' == >> ThreadStartRequest1 >> ... >> debugger> ......--> vm.suspend(); >> debugger>???????? getting : Map suspendsCounts1 >> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, >> Signal Dispatcher=2, Finalizer=2} >> debugger>???????? eventSet.resume; >> debugger>???????? getting : Map suspendsCounts2 >> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, >> Signal Dispatcher=2, Finalizer=2} >> >> I think we should add synchronization to force one of these two >> outcomes. For the first, you would need to make the debugger modify >> some variable that the debuggee is watching (sitting in a loop >> waiting for it to change). For the second, you can rely on the >> existing methodForCommunication() approach. You just need to >> restructure the debugger a bit. I had started down this path late >> Wednesday, but got sidetracked by a few other things. I can look into >> it some more if you'd like. >> >> thanks, >> >> Chris From ekaterina.pavlova at oracle.com Tue Jul 24 22:10:56 2018 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Tue, 24 Jul 2018 15:10:56 -0700 Subject: [11] RFR(XS): 8195156 [Graal] serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails with Graal in Xcomp mode Message-ID: <82547f4a-7bd8-447f-4aa8-af83ed156823@oracle.com> Hi All, serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails with Graal because two more modules jdk.proxy1 and jdk.proxy2 are dynamically initialized by Graal code. These modules are not part of boot modules and as results the check fails. It was agreed with Serviceability team to filter these modules out. Please review the fix. JBS: https://bugs.openjdk.java.net/browse/JDK-8195156 webrev: http://cr.openjdk.java.net/~epavlova//8195156/webrev.00/index.html testing: tested by running serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java with Graal and -Xcomp Thanks, -katya p.s. Igor Ignatyev volunteered to sponsor this change. From rahul.v.raghavan at oracle.com Thu Jul 19 07:48:26 2018 From: rahul.v.raghavan at oracle.com (Rahul Raghavan) Date: Thu, 19 Jul 2018 13:18:26 +0530 Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled In-Reply-To: References: <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com> <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com> Message-ID: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled (just adding + hotspot-compiler-dev also) On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote: Subject Was: Re: RFR (S): C1 still does eden allocations when TLAB is enabled + serviceability-dev Hi all, Could anyone else give me a review of this webrev and check/test the various architecture changes? http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ Thanks for all your help! Jc > On Mon, Jul 16, 2018 at 2:58 PM JC Beyler wrote: > >> Hi all, >> >> Here is a webrev that does all the architectures in the same way: >> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ >> >> Could anyone review the other architectures and test? >> - arm, sparc & aarch64 are also modified now to follow the same "if no >> tlab, then consider eden space allocation" logic. >> >> Thanks for your help! >> Jc >> >> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler wrote: >> >>> Hi Kim, >>> >>> I opened this bug >>> https://bugs.openjdk.java.net/browse/JDK-8190862 >>> >>> and now I've done an update: >>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ >>> >>> I basically have done your nits but also removed the try_eden (it was >>> used to bind a label but was not used). I updated the comments to use the >>> one you preferred. >>> >>> I still have to do the other architectures though but at least we seem to >>> have a consensus on this architecture, correct? >>> >>> Thanks for the review, >>> Jc >>> >>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett >>> wrote: >>> >>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler wrote: >>>>> >>>>> Yes, you are right, I did those changes due to: >>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 >>>>> >>>>> If Robbin agrees to this change, and if no one sees an issue, I'll go >>>> ahead >>>>> and propagate the change across architectures. >>>>> >>>>> Thanks for the review, I'll wait for Robbin (or anyone else's comment >>>> and >>>>> review) :) >>>>> Jc >>>>> >>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose >>>> wrote: >>>>> >>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler wrote: >>>>>> >>>>>> >>>>>> I'm not sure if we had left this case intentionally or not but, if we >>>> want >>>>>> it all to be consistent, we should perhaps fix it. >>>>>> >>>>>> >>>>>> Well, you put in that logic last February, so unless somebody speaks >>>> up >>>>>> quickly, I support your adjusting it to be the way you want it. >>>>>> >>>>>> Doing "hg grep -u supports_inline_contig_alloc -I src/hotspot/share" >>>>>> suggests that the GC group is most active in touching this feature. >>>>>> If Robbin is OK with it, there's your reviewer. >>>>>> >>>>>> FWIW, you can use me as a reviewer, but I'd get one other person >>>>>> working on the GC to OK it. >>>>>> >>>>>> ? John >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Thanks, >>>>> Jc >>>> >>>> Robbin is on vacation; you might not hear from him for a while. >>>> >>>> I'm assuming you'll open a new bug for this? >>>> >>>> Except for a few minor nits (below), this looks okay to me. >>>> >>>> The comment at line 1052 needs updating. >>>> >>>> pre-existing: The retry_tlab label declared on line 1054 is unused. >>>> >>>> pre-existing: The try_eden label declared on line 1054 is bound at >>>> line 1058, but unreferenced. >>>> >>>> I like the wording of the comment at 1139 better than the wording at >>>> 1016. >>>> >>>> >>> >>> -- >>> >>> Thanks, >>> Jc >>> >> >> >> -- >> >> Thanks, >> Jc >> > > From serguei.spitsyn at oracle.com Wed Jul 25 00:51:25 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 24 Jul 2018 17:51:25 -0700 Subject: [11] RFR(XS): 8195156 [Graal] serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails with Graal in Xcomp mode In-Reply-To: <18ba24bb-0498-2d35-45b3-fdad9e6c8c5d@oracle.com> References: <82547f4a-7bd8-447f-4aa8-af83ed156823@oracle.com> <18ba24bb-0498-2d35-45b3-fdad9e6c8c5d@oracle.com> Message-ID: <99f46eb4-dd71-2f4d-4aae-b0f1aa9fd62b@oracle.com> Forgot to tell that a copyright comment needs a year update. No need in new webrev. Thanks, Serguei On 7/24/18 17:48, serguei.spitsyn at oracle.com wrote: > Hi Katya, > > Nice simple fix. > Thank you for taking care about it! > > Thanks, > Serguei > > On 7/24/18 15:10, Ekaterina Pavlova wrote: >> Hi All, >> >> serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails >> with Graal because >> two more modules jdk.proxy1 and jdk.proxy2 are dynamically >> initialized by Graal code. >> These modules are not part of boot modules and as results the check >> fails. >> It was agreed with Serviceability team to filter these modules out. >> >> Please review the fix. >> >> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8195156 >> ?webrev: >> http://cr.openjdk.java.net/~epavlova//8195156/webrev.00/index.html >> testing: tested by running >> serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java with >> Graal and -Xcomp >> >> >> Thanks, >> -katya >> >> p.s. >> ?Igor Ignatyev volunteered to sponsor this change. >> > From serguei.spitsyn at oracle.com Wed Jul 25 01:01:44 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 24 Jul 2018 18:01:44 -0700 Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR: suspendCounts don't match for : Common-Cleaner In-Reply-To: <00dd4347-7869-b566-b7f6-d8b897e73c80@oracle.com> References: <5B4E0C62.3020808@oracle.com> <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com> <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com> <5B4F98BF.1060602@oracle.com> <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com> <5B507F2C.4080503@oracle.com> <2bd15cea-86b5-1af0-ef53-2c82a4685224@oracle.com> <5B577DDC.3000500@oracle.com> <00dd4347-7869-b566-b7f6-d8b897e73c80@oracle.com> Message-ID: <31f065d3-c178-fe6d-95c2-86096cf9e5ea@oracle.com> Hi Gary, +1 Thanks, Serguei On 7/24/18 16:46, Chris Plummer wrote: > Hi Gary, > > It looks like that should work fine. > > thanks, > > Chris > > On 7/24/18 12:28 PM, Gary Adams wrote: >> Here's a quick prototype to add a variable to the debuggee. >> The debugger sets it at the end of each completed test case. >> >> The debuggee can then check for the value change to delay >> hitting the breakpoint which interfered with suspend count checks. >> >> Would need to add a bit more error and timeout checking to >> complete the fix. Should also check if the other resume008 test cases >> need similar synchronization. Could possibly migrate the code up to >> TestDebuggerType1 if other tests also needed this generic capability. >> >> >> diff --git >> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java >> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java >> --- >> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java >> +++ >> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java >> @@ -63,6 +63,9 @@ >> ? *?? to be resulting in the event. >> ? * - Upon getting new event, the debugger >> ? *?? performs the check corresponding to the event. >> + * - The debugger informs the debuggee when it completes >> + *?? each test case, so it will wait before hitting >> + *?? communication breakpoints. >> ? */ >> >> ?public class resume008 extends TestDebuggerType1 { >> @@ -234,6 +237,7 @@ >> >> ????????????????????? default: throw new Failure("** default case 1 >> **"); >> ???????????????? } >> +??????????????? informDebuggeeTestCase(i); >> ???????????? } >> >> ???????????? display("......--> vm.resume()"); >> @@ -255,4 +259,25 @@ >> ???????? } >> ???? } >> >> +??? /** >> +???? * Inform debuggee which thread test the debugger has completed. >> +???? * Used for synchronization, so the debuggee does not move too >> quickly. >> +???? * @param testCase index of just completed test >> +???? */ >> +??? void informDebuggeeTestCase(int testCase) { >> +??????? if (!EventHandler.isDisconnected() && debuggeeClass != null) { >> +??????????? try { >> +??????????????? ((ClassType)debuggeeClass) >> + .setValue(debuggeeClass.fieldByName("testCase"), >> +????????????????????????????? vm.mirrorOf(testCase)); >> +??????????? } catch (InvalidTypeException ite) { >> +??????????????? // ignored >> +??????????? } catch (ClassNotLoadedException cnle) { >> +??????????????? // ignored >> +??????????? } catch (VMDisconnectedException e) { >> +??????????????? // ignored >> ?} >> +??????? } >> +??? } >> + >> +} >> >> >> diff --git >> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java >> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java >> --- >> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java >> +++ >> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java >> @@ -62,6 +62,7 @@ >> >> ???? static int exitCode = PASSED; >> >> +??? static int testCase = -1; >> ???? static int instruction = 1; >> ???? static int end???????? = 0; >> ??????????????????????????????????? //??? static int quit = 0; >> @@ -104,6 +105,15 @@ >> ???????????????????????????? threadStart(thread0); >> >> ???????????????????????????? thread1 = new Threadresume008a("thread1"); >> +??????????????????????????? // Wait for debugger to complete the >> first test case >> +??????????????????????????? // before advancing to the next breakpoint >> +??????????????????????????? while (testCase < 0) { >> +??????????????????????????????? try { >> +??????????????????????????????????? Thread.sleep(100); >> +??????????????????????????????? } catch (InterruptedException e) { >> +??????????????????????????????????? // ignored >> +??????????????????????????????? } >> +??????????????????????????? } >> ???????????????????????????? methodForCommunication(); >> ???????????????????????????? break; >> >> >> On 7/20/18, 2:37 PM, Chris Plummer wrote: >>> Hi Gary, >>> >>> The test fails if the breakpoint event comes in after the test >>> captures the initial thread suspend counts and before the test >>> captures the 2nd suspend counts. >>> >>> debugger>???????? getting : Map suspendsCounts1 >>> debugger> {Reference Handler=1, Common-Cleaner=1, main=1, Signal >>> Dispatcher=1, Finalizer=1} >>> debugger>???????? eventSet.resume; >>> debugger>???????? getting : Map suspendsCounts2 >>> EventHandler> Received event set with policy = SUSPEND_ALL >>> EventHandler> Event: BreakpointEventImpl req breakpoint request >>> nsk.jdi.EventSet.resume.resume008a:60 (enabled) >>> debugger> Received communication breakpoint event. >>> debugger> {Reference Handler=2, Common-Cleaner=2, main=1, Signal >>> Dispatcher=2, Finalizer=2} >>> >>> So we end up with some threads starting with 1 suspend and ending >>> with 2 (not clear to me why main is still at 1). >>> >>> It will pass if the breakpoint comes in after it does both of >>> suspend count checks, as you have shown with the sleep(100) >>> solution. Output looks like this: >>> >>> debugger>??????? got new ThreadStartEvent with propety 'number' == >>> ThreadStartRequest1 >>> ... >>> debugger> ......--> vm.suspend(); >>> debugger>???????? getting : Map suspendsCounts1 >>> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, >>> Signal Dispatcher=1, Finalizer=1} >>> debugger>???????? eventSet.resume; >>> debugger>???????? getting : Map suspendsCounts2 >>> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, >>> Signal Dispatcher=1, Finalizer=1} >>> ... >>> debugger> Received communication breakpoint event. >>> >>> I've also shown that it passes if the breakpoint always comes in >>> before capturing the initial suspend counts. I added a sleep on the >>> debugger side right after eventHandler.waitForRequestedEventSet() >>> returns. Output looks like: >>> >>> debugger> Received communication breakpoint event. >>> debugger>??????? got new ThreadStartEvent with propety 'number' == >>> ThreadStartRequest1 >>> ... >>> debugger> ......--> vm.suspend(); >>> debugger>???????? getting : Map suspendsCounts1 >>> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, >>> Signal Dispatcher=2, Finalizer=2} >>> debugger>???????? eventSet.resume; >>> debugger>???????? getting : Map suspendsCounts2 >>> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, >>> Signal Dispatcher=2, Finalizer=2} >>> >>> I think we should add synchronization to force one of these two >>> outcomes. For the first, you would need to make the debugger modify >>> some variable that the debuggee is watching (sitting in a loop >>> waiting for it to change). For the second, you can rely on the >>> existing methodForCommunication() approach. You just need to >>> restructure the debugger a bit. I had started down this path late >>> Wednesday, but got sidetracked by a few other things. I can look >>> into it some more if you'd like. >>> >>> thanks, >>> >>> Chris > From serguei.spitsyn at oracle.com Wed Jul 25 00:48:58 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 24 Jul 2018 17:48:58 -0700 Subject: [11] RFR(XS): 8195156 [Graal] serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails with Graal in Xcomp mode In-Reply-To: <82547f4a-7bd8-447f-4aa8-af83ed156823@oracle.com> References: <82547f4a-7bd8-447f-4aa8-af83ed156823@oracle.com> Message-ID: <18ba24bb-0498-2d35-45b3-fdad9e6c8c5d@oracle.com> Hi Katya, Nice simple fix. Thank you for taking care about it! Thanks, Serguei On 7/24/18 15:10, Ekaterina Pavlova wrote: > Hi All, > > serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails > with Graal because > two more modules jdk.proxy1 and jdk.proxy2 are dynamically initialized > by Graal code. > These modules are not part of boot modules and as results the check > fails. > It was agreed with Serviceability team to filter these modules out. > > Please review the fix. > > ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8195156 > ?webrev: > http://cr.openjdk.java.net/~epavlova//8195156/webrev.00/index.html > testing: tested by running > serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java with > Graal and -Xcomp > > > Thanks, > -katya > > p.s. > ?Igor Ignatyev volunteered to sponsor this change. > From ekaterina.pavlova at oracle.com Wed Jul 25 02:08:22 2018 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Tue, 24 Jul 2018 19:08:22 -0700 Subject: [11] RFR(XS): 8195156 [Graal] serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails with Graal in Xcomp mode In-Reply-To: <99f46eb4-dd71-2f4d-4aae-b0f1aa9fd62b@oracle.com> References: <82547f4a-7bd8-447f-4aa8-af83ed156823@oracle.com> <18ba24bb-0498-2d35-45b3-fdad9e6c8c5d@oracle.com> <99f46eb4-dd71-2f4d-4aae-b0f1aa9fd62b@oracle.com> Message-ID: <72710916-021f-6dd9-7f05-b1b1b410734d@oracle.com> Vladimir, Serguei, thanks for your reviews! I fixed copyright year. -katya On 7/24/18 5:51 PM, serguei.spitsyn at oracle.com wrote: > Forgot to tell that a copyright comment needs a year update. > No need in new webrev. > > Thanks, > Serguei > > On 7/24/18 17:48, serguei.spitsyn at oracle.com wrote: >> Hi Katya, >> >> Nice simple fix. >> Thank you for taking care about it! >> >> Thanks, >> Serguei >> >> On 7/24/18 15:10, Ekaterina Pavlova wrote: >>> Hi All, >>> >>> serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails with Graal because >>> two more modules jdk.proxy1 and jdk.proxy2 are dynamically initialized by Graal code. >>> These modules are not part of boot modules and as results the check fails. >>> It was agreed with Serviceability team to filter these modules out. >>> >>> Please review the fix. >>> >>> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8195156 >>> ?webrev: http://cr.openjdk.java.net/~epavlova//8195156/webrev.00/index.html >>> testing: tested by running serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java with Graal and -Xcomp >>> >>> >>> Thanks, >>> -katya >>> >>> p.s. >>> ?Igor Ignatyev volunteered to sponsor this change. >>> >> > From fairoz.matte at oracle.com Wed Jul 25 08:23:48 2018 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Wed, 25 Jul 2018 01:23:48 -0700 (PDT) Subject: [8u-backport] RFR: JDK-8191948: jdb error: InvalidTypeException: Can't assign double[][][] to double[][][] Message-ID: <5be949d0-70ba-4a9d-9541-907b37d3d0fb@default> Hi, Kindly review the backport of "JDK-8191948: jdb error: InvalidTypeException: Can't assign double[][][] to double[][][]" to 8u Webrev - http://cr.openjdk.java.net/~fmatte/8191948/webrev.00/ JDK 11 bug - https://bugs.openjdk.java.net/browse/JDK-8191948 JDK 11 changeset - http://hg.openjdk.java.net/jdk/jdk11/rev/73c769e0486a Review thread - http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024405.html Thanks, Fairoz From gary.adams at oracle.com Wed Jul 25 12:56:40 2018 From: gary.adams at oracle.com (gary.adams at oracle.com) Date: Wed, 25 Jul 2018 08:56:40 -0400 Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR: suspendCounts don't match for : Common-Cleaner In-Reply-To: <00dd4347-7869-b566-b7f6-d8b897e73c80@oracle.com> References: <5B4E0C62.3020808@oracle.com> <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com> <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com> <5B4F98BF.1060602@oracle.com> <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com> <5B507F2C.4080503@oracle.com> <2bd15cea-86b5-1af0-ef53-2c82a4685224@oracle.com> <5B577DDC.3000500@oracle.com> <00dd4347-7869-b566-b7f6-d8b897e73c80@oracle.com> Message-ID: <8f5e2612-f348-012a-e4d8-9f3c4a082b8d@oracle.com> During some longer testing runs I noticed similar failures for resume002, resume003 and resume006. I'll spend a few more cycles to see if a more general purpose solution could be shared across these tests. On 7/24/18 7:46 PM, Chris Plummer wrote: > Hi Gary, > > It looks like that should work fine. > > thanks, > > Chris > > On 7/24/18 12:28 PM, Gary Adams wrote: >> Here's a quick prototype to add a variable to the debuggee. >> The debugger sets it at the end of each completed test case. >> >> The debuggee can then check for the value change to delay >> hitting the breakpoint which interfered with suspend count checks. >> >> Would need to add a bit more error and timeout checking to >> complete the fix. Should also check if the other resume008 test cases >> need similar synchronization. Could possibly migrate the code up to >> TestDebuggerType1 if other tests also needed this generic capability. >> >> >> diff --git >> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java >> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java >> --- >> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java >> +++ >> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java >> @@ -63,6 +63,9 @@ >> ? *?? to be resulting in the event. >> ? * - Upon getting new event, the debugger >> ? *?? performs the check corresponding to the event. >> + * - The debugger informs the debuggee when it completes >> + *?? each test case, so it will wait before hitting >> + *?? communication breakpoints. >> ? */ >> >> ?public class resume008 extends TestDebuggerType1 { >> @@ -234,6 +237,7 @@ >> >> ????????????????????? default: throw new Failure("** default case 1 >> **"); >> ???????????????? } >> +??????????????? informDebuggeeTestCase(i); >> ???????????? } >> >> ???????????? display("......--> vm.resume()"); >> @@ -255,4 +259,25 @@ >> ???????? } >> ???? } >> >> +??? /** >> +???? * Inform debuggee which thread test the debugger has completed. >> +???? * Used for synchronization, so the debuggee does not move too >> quickly. >> +???? * @param testCase index of just completed test >> +???? */ >> +??? void informDebuggeeTestCase(int testCase) { >> +??????? if (!EventHandler.isDisconnected() && debuggeeClass != null) { >> +??????????? try { >> +??????????????? ((ClassType)debuggeeClass) >> + .setValue(debuggeeClass.fieldByName("testCase"), >> +????????????????????????????? vm.mirrorOf(testCase)); >> +??????????? } catch (InvalidTypeException ite) { >> +??????????????? // ignored >> +??????????? } catch (ClassNotLoadedException cnle) { >> +??????????????? // ignored >> +??????????? } catch (VMDisconnectedException e) { >> +??????????????? // ignored >> ?} >> +??????? } >> +??? } >> + >> +} >> >> >> diff --git >> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java >> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java >> --- >> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java >> +++ >> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java >> @@ -62,6 +62,7 @@ >> >> ???? static int exitCode = PASSED; >> >> +??? static int testCase = -1; >> ???? static int instruction = 1; >> ???? static int end???????? = 0; >> ??????????????????????????????????? //??? static int quit = 0; >> @@ -104,6 +105,15 @@ >> ???????????????????????????? threadStart(thread0); >> >> ???????????????????????????? thread1 = new Threadresume008a("thread1"); >> +??????????????????????????? // Wait for debugger to complete the >> first test case >> +??????????????????????????? // before advancing to the next breakpoint >> +??????????????????????????? while (testCase < 0) { >> +??????????????????????????????? try { >> +??????????????????????????????????? Thread.sleep(100); >> +??????????????????????????????? } catch (InterruptedException e) { >> +??????????????????????????????????? // ignored >> +??????????????????????????????? } >> +??????????????????????????? } >> ???????????????????????????? methodForCommunication(); >> ???????????????????????????? break; >> >> >> On 7/20/18, 2:37 PM, Chris Plummer wrote: >>> Hi Gary, >>> >>> The test fails if the breakpoint event comes in after the test >>> captures the initial thread suspend counts and before the test >>> captures the 2nd suspend counts. >>> >>> debugger>???????? getting : Map suspendsCounts1 >>> debugger> {Reference Handler=1, Common-Cleaner=1, main=1, Signal >>> Dispatcher=1, Finalizer=1} >>> debugger>???????? eventSet.resume; >>> debugger>???????? getting : Map suspendsCounts2 >>> EventHandler> Received event set with policy = SUSPEND_ALL >>> EventHandler> Event: BreakpointEventImpl req breakpoint request >>> nsk.jdi.EventSet.resume.resume008a:60 (enabled) >>> debugger> Received communication breakpoint event. >>> debugger> {Reference Handler=2, Common-Cleaner=2, main=1, Signal >>> Dispatcher=2, Finalizer=2} >>> >>> So we end up with some threads starting with 1 suspend and ending >>> with 2 (not clear to me why main is still at 1). >>> >>> It will pass if the breakpoint comes in after it does both of >>> suspend count checks, as you have shown with the sleep(100) >>> solution. Output looks like this: >>> >>> debugger>??????? got new ThreadStartEvent with propety 'number' == >>> ThreadStartRequest1 >>> ... >>> debugger> ......--> vm.suspend(); >>> debugger>???????? getting : Map suspendsCounts1 >>> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, >>> Signal Dispatcher=1, Finalizer=1} >>> debugger>???????? eventSet.resume; >>> debugger>???????? getting : Map suspendsCounts2 >>> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, >>> Signal Dispatcher=1, Finalizer=1} >>> ... >>> debugger> Received communication breakpoint event. >>> >>> I've also shown that it passes if the breakpoint always comes in >>> before capturing the initial suspend counts. I added a sleep on the >>> debugger side right after eventHandler.waitForRequestedEventSet() >>> returns. Output looks like: >>> >>> debugger> Received communication breakpoint event. >>> debugger>??????? got new ThreadStartEvent with propety 'number' == >>> ThreadStartRequest1 >>> ... >>> debugger> ......--> vm.suspend(); >>> debugger>???????? getting : Map suspendsCounts1 >>> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, >>> Signal Dispatcher=2, Finalizer=2} >>> debugger>???????? eventSet.resume; >>> debugger>???????? getting : Map suspendsCounts2 >>> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, >>> Signal Dispatcher=2, Finalizer=2} >>> >>> I think we should add synchronization to force one of these two >>> outcomes. For the first, you would need to make the debugger modify >>> some variable that the debuggee is watching (sitting in a loop >>> waiting for it to change). For the second, you can rely on the >>> existing methodForCommunication() approach. You just need to >>> restructure the debugger a bit. I had started down this path late >>> Wednesday, but got sidetracked by a few other things. I can look >>> into it some more if you'd like. >>> >>> thanks, >>> >>> Chris > From jcbeyler at google.com Wed Jul 25 14:20:11 2018 From: jcbeyler at google.com (JC Beyler) Date: Wed, 25 Jul 2018 07:20:11 -0700 Subject: RFR (XS) 8208059: [TESTBUG] HeapMonitorInterpreterArrayTest.java fails Message-ID: Hi all, There seems to be an intermittent failure with the HeapMonitorInterpreterArrayTest. I believe it is due to the possibility of a huge interval being chosen at the end of the test and GC arriving before checking the samples. This fix should help alleviate it by reducing the interval to 100k and also checking the garbage collected objects, could someone review it please? Webrev: http://cr.openjdk.java.net/~jcbeyler/8208059/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8208059 As always, thanks for your help! Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Wed Jul 25 14:42:37 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 25 Jul 2018 10:42:37 -0400 Subject: RFR (XS) 8208059: [TESTBUG] HeapMonitorInterpreterArrayTest.java fails In-Reply-To: References: Message-ID: <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com> On 7/25/18 10:20 AM, JC Beyler wrote: > Hi all, > > There seems to be an intermittent failure with the > HeapMonitorInterpreterArrayTest. I believe it is due to the > possibility of a huge interval being chosen at the end of the test and > GC arriving before checking the samples. > > This fix should help alleviate it by reducing the interval to 100k and > also checking the garbage collected objects, could someone review it > please? > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8208059/webrev.00/ > test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorTest.java ??? No comments. Thumbs up. We'll need this fix in both JDK11 and JDK12. Dan > Bug: https://bugs.openjdk.java.net/browse/JDK-8208059 > > As always, thanks for your help! > Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Jul 25 17:31:17 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 25 Jul 2018 10:31:17 -0700 Subject: [8u-backport] RFR: JDK-8191948: jdb error: InvalidTypeException: Can't assign double[][][] to double[][][] In-Reply-To: <5be949d0-70ba-4a9d-9541-907b37d3d0fb@default> References: <5be949d0-70ba-4a9d-9541-907b37d3d0fb@default> Message-ID: <07fea5b7-6922-19d0-d58a-8aa5fb95b69d@oracle.com> Hi Fairoz, The changes look good. I'm not sure what the policy is when part of the (full) backport contains test changes that aren't directly applicable to 8u. You might need some sort of noreg label on the backport CR. thanks, Chris On 7/25/18 1:23 AM, Fairoz Matte wrote: > Hi, > > Kindly review the backport of "JDK-8191948: jdb error: InvalidTypeException: Can't assign double[][][] to double[][][]" to 8u > > Webrev - http://cr.openjdk.java.net/~fmatte/8191948/webrev.00/ > > JDK 11 bug - https://bugs.openjdk.java.net/browse/JDK-8191948 > > JDK 11 changeset - http://hg.openjdk.java.net/jdk/jdk11/rev/73c769e0486a > > Review thread - http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024405.html > > Thanks, > Fairoz From jcbeyler at google.com Wed Jul 25 17:34:54 2018 From: jcbeyler at google.com (JC Beyler) Date: Wed, 25 Jul 2018 10:34:54 -0700 Subject: RFR (XS) 8208059: [TESTBUG] HeapMonitorInterpreterArrayTest.java fails In-Reply-To: <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com> References: <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com> Message-ID: Thanks for your help Daniel, Could I get a second review and I'll prepare an updated webrev :) Jc On Wed, Jul 25, 2018 at 7:42 AM Daniel D. Daugherty < daniel.daugherty at oracle.com> wrote: > On 7/25/18 10:20 AM, JC Beyler wrote: > > Hi all, > > There seems to be an intermittent failure with the > HeapMonitorInterpreterArrayTest. I believe it is due to the possibility of > a huge interval being chosen at the end of the test and GC arriving before > checking the samples. > > This fix should help alleviate it by reducing the interval to 100k and > also checking the garbage collected objects, could someone review it please? > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8208059/webrev.00/ > > > > test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorTest.java > No comments. > > Thumbs up. We'll need this fix in both JDK11 and JDK12. > > Dan > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8208059 > > As always, thanks for your help! > Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Jul 25 17:37:13 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 25 Jul 2018 10:37:13 -0700 Subject: RFR (XS) 8208059: [TESTBUG] HeapMonitorInterpreterArrayTest.java fails In-Reply-To: <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com> References: <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com> Message-ID: <3c7f2aa7-b12e-3eb8-4843-12f422aaf022@oracle.com> An HTML attachment was scrubbed... URL: From jcbeyler at google.com Wed Jul 25 17:54:01 2018 From: jcbeyler at google.com (JC Beyler) Date: Wed, 25 Jul 2018 10:54:01 -0700 Subject: RFR (XS) 8208059: [TESTBUG] HeapMonitorInterpreterArrayTest.java fails In-Reply-To: <3c7f2aa7-b12e-3eb8-4843-12f422aaf022@oracle.com> References: <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com> <3c7f2aa7-b12e-3eb8-4843-12f422aaf022@oracle.com> Message-ID: Hi Serguei, Here it is: http://cr.openjdk.java.net/~jcbeyler/8208059/webrev.01/ Let me know if you need anything else and thanks for your help! Jc On Wed, Jul 25, 2018 at 10:37 AM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Hi Jc, > > It looks good. > I'll push it after you send me a patch. > > > On 7/25/18 07:42, Daniel D. Daugherty wrote: > > On 7/25/18 10:20 AM, JC Beyler wrote: > > Hi all, > > There seems to be an intermittent failure with the > HeapMonitorInterpreterArrayTest. I believe it is due to the possibility of > a huge interval being chosen at the end of the test and GC arriving before > checking the samples. > > This fix should help alleviate it by reducing the interval to 100k and > also checking the garbage collected objects, could someone review it please? > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8208059/webrev.00/ > > > > test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorTest.java > No comments. > > Thumbs up. We'll need this fix in both JDK11 and JDK12. > > > Okay, I've changed the 'Fix Version' to 11. > > Thanks, > Serguei > > > Dan > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8208059 > > As always, thanks for your help! > Jc > > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Jul 25 17:54:12 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 25 Jul 2018 10:54:12 -0700 Subject: [8u-backport] RFR: JDK-8191948: jdb error: InvalidTypeException: Can't assign double[][][] to double[][][] In-Reply-To: <07fea5b7-6922-19d0-d58a-8aa5fb95b69d@oracle.com> References: <5be949d0-70ba-4a9d-9541-907b37d3d0fb@default> <07fea5b7-6922-19d0-d58a-8aa5fb95b69d@oracle.com> Message-ID: <96be102f-67c4-462c-89ba-a728b8b81ddc@oracle.com> An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Wed Jul 25 18:00:01 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 25 Jul 2018 11:00:01 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: <539dc93d-8984-4d82-50eb-bd0395476247@oracle.com> References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com> <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com> <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com> <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com> <68ef717f-7388-0dcb-be6b-7b8ed462fdf5@oracle.com> <539dc93d-8984-4d82-50eb-bd0395476247@oracle.com> Message-ID: <112afd52-cd22-4d43-173f-08dd924eb7cc@oracle.com> Looks good to me --alex On 07/24/2018 16:23, Chris Plummer wrote: > Thanks, Serguei. > > I could use one more reviewer. > > thanks, > > Chris > > On 7/24/18 3:00 PM, serguei.spitsyn at oracle.com wrote: >> Chris, >> >> Thank you for the explanations. >> I'm Okay with this webrev as it is. >> >> Thanks, >> Serguei >> >> >> On 7/24/18 13:55, Chris Plummer wrote: >>> On 7/24/18 1:46 PM, serguei.spitsyn at oracle.com wrote: >>>> >>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java.frames.html >>>> - log.complain("Redefinition not started. Maybe running with -Xcomp. >>>> Test ignored."); >>>> + log.complain("Redefinition not started. May need more time for >>>> -Xcomp."); >>>> + status = Consts.TEST_FAILED; >>>> return false; >>>> } >>>> . . . >>>> - log.complain("Redefinition not completed."); >>>> + log.complain("Redefinition not completed. May need more time for >>>> -Xcomp."); >>>> + status = Consts.TEST_FAILED; >>>> + return false; The complain is not fully correct if this can happen >>>> not only with the -Xcomp. Could this message be relaxed a little bit? >>> I think it is relaxed. It says *may* need more time for -Xcomp. I'm >>> not sure how else to word it unless you want me to just say >>> "Redefinition not completed". >>>> Also, just a side comment: The changes above are not that harmless. >>>> As the status now is set to TEST_FAILED there is a potential for the >>>> tests to start failing where they were passed before. >>> Yes, that was intentional. It's still the case that you only need the >>> fail = 0 change to fix the bug, but having these methods properly >>> cause the test to fail is necessary if something were to ever go >>> wrong and the redef was not started or completed. Otherwise the test >>> would either silently pass (if redef was not started) or just produce >>> error messages like it has been when it checks for the proper redef >>> (if the redef never completed). >>> >>> thanks, >>> >>> Chris >>>> Otherwise, looks good. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 7/24/18 13:22, Chris Plummer wrote: >>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01 >>>>> >>>>> Since I was removed the "else", there was no need for the "if", so >>>>> I removed it also. I had to re-indent the body of the "if" section >>>>> because of that. The webrev seems to not call out the whitespace >>>>> changes, although I also did a couple of other minor formatting >>>>> changes in the code that do show up. >>>>> >>>>> Chris >>>>> >>>>> On 7/24/18 12:42 PM, Chris Plummer wrote: >>>>>> Yes. I'm just retesting first. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/24/18 12:18 PM, serguei.spitsyn at oracle.com wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> You have my all my comments and I leave it up to you to decide >>>>>>> what approach to pick. >>>>>>> Could you send an updated webrev, please? >>>>>>> >>>>>>> Thanks, >>>>>>> Serguei >>>>>>> >>>>>>> >>>>>>> On 7/24/18 09:27, Chris Plummer wrote: >>>>>>>> On 7/24/18 12:25 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>> Hi Chris, >>>>>>>>> >>>>>>>>> I still feel, this fix adds more confusion and complexity. >>>>>>>>> Let's look at some fragments. >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028/redefclass028.c.frames.html >>>>>>>>> >>>>>>>>> 116 if ((strcmp(name, expHSMethod) == 0) && >>>>>>>>> 117 (strcmp(sig, expHSSignature) == 0)) { >>>>>>>>> 118 NSK_DISPLAY0("CompiledMethodLoad: a tested hotspot method found\n"); >>>>>>>>> 119 >>>>>>>>> 120 // CR 6604375: check whether "hot" method was entered >>>>>>>>> 121 if (enteredHotMethod) { >>>>>>>>> 122 hsMethodID = method; >>>>>>>>> 123 fire = 1; >>>>>>>>> 124 } else { >>>>>>>>> 125 NSK_DISPLAY0("Compilation occured before method execution\n"); >>>>>>>>> 126 fire = 0; // Ignore this compilation. Wait for next one. >>>>>>>>> 127 } >>>>>>>>> 128 } >>>>>>>>> >>>>>>>>> I think, the line #126 is not needed. >>>>>>>>> It just creates a confusion. >>>>>>>>> The fire == 0 from beginning. >>>>>>>>> Why do we need it to set to 0 again? >>>>>>>> Yes, it can be removed. I just didn't give it much thought when >>>>>>>> changing the code from -1 to 0. >>>>>>>>> Is it because it can be already set to 1? >>>>>>>>> Id so, I'm not sure I understand this code then. >>>>>>>>> >>>>>>>>> 187 } while(fire == 0); >>>>>>>>> 188 >>>>>>>>> 189 NSK_DISPLAY0("agentProc: hotspot method compiled\n\n"); >>>>>>>>> 190 >>>>>>>>> 192 if (fire == 1) { >>>>>>>>> . . . >>>>>>>>> 224 } else { >>>>>>>>> 225 // fire == -1 >>>>>>>>> 226 // NOTE: This isn't suppose to happen anymore. Hot method >>>>>>>>> should always end up being entered. >>>>>>>>> 227 NSK_COMPLAIN0("agentProc: \"hot\" method wasn't executed. >>>>>>>>> Don't perform redefinition\n"); >>>>>>>>> 228 } >>>>>>>>> I don't understand why do we need the check at the line #192. >>>>>>>>> The variable fire can be only equal to 0 or 1. >>>>>>>>> The only way out of the loop at the line #187 is if fire == 1. >>>>>>>>> >>>>>>>>> Then the else statement at the lines 224-228 confuses even more. >>>>>>>> The else section can be removed. I left it in as sort of an >>>>>>>> assert, but I see now that it just cause confusion. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>>> >>>>>>>>> >>>>>>>>> On 7/23/18 20:19, Chris Plummer wrote: >>>>>>>>>> On 7/23/18 5:22 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 7/23/18 11:40, Chris Plummer wrote: >>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>> >>>>>>>>>>>> If the fix was complicated I would agree, but it really just >>>>>>>>>>>> boils down to this one line change: >>>>>>>>>>>> >>>>>>>>>>>> -??????????? fire = -1; >>>>>>>>>>>> +??????????? fire = 0; // Ignore this compilation. Wait for >>>>>>>>>>>> next one. >>>>>>>>>>> >>>>>>>>>>> It is not obvious that this will completely fix the problem. >>>>>>>>>>> Is it possible that there will not be next compilation with >>>>>>>>>>> the -Xcomp? >>>>>>>>>> It's only one method that we check for. I don't see why there >>>>>>>>>> would be 2nd -Xcomp compilation for it, but even if there was, >>>>>>>>>> the test will ignore it just like the first one. It will >>>>>>>>>> ignore compilations of the method until the flag has been set >>>>>>>>>> indicating the method has been executed once. >>>>>>>>> >>>>>>>>>> If for some reason the method is never compiled after being >>>>>>>>>> executed once, the test will give up waiting for it (I think >>>>>>>>>> after 30 seconds) and produce an error. >>>>>>>>> >>>>>>>>> I'm afraid that it is what will always happen with the -Xcomp. >>>>>>>>> Then there is no point to waist this by waiting for timeout as >>>>>>>>> the test will successfully complete without testing anything. >>>>>>>>> It seems to be not worth this complexity. >>>>>>>>> >>>>>>>>> I guess, you would want some extra tracing though. :) >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Serguei >>>>>>>>> >>>>>>>>> >>>>>>>>>>> If it is possible then it is better to explicitly exclude >>>>>>>>>>> these tests for -Xcomp. >>>>>>>>>>> Otherwise, consider this reviewed. >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Given that, I see no reason not to increase our test >>>>>>>>>>>> coverage by supporting this test during -Xcomp runs. >>>>>>>>>>> >>>>>>>>>>> I'd agree if it is going to be stable. >>>>>>>>>>> >>>>>>>>>> If problems turn up in the future, we can reconsider disabling >>>>>>>>>> it. >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>>> Thanks, >>>>>>>>>>> Serguei >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 7/23/18 9:44 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> >>>>>>>>>>>>> Would it be more simple to avoid running these tests with >>>>>>>>>>>>> -Xcomp? >>>>>>>>>>>>> I guess, this would work: @requires vm.compMode != "Xcomp" >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Serguei >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 7/23/18 00:42, Chris Plummer wrote: >>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Please review the following fix for JDK11: >>>>>>>>>>>>>> >>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8151259 >>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00 >>>>>>>>>>>>>> >>>>>>>>>>>>>> It fixes the following 3 tests: >>>>>>>>>>>>>> >>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java >>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java >>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java >>>>>>>>>>>>>> >>>>>>>>>>>>>> Any of which could fail when run with -Xcomp with >>>>>>>>>>>>>> (followed by a bunch more errors): >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?# ERROR: Redefinition not started. Maybe running with >>>>>>>>>>>>>> -Xcomp. Test ignored. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Although lately we've only seen this with >>>>>>>>>>>>>> redefclass030.java on macosx. >>>>>>>>>>>>>> >>>>>>>>>>>>>> These 3 tests do redefinition of a "hot" method after >>>>>>>>>>>>>> triggering compilation for it. After the redef some >>>>>>>>>>>>>> testing is done to ensure that the redef was done >>>>>>>>>>>>>> correctly, but the issue these test have actually comes >>>>>>>>>>>>>> before any redef is done. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The test attempts to trigger compilation by calling a hot >>>>>>>>>>>>>> method a lot. The agent detects compilation by receiving a >>>>>>>>>>>>>> CompiledMethodLoad event. There was an issue discovered >>>>>>>>>>>>>> long ago that when -Xcomp is used, the compilation happens >>>>>>>>>>>>>> before the "hot" method is ever called. Then the redef >>>>>>>>>>>>>> would happen before compilation, and this somehow messed >>>>>>>>>>>>>> up the test (I'm not exactly sure how). The fix was to >>>>>>>>>>>>>> basically abandon the redef attempt when this problem is >>>>>>>>>>>>>> detected, and then supposedly just let the test run to >>>>>>>>>>>>>> completion (skipping the actual testing of the redef). >>>>>>>>>>>>>> After this change, if you ran with -Xcomp it would pass, >>>>>>>>>>>>>> but if you looked in the log you would see: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?# ERROR: Redefinition not started. Maybe running with >>>>>>>>>>>>>> -Xcomp. Test ignored. >>>>>>>>>>>>>> >>>>>>>>>>>>>> However, there was a bug in the logic to make the test run >>>>>>>>>>>>>> to completion, and also causes the above message to not >>>>>>>>>>>>>> appear. Instead the test would fail with: >>>>>>>>>>>>>> >>>>>>>>>>>>>> # ERROR: Redefinition not completed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Followed by a bunch more error message during the part of >>>>>>>>>>>>>> the test that checks if the redef was done properly. >>>>>>>>>>>>>> >>>>>>>>>>>>>> If the CompiledMethodLoad event comes in before the hot >>>>>>>>>>>>>> method is ever called (which it does with -Xcomp), the >>>>>>>>>>>>>> test sets fire = -1. If the hot method was called, it is >>>>>>>>>>>>>> set to 1.? The setting of fire = -1 was added to fix the >>>>>>>>>>>>>> -Xcomp problem mentioned above. The jvmti agent does the >>>>>>>>>>>>>> following: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??? do { >>>>>>>>>>>>>> ??????? THREAD_sleep(1); >>>>>>>>>>>>>> ??????? /* wait for compilation to happen */ >>>>>>>>>>>>>> ??? } while(fire == 0); >>>>>>>>>>>>>> >>>>>>>>>>>>>> ??? if (fire == 1) { >>>>>>>>>>>>>> ??????? /* do the redef here */ >>>>>>>>>>>>>> ??????? NSK_DISPLAY0("agentProc: <<<<<<<< >>>>>>>>>>>>>> RedefineClasses() is successfully done\n"); >>>>>>>>>>>>>> ??? } else { >>>>>>>>>>>>>> ??????? // fire == -1 >>>>>>>>>>>>>> ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't >>>>>>>>>>>>>> executed. Don't perform redefinition\n"); >>>>>>>>>>>>>> ??? } >>>>>>>>>>>>>> >>>>>>>>>>>>>> The agent then syncs with the debuggee, waiting for it >>>>>>>>>>>>>> finish up. What the test expects is that >>>>>>>>>>>>>> waitForRedefinitionStarted() in the debuggee will time out >>>>>>>>>>>>>> after two seconds while waiting for fire == 1 (which it >>>>>>>>>>>>>> thinks will will always happen because it was set to -1). >>>>>>>>>>>>>> When it times out, the test does appear to exit properly >>>>>>>>>>>>>> with, but with the following in the log, which is intended: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ?# ERROR: Redefinition not started. Maybe running with >>>>>>>>>>>>>> -Xcomp. Test ignored. >>>>>>>>>>>>>> >>>>>>>>>>>>>> However, sometimes before waitForRedefinitionStarted() >>>>>>>>>>>>>> times out, the hot method is called enough times to >>>>>>>>>>>>>> trigger compilation. So another CompiledMethodLoad event >>>>>>>>>>>>>> arrives, and this time fire is set to 1. Because of this, >>>>>>>>>>>>>> waitForRedefinitionStarted() doesn't time out and returns >>>>>>>>>>>>>> with an indication that the redef has started. After this >>>>>>>>>>>>>> waitForRedefinitionCompleted() is executed. It waits for >>>>>>>>>>>>>> the redef to complete, but it never does since the agent >>>>>>>>>>>>>> decided not to do the redef when it saw fire == -1. So >>>>>>>>>>>>>> waitForRedefinitionCompleted() times out after 10 seconds >>>>>>>>>>>>>> and the test fails, with: >>>>>>>>>>>>>> >>>>>>>>>>>>>> # ERROR: Redefinition not completed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Actually the above error is not really what causes the >>>>>>>>>>>>>> failure. When the above error is detected, no error status >>>>>>>>>>>>>> is set and the test continues as if the redef had been >>>>>>>>>>>>>> done. So then the logic that detects if the redef was done >>>>>>>>>>>>>> properly ends up failing, and that's where the test >>>>>>>>>>>>>> actually indicates a failure status. You see a whole bunch >>>>>>>>>>>>>> of other errors in the log because of all the checks that >>>>>>>>>>>>>> fail. >>>>>>>>>>>>>> >>>>>>>>>>>>>> The fix is to not abandon the test when the first >>>>>>>>>>>>>> CompiledMethodLoad event is before the hot method was >>>>>>>>>>>>>> called. Instead just leave fire==0 and wait for the next >>>>>>>>>>>>>> CompiledMethodLoad event that is triggered after the >>>>>>>>>>>>>> method is called enough times to be recompiled. I'm not >>>>>>>>>>>>>> sure why it was not originally done this way. Possibly the >>>>>>>>>>>>>> recompilation did not happen reliably, but I have not run >>>>>>>>>>>>>> into this problem. The other changes in redefclass030.c >>>>>>>>>>>>>> are just cleaning up debug tracing. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Another fix was to properly set the error status when >>>>>>>>>>>>>> waitForRedefinitionStarted() or >>>>>>>>>>>>>> waitForRedefinitionCompleted() times out, although this is >>>>>>>>>>>>>> just a safety net and I didn't run into any cases where >>>>>>>>>>>>>> this happened after fixing the CompiledMethodLoad event >>>>>>>>>>>>>> handling. So in general the changes in redefclass030.java >>>>>>>>>>>>>> were not needed, but provide better error handling. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > From serguei.spitsyn at oracle.com Wed Jul 25 17:59:58 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 25 Jul 2018 10:59:58 -0700 Subject: RFR (XS) 8208059: [TESTBUG] HeapMonitorInterpreterArrayTest.java fails In-Reply-To: References: <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com> <3c7f2aa7-b12e-3eb8-4843-12f422aaf022@oracle.com> Message-ID: <87a12963-bbdc-1ba3-6476-63bdf3af34a0@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Jul 25 18:13:09 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 25 Jul 2018 11:13:09 -0700 Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of outer fields of the class" when running with -Xcomp In-Reply-To: <112afd52-cd22-4d43-173f-08dd924eb7cc@oracle.com> References: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com> <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com> <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com> <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com> <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com> <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com> <68ef717f-7388-0dcb-be6b-7b8ed462fdf5@oracle.com> <539dc93d-8984-4d82-50eb-bd0395476247@oracle.com> <112afd52-cd22-4d43-173f-08dd924eb7cc@oracle.com> Message-ID: <7dec2692-c3d0-4903-b9ae-8a22528539a7@oracle.com> Thanks! On 7/25/18 11:00 AM, Alex Menkov wrote: > Looks good to me > > --alex > > On 07/24/2018 16:23, Chris Plummer wrote: >> Thanks, Serguei. >> >> I could use one more reviewer. >> >> thanks, >> >> Chris >> >> On 7/24/18 3:00 PM, serguei.spitsyn at oracle.com wrote: >>> Chris, >>> >>> Thank you for the explanations. >>> I'm Okay with this webrev as it is. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 7/24/18 13:55, Chris Plummer wrote: >>>> On 7/24/18 1:46 PM, serguei.spitsyn at oracle.com wrote: >>>>> >>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java.frames.html >>>>> >>>>> - log.complain("Redefinition not started. Maybe running with >>>>> -Xcomp. Test ignored."); >>>>> + log.complain("Redefinition not started. May need more time for >>>>> -Xcomp."); >>>>> + status = Consts.TEST_FAILED; >>>>> ????????? return false; >>>>> ????? } >>>>> . . . >>>>> - log.complain("Redefinition not completed."); >>>>> + log.complain("Redefinition not completed. May need more time for >>>>> -Xcomp."); >>>>> + status = Consts.TEST_FAILED; >>>>> + return false; The complain is not fully correct if this can >>>>> happen not only with the -Xcomp. Could this message be relaxed a >>>>> little bit? >>>> I think it is relaxed. It says *may* need more time for -Xcomp. I'm >>>> not sure how else to word it unless you want me to just say >>>> "Redefinition not completed". >>>>> Also, just a side comment: The changes above are not that >>>>> harmless. As the status now is set to TEST_FAILED there is a >>>>> potential for the tests to start failing where they were passed >>>>> before. >>>> Yes, that was intentional. It's still the case that you only need >>>> the fail = 0 change to fix the bug, but having these methods >>>> properly cause the test to fail is necessary if something were to >>>> ever go wrong and the redef was not started or completed. Otherwise >>>> the test would either silently pass (if redef was not started) or >>>> just produce error messages like it has been when it checks for the >>>> proper redef (if the redef never completed). >>>> >>>> thanks, >>>> >>>> Chris >>>>> Otherwise, looks good. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 7/24/18 13:22, Chris Plummer wrote: >>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01 >>>>>> >>>>>> Since I was removed the "else", there was no need for the "if", >>>>>> so I removed it also. I had to re-indent the body of the "if" >>>>>> section because of that. The webrev seems to not call out the >>>>>> whitespace changes, although I also did a couple of other minor >>>>>> formatting changes in the code that do show up. >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/24/18 12:42 PM, Chris Plummer wrote: >>>>>>> Yes. I'm just retesting first. >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 7/24/18 12:18 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> You have my all my comments and I leave it up to you to decide >>>>>>>> what approach to pick. >>>>>>>> Could you send an updated webrev, please? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Serguei >>>>>>>> >>>>>>>> >>>>>>>> On 7/24/18 09:27, Chris Plummer wrote: >>>>>>>>> On 7/24/18 12:25 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> I still feel, this fix adds more confusion and complexity. >>>>>>>>>> Let's look at some fragments. >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028/redefclass028.c.frames.html >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ? 116???? if ((strcmp(name, expHSMethod) == 0) && >>>>>>>>>> ? 117???????????? (strcmp(sig, expHSSignature) == 0)) { >>>>>>>>>> ? 118???????? NSK_DISPLAY0("CompiledMethodLoad: a tested >>>>>>>>>> hotspot method found\n"); >>>>>>>>>> ? 119 >>>>>>>>>> ? 120???????? // CR 6604375: check whether "hot" method was >>>>>>>>>> entered >>>>>>>>>> ? 121???????? if (enteredHotMethod) { >>>>>>>>>> ? 122???????????? hsMethodID = method; >>>>>>>>>> ? 123???????????? fire = 1; >>>>>>>>>> ? 124???????? } else { >>>>>>>>>> 125 NSK_DISPLAY0("Compilation occured before method >>>>>>>>>> execution\n"); >>>>>>>>>> 126 fire = 0; // Ignore this compilation. Wait for next one. >>>>>>>>>> ? 127???????? } >>>>>>>>>> ? 128???? } >>>>>>>>>> >>>>>>>>>> I think, the line #126 is not needed. >>>>>>>>>> It just creates a confusion. >>>>>>>>>> The fire == 0 from beginning. >>>>>>>>>> Why do we need it to set to 0 again? >>>>>>>>> Yes, it can be removed. I just didn't give it much thought >>>>>>>>> when changing the code from -1 to 0. >>>>>>>>>> Is it because it can be already set to 1? >>>>>>>>>> Id so, I'm not sure I understand this code then. >>>>>>>>>> >>>>>>>>>> ? 187???? } while(fire == 0); >>>>>>>>>> ? 188 >>>>>>>>>> ? 189???? NSK_DISPLAY0("agentProc: hotspot method >>>>>>>>>> compiled\n\n"); >>>>>>>>>> ? 190 >>>>>>>>>> ? 192???? if (fire == 1) { >>>>>>>>>> ? . . . >>>>>>>>>> ? 224???? } else { >>>>>>>>>> ? 225???????? // fire == -1 >>>>>>>>>> 226 // NOTE: This isn't suppose to happen anymore. Hot method >>>>>>>>>> should always end up being entered. >>>>>>>>>> 227 NSK_COMPLAIN0("agentProc: \"hot\" method wasn't executed. >>>>>>>>>> Don't perform redefinition\n"); >>>>>>>>>> ? 228???? } >>>>>>>>>> I don't understand why do we need the check at the line #192. >>>>>>>>>> The variable fire can be only equal to 0 or 1. >>>>>>>>>> The only way out of the loop at the line #187 is if fire == 1. >>>>>>>>>> >>>>>>>>>> Then the else statement at the lines 224-228 confuses even more. >>>>>>>>> The else section can be removed. I left it in as sort of an >>>>>>>>> assert, but I see now that it just cause confusion. >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 7/23/18 20:19, Chris Plummer wrote: >>>>>>>>>>> On 7/23/18 5:22 PM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>> Hi Chris, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 7/23/18 11:40, Chris Plummer wrote: >>>>>>>>>>>>> Hi Serguei, >>>>>>>>>>>>> >>>>>>>>>>>>> If the fix was complicated I would agree, but it really >>>>>>>>>>>>> just boils down to this one line change: >>>>>>>>>>>>> >>>>>>>>>>>>> -??????????? fire = -1; >>>>>>>>>>>>> +??????????? fire = 0; // Ignore this compilation. Wait >>>>>>>>>>>>> for next one. >>>>>>>>>>>> >>>>>>>>>>>> It is not obvious that this will completely fix the problem. >>>>>>>>>>>> Is it possible that there will not be next compilation with >>>>>>>>>>>> the -Xcomp? >>>>>>>>>>> It's only one method that we check for. I don't see why >>>>>>>>>>> there would be 2nd -Xcomp compilation for it, but even if >>>>>>>>>>> there was, the test will ignore it just like the first one. >>>>>>>>>>> It will ignore compilations of the method until the flag has >>>>>>>>>>> been set indicating the method has been executed once. >>>>>>>>>> >>>>>>>>>>> If for some reason the method is never compiled after being >>>>>>>>>>> executed once, the test will give up waiting for it (I think >>>>>>>>>>> after 30 seconds) and produce an error. >>>>>>>>>> >>>>>>>>>> I'm afraid that it is what will always happen with the -Xcomp. >>>>>>>>>> Then there is no point to waist this by waiting for timeout >>>>>>>>>> as the test will successfully complete without testing anything. >>>>>>>>>> It seems to be not worth this complexity. >>>>>>>>>> >>>>>>>>>> I guess, you would want some extra tracing though. :) >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Serguei >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> If it is possible then it is better to explicitly exclude >>>>>>>>>>>> these tests for -Xcomp. >>>>>>>>>>>> Otherwise, consider this reviewed. >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Given that, I see no reason not to increase our test >>>>>>>>>>>>> coverage by supporting this test during -Xcomp runs. >>>>>>>>>>>> >>>>>>>>>>>> I'd agree if it is going to be stable. >>>>>>>>>>>> >>>>>>>>>>> If problems turn up in the future, we can reconsider >>>>>>>>>>> disabling it. >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Serguei >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 7/23/18 9:44 AM, serguei.spitsyn at oracle.com wrote: >>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Would it be more simple to avoid running these tests with >>>>>>>>>>>>>> -Xcomp? >>>>>>>>>>>>>> I guess, this would work: @requires vm.compMode != "Xcomp" >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Serguei >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 7/23/18 00:42, Chris Plummer wrote: >>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please review the following fix for JDK11: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8151259 >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It fixes the following 3 tests: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java >>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java >>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Any of which could fail when run with -Xcomp with >>>>>>>>>>>>>>> (followed by a bunch more errors): >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ?# ERROR: Redefinition not started. Maybe running with >>>>>>>>>>>>>>> -Xcomp. Test ignored. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Although lately we've only seen this with >>>>>>>>>>>>>>> redefclass030.java on macosx. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> These 3 tests do redefinition of a "hot" method after >>>>>>>>>>>>>>> triggering compilation for it. After the redef some >>>>>>>>>>>>>>> testing is done to ensure that the redef was done >>>>>>>>>>>>>>> correctly, but the issue these test have actually comes >>>>>>>>>>>>>>> before any redef is done. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The test attempts to trigger compilation by calling a >>>>>>>>>>>>>>> hot method a lot. The agent detects compilation by >>>>>>>>>>>>>>> receiving a CompiledMethodLoad event. There was an issue >>>>>>>>>>>>>>> discovered long ago that when -Xcomp is used, the >>>>>>>>>>>>>>> compilation happens before the "hot" method is ever >>>>>>>>>>>>>>> called. Then the redef would happen before compilation, >>>>>>>>>>>>>>> and this somehow messed up the test (I'm not exactly >>>>>>>>>>>>>>> sure how). The fix was to basically abandon the redef >>>>>>>>>>>>>>> attempt when this problem is detected, and then >>>>>>>>>>>>>>> supposedly just let the test run to completion (skipping >>>>>>>>>>>>>>> the actual testing of the redef). After this change, if >>>>>>>>>>>>>>> you ran with -Xcomp it would pass, but if you looked in >>>>>>>>>>>>>>> the log you would see: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ?# ERROR: Redefinition not started. Maybe running with >>>>>>>>>>>>>>> -Xcomp. Test ignored. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> However, there was a bug in the logic to make the test >>>>>>>>>>>>>>> run to completion, and also causes the above message to >>>>>>>>>>>>>>> not appear. Instead the test would fail with: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # ERROR: Redefinition not completed. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Followed by a bunch more error message during the part >>>>>>>>>>>>>>> of the test that checks if the redef was done properly. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If the CompiledMethodLoad event comes in before the hot >>>>>>>>>>>>>>> method is ever called (which it does with -Xcomp), the >>>>>>>>>>>>>>> test sets fire = -1. If the hot method was called, it is >>>>>>>>>>>>>>> set to 1. The setting of fire = -1 was added to fix the >>>>>>>>>>>>>>> -Xcomp problem mentioned above. The jvmti agent does the >>>>>>>>>>>>>>> following: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ??? do { >>>>>>>>>>>>>>> ??????? THREAD_sleep(1); >>>>>>>>>>>>>>> ??????? /* wait for compilation to happen */ >>>>>>>>>>>>>>> ??? } while(fire == 0); >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ??? if (fire == 1) { >>>>>>>>>>>>>>> ??????? /* do the redef here */ >>>>>>>>>>>>>>> ??????? NSK_DISPLAY0("agentProc: <<<<<<<< >>>>>>>>>>>>>>> RedefineClasses() is successfully done\n"); >>>>>>>>>>>>>>> ??? } else { >>>>>>>>>>>>>>> ??????? // fire == -1 >>>>>>>>>>>>>>> ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't >>>>>>>>>>>>>>> executed. Don't perform redefinition\n"); >>>>>>>>>>>>>>> ??? } >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The agent then syncs with the debuggee, waiting for it >>>>>>>>>>>>>>> finish up. What the test expects is that >>>>>>>>>>>>>>> waitForRedefinitionStarted() in the debuggee will time >>>>>>>>>>>>>>> out after two seconds while waiting for fire == 1 (which >>>>>>>>>>>>>>> it thinks will will always happen because it was set to >>>>>>>>>>>>>>> -1). When it times out, the test does appear to exit >>>>>>>>>>>>>>> properly with, but with the following in the log, which >>>>>>>>>>>>>>> is intended: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ?# ERROR: Redefinition not started. Maybe running with >>>>>>>>>>>>>>> -Xcomp. Test ignored. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> However, sometimes before waitForRedefinitionStarted() >>>>>>>>>>>>>>> times out, the hot method is called enough times to >>>>>>>>>>>>>>> trigger compilation. So another CompiledMethodLoad event >>>>>>>>>>>>>>> arrives, and this time fire is set to 1. Because of >>>>>>>>>>>>>>> this, waitForRedefinitionStarted() doesn't time out and >>>>>>>>>>>>>>> returns with an indication that the redef has started. >>>>>>>>>>>>>>> After this waitForRedefinitionCompleted() is executed. >>>>>>>>>>>>>>> It waits for the redef to complete, but it never does >>>>>>>>>>>>>>> since the agent decided not to do the redef when it saw >>>>>>>>>>>>>>> fire == -1. So waitForRedefinitionCompleted() times out >>>>>>>>>>>>>>> after 10 seconds and the test fails, with: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> # ERROR: Redefinition not completed. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Actually the above error is not really what causes the >>>>>>>>>>>>>>> failure. When the above error is detected, no error >>>>>>>>>>>>>>> status is set and the test continues as if the redef had >>>>>>>>>>>>>>> been done. So then the logic that detects if the redef >>>>>>>>>>>>>>> was done properly ends up failing, and that's where the >>>>>>>>>>>>>>> test actually indicates a failure status. You see a >>>>>>>>>>>>>>> whole bunch of other errors in the log because of all >>>>>>>>>>>>>>> the checks that fail. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The fix is to not abandon the test when the first >>>>>>>>>>>>>>> CompiledMethodLoad event is before the hot method was >>>>>>>>>>>>>>> called. Instead just leave fire==0 and wait for the next >>>>>>>>>>>>>>> CompiledMethodLoad event that is triggered after the >>>>>>>>>>>>>>> method is called enough times to be recompiled. I'm not >>>>>>>>>>>>>>> sure why it was not originally done this way. Possibly >>>>>>>>>>>>>>> the recompilation did not happen reliably, but I have >>>>>>>>>>>>>>> not run into this problem. The other changes in >>>>>>>>>>>>>>> redefclass030.c are just cleaning up debug tracing. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Another fix was to properly set the error status when >>>>>>>>>>>>>>> waitForRedefinitionStarted() or >>>>>>>>>>>>>>> waitForRedefinitionCompleted() times out, although this >>>>>>>>>>>>>>> is just a safety net and I didn't run into any cases >>>>>>>>>>>>>>> where this happened after fixing the CompiledMethodLoad >>>>>>>>>>>>>>> event handling. So in general the changes in >>>>>>>>>>>>>>> redefclass030.java were not needed, but provide better >>>>>>>>>>>>>>> error handling. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> From alexey.menkov at oracle.com Wed Jul 25 18:22:55 2018 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 25 Jul 2018 11:22:55 -0700 Subject: RFR: JDK-8199155 : Accessibility issues in jdk.jdi Message-ID: <3320a618-4522-d442-eeb9-9f1a676b7a00@oracle.com> Hi, please review the following for for https://bugs.openjdk.java.net/browse/JDK-8199155 webrev: http://cr.openjdk.java.net/~amenkov/accessibility/webrev/ The fix adds standard "banner", "navigation", "main" regions and fixes "
without
" issue. For
styles which are used by most browsers are used: dl { margin-top: 1em; margin-bottom: 1em; } dd { margin-left: 40px; } --alex From daniil.x.titov at oracle.com Wed Jul 25 18:47:52 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Wed, 25 Jul 2018 11:47:52 -0700 Subject: JDK-8199155 : Accessibility issues in jdk.jdi In-Reply-To: References: Message-ID: Looks good to me. --Daniil ?On 7/25/18, 11:23 AM, "serviceability-dev on behalf of Alex Menkov" wrote: Hi, please review the following for for https://bugs.openjdk.java.net/browse/JDK-8199155 webrev: http://cr.openjdk.java.net/~amenkov/accessibility/webrev/ The fix adds standard "banner", "navigation", "main" regions and fixes "
without
" issue. For
styles which are used by most browsers are used: dl { margin-top: 1em; margin-bottom: 1em; } dd { margin-left: 40px; } --alex From daniel.daugherty at oracle.com Wed Jul 25 19:03:35 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 25 Jul 2018 15:03:35 -0400 Subject: RFR(XS): 8208205: ProblemList tests that fail due to 'Error attaching to process: Can't create thread_db agent!' Message-ID: <8e4359d7-7433-2d93-c7a0-d9b2c1b4b548@oracle.com> Greetings, I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so I need a single (R)eviewer for the following fix: ? JDK-8208205 ProblemList tests that fail due to 'Error attaching to ????????????? process: Can't create thread_db agent!' ? https://bugs.openjdk.java.net/browse/JDK-8208205 Here's the diff: $ hg diff test/hotspot/jtreg/ProblemList.txt diff -r 628718bf8970 test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 25 12:32:06 2018 -0400 +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 25 14:47:58 2018 -0400 @@ -74,14 +74,43 @@ ?# :hotspot_runtime ?runtime/CompressedOops/UseCompressedOops.java 8079353 generic-all +runtime/SharedArchiveFile/SASymbolTableTest.java 8193639 solaris ?############################################################################# ?# :hotspot_serviceability -serviceability/sa/ClhsdbCDSCore.java???????????????? 8207832 linux-x64 -serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all -serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all +serviceability/sa/ClhsdbAttach.java 8193639 solaris +serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64 +serviceability/sa/ClhsdbField.java 8193639 solaris +serviceability/sa/ClhsdbFindPC.java 8193639 solaris +serviceability/sa/ClhsdbInspect.java 8193639 solaris +serviceability/sa/ClhsdbJdis.java 8193639 solaris +serviceability/sa/ClhsdbJhisto.java 8193639 solaris +serviceability/sa/ClhsdbJstack.java 8193639 solaris +serviceability/sa/ClhsdbLongConstant.java 8193639 solaris +serviceability/sa/ClhsdbPmap.java 8193639 solaris +serviceability/sa/ClhsdbPrintAll.java 8193639 solaris +serviceability/sa/ClhsdbPrintAs.java 8193639 solaris +serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris +serviceability/sa/ClhsdbPstack.java 8193639 solaris +serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris +serviceability/sa/ClhsdbScanOops.java 8193639 solaris +serviceability/sa/ClhsdbSource.java 8193639 solaris +serviceability/sa/ClhsdbSymbol.java 8193639 solaris +serviceability/sa/ClhsdbSymbolTable.java 8193639 solaris +serviceability/sa/ClhsdbThread.java 8193639 solaris +serviceability/sa/ClhsdbWhere.java 8193639 solaris +serviceability/sa/DeadlockDetectionTest.java 8193639 solaris +serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris +serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all +serviceability/sa/TestClassDump.java 8193639 solaris +serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris +serviceability/sa/TestDefaultMethods.java 8193639 solaris +serviceability/sa/TestG1HeapRegion.java 8193639 solaris +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all +serviceability/sa/TestType.java 8193639 solaris +serviceability/sa/TestUniverse.java 8193639 solaris ?############################################################################# In the above diff, it looks like I deleted these entries: -serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64 -serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all -serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all What I really did was delete the spaces (like most of the other entries in the hotspot ProblemList: +serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64 +serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all I also put those entries in sort order which is why a 'diff -w' wasn't used... Thanks, in advance, for any questions, comments or suggestions. Dan From chris.plummer at oracle.com Wed Jul 25 19:32:50 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 25 Jul 2018 12:32:50 -0700 Subject: RFR(XS): 8208205: ProblemList tests that fail due to 'Error attaching to process: Can't create thread_db agent!' In-Reply-To: <8e4359d7-7433-2d93-c7a0-d9b2c1b4b548@oracle.com> References: <8e4359d7-7433-2d93-c7a0-d9b2c1b4b548@oracle.com> Message-ID: <7b268608-36c2-8243-9e21-fd06b44f0dab@oracle.com> Hi Dan, Looks good to me. Thanks for cleaning up the noise. Chris On 7/25/18 12:03 PM, Daniel D. Daugherty wrote: > Greetings, > > I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so > I need a single (R)eviewer for the following fix: > > ? JDK-8208205 ProblemList tests that fail due to 'Error attaching to > ????????????? process: Can't create thread_db agent!' > ? https://bugs.openjdk.java.net/browse/JDK-8208205 > > Here's the diff: > > $ hg diff test/hotspot/jtreg/ProblemList.txt > diff -r 628718bf8970 test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 25 12:32:06 2018 > -0400 > +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 25 14:47:58 2018 > -0400 > @@ -74,14 +74,43 @@ > ?# :hotspot_runtime > > ?runtime/CompressedOops/UseCompressedOops.java 8079353 generic-all > +runtime/SharedArchiveFile/SASymbolTableTest.java 8193639 solaris > > ?############################################################################# > > > ?# :hotspot_serviceability > > -serviceability/sa/ClhsdbCDSCore.java???????????????? 8207832 linux-x64 > -serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all > -serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all > +serviceability/sa/ClhsdbAttach.java 8193639 solaris > +serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64 > +serviceability/sa/ClhsdbField.java 8193639 solaris > +serviceability/sa/ClhsdbFindPC.java 8193639 solaris > +serviceability/sa/ClhsdbInspect.java 8193639 solaris > +serviceability/sa/ClhsdbJdis.java 8193639 solaris > +serviceability/sa/ClhsdbJhisto.java 8193639 solaris > +serviceability/sa/ClhsdbJstack.java 8193639 solaris > +serviceability/sa/ClhsdbLongConstant.java 8193639 solaris > +serviceability/sa/ClhsdbPmap.java 8193639 solaris > +serviceability/sa/ClhsdbPrintAll.java 8193639 solaris > +serviceability/sa/ClhsdbPrintAs.java 8193639 solaris > +serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris > +serviceability/sa/ClhsdbPstack.java 8193639 solaris > +serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris > +serviceability/sa/ClhsdbScanOops.java 8193639 solaris > +serviceability/sa/ClhsdbSource.java 8193639 solaris > +serviceability/sa/ClhsdbSymbol.java 8193639 solaris > +serviceability/sa/ClhsdbSymbolTable.java 8193639 solaris > +serviceability/sa/ClhsdbThread.java 8193639 solaris > +serviceability/sa/ClhsdbWhere.java 8193639 solaris > +serviceability/sa/DeadlockDetectionTest.java 8193639 solaris > +serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris > +serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all > +serviceability/sa/TestClassDump.java 8193639 solaris > +serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris > +serviceability/sa/TestDefaultMethods.java 8193639 solaris > +serviceability/sa/TestG1HeapRegion.java 8193639 solaris > +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all > +serviceability/sa/TestType.java 8193639 solaris > +serviceability/sa/TestUniverse.java 8193639 solaris > > ?############################################################################# > > > > In the above diff, it looks like I deleted these entries: > > -serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64 > -serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all > -serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all > > What I really did was delete the spaces (like most of the other entries > in the hotspot ProblemList: > > +serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64 > +serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all > +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all > > I also put those entries in sort order which is why a 'diff -w' > wasn't used... > > > Thanks, in advance, for any questions, comments or suggestions. > > Dan From daniel.daugherty at oracle.com Wed Jul 25 19:33:31 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 25 Jul 2018 15:33:31 -0400 Subject: RFR(XS): 8208205: ProblemList tests that fail due to 'Error attaching to process: Can't create thread_db agent!' In-Reply-To: <7b268608-36c2-8243-9e21-fd06b44f0dab@oracle.com> References: <8e4359d7-7433-2d93-c7a0-d9b2c1b4b548@oracle.com> <7b268608-36c2-8243-9e21-fd06b44f0dab@oracle.com> Message-ID: <99ca6ff2-d806-825f-e10b-a7f1354755c7@oracle.com> Chris, Thanks for the quick review! Dan On 7/25/18 3:32 PM, Chris Plummer wrote: > Hi Dan, > > Looks good to me. Thanks for cleaning up the noise. > > Chris > > On 7/25/18 12:03 PM, Daniel D. Daugherty wrote: >> Greetings, >> >> I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so >> I need a single (R)eviewer for the following fix: >> >> ? JDK-8208205 ProblemList tests that fail due to 'Error attaching to >> ????????????? process: Can't create thread_db agent!' >> ? https://bugs.openjdk.java.net/browse/JDK-8208205 >> >> Here's the diff: >> >> $ hg diff test/hotspot/jtreg/ProblemList.txt >> diff -r 628718bf8970 test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 25 12:32:06 2018 >> -0400 >> +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 25 14:47:58 2018 >> -0400 >> @@ -74,14 +74,43 @@ >> ?# :hotspot_runtime >> >> ?runtime/CompressedOops/UseCompressedOops.java 8079353 generic-all >> +runtime/SharedArchiveFile/SASymbolTableTest.java 8193639 solaris >> >> ?############################################################################# >> >> >> ?# :hotspot_serviceability >> >> -serviceability/sa/ClhsdbCDSCore.java???????????????? 8207832 linux-x64 >> -serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 >> generic-all >> -serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 >> generic-all >> +serviceability/sa/ClhsdbAttach.java 8193639 solaris >> +serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64 >> +serviceability/sa/ClhsdbField.java 8193639 solaris >> +serviceability/sa/ClhsdbFindPC.java 8193639 solaris >> +serviceability/sa/ClhsdbInspect.java 8193639 solaris >> +serviceability/sa/ClhsdbJdis.java 8193639 solaris >> +serviceability/sa/ClhsdbJhisto.java 8193639 solaris >> +serviceability/sa/ClhsdbJstack.java 8193639 solaris >> +serviceability/sa/ClhsdbLongConstant.java 8193639 solaris >> +serviceability/sa/ClhsdbPmap.java 8193639 solaris >> +serviceability/sa/ClhsdbPrintAll.java 8193639 solaris >> +serviceability/sa/ClhsdbPrintAs.java 8193639 solaris >> +serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris >> +serviceability/sa/ClhsdbPstack.java 8193639 solaris >> +serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris >> +serviceability/sa/ClhsdbScanOops.java 8193639 solaris >> +serviceability/sa/ClhsdbSource.java 8193639 solaris >> +serviceability/sa/ClhsdbSymbol.java 8193639 solaris >> +serviceability/sa/ClhsdbSymbolTable.java 8193639 solaris >> +serviceability/sa/ClhsdbThread.java 8193639 solaris >> +serviceability/sa/ClhsdbWhere.java 8193639 solaris >> +serviceability/sa/DeadlockDetectionTest.java 8193639 solaris >> +serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris >> +serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all >> +serviceability/sa/TestClassDump.java 8193639 solaris >> +serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris >> +serviceability/sa/TestDefaultMethods.java 8193639 solaris >> +serviceability/sa/TestG1HeapRegion.java 8193639 solaris >> +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all >> +serviceability/sa/TestType.java 8193639 solaris >> +serviceability/sa/TestUniverse.java 8193639 solaris >> >> ?############################################################################# >> >> >> >> In the above diff, it looks like I deleted these entries: >> >> -serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64 >> -serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 >> generic-all >> -serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 >> generic-all >> >> What I really did was delete the spaces (like most of the other entries >> in the hotspot ProblemList: >> >> +serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64 >> +serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all >> +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all >> >> I also put those entries in sort order which is why a 'diff -w' >> wasn't used... >> >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan > > > From serguei.spitsyn at oracle.com Wed Jul 25 20:31:28 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 25 Jul 2018 13:31:28 -0700 Subject: JDK-8199155 : Accessibility issues in jdk.jdi In-Reply-To: References: Message-ID: Hi Alex, +1 Thanks, Serguei On 7/25/18 11:47, Daniil Titov wrote: > Looks good to me. > > --Daniil > > ?On 7/25/18, 11:23 AM, "serviceability-dev on behalf of Alex Menkov" wrote: > > Hi, > > please review the following for for > https://bugs.openjdk.java.net/browse/JDK-8199155 > > webrev: > http://cr.openjdk.java.net/~amenkov/accessibility/webrev/ > > The fix adds standard "banner", "navigation", "main" regions > and fixes "
without
" issue. > For
styles which are used by most browsers are used: > dl { margin-top: 1em; margin-bottom: 1em; } > dd { margin-left: 40px; } > > --alex > > > > From daniel.daugherty at oracle.com Wed Jul 25 20:50:34 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 25 Jul 2018 16:50:34 -0400 Subject: RFR(XXS): 8208226 ProblemList com/sun/jdi/BasicJDWPConnectionTest.java Message-ID: <6e585aa2-0bf1-bcaa-7c1b-848ab0493a7d@oracle.com> Greetings, I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so I need a single (R)eviewer for the following fix: ? JDK-8208226 ProblemList com/sun/jdi/BasicJDWPConnectionTest.java ? https://bugs.openjdk.java.net/browse/JDK-8208226 Here's the diff: $ hg diff diff -r ec6d5843068a test/jdk/ProblemList.txt --- a/test/jdk/ProblemList.txt Wed Jul 25 15:38:37 2018 -0400 +++ b/test/jdk/ProblemList.txt Wed Jul 25 16:44:33 2018 -0400 @@ -834,6 +834,8 @@ # jdk_jdi +com/sun/jdi/BasicJDWPConnectionTest.java 8195703 generic-all + com/sun/jdi/RedefineImplementor.sh 8004127 generic-all com/sun/jdi/JdbExprTest.sh 8203393 solaris-all Thanks, in advance, for any questions, comments or suggestions. Dan From serguei.spitsyn at oracle.com Wed Jul 25 21:03:16 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 25 Jul 2018 14:03:16 -0700 Subject: RFR(XXS): 8208226 ProblemList com/sun/jdi/BasicJDWPConnectionTest.java In-Reply-To: <6e585aa2-0bf1-bcaa-7c1b-848ab0493a7d@oracle.com> References: <6e585aa2-0bf1-bcaa-7c1b-848ab0493a7d@oracle.com> Message-ID: <5ba15e38-e3bc-1a75-2f5b-0c1d4806a206@oracle.com> Hi Dan, Looks good. Thanks, Serguei On 7/25/18 13:50, Daniel D. Daugherty wrote: > Greetings, > > I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so > I need a single (R)eviewer for the following fix: > > ? JDK-8208226 ProblemList com/sun/jdi/BasicJDWPConnectionTest.java > ? https://bugs.openjdk.java.net/browse/JDK-8208226 > > Here's the diff: > > $ hg diff > diff -r ec6d5843068a test/jdk/ProblemList.txt > --- a/test/jdk/ProblemList.txt Wed Jul 25 15:38:37 2018 -0400 > +++ b/test/jdk/ProblemList.txt Wed Jul 25 16:44:33 2018 -0400 > @@ -834,6 +834,8 @@ > > ?# jdk_jdi > > +com/sun/jdi/BasicJDWPConnectionTest.java 8195703 generic-all > + > ?com/sun/jdi/RedefineImplementor.sh 8004127 generic-all > > ?com/sun/jdi/JdbExprTest.sh 8203393 solaris-all > > > Thanks, in advance, for any questions, comments or suggestions. > > Dan > From daniel.daugherty at oracle.com Wed Jul 25 21:16:16 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 25 Jul 2018 17:16:16 -0400 Subject: RFR(XXS): 8208226 ProblemList com/sun/jdi/BasicJDWPConnectionTest.java In-Reply-To: <5ba15e38-e3bc-1a75-2f5b-0c1d4806a206@oracle.com> References: <6e585aa2-0bf1-bcaa-7c1b-848ab0493a7d@oracle.com> <5ba15e38-e3bc-1a75-2f5b-0c1d4806a206@oracle.com> Message-ID: Serguei, Thanks for the very fast review! Dan On 7/25/18 5:03 PM, serguei.spitsyn at oracle.com wrote: > Hi Dan, > > Looks good. > > Thanks, > Serguei > > > On 7/25/18 13:50, Daniel D. Daugherty wrote: >> Greetings, >> >> I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so >> I need a single (R)eviewer for the following fix: >> >> ? JDK-8208226 ProblemList com/sun/jdi/BasicJDWPConnectionTest.java >> ? https://bugs.openjdk.java.net/browse/JDK-8208226 >> >> Here's the diff: >> >> $ hg diff >> diff -r ec6d5843068a test/jdk/ProblemList.txt >> --- a/test/jdk/ProblemList.txt Wed Jul 25 15:38:37 2018 -0400 >> +++ b/test/jdk/ProblemList.txt Wed Jul 25 16:44:33 2018 -0400 >> @@ -834,6 +834,8 @@ >> >> ?# jdk_jdi >> >> +com/sun/jdi/BasicJDWPConnectionTest.java 8195703 generic-all >> + >> ?com/sun/jdi/RedefineImplementor.sh 8004127 generic-all >> >> ?com/sun/jdi/JdbExprTest.sh 8203393 solaris-all >> >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan >> > From daniil.x.titov at oracle.com Wed Jul 25 23:11:57 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Wed, 25 Jul 2018 16:11:57 -0700 Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to start Message-ID: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com> Hello, Please review the change that fix the test issue. The fix increases the metaspace size and corrects the path to the class files. Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/ Issue: https://bugs.openjdk.java.net/browse/JDK-8207364 Thanks! Best regards, Daniil From serguei.spitsyn at oracle.com Wed Jul 25 23:32:07 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 25 Jul 2018 16:32:07 -0700 Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to start In-Reply-To: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com> References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com> Message-ID: <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com> Hi Daniil, It looks good to me. What is the need to increase the metaspace size? Thanks, Serguei On 7/25/18 16:11, Daniil Titov wrote: > Hello, > > Please review the change that fix the test issue. The fix increases the metaspace size and corrects the path to the class files. > > Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/ > Issue: https://bugs.openjdk.java.net/browse/JDK-8207364 > > Thanks! > > Best regards, > Daniil > > > From daniil.x.titov at oracle.com Thu Jul 26 00:38:13 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Wed, 25 Jul 2018 17:38:13 -0700 Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to start In-Reply-To: <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com> References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com> <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com> Message-ID: <7076F356-EC65-41C4-A8AD-E1C9D53223AC@oracle.com> Hi Serguei, On 64 bit machines Java fails to initialize a VM and prints " MaxMetaspaceSize is too small." diagnostic if the max metaspace size set to 8MB or less (java -XX:MaxMetaspaceSize=8m) Per open/src/hotspot/share/memory/metaspace.cpp (line 1140) and open/src/hotspot/share/runtime/globals.hpp (line 1059) MaxMetaspaceSize on 64 bit machines should be greater than 8MB. Comparing it to the behavior of Java 8 it seems as these settings were increased since Java 8 where the metaspace size should be greater than 4MB only. cat -n open/src/hotspot/share/memory/metaspace.cpp 880 881 #define VIRTUALSPACEMULTIPLIER 2 882 1135 // Initial virtual space size will be calculated at global_initialize() 1136 size_t min_metaspace_sz = 1137 VIRTUALSPACEMULTIPLIER * InitialBootClassLoaderMetaspaceSize; 1138 if (UseCompressedClassPointers) { 1139 if ((min_metaspace_sz + CompressedClassSpaceSize) > MaxMetaspaceSize) { 1140 if (min_metaspace_sz >= MaxMetaspaceSize) { 1141 vm_exit_during_initialization("MaxMetaspaceSize is too small."); 1142 } else { 1143 FLAG_SET_ERGO(size_t, CompressedClassSpaceSize, 1144 MaxMetaspaceSize - min_metaspace_sz); 1145 } 1146 } cat -n open/src/hotspot/share/runtime/globals.hpp 1058 product(size_t, InitialBootClassLoaderMetaspaceSize, \ 1059 NOT_LP64(2200*K) LP64_ONLY(4*M), \ 1060 "Initial size of the boot class loader data metaspace") \ 1061 range(30*K, max_uintx/BytesPerWord) \ 1062 constraint(InitialBootClassLoaderMetaspaceSizeConstraintFunc, AfterErgo)\ Best regards, Daniil ?On 7/25/18, 4:32 PM, "serguei.spitsyn at oracle.com" wrote: Hi Daniil, It looks good to me. What is the need to increase the metaspace size? Thanks, Serguei On 7/25/18 16:11, Daniil Titov wrote: > Hello, > > Please review the change that fix the test issue. The fix increases the metaspace size and corrects the path to the class files. > > Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/ > Issue: https://bugs.openjdk.java.net/browse/JDK-8207364 > > Thanks! > > Best regards, > Daniil > > > From serguei.spitsyn at oracle.com Thu Jul 26 01:38:02 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 25 Jul 2018 18:38:02 -0700 Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to start In-Reply-To: <7076F356-EC65-41C4-A8AD-E1C9D53223AC@oracle.com> References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com> <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com> <7076F356-EC65-41C4-A8AD-E1C9D53223AC@oracle.com> Message-ID: Daniil, Thank you for the explanation. Thanks, Serguei On 7/25/18 17:38, Daniil Titov wrote: > Hi Serguei, > > On 64 bit machines Java fails to initialize a VM and prints " MaxMetaspaceSize is too small." diagnostic if the max metaspace size set to 8MB or less (java -XX:MaxMetaspaceSize=8m) > > Per open/src/hotspot/share/memory/metaspace.cpp (line 1140) and open/src/hotspot/share/runtime/globals.hpp (line 1059) MaxMetaspaceSize on 64 bit machines should be greater than 8MB. Comparing it to the behavior of Java 8 it seems as these settings were increased since Java 8 where the metaspace size should be greater than 4MB only. > > cat -n open/src/hotspot/share/memory/metaspace.cpp > > 880 > 881 #define VIRTUALSPACEMULTIPLIER 2 > 882 > > 1135 // Initial virtual space size will be calculated at global_initialize() > 1136 size_t min_metaspace_sz = > 1137 VIRTUALSPACEMULTIPLIER * InitialBootClassLoaderMetaspaceSize; > 1138 if (UseCompressedClassPointers) { > 1139 if ((min_metaspace_sz + CompressedClassSpaceSize) > MaxMetaspaceSize) { > 1140 if (min_metaspace_sz >= MaxMetaspaceSize) { > 1141 vm_exit_during_initialization("MaxMetaspaceSize is too small."); > 1142 } else { > 1143 FLAG_SET_ERGO(size_t, CompressedClassSpaceSize, > 1144 MaxMetaspaceSize - min_metaspace_sz); > 1145 } > 1146 } > > cat -n open/src/hotspot/share/runtime/globals.hpp > > 1058 product(size_t, InitialBootClassLoaderMetaspaceSize, \ > 1059 NOT_LP64(2200*K) LP64_ONLY(4*M), \ > 1060 "Initial size of the boot class loader data metaspace") \ > 1061 range(30*K, max_uintx/BytesPerWord) \ > 1062 constraint(InitialBootClassLoaderMetaspaceSizeConstraintFunc, AfterErgo)\ > > > Best regards, > Daniil > > ?On 7/25/18, 4:32 PM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > It looks good to me. > What is the need to increase the metaspace size? > > Thanks, > Serguei > > > On 7/25/18 16:11, Daniil Titov wrote: > > Hello, > > > > Please review the change that fix the test issue. The fix increases the metaspace size and corrects the path to the class files. > > > > Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/ > > Issue: https://bugs.openjdk.java.net/browse/JDK-8207364 > > > > Thanks! > > > > Best regards, > > Daniil > > > > > > > > > > From chris.plummer at oracle.com Thu Jul 26 04:09:14 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 25 Jul 2018 21:09:14 -0700 Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to start In-Reply-To: <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com> References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com> <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com> Message-ID: <99433df0-16fe-6ca7-b0c7-b428d3d12f92@oracle.com> Hi Daniil, After reading some old comments I added to JDK-6606767, I wonder if bumping the metaspace size all the way up to 16m is the right thing to do. It seems the test wants to exhaust the metaspace, so maybe it should be set it to the smallest allowed size. Is the test still exhausting the metaspace even when it is 16M. Is there a smaller size that will also work? Also, regarding the class path, what impact was this bug having on the test? thanks, Chris On 7/25/18 4:32 PM, serguei.spitsyn at oracle.com wrote: > Hi Daniil, > > It looks good to me. > What is the need to increase the metaspace size? > > Thanks, > Serguei > > > On 7/25/18 16:11, Daniil Titov wrote: >> Hello, >> >> Please review the change that fix the test issue. The fix increases >> the? metaspace size and corrects the path to the class files. >> >> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/ >> Issue: https://bugs.openjdk.java.net/browse/JDK-8207364 >> >> Thanks! >> >> Best regards, >> Daniil >> >> >> > From fairoz.matte at oracle.com Thu Jul 26 04:20:43 2018 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Wed, 25 Jul 2018 21:20:43 -0700 (PDT) Subject: [8u-backport] RFR: JDK-8191948: jdb error: InvalidTypeException: Can't assign double[][][] to double[][][] In-Reply-To: <96be102f-67c4-462c-89ba-a728b8b81ddc@oracle.com> References: <5be949d0-70ba-4a9d-9541-907b37d3d0fb@default> <07fea5b7-6922-19d0-d58a-8aa5fb95b69d@oracle.com> <96be102f-67c4-462c-89ba-a728b8b81ddc@oracle.com> Message-ID: <309fc6ca-8099-4ebc-8502-a4e88e899b8e@default> Hi Chris and Serguei, Thanks for the review, I will add the appropriate noreg label. Thanks, Fairoz From: Serguei Spitsyn Sent: Wednesday, July 25, 2018 11:24 PM To: Chris Plummer ; Fairoz Matte ; serviceability-dev at openjdk.java.net Subject: Re: [8u-backport] RFR: JDK-8191948: jdb error: InvalidTypeException: Can't assign double[][][] to double[][][] Hi Fairoz, Looks good to me too. Thank you for taking care about this backport! On 7/25/18 10:31, Chris Plummer wrote: Hi Fairoz, The changes look good. I'm not sure what the policy is when part of the (full) backport contains test changes that aren't directly applicable to 8u. You might need some sort of noreg label on the backport CR. The test test/hotspot/jtreg/vmTestbase/nsk/jdb/eval/eval001 is located in the VM testbase which is a separate repository for jdk 8. I agree with Chris, noreg label on the backport CR is probably needed. Thanks, Serguei thanks, Chris On 7/25/18 1:23 AM, Fairoz Matte wrote: Hi, Kindly review the backport of "JDK-8191948: jdb error: InvalidTypeException: Can't assign double[][][] to double[][][]" to 8u Webrev - http://cr.openjdk.java.net/~fmatte/8191948/webrev.00/ JDK 11 bug - https://bugs.openjdk.java.net/browse/JDK-8191948 JDK 11 changeset - http://hg.openjdk.java.net/jdk/jdk11/rev/73c769e0486a Review thread - http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024405.html Thanks, Fairoz -------------- next part -------------- An HTML attachment was scrubbed... URL: From sgehwolf at redhat.com Thu Jul 26 07:33:32 2018 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Thu, 26 Jul 2018 09:33:32 +0200 Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 In-Reply-To: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com> References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com> Message-ID: <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com> On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote: > Hi, > > Could I please get a review of this one-liner change related to jhsdb > --mixed when attaching to a running Java process? The issue arises when > threads are in native code and that native code has frame pointers not > properly preserved. In such a case the SA performs a simple frame > pointer valididy check: ebp >= esp > > However, the code of retrieving the value for esp is incorrect in as > much as it's not in sync with native code in regards to the register > index: > > native code => X86ThreadContext.SP > Java code => X86ThreadContext.ESP > > X86ThreadContext.ESP is never being set by the native code. Since > X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then > returns null, ebp.lessThan(esp) wrongly returns false causing the > issue. This webrev fixes it by using SP as index on the Java side. > Thoughts? > > webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/ > bug: https://bugs.openjdk.java.net/browse/JDK-8208091 Anyone willing to review this one-liner? Thanks, Severin > Thanks, > Severin From volker.simonis at gmail.com Thu Jul 26 09:36:15 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 26 Jul 2018 11:36:15 +0200 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: <76229193-7f46-a17a-7ebb-bddbd3d698b9@oracle.com> References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com> <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com> <6de6362944f84740b80abb22cbbea872@sap.com> <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com> <343bce9f-7072-17fd-9351-1c62f67a12de@oracle.com> <92dcce7000a94cf89ae2169cb1f843f2@sap.com> <76229193-7f46-a17a-7ebb-bddbd3d698b9@oracle.com> Message-ID: Hi Sergey, thanks for your help, but I've just pushed the fix now. @Thomas: sorry, I really apologize, but I've just realized that I've forgot to add you as a Reviewer :( I'll promise to look more carefully next time. Regards, Volker On Tue, Jul 24, 2018 at 6:01 PM, serguei.spitsyn at oracle.com wrote: > Hi Ralf, > > I think, you have to consider it reviewed. > Sorry, I was not clear no new webrev is needed. > > Do you need a sponsor for the push? > > Thanks, > Serguei > > > > On 7/24/18 06:32, Schmelter, Ralf wrote: >> >> Hi all, >> >> here is the update webref with the fixed copyright: >> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v5/ >> >> Best regards, >> Ralf >> >> -----Original Message----- >> From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com] >> Sent: Freitag, 20. Juli 2018 23:04 >> To: Chris Plummer ; Schmelter, Ralf >> ; serviceability-dev at openjdk.java.net; Stuefe, >> Thomas >> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to >> prevent quadratic runtime behavior >> >> On 7/20/18 13:44, Chris Plummer wrote: >>> >>> On 7/20/18 1:40 PM, serguei.spitsyn at oracle.com wrote: >>>> >>>> Hi Ralf, >>>> >>>> >>>> On 7/20/18 07:28, Schmelter, Ralf wrote: >>>>> >>>>> Hi Sergue, >>>>> >>>>> I?ve updated the webref: >>>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/ >>>> >>>> The copyright year in ThreadReferenceImpl.c still has to be 2018, not >>>> 2008. >>>> >>>> >>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html >>>> >>>> >>>> 72 if (newDepth == -1_000) { >>>> 73 // Pop some frames so there is room on the stack >>>> for the >>>> 74 // call (including println()). >>>> 75 notifyRecursionEnded(); >>>> 76 } >>>> >>>> I have a concern on potential issue mentioned in the comment above. >>>> Should a StackOverflowError be expected here? >>>> >>>> 79 } catch (StackOverflowError e) { >>>> 80 // Use negative depth to indicate the recursion has >>>> ended. >>>> 81 return -1; >>>> 82 } >>>> >>>> What is going to happen if the StackOverflowError was really caught >>>> above? >>> >>> The SOE is really caught in the above code. I returns -1, and starts >>> the unwinding of the stack. After 1000 frames have been popped via >>> returns, notifyRecursionEnded() will be called. The pops are so >>> notifyRecursionEnded() can be called without worry of another SOE. >> >> Got it, thanks Chris. >> >> So, I'm Okay with the fix assuming the copyright year is fixed. >> >> Thanks, >> Serguei > > From volker.simonis at gmail.com Thu Jul 26 09:43:34 2018 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 26 Jul 2018 11:43:34 +0200 Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior In-Reply-To: References: <709161f438f848b0af5fb079c9c0242a@sap.com> <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com> <21e17c666ac04930a0e4bb4869e989da@sap.com> <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com> <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com> <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com> <6de6362944f84740b80abb22cbbea872@sap.com> <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com> <343bce9f-7072-17fd-9351-1c62f67a12de@oracle.com> <92dcce7000a94cf89ae2169cb1f843f2@sap.com> <76229193-7f46-a17a-7ebb-bddbd3d698b9@oracle.com> Message-ID: Oh my good! And I've also forgot to add Ralf as a Contributer :(:(:( I really desperately need a vacation! Sorry Ralf, Volker On Thu, Jul 26, 2018 at 11:36 AM, Volker Simonis wrote: > Hi Sergey, > > thanks for your help, but I've just pushed the fix now. > > @Thomas: sorry, I really apologize, but I've just realized that I've > forgot to add you as a Reviewer :( I'll promise to look more carefully > next time. > > Regards, > Volker > > > On Tue, Jul 24, 2018 at 6:01 PM, serguei.spitsyn at oracle.com > wrote: >> Hi Ralf, >> >> I think, you have to consider it reviewed. >> Sorry, I was not clear no new webrev is needed. >> >> Do you need a sponsor for the push? >> >> Thanks, >> Serguei >> >> >> >> On 7/24/18 06:32, Schmelter, Ralf wrote: >>> >>> Hi all, >>> >>> here is the update webref with the fixed copyright: >>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v5/ >>> >>> Best regards, >>> Ralf >>> >>> -----Original Message----- >>> From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com] >>> Sent: Freitag, 20. Juli 2018 23:04 >>> To: Chris Plummer ; Schmelter, Ralf >>> ; serviceability-dev at openjdk.java.net; Stuefe, >>> Thomas >>> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to >>> prevent quadratic runtime behavior >>> >>> On 7/20/18 13:44, Chris Plummer wrote: >>>> >>>> On 7/20/18 1:40 PM, serguei.spitsyn at oracle.com wrote: >>>>> >>>>> Hi Ralf, >>>>> >>>>> >>>>> On 7/20/18 07:28, Schmelter, Ralf wrote: >>>>>> >>>>>> Hi Sergue, >>>>>> >>>>>> I?ve updated the webref: >>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/ >>>>> >>>>> The copyright year in ThreadReferenceImpl.c still has to be 2018, not >>>>> 2008. >>>>> >>>>> >>>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html >>>>> >>>>> >>>>> 72 if (newDepth == -1_000) { >>>>> 73 // Pop some frames so there is room on the stack >>>>> for the >>>>> 74 // call (including println()). >>>>> 75 notifyRecursionEnded(); >>>>> 76 } >>>>> >>>>> I have a concern on potential issue mentioned in the comment above. >>>>> Should a StackOverflowError be expected here? >>>>> >>>>> 79 } catch (StackOverflowError e) { >>>>> 80 // Use negative depth to indicate the recursion has >>>>> ended. >>>>> 81 return -1; >>>>> 82 } >>>>> >>>>> What is going to happen if the StackOverflowError was really caught >>>>> above? >>>> >>>> The SOE is really caught in the above code. I returns -1, and starts >>>> the unwinding of the stack. After 1000 frames have been popped via >>>> returns, notifyRecursionEnded() will be called. The pops are so >>>> notifyRecursionEnded() can be called without worry of another SOE. >>> >>> Got it, thanks Chris. >>> >>> So, I'm Okay with the fix assuming the copyright year is fixed. >>> >>> Thanks, >>> Serguei >> >> From yasuenag at gmail.com Thu Jul 26 12:30:44 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 26 Jul 2018 21:30:44 +0900 Subject: PING: RFR: 8207843: HSDB cannot show Object Histogram when ZGC is working In-Reply-To: <0e8ce2e6-e043-2320-e660-a2f1f4df820e@gmail.com> References: <0e8ce2e6-e043-2320-e660-a2f1f4df820e@gmail.com> Message-ID: <3d4f8faf-e26a-0e6b-6df1-73b6600ee5a0@gmail.com> PING: Could you review it? > webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207843/webrev.00/ Yasumasa On 2018/07/19 23:03, Yasumasa Suenaga wrote: > Hi all, > > Please review this webrev. > > ???? JBS: https://bugs.openjdk.java.net/browse/JDK-8207843 > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207843/webrev.00/ > > I encountered AssertionFailure when I attached HSDB to the process which is working with ZGC as below: > > sun.jvm.hotspot.utilities.AssertionFailure: Unexpected CollectedHeap type: sun.jvm.hotspot.gc.z.ZCollectedHeap > ??? at jdk.hotspot.agent/sun.jvm.hotspot.utilities.Assert.that(Assert.java:32) > ??? at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.collectLiveRegions(ObjectHeap.java:448) > ??? at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.iterate(ObjectHeap.java:173) > ??? at jdk.hotspot.agent/sun.jvm.hotspot.HSDB$VisitHeap.run(HSDB.java:1741) > ??? at jdk.hotspot.agent/sun.jvm.hotspot.utilities.WorkerThread$MainLoop.run(WorkerThread.java:70) > ??? at java.base/java.lang.Thread.run(Thread.java:832) > > ObjectHeap#collectLiveRegions() branches by instance type of CollectedHeap. However it does not support ZCollectedHeap. > So I add ZCollectedHeap to it and add some methods to iterate ZPageTable. > > > Thanks, > > Yasumasa From yasuenag at gmail.com Thu Jul 26 13:52:10 2018 From: yasuenag at gmail.com (Yasumasa Suenaga) Date: Thu, 26 Jul 2018 22:52:10 +0900 Subject: ZGC: RFR: 8207843: HSDB cannot show Object Histogram when ZGC is working In-Reply-To: <3d4f8faf-e26a-0e6b-6df1-73b6600ee5a0@gmail.com> References: <0e8ce2e6-e043-2320-e660-a2f1f4df820e@gmail.com> <3d4f8faf-e26a-0e6b-6df1-73b6600ee5a0@gmail.com> Message-ID: <06ceb864-bca5-d89c-c54e-fbfce3585066@gmail.com> CC'ing to hotspot-gc-dev On 2018/07/26 21:30, Yasumasa Suenaga wrote: > PING: Could you review it? > >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207843/webrev.00/ > > > Yasumasa > > > On 2018/07/19 23:03, Yasumasa Suenaga wrote: >> Hi all, >> >> Please review this webrev. >> >> ????? JBS: https://bugs.openjdk.java.net/browse/JDK-8207843 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207843/webrev.00/ >> >> I encountered AssertionFailure when I attached HSDB to the process which is working with ZGC as below: >> >> sun.jvm.hotspot.utilities.AssertionFailure: Unexpected CollectedHeap type: sun.jvm.hotspot.gc.z.ZCollectedHeap >> ???? at jdk.hotspot.agent/sun.jvm.hotspot.utilities.Assert.that(Assert.java:32) >> ???? at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.collectLiveRegions(ObjectHeap.java:448) >> ???? at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.iterate(ObjectHeap.java:173) >> ???? at jdk.hotspot.agent/sun.jvm.hotspot.HSDB$VisitHeap.run(HSDB.java:1741) >> ???? at jdk.hotspot.agent/sun.jvm.hotspot.utilities.WorkerThread$MainLoop.run(WorkerThread.java:70) >> ???? at java.base/java.lang.Thread.run(Thread.java:832) >> >> ObjectHeap#collectLiveRegions() branches by instance type of CollectedHeap. However it does not support ZCollectedHeap. >> So I add ZCollectedHeap to it and add some methods to iterate ZPageTable. >> >> >> Thanks, >> >> Yasumasa From thomas.schatzl at oracle.com Thu Jul 26 14:06:42 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 26 Jul 2018 16:06:42 +0200 Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions In-Reply-To: References: Message-ID: Hi Paul, Erik may not have time in the next few months to review such a large change. But it would also be better to do the changes in steps for other reviewers. Also see below. On Mon, 2018-07-23 at 21:33 +0000, Hohensee, Paul wrote: > Please review. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8196989 I may have missed this in the previous discussion (which has been a while), but has there been any discussion about a "Free (Region) Space" for the committed but free regions? It seems a bit random to assign free region to the "old space", seemingly just a repeat of the current behavior (where everything has been put into "old gen"). Also, imho the second survivor space should preferably be dropped completely. :) > CSR: https://bugs.openjdk.java.net/browse/JDK-8196991 > Webrev: http://cr.openjdk.java.net/~phh/8196989/webrev.00 > > This webrev is marked ?L? because it?s a behavioral change (CSR in > draft state, may I have a review of that too please?) and because the > test change fanout is large. The actual code changes are ?M?. > > Passes the submit repo, Hotspot tier1, the JFR gc event tests and any > other test set with ?gc? or ?serviceability? in the test directory > name. I found it difficult to verify the accuracy of the reported > values other than manually, since they can vary from run to run of > the same program. I?d appreciate suggestions for how to go about > writing accuracy tests. > > I set out originally to revamp only the MXBeans, but decided it would > be incomplete if I didn?t include the jstat counters and the output > of the GC.heap_info jcmd option. I can separate the latter two into > their own RFEs, but I find it easier understand it all in a single > webrev and hope the reviewers will too. > > The basic approach is to add the new memory pools and collectors, the > new jstat counters, and an archive region counter that stands in for > an actual archive region set. HeapRegionSets are disjoint, so One option would be to add a HeapRegionSet tailored for archives that does not check the disjoint-criteria (it is superficially used for verification only anyway) - we already have special classes/flags for different kinds of regions (humongous/free/old) in the HeapRegionSet hierarchy. > initially I tried to create a first-class archive region set (on the > same level as the humongous region set), but that idea foundered on > the fact that there?s too much code I don?t fully understand that > depends on archive regions being in the existing old region set. Probably to simplify the implementation of archive regions :) This is another option, and does not look too bad actually, we only need to check and change all HeapRegion::is_old() or HeapRegion::is_old_or_humongous() checks. Now we only need a good name for is_old_or_archive_or_humongous() because that one is a bit lengthy :) > Externally (i.e., in the MXBeans and the jstat counters), however, > the old region set doesn?t include archive regions (unless running in > legacy mode). > > I used CMS?s TraceCMSMemoryManagerStats class as the model for > TraceConcMemoryManagerStats, which latter collects statistics on > concurrent cycles. There are two STW pauses in each concurrent cycle: > they are recorded separately and count as two sun.gc.collector.2 > events. I would like to move away serviceability code from G1CollectedHeap.h/cpp as much as possible; e.g. it would be very nice to make G1MonitoringSupport the owner of all the serviceability related data. Also the _use_legacy_monitoring member should probably move there too. > The humongous and archive space committed and used values are always > identical, This is because, for some reason, G1 counts the memory filled with filler objects as "used". Other collectors don't. > hence they are always 100% used. You may have noticed that just recently we got a bug (https://bugs.open jdk.java.net/browse/JDK-8207200) filed against the G1 MXBeans because of races in the code particularly code to be not-racy. The reason is the really weird calculation of used/committed for eden space/survivor space/old gen and that the precondition written down in G1MonitoringSupport::recalculate_sizes() does not hold. G1 MemoryMXBeans basically fabricates some numbers as you might have noticed :), so in addition to fixing that issue with the race I am still working on improving the accuracy of the used values. Also, in course of this change I am considering removing some other backwards-bending in returned values for G1 (the mentioned and e.g. funky stuff like assuming that adding together max-capacities of the pools gives you total heap size). I have also a preliminary webrev for that at http://cr.openjdk.java.net /~tschatzl/8207200/webrev/ which unfortunately clashes a lot with your changes. The reason why it is a single webrev is because I am not finished yet - I tend to split it up in parts for much better reviewing at the very end only. Could we work together on first refactoring the code before adding new kinds of spaces to the MXBeans? Looking at this change and mine roughly the following issues would need to be resolved first: - find a solution for archive regions as suggested above :) At the moment, without doing the change, I would tend to make archive regions separate from old regions. - move serviceability stuff as much as possible to g1MonitoringSupport - clean up MemoryPool, remove duplicate information - provide and return sane memory pool used/committed values to the MXBeans - clean up G1MonitoringSupport, e.g. avoid "*used/*committed" variables for every single memory pool. Use MemoryUsage structs for them. Make reading of memory pool information atomic wrt to its readers (note that I think it is currently just impossible to get consistent output for other statistics like jstat) - that's JDK-8207200. - add whatever serviceability stuff for the new pools/jstat/* in steps. > The revised output of jcmd GC.heap_info is in > G1CollectedHeap::print_on(). > I fixed a typo in src/hotspot/share/gc/g1/g1Policy.hpp by changing > the result type of young_list_target_length() from size_t to uint, > which latter is the type of the _young_list_target_length member. > I updated the copyright date in > src/hotspot/share/services/memoryService.hpp to 2018, as I neglected > to do so in a previous push. Comments? Thanks, Thomas From jcbeyler at google.com Thu Jul 26 16:53:48 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 26 Jul 2018 09:53:48 -0700 Subject: RFR (XS) 8208251: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails intermittently on Linux-X64 Message-ID: Hi all, As we fixed the HeapMonitorTest to not fail from time to time, there seems to be the same issue and risk in HeapMonitorGCTest. Could someone review the similar fix: Webrev: http://cr.openjdk.java.net/~jcbeyler/8208251/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8208251 The risk is that the last interval is too big and no sampled object is live after the allocation method. If a GC happens before the check for sample code, it is possible no live objects still exist. The solution is to reduce the sampling interval to make it highly unlikely for no samples to happen in any allocation iteration, keeping at least one sampled object live. But also check the GC'd objects in the system in case they did actually all already get GC'd. Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Thu Jul 26 16:56:08 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 26 Jul 2018 09:56:08 -0700 Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to start In-Reply-To: <99433df0-16fe-6ca7-b0c7-b428d3d12f92@oracle.com> References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com> <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com> <99433df0-16fe-6ca7-b0c7-b428d3d12f92@oracle.com> Message-ID: <3716AA08-0542-4500-AD55-B41EC34C56BB@oracle.com> Hi Chris, The smallest allowed metaspace size for the test is 9MB. In both cases (when the metaspace size is set to 9Mb and to 16 Mb) the expected OutOfMemoryError is thrown and the test passes. I did update the patch to use the smallest settings. Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.02 The test uses a custom class loader to load a class from the byte array read from the predefined specified class file. The incorrect path passed to the test made the test fail to read this class file. java.lang.RuntimeException: Exception when reading file './bin/nsk/jvmti/ResourceExhausted/Helper.class' at nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:74) at nsk.jvmti.ResourceExhausted.resexhausted003.run(resexhausted003.java:89) at nsk.jvmti.ResourceExhausted.resexhausted003.main(resexhausted003.java:129) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.base/java.lang.reflect.Method.invoke(Method.java:566) at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: java.io.FileNotFoundException: ./bin/nsk/jvmti/ResourceExhausted/Helper.class (No such file or directory) at java.base/java.io.FileInputStream.open0(Native Method) at java.base/java.io.FileInputStream.open(FileInputStream.java:219) at java.base/java.io.FileInputStream.(FileInputStream.java:157) at nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:64) ... 8 more Best regards, Daniil ?On 7/25/18, 9:09 PM, "Chris Plummer" wrote: Hi Daniil, After reading some old comments I added to JDK-6606767, I wonder if bumping the metaspace size all the way up to 16m is the right thing to do. It seems the test wants to exhaust the metaspace, so maybe it should be set it to the smallest allowed size. Is the test still exhausting the metaspace even when it is 16M. Is there a smaller size that will also work? Also, regarding the class path, what impact was this bug having on the test? thanks, Chris On 7/25/18 4:32 PM, serguei.spitsyn at oracle.com wrote: > Hi Daniil, > > It looks good to me. > What is the need to increase the metaspace size? > > Thanks, > Serguei > > > On 7/25/18 16:11, Daniil Titov wrote: >> Hello, >> >> Please review the change that fix the test issue. The fix increases >> the metaspace size and corrects the path to the class files. >> >> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/ >> Issue: https://bugs.openjdk.java.net/browse/JDK-8207364 >> >> Thanks! >> >> Best regards, >> Daniil >> >> >> > From jcbeyler at google.com Thu Jul 26 16:58:24 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 26 Jul 2018 09:58:24 -0700 Subject: RFR 8208303: Track JNI failures and fail tests Message-ID: Hi all, The tests in the HeapMonitor subsystem has a lot of JNI calls. There is a need for verification and testing if anything in the JNI subsystem failed unexpectedly. Here is a webrev that tracks if a JNI call does fail and the tests will fail if any JNI call does fail. Could I have a few reviews please for: Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8208303 Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Thu Jul 26 16:59:34 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 26 Jul 2018 09:59:34 -0700 Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to start In-Reply-To: <3716AA08-0542-4500-AD55-B41EC34C56BB@oracle.com> References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com> <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com> <99433df0-16fe-6ca7-b0c7-b428d3d12f92@oracle.com> <3716AA08-0542-4500-AD55-B41EC34C56BB@oracle.com> Message-ID: <85219efa-ef16-add4-209d-96f7bf987ba4@oracle.com> Thanks for the explanation. Update looks good. Chris On 7/26/18 9:56 AM, Daniil Titov wrote: > Hi Chris, > > The smallest allowed metaspace size for the test is 9MB. In both cases (when the metaspace size is set to 9Mb and to 16 Mb) the expected OutOfMemoryError is thrown and the test passes. > > I did update the patch to use the smallest settings. > > Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.02 > > > The test uses a custom class loader to load a class from the byte array read from the predefined specified class file. The incorrect path passed to the test made the test fail to read this class file. > > > java.lang.RuntimeException: Exception when reading file './bin/nsk/jvmti/ResourceExhausted/Helper.class' > at nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:74) > at nsk.jvmti.ResourceExhausted.resexhausted003.run(resexhausted003.java:89) > at nsk.jvmti.ResourceExhausted.resexhausted003.main(resexhausted003.java:129) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.lang.reflect.Method.invoke(Method.java:566) > at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115) > at java.base/java.lang.Thread.run(Thread.java:834) > Caused by: java.io.FileNotFoundException: ./bin/nsk/jvmti/ResourceExhausted/Helper.class (No such file or directory) > at java.base/java.io.FileInputStream.open0(Native Method) > at java.base/java.io.FileInputStream.open(FileInputStream.java:219) > at java.base/java.io.FileInputStream.(FileInputStream.java:157) > at nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:64) > ... 8 more > > Best regards, > Daniil > > ?On 7/25/18, 9:09 PM, "Chris Plummer" wrote: > > Hi Daniil, > > After reading some old comments I added to JDK-6606767, I wonder if > bumping the metaspace size all the way up to 16m is the right thing to > do. It seems the test wants to exhaust the metaspace, so maybe it should > be set it to the smallest allowed size. Is the test still exhausting the > metaspace even when it is 16M. Is there a smaller size that will also work? > > Also, regarding the class path, what impact was this bug having on the test? > > thanks, > > Chris > > On 7/25/18 4:32 PM, serguei.spitsyn at oracle.com wrote: > > Hi Daniil, > > > > It looks good to me. > > What is the need to increase the metaspace size? > > > > Thanks, > > Serguei > > > > > > On 7/25/18 16:11, Daniil Titov wrote: > >> Hello, > >> > >> Please review the change that fix the test issue. The fix increases > >> the metaspace size and corrects the path to the class files. > >> > >> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/ > >> Issue: https://bugs.openjdk.java.net/browse/JDK-8207364 > >> > >> Thanks! > >> > >> Best regards, > >> Daniil > >> > >> > >> > > > > > > > From serguei.spitsyn at oracle.com Thu Jul 26 17:00:36 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 26 Jul 2018 10:00:36 -0700 Subject: RFR (XS) 8208251: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails intermittently on Linux-X64 In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Jul 26 17:01:03 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 26 Jul 2018 10:01:03 -0700 Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to start In-Reply-To: <85219efa-ef16-add4-209d-96f7bf987ba4@oracle.com> References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com> <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com> <99433df0-16fe-6ca7-b0c7-b428d3d12f92@oracle.com> <3716AA08-0542-4500-AD55-B41EC34C56BB@oracle.com> <85219efa-ef16-add4-209d-96f7bf987ba4@oracle.com> Message-ID: +1 Thanks, Serguei On 7/26/18 09:59, Chris Plummer wrote: > Thanks for the explanation. Update looks good. > > Chris > > On 7/26/18 9:56 AM, Daniil Titov wrote: >> Hi Chris, >> >> The smallest allowed metaspace size for the test is 9MB. In both >> cases (when the metaspace size is set to 9Mb and to 16 Mb) the >> expected OutOfMemoryError is thrown and the test passes. >> >> I did update the patch to use the smallest settings. >> >> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.02 >> >> >> The test uses a custom class loader to load a class from the byte >> array read from the predefined specified class file. The incorrect >> path passed to the test made the test fail to read this class file. >> >> java.lang.RuntimeException: Exception when reading file >> './bin/nsk/jvmti/ResourceExhausted/Helper.class' >> ????at >> nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:74) >> ????at >> nsk.jvmti.ResourceExhausted.resexhausted003.run(resexhausted003.java:89) >> ????at >> nsk.jvmti.ResourceExhausted.resexhausted003.main(resexhausted003.java:129) >> ????at >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native >> Method) >> ????at >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> ????at >> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> ????at java.base/java.lang.reflect.Method.invoke(Method.java:566) >> ????at >> com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115) >> ????at java.base/java.lang.Thread.run(Thread.java:834) >> Caused by: java.io.FileNotFoundException: >> ./bin/nsk/jvmti/ResourceExhausted/Helper.class (No such file or >> directory) >> ????at java.base/java.io.FileInputStream.open0(Native Method) >> ????at java.base/java.io.FileInputStream.open(FileInputStream.java:219) >> ????at >> java.base/java.io.FileInputStream.(FileInputStream.java:157) >> ????at >> nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:64) >> ????... 8 more >> >> Best regards, >> Daniil >> >> ?On 7/25/18, 9:09 PM, "Chris Plummer" wrote: >> >> ???? Hi Daniil, >> ???? ???? After reading some old comments I added to JDK-6606767, I >> wonder if >> ???? bumping the metaspace size all the way up to 16m is the right >> thing to >> ???? do. It seems the test wants to exhaust the metaspace, so maybe >> it should >> ???? be set it to the smallest allowed size. Is the test still >> exhausting the >> ???? metaspace even when it is 16M. Is there a smaller size that will >> also work? >> ???? ???? Also, regarding the class path, what impact was this bug >> having on the test? >> ???? ???? thanks, >> ???? ???? Chris >> ???? ???? On 7/25/18 4:32 PM, serguei.spitsyn at oracle.com wrote: >> ???? > Hi Daniil, >> ???? > >> ???? > It looks good to me. >> ???? > What is the need to increase the metaspace size? >> ???? > >> ???? > Thanks, >> ???? > Serguei >> ???? > >> ???? > >> ???? > On 7/25/18 16:11, Daniil Titov wrote: >> ???? >> Hello, >> ???? >> >> ???? >> Please review the change that fix the test issue. The fix >> increases >> ???? >> the? metaspace size and corrects the path to the class files. >> ???? >> >> ???? >> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/ >> ???? >> Issue: https://bugs.openjdk.java.net/browse/JDK-8207364 >> ???? >> >> ???? >> Thanks! >> ???? >> >> ???? >> Best regards, >> ???? >> Daniil >> ???? >> >> ???? >> >> ???? >> >> ???? > >> >> > > From sharath.ballal at oracle.com Thu Jul 26 17:04:39 2018 From: sharath.ballal at oracle.com (Sharath Ballal) Date: Thu, 26 Jul 2018 10:04:39 -0700 (PDT) Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 In-Reply-To: <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com> References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com> <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com> Message-ID: <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default> Changes looks good Severin. I am not a reviewer though, so you still need a Reviewer to review. Thanks, Sharath -----Original Message----- From: Severin Gehwolf [mailto:sgehwolf at redhat.com] Sent: Thursday, July 26, 2018 1:04 PM To: serviceability-dev Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote: > Hi, > > Could I please get a review of this one-liner change related to jhsdb > --mixed when attaching to a running Java process? The issue arises > when threads are in native code and that native code has frame > pointers not properly preserved. In such a case the SA performs a > simple frame pointer valididy check: ebp >= esp > > However, the code of retrieving the value for esp is incorrect in as > much as it's not in sync with native code in regards to the register > index: > > native code => X86ThreadContext.SP > Java code => X86ThreadContext.ESP > > X86ThreadContext.ESP is never being set by the native code. Since > X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then > returns null, ebp.lessThan(esp) wrongly returns false causing the > issue. This webrev fixes it by using SP as index on the Java side. > Thoughts? > > webrev: > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/ > bug: https://bugs.openjdk.java.net/browse/JDK-8208091 Anyone willing to review this one-liner? Thanks, Severin > Thanks, > Severin From serguei.spitsyn at oracle.com Thu Jul 26 17:04:55 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 26 Jul 2018 10:04:55 -0700 Subject: RFR 8208303: Track JNI failures and fail tests In-Reply-To: References: Message-ID: <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com> An HTML attachment was scrubbed... URL: From sgehwolf at redhat.com Thu Jul 26 17:11:30 2018 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Thu, 26 Jul 2018 19:11:30 +0200 Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 In-Reply-To: <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default> References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com> <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com> <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default> Message-ID: <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com> On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote: > Changes looks good Severin. Thanks for the review, Sharath! > I am not a reviewer though, so you still need a Reviewer to review. Anyone? Thanks, Severin > -----Original Message----- > From: Severin Gehwolf [mailto:sgehwolf at redhat.com] > Sent: Thursday, July 26, 2018 1:04 PM > To: serviceability-dev > Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 > > On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote: > > Hi, > > > > Could I please get a review of this one-liner change related to jhsdb > > --mixed when attaching to a running Java process? The issue arises > > when threads are in native code and that native code has frame > > pointers not properly preserved. In such a case the SA performs a > > simple frame pointer valididy check: ebp >= esp > > > > However, the code of retrieving the value for esp is incorrect in as > > much as it's not in sync with native code in regards to the register > > index: > > > > native code => X86ThreadContext.SP > > Java code => X86ThreadContext.ESP > > > > X86ThreadContext.ESP is never being set by the native code. Since > > X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then > > returns null, ebp.lessThan(esp) wrongly returns false causing the > > issue. This webrev fixes it by using SP as index on the Java side. > > Thoughts? > > > > webrev: > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/ > > bug: https://bugs.openjdk.java.net/browse/JDK-8208091 > > Anyone willing to review this one-liner? > > Thanks, > Severin > > > Thanks, > > Severin > > From jcbeyler at google.com Thu Jul 26 17:40:09 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 26 Jul 2018 10:40:09 -0700 Subject: RFR 8208303: Track JNI failures and fail tests In-Reply-To: <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com> References: <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com> Message-ID: Hi Serguei, As I was looking at another test bug ( https://bugs.openjdk.java.net/browse/JDK-8191519); the proposal for that bug is to have a JNI call to FatalError to provoke a failure. If we went down that route, this webrev is simpler, no? Instead of setting failure_status and checking it later; just fail fatally and be done with it, no? That way, the tests in Java land don't have to be changed actually, no? What would we prefer for tests? Remember there was a failure and test it later or fail fast via JNI's FatalError? Thanks, Jc On Thu, Jul 26, 2018 at 10:04 AM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Hi Jc, > > It looks good to me. > > Thanks, > Serguei > > > On 7/26/18 09:58, JC Beyler wrote: > > Hi all, > > The tests in the HeapMonitor subsystem has a lot of JNI calls. There is a > need for verification and testing if anything in the JNI subsystem failed > unexpectedly. > > Here is a webrev that tracks if a JNI call does fail and the tests will > fail if any JNI call does fail. > > Could I have a few reviews please for: > Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8208303 > > Thanks, > Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Jul 26 17:45:56 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 26 Jul 2018 10:45:56 -0700 Subject: RFR 8208303: Track JNI failures and fail tests In-Reply-To: References: <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Thu Jul 26 18:05:27 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 26 Jul 2018 14:05:27 -0400 Subject: RFR (XS) 8208251: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails intermittently on Linux-X64 In-Reply-To: References: Message-ID: On 7/26/18 12:53 PM, JC Beyler wrote: > Hi all, > > As we fixed the HeapMonitorTest to not fail from time to time, there > seems to be the same issue and risk in HeapMonitorGCTest. Could > someone review the similar fix: > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8208251/webrev.00/ > test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCTest.java ??? No comments. Thumbs up! Perhaps consider filing a bug to refactor HeapMonitorTest and HeapMonitorGCTest.java so that they share code... then we won't have to fix the same bug in two places... Dan > Bug: https://bugs.openjdk.java.net/browse/JDK-8208251 > > The risk is that the last interval is too big and no sampled object is > live after the allocation method. If a GC happens before the check for > sample code, it is possible no live objects still exist. > > The solution is to reduce the sampling interval to make it highly > unlikely for no samples to happen in any allocation iteration, keeping > at least one sampled object live. But also check the GC'd objects in > the system in case they did actually all already get GC'd. > > Thanks, > Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Jul 26 18:11:37 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 26 Jul 2018 11:11:37 -0700 Subject: RFR (XS) 8208251: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails intermittently on Linux-X64 In-Reply-To: References: Message-ID: <92e82eb8-02fb-a6a3-5bd9-f5f8d85591c6@oracle.com> An HTML attachment was scrubbed... URL: From jcbeyler at google.com Thu Jul 26 18:57:10 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 26 Jul 2018 11:57:10 -0700 Subject: RFR (XS) 8208251: serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails intermittently on Linux-X64 In-Reply-To: <92e82eb8-02fb-a6a3-5bd9-f5f8d85591c6@oracle.com> References: <92e82eb8-02fb-a6a3-5bd9-f5f8d85591c6@oracle.com> Message-ID: Here you are Serguei: http://cr.openjdk.java.net/~jcbeyler/8208251/webrev.01/ Thanks for the push! Jc Ps: @Daniel I created the issue and assigned it to me ( https://bugs.openjdk.java.net/browse/JDK-8208352) On Thu, Jul 26, 2018 at 11:11 AM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Jc, > > Could you send me a patch? > I'll sponsor the push. > > Thanks, > Serguei > > > On 7/26/18 11:05, Daniel D. Daugherty wrote: > > On 7/26/18 12:53 PM, JC Beyler wrote: > > Hi all, > > As we fixed the HeapMonitorTest to not fail from time to time, there seems > to be the same issue and risk in HeapMonitorGCTest. Could someone review > the similar fix: > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8208251/webrev.00/ > > > > test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCTest.java > No comments. > > Thumbs up! > > Perhaps consider filing a bug to refactor HeapMonitorTest and > HeapMonitorGCTest.java so that they share code... then we won't > have to fix the same bug in two places... > > Dan > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8208251 > > The risk is that the last interval is too big and no sampled object is > live after the allocation method. If a GC happens before the check for > sample code, it is possible no live objects still exist. > > The solution is to reduce the sampling interval to make it highly unlikely > for no samples to happen in any allocation iteration, keeping at least one > sampled object live. But also check the GC'd objects in the system in case > they did actually all already get GC'd. > > Thanks, > Jc > > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcbeyler at google.com Thu Jul 26 19:03:37 2018 From: jcbeyler at google.com (JC Beyler) Date: Thu, 26 Jul 2018 12:03:37 -0700 Subject: RFR 8208303: Track JNI failures and fail tests In-Reply-To: References: <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com> Message-ID: Hi all, With the FatalError idea, here is the webrev to consider, note it no longer changes the tests. If a JNI call fails, then we call FatalError. Let me know what you think: Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.01/ Bug: https://bugs.openjdk.java.net/browse/JDK-8208303 Thanks! Jc On Thu, Jul 26, 2018 at 10:46 AM serguei.spitsyn at oracle.com < serguei.spitsyn at oracle.com> wrote: > Hi Jc, > > Good idea. > I was thinking about something like this. > > Thanks, > Serguei > > > On 7/26/18 10:40, JC Beyler wrote: > > Hi Serguei, > > As I was looking at another test bug ( > https://bugs.openjdk.java.net/browse/JDK-8191519); the proposal for that > bug is to have a JNI call to FatalError to provoke a failure. > > If we went down that route, this webrev is simpler, no? Instead of setting > failure_status and checking it later; just fail fatally and be done with > it, no? That way, the tests in Java land don't have to be changed actually, > no? > > What would we prefer for tests? Remember there was a failure and test it > later or fail fast via JNI's FatalError? > > Thanks, > Jc > > > On Thu, Jul 26, 2018 at 10:04 AM serguei.spitsyn at oracle.com < > serguei.spitsyn at oracle.com> wrote: > >> Hi Jc, >> >> It looks good to me. >> >> Thanks, >> Serguei >> >> >> On 7/26/18 09:58, JC Beyler wrote: >> >> Hi all, >> >> The tests in the HeapMonitor subsystem has a lot of JNI calls. There is a >> need for verification and testing if anything in the JNI subsystem failed >> unexpectedly. >> >> Here is a webrev that tracks if a JNI call does fail and the tests will >> fail if any JNI call does fail. >> >> Could I have a few reviews please for: >> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.00/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8208303 >> >> Thanks, >> Jc >> >> >> > > -- > > Thanks, > Jc > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Thu Jul 26 19:08:02 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 26 Jul 2018 15:08:02 -0400 Subject: RFR 8208303: Track JNI failures and fail tests In-Reply-To: References: <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com> Message-ID: <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com> Please make sure this fix is well tested in Mach5 prior to pushing. In particular, I'm focused on reducing the noise in Mach5 tier[1-3] so adding any new failures there will make me grumpy :-) Dan On 7/26/18 3:03 PM, JC Beyler wrote: > Hi all, > > With the FatalError idea, here is the webrev to consider, note it no > longer changes the tests. If a JNI call fails, then we call FatalError. > > Let me know what you think: > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.01/ > > Bug: https://bugs.openjdk.java.net/browse/JDK-8208303 > > Thanks! > Jc > > On Thu, Jul 26, 2018 at 10:46 AM serguei.spitsyn at oracle.com > > wrote: > > Hi Jc, > > Good idea. > I was thinking about something like this. > > Thanks, > Serguei > > > On 7/26/18 10:40, JC Beyler wrote: >> Hi?Serguei, >> >> As I was looking at another test bug >> (https://bugs.openjdk.java.net/browse/JDK-8191519); the proposal >> for that bug is to have a JNI call to FatalError to provoke a >> failure. >> >> If we went down that route, this webrev is simpler, no? Instead >> of setting failure_status and checking it later; just fail >> fatally and be done with it, no? That way, the tests in Java land >> don't have to be changed actually, no? >> >> What would we prefer for tests? Remember there was a failure and >> test it later or fail fast via JNI's FatalError? >> >> Thanks, >> Jc >> >> >> On Thu, Jul 26, 2018 at 10:04 AM serguei.spitsyn at oracle.com >> > > wrote: >> >> Hi Jc, >> >> It looks good to me. >> >> Thanks, >> Serguei >> >> >> On 7/26/18 09:58, JC Beyler wrote: >>> Hi all, >>> >>> The tests in the HeapMonitor subsystem has a lot of JNI >>> calls. There is a need for verification and testing if >>> anything in the JNI subsystem failed unexpectedly. >>> >>> Here is a webrev that tracks if a JNI call does fail and the >>> tests will fail if any JNI call does fail. >>> >>> Could I have a few reviews please for: >>> Webrev: >>> http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.00/ >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8208303 >>> >>> Thanks, >>> Jc >> >> >> >> -- >> >> Thanks, >> Jc > > > > -- > > Thanks, > Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Jul 26 19:14:03 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 26 Jul 2018 12:14:03 -0700 Subject: RFR 8208303: Track JNI failures and fail tests In-Reply-To: <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com> References: <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com> <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com> Message-ID: <0650aa94-2e30-54e3-9d2d-dcd871455791@oracle.com> An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Thu Jul 26 19:15:08 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Thu, 26 Jul 2018 15:15:08 -0400 Subject: RFR 8208303: Track JNI failures and fail tests In-Reply-To: <0650aa94-2e30-54e3-9d2d-dcd871455791@oracle.com> References: <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com> <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com> <0650aa94-2e30-54e3-9d2d-dcd871455791@oracle.com> Message-ID: <84f55dc6-651f-4c33-46f9-97369932e6e9@oracle.com> We entered RDP2 today (07.26). So only P1 and P2 bug fixes allowed. Dan On 7/26/18 3:14 PM, serguei.spitsyn at oracle.com wrote: > Yes, of course it has to be well tested before the push. > Does it make sense to plan it to push to 11 (after th testing is done)? > > Thanks, > Serguei > > > On 7/26/18 12:08, Daniel D. Daugherty wrote: >> Please make sure this fix is well tested in Mach5 prior to pushing. >> In particular, I'm focused on reducing the noise in Mach5 tier[1-3] >> so adding any new failures there will make me grumpy :-) >> >> Dan >> >> >> On 7/26/18 3:03 PM, JC Beyler wrote: >>> Hi all, >>> >>> With the FatalError idea, here is the webrev to consider, note it no >>> longer changes the tests. If a JNI call fails, then we call FatalError. >>> >>> Let me know what you think: >>> >>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.01/ >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8208303 >>> >>> Thanks! >>> Jc >>> >>> On Thu, Jul 26, 2018 at 10:46 AM serguei.spitsyn at oracle.com >>> >> > wrote: >>> >>> Hi Jc, >>> >>> Good idea. >>> I was thinking about something like this. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 7/26/18 10:40, JC Beyler wrote: >>>> Hi?Serguei, >>>> >>>> As I was looking at another test bug >>>> (https://bugs.openjdk.java.net/browse/JDK-8191519); the >>>> proposal for that bug is to have a JNI call to FatalError to >>>> provoke a failure. >>>> >>>> If we went down that route, this webrev is simpler, no? Instead >>>> of setting failure_status and checking it later; just fail >>>> fatally and be done with it, no? That way, the tests in Java >>>> land don't have to be changed actually, no? >>>> >>>> What would we prefer for tests? Remember there was a failure >>>> and test it later or fail fast via JNI's FatalError? >>>> >>>> Thanks, >>>> Jc >>>> >>>> >>>> On Thu, Jul 26, 2018 at 10:04 AM serguei.spitsyn at oracle.com >>>> >>> > wrote: >>>> >>>> Hi Jc, >>>> >>>> It looks good to me. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 7/26/18 09:58, JC Beyler wrote: >>>>> Hi all, >>>>> >>>>> The tests in the HeapMonitor subsystem has a lot of JNI >>>>> calls. There is a need for verification and testing if >>>>> anything in the JNI subsystem failed unexpectedly. >>>>> >>>>> Here is a webrev that tracks if a JNI call does fail and >>>>> the tests will fail if any JNI call does fail. >>>>> >>>>> Could I have a few reviews please for: >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.00/ >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8208303 >>>>> >>>>> Thanks, >>>>> Jc >>>> >>>> >>>> >>>> -- >>>> >>>> Thanks, >>>> Jc >>> >>> >>> >>> -- >>> >>> Thanks, >>> Jc >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Jul 26 19:17:15 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Thu, 26 Jul 2018 12:17:15 -0700 Subject: RFR 8208303: Track JNI failures and fail tests In-Reply-To: <84f55dc6-651f-4c33-46f9-97369932e6e9@oracle.com> References: <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com> <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com> <0650aa94-2e30-54e3-9d2d-dcd871455791@oracle.com> <84f55dc6-651f-4c33-46f9-97369932e6e9@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Thu Jul 26 19:52:30 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 26 Jul 2018 12:52:30 -0700 Subject: RFR 8208303: Track JNI failures and fail tests In-Reply-To: <84f55dc6-651f-4c33-46f9-97369932e6e9@oracle.com> References: <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com> <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com> <0650aa94-2e30-54e3-9d2d-dcd871455791@oracle.com> <84f55dc6-651f-4c33-46f9-97369932e6e9@oracle.com> Message-ID: <3a2d66d6-e4f8-51ad-3552-dfe3748dff89@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Thu Jul 26 21:07:00 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 26 Jul 2018 14:07:00 -0700 Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 In-Reply-To: <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com> References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com> <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com> <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default> <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com> Message-ID: <4106a672-066a-dc5a-32cc-2eb8eb1bfc4e@oracle.com> Hi Severin, I had looked at this review when it came out, but was hesitant to ok it because I really don't know this code at all. If you can get another reviewer who does know the code, then I'll approve it. This only impacts 32-bit, right? If so, keep in mind that it won't get tested by Oracle testing, including the submit repo, so make sure you do thorough testing. Also, why is there any code being executed that was not compiled with -fno-omit-frame-pointer? The description in the CR just shows a simple java program reproducing this, so all the mixed stack traces belong to the JVM and libs, and I thought we made sure to compile all of them with -fno-omit-frame-pointer. thanks, Chris On 7/26/18 10:11 AM, Severin Gehwolf wrote: > On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote: >> Changes looks good Severin. > Thanks for the review, Sharath! > >> I am not a reviewer though, so you still need a Reviewer to review. > Anyone? > > Thanks, > Severin > >> -----Original Message----- >> From: Severin Gehwolf [mailto:sgehwolf at redhat.com] >> Sent: Thursday, July 26, 2018 1:04 PM >> To: serviceability-dev >> Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 >> >> On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote: >>> Hi, >>> >>> Could I please get a review of this one-liner change related to jhsdb >>> --mixed when attaching to a running Java process? The issue arises >>> when threads are in native code and that native code has frame >>> pointers not properly preserved. In such a case the SA performs a >>> simple frame pointer valididy check: ebp >= esp >>> >>> However, the code of retrieving the value for esp is incorrect in as >>> much as it's not in sync with native code in regards to the register >>> index: >>> >>> native code => X86ThreadContext.SP >>> Java code => X86ThreadContext.ESP >>> >>> X86ThreadContext.ESP is never being set by the native code. Since >>> X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then >>> returns null, ebp.lessThan(esp) wrongly returns false causing the >>> issue. This webrev fixes it by using SP as index on the Java side. >>> Thoughts? >>> >>> webrev: >>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/ >>> bug: https://bugs.openjdk.java.net/browse/JDK-8208091 >> Anyone willing to review this one-liner? >> >> Thanks, >> Severin >> >>> Thanks, >>> Severin >> From daniil.x.titov at oracle.com Thu Jul 26 23:24:48 2018 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 26 Jul 2018 16:24:48 -0700 Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to start In-Reply-To: References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com> <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com> <99433df0-16fe-6ca7-b0c7-b428d3d12f92@oracle.com> <3716AA08-0542-4500-AD55-B41EC34C56BB@oracle.com> <85219efa-ef16-add4-209d-96f7bf987ba4@oracle.com> Message-ID: <0D426A78-5C8C-456B-BEEC-495B86622D0A@oracle.com> Thank you Serguei and Chris for reviewing this change. Best regards, Daniil ?On 7/26/18, 10:01 AM, "serguei.spitsyn at oracle.com" wrote: +1 Thanks, Serguei On 7/26/18 09:59, Chris Plummer wrote: > Thanks for the explanation. Update looks good. > > Chris > > On 7/26/18 9:56 AM, Daniil Titov wrote: >> Hi Chris, >> >> The smallest allowed metaspace size for the test is 9MB. In both >> cases (when the metaspace size is set to 9Mb and to 16 Mb) the >> expected OutOfMemoryError is thrown and the test passes. >> >> I did update the patch to use the smallest settings. >> >> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.02 >> >> >> The test uses a custom class loader to load a class from the byte >> array read from the predefined specified class file. The incorrect >> path passed to the test made the test fail to read this class file. >> >> java.lang.RuntimeException: Exception when reading file >> './bin/nsk/jvmti/ResourceExhausted/Helper.class' >> at >> nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:74) >> at >> nsk.jvmti.ResourceExhausted.resexhausted003.run(resexhausted003.java:89) >> at >> nsk.jvmti.ResourceExhausted.resexhausted003.main(resexhausted003.java:129) >> at >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native >> Method) >> at >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> at >> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.base/java.lang.reflect.Method.invoke(Method.java:566) >> at >> com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115) >> at java.base/java.lang.Thread.run(Thread.java:834) >> Caused by: java.io.FileNotFoundException: >> ./bin/nsk/jvmti/ResourceExhausted/Helper.class (No such file or >> directory) >> at java.base/java.io.FileInputStream.open0(Native Method) >> at java.base/java.io.FileInputStream.open(FileInputStream.java:219) >> at >> java.base/java.io.FileInputStream.(FileInputStream.java:157) >> at >> nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:64) >> ... 8 more >> >> Best regards, >> Daniil >> >> ?On 7/25/18, 9:09 PM, "Chris Plummer" wrote: >> >> Hi Daniil, >> After reading some old comments I added to JDK-6606767, I >> wonder if >> bumping the metaspace size all the way up to 16m is the right >> thing to >> do. It seems the test wants to exhaust the metaspace, so maybe >> it should >> be set it to the smallest allowed size. Is the test still >> exhausting the >> metaspace even when it is 16M. Is there a smaller size that will >> also work? >> Also, regarding the class path, what impact was this bug >> having on the test? >> thanks, >> Chris >> On 7/25/18 4:32 PM, serguei.spitsyn at oracle.com wrote: >> > Hi Daniil, >> > >> > It looks good to me. >> > What is the need to increase the metaspace size? >> > >> > Thanks, >> > Serguei >> > >> > >> > On 7/25/18 16:11, Daniil Titov wrote: >> >> Hello, >> >> >> >> Please review the change that fix the test issue. The fix >> increases >> >> the metaspace size and corrects the path to the class files. >> >> >> >> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/ >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8207364 >> >> >> >> Thanks! >> >> >> >> Best regards, >> >> Daniil >> >> >> >> >> >> >> > >> >> > > From chris.plummer at oracle.com Fri Jul 27 23:27:45 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Fri, 27 Jul 2018 16:27:45 -0700 Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find boolVar with expected value: false In-Reply-To: <5B5233DC.5040003@oracle.com> References: <5B082D2E.7000408@oracle.com> <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com> <5B5233DC.5040003@oracle.com> Message-ID: <853aba55-fafc-2797-ed44-818760bd5571@oracle.com> An HTML attachment was scrubbed... URL: From jcbeyler at google.com Fri Jul 27 23:36:41 2018 From: jcbeyler at google.com (JC Beyler) Date: Fri, 27 Jul 2018 16:36:41 -0700 Subject: RFR 8208303: Track JNI failures and fail tests In-Reply-To: <3a2d66d6-e4f8-51ad-3552-dfe3748dff89@oracle.com> References: <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com> <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com> <0650aa94-2e30-54e3-9d2d-dcd871455791@oracle.com> <84f55dc6-651f-4c33-46f9-97369932e6e9@oracle.com> <3a2d66d6-e4f8-51ad-3552-dfe3748dff89@oracle.com> Message-ID: Hi all, I did the new version that calls FatalError if JNI fails a call. This has the advantage of not having to complicate the Java tests at all, while adding the post-JNI call checks. Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.03/ Bug: https://bugs.openjdk.java.net/browse/JDK-8208303 Thanks all! Jc On Thu, Jul 26, 2018 at 12:52 PM Chris Plummer wrote: > I'm pretty sure changes that only affect tests can be any priority. But > still, be a lot more cautious the closer we get to release. > > Chris > > On 7/26/18 12:15 PM, Daniel D. Daugherty wrote: > > We entered RDP2 today (07.26). So only P1 and P2 bug fixes allowed. > > Dan > > > On 7/26/18 3:14 PM, serguei.spitsyn at oracle.com wrote: > > Yes, of course it has to be well tested before the push. > Does it make sense to plan it to push to 11 (after th testing is done)? > > Thanks, > Serguei > > > On 7/26/18 12:08, Daniel D. Daugherty wrote: > > Please make sure this fix is well tested in Mach5 prior to pushing. > In particular, I'm focused on reducing the noise in Mach5 tier[1-3] > so adding any new failures there will make me grumpy :-) > > Dan > > > On 7/26/18 3:03 PM, JC Beyler wrote: > > Hi all, > > With the FatalError idea, here is the webrev to consider, note it no > longer changes the tests. If a JNI call fails, then we call FatalError. > > Let me know what you think: > > Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.01/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8208303 > > Thanks! > Jc > > On Thu, Jul 26, 2018 at 10:46 AM serguei.spitsyn at oracle.com < > serguei.spitsyn at oracle.com> wrote: > >> Hi Jc, >> >> Good idea. >> I was thinking about something like this. >> >> Thanks, >> Serguei >> >> >> On 7/26/18 10:40, JC Beyler wrote: >> >> Hi Serguei, >> >> As I was looking at another test bug ( >> https://bugs.openjdk.java.net/browse/JDK-8191519); the proposal for that >> bug is to have a JNI call to FatalError to provoke a failure. >> >> If we went down that route, this webrev is simpler, no? Instead of >> setting failure_status and checking it later; just fail fatally and be done >> with it, no? That way, the tests in Java land don't have to be changed >> actually, no? >> >> What would we prefer for tests? Remember there was a failure and test it >> later or fail fast via JNI's FatalError? >> >> Thanks, >> Jc >> >> >> On Thu, Jul 26, 2018 at 10:04 AM serguei.spitsyn at oracle.com < >> serguei.spitsyn at oracle.com> wrote: >> >>> Hi Jc, >>> >>> It looks good to me. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 7/26/18 09:58, JC Beyler wrote: >>> >>> Hi all, >>> >>> The tests in the HeapMonitor subsystem has a lot of JNI calls. There is >>> a need for verification and testing if anything in the JNI subsystem failed >>> unexpectedly. >>> >>> Here is a webrev that tracks if a JNI call does fail and the tests will >>> fail if any JNI call does fail. >>> >>> Could I have a few reviews please for: >>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.00/ >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8208303 >>> >>> Thanks, >>> Jc >>> >>> >>> >> >> -- >> >> Thanks, >> Jc >> >> >> > > -- > > Thanks, > Jc > > > > > > -- Thanks, Jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Mon Jul 30 05:05:56 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Sun, 29 Jul 2018 22:05:56 -0700 Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find boolVar with expected value: false In-Reply-To: <853aba55-fafc-2797-ed44-818760bd5571@oracle.com> References: <5B082D2E.7000408@oracle.com> <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com> <5B5233DC.5040003@oracle.com> <853aba55-fafc-2797-ed44-818760bd5571@oracle.com> Message-ID: <352ccc2d-8e8a-4b43-45fd-64bed2bb56f1@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Jul 30 07:47:04 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 30 Jul 2018 00:47:04 -0700 Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find boolVar with expected value: false In-Reply-To: <352ccc2d-8e8a-4b43-45fd-64bed2bb56f1@oracle.com> References: <5B082D2E.7000408@oracle.com> <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com> <5B5233DC.5040003@oracle.com> <853aba55-fafc-2797-ed44-818760bd5571@oracle.com> <352ccc2d-8e8a-4b43-45fd-64bed2bb56f1@oracle.com> Message-ID: <973a96aa-0533-1e7d-a6f7-e948c0ecc371@oracle.com> An HTML attachment was scrubbed... URL: From sgehwolf at redhat.com Mon Jul 30 08:28:22 2018 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Mon, 30 Jul 2018 10:28:22 +0200 Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 In-Reply-To: <4106a672-066a-dc5a-32cc-2eb8eb1bfc4e@oracle.com> References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com> <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com> <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default> <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com> <4106a672-066a-dc5a-32cc-2eb8eb1bfc4e@oracle.com> Message-ID: Hi Chris, On Thu, 2018-07-26 at 14:07 -0700, Chris Plummer wrote: > I had looked at this review when it came out, but was hesitant to ok it > because I really don't know this code at all. If you can get another > reviewer who does know the code, then I'll approve it. Sharath Ballal reviewed it, but he's not a Reviewer as per the OpenJDK census. As to whether he knows the code, I don't know. He's on CC. > This only impacts 32-bit, right? If so, keep in mind that it won't get tested by Oracle > testing, including the submit repo, so make sure you do thorough testing. It only impacts 32-bit, yes. I understand that Oracle isn't testing 32- bit x86 any more. The change itself should be fairly low risk since it's changing only a 32-bit-x86-linux-only file and the native bits don't seem to match what the Java code does[1]. REG_INDEX(reg) being defined as: #define REG_INDEX(reg) sun_jvm_hotspot_debugger_x86_X86ThreadContext_##reg and being used as: REG_INDEX(SP) Thus, using sun_jvm_hotspot_debugger_x86_X86ThreadContext_SP The Java code uses: sun.jvm.hotspot.debugger.x86.X86ThreadContext.ESP > Also, why is there any code being executed that was not compiled with > -fno-omit-frame-pointer? The description in the CR just shows a simple > java program reproducing this, so all the mixed stack traces belong to > the JVM and libs, and I thought we made sure to compile all of them with > -fno-omit-frame-pointer. The JVM uses glibc and that simple program is enough to see some thread's stack currently being in a glibc function when getting a mixed stack trace. We've originally seen this in JDK 8 with jstack -m and was reported in [2]. That comment has more details. The problem here isn't that it's a JDK lib which gets compiled without -fno-omit-frame- pointer. It's glibc not being compiled with that option. An example stack trace for a system where this happens looks like this: Thread 7 (Thread 0xa3863b40 (LWP 834)): #0 0xf771f430 in __kernel_vsyscall () #1 0xf7703acc in futex_abstimed_wait (cancel=true, private=, abstime=0x0, expected=1, futex=0xf770f000) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43 #2 do_futex_wait (sem=0xf770f000, sem at entry=0xf70ea854 , abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:226 #3 0xf7703bb7 in __new_sem_wait_slow (sem=0xf70ea854 , abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:407 #4 0xf6cc18d4 in check_pending_signals (wait=true) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:2522 #5 0xf6cbc632 in signal_thread_entry (thread=0xa37a4800, __the_thread__=0xa37a4800) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/share/vm/runtime/os.cpp:250 That is, frames 0-3 are JDK foreign. This bug will happen on all systems which use any native library which isn't compiled with -fno- omit-frame-pointer. Be it glibc or some other library. Thanks, Severin [1] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c9 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c4 > thanks, > > Chris > > On 7/26/18 10:11 AM, Severin Gehwolf wrote: > > On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote: > > > Changes looks good Severin. > > > > Thanks for the review, Sharath! > > > > > I am not a reviewer though, so you still need a Reviewer to review. > > > > Anyone? > > > > Thanks, > > Severin > > > > > -----Original Message----- > > > From: Severin Gehwolf [mailto:sgehwolf at redhat.com] > > > Sent: Thursday, July 26, 2018 1:04 PM > > > To: serviceability-dev > > > Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 > > > > > > On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote: > > > > Hi, > > > > > > > > Could I please get a review of this one-liner change related to jhsdb > > > > --mixed when attaching to a running Java process? The issue arises > > > > when threads are in native code and that native code has frame > > > > pointers not properly preserved. In such a case the SA performs a > > > > simple frame pointer valididy check: ebp >= esp > > > > > > > > However, the code of retrieving the value for esp is incorrect in as > > > > much as it's not in sync with native code in regards to the register > > > > index: > > > > > > > > native code => X86ThreadContext.SP > > > > Java code => X86ThreadContext.ESP > > > > > > > > X86ThreadContext.ESP is never being set by the native code. Since > > > > X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then > > > > returns null, ebp.lessThan(esp) wrongly returns false causing the > > > > issue. This webrev fixes it by using SP as index on the Java side. > > > > Thoughts? > > > > > > > > webrev: > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/ > > > > bug: https://bugs.openjdk.java.net/browse/JDK-8208091 > > > > > > Anyone willing to review this one-liner? > > > > > > Thanks, > > > Severin > > > > > > > Thanks, > > > > Severin > > From thomas.schatzl at oracle.com Mon Jul 30 13:03:20 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 30 Jul 2018 15:03:20 +0200 Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions In-Reply-To: References: Message-ID: <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com> Hi Paul, did some prototyping and wanted to show you the results and get your input: On Thu, 2018-07-26 at 16:06 +0200, Thomas Schatzl wrote: > [...] > Could we work together on first refactoring the code before adding > new > kinds of spaces to the MXBeans? > > Looking at this change and mine roughly the following issues would > need to be resolved first: > - find a solution for archive regions as suggested above :) At the > moment, without doing the change, I would tend to make archive > regions separate from old regions. I went with that and I am currently testing https://bugs.openjdk.java.n et/browse/JDK-8208498 ; here's a webrev to look at: http://cr.openjdk.j ava.net/~tschatzl/8208498/webrev/ > - move serviceability stuff as much as possible to > g1MonitoringSupport Preliminary webrev: http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/ I think this came out better than expected: while we maybe want to add a ServiceabilitySupport interface that collects the get_memory_manager/pools/* methods in the future, imho this is a lot better than current code as it tightens the G1MonitoringSupport interface quite a bit. Particularly of note should be the G1MonitoringScope class that collects both TraceCollectorStats and TraceMemoryManagerStats into a single class. (Instead of the two bools passed to it something indicating the GC directly would probably be better too). It would be nice if something similar could be made for the concurrent Trace*Stats. > - clean up MemoryPool, remove duplicate information > - provide and return sane memory pool used/committed values to the > MXBeans > - clean up G1MonitoringSupport, e.g. avoid "*used/*committed" > variables > for every single memory pool. Use MemoryUsage structs for them. Make > reading of memory pool information atomic wrt to its readers (note > that I think it is currently just impossible to get consistent output > for other statistics like jstat) - that's JDK-8207200. > - add whatever serviceability stuff for the new pools/jstat/* in > steps. Thanks, Thomas From chris.plummer at oracle.com Mon Jul 30 16:33:15 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 30 Jul 2018 09:33:15 -0700 Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 In-Reply-To: References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com> <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com> <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default> <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com> <4106a672-066a-dc5a-32cc-2eb8eb1bfc4e@oracle.com> Message-ID: Hi Severin, On 7/30/18 1:28 AM, Severin Gehwolf wrote: > Hi Chris, > > On Thu, 2018-07-26 at 14:07 -0700, Chris Plummer wrote: >> I had looked at this review when it came out, but was hesitant to ok it >> because I really don't know this code at all. If you can get another >> reviewer who does know the code, then I'll approve it. > Sharath Ballal reviewed it, but he's not a Reviewer as per the OpenJDK > census. As to whether he knows the code, I don't know. He's on CC. Yes, but I was asking for a second reviewer (not counting me). > >> This only impacts 32-bit, right? If so, keep in mind that it won't get tested by Oracle >> testing, including the submit repo, so make sure you do thorough testing. > It only impacts 32-bit, yes. I understand that Oracle isn't testing 32- > bit x86 any more. The change itself should be fairly low risk since > it's changing only a 32-bit-x86-linux-only file and the native bits > don't seem to match what the Java code does[1]. REG_INDEX(reg) being > defined as: > > #define REG_INDEX(reg) sun_jvm_hotspot_debugger_x86_X86ThreadContext_##reg > > and being used as: > > REG_INDEX(SP) > > Thus, using > > sun_jvm_hotspot_debugger_x86_X86ThreadContext_SP > > The Java code uses: > > sun.jvm.hotspot.debugger.x86.X86ThreadContext.ESP > >> Also, why is there any code being executed that was not compiled with >> -fno-omit-frame-pointer? The description in the CR just shows a simple >> java program reproducing this, so all the mixed stack traces belong to >> the JVM and libs, and I thought we made sure to compile all of them with >> -fno-omit-frame-pointer. > The JVM uses glibc and that simple program is enough to see some > thread's stack currently being in a glibc function when getting a mixed > stack trace. We've originally seen this in JDK 8 with jstack -m and was > reported in [2]. That comment has more details. The problem here isn't > that it's a JDK lib which gets compiled without -fno-omit-frame- > pointer. It's glibc not being compiled with that option. > > An example stack trace for a system where this happens looks like this: > > Thread 7 (Thread 0xa3863b40 (LWP 834)): > #0 0xf771f430 in __kernel_vsyscall () > #1 0xf7703acc in futex_abstimed_wait (cancel=true, private=, abstime=0x0, expected=1, futex=0xf770f000) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43 > #2 do_futex_wait (sem=0xf770f000, sem at entry=0xf70ea854 , abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:226 > #3 0xf7703bb7 in __new_sem_wait_slow (sem=0xf70ea854 , abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:407 > #4 0xf6cc18d4 in check_pending_signals (wait=true) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:2522 > #5 0xf6cbc632 in signal_thread_entry (thread=0xa37a4800, __the_thread__=0xa37a4800) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/share/vm/runtime/os.cpp:250 > > That is, frames 0-3 are JDK foreign. This bug will happen on all > systems which use any native library which isn't compiled with -fno- > omit-frame-pointer. Be it glibc or some other library. Ok. It looks like we don't even have a "jstack --mixed" test. Could you add one? It would be even better if the test included a JNI lib that wasn't compiled with -fno-omit-frame-pointer so you don't need to rely on glibc to reproduce this issue (or is glibc pretty much always compiled without -fno-omit-frame-pointer)? Or if Sharath agrees, file a bug to have a test added. thanks, Chris > > Thanks, > Severin > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c9 > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c4 > >> thanks, >> >> Chris >> >> On 7/26/18 10:11 AM, Severin Gehwolf wrote: >>> On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote: >>>> Changes looks good Severin. >>> Thanks for the review, Sharath! >>> >>>> I am not a reviewer though, so you still need a Reviewer to review. >>> Anyone? >>> >>> Thanks, >>> Severin >>> >>>> -----Original Message----- >>>> From: Severin Gehwolf [mailto:sgehwolf at redhat.com] >>>> Sent: Thursday, July 26, 2018 1:04 PM >>>> To: serviceability-dev >>>> Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 >>>> >>>> On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote: >>>>> Hi, >>>>> >>>>> Could I please get a review of this one-liner change related to jhsdb >>>>> --mixed when attaching to a running Java process? The issue arises >>>>> when threads are in native code and that native code has frame >>>>> pointers not properly preserved. In such a case the SA performs a >>>>> simple frame pointer valididy check: ebp >= esp >>>>> >>>>> However, the code of retrieving the value for esp is incorrect in as >>>>> much as it's not in sync with native code in regards to the register >>>>> index: >>>>> >>>>> native code => X86ThreadContext.SP >>>>> Java code => X86ThreadContext.ESP >>>>> >>>>> X86ThreadContext.ESP is never being set by the native code. Since >>>>> X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then >>>>> returns null, ebp.lessThan(esp) wrongly returns false causing the >>>>> issue. This webrev fixes it by using SP as index on the Java side. >>>>> Thoughts? >>>>> >>>>> webrev: >>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/ >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8208091 >>>> Anyone willing to review this one-liner? >>>> >>>> Thanks, >>>> Severin >>>> >>>>> Thanks, >>>>> Severin >> From chris.plummer at oracle.com Mon Jul 30 16:46:35 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 30 Jul 2018 09:46:35 -0700 Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find boolVar with expected value: false In-Reply-To: <973a96aa-0533-1e7d-a6f7-e948c0ecc371@oracle.com> References: <5B082D2E.7000408@oracle.com> <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com> <5B5233DC.5040003@oracle.com> <853aba55-fafc-2797-ed44-818760bd5571@oracle.com> <352ccc2d-8e8a-4b43-45fd-64bed2bb56f1@oracle.com> <973a96aa-0533-1e7d-a6f7-e948c0ecc371@oracle.com> Message-ID: <8a256bfc-1ff1-da31-ce31-75099f850461@oracle.com> An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Mon Jul 30 18:05:39 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 30 Jul 2018 14:05:39 -0400 Subject: RFR(XXS): 8208521 ProblemList more tests that fail due to 'Error attaching to process: Can't create thread_db agent!' Message-ID: Greetings, I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so I need a single (R)eviewer for the following fix: ? JDK-8208521 ProblemList more tests that fail due to 'Error attaching to ????????????? process: Can't create thread_db agent!' ? https://bugs.openjdk.java.net/browse/JDK-8208521 Here's the diff: $ hg diff diff -r 24517a097dc1 test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt??? Fri Jul 27 00:00:28 2018 -0700 +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 30 13:58:45 2018 -0400 @@ -101,6 +101,7 @@ ?serviceability/sa/ClhsdbSymbol.java 8193639 solaris ?serviceability/sa/ClhsdbSymbolTable.java 8193639 solaris ?serviceability/sa/ClhsdbThread.java 8193639 solaris +serviceability/sa/ClhsdbVmStructsDump.java 8193639 solaris ?serviceability/sa/ClhsdbWhere.java 8193639 solaris ?serviceability/sa/DeadlockDetectionTest.java 8193639 solaris ?serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris @@ -109,6 +110,7 @@ ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris ?serviceability/sa/TestDefaultMethods.java 8193639 solaris ?serviceability/sa/TestG1HeapRegion.java 8193639 solaris +serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris ?serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all ?serviceability/sa/TestType.java 8193639 solaris ?serviceability/sa/TestUniverse.java 8193639 solaris This is an add-on to the following fix that I pushed last week: ? JDK-8208205 ProblemList tests that fail due to 'Error attaching to ????????????? process: Can't create thread_db agent!' ? https://bugs.openjdk.java.net/browse/JDK-8208205 The above two tests failed in last weekend's jdk-11+24 Thread-SMR stress test run on Solaris-X64. Thanks, in advance, for any questions, comments or suggestions. Dan From chris.plummer at oracle.com Mon Jul 30 18:17:38 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 30 Jul 2018 11:17:38 -0700 Subject: RFR(XXS): 8208521 ProblemList more tests that fail due to 'Error attaching to process: Can't create thread_db agent!' In-Reply-To: References: Message-ID: Looks good. Chris On 7/30/18 11:05 AM, Daniel D. Daugherty wrote: > Greetings, > > I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so > I need a single (R)eviewer for the following fix: > > ? JDK-8208521 ProblemList more tests that fail due to 'Error attaching to > ????????????? process: Can't create thread_db agent!' > ? https://bugs.openjdk.java.net/browse/JDK-8208521 > > Here's the diff: > > $ hg diff > diff -r 24517a097dc1 test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt??? Fri Jul 27 00:00:28 2018 > -0700 > +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 30 13:58:45 2018 > -0400 > @@ -101,6 +101,7 @@ > ?serviceability/sa/ClhsdbSymbol.java 8193639 solaris > ?serviceability/sa/ClhsdbSymbolTable.java 8193639 solaris > ?serviceability/sa/ClhsdbThread.java 8193639 solaris > +serviceability/sa/ClhsdbVmStructsDump.java 8193639 solaris > ?serviceability/sa/ClhsdbWhere.java 8193639 solaris > ?serviceability/sa/DeadlockDetectionTest.java 8193639 solaris > ?serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris > @@ -109,6 +110,7 @@ > ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris > ?serviceability/sa/TestDefaultMethods.java 8193639 solaris > ?serviceability/sa/TestG1HeapRegion.java 8193639 solaris > +serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris > ?serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all > ?serviceability/sa/TestType.java 8193639 solaris > ?serviceability/sa/TestUniverse.java 8193639 solaris > > > This is an add-on to the following fix that I pushed last week: > > ? JDK-8208205 ProblemList tests that fail due to 'Error attaching to > ????????????? process: Can't create thread_db agent!' > ? https://bugs.openjdk.java.net/browse/JDK-8208205 > > The above two tests failed in last weekend's jdk-11+24 Thread-SMR > stress test run on Solaris-X64. > > Thanks, in advance, for any questions, comments or suggestions. > > Dan > From daniel.daugherty at oracle.com Mon Jul 30 18:19:10 2018 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Mon, 30 Jul 2018 14:19:10 -0400 Subject: RFR(XXS): 8208521 ProblemList more tests that fail due to 'Error attaching to process: Can't create thread_db agent!' In-Reply-To: References: Message-ID: <1414847c-9490-1b92-58f0-a0bcf42662ae@oracle.com> Chris, Thanks for the fast review! Dan On 7/30/18 2:17 PM, Chris Plummer wrote: > Looks good. > > Chris > > On 7/30/18 11:05 AM, Daniel D. Daugherty wrote: >> Greetings, >> >> I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so >> I need a single (R)eviewer for the following fix: >> >> ? JDK-8208521 ProblemList more tests that fail due to 'Error >> attaching to >> ????????????? process: Can't create thread_db agent!' >> ? https://bugs.openjdk.java.net/browse/JDK-8208521 >> >> Here's the diff: >> >> $ hg diff >> diff -r 24517a097dc1 test/hotspot/jtreg/ProblemList.txt >> --- a/test/hotspot/jtreg/ProblemList.txt??? Fri Jul 27 00:00:28 2018 >> -0700 >> +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 30 13:58:45 2018 >> -0400 >> @@ -101,6 +101,7 @@ >> ?serviceability/sa/ClhsdbSymbol.java 8193639 solaris >> ?serviceability/sa/ClhsdbSymbolTable.java 8193639 solaris >> ?serviceability/sa/ClhsdbThread.java 8193639 solaris >> +serviceability/sa/ClhsdbVmStructsDump.java 8193639 solaris >> ?serviceability/sa/ClhsdbWhere.java 8193639 solaris >> ?serviceability/sa/DeadlockDetectionTest.java 8193639 solaris >> ?serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris >> @@ -109,6 +110,7 @@ >> ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris >> ?serviceability/sa/TestDefaultMethods.java 8193639 solaris >> ?serviceability/sa/TestG1HeapRegion.java 8193639 solaris >> +serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris >> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all >> ?serviceability/sa/TestType.java 8193639 solaris >> ?serviceability/sa/TestUniverse.java 8193639 solaris >> >> >> This is an add-on to the following fix that I pushed last week: >> >> ? JDK-8208205 ProblemList tests that fail due to 'Error attaching to >> ????????????? process: Can't create thread_db agent!' >> ? https://bugs.openjdk.java.net/browse/JDK-8208205 >> >> The above two tests failed in last weekend's jdk-11+24 Thread-SMR >> stress test run on Solaris-X64. >> >> Thanks, in advance, for any questions, comments or suggestions. >> >> Dan >> > > From hohensee at amazon.com Mon Jul 30 19:18:27 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 30 Jul 2018 19:18:27 +0000 Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions In-Reply-To: <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com> References: <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com> Message-ID: <22C41D6E-162C-4758-AE23-6856627250DC@amazon.com> At JVMLS, so can't look in depth this instant, but I'm fine with your approach, except I'd get the new JMX and jstat structure in place before fixing the data that gets reported. Imo it'll be easier to fit correct data into the new JMX/jstat setup than into the old one, and doing it the new way will give us a good idea of exactly what we should do for the legacy ones. Your archive region set webrev looks pretty much the same as what I wrote, but I got a trace trap when I tried to execute the resulting JVM. Not a clue why, so I abandoned it. I'd not have thought of making a G1MonitoringScope, looks good. Thanks, Paul ?On 7/30/18, 6:04 AM, "Thomas Schatzl" wrote: Hi Paul, did some prototyping and wanted to show you the results and get your input: On Thu, 2018-07-26 at 16:06 +0200, Thomas Schatzl wrote: > [...] > Could we work together on first refactoring the code before adding > new > kinds of spaces to the MXBeans? > > Looking at this change and mine roughly the following issues would > need to be resolved first: > - find a solution for archive regions as suggested above :) At the > moment, without doing the change, I would tend to make archive > regions separate from old regions. I went with that and I am currently testing https://bugs.openjdk.java.n et/browse/JDK-8208498 ; here's a webrev to look at: http://cr.openjdk.j ava.net/~tschatzl/8208498/webrev/ > - move serviceability stuff as much as possible to > g1MonitoringSupport Preliminary webrev: http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/ I think this came out better than expected: while we maybe want to add a ServiceabilitySupport interface that collects the get_memory_manager/pools/* methods in the future, imho this is a lot better than current code as it tightens the G1MonitoringSupport interface quite a bit. Particularly of note should be the G1MonitoringScope class that collects both TraceCollectorStats and TraceMemoryManagerStats into a single class. (Instead of the two bools passed to it something indicating the GC directly would probably be better too). It would be nice if something similar could be made for the concurrent Trace*Stats. > - clean up MemoryPool, remove duplicate information > - provide and return sane memory pool used/committed values to the > MXBeans > - clean up G1MonitoringSupport, e.g. avoid "*used/*committed" > variables > for every single memory pool. Use MemoryUsage structs for them. Make > reading of memory pool information atomic wrt to its readers (note > that I think it is currently just impossible to get consistent output > for other statistics like jstat) - that's JDK-8207200. > - add whatever serviceability stuff for the new pools/jstat/* in > steps. Thanks, Thomas From coleen.phillimore at oracle.com Mon Jul 30 20:49:57 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Mon, 30 Jul 2018 16:49:57 -0400 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException Message-ID: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> Summary: fixed refactoring caused by JDK-8203820 open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev bug link https://bugs.openjdk.java.net/browse/JDK-8208074 Ran the test in mach5 on all Oracle supported platforms.? Also took the test out of ProblemList.txt because JDK-8203820 fixes https://bugs.openjdk.java.net/browse/JDK-8202896. Thanks, Coleen From david.holmes at oracle.com Mon Jul 30 21:46:21 2018 From: david.holmes at oracle.com (David Holmes) Date: Tue, 31 Jul 2018 07:46:21 +1000 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> Message-ID: <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com> On 31/07/2018 6:49 AM, coleen.phillimore at oracle.com wrote: > Summary: fixed refactoring caused by JDK-8203820 > > open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8208074 For the sake of other readers who don't want to have to reverse engineer the actual cause of the problem, the original code has two Method.invoke sequences: one for a static method and which passed a null receiver; one for a non-static method which passed a non-null receiver. The refactoring extracted the invoke logic but always passed a null receiver - which was wrong for the non-static case. The fix always passes a non-null receiver to fix the non-static case, and which is ignored in the static case. Reviewed. Trivial. Thanks, David > Ran the test in mach5 on all Oracle supported platforms.? Also took the > test out of ProblemList.txt because JDK-8203820 fixes > https://bugs.openjdk.java.net/browse/JDK-8202896. > > Thanks, > Coleen From hohensee at amazon.com Mon Jul 30 23:26:57 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 30 Jul 2018 23:26:57 +0000 Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions In-Reply-To: <22C41D6E-162C-4758-AE23-6856627250DC@amazon.com> References: <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com> <22C41D6E-162C-4758-AE23-6856627250DC@amazon.com> Message-ID: <54CF80A5-A8C3-4C20-8D3D-045A8FA181AA@amazon.com> A couple nits on http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/. g1CollectedHeap.cpp: in initialize_serviceability(), memory_managers(), and memory_pools(), use g1mm() instead of _g1mm. g1MonitoringSupport.cpp: there's an extra newline after ~G1MonitoringSupport(). Otherwise looks good. Paul ?On 7/30/18, 12:18 PM, "Hohensee, Paul" wrote: At JVMLS, so can't look in depth this instant, but I'm fine with your approach, except I'd get the new JMX and jstat structure in place before fixing the data that gets reported. Imo it'll be easier to fit correct data into the new JMX/jstat setup than into the old one, and doing it the new way will give us a good idea of exactly what we should do for the legacy ones. Your archive region set webrev looks pretty much the same as what I wrote, but I got a trace trap when I tried to execute the resulting JVM. Not a clue why, so I abandoned it. I'd not have thought of making a G1MonitoringScope, looks good. Thanks, Paul On 7/30/18, 6:04 AM, "Thomas Schatzl" wrote: Hi Paul, did some prototyping and wanted to show you the results and get your input: On Thu, 2018-07-26 at 16:06 +0200, Thomas Schatzl wrote: > [...] > Could we work together on first refactoring the code before adding > new > kinds of spaces to the MXBeans? > > Looking at this change and mine roughly the following issues would > need to be resolved first: > - find a solution for archive regions as suggested above :) At the > moment, without doing the change, I would tend to make archive > regions separate from old regions. I went with that and I am currently testing https://bugs.openjdk.java.n et/browse/JDK-8208498 ; here's a webrev to look at: http://cr.openjdk.j ava.net/~tschatzl/8208498/webrev/ > - move serviceability stuff as much as possible to > g1MonitoringSupport Preliminary webrev: http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/ I think this came out better than expected: while we maybe want to add a ServiceabilitySupport interface that collects the get_memory_manager/pools/* methods in the future, imho this is a lot better than current code as it tightens the G1MonitoringSupport interface quite a bit. Particularly of note should be the G1MonitoringScope class that collects both TraceCollectorStats and TraceMemoryManagerStats into a single class. (Instead of the two bools passed to it something indicating the GC directly would probably be better too). It would be nice if something similar could be made for the concurrent Trace*Stats. > - clean up MemoryPool, remove duplicate information > - provide and return sane memory pool used/committed values to the > MXBeans > - clean up G1MonitoringSupport, e.g. avoid "*used/*committed" > variables > for every single memory pool. Use MemoryUsage structs for them. Make > reading of memory pool information atomic wrt to its readers (note > that I think it is currently just impossible to get consistent output > for other statistics like jstat) - that's JDK-8207200. > - add whatever serviceability stuff for the new pools/jstat/* in > steps. Thanks, Thomas From chris.plummer at oracle.com Tue Jul 31 01:34:21 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 30 Jul 2018 18:34:21 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> Message-ID: Hi Coleen, Now that this had been pushed, I assume JDK-8202896 should be closed as a dup. And what about JDK-8206076? Is it fixed by this change also? thanks, Chris On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: > Summary: fixed refactoring caused by JDK-8203820 > > open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev > bug link https://bugs.openjdk.java.net/browse/JDK-8208074 > > Ran the test in mach5 on all Oracle supported platforms.? Also took > the test out of ProblemList.txt because JDK-8203820 fixes > https://bugs.openjdk.java.net/browse/JDK-8202896. > > Thanks, > Coleen From sharath.ballal at oracle.com Tue Jul 31 04:23:46 2018 From: sharath.ballal at oracle.com (Sharath Ballal) Date: Mon, 30 Jul 2018 21:23:46 -0700 (PDT) Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 In-Reply-To: References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com> <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com> <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default> <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com> <4106a672-066a-dc5a-32cc-2eb8eb1bfc4e@oracle.com> Message-ID: <8cb164e3-0492-4c92-81d0-469f16158ff4@default> > Ok. It looks like we don't even have a "jstack --mixed" test. Could you add one? It would be even better if the test included a JNI lib that wasn't compiled with -fno-omit-frame-pointer so you don't need to rely on glibc to reproduce this issue (or is glibc pretty much always compiled without -fno-omit-frame-pointer)? Or if Sharath agrees, file a bug to have a test added. That?s a good suggestion. Severin you can either write a test or open a bug for it. Thanks, Sharath -----Original Message----- From: Chris Plummer Sent: Monday, July 30, 2018 10:03 PM To: Severin Gehwolf; Sharath Ballal; serviceability-dev Subject: Re: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 Hi Severin, On 7/30/18 1:28 AM, Severin Gehwolf wrote: > Hi Chris, > > On Thu, 2018-07-26 at 14:07 -0700, Chris Plummer wrote: >> I had looked at this review when it came out, but was hesitant to ok >> it because I really don't know this code at all. If you can get >> another reviewer who does know the code, then I'll approve it. > Sharath Ballal reviewed it, but he's not a Reviewer as per the OpenJDK > census. As to whether he knows the code, I don't know. He's on CC. Yes, but I was asking for a second reviewer (not counting me). > >> This only impacts 32-bit, right? If so, keep in mind that it won't >> get tested by Oracle testing, including the submit repo, so make sure you do thorough testing. > It only impacts 32-bit, yes. I understand that Oracle isn't testing > 32- bit x86 any more. The change itself should be fairly low risk > since it's changing only a 32-bit-x86-linux-only file and the native > bits don't seem to match what the Java code does[1]. REG_INDEX(reg) > being defined as: > > #define REG_INDEX(reg) > sun_jvm_hotspot_debugger_x86_X86ThreadContext_##reg > > and being used as: > > REG_INDEX(SP) > > Thus, using > > sun_jvm_hotspot_debugger_x86_X86ThreadContext_SP > > The Java code uses: > > sun.jvm.hotspot.debugger.x86.X86ThreadContext.ESP > >> Also, why is there any code being executed that was not compiled with >> -fno-omit-frame-pointer? The description in the CR just shows a >> simple java program reproducing this, so all the mixed stack traces >> belong to the JVM and libs, and I thought we made sure to compile all >> of them with -fno-omit-frame-pointer. > The JVM uses glibc and that simple program is enough to see some > thread's stack currently being in a glibc function when getting a > mixed stack trace. We've originally seen this in JDK 8 with jstack -m > and was reported in [2]. That comment has more details. The problem > here isn't that it's a JDK lib which gets compiled without > -fno-omit-frame- pointer. It's glibc not being compiled with that option. > > An example stack trace for a system where this happens looks like this: > > Thread 7 (Thread 0xa3863b40 (LWP 834)): > #0 0xf771f430 in __kernel_vsyscall () > #1 0xf7703acc in futex_abstimed_wait (cancel=true, private= out>, abstime=0x0, expected=1, futex=0xf770f000) at > ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43 > #2 do_futex_wait (sem=0xf770f000, sem at entry=0xf70ea854 , > abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:226 > #3 0xf7703bb7 in __new_sem_wait_slow (sem=0xf70ea854 , > abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:407 > #4 0xf6cc18d4 in check_pending_signals (wait=true) at > /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/h > otspot/src/os/linux/vm/os_linux.cpp:2522 > #5 0xf6cbc632 in signal_thread_entry (thread=0xa37a4800, > __the_thread__=0xa37a4800) at > /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/h > otspot/src/share/vm/runtime/os.cpp:250 > > That is, frames 0-3 are JDK foreign. This bug will happen on all > systems which use any native library which isn't compiled with -fno- > omit-frame-pointer. Be it glibc or some other library. Ok. It looks like we don't even have a "jstack --mixed" test. Could you add one? It would be even better if the test included a JNI lib that wasn't compiled with -fno-omit-frame-pointer so you don't need to rely on glibc to reproduce this issue (or is glibc pretty much always compiled without -fno-omit-frame-pointer)? Or if Sharath agrees, file a bug to have a test added. thanks, Chris > > Thanks, > Severin > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c9 > [2] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c4 > >> thanks, >> >> Chris >> >> On 7/26/18 10:11 AM, Severin Gehwolf wrote: >>> On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote: >>>> Changes looks good Severin. >>> Thanks for the review, Sharath! >>> >>>> I am not a reviewer though, so you still need a Reviewer to review. >>> Anyone? >>> >>> Thanks, >>> Severin >>> >>>> -----Original Message----- >>>> From: Severin Gehwolf [mailto:sgehwolf at redhat.com] >>>> Sent: Thursday, July 26, 2018 1:04 PM >>>> To: serviceability-dev >>>> Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws >>>> UnmappedAddressException on i686 >>>> >>>> On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote: >>>>> Hi, >>>>> >>>>> Could I please get a review of this one-liner change related to >>>>> jhsdb --mixed when attaching to a running Java process? The issue >>>>> arises when threads are in native code and that native code has >>>>> frame pointers not properly preserved. In such a case the SA >>>>> performs a simple frame pointer valididy check: ebp >= esp >>>>> >>>>> However, the code of retrieving the value for esp is incorrect in >>>>> as much as it's not in sync with native code in regards to the >>>>> register >>>>> index: >>>>> >>>>> native code => X86ThreadContext.SP >>>>> Java code => X86ThreadContext.ESP >>>>> >>>>> X86ThreadContext.ESP is never being set by the native code. Since >>>>> X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then >>>>> returns null, ebp.lessThan(esp) wrongly returns false causing the >>>>> issue. This webrev fixes it by using SP as index on the Java side. >>>>> Thoughts? >>>>> >>>>> webrev: >>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01 >>>>> / >>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8208091 >>>> Anyone willing to review this one-liner? >>>> >>>> Thanks, >>>> Severin >>>> >>>>> Thanks, >>>>> Severin >> From chris.plummer at oracle.com Tue Jul 31 07:16:03 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 31 Jul 2018 00:16:03 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> Message-ID: <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> Sorry, I thought this had been pushed already, but it hasn't. But it still looks like JDK-8202896 should be closed as a dup, and it's unclear to me if JDK-8206076 has been fixed and this test can be removed from the problem list. Chris On 7/30/18 6:34 PM, Chris Plummer wrote: > Hi Coleen, > > Now that this had been pushed, I assume JDK-8202896 should be closed > as a dup. And what about JDK-8206076? Is it fixed by this change also? > > thanks, > > Chris > > On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >> Summary: fixed refactoring caused by JDK-8203820 >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >> >> Ran the test in mach5 on all Oracle supported platforms.? Also took >> the test out of ProblemList.txt because JDK-8203820 fixes >> https://bugs.openjdk.java.net/browse/JDK-8202896. >> >> Thanks, >> Coleen > > > From serguei.spitsyn at oracle.com Tue Jul 31 07:20:54 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 31 Jul 2018 00:20:54 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com> Message-ID: Hi Coleen, The explanation from David is very helpful - thanks! So the fix looks good to me as well. We still need to answer questions from Chris though. Thanks, Serguei On 7/30/18 14:46, David Holmes wrote: > On 31/07/2018 6:49 AM, coleen.phillimore at oracle.com wrote: >> Summary: fixed refactoring caused by JDK-8203820 >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 > > For the sake of other readers who don't want to have to reverse > engineer the actual cause of the problem, the original code has two > Method.invoke sequences: one for a static method and which passed a > null receiver; one ?for a non-static method which passed a non-null > receiver. The refactoring extracted the invoke logic but always passed > a null receiver - which was wrong for the non-static case. The fix > always passes a non-null receiver to fix the non-static case, and > which is ignored in the static case. > > Reviewed. Trivial. > > Thanks, > David > >> Ran the test in mach5 on all Oracle supported platforms.? Also took >> the test out of ProblemList.txt because JDK-8203820 fixes >> https://bugs.openjdk.java.net/browse/JDK-8202896. >> >> Thanks, >> Coleen From serguei.spitsyn at oracle.com Tue Jul 31 07:29:59 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 31 Jul 2018 00:29:59 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> Message-ID: <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> Hi Chris, Good catch. It is possible that this webrev does not fix the JDK-8202896. The JDK-8202896 is about timeouts which are normally intermittent (is it right?). There are two options here: ? A: close 8202896 as a dup of 8208074 ? B: keep the test problem listed and labeled with 8202896 Let's wait for Coleen's answer. Thanks, Serguei On 7/31/18 00:16, Chris Plummer wrote: > Sorry, I thought this had been pushed already, but it hasn't. But it > still looks like JDK-8202896 should be closed as a dup, and it's > unclear to me if JDK-8206076 has been fixed and this test can be > removed from the problem list. > > Chris > > On 7/30/18 6:34 PM, Chris Plummer wrote: >> Hi Coleen, >> >> Now that this had been pushed, I assume JDK-8202896 should be closed >> as a dup. And what about JDK-8206076? Is it fixed by this change also? >> >> thanks, >> >> Chris >> >> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>> Summary: fixed refactoring caused by JDK-8203820 >>> >>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>> >>> Ran the test in mach5 on all Oracle supported platforms.? Also took >>> the test out of ProblemList.txt because JDK-8203820 fixes >>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>> >>> Thanks, >>> Coleen >> >> >> > > From sgehwolf at redhat.com Tue Jul 31 08:14:50 2018 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 31 Jul 2018 10:14:50 +0200 Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686 In-Reply-To: <8cb164e3-0492-4c92-81d0-469f16158ff4@default> References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com> <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com> <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default> <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com> <4106a672-066a-dc5a-32cc-2eb8eb1bfc4e@oracle.com> <8cb164e3-0492-4c92-81d0-469f16158ff4@default> Message-ID: <994a4b8b3ef5456404d83e3aad1a3ec9027fbc1e.camel@redhat.com> On Mon, 2018-07-30 at 21:23 -0700, Sharath Ballal wrote: > > Ok. It looks like we don't even have a "jstack --mixed" test. Could > > you add one? It would be even better if the test included a JNI lib > > that wasn't compiled with -fno-omit-frame-pointer so you don't need > > to rely on glibc to reproduce this issue (or is glibc pretty much > > always compiled without -fno-omit-frame-pointer)? Or if Sharath > > agrees, file a bug to have a test added. > > That?s a good suggestion. Severin you can either write a test or > open a bug for it. I'll write a test for it. Thanks, Severin From serguei.spitsyn at oracle.com Tue Jul 31 08:32:34 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 31 Jul 2018 01:32:34 -0700 Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find boolVar with expected value: false In-Reply-To: <8a256bfc-1ff1-da31-ce31-75099f850461@oracle.com> References: <5B082D2E.7000408@oracle.com> <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com> <5B5233DC.5040003@oracle.com> <853aba55-fafc-2797-ed44-818760bd5571@oracle.com> <352ccc2d-8e8a-4b43-45fd-64bed2bb56f1@oracle.com> <973a96aa-0533-1e7d-a6f7-e948c0ecc371@oracle.com> <8a256bfc-1ff1-da31-ce31-75099f850461@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From coleen.phillimore at oracle.com Tue Jul 31 11:56:15 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 31 Jul 2018 07:56:15 -0400 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com> Message-ID: <72b2618f-ec62-00ff-8af5-53dbc67156ef@oracle.com> On 7/30/18 5:46 PM, David Holmes wrote: > On 31/07/2018 6:49 AM, coleen.phillimore at oracle.com wrote: >> Summary: fixed refactoring caused by JDK-8203820 >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 > > For the sake of other readers who don't want to have to reverse > engineer the actual cause of the problem, the original code has two > Method.invoke sequences: one for a static method and which passed a > null receiver; one ?for a non-static method which passed a non-null > receiver. The refactoring extracted the invoke logic but always passed > a null receiver - which was wrong for the non-static case. The fix > always passes a non-null receiver to fix the non-static case, and > which is ignored in the static case. Thank you David for summarizing the bug(s) and the review. Coleen > > Reviewed. Trivial. > > Thanks, > David > >> Ran the test in mach5 on all Oracle supported platforms.? Also took >> the test out of ProblemList.txt because JDK-8203820 fixes >> https://bugs.openjdk.java.net/browse/JDK-8202896. >> >> Thanks, >> Coleen From coleen.phillimore at oracle.com Tue Jul 31 12:01:08 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 31 Jul 2018 08:01:08 -0400 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> Message-ID: <2f6bf712-594c-d859-128d-cb30343ec591@oracle.com> On 7/30/18 9:34 PM, Chris Plummer wrote: > Hi Coleen, > > Now that this had been pushed, I assume JDK-8202896 should be closed > as a dup. And what about JDK-8206076? Is it fixed by this change also? Yes, it should be closed also.?? I didn't see this bug.? When I was fixing the first one: https://bugs.openjdk.java.net/browse/JDK-8203820 , I looked for similar patterns in the vmTestbase tests and found this test also. All of these tests were calling InMemoryJavaCompiler from within a loop and from within multiple threads to get the same result.? I can imagine this easily timing out for -Xcomp. I haven't pushed it yet.? I was hoping you'd see this and comment on it, since you had comments for the whole set of bugs. Thanks! Coleen > > thanks, > > Chris > > On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >> Summary: fixed refactoring caused by JDK-8203820 >> >> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >> >> Ran the test in mach5 on all Oracle supported platforms.? Also took >> the test out of ProblemList.txt because JDK-8203820 fixes >> https://bugs.openjdk.java.net/browse/JDK-8202896. >> >> Thanks, >> Coleen > > > From coleen.phillimore at oracle.com Tue Jul 31 12:06:35 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 31 Jul 2018 08:06:35 -0400 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> Message-ID: <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > Good catch. > It is possible that this webrev does not fix the JDK-8202896. > The JDK-8202896 is about timeouts which are normally intermittent (is > it right?). > > There are two options here: > ? A: close 8202896 as a dup of 8208074 > ? B: keep the test problem listed and labeled with 8202896 > > Let's wait for Coleen's answer. I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts with -Xcomp) ?as a duplicate of https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took InMemoryCompiler out of the threads) because that's where the attempted fix was. I think https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many open files intermittently) should be closed as a duplicate too because it's the same root cause. And this one: https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) fixes my fix and will remove the test from the ProblemList.txt. I believe it should be removed fromt he problem list because I don't think it will time out or intermittently fail again for the same reason.? If it times out or fails for a different reason, we should file a whole new bug, with that specific analysis. Thanks, Coleen > > Thanks, > Serguei > > > On 7/31/18 00:16, Chris Plummer wrote: >> Sorry, I thought this had been pushed already, but it hasn't. But it >> still looks like JDK-8202896 should be closed as a dup, and it's >> unclear to me if JDK-8206076 has been fixed and this test can be >> removed from the problem list. >> >> Chris >> >> On 7/30/18 6:34 PM, Chris Plummer wrote: >>> Hi Coleen, >>> >>> Now that this had been pushed, I assume JDK-8202896 should be closed >>> as a dup. And what about JDK-8206076? Is it fixed by this change also? >>> >>> thanks, >>> >>> Chris >>> >>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>> Summary: fixed refactoring caused by JDK-8203820 >>>> >>>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>> >>>> Ran the test in mach5 on all Oracle supported platforms. Also took >>>> the test out of ProblemList.txt because JDK-8203820 fixes >>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>> >>>> Thanks, >>>> Coleen >>> >>> >>> >> >> > From chris.plummer at oracle.com Tue Jul 31 16:13:19 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 31 Jul 2018 09:13:19 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> Message-ID: On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote: > > > On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: >> Hi Chris, >> >> Good catch. >> It is possible that this webrev does not fix the JDK-8202896. >> The JDK-8202896 is about timeouts which are normally intermittent (is >> it right?). >> >> There are two options here: >> ? A: close 8202896 as a dup of 8208074 >> ? B: keep the test problem listed and labeled with 8202896 >> >> Let's wait for Coleen's answer. > > I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts > with -Xcomp) > ?as a duplicate of > https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took > InMemoryCompiler out of the threads) > because that's where the attempted fix was. > > I think > https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many > open files intermittently) > should be closed as a duplicate too because it's the same root cause. > > And this one: > https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) > fixes my fix and will remove the test from the ProblemList.txt. > > I believe it should be removed fromt he problem list because I don't > think it will time out or intermittently fail again for the same > reason.? If it times out or fails for a different reason, we should > file a whole new bug, with that specific analysis. > > Thanks, > Coleen Hi Coleen, That all sounds reasonable. Thanks for cleaning up the bug situation. Chris > >> >> Thanks, >> Serguei >> >> >> On 7/31/18 00:16, Chris Plummer wrote: >>> Sorry, I thought this had been pushed already, but it hasn't. But it >>> still looks like JDK-8202896 should be closed as a dup, and it's >>> unclear to me if JDK-8206076 has been fixed and this test can be >>> removed from the problem list. >>> >>> Chris >>> >>> On 7/30/18 6:34 PM, Chris Plummer wrote: >>>> Hi Coleen, >>>> >>>> Now that this had been pushed, I assume JDK-8202896 should be >>>> closed as a dup. And what about JDK-8206076? Is it fixed by this >>>> change also? >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>>> Summary: fixed refactoring caused by JDK-8203820 >>>>> >>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>>> >>>>> Ran the test in mach5 on all Oracle supported platforms. Also took >>>>> the test out of ProblemList.txt because JDK-8203820 fixes >>>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>>> >>>>> Thanks, >>>>> Coleen >>>> >>>> >>>> >>> >>> >> > From chris.plummer at oracle.com Tue Jul 31 17:43:31 2018 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 31 Jul 2018 10:43:31 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> Message-ID: Hi Coleen, I just realized that there is also https://bugs.openjdk.java.net/browse/JDK-8208234 filed for this test last week. It results in an OOME. I think it's the same issue, but just want check with you. Please close it as a dup if you think it is the same. thanks, Chris On 7/31/18 9:13 AM, Chris Plummer wrote: > On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: >>> Hi Chris, >>> >>> Good catch. >>> It is possible that this webrev does not fix the JDK-8202896. >>> The JDK-8202896 is about timeouts which are normally intermittent >>> (is it right?). >>> >>> There are two options here: >>> ? A: close 8202896 as a dup of 8208074 >>> ? B: keep the test problem listed and labeled with 8202896 >>> >>> Let's wait for Coleen's answer. >> >> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts >> with -Xcomp) >> ?as a duplicate of >> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took >> InMemoryCompiler out of the threads) >> because that's where the attempted fix was. >> >> I think >> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many >> open files intermittently) >> should be closed as a duplicate too because it's the same root cause. >> >> And this one: >> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) >> fixes my fix and will remove the test from the ProblemList.txt. >> >> I believe it should be removed fromt he problem list because I don't >> think it will time out or intermittently fail again for the same >> reason.? If it times out or fails for a different reason, we should >> file a whole new bug, with that specific analysis. >> >> Thanks, >> Coleen > > Hi Coleen, > > That all sounds reasonable. Thanks for cleaning up the bug situation. > > Chris >> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 7/31/18 00:16, Chris Plummer wrote: >>>> Sorry, I thought this had been pushed already, but it hasn't. But >>>> it still looks like JDK-8202896 should be closed as a dup, and it's >>>> unclear to me if JDK-8206076 has been fixed and this test can be >>>> removed from the problem list. >>>> >>>> Chris >>>> >>>> On 7/30/18 6:34 PM, Chris Plummer wrote: >>>>> Hi Coleen, >>>>> >>>>> Now that this had been pushed, I assume JDK-8202896 should be >>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this >>>>> change also? >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: fixed refactoring caused by JDK-8203820 >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>>>> >>>>>> Ran the test in mach5 on all Oracle supported platforms. Also >>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes >>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> >>>>> >>>>> >>>> >>>> >>> >> > > From serguei.spitsyn at oracle.com Tue Jul 31 18:07:35 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 31 Jul 2018 11:07:35 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> Message-ID: <60b8abbb-f9da-25b5-f488-b7c76896b62d@oracle.com> On 7/31/18 09:13, Chris Plummer wrote: > On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote: >> >> >> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: >>> Hi Chris, >>> >>> Good catch. >>> It is possible that this webrev does not fix the JDK-8202896. >>> The JDK-8202896 is about timeouts which are normally intermittent >>> (is it right?). >>> >>> There are two options here: >>> ? A: close 8202896 as a dup of 8208074 >>> ? B: keep the test problem listed and labeled with 8202896 >>> >>> Let's wait for Coleen's answer. >> >> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts >> with -Xcomp) >> ?as a duplicate of >> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took >> InMemoryCompiler out of the threads) >> because that's where the attempted fix was. >> >> I think >> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many >> open files intermittently) >> should be closed as a duplicate too because it's the same root cause. >> >> And this one: >> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) >> fixes my fix and will remove the test from the ProblemList.txt. >> >> I believe it should be removed fromt he problem list because I don't >> think it will time out or intermittently fail again for the same >> reason.? If it times out or fails for a different reason, we should >> file a whole new bug, with that specific analysis. >> >> Thanks, >> Coleen > > Hi Coleen, > > That all sounds reasonable. Thanks for cleaning up the bug situation. +1 Thanks, Serguei > > Chris >> >>> >>> Thanks, >>> Serguei >>> >>> >>> On 7/31/18 00:16, Chris Plummer wrote: >>>> Sorry, I thought this had been pushed already, but it hasn't. But >>>> it still looks like JDK-8202896 should be closed as a dup, and it's >>>> unclear to me if JDK-8206076 has been fixed and this test can be >>>> removed from the problem list. >>>> >>>> Chris >>>> >>>> On 7/30/18 6:34 PM, Chris Plummer wrote: >>>>> Hi Coleen, >>>>> >>>>> Now that this had been pushed, I assume JDK-8202896 should be >>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this >>>>> change also? >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>>>> Summary: fixed refactoring caused by JDK-8203820 >>>>>> >>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>>>> >>>>>> Ran the test in mach5 on all Oracle supported platforms. Also >>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes >>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>>>> >>>>>> Thanks, >>>>>> Coleen >>>>> >>>>> >>>>> >>>> >>>> >>> >> > > From hohensee at amazon.com Tue Jul 31 18:45:14 2018 From: hohensee at amazon.com (Hohensee, Paul) Date: Tue, 31 Jul 2018 18:45:14 +0000 Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions In-Reply-To: <54CF80A5-A8C3-4C20-8D3D-045A8FA181AA@amazon.com> References: <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com> <22C41D6E-162C-4758-AE23-6856627250DC@amazon.com> <54CF80A5-A8C3-4C20-8D3D-045A8FA181AA@amazon.com> Message-ID: A few small things for http://cr.openjdk.java.net/~tschatzl/8208498/webrev/, otherwise looks good. collectionSetChooser.cpp: Doesn't !r->is_old() include is_archive()? g1CollectedHeap.hpp: Add archive_region_add(), archive_region_remove(), and old_set_bulk_remove(). In non_young_capacity_bytes(), use old_regions_count(), humongous_regions_count(), and archive_regions_count(). g1CollectedHeap.cpp: Use old_set_add() and friends where possible. "// humongous regions set." -> "// humongous and archive region sets." ?On 7/30/18, 4:27 PM, "Hohensee, Paul" wrote: A couple nits on http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/. g1CollectedHeap.cpp: in initialize_serviceability(), memory_managers(), and memory_pools(), use g1mm() instead of _g1mm. g1MonitoringSupport.cpp: there's an extra newline after ~G1MonitoringSupport(). Otherwise looks good. Paul On 7/30/18, 12:18 PM, "Hohensee, Paul" wrote: At JVMLS, so can't look in depth this instant, but I'm fine with your approach, except I'd get the new JMX and jstat structure in place before fixing the data that gets reported. Imo it'll be easier to fit correct data into the new JMX/jstat setup than into the old one, and doing it the new way will give us a good idea of exactly what we should do for the legacy ones. Your archive region set webrev looks pretty much the same as what I wrote, but I got a trace trap when I tried to execute the resulting JVM. Not a clue why, so I abandoned it. I'd not have thought of making a G1MonitoringScope, looks good. Thanks, Paul On 7/30/18, 6:04 AM, "Thomas Schatzl" wrote: Hi Paul, did some prototyping and wanted to show you the results and get your input: On Thu, 2018-07-26 at 16:06 +0200, Thomas Schatzl wrote: > [...] > Could we work together on first refactoring the code before adding > new > kinds of spaces to the MXBeans? > > Looking at this change and mine roughly the following issues would > need to be resolved first: > - find a solution for archive regions as suggested above :) At the > moment, without doing the change, I would tend to make archive > regions separate from old regions. I went with that and I am currently testing https://bugs.openjdk.java.n et/browse/JDK-8208498 ; here's a webrev to look at: http://cr.openjdk.j ava.net/~tschatzl/8208498/webrev/ > - move serviceability stuff as much as possible to > g1MonitoringSupport Preliminary webrev: http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/ I think this came out better than expected: while we maybe want to add a ServiceabilitySupport interface that collects the get_memory_manager/pools/* methods in the future, imho this is a lot better than current code as it tightens the G1MonitoringSupport interface quite a bit. Particularly of note should be the G1MonitoringScope class that collects both TraceCollectorStats and TraceMemoryManagerStats into a single class. (Instead of the two bools passed to it something indicating the GC directly would probably be better too). It would be nice if something similar could be made for the concurrent Trace*Stats. > - clean up MemoryPool, remove duplicate information > - provide and return sane memory pool used/committed values to the > MXBeans > - clean up G1MonitoringSupport, e.g. avoid "*used/*committed" > variables > for every single memory pool. Use MemoryUsage structs for them. Make > reading of memory pool information atomic wrt to its readers (note > that I think it is currently just impossible to get consistent output > for other statistics like jstat) - that's JDK-8207200. > - add whatever serviceability stuff for the new pools/jstat/* in > steps. Thanks, Thomas From coleen.phillimore at oracle.com Tue Jul 31 20:07:25 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 31 Jul 2018 16:07:25 -0400 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> Message-ID: On 7/31/18 1:43 PM, Chris Plummer wrote: > Hi Coleen, > > I just realized that there is also > https://bugs.openjdk.java.net/browse/JDK-8208234 filed for this test > last week. It results in an OOME. I think it's the same issue, but > just want check with you. Please close it as a dup if you think it is > the same. Yes, I think this is the same thing.? One call to InMemoryCompiler shouldn't OOME but multiple concurrent calls could. thanks, Coleen > > thanks, > > Chris > > On 7/31/18 9:13 AM, Chris Plummer wrote: >> On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: >>>> Hi Chris, >>>> >>>> Good catch. >>>> It is possible that this webrev does not fix the JDK-8202896. >>>> The JDK-8202896 is about timeouts which are normally intermittent >>>> (is it right?). >>>> >>>> There are two options here: >>>> ? A: close 8202896 as a dup of 8208074 >>>> ? B: keep the test problem listed and labeled with 8202896 >>>> >>>> Let's wait for Coleen's answer. >>> >>> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts >>> with -Xcomp) >>> ?as a duplicate of >>> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took >>> InMemoryCompiler out of the threads) >>> because that's where the attempted fix was. >>> >>> I think >>> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many >>> open files intermittently) >>> should be closed as a duplicate too because it's the same root cause. >>> >>> And this one: >>> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) >>> fixes my fix and will remove the test from the ProblemList.txt. >>> >>> I believe it should be removed fromt he problem list because I don't >>> think it will time out or intermittently fail again for the same >>> reason.? If it times out or fails for a different reason, we should >>> file a whole new bug, with that specific analysis. >>> >>> Thanks, >>> Coleen >> >> Hi Coleen, >> >> That all sounds reasonable. Thanks for cleaning up the bug situation. >> >> Chris >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 7/31/18 00:16, Chris Plummer wrote: >>>>> Sorry, I thought this had been pushed already, but it hasn't. But >>>>> it still looks like JDK-8202896 should be closed as a dup, and >>>>> it's unclear to me if JDK-8206076 has been fixed and this test can >>>>> be removed from the problem list. >>>>> >>>>> Chris >>>>> >>>>> On 7/30/18 6:34 PM, Chris Plummer wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> Now that this had been pushed, I assume JDK-8202896 should be >>>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this >>>>>> change also? >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: fixed refactoring caused by JDK-8203820 >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>>>>> >>>>>>> Ran the test in mach5 on all Oracle supported platforms. Also >>>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> >> > > From coleen.phillimore at oracle.com Tue Jul 31 20:09:20 2018 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 31 Jul 2018 16:09:20 -0400 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <60b8abbb-f9da-25b5-f488-b7c76896b62d@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> <60b8abbb-f9da-25b5-f488-b7c76896b62d@oracle.com> Message-ID: <9bdf6cc2-d0e6-dd55-88e6-f3bb45b322ec@oracle.com> On 7/31/18 2:07 PM, serguei.spitsyn at oracle.com wrote: > On 7/31/18 09:13, Chris Plummer wrote: >> On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote: >>> >>> >>> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: >>>> Hi Chris, >>>> >>>> Good catch. >>>> It is possible that this webrev does not fix the JDK-8202896. >>>> The JDK-8202896 is about timeouts which are normally intermittent >>>> (is it right?). >>>> >>>> There are two options here: >>>> ? A: close 8202896 as a dup of 8208074 >>>> ? B: keep the test problem listed and labeled with 8202896 >>>> >>>> Let's wait for Coleen's answer. >>> >>> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts >>> with -Xcomp) >>> ?as a duplicate of >>> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took >>> InMemoryCompiler out of the threads) >>> because that's where the attempted fix was. >>> >>> I think >>> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many >>> open files intermittently) >>> should be closed as a duplicate too because it's the same root cause. >>> >>> And this one: >>> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) >>> fixes my fix and will remove the test from the ProblemList.txt. >>> >>> I believe it should be removed fromt he problem list because I don't >>> think it will time out or intermittently fail again for the same >>> reason.? If it times out or fails for a different reason, we should >>> file a whole new bug, with that specific analysis. >>> >>> Thanks, >>> Coleen >> >> Hi Coleen, >> >> That all sounds reasonable. Thanks for cleaning up the bug situation. > > +1 Thanks Chris and Serguei for your discussion of this bug.? Hopefully this test becomes stable and useful now. Coleen > > Thanks, > Serguei >> >> Chris >>> >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 7/31/18 00:16, Chris Plummer wrote: >>>>> Sorry, I thought this had been pushed already, but it hasn't. But >>>>> it still looks like JDK-8202896 should be closed as a dup, and >>>>> it's unclear to me if JDK-8206076 has been fixed and this test can >>>>> be removed from the problem list. >>>>> >>>>> Chris >>>>> >>>>> On 7/30/18 6:34 PM, Chris Plummer wrote: >>>>>> Hi Coleen, >>>>>> >>>>>> Now that this had been pushed, I assume JDK-8202896 should be >>>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this >>>>>> change also? >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>>>>> Summary: fixed refactoring caused by JDK-8203820 >>>>>>> >>>>>>> open webrev at >>>>>>> http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>>>>> >>>>>>> Ran the test in mach5 on all Oracle supported platforms. Also >>>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>>>>> >>>>>>> Thanks, >>>>>>> Coleen >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> >> > From serguei.spitsyn at oracle.com Tue Jul 31 22:55:39 2018 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 31 Jul 2018 15:55:39 -0700 Subject: RFR (trivial) 8208074: [TESTBUG] vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java failed with NullPointerException In-Reply-To: <9bdf6cc2-d0e6-dd55-88e6-f3bb45b322ec@oracle.com> References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com> <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com> <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com> <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com> <60b8abbb-f9da-25b5-f488-b7c76896b62d@oracle.com> <9bdf6cc2-d0e6-dd55-88e6-f3bb45b322ec@oracle.com> Message-ID: <306e1860-9066-34e3-036e-1ded191d0cd4@oracle.com> On 7/31/18 13:09, coleen.phillimore at oracle.com wrote: > > > On 7/31/18 2:07 PM, serguei.spitsyn at oracle.com wrote: >> On 7/31/18 09:13, Chris Plummer wrote: >>> On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> >>>> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote: >>>>> Hi Chris, >>>>> >>>>> Good catch. >>>>> It is possible that this webrev does not fix the JDK-8202896. >>>>> The JDK-8202896 is about timeouts which are normally intermittent >>>>> (is it right?). >>>>> >>>>> There are two options here: >>>>> ? A: close 8202896 as a dup of 8208074 >>>>> ? B: keep the test problem listed and labeled with 8202896 >>>>> >>>>> Let's wait for Coleen's answer. >>>> >>>> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts >>>> with -Xcomp) >>>> ?as a duplicate of >>>> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took >>>> InMemoryCompiler out of the threads) >>>> because that's where the attempted fix was. >>>> >>>> I think >>>> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many >>>> open files intermittently) >>>> should be closed as a duplicate too because it's the same root cause. >>>> >>>> And this one: >>>> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix) >>>> fixes my fix and will remove the test from the ProblemList.txt. >>>> >>>> I believe it should be removed fromt he problem list because I >>>> don't think it will time out or intermittently fail again for the >>>> same reason.? If it times out or fails for a different reason, we >>>> should file a whole new bug, with that specific analysis. >>>> >>>> Thanks, >>>> Coleen >>> >>> Hi Coleen, >>> >>> That all sounds reasonable. Thanks for cleaning up the bug situation. >> >> +1 > > Thanks Chris and Serguei for your discussion of this bug. Hopefully > this test becomes stable and useful now. Thanks a lot for taking care about this issue, Coleen! Thanks, Serguei > Coleen > >> >> Thanks, >> Serguei >>> >>> Chris >>>> >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 7/31/18 00:16, Chris Plummer wrote: >>>>>> Sorry, I thought this had been pushed already, but it hasn't. But >>>>>> it still looks like JDK-8202896 should be closed as a dup, and >>>>>> it's unclear to me if JDK-8206076 has been fixed and this test >>>>>> can be removed from the problem list. >>>>>> >>>>>> Chris >>>>>> >>>>>> On 7/30/18 6:34 PM, Chris Plummer wrote: >>>>>>> Hi Coleen, >>>>>>> >>>>>>> Now that this had been pushed, I assume JDK-8202896 should be >>>>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this >>>>>>> change also? >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote: >>>>>>>> Summary: fixed refactoring caused by JDK-8203820 >>>>>>>> >>>>>>>> open webrev at >>>>>>>> http://cr.openjdk.java.net/~coleenp/8208074.01/webrev >>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074 >>>>>>>> >>>>>>>> Ran the test in mach5 on all Oracle supported platforms. Also >>>>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Coleen >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> >> >