From david.holmes at oracle.com  Sun Jul  1 12:19:50 2018
From: david.holmes at oracle.com (David Holmes)
Date: Sun, 1 Jul 2018 22:19:50 +1000
Subject: [11] RFR: 8205653:
 test/jdk/sun/management/jmxremote/bootstrap/RmiRegistrySslTest.java and
 RmiSslBootstrapTest.sh fail with handshake_failure
In-Reply-To: <687b7dd7-5be5-cfdb-c411-2cf4a7008b12@oracle.com>
References: <3e4af336-6863-4145-9dce-60b08ea64a79@default>
 <74fabdff-3523-49cc-5ac2-4b766c8bbb30@oracle.com>
 <5764f7f2-f0a7-4bf1-44dc-977953cc6cab@oracle.com>
 <06b1c50c-401c-4c50-9ddf-7876c0638e63@default>
 <687b7dd7-5be5-cfdb-c411-2cf4a7008b12@oracle.com>
Message-ID: <84c24bf6-05d6-3120-6f23-2483f51e3175@oracle.com>

On 29/06/2018 6:32 PM, Alan Bateman wrote:
> On 29/06/2018 09:22, Sibabrata Sahoo wrote:
>> May I get the approval from serviceability-dev at openjdk.java.net.
>>
> This a test only change to update the keystores and the list of 
> ciphers/protocols that the test uses. There's nothing serviceability 
> specific here so having a Reviewer from the security area should be okay 
> in the event that don't get a quick review on serviceability-dev list.

+1

I certainly can't comment on any of this keystore stuff.

David

> -Alan

From rafael.wth at gmail.com  Mon Jul  2 08:41:38 2018
From: rafael.wth at gmail.com (Rafael Winterhalter)
Date: Mon, 2 Jul 2018 10:41:38 +0200
Subject: Review Request JDK-8200559: Java agents doing instrumentation
 need a means to define auxiliary classes
In-Reply-To: <78719929-5cc1-7770-c7bf-107bef65e57e@oracle.com>
References: <78719929-5cc1-7770-c7bf-107bef65e57e@oracle.com>
Message-ID: <CA+DM0Amu8L-vkgHOfhe5YCZO_+c2CMq3vVp3yb5dfdhyz+PK2A@mail.gmail.com>

Hi,

I was wondering if a solution for this problem is still planned for JDK 11
giving the beginning ramp down.

With removing sun.misc.Unsafe::defineClass, Java agents only have an option
to use jdk.internal.misc.Unsafe::defineClass for the use-cases that I
described.

I think it would be a missed opportunity not to offer an alternative as of
JDK 11 as a second migration would make it even less likely that agents
would avoid unsafe API.

Thanks for the information,
best regards, Rafael

mandy chung <mandy.chung at oracle.com> schrieb am So., 15. Apr. 2018, 08:23:

> Background:
>
> Java agents support both load time and dynamic instrumentation.   At load
> time,
> the agent's ClassFileTransformer is invoked to transform class bytes.
> There is
> no Class objects at this time.  Dynamic instrumentation is when
> redefineClasses
> or retransformClasses is used to redefine an existing loaded class.  The
> ClassFileTransformer is invoked with class bytes where the Class object is
> present.
>
> Java agent doing instrumentation needs a means to define auxiliary classes
> that are visible and accessible to the instrumented class.  Existing agents
> have been using sun.misc.Unsafe::defineClass to define aux classes directly
> or accessing protected ClassLoader::defineClass method with setAccessible
> to
> suppress the language access check (see [1] where this issue was brought
> up).
>
> Instrumentation::appendToBootstrapClassLoaderSearch and
> appendToSystemClassLoaderSearch
> APIs are existing means to supply additional classes.  It's too limited
> for example it can't inject a class in the same runtime package as the
> class
> being transformed.
>
> Proposal:
>
> This proposes to add a new ClassFileTransformer.transform method taking
> additional ClassDefiner parameter.  A transformer can define additional
> classes during the transformation process, i.e.
> when ClassFileTransformer::transform is invoked.  Some details:
>
> 1. ClassDefiner::defineClass defines a class in the same runtime package
>    as the class being transformed.
> 2. The class is defined in the same thread as the transformers are being
>    invoked.   ClassDefiner::defineClass returns Class object directly
>    before the transformed class is defined.
> 3. No transformation is applied to classes defined by
> ClassDefiner::defineClass.
>
> The first prototype we did is to collect the auxiliary classes and define
> them  until all transformers are invoked and have these aux classes to go
> through the transformation pipeline.  Several complicated issues would
> need to be resolved for example timing whether the auxiliary classes
> should
> be defined before the transformed class (otherwise a potential race where
> some other thread references the transformed class and cause the code to
> execute that in turn reference the auxiliary classes.  The current
> implementation has a native reentrancy check that ensure one class is being
> transformed to avoid potential circularity issues.  This may need JVM TI
> support to be reliable.
>
> This proposal would allow java agents to migrate from internal API and
> ClassDefiner to be enhanced in the future.
>
> Webrev:
>    http://cr.openjdk.java.net/~mchung/jdk11/webrevs/8200559/webrev.00/
>
> Mandy
> [1]
> http://mail.openjdk.java.net/pipermail/jdk-dev/2018-January/000405.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180702/d560a0ef/attachment.html>

From christoph.langer at sap.com  Mon Jul  2 09:03:58 2018
From: christoph.langer at sap.com (Langer, Christoph)
Date: Mon, 2 Jul 2018 09:03:58 +0000
Subject: RFR : 8205959 : Do not restart close if errno is EINTR
In-Reply-To: <2E0FD75E-3AEF-4251-B324-7F0BA864CFC2@googlemail.com>
References: <2e9e20817ecf49d995cd2f939fefd774@sap.com>
 <2E0FD75E-3AEF-4251-B324-7F0BA864CFC2@googlemail.com>
Message-ID: <9bf3ebbba0014f918bf53eb0b4d0c464@sap.com>

Hi Matthias,

forwarding to serviceability-dev, because debugging is usually discussed there.

Yes, I would think this coding should be fixed, too. Can you open a bug and prepare a change?

Thanks
Christoph

> -----Original Message-----
> From: net-dev [mailto:net-dev-bounces at openjdk.java.net] On Behalf Of
> Norman Maurer
> Sent: Montag, 2. Juli 2018 10:23
> To: Baesken, Matthias <matthias.baesken at sap.com>
> Cc: Stuefe, Thomas <thomas.stuefe at sap.com>; net-dev at openjdk.java.net
> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR
> 
> +1 retry a close on EINTR has most likely not the outcome you expect and
> may even close a wrong FD if the same FD is reused already (as even if EINTR
> is returned it may have closed the FD)
> 
> > Am 02.07.2018 um 10:17 schrieb Baesken, Matthias
> <matthias.baesken at sap.com>:
> >
> > Hello  ,  there is a similar pattern (attempt to restart close in case of EINTR)
> in the coding as well   in  socket_md.c   :
> >
> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-147-    int rv;
> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-148-    do {
> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-149-        rv =
> close(fd);
> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c:150:    } while (rv
> == -1 && errno == EINTR);
> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-151-
> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-152-    return rv;
> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-153-}
> >
> > Do you think this needs adjustment   (on LINUX)  as well ?
> >
> > Best regards, Matthias
> >
> >
> >> Message: 2
> >> Date: Thu, 28 Jun 2018 18:19:46 +0100
> >> From: Alan Bateman <Alan.Bateman at oracle.com>
> >> To: David Lloyd <david.lloyd at redhat.com>, ivan.gerasimov at oracle.com
> >> Cc: OpenJDK Network Dev list <net-dev at openjdk.java.net>
> >> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR
> >> Message-ID: <3fd1496f-ab83-a2d5-0699-13c8b735d70b at oracle.com>
> >> Content-Type: text/plain; charset=utf-8; format=flowed
> >>
> >>> On 28/06/2018 17:35, David Lloyd wrote:
> >>> :
> >>> Do you (or Alan) think that this might have accounted for real-world
> >>> connection problems?
> >>>
> >> In the file I/O area, with NFS I think, we had an issue a long time ago
> >> where close was retried after EIO. That issue was fixed a long time ago
> >> but it's one that comes to mind in this general area.
> >>
> >> -Alan
> >>
> >

From Alan.Bateman at oracle.com  Mon Jul  2 09:55:45 2018
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Mon, 2 Jul 2018 10:55:45 +0100
Subject: Review Request JDK-8200559: Java agents doing instrumentation
 need a means to define auxiliary classes
In-Reply-To: <CA+DM0Amu8L-vkgHOfhe5YCZO_+c2CMq3vVp3yb5dfdhyz+PK2A@mail.gmail.com>
References: <78719929-5cc1-7770-c7bf-107bef65e57e@oracle.com>
 <CA+DM0Amu8L-vkgHOfhe5YCZO_+c2CMq3vVp3yb5dfdhyz+PK2A@mail.gmail.com>
Message-ID: <813342b9-d670-e60f-4cc0-e2b5d0542b5f@oracle.com>

On 02/07/2018 09:41, Rafael Winterhalter wrote:
> Hi,
>
> I was wondering if a solution for this problem is still planned for 
> JDK 11 giving the beginning ramp down.
>
> With removing sun.misc.Unsafe::defineClass, Java agents only have an 
> option to use jdk.internal.misc.Unsafe::defineClass for the use-cases 
> that I described.
>
> I think it would be a missed opportunity not to offer an alternative 
> as of JDK 11 as a second migration would make it even less likely that 
> agents would avoid unsafe API.
>
Mandy's propoal to allow agents doing instrumentation to define 
auxiliary classes in the same runtime package as the class being loaded 
or redefine is a good proposal make complete sense and that fits with 
the intended use of this API. Unfortunately it didn't make JDK 11.

I read the mails and arguments for an Instrumentation.defineClass but I 
don't think it's the right API to add.? The Instrumentation API was 
designed for tool agents, not libraries, and a lot of discussion seems 
to be trying to use the API for cases that it was never intended. Also 
an unrestricted defineClass creates an attractive nuisance that would 
likely create a lot of problems further down the road. I think it would 
be better to focus on some of the use-cases to see if we can identify 
cases where a standard API make sense.

-Alan

From thomas.stuefe at gmail.com  Mon Jul  2 10:08:15 2018
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 2 Jul 2018 12:08:15 +0200
Subject: RFR : 8205959 : Do not restart close if errno is EINTR
In-Reply-To: <9bf3ebbba0014f918bf53eb0b4d0c464@sap.com>
References: <2e9e20817ecf49d995cd2f939fefd774@sap.com>
 <2E0FD75E-3AEF-4251-B324-7F0BA864CFC2@googlemail.com>
 <9bf3ebbba0014f918bf53eb0b4d0c464@sap.com>
Message-ID: <CAA-vtUzrC5E1U8yYibbRLx7fABJv9sQuNdaL7FdHbqvah0tVGQ@mail.gmail.com>

+1. Please fix this for Linux! Thanks.

On Mon, Jul 2, 2018 at 11:03 AM, Langer, Christoph
<christoph.langer at sap.com> wrote:
> Hi Matthias,
>
> forwarding to serviceability-dev, because debugging is usually discussed there.
>
> Yes, I would think this coding should be fixed, too. Can you open a bug and prepare a change?
>
> Thanks
> Christoph
>
>> -----Original Message-----
>> From: net-dev [mailto:net-dev-bounces at openjdk.java.net] On Behalf Of
>> Norman Maurer
>> Sent: Montag, 2. Juli 2018 10:23
>> To: Baesken, Matthias <matthias.baesken at sap.com>
>> Cc: Stuefe, Thomas <thomas.stuefe at sap.com>; net-dev at openjdk.java.net
>> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR
>>
>> +1 retry a close on EINTR has most likely not the outcome you expect and
>> may even close a wrong FD if the same FD is reused already (as even if EINTR
>> is returned it may have closed the FD)
>>
>> > Am 02.07.2018 um 10:17 schrieb Baesken, Matthias
>> <matthias.baesken at sap.com>:
>> >
>> > Hello  ,  there is a similar pattern (attempt to restart close in case of EINTR)
>> in the coding as well   in  socket_md.c   :
>> >
>> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-147-    int rv;
>> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-148-    do {
>> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-149-        rv =
>> close(fd);
>> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c:150:    } while (rv
>> == -1 && errno == EINTR);
>> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-151-
>> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-152-    return rv;
>> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-153-}
>> >
>> > Do you think this needs adjustment   (on LINUX)  as well ?
>> >
>> > Best regards, Matthias
>> >
>> >
>> >> Message: 2
>> >> Date: Thu, 28 Jun 2018 18:19:46 +0100
>> >> From: Alan Bateman <Alan.Bateman at oracle.com>
>> >> To: David Lloyd <david.lloyd at redhat.com>, ivan.gerasimov at oracle.com
>> >> Cc: OpenJDK Network Dev list <net-dev at openjdk.java.net>
>> >> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR
>> >> Message-ID: <3fd1496f-ab83-a2d5-0699-13c8b735d70b at oracle.com>
>> >> Content-Type: text/plain; charset=utf-8; format=flowed
>> >>
>> >>> On 28/06/2018 17:35, David Lloyd wrote:
>> >>> :
>> >>> Do you (or Alan) think that this might have accounted for real-world
>> >>> connection problems?
>> >>>
>> >> In the file I/O area, with NFS I think, we had an issue a long time ago
>> >> where close was retried after EIO. That issue was fixed a long time ago
>> >> but it's one that comes to mind in this general area.
>> >>
>> >> -Alan
>> >>
>> >

From david.holmes at oracle.com  Mon Jul  2 12:03:55 2018
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 2 Jul 2018 22:03:55 +1000
Subject: RFR : 8205959 : Do not restart close if errno is EINTR
In-Reply-To: <9bf3ebbba0014f918bf53eb0b4d0c464@sap.com>
References: <2e9e20817ecf49d995cd2f939fefd774@sap.com>
 <2E0FD75E-3AEF-4251-B324-7F0BA864CFC2@googlemail.com>
 <9bf3ebbba0014f918bf53eb0b4d0c464@sap.com>
Message-ID: <5a2ab4d2-ed90-20fd-6340-e73153d4313d@oracle.com>

In reference to 8205959, where is it stated that dup2 is any more 
restartable than close ??

AFAICS both leave things undefined/unspecified if they set EINTR.

David

On 2/07/2018 7:03 PM, Langer, Christoph wrote:
> Hi Matthias,
> 
> forwarding to serviceability-dev, because debugging is usually discussed there.
> 
> Yes, I would think this coding should be fixed, too. Can you open a bug and prepare a change?
> 
> Thanks
> Christoph
> 
>> -----Original Message-----
>> From: net-dev [mailto:net-dev-bounces at openjdk.java.net] On Behalf Of
>> Norman Maurer
>> Sent: Montag, 2. Juli 2018 10:23
>> To: Baesken, Matthias <matthias.baesken at sap.com>
>> Cc: Stuefe, Thomas <thomas.stuefe at sap.com>; net-dev at openjdk.java.net
>> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR
>>
>> +1 retry a close on EINTR has most likely not the outcome you expect and
>> may even close a wrong FD if the same FD is reused already (as even if EINTR
>> is returned it may have closed the FD)
>>
>>> Am 02.07.2018 um 10:17 schrieb Baesken, Matthias
>> <matthias.baesken at sap.com>:
>>>
>>> Hello  ,  there is a similar pattern (attempt to restart close in case of EINTR)
>> in the coding as well   in  socket_md.c   :
>>>
>>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-147-    int rv;
>>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-148-    do {
>>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-149-        rv =
>> close(fd);
>>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c:150:    } while (rv
>> == -1 && errno == EINTR);
>>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-151-
>>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-152-    return rv;
>>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-153-}
>>>
>>> Do you think this needs adjustment   (on LINUX)  as well ?
>>>
>>> Best regards, Matthias
>>>
>>>
>>>> Message: 2
>>>> Date: Thu, 28 Jun 2018 18:19:46 +0100
>>>> From: Alan Bateman <Alan.Bateman at oracle.com>
>>>> To: David Lloyd <david.lloyd at redhat.com>, ivan.gerasimov at oracle.com
>>>> Cc: OpenJDK Network Dev list <net-dev at openjdk.java.net>
>>>> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR
>>>> Message-ID: <3fd1496f-ab83-a2d5-0699-13c8b735d70b at oracle.com>
>>>> Content-Type: text/plain; charset=utf-8; format=flowed
>>>>
>>>>> On 28/06/2018 17:35, David Lloyd wrote:
>>>>> :
>>>>> Do you (or Alan) think that this might have accounted for real-world
>>>>> connection problems?
>>>>>
>>>> In the file I/O area, with NFS I think, we had an issue a long time ago
>>>> where close was retried after EIO. That issue was fixed a long time ago
>>>> but it's one that comes to mind in this general area.
>>>>
>>>> -Alan
>>>>
>>>

From matthias.baesken at sap.com  Mon Jul  2 13:44:22 2018
From: matthias.baesken at sap.com (Baesken, Matthias)
Date: Mon, 2 Jul 2018 13:44:22 +0000
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is EINTR
 [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
Message-ID: <228035d2f64c494eaefe31b07ac72083@sap.com>

I created a bug  and a webrev  , please review .


https://bugs.openjdk.java.net/browse/JDK-8206145

http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/


(   The other bug where a similar issue  was addressed is  https://bugs.openjdk.java.net/browse/JDK-8205959   )

Best regards, Matthias


> -----Original Message-----
> From: Thomas St?fe [mailto:thomas.stuefe at gmail.com]
> Sent: Montag, 2. Juli 2018 12:08
> To: Baesken, Matthias <matthias.baesken at sap.com>; Langer, Christoph
> <christoph.langer at sap.com>
> Cc: serviceability-dev (serviceability-dev at openjdk.java.net) <serviceability-
> dev at openjdk.java.net>; Stuefe, Thomas <thomas.stuefe at sap.com>; net-
> dev at openjdk.java.net
> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR
> 
> +1. Please fix this for Linux! Thanks.
> 
> On Mon, Jul 2, 2018 at 11:03 AM, Langer, Christoph
> <christoph.langer at sap.com> wrote:
> > Hi Matthias,
> >
> > forwarding to serviceability-dev, because debugging is usually discussed
> there.
> >
> > Yes, I would think this coding should be fixed, too. Can you open a bug and
> prepare a change?
> >
> > Thanks
> > Christoph
> >
> >> -----Original Message-----
> >> From: net-dev [mailto:net-dev-bounces at openjdk.java.net] On Behalf Of
> >> Norman Maurer
> >> Sent: Montag, 2. Juli 2018 10:23
> >> To: Baesken, Matthias <matthias.baesken at sap.com>
> >> Cc: Stuefe, Thomas <thomas.stuefe at sap.com>; net-
> dev at openjdk.java.net
> >> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR
> >>
> >> +1 retry a close on EINTR has most likely not the outcome you expect and
> >> may even close a wrong FD if the same FD is reused already (as even if
> EINTR
> >> is returned it may have closed the FD)
> >>
> >> > Am 02.07.2018 um 10:17 schrieb Baesken, Matthias
> >> <matthias.baesken at sap.com>:
> >> >
> >> > Hello  ,  there is a similar pattern (attempt to restart close in case of
> EINTR)
> >> in the coding as well   in  socket_md.c   :
> >> >
> >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-147-    int rv;
> >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-148-    do {
> >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-149-        rv =
> >> close(fd);
> >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c:150:    } while
> (rv
> >> == -1 && errno == EINTR);
> >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-151-
> >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-152-    return
> rv;
> >> > src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-153-}
> >> >
> >> > Do you think this needs adjustment   (on LINUX)  as well ?
> >> >
> >> > Best regards, Matthias
> >> >
> >> >
> >> >> Message: 2
> >> >> Date: Thu, 28 Jun 2018 18:19:46 +0100
> >> >> From: Alan Bateman <Alan.Bateman at oracle.com>
> >> >> To: David Lloyd <david.lloyd at redhat.com>,
> ivan.gerasimov at oracle.com
> >> >> Cc: OpenJDK Network Dev list <net-dev at openjdk.java.net>
> >> >> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR
> >> >> Message-ID: <3fd1496f-ab83-a2d5-0699-13c8b735d70b at oracle.com>
> >> >> Content-Type: text/plain; charset=utf-8; format=flowed
> >> >>
> >> >>> On 28/06/2018 17:35, David Lloyd wrote:
> >> >>> :
> >> >>> Do you (or Alan) think that this might have accounted for real-world
> >> >>> connection problems?
> >> >>>
> >> >> In the file I/O area, with NFS I think, we had an issue a long time ago
> >> >> where close was retried after EIO. That issue was fixed a long time ago
> >> >> but it's one that comes to mind in this general area.
> >> >>
> >> >> -Alan
> >> >>
> >> >

From Alan.Bateman at oracle.com  Mon Jul  2 14:09:02 2018
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Mon, 2 Jul 2018 15:09:02 +0100
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <228035d2f64c494eaefe31b07ac72083@sap.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
Message-ID: <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>

On 02/07/2018 14:44, Baesken, Matthias wrote:
> I created a bug  and a webrev  , please review .
>
>
> https://bugs.openjdk.java.net/browse/JDK-8206145
>
> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/
>
Why is this Linux only? I assume the do-while should be removed completely.

-Alan.

From thomas.stuefe at gmail.com  Mon Jul  2 15:41:52 2018
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 2 Jul 2018 17:41:52 +0200
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
Message-ID: <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>

Hi Alan,

Whether to repeat close() in case of EINTR seems to differ between
platforms. POSIX leaves it open:

"If close() is interrupted by a signal that is to be caught, it shall
return -1 with errno set to [EINTR] and the state of fildes is
unspecified."

Linux recommends *not* repeating the call since the file descriptor is
closed already and repeating the close may close a reopened fd
belonging to someone else.

AIX, for instance, recommends to repeat the call:

"EINTR  The state of the FileDescriptor is undetermined. Retry the
close routine to ensure that the FileDescriptor is closed."

Best Regards, Thomas


On Mon, Jul 2, 2018 at 4:09 PM, Alan Bateman <Alan.Bateman at oracle.com> wrote:
> On 02/07/2018 14:44, Baesken, Matthias wrote:
>>
>> I created a bug  and a webrev  , please review .
>>
>>
>> https://bugs.openjdk.java.net/browse/JDK-8206145
>>
>> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/
>>
> Why is this Linux only? I assume the do-while should be removed completely.
>
> -Alan.

From Alan.Bateman at oracle.com  Mon Jul  2 15:55:25 2018
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Mon, 2 Jul 2018 16:55:25 +0100
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
Message-ID: <376d39dc-ede0-99f0-d2be-af8db618a4bd@oracle.com>

On 02/07/2018 16:41, Thomas St?fe wrote:
> Hi Alan,
>
> Whether to repeat close() in case of EINTR seems to differ between
> platforms. POSIX leaves it open:
>
> "If close() is interrupted by a signal that is to be caught, it shall
> return -1 with errno set to [EINTR] and the state of fildes is
> unspecified."
>
> Linux recommends *not* repeating the call since the file descriptor is
> closed already and repeating the close may close a reopened fd
> belonging to someone else.
>
> AIX, for instance, recommends to repeat the call:
>
> "EINTR  The state of the FileDescriptor is undetermined. Retry the
> close routine to ensure that the FileDescriptor is closed."
I think we should double check macOS and Solaris too as we've been 
careful in other areas to not retry close when interrupted.

-Alan

From mandy.chung at oracle.com  Mon Jul  2 17:17:31 2018
From: mandy.chung at oracle.com (mandy chung)
Date: Mon, 2 Jul 2018 10:17:31 -0700
Subject: Review Request JDK-8200559: Java agents doing instrumentation
 need a means to define auxiliary classes
In-Reply-To: <CA+DM0Amu8L-vkgHOfhe5YCZO_+c2CMq3vVp3yb5dfdhyz+PK2A@mail.gmail.com>
References: <78719929-5cc1-7770-c7bf-107bef65e57e@oracle.com>
 <CA+DM0Amu8L-vkgHOfhe5YCZO_+c2CMq3vVp3yb5dfdhyz+PK2A@mail.gmail.com>
Message-ID: <7a27c86c-6915-21e0-866f-f62c7626e330@oracle.com>

My proposal of ClassDefiner API allows the java agent to define 
auxiliary classes in the same runtime package of the class being 
instrumented.  You raised other use cases that are not addressed by this 
proposal.  As Alan replied, the ability to define any arbitrary class 
would be an attractive nuisance and we think Instrumentation.defineClass 
isn't the right API to add.

I think the proposed ClassDefiner API is useful for the specific use 
case (define auxiliary classes in the runtime package of the class being 
instrumented).  I hold it off and so didn't make 11.  For the other use 
cases, perhaps we should create JBS issues for further investigation.

Mandy

On 7/2/18 1:41 AM, Rafael Winterhalter wrote:
> Hi,
> 
> I was wondering if a solution for this problem is still planned for JDK 
> 11 giving the beginning ramp down.
> 
> With removing sun.misc.Unsafe::defineClass, Java agents only have an 
> option to use jdk.internal.misc.Unsafe::defineClass for the use-cases 
> that I described.
> 
> I think it would be a missed opportunity not to offer an alternative as 
> of JDK 11 as a second migration would make it even less likely that 
> agents would avoid unsafe API.
> 
> Thanks for the information,
> best regards, Rafael
> 
> mandy chung <mandy.chung at oracle.com <mailto:mandy.chung at oracle.com>> 
> schrieb am So., 15. Apr. 2018, 08:23:
> 
>     Background:
> 
>     Java agents support both load time and dynamic instrumentation. At
>     load time,
>     the agent's ClassFileTransformer is invoked to transform class
>     bytes.? There is
>     no Class objects at this time.? Dynamic instrumentation is when
>     redefineClasses
>     or retransformClasses is used to redefine an existing loaded class.? The
>     ClassFileTransformer is invoked with class bytes where the Class
>     object is present.
> 
>     Java agent doing instrumentation needs a means to define auxiliary
>     classes
>     that are visible and accessible to the instrumented class. Existing
>     agents
>     have been using sun.misc.Unsafe::defineClass to define aux classes
>     directly
>     or accessing protected ClassLoader::defineClass method with
>     setAccessible to
>     suppress the language access check (see [1] where this issue was
>     brought up).
> 
>     Instrumentation::appendToBootstrapClassLoaderSearch and
>     appendToSystemClassLoaderSearch
>     APIs are existing means to supply additional classes.? It's too limited
>     for example it can't inject a class in the same runtime package as
>     the class
>     being transformed.
> 
>     Proposal:
> 
>     This proposes to add a new ClassFileTransformer.transform method
>     taking additional ClassDefiner parameter.? A transformer can define
>     additional
>     classes during the transformation process, i.e.
>     when ClassFileTransformer::transform is invoked. Some details:
> 
>     1. ClassDefiner::defineClass defines a class in the same runtime package
>      ?? as the class being transformed.
>     2. The class is defined in the same thread as the transformers are being
>      ?? invoked.?? ClassDefiner::defineClass returns Class object directly
>      ?? before the transformed class is defined.
>     3. No transformation is applied to classes defined by
>     ClassDefiner::defineClass.
> 
>     The first prototype we did is to collect the auxiliary classes and
>     define
>     them? until all transformers are invoked and have these aux classes
>     to go
>     through the transformation pipeline.? Several complicated issues would
>     need to be resolved for example timing whether the auxiliary classes
>     should
>     be defined before the transformed class (otherwise a potential race
>     where
>     some other thread references the transformed class and cause the code to
>     execute that in turn reference the auxiliary classes.? The current
>     implementation has a native reentrancy check that ensure one class
>     is being
>     transformed to avoid potential circularity issues.? This may need
>     JVM TI
>     support to be reliable.
> 
>     This proposal would allow java agents to migrate from internal API
>     and ClassDefiner to be enhanced in the future.
> 
>     Webrev:
>     http://cr.openjdk.java.net/~mchung/jdk11/webrevs/8200559/webrev.00/
> 
>     Mandy
>     [1]
>     http://mail.openjdk.java.net/pipermail/jdk-dev/2018-January/000405.html
> 

From david.lloyd at redhat.com  Mon Jul  2 13:43:06 2018
From: david.lloyd at redhat.com (David Lloyd)
Date: Mon, 2 Jul 2018 08:43:06 -0500
Subject: RFR : 8205959 : Do not restart close if errno is EINTR
In-Reply-To: <5a2ab4d2-ed90-20fd-6340-e73153d4313d@oracle.com>
References: <2e9e20817ecf49d995cd2f939fefd774@sap.com>
 <2E0FD75E-3AEF-4251-B324-7F0BA864CFC2@googlemail.com>
 <9bf3ebbba0014f918bf53eb0b4d0c464@sap.com>
 <5a2ab4d2-ed90-20fd-6340-e73153d4313d@oracle.com>
Message-ID: <CANghgrRRA6iAi=Ho5MKqUA-ykE+TObzvWVGuSz3v2EsuOdJOdg@mail.gmail.com>

I think because the only two possible outcomes are either that the FD
was not dup'd, in which case things carry on as before, or that it was
dup'd, in which case (at least in the JVM) re-dupping won't really do
anything harmful since the target FD already references the dead
socket FD.

The POSIX manpage doesn't seem to include any other possibilities.

On Mon, Jul 2, 2018 at 7:04 AM David Holmes <david.holmes at oracle.com> wrote:
>
> In reference to 8205959, where is it stated that dup2 is any more
> restartable than close ??
>
> AFAICS both leave things undefined/unspecified if they set EINTR.
>
> David
>
> On 2/07/2018 7:03 PM, Langer, Christoph wrote:
> > Hi Matthias,
> >
> > forwarding to serviceability-dev, because debugging is usually discussed there.
> >
> > Yes, I would think this coding should be fixed, too. Can you open a bug and prepare a change?
> >
> > Thanks
> > Christoph
> >
> >> -----Original Message-----
> >> From: net-dev [mailto:net-dev-bounces at openjdk.java.net] On Behalf Of
> >> Norman Maurer
> >> Sent: Montag, 2. Juli 2018 10:23
> >> To: Baesken, Matthias <matthias.baesken at sap.com>
> >> Cc: Stuefe, Thomas <thomas.stuefe at sap.com>; net-dev at openjdk.java.net
> >> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR
> >>
> >> +1 retry a close on EINTR has most likely not the outcome you expect and
> >> may even close a wrong FD if the same FD is reused already (as even if EINTR
> >> is returned it may have closed the FD)
> >>
> >>> Am 02.07.2018 um 10:17 schrieb Baesken, Matthias
> >> <matthias.baesken at sap.com>:
> >>>
> >>> Hello  ,  there is a similar pattern (attempt to restart close in case of EINTR)
> >> in the coding as well   in  socket_md.c   :
> >>>
> >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-147-    int rv;
> >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-148-    do {
> >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-149-        rv =
> >> close(fd);
> >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c:150:    } while (rv
> >> == -1 && errno == EINTR);
> >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-151-
> >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-152-    return rv;
> >>> src/jdk.jdwp.agent/unix/native/libdt_socket/socket_md.c-153-}
> >>>
> >>> Do you think this needs adjustment   (on LINUX)  as well ?
> >>>
> >>> Best regards, Matthias
> >>>
> >>>
> >>>> Message: 2
> >>>> Date: Thu, 28 Jun 2018 18:19:46 +0100
> >>>> From: Alan Bateman <Alan.Bateman at oracle.com>
> >>>> To: David Lloyd <david.lloyd at redhat.com>, ivan.gerasimov at oracle.com
> >>>> Cc: OpenJDK Network Dev list <net-dev at openjdk.java.net>
> >>>> Subject: Re: RFR : 8205959 : Do not restart close if errno is EINTR
> >>>> Message-ID: <3fd1496f-ab83-a2d5-0699-13c8b735d70b at oracle.com>
> >>>> Content-Type: text/plain; charset=utf-8; format=flowed
> >>>>
> >>>>> On 28/06/2018 17:35, David Lloyd wrote:
> >>>>> :
> >>>>> Do you (or Alan) think that this might have accounted for real-world
> >>>>> connection problems?
> >>>>>
> >>>> In the file I/O area, with NFS I think, we had an issue a long time ago
> >>>> where close was retried after EIO. That issue was fixed a long time ago
> >>>> but it's one that comes to mind in this general area.
> >>>>
> >>>> -Alan
> >>>>
> >>>


-- 
- DML

From david.holmes at oracle.com  Tue Jul  3 04:28:43 2018
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 3 Jul 2018 14:28:43 +1000
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
Message-ID: <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>

On 3/07/2018 1:41 AM, Thomas St?fe wrote:
> Hi Alan,
> 
> Whether to repeat close() in case of EINTR seems to differ between
> platforms. POSIX leaves it open:
> 
> "If close() is interrupted by a signal that is to be caught, it shall
> return -1 with errno set to [EINTR] and the state of fildes is
> unspecified."
> 
> Linux recommends *not* repeating the call since the file descriptor is
> closed already and repeating the close may close a reopened fd
> belonging to someone else.
> 
> AIX, for instance, recommends to repeat the call:
> 
> "EINTR  The state of the FileDescriptor is undetermined. Retry the
> close routine to ensure that the FileDescriptor is closed."

As does HP-UX according to:

http://man7.org/linux/man-pages/man2/close.2.html

Solaris leaves things unspecified as per POSIX.

David

> Best Regards, Thomas
> 
> 
> 
> 
> 
> 
> 
> On Mon, Jul 2, 2018 at 4:09 PM, Alan Bateman <Alan.Bateman at oracle.com> wrote:
>> On 02/07/2018 14:44, Baesken, Matthias wrote:
>>>
>>> I created a bug  and a webrev  , please review .
>>>
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8206145
>>>
>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/
>>>
>> Why is this Linux only? I assume the do-while should be removed completely.
>>
>> -Alan.

From Alan.Bateman at oracle.com  Tue Jul  3 07:35:29 2018
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Tue, 3 Jul 2018 08:35:29 +0100
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
Message-ID: <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>

On 03/07/2018 05:28, David Holmes wrote:
>
>
> Solaris leaves things unspecified as per POSIX.
We've had problems on Solaris in other areas on exactly this topic so 
they have been fixed to not retry. I think we should do the same here so 
that we are at least consistent.

-Alan

From matthias.baesken at sap.com  Tue Jul  3 07:47:17 2018
From: matthias.baesken at sap.com (Baesken, Matthias)
Date: Tue, 3 Jul 2018 07:47:17 +0000
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
Message-ID: <832f9ddb14f4415b98adafa004ff196f@sap.com>

Hello , so should I change  my  webrev  for  8206145  to   
- retry on AIX
- not retry  on Linux + Solaris  ?

Any remarks on Mac / BSD ?


Thanks, Matthias


> -----Original Message-----
> From: Alan Bateman [mailto:Alan.Bateman at oracle.com]
> Sent: Dienstag, 3. Juli 2018 09:35
> To: David Holmes <david.holmes at oracle.com>; Thomas St?fe
> <thomas.stuefe at gmail.com>
> Cc: serviceability-dev (serviceability-dev at openjdk.java.net) <serviceability-
> dev at openjdk.java.net>; Baesken, Matthias <matthias.baesken at sap.com>
> Subject: Re: 8206145 : dbgsysSocketClose - do not restart close if errno is
> EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR
> 
> On 03/07/2018 05:28, David Holmes wrote:
> >
> >
> > Solaris leaves things unspecified as per POSIX.
> We've had problems on Solaris in other areas on exactly this topic so
> they have been fixed to not retry. I think we should do the same here so
> that we are at least consistent.
> 
> -Alan

From thomas.stuefe at gmail.com  Tue Jul  3 08:20:49 2018
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 3 Jul 2018 10:20:49 +0200
Subject: RFR(xxs): 8206243: java -XshowSettings fails if memory.limit_in_bytes
 overflows LONG.max
Message-ID: <CAA-vtUzOiyRQz7RC4cH-9u9eq+K1jX2rp0N4t84PL2+CSN0LwA@mail.gmail.com>

Hi all,

may I please have reviews for this small fix.

https://bugs.openjdk.java.net/browse/JDK-8206243
http://cr.openjdk.java.net/~stuefe/webrevs/8206243-java-xshowsettings-fails-for-large-values-of-memory-limit_in_bytes/webrev.00/webrev/


On some Linux kernels, the unlimited value of memory.limit_in_bytes is
returned as ULONG_MAX, not LONG_MAX.

- .../nightly $ cat //sys/fs/cgroup/memory/memory.limit_in_bytes
18446744073709551615

In those cases, java -XshowSettings will fail:

java -XshowSettings
....
Operating System Metrics:
    Provider: cgroupv1
    Effective CPU Count: 8
    CPU Period: 100000us
    CPU Quota: -1
    CPU Shares: -1
    List of Processors, 8 total:
    0 1 2 3 4 5 6 7
    List of Effective Processors, 0 total:
        List of Memory Nodes, 1 total:
    0
    List of Available Memory Nodes, 0 total:
        CPUSet Memory Pressure Enabled: false
Exception in thread "main" java.lang.NumberFormatException: For input
string: "18446744073709551615"
        at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.base/java.lang.Long.parseLong(Long.java:692)
        at java.base/java.lang.Long.parseLong(Long.java:817)
        at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValue(SubSystem.java:106)
        at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:374)
        at java.base/sun.launcher.LauncherHelper.printSystemMetrics(LauncherHelper.java:385)


Thank you,

Thomas

From david.holmes at oracle.com  Tue Jul  3 08:37:46 2018
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 3 Jul 2018 18:37:46 +1000
Subject: RFR(xxs): 8206243: java -XshowSettings fails if
 memory.limit_in_bytes overflows LONG.max
In-Reply-To: <CAA-vtUzOiyRQz7RC4cH-9u9eq+K1jX2rp0N4t84PL2+CSN0LwA@mail.gmail.com>
References: <CAA-vtUzOiyRQz7RC4cH-9u9eq+K1jX2rp0N4t84PL2+CSN0LwA@mail.gmail.com>
Message-ID: <539f5555-e057-4272-88a1-3da3ad0fd61d@oracle.com>

Hi Thomas,

This seems okay.

Minor nit:

if(bigInt

Please add a space after 'if'

Thanks,
David

On 3/07/2018 6:20 PM, Thomas St?fe wrote:
> Hi all,
> 
> may I please have reviews for this small fix.
> 
> https://bugs.openjdk.java.net/browse/JDK-8206243
> http://cr.openjdk.java.net/~stuefe/webrevs/8206243-java-xshowsettings-fails-for-large-values-of-memory-limit_in_bytes/webrev.00/webrev/
> 
> 
> On some Linux kernels, the unlimited value of memory.limit_in_bytes is
> returned as ULONG_MAX, not LONG_MAX.
> 
> - .../nightly $ cat //sys/fs/cgroup/memory/memory.limit_in_bytes
> 18446744073709551615
> 
> In those cases, java -XshowSettings will fail:
> 
> java -XshowSettings
> ....
> Operating System Metrics:
>      Provider: cgroupv1
>      Effective CPU Count: 8
>      CPU Period: 100000us
>      CPU Quota: -1
>      CPU Shares: -1
>      List of Processors, 8 total:
>      0 1 2 3 4 5 6 7
>      List of Effective Processors, 0 total:
>          List of Memory Nodes, 1 total:
>      0
>      List of Available Memory Nodes, 0 total:
>          CPUSet Memory Pressure Enabled: false
> Exception in thread "main" java.lang.NumberFormatException: For input
> string: "18446744073709551615"
>          at java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>          at java.base/java.lang.Long.parseLong(Long.java:692)
>          at java.base/java.lang.Long.parseLong(Long.java:817)
>          at java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValue(SubSystem.java:106)
>          at java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:374)
>          at java.base/sun.launcher.LauncherHelper.printSystemMetrics(LauncherHelper.java:385)
> 
> 
> Thank you,
> 
> Thomas
> 

From per.liden at oracle.com  Tue Jul  3 08:47:27 2018
From: per.liden at oracle.com (Per Liden)
Date: Tue, 3 Jul 2018 10:47:27 +0200
Subject: HotSpot Serviceability Agent (SA) Survey
In-Reply-To: <ed2092d2-594a-8329-53fa-063a6f683c42@oracle.com>
References: <ed2092d2-594a-8329-53fa-063a6f683c42@oracle.com>
Message-ID: <c0aaa892-3f62-c4b1-d2b8-50052ff767c9@oracle.com>

Hi Stephen,

On 03/21/2018 07:14 PM, Stephen Fitch wrote:
> Hi,
> 
> The HotSpot Serviceability Agent (SA) is a set of APIs and tools for 
> debugging HotSpot Virtual Machine and has been a part of the JVM/JDK for 
> a long time, however we don't have a lot of data about how it is used in 
> practice, especially outside of Oracle. Therefore, we have created an 
> initial survey to gather more information and help us evaluate and 
> understand how others are using it.
> 
> If you have used, or have (support) processes that utilize the 
> Serviceability Agent or related APIs, then we would definitely 
> appreciate if you would complete this survey:
> 
> https://www.surveymonkey.com/r/CF3MYDL
> 
> We are specifically interested in your use-cases and how SA is effective 
> for you in resolving JVM issues.
> 
> The survey will remain open through March 31st. The results of the 
> survey will be made public after the survey closes.

Have the results been published yet?

cheers,
Per

> 
> Regards, Stephen
> 
>  ?Java Platform Group - JVM - Sustaining Engineering

From thomas.stuefe at gmail.com  Tue Jul  3 09:15:56 2018
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 3 Jul 2018 11:15:56 +0200
Subject: RFR(xxs): 8206243: java -XshowSettings fails if
 memory.limit_in_bytes overflows LONG.max
In-Reply-To: <539f5555-e057-4272-88a1-3da3ad0fd61d@oracle.com>
References: <CAA-vtUzOiyRQz7RC4cH-9u9eq+K1jX2rp0N4t84PL2+CSN0LwA@mail.gmail.com>
 <539f5555-e057-4272-88a1-3da3ad0fd61d@oracle.com>
Message-ID: <CAA-vtUy1dYhVTBR2w4Su8EBap9tx_ZmikBFKxMyhTSdhpdWKyA@mail.gmail.com>

Thank you David!

I changed the webrev in place.

Thanks, Thomas

On Tue, Jul 3, 2018 at 10:37 AM, David Holmes <david.holmes at oracle.com> wrote:
> Hi Thomas,
>
> This seems okay.
>
> Minor nit:
>
> if(bigInt
>
> Please add a space after 'if'
>
> Thanks,
> David
>
>
> On 3/07/2018 6:20 PM, Thomas St?fe wrote:
>>
>> Hi all,
>>
>> may I please have reviews for this small fix.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8206243
>>
>> http://cr.openjdk.java.net/~stuefe/webrevs/8206243-java-xshowsettings-fails-for-large-values-of-memory-limit_in_bytes/webrev.00/webrev/
>>
>>
>> On some Linux kernels, the unlimited value of memory.limit_in_bytes is
>> returned as ULONG_MAX, not LONG_MAX.
>>
>> - .../nightly $ cat //sys/fs/cgroup/memory/memory.limit_in_bytes
>> 18446744073709551615
>>
>> In those cases, java -XshowSettings will fail:
>>
>> java -XshowSettings
>> ....
>> Operating System Metrics:
>>      Provider: cgroupv1
>>      Effective CPU Count: 8
>>      CPU Period: 100000us
>>      CPU Quota: -1
>>      CPU Shares: -1
>>      List of Processors, 8 total:
>>      0 1 2 3 4 5 6 7
>>      List of Effective Processors, 0 total:
>>          List of Memory Nodes, 1 total:
>>      0
>>      List of Available Memory Nodes, 0 total:
>>          CPUSet Memory Pressure Enabled: false
>> Exception in thread "main" java.lang.NumberFormatException: For input
>> string: "18446744073709551615"
>>          at
>> java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>>          at java.base/java.lang.Long.parseLong(Long.java:692)
>>          at java.base/java.lang.Long.parseLong(Long.java:817)
>>          at
>> java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValue(SubSystem.java:106)
>>          at
>> java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:374)
>>          at
>> java.base/sun.launcher.LauncherHelper.printSystemMetrics(LauncherHelper.java:385)
>>
>>
>> Thank you,
>>
>> Thomas
>>
>

From Alan.Bateman at oracle.com  Tue Jul  3 10:08:43 2018
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Tue, 3 Jul 2018 11:08:43 +0100
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <832f9ddb14f4415b98adafa004ff196f@sap.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
 <832f9ddb14f4415b98adafa004ff196f@sap.com>
Message-ID: <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>

On 03/07/2018 08:47, Baesken, Matthias wrote:
> Hello , so should I change  my  webrev  for  8206145  to
> - retry on AIX
> - not retry  on Linux + Solaris  ?
Yes.

> Any remarks on Mac / BSD ?
>
I see a few issues in the FreeBSD bugzilla on this topic. I assume it 
would be safer to not retry if interrupted.

-Alan

From ralf.schmelter at sap.com  Tue Jul  3 10:43:46 2018
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Tue, 3 Jul 2018 10:43:46 +0000
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent
 quadratic runtime behavior
Message-ID: <709161f438f848b0af5fb079c9c0242a@sap.com>

Hi All,

Please review the fix for the bug https://bugs.openjdk.java.net/browse/JDK-8205608 . The webref is at http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/ .

This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack.

I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is.

Best regards,
Ralf Schmelter

From Stephen.Fitch at oracle.com  Tue Jul  3 12:17:43 2018
From: Stephen.Fitch at oracle.com (Stephen Fitch)
Date: Tue, 3 Jul 2018 05:17:43 -0700
Subject: HotSpot Serviceability Agent (SA) Survey
In-Reply-To: <c0aaa892-3f62-c4b1-d2b8-50052ff767c9@oracle.com>
References: <ed2092d2-594a-8329-53fa-063a6f683c42@oracle.com>
 <c0aaa892-3f62-c4b1-d2b8-50052ff767c9@oracle.com>
Message-ID: <b73814c0-fcfc-21f1-e6c9-91ef5b4f09b8@oracle.com>

Hi Per,

Sadly delayed by other things; I'll put some further solid effort
into a published summary ASAP, ideally before the end of July, if
not before.

It's not forgotten, but behind other priorities.

Regards,

 ?Stephen

On 7/3/18 1:47 AM, Per Liden wrote:
> Hi Stephen,
>
> On 03/21/2018 07:14 PM, Stephen Fitch wrote:
>> Hi,
>>
>> The HotSpot Serviceability Agent (SA) is a set of APIs and tools for 
>> debugging HotSpot Virtual Machine and has been a part of the JVM/JDK for a 
>> long time, however we don't have a lot of data about how it is used in 
>> practice, especially outside of Oracle. Therefore, we have created an initial 
>> survey to gather more information and help us evaluate and understand how 
>> others are using it.
>>
>> If you have used, or have (support) processes that utilize the Serviceability 
>> Agent or related APIs, then we would definitely appreciate if you would 
>> complete this survey:
>>
>> https://www.surveymonkey.com/r/CF3MYDL
>>
>> We are specifically interested in your use-cases and how SA is effective for 
>> you in resolving JVM issues.
>>
>> The survey will remain open through March 31st. The results of the survey 
>> will be made public after the survey closes.
>
> Have the results been published yet?
>
> cheers,
> Per
>
>>
>> Regards, Stephen
>>
>> ??Java Platform Group - JVM - Sustaining Engineering


From bob.vandette at oracle.com  Tue Jul  3 12:59:55 2018
From: bob.vandette at oracle.com (Bob Vandette)
Date: Tue, 3 Jul 2018 08:59:55 -0400
Subject: RFR(xxs): 8206243: java -XshowSettings fails if
 memory.limit_in_bytes overflows LONG.max
In-Reply-To: <CAA-vtUy1dYhVTBR2w4Su8EBap9tx_ZmikBFKxMyhTSdhpdWKyA@mail.gmail.com>
References: <CAA-vtUzOiyRQz7RC4cH-9u9eq+K1jX2rp0N4t84PL2+CSN0LwA@mail.gmail.com>
 <539f5555-e057-4272-88a1-3da3ad0fd61d@oracle.com>
 <CAA-vtUy1dYhVTBR2w4Su8EBap9tx_ZmikBFKxMyhTSdhpdWKyA@mail.gmail.com>
Message-ID: <DFE27EE5-DA0B-4844-A53B-9D9692693409@oracle.com>

Looks ok.

Bob.

> On Jul 3, 2018, at 5:15 AM, Thomas St?fe <thomas.stuefe at gmail.com> wrote:
> 
> Thank you David!
> 
> I changed the webrev in place.
> 
> Thanks, Thomas
> 
> On Tue, Jul 3, 2018 at 10:37 AM, David Holmes <david.holmes at oracle.com> wrote:
>> Hi Thomas,
>> 
>> This seems okay.
>> 
>> Minor nit:
>> 
>> if(bigInt
>> 
>> Please add a space after 'if'
>> 
>> Thanks,
>> David
>> 
>> 
>> On 3/07/2018 6:20 PM, Thomas St?fe wrote:
>>> 
>>> Hi all,
>>> 
>>> may I please have reviews for this small fix.
>>> 
>>> https://bugs.openjdk.java.net/browse/JDK-8206243
>>> 
>>> http://cr.openjdk.java.net/~stuefe/webrevs/8206243-java-xshowsettings-fails-for-large-values-of-memory-limit_in_bytes/webrev.00/webrev/
>>> 
>>> 
>>> On some Linux kernels, the unlimited value of memory.limit_in_bytes is
>>> returned as ULONG_MAX, not LONG_MAX.
>>> 
>>> - .../nightly $ cat //sys/fs/cgroup/memory/memory.limit_in_bytes
>>> 18446744073709551615
>>> 
>>> In those cases, java -XshowSettings will fail:
>>> 
>>> java -XshowSettings
>>> ....
>>> Operating System Metrics:
>>>     Provider: cgroupv1
>>>     Effective CPU Count: 8
>>>     CPU Period: 100000us
>>>     CPU Quota: -1
>>>     CPU Shares: -1
>>>     List of Processors, 8 total:
>>>     0 1 2 3 4 5 6 7
>>>     List of Effective Processors, 0 total:
>>>         List of Memory Nodes, 1 total:
>>>     0
>>>     List of Available Memory Nodes, 0 total:
>>>         CPUSet Memory Pressure Enabled: false
>>> Exception in thread "main" java.lang.NumberFormatException: For input
>>> string: "18446744073709551615"
>>>         at
>>> java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>>>         at java.base/java.lang.Long.parseLong(Long.java:692)
>>>         at java.base/java.lang.Long.parseLong(Long.java:817)
>>>         at
>>> java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValue(SubSystem.java:106)
>>>         at
>>> java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:374)
>>>         at
>>> java.base/sun.launcher.LauncherHelper.printSystemMetrics(LauncherHelper.java:385)
>>> 
>>> 
>>> Thank you,
>>> 
>>> Thomas
>>> 
>> 


From thomas.stuefe at gmail.com  Tue Jul  3 13:04:45 2018
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 3 Jul 2018 15:04:45 +0200
Subject: RFR(xxs): 8206243: java -XshowSettings fails if
 memory.limit_in_bytes overflows LONG.max
In-Reply-To: <DFE27EE5-DA0B-4844-A53B-9D9692693409@oracle.com>
References: <CAA-vtUzOiyRQz7RC4cH-9u9eq+K1jX2rp0N4t84PL2+CSN0LwA@mail.gmail.com>
 <539f5555-e057-4272-88a1-3da3ad0fd61d@oracle.com>
 <CAA-vtUy1dYhVTBR2w4Su8EBap9tx_ZmikBFKxMyhTSdhpdWKyA@mail.gmail.com>
 <DFE27EE5-DA0B-4844-A53B-9D9692693409@oracle.com>
Message-ID: <CAA-vtUxqitisFHj0tsLcK6yFbCbZK_-VZ3e=BfwcqdMG9BF8eQ@mail.gmail.com>

Thank you Bob!

On Tue, Jul 3, 2018 at 2:59 PM, Bob Vandette <bob.vandette at oracle.com> wrote:
> Looks ok.
>
> Bob.
>
>> On Jul 3, 2018, at 5:15 AM, Thomas St?fe <thomas.stuefe at gmail.com> wrote:
>>
>> Thank you David!
>>
>> I changed the webrev in place.
>>
>> Thanks, Thomas
>>
>> On Tue, Jul 3, 2018 at 10:37 AM, David Holmes <david.holmes at oracle.com> wrote:
>>> Hi Thomas,
>>>
>>> This seems okay.
>>>
>>> Minor nit:
>>>
>>> if(bigInt
>>>
>>> Please add a space after 'if'
>>>
>>> Thanks,
>>> David
>>>
>>>
>>> On 3/07/2018 6:20 PM, Thomas St?fe wrote:
>>>>
>>>> Hi all,
>>>>
>>>> may I please have reviews for this small fix.
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8206243
>>>>
>>>> http://cr.openjdk.java.net/~stuefe/webrevs/8206243-java-xshowsettings-fails-for-large-values-of-memory-limit_in_bytes/webrev.00/webrev/
>>>>
>>>>
>>>> On some Linux kernels, the unlimited value of memory.limit_in_bytes is
>>>> returned as ULONG_MAX, not LONG_MAX.
>>>>
>>>> - .../nightly $ cat //sys/fs/cgroup/memory/memory.limit_in_bytes
>>>> 18446744073709551615
>>>>
>>>> In those cases, java -XshowSettings will fail:
>>>>
>>>> java -XshowSettings
>>>> ....
>>>> Operating System Metrics:
>>>>     Provider: cgroupv1
>>>>     Effective CPU Count: 8
>>>>     CPU Period: 100000us
>>>>     CPU Quota: -1
>>>>     CPU Shares: -1
>>>>     List of Processors, 8 total:
>>>>     0 1 2 3 4 5 6 7
>>>>     List of Effective Processors, 0 total:
>>>>         List of Memory Nodes, 1 total:
>>>>     0
>>>>     List of Available Memory Nodes, 0 total:
>>>>         CPUSet Memory Pressure Enabled: false
>>>> Exception in thread "main" java.lang.NumberFormatException: For input
>>>> string: "18446744073709551615"
>>>>         at
>>>> java.base/java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>>>>         at java.base/java.lang.Long.parseLong(Long.java:692)
>>>>         at java.base/java.lang.Long.parseLong(Long.java:817)
>>>>         at
>>>> java.base/jdk.internal.platform.cgroupv1.SubSystem.getLongValue(SubSystem.java:106)
>>>>         at
>>>> java.base/jdk.internal.platform.cgroupv1.Metrics.getMemoryLimit(Metrics.java:374)
>>>>         at
>>>> java.base/sun.launcher.LauncherHelper.printSystemMetrics(LauncherHelper.java:385)
>>>>
>>>>
>>>> Thank you,
>>>>
>>>> Thomas
>>>>
>>>
>

From bob.vandette at oracle.com  Tue Jul  3 13:13:04 2018
From: bob.vandette at oracle.com (Bob Vandette)
Date: Tue, 3 Jul 2018 09:13:04 -0400
Subject: RFR: 8205928 - [TESTBUG]:
 jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails depending on
 kernel config
Message-ID: <3481D570-DAAF-49D8-862B-5B13BC6C2B48@oracle.com>

Please review this small fix to correct a test failure when the Linux system kernel is
not configured with the CONFIG_MEMCG_KMEM option.

The Container Metric tests are dependent on docker which allow us to assume a certain minimum
Linux kernel configuration level. However, the kernel memory resource limiting feature is not a hard
requirement for docker. This test will need to be updated to allow for running on kernels without this
option.  A 0 return from the getKernelMemoryLimit is defined to indicate that this API is not available.

BUG: https://bugs.openjdk.java.net/browse/JDK-8205928

PROPOSED FIX:

diff --git a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
--- a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
+++ b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
@@ -95,10 +95,11 @@
 
     private static void testKernelMemoryLimit(String value) {
         long limit = getMemoryValue(value);
-        if (limit != Metrics.systemMetrics().getKernelMemoryLimit()) {
+        long kmemlimit = Metrics.systemMetrics().getKernelMemoryLimit();
+        if (kmemlimit != 0 && limit != kmemlimit) {
             throw new RuntimeException("Kernel Memory limit not equal, expected : ["
                     + limit + "]" + ", got : ["
-                    + Metrics.systemMetrics().getKernelMemoryLimit() + "]");
+                    + kmemlimit + "]");
         }
         System.out.println("TEST PASSED!!!");
     }

From thomas.stuefe at gmail.com  Tue Jul  3 13:38:40 2018
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 3 Jul 2018 15:38:40 +0200
Subject: RFR: 8205928 - [TESTBUG]:
 jdk/internal/platform/docker/TestDockerMemoryMetrics.java
 fails depending on kernel config
In-Reply-To: <3481D570-DAAF-49D8-862B-5B13BC6C2B48@oracle.com>
References: <3481D570-DAAF-49D8-862B-5B13BC6C2B48@oracle.com>
Message-ID: <CAA-vtUxJSvgRA9=A293+YMGw4YzX4aY2qYKqPACnXkE0pRNvog@mail.gmail.com>

Hi Bob,

It does look fine from the outside. I did not test it though, since I
have no suitable kernel.

Best Regards, Thomas

On Tue, Jul 3, 2018 at 3:13 PM, Bob Vandette <bob.vandette at oracle.com> wrote:
> Please review this small fix to correct a test failure when the Linux system kernel is
> not configured with the CONFIG_MEMCG_KMEM option.
>
> The Container Metric tests are dependent on docker which allow us to assume a certain minimum
> Linux kernel configuration level. However, the kernel memory resource limiting feature is not a hard
> requirement for docker. This test will need to be updated to allow for running on kernels without this
> option.  A 0 return from the getKernelMemoryLimit is defined to indicate that this API is not available.
>
> BUG: https://bugs.openjdk.java.net/browse/JDK-8205928
>
> PROPOSED FIX:
>
> diff --git a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
> --- a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
> +++ b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
> @@ -95,10 +95,11 @@
>
>      private static void testKernelMemoryLimit(String value) {
>          long limit = getMemoryValue(value);
> -        if (limit != Metrics.systemMetrics().getKernelMemoryLimit()) {
> +        long kmemlimit = Metrics.systemMetrics().getKernelMemoryLimit();
> +        if (kmemlimit != 0 && limit != kmemlimit) {
>              throw new RuntimeException("Kernel Memory limit not equal, expected : ["
>                      + limit + "]" + ", got : ["
> -                    + Metrics.systemMetrics().getKernelMemoryLimit() + "]");
> +                    + kmemlimit + "]");
>          }
>          System.out.println("TEST PASSED!!!");
>      }

From matthias.baesken at sap.com  Tue Jul  3 13:57:50 2018
From: matthias.baesken at sap.com (Baesken, Matthias)
Date: Tue, 3 Jul 2018 13:57:50 +0000
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
 <832f9ddb14f4415b98adafa004ff196f@sap.com>
 <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>
Message-ID: <4411e9aedba54b16bc779acdf8de184d@sap.com>

>>> I created a bug  and a webrev  , please review .
>>>
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8206145
>>>
>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/


Hello, here is  the second webrev  including Solaris  :

http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.1/

Please review !

Thanks, Matthias


> -----Original Message-----
> From: Alan Bateman [mailto:Alan.Bateman at oracle.com]
> Sent: Dienstag, 3. Juli 2018 12:09
> To: Baesken, Matthias <matthias.baesken at sap.com>; David Holmes
> <david.holmes at oracle.com>; Thomas St?fe <thomas.stuefe at gmail.com>
> Cc: serviceability-dev (serviceability-dev at openjdk.java.net) <serviceability-
> dev at openjdk.java.net>
> Subject: Re: 8206145 : dbgsysSocketClose - do not restart close if errno is
> EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR
> 
> On 03/07/2018 08:47, Baesken, Matthias wrote:
> > Hello , so should I change  my  webrev  for  8206145  to
> > - retry on AIX
> > - not retry  on Linux + Solaris  ?
> Yes.
> 
> > Any remarks on Mac / BSD ?
> >
> I see a few issues in the FreeBSD bugzilla on this topic. I assume it
> would be safer to not retry if interrupted.
> 
> -Alan

From bob.vandette at oracle.com  Tue Jul  3 14:02:55 2018
From: bob.vandette at oracle.com (Bob Vandette)
Date: Tue, 3 Jul 2018 10:02:55 -0400
Subject: RFR: 8205928 - [TESTBUG]:
 jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails depending on
 kernel config
In-Reply-To: <CAA-vtUxJSvgRA9=A293+YMGw4YzX4aY2qYKqPACnXkE0pRNvog@mail.gmail.com>
References: <3481D570-DAAF-49D8-862B-5B13BC6C2B48@oracle.com>
 <CAA-vtUxJSvgRA9=A293+YMGw4YzX4aY2qYKqPACnXkE0pRNvog@mail.gmail.com>
Message-ID: <596B8DE4-821B-4C0E-AB79-205F887645CB@oracle.com>

Matthais, who reported the issue, confirmed that this patch solves the problem.

Thanks,
Bob.

> On Jul 3, 2018, at 9:38 AM, Thomas St?fe <thomas.stuefe at gmail.com> wrote:
> 
> Hi Bob,
> 
> It does look fine from the outside. I did not test it though, since I
> have no suitable kernel.
> 
> Best Regards, Thomas
> 
> On Tue, Jul 3, 2018 at 3:13 PM, Bob Vandette <bob.vandette at oracle.com> wrote:
>> Please review this small fix to correct a test failure when the Linux system kernel is
>> not configured with the CONFIG_MEMCG_KMEM option.
>> 
>> The Container Metric tests are dependent on docker which allow us to assume a certain minimum
>> Linux kernel configuration level. However, the kernel memory resource limiting feature is not a hard
>> requirement for docker. This test will need to be updated to allow for running on kernels without this
>> option.  A 0 return from the getKernelMemoryLimit is defined to indicate that this API is not available.
>> 
>> BUG: https://bugs.openjdk.java.net/browse/JDK-8205928
>> 
>> PROPOSED FIX:
>> 
>> diff --git a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
>> --- a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
>> +++ b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
>> @@ -95,10 +95,11 @@
>> 
>>     private static void testKernelMemoryLimit(String value) {
>>         long limit = getMemoryValue(value);
>> -        if (limit != Metrics.systemMetrics().getKernelMemoryLimit()) {
>> +        long kmemlimit = Metrics.systemMetrics().getKernelMemoryLimit();
>> +        if (kmemlimit != 0 && limit != kmemlimit) {
>>             throw new RuntimeException("Kernel Memory limit not equal, expected : ["
>>                     + limit + "]" + ", got : ["
>> -                    + Metrics.systemMetrics().getKernelMemoryLimit() + "]");
>> +                    + kmemlimit + "]");
>>         }
>>         System.out.println("TEST PASSED!!!");
>>     }


From coleen.phillimore at oracle.com  Tue Jul  3 15:34:16 2018
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 3 Jul 2018 11:34:16 -0400
Subject: [12] RFR (S) 8205534: Remove SymbolTable dependency from
 serviceability agent
In-Reply-To: <f9fe11de-33e3-8280-9347-ee4bc067f518@oracle.com>
References: <88e391a8-78a2-8dbc-a489-fce9c6b922b5@oracle.com>
 <f9fe11de-33e3-8280-9347-ee4bc067f518@oracle.com>
Message-ID: <29f661c2-ce48-b174-cb2f-7300bd4aa8f2@oracle.com>


Hi Jini,? Thank you for reviewing this.

On 6/29/18 12:02 PM, Jini George wrote:
> Hi Coleen,
>
> Apologize for the delay. Your changes look good to me overall. A few 
> comments:
>
> It might make sense to also remove the corresponding lines in the 
> vmStructs files. Like:
>
> ?File????????? Line
> vmStructs.cpp? 170 typedef RehashableHashtable<Symbol*, mtSymbol> 
> RehashableSymbolHashtable;
> vmStructs.cpp? 477 static_field(RehashableSymbolHashtable, _seed, 
> juint)???????????????????????????????? \
> vmStructs.cpp 1362 declare_type(RehashableSymbolHashtable, 
> BasicHashtable<mtSymbol>)???? \
> vmStructs.cpp? 475 static_field(SymbolTable, _the_table, 
> SymbolTable*)????????????????????????? \
> vmStructs.cpp? 476 static_field(SymbolTable, _shared_table, 
> SymbolCompactHashTable)??????????????? \
>

Gerard has these changes in his changeset for rewriting the SymbolTable 
so I am going to leave this part of the change to him.

> You could also remove the "friend class VMStructs" from the 
> corresponding C++ data types.
>

Good point.? We'll make sure it's not there in his changes.

> The test case: test/jdk/sun/tools/jhsdb/AlternateHashingTest.java with 
> the file: test/jdk/sun/tools/jhsdb/LingeredAppWithAltHashing.java were 
> created to test the alternate hashing mechanism of the SymbolTable in 
> SA. Don't know if it makes sense to retain these.
>

Ok, I was debating with myself whether to remove these.? It makes sense 
not to test something that doesn't test what's intended anymore.? I'll 
remove them.


> One nit:
>
> Line 1079 of HeapHprofBinWriter.java: Extra spaces needed.
>
Fixed.

Thanks!
Coleen
> Thanks,
> Jini.
>
>
> On 6/23/2018 3:10 AM, coleen.phillimore at oracle.com wrote:
>> Summary: Modify SA code to not use SymbolTable and remove it.
>>
>> This is to support the concurrent hashtable for SymbolTable.
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8205534.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8205534
>>
>> Tested with hs-tier1-5.
>>
>> Thanks,
>> Coleen


From matthias.baesken at sap.com  Tue Jul  3 15:45:23 2018
From: matthias.baesken at sap.com (Baesken, Matthias)
Date: Tue, 3 Jul 2018 15:45:23 +0000
Subject: RFR: 8205928 - [TESTBUG]:
 jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails depending on
 kernel config
In-Reply-To: <596B8DE4-821B-4C0E-AB79-205F887645CB@oracle.com>
References: <3481D570-DAAF-49D8-862B-5B13BC6C2B48@oracle.com>
 <CAA-vtUxJSvgRA9=A293+YMGw4YzX4aY2qYKqPACnXkE0pRNvog@mail.gmail.com>
 <596B8DE4-821B-4C0E-AB79-205F887645CB@oracle.com>
Message-ID: <aad0b21b66054397b97150057d60f98d@sap.com>

Hi Bob and Thomas ,  I had  the patch in our internal  queue  and it fixed the problem .
( however I am not reviewer )

Best regards, Matthias


> -----Original Message-----
> From: Bob Vandette [mailto:bob.vandette at oracle.com]
> Sent: Dienstag, 3. Juli 2018 16:03
> To: Thomas St?fe <thomas.stuefe at gmail.com>
> Cc: serviceability-dev at openjdk.java.net serviceability-
> dev at openjdk.java.net <serviceability-dev at openjdk.java.net>; Baesken,
> Matthias <matthias.baesken at sap.com>
> Subject: Re: RFR: 8205928 - [TESTBUG]:
> jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails
> depending on kernel config
> 
> Matthais, who reported the issue, confirmed that this patch solves the
> problem.
> 
> Thanks,
> Bob.
> 
> > On Jul 3, 2018, at 9:38 AM, Thomas St?fe <thomas.stuefe at gmail.com>
> wrote:
> >
> > Hi Bob,
> >
> > It does look fine from the outside. I did not test it though, since I
> > have no suitable kernel.
> >
> > Best Regards, Thomas
> >
> > On Tue, Jul 3, 2018 at 3:13 PM, Bob Vandette <bob.vandette at oracle.com>
> wrote:
> >> Please review this small fix to correct a test failure when the Linux system
> kernel is
> >> not configured with the CONFIG_MEMCG_KMEM option.
> >>
> >> The Container Metric tests are dependent on docker which allow us to
> assume a certain minimum
> >> Linux kernel configuration level. However, the kernel memory resource
> limiting feature is not a hard
> >> requirement for docker. This test will need to be updated to allow for
> running on kernels without this
> >> option.  A 0 return from the getKernelMemoryLimit is defined to indicate
> that this API is not available.
> >>
> >> BUG: https://bugs.openjdk.java.net/browse/JDK-8205928
> >>
> >> PROPOSED FIX:
> >>
> >> diff --git
> a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
> b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
> >> --- a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
> >> +++ b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
> >> @@ -95,10 +95,11 @@
> >>
> >>     private static void testKernelMemoryLimit(String value) {
> >>         long limit = getMemoryValue(value);
> >> -        if (limit != Metrics.systemMetrics().getKernelMemoryLimit()) {
> >> +        long kmemlimit = Metrics.systemMetrics().getKernelMemoryLimit();
> >> +        if (kmemlimit != 0 && limit != kmemlimit) {
> >>             throw new RuntimeException("Kernel Memory limit not equal,
> expected : ["
> >>                     + limit + "]" + ", got : ["
> >> -                    + Metrics.systemMetrics().getKernelMemoryLimit() + "]");
> >> +                    + kmemlimit + "]");
> >>         }
> >>         System.out.println("TEST PASSED!!!");
> >>     }


From Alan.Bateman at oracle.com  Tue Jul  3 16:49:15 2018
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Tue, 3 Jul 2018 17:49:15 +0100
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <4411e9aedba54b16bc779acdf8de184d@sap.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
 <832f9ddb14f4415b98adafa004ff196f@sap.com>
 <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>
 <4411e9aedba54b16bc779acdf8de184d@sap.com>
Message-ID: <f2590a30-ebc8-f81d-2c73-fd1f3c3860b0@oracle.com>


On 03/07/2018 14:57, Baesken, Matthias wrote:
>>>> I created a bug  and a webrev  , please review .
>>>>
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8206145
>>>>
>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/
>
> Hello, here is  the second webrev  including Solaris  :
>
> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.1/
>
This looks okay to me (although I think we should include macOS in the 
list too).

-Alan

From thomas.stuefe at gmail.com  Tue Jul  3 17:07:28 2018
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 3 Jul 2018 19:07:28 +0200
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <f2590a30-ebc8-f81d-2c73-fd1f3c3860b0@oracle.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
 <832f9ddb14f4415b98adafa004ff196f@sap.com>
 <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>
 <4411e9aedba54b16bc779acdf8de184d@sap.com>
 <f2590a30-ebc8-f81d-2c73-fd1f3c3860b0@oracle.com>
Message-ID: <CAA-vtUx6-vYVE6TFWvvS58Aa+tnKMXuS2EB4nRT8+oz2mAsO6g@mail.gmail.com>

On Tue, Jul 3, 2018 at 6:49 PM, Alan Bateman <Alan.Bateman at oracle.com> wrote:
>
>
> On 03/07/2018 14:57, Baesken, Matthias wrote:
>>>>>
>>>>> I created a bug  and a webrev  , please review .
>>>>>
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8206145
>>>>>
>>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145/
>>
>>
>> Hello, here is  the second webrev  including Solaris  :
>>
>> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.1/
>>
> This looks okay to me (although I think we should include macOS in the list
> too).
>
> -Alan

+1

Actually, at this point we could just:

#if defined(__AIX)
     do {
         rv = close(fd);
     } while (rv == -1 && errno == EINTR);
#else
    rv = close(fd);
#endif

But boy this close() EINTR business is evil. Choosing between risking
file descriptor leaks or random double closes ...

..Thomas

From Alan.Bateman at oracle.com  Tue Jul  3 17:14:18 2018
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Tue, 3 Jul 2018 18:14:18 +0100
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <CAA-vtUx6-vYVE6TFWvvS58Aa+tnKMXuS2EB4nRT8+oz2mAsO6g@mail.gmail.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
 <832f9ddb14f4415b98adafa004ff196f@sap.com>
 <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>
 <4411e9aedba54b16bc779acdf8de184d@sap.com>
 <f2590a30-ebc8-f81d-2c73-fd1f3c3860b0@oracle.com>
 <CAA-vtUx6-vYVE6TFWvvS58Aa+tnKMXuS2EB4nRT8+oz2mAsO6g@mail.gmail.com>
Message-ID: <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com>

On 03/07/2018 18:07, Thomas St?fe wrote:
> :
> Actually, at this point we could just:
>
> #if defined(__AIX)
>       do {
>           rv = close(fd);
>       } while (rv == -1 && errno == EINTR);
> #else
>      rv = close(fd);
> #endif
Right, might be the simplest.

>
> But boy this close() EINTR business is evil. Choosing between risking
> file descriptor leaks or random double closes ...
>
and we aren't out of the woods yet, there are a few other places that 
need similar attention.

From thomas.stuefe at gmail.com  Tue Jul  3 18:32:17 2018
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 3 Jul 2018 20:32:17 +0200
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <709161f438f848b0af5fb079c9c0242a@sap.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
Message-ID: <CAA-vtUzj4gor7Z_RKUJzAkzsC7QXubp1mn6ukb5rByJUiJgw4A@mail.gmail.com>

Hi Ralf,

patch looks good and makes sense.


Some remarks:

+    /* Should not usually happen. */
+    if (length != count) {
+        error = JVMTI_ERROR_INTERNAL;
+    }

Cosmetics: I would also probably explicitly return:

/* Should not usually happen. */
if (length != count) {
  jvmtiDeallocate(frames);
  outStream_setError(out, JDWP_ERROR(INTERNAL));
  return JNI_TRUE;
}

.. makes the code clearer and should someone change the loop cancel
condition it will still work.

======

+    for (fnum = 0; (fnum < count) && (error == JVMTI_ERROR_NONE); ++fnum) {

you could loose the inner brackets.

--

Cosmetics: you changed meaning of fnum. Before it was really the frame
number. Now, fnum is a zero based index into your array. So I would
probably have renamed the variable too, maybe index? or somesuch.

======

Do we not have to handle opaque frames like the code before did? Or
does GetStackTrace already filter out opaque frames? Would that not
mean that GetStackTrace returns fewer frames than expected, and then
count could be smaller than length?

-- oh wait I see GetFrameLocation never really returned
JVMTI_ERROR_OPAQUE_FRAME? So it is probably fine.

=====

How large can the depth get? In stack overflow scenarios?

To limit memory usage and to make it more predictable, I would not
retrieve all frames in one go but in a loop, in bulks a n frames. E.g.
4086 frames would mean your buffer never exceeds 64K on 64bit
platforms. You would sacrifice a tiny bit of performance (again
needless walking up to starting position) but would not choke out when
stacks are ridiculously large.

======

I cannot comment on the jtreg test. Looks fine to me, but I wonder
whether there is a better way to script jdb, is this how we are
supposed to do this?

Maybe someone from the Oracle serviceability group can comment.


Thanks & Best Regards, Thomas


On Tue, Jul 3, 2018 at 12:43 PM, Schmelter, Ralf <ralf.schmelter at sap.com> wrote:
> Hi All,
>
> Please review the fix for the bug https://bugs.openjdk.java.net/browse/JDK-8205608 . The webref is at http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/ .
>
> This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack.
>
> I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is.
>
> Best regards,
> Ralf Schmelter

From david.holmes at oracle.com  Tue Jul  3 21:26:13 2018
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 4 Jul 2018 07:26:13 +1000
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
 <832f9ddb14f4415b98adafa004ff196f@sap.com>
 <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>
 <4411e9aedba54b16bc779acdf8de184d@sap.com>
 <f2590a30-ebc8-f81d-2c73-fd1f3c3860b0@oracle.com>
 <CAA-vtUx6-vYVE6TFWvvS58Aa+tnKMXuS2EB4nRT8+oz2mAsO6g@mail.gmail.com>
 <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com>
Message-ID: <fc56cc53-6d36-de3f-0817-de0b5c553725@oracle.com>

On 4/07/2018 3:14 AM, Alan Bateman wrote:
> On 03/07/2018 18:07, Thomas St?fe wrote:
>> :
>> Actually, at this point we could just:
>>
>> #if defined(__AIX)
>> ????? do {
>> ????????? rv = close(fd);
>> ????? } while (rv == -1 && errno == EINTR);
>> #else
>> ???? rv = close(fd);
>> #endif
> Right, might be the simplest.

+1 with suitable comment

>>
>> But boy this close() EINTR business is evil. Choosing between risking
>> file descriptor leaks or random double closes ...

Hopefully it's somewhat academic and we don't actually take signals in 
arbitrary threads.

Cheers,
David

>>
> and we aren't out of the woods yet, there are a few other places that 
> need similar attention.

From yasuenag at gmail.com  Tue Jul  3 23:04:32 2018
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Wed, 4 Jul 2018 08:04:32 +0900
Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running in
 Docker containers
In-Reply-To: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com>
References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com>
Message-ID: <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com>

PING: Could you review it?

>   JBS:    https://bugs.openjdk.java.net/browse/JDK-8205992
>   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/


Thanks,

Yasumasa


On 2018/06/28 22:12, Yasumasa Suenaga wrote:
> Hi all,
>
> Please review this change.
>
>   JBS:    https://bugs.openjdk.java.net/browse/JDK-8205992
>   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>
> I tried to attach jhsdb to java process in docker container from container host, but it couldn't.
> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet.
>
> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they returns PIDs in container - they are different from host's PID. So I added the code to scan /proc/<PID>/task to get all LWP IDs and they are kept in a Map in LinuxDebuggerLocal.
>
> Also SA_ALTROOT is set to /proc/<PID>/root if SA detects debuggee runs in container. It helps SA to parse binaries in container.
>
> This change has been pushed to submit repo, and it was failed on OS X (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963).
> But I guess it causes JDK-8205906. This change affects to Linux only.
>
> Could you review it?
>
>
> Thanks,
>
> Yasumasa
>

From thomas.stuefe at gmail.com  Wed Jul  4 05:11:51 2018
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 4 Jul 2018 07:11:51 +0200
Subject: RFR: 8205928 - [TESTBUG]:
 jdk/internal/platform/docker/TestDockerMemoryMetrics.java
 fails depending on kernel config
In-Reply-To: <aad0b21b66054397b97150057d60f98d@sap.com>
References: <3481D570-DAAF-49D8-862B-5B13BC6C2B48@oracle.com>
 <CAA-vtUxJSvgRA9=A293+YMGw4YzX4aY2qYKqPACnXkE0pRNvog@mail.gmail.com>
 <596B8DE4-821B-4C0E-AB79-205F887645CB@oracle.com>
 <aad0b21b66054397b97150057d60f98d@sap.com>
Message-ID: <CAA-vtUxxP+kX9m0_pNWu-zUSdo5rVt9=x-n7PR4c-mYhi0MeKw@mail.gmail.com>

Thanks for confirming, Matthias.

On Tue, Jul 3, 2018, 17:45 Baesken, Matthias <matthias.baesken at sap.com>
wrote:

> Hi Bob and Thomas ,  I had  the patch in our internal  queue  and it fixed
> the problem .
> ( however I am not reviewer )
>
> Best regards, Matthias
>
>
> > -----Original Message-----
> > From: Bob Vandette [mailto:bob.vandette at oracle.com]
> > Sent: Dienstag, 3. Juli 2018 16:03
> > To: Thomas St?fe <thomas.stuefe at gmail.com>
> > Cc: serviceability-dev at openjdk.java.net serviceability-
> > dev at openjdk.java.net <serviceability-dev at openjdk.java.net>; Baesken,
> > Matthias <matthias.baesken at sap.com>
> > Subject: Re: RFR: 8205928 - [TESTBUG]:
> > jdk/internal/platform/docker/TestDockerMemoryMetrics.java fails
> > depending on kernel config
> >
> > Matthais, who reported the issue, confirmed that this patch solves the
> > problem.
> >
> > Thanks,
> > Bob.
> >
> > > On Jul 3, 2018, at 9:38 AM, Thomas St?fe <thomas.stuefe at gmail.com>
> > wrote:
> > >
> > > Hi Bob,
> > >
> > > It does look fine from the outside. I did not test it though, since I
> > > have no suitable kernel.
> > >
> > > Best Regards, Thomas
> > >
> > > On Tue, Jul 3, 2018 at 3:13 PM, Bob Vandette <bob.vandette at oracle.com>
> > wrote:
> > >> Please review this small fix to correct a test failure when the Linux
> system
> > kernel is
> > >> not configured with the CONFIG_MEMCG_KMEM option.
> > >>
> > >> The Container Metric tests are dependent on docker which allow us to
> > assume a certain minimum
> > >> Linux kernel configuration level. However, the kernel memory resource
> > limiting feature is not a hard
> > >> requirement for docker. This test will need to be updated to allow for
> > running on kernels without this
> > >> option.  A 0 return from the getKernelMemoryLimit is defined to
> indicate
> > that this API is not available.
> > >>
> > >> BUG: https://bugs.openjdk.java.net/browse/JDK-8205928
> > >>
> > >> PROPOSED FIX:
> > >>
> > >> diff --git
> > a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
> > b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
> > >> --- a/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
> > >> +++ b/test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java
> > >> @@ -95,10 +95,11 @@
> > >>
> > >>     private static void testKernelMemoryLimit(String value) {
> > >>         long limit = getMemoryValue(value);
> > >> -        if (limit != Metrics.systemMetrics().getKernelMemoryLimit())
> {
> > >> +        long kmemlimit =
> Metrics.systemMetrics().getKernelMemoryLimit();
> > >> +        if (kmemlimit != 0 && limit != kmemlimit) {
> > >>             throw new RuntimeException("Kernel Memory limit not equal,
> > expected : ["
> > >>                     + limit + "]" + ", got : ["
> > >> -                    + Metrics.systemMetrics().getKernelMemoryLimit()
> + "]");
> > >> +                    + kmemlimit + "]");
> > >>         }
> > >>         System.out.println("TEST PASSED!!!");
> > >>     }
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180704/7cfcab18/attachment.html>

From matthias.baesken at sap.com  Wed Jul  4 11:37:44 2018
From: matthias.baesken at sap.com (Baesken, Matthias)
Date: Wed, 4 Jul 2018 11:37:44 +0000
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <fc56cc53-6d36-de3f-0817-de0b5c553725@oracle.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
 <832f9ddb14f4415b98adafa004ff196f@sap.com>
 <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>
 <4411e9aedba54b16bc779acdf8de184d@sap.com>
 <f2590a30-ebc8-f81d-2c73-fd1f3c3860b0@oracle.com>
 <CAA-vtUx6-vYVE6TFWvvS58Aa+tnKMXuS2EB4nRT8+oz2mAsO6g@mail.gmail.com>
 <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com>
 <fc56cc53-6d36-de3f-0817-de0b5c553725@oracle.com>
Message-ID: <c180c0c797b8484bae051f9903610101@sap.com>

Hi all,  here is another  webrev :

http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.2/

- switched to  the coding proposed by Thomas
- added a small comment


Best regards, Matthias


> -----Original Message-----
> From: David Holmes [mailto:david.holmes at oracle.com]
> Sent: Dienstag, 3. Juli 2018 23:26
> To: Alan Bateman <Alan.Bateman at oracle.com>; Thomas St?fe
> <thomas.stuefe at gmail.com>
> Cc: Baesken, Matthias <matthias.baesken at sap.com>; serviceability-dev
> (serviceability-dev at openjdk.java.net) <serviceability-
> dev at openjdk.java.net>
> Subject: Re: 8206145 : dbgsysSocketClose - do not restart close if errno is
> EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR
> 
> On 4/07/2018 3:14 AM, Alan Bateman wrote:
> > On 03/07/2018 18:07, Thomas St?fe wrote:
> >> :
> >> Actually, at this point we could just:
> >>
> >> #if defined(__AIX)
> >> ????? do {
> >> ????????? rv = close(fd);
> >> ????? } while (rv == -1 && errno == EINTR);
> >> #else
> >> ???? rv = close(fd);
> >> #endif
> > Right, might be the simplest.
> 
> +1 with suitable comment
> 
> >>
> >> But boy this close() EINTR business is evil. Choosing between risking
> >> file descriptor leaks or random double closes ...
> 
> Hopefully it's somewhat academic and we don't actually take signals in
> arbitrary threads.
> 
> Cheers,
> David
> 
> >>
> > and we aren't out of the woods yet, there are a few other places that
> > need similar attention.

From thomas.stuefe at gmail.com  Wed Jul  4 11:39:05 2018
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 4 Jul 2018 13:39:05 +0200
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <c180c0c797b8484bae051f9903610101@sap.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
 <832f9ddb14f4415b98adafa004ff196f@sap.com>
 <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>
 <4411e9aedba54b16bc779acdf8de184d@sap.com>
 <f2590a30-ebc8-f81d-2c73-fd1f3c3860b0@oracle.com>
 <CAA-vtUx6-vYVE6TFWvvS58Aa+tnKMXuS2EB4nRT8+oz2mAsO6g@mail.gmail.com>
 <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com>
 <fc56cc53-6d36-de3f-0817-de0b5c553725@oracle.com>
 <c180c0c797b8484bae051f9903610101@sap.com>
Message-ID: <CAA-vtUzY6_Wvnua+Smf9d6=d=-1=RL0mRjaFStsS1+xM3ad1Pg@mail.gmail.com>

Looks good. Thank you Matthias!

..Thomas

On Wed, Jul 4, 2018 at 1:37 PM, Baesken, Matthias
<matthias.baesken at sap.com> wrote:
> Hi all,  here is another  webrev :
>
> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.2/
>
> - switched to  the coding proposed by Thomas
> - added a small comment
>
>
>
> Best regards, Matthias
>
>
>> -----Original Message-----
>> From: David Holmes [mailto:david.holmes at oracle.com]
>> Sent: Dienstag, 3. Juli 2018 23:26
>> To: Alan Bateman <Alan.Bateman at oracle.com>; Thomas St?fe
>> <thomas.stuefe at gmail.com>
>> Cc: Baesken, Matthias <matthias.baesken at sap.com>; serviceability-dev
>> (serviceability-dev at openjdk.java.net) <serviceability-
>> dev at openjdk.java.net>
>> Subject: Re: 8206145 : dbgsysSocketClose - do not restart close if errno is
>> EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR
>>
>> On 4/07/2018 3:14 AM, Alan Bateman wrote:
>> > On 03/07/2018 18:07, Thomas St?fe wrote:
>> >> :
>> >> Actually, at this point we could just:
>> >>
>> >> #if defined(__AIX)
>> >>       do {
>> >>           rv = close(fd);
>> >>       } while (rv == -1 && errno == EINTR);
>> >> #else
>> >>      rv = close(fd);
>> >> #endif
>> > Right, might be the simplest.
>>
>> +1 with suitable comment
>>
>> >>
>> >> But boy this close() EINTR business is evil. Choosing between risking
>> >> file descriptor leaks or random double closes ...
>>
>> Hopefully it's somewhat academic and we don't actually take signals in
>> arbitrary threads.
>>
>> Cheers,
>> David
>>
>> >>
>> > and we aren't out of the woods yet, there are a few other places that
>> > need similar attention.

From Alan.Bateman at oracle.com  Wed Jul  4 11:42:45 2018
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Wed, 4 Jul 2018 12:42:45 +0100
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <c180c0c797b8484bae051f9903610101@sap.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
 <832f9ddb14f4415b98adafa004ff196f@sap.com>
 <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>
 <4411e9aedba54b16bc779acdf8de184d@sap.com>
 <f2590a30-ebc8-f81d-2c73-fd1f3c3860b0@oracle.com>
 <CAA-vtUx6-vYVE6TFWvvS58Aa+tnKMXuS2EB4nRT8+oz2mAsO6g@mail.gmail.com>
 <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com>
 <fc56cc53-6d36-de3f-0817-de0b5c553725@oracle.com>
 <c180c0c797b8484bae051f9903610101@sap.com>
Message-ID: <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com>

On 04/07/2018 12:37, Baesken, Matthias wrote:
> Hi all,  here is another  webrev :
>
> http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.2/
>
> - switched to  the coding proposed by Thomas
> - added a small comment
>
The code looks okay but the comment is a bit strange. A simple "AIX 
recommends to repeat the close call on EINTR" should be ignore and drop 
the bug reference.

-Alan

From Alan.Bateman at oracle.com  Wed Jul  4 11:44:34 2018
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Wed, 4 Jul 2018 12:44:34 +0100
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
 <832f9ddb14f4415b98adafa004ff196f@sap.com>
 <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>
 <4411e9aedba54b16bc779acdf8de184d@sap.com>
 <f2590a30-ebc8-f81d-2c73-fd1f3c3860b0@oracle.com>
 <CAA-vtUx6-vYVE6TFWvvS58Aa+tnKMXuS2EB4nRT8+oz2mAsO6g@mail.gmail.com>
 <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com>
 <fc56cc53-6d36-de3f-0817-de0b5c553725@oracle.com>
 <c180c0c797b8484bae051f9903610101@sap.com>
 <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com>
Message-ID: <f33eb8f4-02fe-c22f-7c84-99166e7708b7@oracle.com>


On 04/07/2018 12:42, Alan Bateman wrote:
> ?A simple "AIX recommends to repeat the close call on EINTR" should be 
> ignore
I meant "should be okay" of course as AIX is the outlier.

From matthias.baesken at sap.com  Wed Jul  4 13:00:38 2018
From: matthias.baesken at sap.com (Baesken, Matthias)
Date: Wed, 4 Jul 2018 13:00:38 +0000
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
 <832f9ddb14f4415b98adafa004ff196f@sap.com>
 <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>
 <4411e9aedba54b16bc779acdf8de184d@sap.com>
 <f2590a30-ebc8-f81d-2c73-fd1f3c3860b0@oracle.com>
 <CAA-vtUx6-vYVE6TFWvvS58Aa+tnKMXuS2EB4nRT8+oz2mAsO6g@mail.gmail.com>
 <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com>
 <fc56cc53-6d36-de3f-0817-de0b5c553725@oracle.com>
 <c180c0c797b8484bae051f9903610101@sap.com>
 <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com>
Message-ID: <8c8b6fc981e24fd1b6d1c6bf40e6e0fc@sap.com>

Ok, I  change  the  comment to  "AIX  recommends to repeat the close call on EINTR" and  push , is that fine with you ?

Best regards, Matthias

> -----Original Message-----
> From: Alan Bateman [mailto:Alan.Bateman at oracle.com]
> Sent: Mittwoch, 4. Juli 2018 13:43
> To: Baesken, Matthias <matthias.baesken at sap.com>; David Holmes
> <david.holmes at oracle.com>; Thomas St?fe <thomas.stuefe at gmail.com>
> Cc: serviceability-dev (serviceability-dev at openjdk.java.net) <serviceability-
> dev at openjdk.java.net>
> Subject: Re: 8206145 : dbgsysSocketClose - do not restart close if errno is
> EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR
> 
> On 04/07/2018 12:37, Baesken, Matthias wrote:
> > Hi all,  here is another  webrev :
> >
> > http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.2/
> >
> > - switched to  the coding proposed by Thomas
> > - added a small comment
> >
> The code looks okay but the comment is a bit strange. A simple "AIX
> recommends to repeat the close call on EINTR" should be ignore and drop
> the bug reference.
> 
> -Alan

From thomas.stuefe at gmail.com  Wed Jul  4 13:02:19 2018
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 4 Jul 2018 15:02:19 +0200
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <8c8b6fc981e24fd1b6d1c6bf40e6e0fc@sap.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
 <832f9ddb14f4415b98adafa004ff196f@sap.com>
 <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>
 <4411e9aedba54b16bc779acdf8de184d@sap.com>
 <f2590a30-ebc8-f81d-2c73-fd1f3c3860b0@oracle.com>
 <CAA-vtUx6-vYVE6TFWvvS58Aa+tnKMXuS2EB4nRT8+oz2mAsO6g@mail.gmail.com>
 <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com>
 <fc56cc53-6d36-de3f-0817-de0b5c553725@oracle.com>
 <c180c0c797b8484bae051f9903610101@sap.com>
 <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com>
 <8c8b6fc981e24fd1b6d1c6bf40e6e0fc@sap.com>
Message-ID: <CAA-vtUwPW4pnj1u_-9=99nHkGjN_aiWNog=EbbSBGKaEL7Fk2g@mail.gmail.com>

On Wed, Jul 4, 2018 at 3:00 PM, Baesken, Matthias
<matthias.baesken at sap.com> wrote:
> Ok, I  change  the  comment to  "AIX  recommends to repeat the close call on EINTR" and  push , is that fine with you ?
>

Sure. I do not need another webrev.

..Thomas

> Best regards, Matthias
>
>> -----Original Message-----
>> From: Alan Bateman [mailto:Alan.Bateman at oracle.com]
>> Sent: Mittwoch, 4. Juli 2018 13:43
>> To: Baesken, Matthias <matthias.baesken at sap.com>; David Holmes
>> <david.holmes at oracle.com>; Thomas St?fe <thomas.stuefe at gmail.com>
>> Cc: serviceability-dev (serviceability-dev at openjdk.java.net) <serviceability-
>> dev at openjdk.java.net>
>> Subject: Re: 8206145 : dbgsysSocketClose - do not restart close if errno is
>> EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is EINTR
>>
>> On 04/07/2018 12:37, Baesken, Matthias wrote:
>> > Hi all,  here is another  webrev :
>> >
>> > http://cr.openjdk.java.net/~mbaesken/webrevs/8206145.2/
>> >
>> > - switched to  the coding proposed by Thomas
>> > - added a small comment
>> >
>> The code looks okay but the comment is a bit strange. A simple "AIX
>> recommends to repeat the close call on EINTR" should be ignore and drop
>> the bug reference.
>>
>> -Alan

From Alan.Bateman at oracle.com  Wed Jul  4 13:03:39 2018
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Wed, 4 Jul 2018 14:03:39 +0100
Subject: 8206145 : dbgsysSocketClose - do not restart close if errno is
 EINTR [linux] - was : RE: RFR : 8205959 : Do not restart close if errno is
 EINTR
In-Reply-To: <8c8b6fc981e24fd1b6d1c6bf40e6e0fc@sap.com>
References: <228035d2f64c494eaefe31b07ac72083@sap.com>
 <cc6114a2-9125-d9e2-ee63-e422111acbfa@oracle.com>
 <CAA-vtUxVg0omb_Ar=SiQg_-hh1K-KW7Ky-epeyCyOWV152WFuQ@mail.gmail.com>
 <c64df6de-ce3f-6a61-0272-e0620ee63d89@oracle.com>
 <93f37d63-7e03-5420-9949-c8b8ca9e8d06@oracle.com>
 <832f9ddb14f4415b98adafa004ff196f@sap.com>
 <8dc3f5e9-a4d4-a841-6a4b-220820576344@oracle.com>
 <4411e9aedba54b16bc779acdf8de184d@sap.com>
 <f2590a30-ebc8-f81d-2c73-fd1f3c3860b0@oracle.com>
 <CAA-vtUx6-vYVE6TFWvvS58Aa+tnKMXuS2EB4nRT8+oz2mAsO6g@mail.gmail.com>
 <0cc810d2-8679-3759-ae8c-d2c848183c00@oracle.com>
 <fc56cc53-6d36-de3f-0817-de0b5c553725@oracle.com>
 <c180c0c797b8484bae051f9903610101@sap.com>
 <1ba5013e-323e-26f9-c764-536f7d11e046@oracle.com>
 <8c8b6fc981e24fd1b6d1c6bf40e6e0fc@sap.com>
Message-ID: <09bfcdb3-29d5-7788-c73f-bba488a8ec30@oracle.com>


On 04/07/2018 14:00, Baesken, Matthias wrote:
> Ok, I  change  the  comment to  "AIX  recommends to repeat the close call on EINTR" and  push , is that fine with you ?
>
Works for me, no need to refresh the webrev of course.

-Alan

From ralf.schmelter at sap.com  Wed Jul  4 13:47:28 2018
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Wed, 4 Jul 2018 13:47:28 +0000
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <CAA-vtUzj4gor7Z_RKUJzAkzsC7QXubp1mn6ukb5rByJUiJgw4A@mail.gmail.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <CAA-vtUzj4gor7Z_RKUJzAkzsC7QXubp1mn6ukb5rByJUiJgw4A@mail.gmail.com>
Message-ID: <5f4b279d105e44d9a46bfe16a15bbb34@sap.com>

Hi Thomas,

thank you for reviewing the change.


> +    /* Should not usually happen. */
> +    if (length != count) {
> +        error = JVMTI_ERROR_INTERNAL;
> +    }
>
> Cosmetics: I would also probably explicitly return:
>
> /* Should not usually happen. */
> if (length != count) {
>   jvmtiDeallocate(frames);
>   outStream_setError(out, JDWP_ERROR(INTERNAL));
>   return JNI_TRUE;
> }
>
> .. makes the code clearer and should someone change the loop cancel
> condition it will still work.

This would still rely on the error check in the loop, since the GetStackTrace JVMTI call sets the error variable too.

This means it should be either explicit for both cases:
    error = JVMTI_FUNC_PTR(gdata->jvmti, GetStackTrace)
                          (gdata->jvmti, thread, startIndex, length, frames, &count);

    If (error != JVMTI_ERROR_NONE) {
        jvmtiDeallocate(frames);
        outStream_setError(out, map2jdwpError(error));
        return JNI_TRUE;
    }

    /* Should not happen. */
    if (length != count) {
        jvmtiDeallocate(frames);
        outStream_setError(out, JDWP_ERROR(INTERNAL));
        return JNI_TRUE;
    }

or none (note that the original code could overwrite the error from the GetStackTrace call, which is fixed here):
    error = JVMTI_FUNC_PTR(gdata->jvmti, GetStackTrace)
                          (gdata->jvmti, thread, startIndex, length, frames, &count);

    /* Should not happen. */
    if (error == JVMTI_ERROR_NONE  && length != count) {
        error = JVMTI_ERROR_INTERNAL;
    }


> +    for (fnum = 0; (fnum < count) && (error == JVMTI_ERROR_NONE); ++fnum) {
>
> you could loose the inner brackets.
>
> --
>
> Cosmetics: you changed meaning of fnum. Before it was really the frame
> number. Now, fnum is a zero based index into your array. So I would
> probably have renamed the variable too, maybe index? or somesuch.

Ok, index it is. 


> Do we not have to handle opaque frames like the code before did? Or
> does GetStackTrace already filter out opaque frames? Would that not
> mean that GetStackTrace returns fewer frames than expected, and then
> count could be smaller than length?

> -- oh wait I see GetFrameLocation never really returned
> JVMTI_ERROR_OPAQUE_FRAME? So it is probably fine.

Exactly. The old code would have skipped native methods in the stack trace, if JVMTI_ERROR_OPAQUE_FRAME would have been returned. But since this was in fact not returned, the stacks should look the same.


> How large can the depth get? In stack overflow scenarios?
>
> To limit memory usage and to make it more predictable, I would not
> retrieve all frames in one go but in a loop, in bulks a n frames. E.g.
> 4086 frames would mean your buffer never exceeds 64K on 64bit
> platforms. You would sacrifice a tiny bit of performance (again
> needless walking up to starting position) but would not choke out when
> stacks are ridiculously large.

In theory or in practice? Practically a stack overflow will have at most a few 100 thousand frames, usually much less (10 to 20 thousand). But one can image a scenario where the JIT could statically inline a lot of calls, leading to many Java frames per (small) physical frame.

But you should consider, that the whole stack is written 'to memory' already, since the packet output stream is backed completely by memory. So the memory requirement is already O(nrOfFrames).


> I cannot comment on the jtreg test. Looks fine to me, but I wonder
> whether there is a better way to script jdb, is this how we are
> supposed to do this?

I don't know. But the ShellScaffold.sh library is used by over 40 other JDI test, so I used it too.

Best regards,
Ralf


From thomas.stuefe at gmail.com  Wed Jul  4 14:43:17 2018
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 4 Jul 2018 16:43:17 +0200
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <5f4b279d105e44d9a46bfe16a15bbb34@sap.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <CAA-vtUzj4gor7Z_RKUJzAkzsC7QXubp1mn6ukb5rByJUiJgw4A@mail.gmail.com>
 <5f4b279d105e44d9a46bfe16a15bbb34@sap.com>
Message-ID: <CAA-vtUwUsz5aUGsaQ9xMYCQKWY52wWf9KpThD=tHx8SyGe4E5Q@mail.gmail.com>

Hi Ralf,

On Wed, Jul 4, 2018 at 3:47 PM, Schmelter, Ralf <ralf.schmelter at sap.com> wrote:
> Hi Thomas,
>
> thank you for reviewing the change.
>
>
>> +    /* Should not usually happen. */
>> +    if (length != count) {
>> +        error = JVMTI_ERROR_INTERNAL;
>> +    }
>>
>> Cosmetics: I would also probably explicitly return:
>>
>> /* Should not usually happen. */
>> if (length != count) {
>>   jvmtiDeallocate(frames);
>>   outStream_setError(out, JDWP_ERROR(INTERNAL));
>>   return JNI_TRUE;
>> }
>>
>> .. makes the code clearer and should someone change the loop cancel
>> condition it will still work.
>
> This would still rely on the error check in the loop, since the GetStackTrace JVMTI call sets the error variable too.
>
> This means it should be either explicit for both cases:
>     error = JVMTI_FUNC_PTR(gdata->jvmti, GetStackTrace)
>                           (gdata->jvmti, thread, startIndex, length, frames, &count);
>
>     If (error != JVMTI_ERROR_NONE) {
>         jvmtiDeallocate(frames);
>         outStream_setError(out, map2jdwpError(error));
>         return JNI_TRUE;
>     }
>
>     /* Should not happen. */
>     if (length != count) {
>         jvmtiDeallocate(frames);
>         outStream_setError(out, JDWP_ERROR(INTERNAL));
>         return JNI_TRUE;
>     }
>
> or none (note that the original code could overwrite the error from the GetStackTrace call, which is fixed here):
>     error = JVMTI_FUNC_PTR(gdata->jvmti, GetStackTrace)
>                           (gdata->jvmti, thread, startIndex, length, frames, &count);
>
>     /* Should not happen. */
>     if (error == JVMTI_ERROR_NONE  && length != count) {
>         error = JVMTI_ERROR_INTERNAL;
>     }
>

Okay, in that case I prefer the second variant. At least only one
deallocate call then.

>
>
>> +    for (fnum = 0; (fnum < count) && (error == JVMTI_ERROR_NONE); ++fnum) {
>>
>> you could loose the inner brackets.
>>
>> --
>>
>> Cosmetics: you changed meaning of fnum. Before it was really the frame
>> number. Now, fnum is a zero based index into your array. So I would
>> probably have renamed the variable too, maybe index? or somesuch.
>
> Ok, index it is.
>
>

Thanks.

>
>> Do we not have to handle opaque frames like the code before did? Or
>> does GetStackTrace already filter out opaque frames? Would that not
>> mean that GetStackTrace returns fewer frames than expected, and then
>> count could be smaller than length?
>
>> -- oh wait I see GetFrameLocation never really returned
>> JVMTI_ERROR_OPAQUE_FRAME? So it is probably fine.
>
> Exactly. The old code would have skipped native methods in the stack trace, if JVMTI_ERROR_OPAQUE_FRAME would have been returned. But since this was in fact not returned, the stacks should look the same.
>
>
>
>
>> How large can the depth get? In stack overflow scenarios?
>>
>> To limit memory usage and to make it more predictable, I would not
>> retrieve all frames in one go but in a loop, in bulks a n frames. E.g.
>> 4086 frames would mean your buffer never exceeds 64K on 64bit
>> platforms. You would sacrifice a tiny bit of performance (again
>> needless walking up to starting position) but would not choke out when
>> stacks are ridiculously large.
>
> In theory or in practice? Practically a stack overflow will have at most a few 100 thousand frames, usually much less (10 to 20 thousand). But one can image a scenario where the JIT could statically inline a lot of calls, leading to many Java frames per (small) physical frame.
>
> But you should consider, that the whole stack is written 'to memory' already, since the packet output stream is backed completely by memory. So the memory requirement is already O(nrOfFrames).

Okay. Just did a quick calculation, we need now 33 bytes per frame in
the outputstream, and now we need 16 more. But I find it difficult to
see how one would be a problem and the other would not. So okay, lets
keep the code simple.

>
>
>> I cannot comment on the jtreg test. Looks fine to me, but I wonder
>> whether there is a better way to script jdb, is this how we are
>> supposed to do this?
>
> I don't know. But the ShellScaffold.sh library is used by over 40 other JDI test, so I used it too.
>
> Best regards,
> Ralf

Okay. From my point, this is reviewed.

Thanks & Best Regards, Thomas

>

From rafael.wth at gmail.com  Wed Jul  4 19:08:04 2018
From: rafael.wth at gmail.com (Rafael Winterhalter)
Date: Wed, 4 Jul 2018 21:08:04 +0200
Subject: Review Request JDK-8200559: Java agents doing instrumentation
 need a means to define auxiliary classes
In-Reply-To: <7a27c86c-6915-21e0-866f-f62c7626e330@oracle.com>
References: <78719929-5cc1-7770-c7bf-107bef65e57e@oracle.com>
 <CA+DM0Amu8L-vkgHOfhe5YCZO_+c2CMq3vVp3yb5dfdhyz+PK2A@mail.gmail.com>
 <7a27c86c-6915-21e0-866f-f62c7626e330@oracle.com>
Message-ID: <CA+DM0An-gzfGQA3Qv0Qt9f_8291LQjRTk37PnAf6OTSsuOjRpQ@mail.gmail.com>

Hi Mandy and Alan,

I very much understand your points of view that the Java instrumentation
API should retain its original intended scope and that every problem should
be solved at its own time. I do however claim that the proposed API does
not even solve this problem of auxiliary classes for the general case. By
bringing up several examples for why I want to suggest a
package-independent injection mechanism, this argument was maybe lost
between the lines. Allow me to repeat it in a possibly clearer manner:

Consider a case where an instrumentation is not isolated to a single class
but involves the classes foo.Bar and qux.Baz which are both transformed. To
implement the instrumentation some code is added to a method of foo.Bar
which then invokes the method qux.Baz::baz(Object) which is also
transformed. To apply the instrumentation, an auxiliary class needs to be
created which transports some state from foo.Bar to qux.Baz as the method
argument. For this to work, the code that is added to qux.Baz::baz(Object)
checks and casts the argument to the known auxiliary class.

Using the suggested API, it is very difficult to apply an transformation as
it is not clear if the auxiliary class should live in the foo or the qux
package as it is not controlled by the Java agent which of the foo.Bar and
qux.Baz classes is loaded first. To solve this, one would need to prepare
two instrumentations with an auxiliary class in the foo or the qux package
to define the class according to the load order. This problem explodes into
multidimensional complexity as more classes are involved in an
instrumentation.

This example might seem artificial but I it makes perfect sense when for
example instrumenting actors in an actor framework such as Akka where a new
message is added at runtime and handling is added to two actor classes. I
have also encountered similar instrumentation circles in I/O processing
frameworks.

And while an instrumentation might start isolated to transforming a single
class, there might be a requirement to evolve it later. Given the proposed
API, such evolutions are now difficult to implement as defining auxiliary
classes requires to consider class loading order.

This is my argument for this API not being suited for instrumentation
purposes and why I would favor an Instrumentation::defineClass API instead.
It is simply to difficult to find a good limiting factor; one could of
course consider to allow class definitions in the same class loader or
module but since class loaders and modules can stand in exporting
relationships to one another the problem with unpredictable load order
would occur again.

Beyond that I still claim that beyond the use case of auxiliary classes,
the following facts justify an introduction of Instrumentation::defineClass
which all apply to "tool agents" towards which the Instrumentation API is
targeted:
1. The need to inject dispatchers into specific class loaders to allow
cross-class loader communication. This is typically the bootstrap class
loader which is already accessible via
Instrumentation::appendToBootstrapSearchPath but this might be too specific.
2. The factual possibility of an owner of an instrumentation instance to
inject classes into any package without using internal API simply by
"pseudo transforming" a class that resides in the desired package.
3. The history of agents being developed using sun.misc.Unsafe::defineClass
for many years what makes migration to a much different API unlikely if
this involves heavy costs.
4. Retaining some equivalence to native agents where an API for defining
classes is available via JNI. It would be a shame if JVMTI is favored over
the Java agent API only for this.

I really hope you take this concern into consideration, To strengthen my
argument, please also considered that many regard me to be one of the
leading experts for JVM agents (due to my library Byte Buddy that is often
used for Java agents) and I have worked with a multitude of Java agent
vendors whose concerns I am also voicing here. Currently, none of the
vendors I regularly talk to is considering to use the suggested API whereas
most plan to simply migrate to jdk.internal.misc.Unsafe what is further
fostered by no alternative being offered in Java 11.

I also understand that it is probably too late to make an API decision at
this point. However, due to this important use case for
sun.misc.Unsafe::defineClass not being currently covered, it want to
suggest to reintroduce the latter method to Java 11 to avoid a further
spread of internal API usage by forcing people into
jdk.internal.misc.Unsafe where many will grow comfortable and not even
consider future APIs. The migration to jdk.internal.misc.Unsafe is also
what I observe being used for EA builds at this point so this is a partial
reality already.

Thank you for hearing me out, I really hope I can change your mind on this
issue and add Instrumentation::defineClass.
best regards, Rafael

2018-07-02 19:17 GMT+02:00 mandy chung <mandy.chung at oracle.com>:

> My proposal of ClassDefiner API allows the java agent to define auxiliary
> classes in the same runtime package of the class being instrumented.  You
> raised other use cases that are not addressed by this proposal.  As Alan
> replied, the ability to define any arbitrary class would be an attractive
> nuisance and we think Instrumentation.defineClass isn't the right API to
> add.
>
> I think the proposed ClassDefiner API is useful for the specific use case
> (define auxiliary classes in the runtime package of the class being
> instrumented).  I hold it off and so didn't make 11.  For the other use
> cases, perhaps we should create JBS issues for further investigation.
>
> Mandy
>
> On 7/2/18 1:41 AM, Rafael Winterhalter wrote:
>
>> Hi,
>>
>> I was wondering if a solution for this problem is still planned for JDK
>> 11 giving the beginning ramp down.
>>
>> With removing sun.misc.Unsafe::defineClass, Java agents only have an
>> option to use jdk.internal.misc.Unsafe::defineClass for the use-cases
>> that I described.
>>
>> I think it would be a missed opportunity not to offer an alternative as
>> of JDK 11 as a second migration would make it even less likely that agents
>> would avoid unsafe API.
>>
>> Thanks for the information,
>> best regards, Rafael
>>
>> mandy chung <mandy.chung at oracle.com <mailto:mandy.chung at oracle.com>>
>> schrieb am So., 15. Apr. 2018, 08:23:
>>
>>     Background:
>>
>>     Java agents support both load time and dynamic instrumentation. At
>>     load time,
>>     the agent's ClassFileTransformer is invoked to transform class
>>     bytes.  There is
>>     no Class objects at this time.  Dynamic instrumentation is when
>>     redefineClasses
>>     or retransformClasses is used to redefine an existing loaded class.
>> The
>>     ClassFileTransformer is invoked with class bytes where the Class
>>     object is present.
>>
>>     Java agent doing instrumentation needs a means to define auxiliary
>>     classes
>>     that are visible and accessible to the instrumented class. Existing
>>     agents
>>     have been using sun.misc.Unsafe::defineClass to define aux classes
>>     directly
>>     or accessing protected ClassLoader::defineClass method with
>>     setAccessible to
>>     suppress the language access check (see [1] where this issue was
>>     brought up).
>>
>>     Instrumentation::appendToBootstrapClassLoaderSearch and
>>     appendToSystemClassLoaderSearch
>>     APIs are existing means to supply additional classes.  It's too
>> limited
>>     for example it can't inject a class in the same runtime package as
>>     the class
>>     being transformed.
>>
>>     Proposal:
>>
>>     This proposes to add a new ClassFileTransformer.transform method
>>     taking additional ClassDefiner parameter.  A transformer can define
>>     additional
>>     classes during the transformation process, i.e.
>>     when ClassFileTransformer::transform is invoked. Some details:
>>
>>     1. ClassDefiner::defineClass defines a class in the same runtime
>> package
>>         as the class being transformed.
>>     2. The class is defined in the same thread as the transformers are
>> being
>>         invoked.   ClassDefiner::defineClass returns Class object directly
>>         before the transformed class is defined.
>>     3. No transformation is applied to classes defined by
>>     ClassDefiner::defineClass.
>>
>>     The first prototype we did is to collect the auxiliary classes and
>>     define
>>     them  until all transformers are invoked and have these aux classes
>>     to go
>>     through the transformation pipeline.  Several complicated issues would
>>     need to be resolved for example timing whether the auxiliary classes
>>     should
>>     be defined before the transformed class (otherwise a potential race
>>     where
>>     some other thread references the transformed class and cause the code
>> to
>>     execute that in turn reference the auxiliary classes.  The current
>>     implementation has a native reentrancy check that ensure one class
>>     is being
>>     transformed to avoid potential circularity issues.  This may need
>>     JVM TI
>>     support to be reliable.
>>
>>     This proposal would allow java agents to migrate from internal API
>>     and ClassDefiner to be enhanced in the future.
>>
>>     Webrev:
>>     http://cr.openjdk.java.net/~mchung/jdk11/webrevs/8200559/webrev.00/
>>
>>     Mandy
>>     [1]
>>     http://mail.openjdk.java.net/pipermail/jdk-dev/2018-January/
>> 000405.html
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180704/b95ddd0a/attachment.html>

From david.holmes at oracle.com  Thu Jul  5 08:19:17 2018
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 5 Jul 2018 18:19:17 +1000
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0 code
Message-ID: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>

Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/

Problem:

The tests create native threads that attach to the VM through 
JNI_AttachCurrentThread but which then terminate without detaching 
themselves. When the VM exits and we're using Flight Recorder 
"dumponexit" this leads to a call to VM_PrintThreads that in part wants 
to print the per-thread CPU usage. When we encounter the threads that 
have terminated already the low level pthread_getcpuclockid calls 
returns ESRCH but the code doesn't expect that and so fails an assert in 
debug mode and can SEGV in product mode.

Solution:

Serviceability-side: fix the tests

Change the tests so that the threads detach before terminating. The two 
tests are (surprisingly) written in completely different styles, so the 
solution also takes on two different styles.

Runtime-side: make the VM more robust in the fact of JNI attached 
threads that terminate before detaching, and add a regression test

I took a good look at the low-level code for interacting with arbitrary 
threads and as far as I can see the problem only exists for this one 
case of pthread_getcpuclockid on Linux. Elsewhere the potential for a 
library call failure just reports an error value (such as -1 for the cpu 
time used).

So the fix is simply to allow for ESRCH when calling 
pthread_getcpuclockid and return -1 for the cpu usage in that case.

I created a new regression test to create a new native thread, attach it 
and then let it terminate while still attached. The java code then calls 
various Thread and ThreadMXBean functions on it to ensure there are no 
crashes or unexpected exceptions.

Testing:
  - old tests with fixed run-time
  - old run-time with fixed tests
  - mach tier4 (which exposed the problem - that's where we enable 
Flight recorder for the tests) [in progress]
  - mach5 tier 1-3 for good measure [in progress]
  - new regression test

Thanks,
David

From david.holmes at oracle.com  Thu Jul  5 09:58:39 2018
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 5 Jul 2018 19:58:39 +1000
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
Message-ID: <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>

<sigh> Solaris compiler complains about doing a return from inside a 
do-while loop. I'll have to rework part of the fix tomorrow.

David

On 5/07/2018 6:19 PM, David Holmes wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
> 
> Problem:
> 
> The tests create native threads that attach to the VM through 
> JNI_AttachCurrentThread but which then terminate without detaching 
> themselves. When the VM exits and we're using Flight Recorder 
> "dumponexit" this leads to a call to VM_PrintThreads that in part wants 
> to print the per-thread CPU usage. When we encounter the threads that 
> have terminated already the low level pthread_getcpuclockid calls 
> returns ESRCH but the code doesn't expect that and so fails an assert in 
> debug mode and can SEGV in product mode.
> 
> Solution:
> 
> Serviceability-side: fix the tests
> 
> Change the tests so that the threads detach before terminating. The two 
> tests are (surprisingly) written in completely different styles, so the 
> solution also takes on two different styles.
> 
> Runtime-side: make the VM more robust in the fact of JNI attached 
> threads that terminate before detaching, and add a regression test
> 
> I took a good look at the low-level code for interacting with arbitrary 
> threads and as far as I can see the problem only exists for this one 
> case of pthread_getcpuclockid on Linux. Elsewhere the potential for a 
> library call failure just reports an error value (such as -1 for the cpu 
> time used).
> 
> So the fix is simply to allow for ESRCH when calling 
> pthread_getcpuclockid and return -1 for the cpu usage in that case.
> 
> I created a new regression test to create a new native thread, attach it 
> and then let it terminate while still attached. The java code then calls 
> various Thread and ThreadMXBean functions on it to ensure there are no 
> crashes or unexpected exceptions.
> 
> Testing:
>  ?- old tests with fixed run-time
>  ?- old run-time with fixed tests
>  ?- mach tier4 (which exposed the problem - that's where we enable 
> Flight recorder for the tests) [in progress]
>  ?- mach5 tier 1-3 for good measure [in progress]
>  ?- new regression test
> 
> Thanks,
> David

From gary.adams at oracle.com  Thu Jul  5 14:48:39 2018
From: gary.adams at oracle.com (Gary Adams)
Date: Thu, 05 Jul 2018 10:48:39 -0400
Subject: RFR:  JDK-8206007: nsk/jdb/exclude001 test is taking a long time
 on some builds
Message-ID: <5B3E2FC7.1060303@oracle.com>

A simple test run using "exclude none" shows 625K methods are being 
observed.
The bulk of those methods were due to the last class accessed in the 
test - VirtualMachineManager.

It's not important that this particular call is used. The test is simply 
demonstrating that
filters work for other packages than java and javax.

This proposed fix uses a simpler lookup for GregorianCalendar.

   Issue: https://bugs.openjdk.java.net/browse/JDK-8206007
   Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/

From chris.plummer at oracle.com  Thu Jul  5 21:28:15 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 5 Jul 2018 14:28:15 -0700
Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time
 on some builds
In-Reply-To: <5B3E2FC7.1060303@oracle.com>
References: <5B3E2FC7.1060303@oracle.com>
Message-ID: <e81f2757-b5a9-25c3-be0d-d71531dcb3a5@oracle.com>

Hi Gary,

The changes look good. How much is the reducing execution by?

thanks,

Chris

On 7/5/18 7:48 AM, Gary Adams wrote:
> A simple test run using "exclude none" shows 625K methods are being 
> observed.
> The bulk of those methods were due to the last class accessed in the 
> test - VirtualMachineManager.
>
> It's not important that this particular call is used. The test is 
> simply demonstrating that
> filters work for other packages than java and javax.
>
> This proposed fix uses a simpler lookup for GregorianCalendar.
>
> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206007
> ? Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/


From chris.plummer at oracle.com  Thu Jul  5 21:55:36 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 5 Jul 2018 14:55:36 -0700
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
Message-ID: <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>

Hi David,

Solaris problems aside, overall it looks fine. Some minor things I noted:

I noticed that exitCode is never modified in agentA() or agentB(), so 
there isn't much point to having it. If you reach the bottom of the 
function, it passed, so PASSED can be returned. The code would be more 
clear if it did this. As-is it is implied that you can reach the bottom 
when it fails.

Is detaching the threads along the failure paths really needed? exit() 
is called, so this would seem to make it unnecessary.

I prefer assignments not to be embedded inside the "if" condition. The 
DetachCurrentThread code in THREAD_return() is much more readable than 
the similar code in agentA() and agentB().

In the test:

 ? 54???????? // Generally as long as we don't crash of throw unexpected
 ? 55???????? // exceptions then the test passes. In some cases we know 
exactly

"of" should be "or".

Shouldn't you be catching exceptions for all the Thread methods you are 
calling? Otherwise the test will exit if one is thrown, and the above 
comment indicates that you don't want this.

Don't we normally put these tests in a package?

thanks,

Chris

On 7/5/18 2:58 AM, David Holmes wrote:
> <sigh> Solaris compiler complains about doing a return from inside a 
> do-while loop. I'll have to rework part of the fix tomorrow.
>
> David
>
> On 5/07/2018 6:19 PM, David Holmes wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>
>> Problem:
>>
>> The tests create native threads that attach to the VM through 
>> JNI_AttachCurrentThread but which then terminate without detaching 
>> themselves. When the VM exits and we're using Flight Recorder 
>> "dumponexit" this leads to a call to VM_PrintThreads that in part 
>> wants to print the per-thread CPU usage. When we encounter the 
>> threads that have terminated already the low level 
>> pthread_getcpuclockid calls returns ESRCH but the code doesn't expect 
>> that and so fails an assert in debug mode and can SEGV in product mode.
>>
>> Solution:
>>
>> Serviceability-side: fix the tests
>>
>> Change the tests so that the threads detach before terminating. The 
>> two tests are (surprisingly) written in completely different styles, 
>> so the solution also takes on two different styles.
>>
>> Runtime-side: make the VM more robust in the fact of JNI attached 
>> threads that terminate before detaching, and add a regression test
>>
>> I took a good look at the low-level code for interacting with 
>> arbitrary threads and as far as I can see the problem only exists for 
>> this one case of pthread_getcpuclockid on Linux. Elsewhere the 
>> potential for a library call failure just reports an error value 
>> (such as -1 for the cpu time used).
>>
>> So the fix is simply to allow for ESRCH when calling 
>> pthread_getcpuclockid and return -1 for the cpu usage in that case.
>>
>> I created a new regression test to create a new native thread, attach 
>> it and then let it terminate while still attached. The java code then 
>> calls various Thread and ThreadMXBean functions on it to ensure there 
>> are no crashes or unexpected exceptions.
>>
>> Testing:
>> ??- old tests with fixed run-time
>> ??- old run-time with fixed tests
>> ??- mach tier4 (which exposed the problem - that's where we enable 
>> Flight recorder for the tests) [in progress]
>> ??- mach5 tier 1-3 for good measure [in progress]
>> ??- new regression test
>>
>> Thanks,
>> David


From david.holmes at oracle.com  Thu Jul  5 22:18:52 2018
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 6 Jul 2018 08:18:52 +1000
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
Message-ID: <5752c0cc-f2bd-ed5c-0579-51ed639ee4cb@oracle.com>

On 5/07/2018 7:58 PM, David Holmes wrote:
> <sigh> Solaris compiler complains about doing a return from inside a 
> do-while loop. I'll have to rework part of the fix tomorrow.

Webrev updated in-place. The only change is to the makefile to disable a 
warning:

+ ifeq ($(TOOLCHAIN_TYPE), solstudio)
+   BUILD_HOTSPOT_JTREG_LIBRARIES_CFLAGS_libji06t001 += 
-erroff=E_END_OF_LOOP_CODE_NOT_REACHED
+ endif
+

David
-----

> David
> 
> On 5/07/2018 6:19 PM, David Holmes wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>
>> Problem:
>>
>> The tests create native threads that attach to the VM through 
>> JNI_AttachCurrentThread but which then terminate without detaching 
>> themselves. When the VM exits and we're using Flight Recorder 
>> "dumponexit" this leads to a call to VM_PrintThreads that in part 
>> wants to print the per-thread CPU usage. When we encounter the threads 
>> that have terminated already the low level pthread_getcpuclockid calls 
>> returns ESRCH but the code doesn't expect that and so fails an assert 
>> in debug mode and can SEGV in product mode.
>>
>> Solution:
>>
>> Serviceability-side: fix the tests
>>
>> Change the tests so that the threads detach before terminating. The 
>> two tests are (surprisingly) written in completely different styles, 
>> so the solution also takes on two different styles.
>>
>> Runtime-side: make the VM more robust in the fact of JNI attached 
>> threads that terminate before detaching, and add a regression test
>>
>> I took a good look at the low-level code for interacting with 
>> arbitrary threads and as far as I can see the problem only exists for 
>> this one case of pthread_getcpuclockid on Linux. Elsewhere the 
>> potential for a library call failure just reports an error value (such 
>> as -1 for the cpu time used).
>>
>> So the fix is simply to allow for ESRCH when calling 
>> pthread_getcpuclockid and return -1 for the cpu usage in that case.
>>
>> I created a new regression test to create a new native thread, attach 
>> it and then let it terminate while still attached. The java code then 
>> calls various Thread and ThreadMXBean functions on it to ensure there 
>> are no crashes or unexpected exceptions.
>>
>> Testing:
>> ??- old tests with fixed run-time
>> ??- old run-time with fixed tests
>> ??- mach tier4 (which exposed the problem - that's where we enable 
>> Flight recorder for the tests) [in progress]
>> ??- mach5 tier 1-3 for good measure [in progress]
>> ??- new regression test
>>
>> Thanks,
>> David

From chris.plummer at oracle.com  Thu Jul  5 22:37:21 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 5 Jul 2018 15:37:21 -0700
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <709161f438f848b0af5fb079c9c0242a@sap.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
Message-ID: <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>

Hi Ralf,

Overall looks good, but I do have a few comments and questions.

Please update the copyright.

What testing have you done?

How long does this test take to run.

What happens if for some reason SOE is never thrown? It's not clear to 
me what the script would do in this case.

In answer to the ShellScaffold.sh question, there is already work 
underway to convert to pure java tests. See JDK-8201652. I'm not certain 
if it is ok for you to just submit this new shell script, or if should 
be rewritten in pure java. Most of the work to convert the scripts has 
already been done but was put on hold. Maybe Serguei can comment and 
guide you on how it would be done in java.

thanks,

Chris

On 7/3/18 3:43 AM, Schmelter, Ralf wrote:
> Hi All,
>
> Please review the fix for the bughttps://bugs.openjdk.java.net/browse/JDK-8205608  . The webref is athttp://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/  .
>
> This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack.
>
> I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is.
>
> Best regards,
> Ralf Schmelter


From david.holmes at oracle.com  Thu Jul  5 22:40:06 2018
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 6 Jul 2018 08:40:06 +1000
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
Message-ID: <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>

Hi Chris,

Thanks for looking at this.

Updated webrev:

http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/

Only real changes in ji05t001.c. (And fixed typo in the new test)

More below ...

On 6/07/2018 7:55 AM, Chris Plummer wrote:
> Hi David,
> 
> Solaris problems aside, overall it looks fine. Some minor things I noted:
> 
> I noticed that exitCode is never modified in agentA() or agentB(), so 
> there isn't much point to having it. If you reach the bottom of the 
> function, it passed, so PASSED can be returned. The code would be more 
> clear if it did this. As-is it is implied that you can reach the bottom 
> when it fails.

I resisted any and all urges to do any kind of unrelated code cleanup in 
the tests - once you start you may end up doing a full rewrite.

> Is detaching the threads along the failure paths really needed? exit() 
> is called, so this would seem to make it unnecessary.

You're right that isn't necessary. I'll remove the changes from before 
the exits in ji05t001.c

> I prefer assignments not to be embedded inside the "if" condition. The 
> DetachCurrentThread code in THREAD_return() is much more readable than 
> the similar code in agentA() and agentB().

It's an existing style already used in that test e.g.

  287     if ((res =
  288             JNI_ENV_PTR(vm)->AttachCurrentThread(
  289                 JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) != 0) {

and I don't mind it, so I'd prefer not to change it.

> In the test:
> 
>  ? 54???????? // Generally as long as we don't crash of throw unexpected
>  ? 55???????? // exceptions then the test passes. In some cases we know 
> exactly
> 
> "of" should be "or".

Well spotted. Thanks.

> Shouldn't you be catching exceptions for all the Thread methods you are 
> calling? Otherwise the test will exit if one is thrown, and the above 
> comment indicates that you don't want this.

I'm not expecting there to be any exceptions from any of the called 
methods. That would potentially indicate a problem in handling the 
terminated native thread, so would indicate a test failure.

> Don't we normally put these tests in a package?

Doesn't seem to be any hard and fast rule. I only uses packages when 
they are important for the test. In runtime we have 905 java files and 
only 116 have a package statement. It varies elsewhere.

Thanks,
David

> thanks,
> 
> Chris
> 
> On 7/5/18 2:58 AM, David Holmes wrote:
>> <sigh> Solaris compiler complains about doing a return from inside a 
>> do-while loop. I'll have to rework part of the fix tomorrow.
>>
>> David
>>
>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>
>>> Problem:
>>>
>>> The tests create native threads that attach to the VM through 
>>> JNI_AttachCurrentThread but which then terminate without detaching 
>>> themselves. When the VM exits and we're using Flight Recorder 
>>> "dumponexit" this leads to a call to VM_PrintThreads that in part 
>>> wants to print the per-thread CPU usage. When we encounter the 
>>> threads that have terminated already the low level 
>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't expect 
>>> that and so fails an assert in debug mode and can SEGV in product mode.
>>>
>>> Solution:
>>>
>>> Serviceability-side: fix the tests
>>>
>>> Change the tests so that the threads detach before terminating. The 
>>> two tests are (surprisingly) written in completely different styles, 
>>> so the solution also takes on two different styles.
>>>
>>> Runtime-side: make the VM more robust in the fact of JNI attached 
>>> threads that terminate before detaching, and add a regression test
>>>
>>> I took a good look at the low-level code for interacting with 
>>> arbitrary threads and as far as I can see the problem only exists for 
>>> this one case of pthread_getcpuclockid on Linux. Elsewhere the 
>>> potential for a library call failure just reports an error value 
>>> (such as -1 for the cpu time used).
>>>
>>> So the fix is simply to allow for ESRCH when calling 
>>> pthread_getcpuclockid and return -1 for the cpu usage in that case.
>>>
>>> I created a new regression test to create a new native thread, attach 
>>> it and then let it terminate while still attached. The java code then 
>>> calls various Thread and ThreadMXBean functions on it to ensure there 
>>> are no crashes or unexpected exceptions.
>>>
>>> Testing:
>>> ??- old tests with fixed run-time
>>> ??- old run-time with fixed tests
>>> ??- mach tier4 (which exposed the problem - that's where we enable 
>>> Flight recorder for the tests) [in progress]
>>> ??- mach5 tier 1-3 for good measure [in progress]
>>> ??- new regression test
>>>
>>> Thanks,
>>> David
> 
> 
> 

From chris.plummer at oracle.com  Thu Jul  5 23:00:39 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 5 Jul 2018 16:00:39 -0700
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
 <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
Message-ID: <e6177f45-420f-df33-5ac3-0a72cf142740@oracle.com>

Hi David,

Looks good. Regarding the test being in a package, looks like this was 
the convention for the nsk tests, so that's why I noted it.

thanks,

Chris

On 7/5/18 3:40 PM, David Holmes wrote:
> Hi Chris,
>
> Thanks for looking at this.
>
> Updated webrev:
>
> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
>
> Only real changes in ji05t001.c. (And fixed typo in the new test)
>
> More below ...
>
> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>> Hi David,
>>
>> Solaris problems aside, overall it looks fine. Some minor things I 
>> noted:
>>
>> I noticed that exitCode is never modified in agentA() or agentB(), so 
>> there isn't much point to having it. If you reach the bottom of the 
>> function, it passed, so PASSED can be returned. The code would be 
>> more clear if it did this. As-is it is implied that you can reach the 
>> bottom when it fails.
>
> I resisted any and all urges to do any kind of unrelated code cleanup 
> in the tests - once you start you may end up doing a full rewrite.
>
>> Is detaching the threads along the failure paths really needed? 
>> exit() is called, so this would seem to make it unnecessary.
>
> You're right that isn't necessary. I'll remove the changes from before 
> the exits in ji05t001.c
>
>> I prefer assignments not to be embedded inside the "if" condition. 
>> The DetachCurrentThread code in THREAD_return() is much more readable 
>> than the similar code in agentA() and agentB().
>
> It's an existing style already used in that test e.g.
>
> ?287???? if ((res =
> ?288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread(
> ?289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) != 
> 0) {
>
> and I don't mind it, so I'd prefer not to change it.
>
>> In the test:
>>
>> ?? 54???????? // Generally as long as we don't crash of throw unexpected
>> ?? 55???????? // exceptions then the test passes. In some cases we 
>> know exactly
>>
>> "of" should be "or".
>
> Well spotted. Thanks.
>
>> Shouldn't you be catching exceptions for all the Thread methods you 
>> are calling? Otherwise the test will exit if one is thrown, and the 
>> above comment indicates that you don't want this.
>
> I'm not expecting there to be any exceptions from any of the called 
> methods. That would potentially indicate a problem in handling the 
> terminated native thread, so would indicate a test failure.
>
>> Don't we normally put these tests in a package?
>
> Doesn't seem to be any hard and fast rule. I only uses packages when 
> they are important for the test. In runtime we have 905 java files and 
> only 116 have a package statement. It varies elsewhere.
>
> Thanks,
> David
>
>> thanks,
>>
>> Chris
>>
>> On 7/5/18 2:58 AM, David Holmes wrote:
>>> <sigh> Solaris compiler complains about doing a return from inside a 
>>> do-while loop. I'll have to rework part of the fix tomorrow.
>>>
>>> David
>>>
>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>
>>>> Problem:
>>>>
>>>> The tests create native threads that attach to the VM through 
>>>> JNI_AttachCurrentThread but which then terminate without detaching 
>>>> themselves. When the VM exits and we're using Flight Recorder 
>>>> "dumponexit" this leads to a call to VM_PrintThreads that in part 
>>>> wants to print the per-thread CPU usage. When we encounter the 
>>>> threads that have terminated already the low level 
>>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't 
>>>> expect that and so fails an assert in debug mode and can SEGV in 
>>>> product mode.
>>>>
>>>> Solution:
>>>>
>>>> Serviceability-side: fix the tests
>>>>
>>>> Change the tests so that the threads detach before terminating. The 
>>>> two tests are (surprisingly) written in completely different 
>>>> styles, so the solution also takes on two different styles.
>>>>
>>>> Runtime-side: make the VM more robust in the fact of JNI attached 
>>>> threads that terminate before detaching, and add a regression test
>>>>
>>>> I took a good look at the low-level code for interacting with 
>>>> arbitrary threads and as far as I can see the problem only exists 
>>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere the 
>>>> potential for a library call failure just reports an error value 
>>>> (such as -1 for the cpu time used).
>>>>
>>>> So the fix is simply to allow for ESRCH when calling 
>>>> pthread_getcpuclockid and return -1 for the cpu usage in that case.
>>>>
>>>> I created a new regression test to create a new native thread, 
>>>> attach it and then let it terminate while still attached. The java 
>>>> code then calls various Thread and ThreadMXBean functions on it to 
>>>> ensure there are no crashes or unexpected exceptions.
>>>>
>>>> Testing:
>>>> ??- old tests with fixed run-time
>>>> ??- old run-time with fixed tests
>>>> ??- mach tier4 (which exposed the problem - that's where we enable 
>>>> Flight recorder for the tests) [in progress]
>>>> ??- mach5 tier 1-3 for good measure [in progress]
>>>> ??- new regression test
>>>>
>>>> Thanks,
>>>> David
>>
>>
>>


From serguei.spitsyn at oracle.com  Thu Jul  5 23:26:40 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 5 Jul 2018 16:26:40 -0700
Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time
 on some builds
In-Reply-To: <5B3E2FC7.1060303@oracle.com>
References: <5B3E2FC7.1060303@oracle.com>
Message-ID: <1a60027a-f7c2-d892-997f-655fe9612718@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180705/0a8e744e/attachment.html>

From serguei.spitsyn at oracle.com  Fri Jul  6 02:17:36 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 5 Jul 2018 19:17:36 -0700
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
Message-ID: <7b8e9622-b4a1-78c1-6228-dd9c9e845e30@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180705/0b9905dd/attachment.html>

From serguei.spitsyn at oracle.com  Fri Jul  6 02:35:54 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 5 Jul 2018 19:35:54 -0700
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <7b8e9622-b4a1-78c1-6228-dd9c9e845e30@oracle.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <7b8e9622-b4a1-78c1-6228-dd9c9e845e30@oracle.com>
Message-ID: <332c6e87-aba9-0fb5-4b41-4e80507792bb@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180705/774b6c61/attachment-0001.html>

From david.holmes at oracle.com  Fri Jul  6 05:21:11 2018
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 6 Jul 2018 15:21:11 +1000
Subject: (11) RFR (XS): 8205966: [testbug] New Nestmates JDI test times out
 with Xcomp on sparc
Message-ID: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com>

Bug: https://bugs.openjdk.java.net/browse/JDK-8205966
webrev: http://cr.openjdk.java.net/~dholmes/8205966/webrev/

One of the @run variants was taking around 15x longer to execute. That 
variant uses the InMemoryJavaCompiler which involves a lot of classes 
and code execution. The test was enabling method entry event generation 
for all of main, resulting in the massive slowdown.

The fix is to add a new breakpoint() function that gets called after the 
in-memory compilation setup is done, and we initially run the test to 
that point before enabling the events.

The problem @run now only takes 2x the other tests and so should avoid 
the timeouts.

Testing: mach5 tier4 solaris-sparc
          mach5 tier 1-3

Thanks,
David

From mikael.vidstedt at oracle.com  Fri Jul  6 05:29:28 2018
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Thu, 5 Jul 2018 22:29:28 -0700
Subject: (11) RFR (XS): 8205966: [testbug] New Nestmates JDI test times
 out with Xcomp on sparc
In-Reply-To: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com>
References: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com>
Message-ID: <1EAE1E8D-6DF4-4930-9494-16E6428819C5@oracle.com>


Looks good. Nice speedup!

Cheers,
Mikael


> On Jul 5, 2018, at 10:21 PM, David Holmes <david.holmes at oracle.com> wrote:
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8205966
> webrev: http://cr.openjdk.java.net/~dholmes/8205966/webrev/
> 
> One of the @run variants was taking around 15x longer to execute. That variant uses the InMemoryJavaCompiler which involves a lot of classes and code execution. The test was enabling method entry event generation for all of main, resulting in the massive slowdown.
> 
> The fix is to add a new breakpoint() function that gets called after the in-memory compilation setup is done, and we initially run the test to that point before enabling the events.
> 
> The problem @run now only takes 2x the other tests and so should avoid the timeouts.
> 
> Testing: mach5 tier4 solaris-sparc
>         mach5 tier 1-3
> 
> Thanks,
> David


From david.holmes at oracle.com  Fri Jul  6 05:48:54 2018
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 6 Jul 2018 15:48:54 +1000
Subject: (11) RFR (XS): 8205966: [testbug] New Nestmates JDI test times
 out with Xcomp on sparc
In-Reply-To: <1EAE1E8D-6DF4-4930-9494-16E6428819C5@oracle.com>
References: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com>
 <1EAE1E8D-6DF4-4930-9494-16E6428819C5@oracle.com>
Message-ID: <ad6c8d54-9250-802c-7844-03092c653aae@oracle.com>

On 6/07/2018 3:29 PM, Mikael Vidstedt wrote:
> Looks good. Nice speedup!

Thanks for looking at it Mikael!

Still the second longest test in com/sun/jdi at 21 minutes!!!

David

> Cheers,
> Mikael
> 
> 
>> On Jul 5, 2018, at 10:21 PM, David Holmes <david.holmes at oracle.com> wrote:
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205966
>> webrev: http://cr.openjdk.java.net/~dholmes/8205966/webrev/
>>
>> One of the @run variants was taking around 15x longer to execute. That variant uses the InMemoryJavaCompiler which involves a lot of classes and code execution. The test was enabling method entry event generation for all of main, resulting in the massive slowdown.
>>
>> The fix is to add a new breakpoint() function that gets called after the in-memory compilation setup is done, and we initially run the test to that point before enabling the events.
>>
>> The problem @run now only takes 2x the other tests and so should avoid the timeouts.
>>
>> Testing: mach5 tier4 solaris-sparc
>>          mach5 tier 1-3
>>
>> Thanks,
>> David
> 

From david.holmes at oracle.com  Fri Jul  6 08:07:37 2018
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 6 Jul 2018 18:07:37 +1000
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
 <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
Message-ID: <bd26e602-8f03-377a-49c8-badb092736dd@oracle.com>

<sigh> The new test is hanging on Solaris. I just discovered we don't 
run these tests on Solaris until tier4.

David

On 6/07/2018 8:40 AM, David Holmes wrote:
> Hi Chris,
> 
> Thanks for looking at this.
> 
> Updated webrev:
> 
> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
> 
> Only real changes in ji05t001.c. (And fixed typo in the new test)
> 
> More below ...
> 
> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>> Hi David,
>>
>> Solaris problems aside, overall it looks fine. Some minor things I noted:
>>
>> I noticed that exitCode is never modified in agentA() or agentB(), so 
>> there isn't much point to having it. If you reach the bottom of the 
>> function, it passed, so PASSED can be returned. The code would be more 
>> clear if it did this. As-is it is implied that you can reach the 
>> bottom when it fails.
> 
> I resisted any and all urges to do any kind of unrelated code cleanup in 
> the tests - once you start you may end up doing a full rewrite.
> 
>> Is detaching the threads along the failure paths really needed? exit() 
>> is called, so this would seem to make it unnecessary.
> 
> You're right that isn't necessary. I'll remove the changes from before 
> the exits in ji05t001.c
> 
>> I prefer assignments not to be embedded inside the "if" condition. The 
>> DetachCurrentThread code in THREAD_return() is much more readable than 
>> the similar code in agentA() and agentB().
> 
> It's an existing style already used in that test e.g.
> 
>  ?287???? if ((res =
>  ?288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread(
>  ?289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) != 0) {
> 
> and I don't mind it, so I'd prefer not to change it.
> 
>> In the test:
>>
>> ?? 54???????? // Generally as long as we don't crash of throw unexpected
>> ?? 55???????? // exceptions then the test passes. In some cases we 
>> know exactly
>>
>> "of" should be "or".
> 
> Well spotted. Thanks.
> 
>> Shouldn't you be catching exceptions for all the Thread methods you 
>> are calling? Otherwise the test will exit if one is thrown, and the 
>> above comment indicates that you don't want this.
> 
> I'm not expecting there to be any exceptions from any of the called 
> methods. That would potentially indicate a problem in handling the 
> terminated native thread, so would indicate a test failure.
> 
>> Don't we normally put these tests in a package?
> 
> Doesn't seem to be any hard and fast rule. I only uses packages when 
> they are important for the test. In runtime we have 905 java files and 
> only 116 have a package statement. It varies elsewhere.
> 
> Thanks,
> David
> 
>> thanks,
>>
>> Chris
>>
>> On 7/5/18 2:58 AM, David Holmes wrote:
>>> <sigh> Solaris compiler complains about doing a return from inside a 
>>> do-while loop. I'll have to rework part of the fix tomorrow.
>>>
>>> David
>>>
>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>
>>>> Problem:
>>>>
>>>> The tests create native threads that attach to the VM through 
>>>> JNI_AttachCurrentThread but which then terminate without detaching 
>>>> themselves. When the VM exits and we're using Flight Recorder 
>>>> "dumponexit" this leads to a call to VM_PrintThreads that in part 
>>>> wants to print the per-thread CPU usage. When we encounter the 
>>>> threads that have terminated already the low level 
>>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't 
>>>> expect that and so fails an assert in debug mode and can SEGV in 
>>>> product mode.
>>>>
>>>> Solution:
>>>>
>>>> Serviceability-side: fix the tests
>>>>
>>>> Change the tests so that the threads detach before terminating. The 
>>>> two tests are (surprisingly) written in completely different styles, 
>>>> so the solution also takes on two different styles.
>>>>
>>>> Runtime-side: make the VM more robust in the fact of JNI attached 
>>>> threads that terminate before detaching, and add a regression test
>>>>
>>>> I took a good look at the low-level code for interacting with 
>>>> arbitrary threads and as far as I can see the problem only exists 
>>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere the 
>>>> potential for a library call failure just reports an error value 
>>>> (such as -1 for the cpu time used).
>>>>
>>>> So the fix is simply to allow for ESRCH when calling 
>>>> pthread_getcpuclockid and return -1 for the cpu usage in that case.
>>>>
>>>> I created a new regression test to create a new native thread, 
>>>> attach it and then let it terminate while still attached. The java 
>>>> code then calls various Thread and ThreadMXBean functions on it to 
>>>> ensure there are no crashes or unexpected exceptions.
>>>>
>>>> Testing:
>>>> ??- old tests with fixed run-time
>>>> ??- old run-time with fixed tests
>>>> ??- mach tier4 (which exposed the problem - that's where we enable 
>>>> Flight recorder for the tests) [in progress]
>>>> ??- mach5 tier 1-3 for good measure [in progress]
>>>> ??- new regression test
>>>>
>>>> Thanks,
>>>> David
>>
>>
>>

From gary.adams at oracle.com  Fri Jul  6 12:54:58 2018
From: gary.adams at oracle.com (Gary Adams)
Date: Fri, 06 Jul 2018 08:54:58 -0400
Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time
 on some builds
In-Reply-To: <1a60027a-f7c2-d892-997f-655fe9612718@oracle.com>
References: <5B3E2FC7.1060303@oracle.com>
 <1a60027a-f7c2-d892-997f-655fe9612718@oracle.com>
Message-ID: <5B3F66A2.9020803@oracle.com>

Yes, as part of the testing I went back to include windows-x64-debug
and found the crashes are removed by this simplification that
removes VirtualMachineManger. Once this fix is pushed, I'll close
8197938 as a duplicate.

On 7/5/18, 7:26 PM, serguei.spitsyn at oracle.com wrote:
> Hi Gary,
>
> One thing is not clear.
> The 8206007 is linked to the 8197938 which tags this test in the 
> ProblemList.txt.
> This line is removed:
> -vmTestbase/nsk/jdb/exclude/exclude001/exclude001.java 8197938 windows-all
>
>
> but the bug 8197938 is still open.
> Is it intentional or some kind of a typo?
> Or maybe we have to close the 8197938 as a dup of 8206007?
>
> Otherwise, the fix looks good to me.
>
> Thank you for the extra testing!
>
> Thanks,
> Serguei
>
>
> On 7/5/18 07:48, Gary Adams wrote:
>> A simple test run using "exclude none" shows 625K methods are being 
>> observed.
>> The bulk of those methods were due to the last class accessed in the 
>> test - VirtualMachineManager.
>>
>> It's not important that this particular call is used. The test is 
>> simply demonstrating that
>> filters work for other packages than java and javax.
>>
>> This proposed fix uses a simpler lookup for GregorianCalendar.
>>
>>   Issue: https://bugs.openjdk.java.net/browse/JDK-8206007
>>   Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180706/5d368bb9/attachment.html>

From gary.adams at oracle.com  Fri Jul  6 12:55:19 2018
From: gary.adams at oracle.com (Gary Adams)
Date: Fri, 06 Jul 2018 08:55:19 -0400
Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time
 on some builds
In-Reply-To: <e81f2757-b5a9-25c3-be0d-d71531dcb3a5@oracle.com>
References: <5B3E2FC7.1060303@oracle.com>
 <e81f2757-b5a9-25c3-be0d-d71531dcb3a5@oracle.com>
Message-ID: <5B3F66B7.4000400@oracle.com>

This change reduces the test by ~180K method observations (29%).
It also depends on less complicated methods. e.g. VirtualMachineManager
deals with more class and service loaders

On 7/5/18, 5:28 PM, Chris Plummer wrote:
> Hi Gary,
>
> The changes look good. How much is the reducing execution by?
>
> thanks,
>
> Chris
>
> On 7/5/18 7:48 AM, Gary Adams wrote:
>> A simple test run using "exclude none" shows 625K methods are being 
>> observed.
>> The bulk of those methods were due to the last class accessed in the 
>> test - VirtualMachineManager.
>>
>> It's not important that this particular call is used. The test is 
>> simply demonstrating that
>> filters work for other packages than java and javax.
>>
>> This proposed fix uses a simpler lookup for GregorianCalendar.
>>
>>   Issue: https://bugs.openjdk.java.net/browse/JDK-8206007
>>   Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/
>
>
>


From serguei.spitsyn at oracle.com  Fri Jul  6 16:25:58 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Jul 2018 09:25:58 -0700
Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time
 on some builds
In-Reply-To: <5B3F66A2.9020803@oracle.com>
References: <5B3E2FC7.1060303@oracle.com>
 <1a60027a-f7c2-d892-997f-655fe9612718@oracle.com>
 <5B3F66A2.9020803@oracle.com>
Message-ID: <40cbc6d4-4472-f500-3fb9-c70f53d496b2@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180706/abca0934/attachment.html>

From serguei.spitsyn at oracle.com  Fri Jul  6 17:47:57 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 6 Jul 2018 10:47:57 -0700
Subject: (11) RFR (XS): 8205966: [testbug] New Nestmates JDI test times
 out with Xcomp on sparc
In-Reply-To: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com>
References: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com>
Message-ID: <7c533564-8deb-189c-2065-8cc73be2a785@oracle.com>

Hi David,

It looks good.
I agree with Mikael, it is a nice speedup!

Thanks,
Serguei


On 7/5/18 22:21, David Holmes wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8205966
> webrev: http://cr.openjdk.java.net/~dholmes/8205966/webrev/
>
> One of the @run variants was taking around 15x longer to execute. That 
> variant uses the InMemoryJavaCompiler which involves a lot of classes 
> and code execution. The test was enabling method entry event 
> generation for all of main, resulting in the massive slowdown.
>
> The fix is to add a new breakpoint() function that gets called after 
> the in-memory compilation setup is done, and we initially run the test 
> to that point before enabling the events.
>
> The problem @run now only takes 2x the other tests and so should avoid 
> the timeouts.
>
> Testing: mach5 tier4 solaris-sparc
> ???????? mach5 tier 1-3
>
> Thanks,
> David


From chris.plummer at oracle.com  Fri Jul  6 19:17:03 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 6 Jul 2018 12:17:03 -0700
Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time
 on some builds
In-Reply-To: <5B3F66B7.4000400@oracle.com>
References: <5B3E2FC7.1060303@oracle.com>
 <e81f2757-b5a9-25c3-be0d-d71531dcb3a5@oracle.com>
 <5B3F66B7.4000400@oracle.com>
Message-ID: <308e0c82-514f-a63c-89a0-1c9e983d549a@oracle.com>

Ok. Still seems like an awful lot of methods being invoked, but is nice 
improvement.

Chris

On 7/6/18 5:55 AM, Gary Adams wrote:
> This change reduces the test by ~180K method observations (29%).
> It also depends on less complicated methods. e.g. VirtualMachineManager
> deals with more class and service loaders
>
> On 7/5/18, 5:28 PM, Chris Plummer wrote:
>> Hi Gary,
>>
>> The changes look good. How much is the reducing execution by?
>>
>> thanks,
>>
>> Chris
>>
>> On 7/5/18 7:48 AM, Gary Adams wrote:
>>> A simple test run using "exclude none" shows 625K methods are being 
>>> observed.
>>> The bulk of those methods were due to the last class accessed in the 
>>> test - VirtualMachineManager.
>>>
>>> It's not important that this particular call is used. The test is 
>>> simply demonstrating that
>>> filters work for other packages than java and javax.
>>>
>>> This proposed fix uses a simpler lookup for GregorianCalendar.
>>>
>>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206007
>>> ? Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/
>>
>>
>>
>


From david.holmes at oracle.com  Sun Jul  8 23:58:32 2018
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 9 Jul 2018 09:58:32 +1000
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <bd26e602-8f03-377a-49c8-badb092736dd@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
 <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
 <bd26e602-8f03-377a-49c8-badb092736dd@oracle.com>
Message-ID: <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com>

tl;dr skip the new regression test on Solaris

New webrev:

http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/

This excludes the test from running on Solaris, so the makefile doesn't 
bother compiling this native test and the Java part of the test adds:

! * @requires os.family != "windows" & os.family != "solaris"
   * @summary Basic test of Thread and ThreadMXBean queries on a natively
   *          attached thread that has failed to detach before terminating.
+ * @comment The native code only supports POSIX so no windows testing; also
+ *          we have to skip solaris as a terminating thread that fails to
+ *          detach will hit an infinite loop due to TLS destructor 
issues - see
+ *          comments in JDK-8156708

Note this means that Solaris is not affected by the original issue 
because a still-attached native thread can't actually terminate due to 
the TLS destructor infinite-loop issue.

Thanks,
David

On 6/07/2018 6:07 PM, David Holmes wrote:
> <sigh> The new test is hanging on Solaris. I just discovered we don't 
> run these tests on Solaris until tier4.
> 
> David
> 
> On 6/07/2018 8:40 AM, David Holmes wrote:
>> Hi Chris,
>>
>> Thanks for looking at this.
>>
>> Updated webrev:
>>
>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
>>
>> Only real changes in ji05t001.c. (And fixed typo in the new test)
>>
>> More below ...
>>
>> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>>> Hi David,
>>>
>>> Solaris problems aside, overall it looks fine. Some minor things I 
>>> noted:
>>>
>>> I noticed that exitCode is never modified in agentA() or agentB(), so 
>>> there isn't much point to having it. If you reach the bottom of the 
>>> function, it passed, so PASSED can be returned. The code would be 
>>> more clear if it did this. As-is it is implied that you can reach the 
>>> bottom when it fails.
>>
>> I resisted any and all urges to do any kind of unrelated code cleanup 
>> in the tests - once you start you may end up doing a full rewrite.
>>
>>> Is detaching the threads along the failure paths really needed? 
>>> exit() is called, so this would seem to make it unnecessary.
>>
>> You're right that isn't necessary. I'll remove the changes from before 
>> the exits in ji05t001.c
>>
>>> I prefer assignments not to be embedded inside the "if" condition. 
>>> The DetachCurrentThread code in THREAD_return() is much more readable 
>>> than the similar code in agentA() and agentB().
>>
>> It's an existing style already used in that test e.g.
>>
>> ??287???? if ((res =
>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread(
>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) != 
>> 0) {
>>
>> and I don't mind it, so I'd prefer not to change it.
>>
>>> In the test:
>>>
>>> ?? 54???????? // Generally as long as we don't crash of throw unexpected
>>> ?? 55???????? // exceptions then the test passes. In some cases we 
>>> know exactly
>>>
>>> "of" should be "or".
>>
>> Well spotted. Thanks.
>>
>>> Shouldn't you be catching exceptions for all the Thread methods you 
>>> are calling? Otherwise the test will exit if one is thrown, and the 
>>> above comment indicates that you don't want this.
>>
>> I'm not expecting there to be any exceptions from any of the called 
>> methods. That would potentially indicate a problem in handling the 
>> terminated native thread, so would indicate a test failure.
>>
>>> Don't we normally put these tests in a package?
>>
>> Doesn't seem to be any hard and fast rule. I only uses packages when 
>> they are important for the test. In runtime we have 905 java files and 
>> only 116 have a package statement. It varies elsewhere.
>>
>> Thanks,
>> David
>>
>>> thanks,
>>>
>>> Chris
>>>
>>> On 7/5/18 2:58 AM, David Holmes wrote:
>>>> <sigh> Solaris compiler complains about doing a return from inside a 
>>>> do-while loop. I'll have to rework part of the fix tomorrow.
>>>>
>>>> David
>>>>
>>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>>
>>>>> Problem:
>>>>>
>>>>> The tests create native threads that attach to the VM through 
>>>>> JNI_AttachCurrentThread but which then terminate without detaching 
>>>>> themselves. When the VM exits and we're using Flight Recorder 
>>>>> "dumponexit" this leads to a call to VM_PrintThreads that in part 
>>>>> wants to print the per-thread CPU usage. When we encounter the 
>>>>> threads that have terminated already the low level 
>>>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't 
>>>>> expect that and so fails an assert in debug mode and can SEGV in 
>>>>> product mode.
>>>>>
>>>>> Solution:
>>>>>
>>>>> Serviceability-side: fix the tests
>>>>>
>>>>> Change the tests so that the threads detach before terminating. The 
>>>>> two tests are (surprisingly) written in completely different 
>>>>> styles, so the solution also takes on two different styles.
>>>>>
>>>>> Runtime-side: make the VM more robust in the fact of JNI attached 
>>>>> threads that terminate before detaching, and add a regression test
>>>>>
>>>>> I took a good look at the low-level code for interacting with 
>>>>> arbitrary threads and as far as I can see the problem only exists 
>>>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere the 
>>>>> potential for a library call failure just reports an error value 
>>>>> (such as -1 for the cpu time used).
>>>>>
>>>>> So the fix is simply to allow for ESRCH when calling 
>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that case.
>>>>>
>>>>> I created a new regression test to create a new native thread, 
>>>>> attach it and then let it terminate while still attached. The java 
>>>>> code then calls various Thread and ThreadMXBean functions on it to 
>>>>> ensure there are no crashes or unexpected exceptions.
>>>>>
>>>>> Testing:
>>>>> ??- old tests with fixed run-time
>>>>> ??- old run-time with fixed tests
>>>>> ??- mach tier4 (which exposed the problem - that's where we enable 
>>>>> Flight recorder for the tests) [in progress]
>>>>> ??- mach5 tier 1-3 for good measure [in progress]
>>>>> ??- new regression test
>>>>>
>>>>> Thanks,
>>>>> David
>>>
>>>
>>>

From david.holmes at oracle.com  Sun Jul  8 23:59:10 2018
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 9 Jul 2018 09:59:10 +1000
Subject: (11) RFR (XS): 8205966: [testbug] New Nestmates JDI test times
 out with Xcomp on sparc
In-Reply-To: <7c533564-8deb-189c-2065-8cc73be2a785@oracle.com>
References: <78907b9e-9e75-a218-c8b7-2b8bdbdb7779@oracle.com>
 <7c533564-8deb-189c-2065-8cc73be2a785@oracle.com>
Message-ID: <e58baa48-3991-6228-0ae8-a7fc76460e38@oracle.com>

Thanks Serguei!

David

On 7/07/2018 3:47 AM, serguei.spitsyn at oracle.com wrote:
> Hi David,
> 
> It looks good.
> I agree with Mikael, it is a nice speedup!
> 
> Thanks,
> Serguei
> 
> 
> 
> On 7/5/18 22:21, David Holmes wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205966
>> webrev: http://cr.openjdk.java.net/~dholmes/8205966/webrev/
>>
>> One of the @run variants was taking around 15x longer to execute. That 
>> variant uses the InMemoryJavaCompiler which involves a lot of classes 
>> and code execution. The test was enabling method entry event 
>> generation for all of main, resulting in the massive slowdown.
>>
>> The fix is to add a new breakpoint() function that gets called after 
>> the in-memory compilation setup is done, and we initially run the test 
>> to that point before enabling the events.
>>
>> The problem @run now only takes 2x the other tests and so should avoid 
>> the timeouts.
>>
>> Testing: mach5 tier4 solaris-sparc
>> ???????? mach5 tier 1-3
>>
>> Thanks,
>> David
> 

From ralf.schmelter at sap.com  Mon Jul  9 14:04:34 2018
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Mon, 9 Jul 2018 14:04:34 +0000
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
Message-ID: <21e17c666ac04930a0e4bb4869e989da@sap.com>

Hi Chris,

thanks for the review.

> What testing have you done?

I've tested the change by debugging by hand in eclipse and jdb, running the com/sun/jdi rtreg tests and the jdwp jck tests. And analog code is running in the SAP JVM for many years.


> How long does this test take to run.

15 s according to jtreg. 


> What happens if for some reason SOE is never thrown? It's not clear to 
> me what the script would do in this case.

It is treated as passed (which is not ideal).


> In answer to the ShellScaffold.sh question, there is already work 
> underway to convert to pure java tests. See JDK-8201652.

Ok, then I think it is better to convert the test to a Java TestScaffold test. I will update the webref when this is done.

Best regards,
Ralf 


-----Original Message-----
From: Chris Plummer [mailto:chris.plummer at oracle.com] 
Sent: Freitag, 6. Juli 2018 00:37
To: Schmelter, Ralf <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com
Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior

Hi Ralf,

Overall looks good, but I do have a few comments and questions.

Please update the copyright.

What testing have you done?

How long does this test take to run.

What happens if for some reason SOE is never thrown? It's not clear to 
me what the script would do in this case.
In answer to the ShellScaffold.sh question, there is already work 
underway to convert to pure java tests. See JDK-8201652. I'm not certain 
if it is ok for you to just submit this new shell script, or if should 
be rewritten in pure java. Most of the work to convert the scripts has 
already been done but was put on hold. Maybe Serguei can comment and 
guide you on how it would be done in java.

thanks,

Chris

On 7/3/18 3:43 AM, Schmelter, Ralf wrote:
> Hi All,
>
> Please review the fix for the bughttps://bugs.openjdk.java.net/browse/JDK-8205608  . The webref is athttp://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/  .
>
> This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack.
>
> I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is.
>
> Best regards,
> Ralf Schmelter


From chris.plummer at oracle.com  Mon Jul  9 18:22:46 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 9 Jul 2018 11:22:46 -0700
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
 <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
 <bd26e602-8f03-377a-49c8-badb092736dd@oracle.com>
 <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com>
Message-ID: <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com>

Hi David,

Would it be better to problem list this test on solaris using 
JDK-8156708. That way when JDK-8156708 is fixed it can come off the 
problem list and start executing on solaris.

thanks,

Chris

On 7/8/18 4:58 PM, David Holmes wrote:
> tl;dr skip the new regression test on Solaris
>
> New webrev:
>
> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/
>
> This excludes the test from running on Solaris, so the makefile 
> doesn't bother compiling this native test and the Java part of the 
> test adds:
>
> ! * @requires os.family != "windows" & os.family != "solaris"
> ? * @summary Basic test of Thread and ThreadMXBean queries on a natively
> ? *????????? attached thread that has failed to detach before 
> terminating.
> + * @comment The native code only supports POSIX so no windows 
> testing; also
> + *????????? we have to skip solaris as a terminating thread that 
> fails to
> + *????????? detach will hit an infinite loop due to TLS destructor 
> issues - see
> + *????????? comments in JDK-8156708
>
> Note this means that Solaris is not affected by the original issue 
> because a still-attached native thread can't actually terminate due to 
> the TLS destructor infinite-loop issue.
>
> Thanks,
> David
>
> On 6/07/2018 6:07 PM, David Holmes wrote:
>> <sigh> The new test is hanging on Solaris. I just discovered we don't 
>> run these tests on Solaris until tier4.
>>
>> David
>>
>> On 6/07/2018 8:40 AM, David Holmes wrote:
>>> Hi Chris,
>>>
>>> Thanks for looking at this.
>>>
>>> Updated webrev:
>>>
>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
>>>
>>> Only real changes in ji05t001.c. (And fixed typo in the new test)
>>>
>>> More below ...
>>>
>>> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>>>> Hi David,
>>>>
>>>> Solaris problems aside, overall it looks fine. Some minor things I 
>>>> noted:
>>>>
>>>> I noticed that exitCode is never modified in agentA() or agentB(), 
>>>> so there isn't much point to having it. If you reach the bottom of 
>>>> the function, it passed, so PASSED can be returned. The code would 
>>>> be more clear if it did this. As-is it is implied that you can 
>>>> reach the bottom when it fails.
>>>
>>> I resisted any and all urges to do any kind of unrelated code 
>>> cleanup in the tests - once you start you may end up doing a full 
>>> rewrite.
>>>
>>>> Is detaching the threads along the failure paths really needed? 
>>>> exit() is called, so this would seem to make it unnecessary.
>>>
>>> You're right that isn't necessary. I'll remove the changes from 
>>> before the exits in ji05t001.c
>>>
>>>> I prefer assignments not to be embedded inside the "if" condition. 
>>>> The DetachCurrentThread code in THREAD_return() is much more 
>>>> readable than the similar code in agentA() and agentB().
>>>
>>> It's an existing style already used in that test e.g.
>>>
>>> ??287???? if ((res =
>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread(
>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) 
>>> != 0) {
>>>
>>> and I don't mind it, so I'd prefer not to change it.
>>>
>>>> In the test:
>>>>
>>>> ?? 54???????? // Generally as long as we don't crash of throw 
>>>> unexpected
>>>> ?? 55???????? // exceptions then the test passes. In some cases we 
>>>> know exactly
>>>>
>>>> "of" should be "or".
>>>
>>> Well spotted. Thanks.
>>>
>>>> Shouldn't you be catching exceptions for all the Thread methods you 
>>>> are calling? Otherwise the test will exit if one is thrown, and the 
>>>> above comment indicates that you don't want this.
>>>
>>> I'm not expecting there to be any exceptions from any of the called 
>>> methods. That would potentially indicate a problem in handling the 
>>> terminated native thread, so would indicate a test failure.
>>>
>>>> Don't we normally put these tests in a package?
>>>
>>> Doesn't seem to be any hard and fast rule. I only uses packages when 
>>> they are important for the test. In runtime we have 905 java files 
>>> and only 116 have a package statement. It varies elsewhere.
>>>
>>> Thanks,
>>> David
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 7/5/18 2:58 AM, David Holmes wrote:
>>>>> <sigh> Solaris compiler complains about doing a return from inside 
>>>>> a do-while loop. I'll have to rework part of the fix tomorrow.
>>>>>
>>>>> David
>>>>>
>>>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>>>
>>>>>> Problem:
>>>>>>
>>>>>> The tests create native threads that attach to the VM through 
>>>>>> JNI_AttachCurrentThread but which then terminate without 
>>>>>> detaching themselves. When the VM exits and we're using Flight 
>>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads 
>>>>>> that in part wants to print the per-thread CPU usage. When we 
>>>>>> encounter the threads that have terminated already the low level 
>>>>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't 
>>>>>> expect that and so fails an assert in debug mode and can SEGV in 
>>>>>> product mode.
>>>>>>
>>>>>> Solution:
>>>>>>
>>>>>> Serviceability-side: fix the tests
>>>>>>
>>>>>> Change the tests so that the threads detach before terminating. 
>>>>>> The two tests are (surprisingly) written in completely different 
>>>>>> styles, so the solution also takes on two different styles.
>>>>>>
>>>>>> Runtime-side: make the VM more robust in the fact of JNI attached 
>>>>>> threads that terminate before detaching, and add a regression test
>>>>>>
>>>>>> I took a good look at the low-level code for interacting with 
>>>>>> arbitrary threads and as far as I can see the problem only exists 
>>>>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere 
>>>>>> the potential for a library call failure just reports an error 
>>>>>> value (such as -1 for the cpu time used).
>>>>>>
>>>>>> So the fix is simply to allow for ESRCH when calling 
>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that case.
>>>>>>
>>>>>> I created a new regression test to create a new native thread, 
>>>>>> attach it and then let it terminate while still attached. The 
>>>>>> java code then calls various Thread and ThreadMXBean functions on 
>>>>>> it to ensure there are no crashes or unexpected exceptions.
>>>>>>
>>>>>> Testing:
>>>>>> ??- old tests with fixed run-time
>>>>>> ??- old run-time with fixed tests
>>>>>> ??- mach tier4 (which exposed the problem - that's where we 
>>>>>> enable Flight recorder for the tests) [in progress]
>>>>>> ??- mach5 tier 1-3 for good measure [in progress]
>>>>>> ??- new regression test
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>
>>>>
>>>>


From jini.george at oracle.com  Mon Jul  9 18:44:33 2018
From: jini.george at oracle.com (Jini George)
Date: Tue, 10 Jul 2018 00:14:33 +0530
Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
Message-ID: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com>

Requesting reviews for enabling SA tests on OS X for Mach5.

https://bugs.openjdk.java.net/browse/JDK-8199700

Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/

The changes are mostly to include the addition of sudo privileges to the 
SA launchers for OSX if Platform.shouldSAAttach() fails. Some tests 
(those using clhsdb) have been refactored to use ClhsdbLauncher for ease 
of maintainence. This also avoids checks for Platform.shouldSAAttach() 
for corefile related test cases. More details have been provided in JIRA.

Thanks,
Jini.

From david.holmes at oracle.com  Mon Jul  9 21:41:02 2018
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 10 Jul 2018 07:41:02 +1000
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
 <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
 <bd26e602-8f03-377a-49c8-badb092736dd@oracle.com>
 <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com>
 <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com>
Message-ID: <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com>

Hi Chris,

On 10/07/2018 4:22 AM, Chris Plummer wrote:
> Hi David,
> 
> Would it be better to problem list this test on solaris using 
> JDK-8156708. That way when JDK-8156708 is fixed it can come off the 
> problem list and start executing on solaris.

JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could only 
fix this for VM created threads. The general problem of TLS destructors 
looping if a thread terminates without detaching from the VM is not 
solvable - other than by not using TLS in the VM.

Thanks,
David

> thanks,
> 
> Chris
> 
> On 7/8/18 4:58 PM, David Holmes wrote:
>> tl;dr skip the new regression test on Solaris
>>
>> New webrev:
>>
>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/
>>
>> This excludes the test from running on Solaris, so the makefile 
>> doesn't bother compiling this native test and the Java part of the 
>> test adds:
>>
>> ! * @requires os.family != "windows" & os.family != "solaris"
>> ? * @summary Basic test of Thread and ThreadMXBean queries on a natively
>> ? *????????? attached thread that has failed to detach before 
>> terminating.
>> + * @comment The native code only supports POSIX so no windows 
>> testing; also
>> + *????????? we have to skip solaris as a terminating thread that 
>> fails to
>> + *????????? detach will hit an infinite loop due to TLS destructor 
>> issues - see
>> + *????????? comments in JDK-8156708
>>
>> Note this means that Solaris is not affected by the original issue 
>> because a still-attached native thread can't actually terminate due to 
>> the TLS destructor infinite-loop issue.
>>
>> Thanks,
>> David
>>
>> On 6/07/2018 6:07 PM, David Holmes wrote:
>>> <sigh> The new test is hanging on Solaris. I just discovered we don't 
>>> run these tests on Solaris until tier4.
>>>
>>> David
>>>
>>> On 6/07/2018 8:40 AM, David Holmes wrote:
>>>> Hi Chris,
>>>>
>>>> Thanks for looking at this.
>>>>
>>>> Updated webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
>>>>
>>>> Only real changes in ji05t001.c. (And fixed typo in the new test)
>>>>
>>>> More below ...
>>>>
>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>>>>> Hi David,
>>>>>
>>>>> Solaris problems aside, overall it looks fine. Some minor things I 
>>>>> noted:
>>>>>
>>>>> I noticed that exitCode is never modified in agentA() or agentB(), 
>>>>> so there isn't much point to having it. If you reach the bottom of 
>>>>> the function, it passed, so PASSED can be returned. The code would 
>>>>> be more clear if it did this. As-is it is implied that you can 
>>>>> reach the bottom when it fails.
>>>>
>>>> I resisted any and all urges to do any kind of unrelated code 
>>>> cleanup in the tests - once you start you may end up doing a full 
>>>> rewrite.
>>>>
>>>>> Is detaching the threads along the failure paths really needed? 
>>>>> exit() is called, so this would seem to make it unnecessary.
>>>>
>>>> You're right that isn't necessary. I'll remove the changes from 
>>>> before the exits in ji05t001.c
>>>>
>>>>> I prefer assignments not to be embedded inside the "if" condition. 
>>>>> The DetachCurrentThread code in THREAD_return() is much more 
>>>>> readable than the similar code in agentA() and agentB().
>>>>
>>>> It's an existing style already used in that test e.g.
>>>>
>>>> ??287???? if ((res =
>>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread(
>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 0)) 
>>>> != 0) {
>>>>
>>>> and I don't mind it, so I'd prefer not to change it.
>>>>
>>>>> In the test:
>>>>>
>>>>> ?? 54???????? // Generally as long as we don't crash of throw 
>>>>> unexpected
>>>>> ?? 55???????? // exceptions then the test passes. In some cases we 
>>>>> know exactly
>>>>>
>>>>> "of" should be "or".
>>>>
>>>> Well spotted. Thanks.
>>>>
>>>>> Shouldn't you be catching exceptions for all the Thread methods you 
>>>>> are calling? Otherwise the test will exit if one is thrown, and the 
>>>>> above comment indicates that you don't want this.
>>>>
>>>> I'm not expecting there to be any exceptions from any of the called 
>>>> methods. That would potentially indicate a problem in handling the 
>>>> terminated native thread, so would indicate a test failure.
>>>>
>>>>> Don't we normally put these tests in a package?
>>>>
>>>> Doesn't seem to be any hard and fast rule. I only uses packages when 
>>>> they are important for the test. In runtime we have 905 java files 
>>>> and only 116 have a package statement. It varies elsewhere.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 7/5/18 2:58 AM, David Holmes wrote:
>>>>>> <sigh> Solaris compiler complains about doing a return from inside 
>>>>>> a do-while loop. I'll have to rework part of the fix tomorrow.
>>>>>>
>>>>>> David
>>>>>>
>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>>>>
>>>>>>> Problem:
>>>>>>>
>>>>>>> The tests create native threads that attach to the VM through 
>>>>>>> JNI_AttachCurrentThread but which then terminate without 
>>>>>>> detaching themselves. When the VM exits and we're using Flight 
>>>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads 
>>>>>>> that in part wants to print the per-thread CPU usage. When we 
>>>>>>> encounter the threads that have terminated already the low level 
>>>>>>> pthread_getcpuclockid calls returns ESRCH but the code doesn't 
>>>>>>> expect that and so fails an assert in debug mode and can SEGV in 
>>>>>>> product mode.
>>>>>>>
>>>>>>> Solution:
>>>>>>>
>>>>>>> Serviceability-side: fix the tests
>>>>>>>
>>>>>>> Change the tests so that the threads detach before terminating. 
>>>>>>> The two tests are (surprisingly) written in completely different 
>>>>>>> styles, so the solution also takes on two different styles.
>>>>>>>
>>>>>>> Runtime-side: make the VM more robust in the fact of JNI attached 
>>>>>>> threads that terminate before detaching, and add a regression test
>>>>>>>
>>>>>>> I took a good look at the low-level code for interacting with 
>>>>>>> arbitrary threads and as far as I can see the problem only exists 
>>>>>>> for this one case of pthread_getcpuclockid on Linux. Elsewhere 
>>>>>>> the potential for a library call failure just reports an error 
>>>>>>> value (such as -1 for the cpu time used).
>>>>>>>
>>>>>>> So the fix is simply to allow for ESRCH when calling 
>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that case.
>>>>>>>
>>>>>>> I created a new regression test to create a new native thread, 
>>>>>>> attach it and then let it terminate while still attached. The 
>>>>>>> java code then calls various Thread and ThreadMXBean functions on 
>>>>>>> it to ensure there are no crashes or unexpected exceptions.
>>>>>>>
>>>>>>> Testing:
>>>>>>> ??- old tests with fixed run-time
>>>>>>> ??- old run-time with fixed tests
>>>>>>> ??- mach tier4 (which exposed the problem - that's where we 
>>>>>>> enable Flight recorder for the tests) [in progress]
>>>>>>> ??- mach5 tier 1-3 for good measure [in progress]
>>>>>>> ??- new regression test
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>
>>>>>
>>>>>
> 
> 

From chris.plummer at oracle.com  Mon Jul  9 21:50:09 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 9 Jul 2018 14:50:09 -0700
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
 <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
 <bd26e602-8f03-377a-49c8-badb092736dd@oracle.com>
 <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com>
 <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com>
 <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com>
Message-ID: <e760f370-c54e-fa0a-a261-c8df8e9b15bb@oracle.com>

On 7/9/18 2:41 PM, David Holmes wrote:
> Hi Chris,
>
> On 10/07/2018 4:22 AM, Chris Plummer wrote:
>> Hi David,
>>
>> Would it be better to problem list this test on solaris using 
>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the 
>> problem list and start executing on solaris.
>
> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could 
> only fix this for VM created threads. The general problem of TLS 
> destructors looping if a thread terminates without detaching from the 
> VM is not solvable - other than by not using TLS in the VM.
Ok, I misunderstood your comments in the test.

Changes look fine.

Chris
>
> Thanks,
> David
>
>> thanks,
>>
>> Chris
>>
>> On 7/8/18 4:58 PM, David Holmes wrote:
>>> tl;dr skip the new regression test on Solaris
>>>
>>> New webrev:
>>>
>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/
>>>
>>> This excludes the test from running on Solaris, so the makefile 
>>> doesn't bother compiling this native test and the Java part of the 
>>> test adds:
>>>
>>> ! * @requires os.family != "windows" & os.family != "solaris"
>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a 
>>> natively
>>> ? *????????? attached thread that has failed to detach before 
>>> terminating.
>>> + * @comment The native code only supports POSIX so no windows 
>>> testing; also
>>> + *????????? we have to skip solaris as a terminating thread that 
>>> fails to
>>> + *????????? detach will hit an infinite loop due to TLS destructor 
>>> issues - see
>>> + *????????? comments in JDK-8156708
>>>
>>> Note this means that Solaris is not affected by the original issue 
>>> because a still-attached native thread can't actually terminate due 
>>> to the TLS destructor infinite-loop issue.
>>>
>>> Thanks,
>>> David
>>>
>>> On 6/07/2018 6:07 PM, David Holmes wrote:
>>>> <sigh> The new test is hanging on Solaris. I just discovered we 
>>>> don't run these tests on Solaris until tier4.
>>>>
>>>> David
>>>>
>>>> On 6/07/2018 8:40 AM, David Holmes wrote:
>>>>> Hi Chris,
>>>>>
>>>>> Thanks for looking at this.
>>>>>
>>>>> Updated webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
>>>>>
>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test)
>>>>>
>>>>> More below ...
>>>>>
>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> Solaris problems aside, overall it looks fine. Some minor things 
>>>>>> I noted:
>>>>>>
>>>>>> I noticed that exitCode is never modified in agentA() or 
>>>>>> agentB(), so there isn't much point to having it. If you reach 
>>>>>> the bottom of the function, it passed, so PASSED can be returned. 
>>>>>> The code would be more clear if it did this. As-is it is implied 
>>>>>> that you can reach the bottom when it fails.
>>>>>
>>>>> I resisted any and all urges to do any kind of unrelated code 
>>>>> cleanup in the tests - once you start you may end up doing a full 
>>>>> rewrite.
>>>>>
>>>>>> Is detaching the threads along the failure paths really needed? 
>>>>>> exit() is called, so this would seem to make it unnecessary.
>>>>>
>>>>> You're right that isn't necessary. I'll remove the changes from 
>>>>> before the exits in ji05t001.c
>>>>>
>>>>>> I prefer assignments not to be embedded inside the "if" 
>>>>>> condition. The DetachCurrentThread code in THREAD_return() is 
>>>>>> much more readable than the similar code in agentA() and agentB().
>>>>>
>>>>> It's an existing style already used in that test e.g.
>>>>>
>>>>> ??287???? if ((res =
>>>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread(
>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 
>>>>> 0)) != 0) {
>>>>>
>>>>> and I don't mind it, so I'd prefer not to change it.
>>>>>
>>>>>> In the test:
>>>>>>
>>>>>> ?? 54???????? // Generally as long as we don't crash of throw 
>>>>>> unexpected
>>>>>> ?? 55???????? // exceptions then the test passes. In some cases 
>>>>>> we know exactly
>>>>>>
>>>>>> "of" should be "or".
>>>>>
>>>>> Well spotted. Thanks.
>>>>>
>>>>>> Shouldn't you be catching exceptions for all the Thread methods 
>>>>>> you are calling? Otherwise the test will exit if one is thrown, 
>>>>>> and the above comment indicates that you don't want this.
>>>>>
>>>>> I'm not expecting there to be any exceptions from any of the 
>>>>> called methods. That would potentially indicate a problem in 
>>>>> handling the terminated native thread, so would indicate a test 
>>>>> failure.
>>>>>
>>>>>> Don't we normally put these tests in a package?
>>>>>
>>>>> Doesn't seem to be any hard and fast rule. I only uses packages 
>>>>> when they are important for the test. In runtime we have 905 java 
>>>>> files and only 116 have a package statement. It varies elsewhere.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/5/18 2:58 AM, David Holmes wrote:
>>>>>>> <sigh> Solaris compiler complains about doing a return from 
>>>>>>> inside a do-while loop. I'll have to rework part of the fix 
>>>>>>> tomorrow.
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>>>>>
>>>>>>>> Problem:
>>>>>>>>
>>>>>>>> The tests create native threads that attach to the VM through 
>>>>>>>> JNI_AttachCurrentThread but which then terminate without 
>>>>>>>> detaching themselves. When the VM exits and we're using Flight 
>>>>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads 
>>>>>>>> that in part wants to print the per-thread CPU usage. When we 
>>>>>>>> encounter the threads that have terminated already the low 
>>>>>>>> level pthread_getcpuclockid calls returns ESRCH but the code 
>>>>>>>> doesn't expect that and so fails an assert in debug mode and 
>>>>>>>> can SEGV in product mode.
>>>>>>>>
>>>>>>>> Solution:
>>>>>>>>
>>>>>>>> Serviceability-side: fix the tests
>>>>>>>>
>>>>>>>> Change the tests so that the threads detach before terminating. 
>>>>>>>> The two tests are (surprisingly) written in completely 
>>>>>>>> different styles, so the solution also takes on two different 
>>>>>>>> styles.
>>>>>>>>
>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI 
>>>>>>>> attached threads that terminate before detaching, and add a 
>>>>>>>> regression test
>>>>>>>>
>>>>>>>> I took a good look at the low-level code for interacting with 
>>>>>>>> arbitrary threads and as far as I can see the problem only 
>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. 
>>>>>>>> Elsewhere the potential for a library call failure just reports 
>>>>>>>> an error value (such as -1 for the cpu time used).
>>>>>>>>
>>>>>>>> So the fix is simply to allow for ESRCH when calling 
>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that 
>>>>>>>> case.
>>>>>>>>
>>>>>>>> I created a new regression test to create a new native thread, 
>>>>>>>> attach it and then let it terminate while still attached. The 
>>>>>>>> java code then calls various Thread and ThreadMXBean functions 
>>>>>>>> on it to ensure there are no crashes or unexpected exceptions.
>>>>>>>>
>>>>>>>> Testing:
>>>>>>>> ??- old tests with fixed run-time
>>>>>>>> ??- old run-time with fixed tests
>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we 
>>>>>>>> enable Flight recorder for the tests) [in progress]
>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress]
>>>>>>>> ??- new regression test
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>
>>>>>>
>>>>>>
>>
>>


From david.holmes at oracle.com  Mon Jul  9 22:17:13 2018
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 10 Jul 2018 08:17:13 +1000
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <e760f370-c54e-fa0a-a261-c8df8e9b15bb@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
 <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
 <bd26e602-8f03-377a-49c8-badb092736dd@oracle.com>
 <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com>
 <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com>
 <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com>
 <e760f370-c54e-fa0a-a261-c8df8e9b15bb@oracle.com>
Message-ID: <d9212cbf-5f41-2aff-a448-035dd9e63963@oracle.com>

Thanks Chris!

Can I please get a second review.

David

On 10/07/2018 7:50 AM, Chris Plummer wrote:
> On 7/9/18 2:41 PM, David Holmes wrote:
>> Hi Chris,
>>
>> On 10/07/2018 4:22 AM, Chris Plummer wrote:
>>> Hi David,
>>>
>>> Would it be better to problem list this test on solaris using 
>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the 
>>> problem list and start executing on solaris.
>>
>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could 
>> only fix this for VM created threads. The general problem of TLS 
>> destructors looping if a thread terminates without detaching from the 
>> VM is not solvable - other than by not using TLS in the VM.
> Ok, I misunderstood your comments in the test.
> 
> Changes look fine.
> 
> Chris
>>
>> Thanks,
>> David
>>
>>> thanks,
>>>
>>> Chris
>>>
>>> On 7/8/18 4:58 PM, David Holmes wrote:
>>>> tl;dr skip the new regression test on Solaris
>>>>
>>>> New webrev:
>>>>
>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/
>>>>
>>>> This excludes the test from running on Solaris, so the makefile 
>>>> doesn't bother compiling this native test and the Java part of the 
>>>> test adds:
>>>>
>>>> ! * @requires os.family != "windows" & os.family != "solaris"
>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a 
>>>> natively
>>>> ? *????????? attached thread that has failed to detach before 
>>>> terminating.
>>>> + * @comment The native code only supports POSIX so no windows 
>>>> testing; also
>>>> + *????????? we have to skip solaris as a terminating thread that 
>>>> fails to
>>>> + *????????? detach will hit an infinite loop due to TLS destructor 
>>>> issues - see
>>>> + *????????? comments in JDK-8156708
>>>>
>>>> Note this means that Solaris is not affected by the original issue 
>>>> because a still-attached native thread can't actually terminate due 
>>>> to the TLS destructor infinite-loop issue.
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>> On 6/07/2018 6:07 PM, David Holmes wrote:
>>>>> <sigh> The new test is hanging on Solaris. I just discovered we 
>>>>> don't run these tests on Solaris until tier4.
>>>>>
>>>>> David
>>>>>
>>>>> On 6/07/2018 8:40 AM, David Holmes wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> Thanks for looking at this.
>>>>>>
>>>>>> Updated webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
>>>>>>
>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test)
>>>>>>
>>>>>> More below ...
>>>>>>
>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> Solaris problems aside, overall it looks fine. Some minor things 
>>>>>>> I noted:
>>>>>>>
>>>>>>> I noticed that exitCode is never modified in agentA() or 
>>>>>>> agentB(), so there isn't much point to having it. If you reach 
>>>>>>> the bottom of the function, it passed, so PASSED can be returned. 
>>>>>>> The code would be more clear if it did this. As-is it is implied 
>>>>>>> that you can reach the bottom when it fails.
>>>>>>
>>>>>> I resisted any and all urges to do any kind of unrelated code 
>>>>>> cleanup in the tests - once you start you may end up doing a full 
>>>>>> rewrite.
>>>>>>
>>>>>>> Is detaching the threads along the failure paths really needed? 
>>>>>>> exit() is called, so this would seem to make it unnecessary.
>>>>>>
>>>>>> You're right that isn't necessary. I'll remove the changes from 
>>>>>> before the exits in ji05t001.c
>>>>>>
>>>>>>> I prefer assignments not to be embedded inside the "if" 
>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is 
>>>>>>> much more readable than the similar code in agentA() and agentB().
>>>>>>
>>>>>> It's an existing style already used in that test e.g.
>>>>>>
>>>>>> ??287???? if ((res =
>>>>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread(
>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 
>>>>>> 0)) != 0) {
>>>>>>
>>>>>> and I don't mind it, so I'd prefer not to change it.
>>>>>>
>>>>>>> In the test:
>>>>>>>
>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw 
>>>>>>> unexpected
>>>>>>> ?? 55???????? // exceptions then the test passes. In some cases 
>>>>>>> we know exactly
>>>>>>>
>>>>>>> "of" should be "or".
>>>>>>
>>>>>> Well spotted. Thanks.
>>>>>>
>>>>>>> Shouldn't you be catching exceptions for all the Thread methods 
>>>>>>> you are calling? Otherwise the test will exit if one is thrown, 
>>>>>>> and the above comment indicates that you don't want this.
>>>>>>
>>>>>> I'm not expecting there to be any exceptions from any of the 
>>>>>> called methods. That would potentially indicate a problem in 
>>>>>> handling the terminated native thread, so would indicate a test 
>>>>>> failure.
>>>>>>
>>>>>>> Don't we normally put these tests in a package?
>>>>>>
>>>>>> Doesn't seem to be any hard and fast rule. I only uses packages 
>>>>>> when they are important for the test. In runtime we have 905 java 
>>>>>> files and only 116 have a package statement. It varies elsewhere.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote:
>>>>>>>> <sigh> Solaris compiler complains about doing a return from 
>>>>>>>> inside a do-while loop. I'll have to rework part of the fix 
>>>>>>>> tomorrow.
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>>>>>>
>>>>>>>>> Problem:
>>>>>>>>>
>>>>>>>>> The tests create native threads that attach to the VM through 
>>>>>>>>> JNI_AttachCurrentThread but which then terminate without 
>>>>>>>>> detaching themselves. When the VM exits and we're using Flight 
>>>>>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads 
>>>>>>>>> that in part wants to print the per-thread CPU usage. When we 
>>>>>>>>> encounter the threads that have terminated already the low 
>>>>>>>>> level pthread_getcpuclockid calls returns ESRCH but the code 
>>>>>>>>> doesn't expect that and so fails an assert in debug mode and 
>>>>>>>>> can SEGV in product mode.
>>>>>>>>>
>>>>>>>>> Solution:
>>>>>>>>>
>>>>>>>>> Serviceability-side: fix the tests
>>>>>>>>>
>>>>>>>>> Change the tests so that the threads detach before terminating. 
>>>>>>>>> The two tests are (surprisingly) written in completely 
>>>>>>>>> different styles, so the solution also takes on two different 
>>>>>>>>> styles.
>>>>>>>>>
>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI 
>>>>>>>>> attached threads that terminate before detaching, and add a 
>>>>>>>>> regression test
>>>>>>>>>
>>>>>>>>> I took a good look at the low-level code for interacting with 
>>>>>>>>> arbitrary threads and as far as I can see the problem only 
>>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. 
>>>>>>>>> Elsewhere the potential for a library call failure just reports 
>>>>>>>>> an error value (such as -1 for the cpu time used).
>>>>>>>>>
>>>>>>>>> So the fix is simply to allow for ESRCH when calling 
>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that 
>>>>>>>>> case.
>>>>>>>>>
>>>>>>>>> I created a new regression test to create a new native thread, 
>>>>>>>>> attach it and then let it terminate while still attached. The 
>>>>>>>>> java code then calls various Thread and ThreadMXBean functions 
>>>>>>>>> on it to ensure there are no crashes or unexpected exceptions.
>>>>>>>>>
>>>>>>>>> Testing:
>>>>>>>>> ??- old tests with fixed run-time
>>>>>>>>> ??- old run-time with fixed tests
>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we 
>>>>>>>>> enable Flight recorder for the tests) [in progress]
>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress]
>>>>>>>>> ??- new regression test
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>
>>>>>>>
>>>>>>>
>>>
>>>
> 
> 

From alexey.menkov at oracle.com  Mon Jul  9 22:45:41 2018
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Mon, 9 Jul 2018 15:45:41 -0700
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <d9212cbf-5f41-2aff-a448-035dd9e63963@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
 <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
 <bd26e602-8f03-377a-49c8-badb092736dd@oracle.com>
 <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com>
 <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com>
 <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com>
 <e760f370-c54e-fa0a-a261-c8df8e9b15bb@oracle.com>
 <d9212cbf-5f41-2aff-a448-035dd9e63963@oracle.com>
Message-ID: <b508fa01-7803-5b30-c0d7-8bcf6c22807b@oracle.com>

+1

couple minor notes (no need to resend review)

src/hotspot/os/linux/os_linux.cpp
please replace

5581     }
5582     else {

with
     } else {


test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c
please fix error reporting (I suppose you mean "TEST ERROR: 
pthread_create failed"/"TEST ERROR: pthread_join failed"):

   85   if ((res = pthread_create(&thread, NULL, thread_start, NULL)) != 
0) {
   86     fprintf(stderr, "TEST ERROR: pthread_created failed: %s 
(%d)\n", strerror(res), res);
   87     exit(1);
   88   }
   89
   90   if ((res = pthread_join(thread, NULL)) != 0) {
   91     fprintf(stderr, "TEST ERROR: pthread_created failed: %s 
(%d)\n", strerror(res), res);
   92     exit(1);
   93   }

--alex

On 07/09/2018 15:17, David Holmes wrote:
> Thanks Chris!
> 
> Can I please get a second review.
> 
> David
> 
> On 10/07/2018 7:50 AM, Chris Plummer wrote:
>> On 7/9/18 2:41 PM, David Holmes wrote:
>>> Hi Chris,
>>>
>>> On 10/07/2018 4:22 AM, Chris Plummer wrote:
>>>> Hi David,
>>>>
>>>> Would it be better to problem list this test on solaris using 
>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the 
>>>> problem list and start executing on solaris.
>>>
>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could 
>>> only fix this for VM created threads. The general problem of TLS 
>>> destructors looping if a thread terminates without detaching from the 
>>> VM is not solvable - other than by not using TLS in the VM.
>> Ok, I misunderstood your comments in the test.
>>
>> Changes look fine.
>>
>> Chris
>>>
>>> Thanks,
>>> David
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 7/8/18 4:58 PM, David Holmes wrote:
>>>>> tl;dr skip the new regression test on Solaris
>>>>>
>>>>> New webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/
>>>>>
>>>>> This excludes the test from running on Solaris, so the makefile 
>>>>> doesn't bother compiling this native test and the Java part of the 
>>>>> test adds:
>>>>>
>>>>> ! * @requires os.family != "windows" & os.family != "solaris"
>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a 
>>>>> natively
>>>>> ? *????????? attached thread that has failed to detach before 
>>>>> terminating.
>>>>> + * @comment The native code only supports POSIX so no windows 
>>>>> testing; also
>>>>> + *????????? we have to skip solaris as a terminating thread that 
>>>>> fails to
>>>>> + *????????? detach will hit an infinite loop due to TLS destructor 
>>>>> issues - see
>>>>> + *????????? comments in JDK-8156708
>>>>>
>>>>> Note this means that Solaris is not affected by the original issue 
>>>>> because a still-attached native thread can't actually terminate due 
>>>>> to the TLS destructor infinite-loop issue.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>> On 6/07/2018 6:07 PM, David Holmes wrote:
>>>>>> <sigh> The new test is hanging on Solaris. I just discovered we 
>>>>>> don't run these tests on Solaris until tier4.
>>>>>>
>>>>>> David
>>>>>>
>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>> Thanks for looking at this.
>>>>>>>
>>>>>>> Updated webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
>>>>>>>
>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test)
>>>>>>>
>>>>>>> More below ...
>>>>>>>
>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> Solaris problems aside, overall it looks fine. Some minor things 
>>>>>>>> I noted:
>>>>>>>>
>>>>>>>> I noticed that exitCode is never modified in agentA() or 
>>>>>>>> agentB(), so there isn't much point to having it. If you reach 
>>>>>>>> the bottom of the function, it passed, so PASSED can be 
>>>>>>>> returned. The code would be more clear if it did this. As-is it 
>>>>>>>> is implied that you can reach the bottom when it fails.
>>>>>>>
>>>>>>> I resisted any and all urges to do any kind of unrelated code 
>>>>>>> cleanup in the tests - once you start you may end up doing a full 
>>>>>>> rewrite.
>>>>>>>
>>>>>>>> Is detaching the threads along the failure paths really needed? 
>>>>>>>> exit() is called, so this would seem to make it unnecessary.
>>>>>>>
>>>>>>> You're right that isn't necessary. I'll remove the changes from 
>>>>>>> before the exits in ji05t001.c
>>>>>>>
>>>>>>>> I prefer assignments not to be embedded inside the "if" 
>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is 
>>>>>>>> much more readable than the similar code in agentA() and agentB().
>>>>>>>
>>>>>>> It's an existing style already used in that test e.g.
>>>>>>>
>>>>>>> ??287???? if ((res =
>>>>>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread(
>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 
>>>>>>> 0)) != 0) {
>>>>>>>
>>>>>>> and I don't mind it, so I'd prefer not to change it.
>>>>>>>
>>>>>>>> In the test:
>>>>>>>>
>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw 
>>>>>>>> unexpected
>>>>>>>> ?? 55???????? // exceptions then the test passes. In some cases 
>>>>>>>> we know exactly
>>>>>>>>
>>>>>>>> "of" should be "or".
>>>>>>>
>>>>>>> Well spotted. Thanks.
>>>>>>>
>>>>>>>> Shouldn't you be catching exceptions for all the Thread methods 
>>>>>>>> you are calling? Otherwise the test will exit if one is thrown, 
>>>>>>>> and the above comment indicates that you don't want this.
>>>>>>>
>>>>>>> I'm not expecting there to be any exceptions from any of the 
>>>>>>> called methods. That would potentially indicate a problem in 
>>>>>>> handling the terminated native thread, so would indicate a test 
>>>>>>> failure.
>>>>>>>
>>>>>>>> Don't we normally put these tests in a package?
>>>>>>>
>>>>>>> Doesn't seem to be any hard and fast rule. I only uses packages 
>>>>>>> when they are important for the test. In runtime we have 905 java 
>>>>>>> files and only 116 have a package statement. It varies elsewhere.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote:
>>>>>>>>> <sigh> Solaris compiler complains about doing a return from 
>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix 
>>>>>>>>> tomorrow.
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>>>>>>>
>>>>>>>>>> Problem:
>>>>>>>>>>
>>>>>>>>>> The tests create native threads that attach to the VM through 
>>>>>>>>>> JNI_AttachCurrentThread but which then terminate without 
>>>>>>>>>> detaching themselves. When the VM exits and we're using Flight 
>>>>>>>>>> Recorder "dumponexit" this leads to a call to VM_PrintThreads 
>>>>>>>>>> that in part wants to print the per-thread CPU usage. When we 
>>>>>>>>>> encounter the threads that have terminated already the low 
>>>>>>>>>> level pthread_getcpuclockid calls returns ESRCH but the code 
>>>>>>>>>> doesn't expect that and so fails an assert in debug mode and 
>>>>>>>>>> can SEGV in product mode.
>>>>>>>>>>
>>>>>>>>>> Solution:
>>>>>>>>>>
>>>>>>>>>> Serviceability-side: fix the tests
>>>>>>>>>>
>>>>>>>>>> Change the tests so that the threads detach before 
>>>>>>>>>> terminating. The two tests are (surprisingly) written in 
>>>>>>>>>> completely different styles, so the solution also takes on two 
>>>>>>>>>> different styles.
>>>>>>>>>>
>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI 
>>>>>>>>>> attached threads that terminate before detaching, and add a 
>>>>>>>>>> regression test
>>>>>>>>>>
>>>>>>>>>> I took a good look at the low-level code for interacting with 
>>>>>>>>>> arbitrary threads and as far as I can see the problem only 
>>>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. 
>>>>>>>>>> Elsewhere the potential for a library call failure just 
>>>>>>>>>> reports an error value (such as -1 for the cpu time used).
>>>>>>>>>>
>>>>>>>>>> So the fix is simply to allow for ESRCH when calling 
>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that 
>>>>>>>>>> case.
>>>>>>>>>>
>>>>>>>>>> I created a new regression test to create a new native thread, 
>>>>>>>>>> attach it and then let it terminate while still attached. The 
>>>>>>>>>> java code then calls various Thread and ThreadMXBean functions 
>>>>>>>>>> on it to ensure there are no crashes or unexpected exceptions.
>>>>>>>>>>
>>>>>>>>>> Testing:
>>>>>>>>>> ??- old tests with fixed run-time
>>>>>>>>>> ??- old run-time with fixed tests
>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we 
>>>>>>>>>> enable Flight recorder for the tests) [in progress]
>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress]
>>>>>>>>>> ??- new regression test
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>
>>>>
>>
>>

From david.holmes at oracle.com  Mon Jul  9 23:22:21 2018
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 10 Jul 2018 09:22:21 +1000
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <b508fa01-7803-5b30-c0d7-8bcf6c22807b@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
 <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
 <bd26e602-8f03-377a-49c8-badb092736dd@oracle.com>
 <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com>
 <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com>
 <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com>
 <e760f370-c54e-fa0a-a261-c8df8e9b15bb@oracle.com>
 <d9212cbf-5f41-2aff-a448-035dd9e63963@oracle.com>
 <b508fa01-7803-5b30-c0d7-8bcf6c22807b@oracle.com>
Message-ID: <7c5dc8ba-aee8-a1d1-6821-6f9607a9e478@oracle.com>

Adding back runtime

On 10/07/2018 8:45 AM, Alex Menkov wrote:
> +1

Thanks for looking at this Alex!

> couple minor notes (no need to resend review)

Webrev updated in place (v3) for others to see.

> src/hotspot/os/linux/os_linux.cpp
> please replace
> 
> 5581???? }
> 5582???? else {
> 
> with
>  ??? } else {

Done.

> 
> test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c
> please fix error reporting (I suppose you mean "TEST ERROR: 
> pthread_create failed"/"TEST ERROR: pthread_join failed"):
> 
>  ? 85?? if ((res = pthread_create(&thread, NULL, thread_start, NULL)) != 
> 0) {
>  ? 86???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s 
> (%d)\n", strerror(res), res);
>  ? 87???? exit(1);
>  ? 88?? }
>  ? 89
>  ? 90?? if ((res = pthread_join(thread, NULL)) != 0) {
>  ? 91???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s 
> (%d)\n", strerror(res), res);
>  ? 92???? exit(1);
>  ? 93?? }

Fixed - well spotted!

Thanks,
David

> --alex
> 
> On 07/09/2018 15:17, David Holmes wrote:
>> Thanks Chris!
>>
>> Can I please get a second review.
>>
>> David
>>
>> On 10/07/2018 7:50 AM, Chris Plummer wrote:
>>> On 7/9/18 2:41 PM, David Holmes wrote:
>>>> Hi Chris,
>>>>
>>>> On 10/07/2018 4:22 AM, Chris Plummer wrote:
>>>>> Hi David,
>>>>>
>>>>> Would it be better to problem list this test on solaris using 
>>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the 
>>>>> problem list and start executing on solaris.
>>>>
>>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could 
>>>> only fix this for VM created threads. The general problem of TLS 
>>>> destructors looping if a thread terminates without detaching from 
>>>> the VM is not solvable - other than by not using TLS in the VM.
>>> Ok, I misunderstood your comments in the test.
>>>
>>> Changes look fine.
>>>
>>> Chris
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 7/8/18 4:58 PM, David Holmes wrote:
>>>>>> tl;dr skip the new regression test on Solaris
>>>>>>
>>>>>> New webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/
>>>>>>
>>>>>> This excludes the test from running on Solaris, so the makefile 
>>>>>> doesn't bother compiling this native test and the Java part of the 
>>>>>> test adds:
>>>>>>
>>>>>> ! * @requires os.family != "windows" & os.family != "solaris"
>>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a 
>>>>>> natively
>>>>>> ? *????????? attached thread that has failed to detach before 
>>>>>> terminating.
>>>>>> + * @comment The native code only supports POSIX so no windows 
>>>>>> testing; also
>>>>>> + *????????? we have to skip solaris as a terminating thread that 
>>>>>> fails to
>>>>>> + *????????? detach will hit an infinite loop due to TLS 
>>>>>> destructor issues - see
>>>>>> + *????????? comments in JDK-8156708
>>>>>>
>>>>>> Note this means that Solaris is not affected by the original issue 
>>>>>> because a still-attached native thread can't actually terminate 
>>>>>> due to the TLS destructor infinite-loop issue.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>> On 6/07/2018 6:07 PM, David Holmes wrote:
>>>>>>> <sigh> The new test is hanging on Solaris. I just discovered we 
>>>>>>> don't run these tests on Solaris until tier4.
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote:
>>>>>>>> Hi Chris,
>>>>>>>>
>>>>>>>> Thanks for looking at this.
>>>>>>>>
>>>>>>>> Updated webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
>>>>>>>>
>>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test)
>>>>>>>>
>>>>>>>> More below ...
>>>>>>>>
>>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>>>>>>>>> Hi David,
>>>>>>>>>
>>>>>>>>> Solaris problems aside, overall it looks fine. Some minor 
>>>>>>>>> things I noted:
>>>>>>>>>
>>>>>>>>> I noticed that exitCode is never modified in agentA() or 
>>>>>>>>> agentB(), so there isn't much point to having it. If you reach 
>>>>>>>>> the bottom of the function, it passed, so PASSED can be 
>>>>>>>>> returned. The code would be more clear if it did this. As-is it 
>>>>>>>>> is implied that you can reach the bottom when it fails.
>>>>>>>>
>>>>>>>> I resisted any and all urges to do any kind of unrelated code 
>>>>>>>> cleanup in the tests - once you start you may end up doing a 
>>>>>>>> full rewrite.
>>>>>>>>
>>>>>>>>> Is detaching the threads along the failure paths really needed? 
>>>>>>>>> exit() is called, so this would seem to make it unnecessary.
>>>>>>>>
>>>>>>>> You're right that isn't necessary. I'll remove the changes from 
>>>>>>>> before the exits in ji05t001.c
>>>>>>>>
>>>>>>>>> I prefer assignments not to be embedded inside the "if" 
>>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is 
>>>>>>>>> much more readable than the similar code in agentA() and agentB().
>>>>>>>>
>>>>>>>> It's an existing style already used in that test e.g.
>>>>>>>>
>>>>>>>> ??287???? if ((res =
>>>>>>>> ??288???????????? JNI_ENV_PTR(vm)->AttachCurrentThread(
>>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 
>>>>>>>> 0)) != 0) {
>>>>>>>>
>>>>>>>> and I don't mind it, so I'd prefer not to change it.
>>>>>>>>
>>>>>>>>> In the test:
>>>>>>>>>
>>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw 
>>>>>>>>> unexpected
>>>>>>>>> ?? 55???????? // exceptions then the test passes. In some cases 
>>>>>>>>> we know exactly
>>>>>>>>>
>>>>>>>>> "of" should be "or".
>>>>>>>>
>>>>>>>> Well spotted. Thanks.
>>>>>>>>
>>>>>>>>> Shouldn't you be catching exceptions for all the Thread methods 
>>>>>>>>> you are calling? Otherwise the test will exit if one is thrown, 
>>>>>>>>> and the above comment indicates that you don't want this.
>>>>>>>>
>>>>>>>> I'm not expecting there to be any exceptions from any of the 
>>>>>>>> called methods. That would potentially indicate a problem in 
>>>>>>>> handling the terminated native thread, so would indicate a test 
>>>>>>>> failure.
>>>>>>>>
>>>>>>>>> Don't we normally put these tests in a package?
>>>>>>>>
>>>>>>>> Doesn't seem to be any hard and fast rule. I only uses packages 
>>>>>>>> when they are important for the test. In runtime we have 905 
>>>>>>>> java files and only 116 have a package statement. It varies 
>>>>>>>> elsewhere.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote:
>>>>>>>>>> <sigh> Solaris compiler complains about doing a return from 
>>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix 
>>>>>>>>>> tomorrow.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>>>>>>>>
>>>>>>>>>>> Problem:
>>>>>>>>>>>
>>>>>>>>>>> The tests create native threads that attach to the VM through 
>>>>>>>>>>> JNI_AttachCurrentThread but which then terminate without 
>>>>>>>>>>> detaching themselves. When the VM exits and we're using 
>>>>>>>>>>> Flight Recorder "dumponexit" this leads to a call to 
>>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread 
>>>>>>>>>>> CPU usage. When we encounter the threads that have terminated 
>>>>>>>>>>> already the low level pthread_getcpuclockid calls returns 
>>>>>>>>>>> ESRCH but the code doesn't expect that and so fails an assert 
>>>>>>>>>>> in debug mode and can SEGV in product mode.
>>>>>>>>>>>
>>>>>>>>>>> Solution:
>>>>>>>>>>>
>>>>>>>>>>> Serviceability-side: fix the tests
>>>>>>>>>>>
>>>>>>>>>>> Change the tests so that the threads detach before 
>>>>>>>>>>> terminating. The two tests are (surprisingly) written in 
>>>>>>>>>>> completely different styles, so the solution also takes on 
>>>>>>>>>>> two different styles.
>>>>>>>>>>>
>>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI 
>>>>>>>>>>> attached threads that terminate before detaching, and add a 
>>>>>>>>>>> regression test
>>>>>>>>>>>
>>>>>>>>>>> I took a good look at the low-level code for interacting with 
>>>>>>>>>>> arbitrary threads and as far as I can see the problem only 
>>>>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. 
>>>>>>>>>>> Elsewhere the potential for a library call failure just 
>>>>>>>>>>> reports an error value (such as -1 for the cpu time used).
>>>>>>>>>>>
>>>>>>>>>>> So the fix is simply to allow for ESRCH when calling 
>>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that 
>>>>>>>>>>> case.
>>>>>>>>>>>
>>>>>>>>>>> I created a new regression test to create a new native 
>>>>>>>>>>> thread, attach it and then let it terminate while still 
>>>>>>>>>>> attached. The java code then calls various Thread and 
>>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes 
>>>>>>>>>>> or unexpected exceptions.
>>>>>>>>>>>
>>>>>>>>>>> Testing:
>>>>>>>>>>> ??- old tests with fixed run-time
>>>>>>>>>>> ??- old run-time with fixed tests
>>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we 
>>>>>>>>>>> enable Flight recorder for the tests) [in progress]
>>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress]
>>>>>>>>>>> ??- new regression test
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>
>>>>>
>>>
>>>

From david.holmes at oracle.com  Tue Jul 10 00:14:44 2018
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 10 Jul 2018 10:14:44 +1000
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <95f95813-52e7-aebd-10e4-306ecb534fc3@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
 <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
 <bd26e602-8f03-377a-49c8-badb092736dd@oracle.com>
 <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com>
 <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com>
 <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com>
 <e760f370-c54e-fa0a-a261-c8df8e9b15bb@oracle.com>
 <d9212cbf-5f41-2aff-a448-035dd9e63963@oracle.com>
 <b508fa01-7803-5b30-c0d7-8bcf6c22807b@oracle.com>
 <7c5dc8ba-aee8-a1d1-6821-6f9607a9e478@oracle.com>
 <95f95813-52e7-aebd-10e4-306ecb534fc3@oracle.com>
Message-ID: <ab2779ba-d0dc-3993-9c97-12cec528d8ca@oracle.com>

Thanks for looking at this Coleen.

David

On 10/07/2018 10:11 AM, coleen.phillimore at oracle.com wrote:
> 
> This looks good!? Thank you for fixing these failures.
> Coleen
> 
> On 7/9/18 7:22 PM, David Holmes wrote:
>> Adding back runtime
>>
>> On 10/07/2018 8:45 AM, Alex Menkov wrote:
>>> +1
>>
>> Thanks for looking at this Alex!
>>
>>> couple minor notes (no need to resend review)
>>
>> Webrev updated in place (v3) for others to see.
>>
>>> src/hotspot/os/linux/os_linux.cpp
>>> please replace
>>>
>>> 5581???? }
>>> 5582???? else {
>>>
>>> with
>>> ???? } else {
>>
>> Done.
>>
>>>
>>> test/hotspot/jtreg/runtime/jni/terminatedThread/libterminatedThread.c
>>> please fix error reporting (I suppose you mean "TEST ERROR: 
>>> pthread_create failed"/"TEST ERROR: pthread_join failed"):
>>>
>>> ?? 85?? if ((res = pthread_create(&thread, NULL, thread_start, NULL)) 
>>> != 0) {
>>> ?? 86???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s 
>>> (%d)\n", strerror(res), res);
>>> ?? 87???? exit(1);
>>> ?? 88?? }
>>> ?? 89
>>> ?? 90?? if ((res = pthread_join(thread, NULL)) != 0) {
>>> ?? 91???? fprintf(stderr, "TEST ERROR: pthread_created failed: %s 
>>> (%d)\n", strerror(res), res);
>>> ?? 92???? exit(1);
>>> ?? 93?? }
>>
>> Fixed - well spotted!
>>
>> Thanks,
>> David
>>
>>> --alex
>>>
>>> On 07/09/2018 15:17, David Holmes wrote:
>>>> Thanks Chris!
>>>>
>>>> Can I please get a second review.
>>>>
>>>> David
>>>>
>>>> On 10/07/2018 7:50 AM, Chris Plummer wrote:
>>>>> On 7/9/18 2:41 PM, David Holmes wrote:
>>>>>> Hi Chris,
>>>>>>
>>>>>> On 10/07/2018 4:22 AM, Chris Plummer wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> Would it be better to problem list this test on solaris using 
>>>>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off 
>>>>>>> the problem list and start executing on solaris.
>>>>>>
>>>>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could 
>>>>>> only fix this for VM created threads. The general problem of TLS 
>>>>>> destructors looping if a thread terminates without detaching from 
>>>>>> the VM is not solvable - other than by not using TLS in the VM.
>>>>> Ok, I misunderstood your comments in the test.
>>>>>
>>>>> Changes look fine.
>>>>>
>>>>> Chris
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 7/8/18 4:58 PM, David Holmes wrote:
>>>>>>>> tl;dr skip the new regression test on Solaris
>>>>>>>>
>>>>>>>> New webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/
>>>>>>>>
>>>>>>>> This excludes the test from running on Solaris, so the makefile 
>>>>>>>> doesn't bother compiling this native test and the Java part of 
>>>>>>>> the test adds:
>>>>>>>>
>>>>>>>> ! * @requires os.family != "windows" & os.family != "solaris"
>>>>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a 
>>>>>>>> natively
>>>>>>>> ? *????????? attached thread that has failed to detach before 
>>>>>>>> terminating.
>>>>>>>> + * @comment The native code only supports POSIX so no windows 
>>>>>>>> testing; also
>>>>>>>> + *????????? we have to skip solaris as a terminating thread 
>>>>>>>> that fails to
>>>>>>>> + *????????? detach will hit an infinite loop due to TLS 
>>>>>>>> destructor issues - see
>>>>>>>> + *????????? comments in JDK-8156708
>>>>>>>>
>>>>>>>> Note this means that Solaris is not affected by the original 
>>>>>>>> issue because a still-attached native thread can't actually 
>>>>>>>> terminate due to the TLS destructor infinite-loop issue.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>> On 6/07/2018 6:07 PM, David Holmes wrote:
>>>>>>>>> <sigh> The new test is hanging on Solaris. I just discovered we 
>>>>>>>>> don't run these tests on Solaris until tier4.
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote:
>>>>>>>>>> Hi Chris,
>>>>>>>>>>
>>>>>>>>>> Thanks for looking at this.
>>>>>>>>>>
>>>>>>>>>> Updated webrev:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
>>>>>>>>>>
>>>>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test)
>>>>>>>>>>
>>>>>>>>>> More below ...
>>>>>>>>>>
>>>>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>>>>>>>>>>> Hi David,
>>>>>>>>>>>
>>>>>>>>>>> Solaris problems aside, overall it looks fine. Some minor 
>>>>>>>>>>> things I noted:
>>>>>>>>>>>
>>>>>>>>>>> I noticed that exitCode is never modified in agentA() or 
>>>>>>>>>>> agentB(), so there isn't much point to having it. If you 
>>>>>>>>>>> reach the bottom of the function, it passed, so PASSED can be 
>>>>>>>>>>> returned. The code would be more clear if it did this. As-is 
>>>>>>>>>>> it is implied that you can reach the bottom when it fails.
>>>>>>>>>>
>>>>>>>>>> I resisted any and all urges to do any kind of unrelated code 
>>>>>>>>>> cleanup in the tests - once you start you may end up doing a 
>>>>>>>>>> full rewrite.
>>>>>>>>>>
>>>>>>>>>>> Is detaching the threads along the failure paths really 
>>>>>>>>>>> needed? exit() is called, so this would seem to make it 
>>>>>>>>>>> unnecessary.
>>>>>>>>>>
>>>>>>>>>> You're right that isn't necessary. I'll remove the changes 
>>>>>>>>>> from before the exits in ji05t001.c
>>>>>>>>>>
>>>>>>>>>>> I prefer assignments not to be embedded inside the "if" 
>>>>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is 
>>>>>>>>>>> much more readable than the similar code in agentA() and 
>>>>>>>>>>> agentB().
>>>>>>>>>>
>>>>>>>>>> It's an existing style already used in that test e.g.
>>>>>>>>>>
>>>>>>>>>> ??287???? if ((res =
>>>>>>>>>> ??288 JNI_ENV_PTR(vm)->AttachCurrentThread(
>>>>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void 
>>>>>>>>>> *) 0)) != 0) {
>>>>>>>>>>
>>>>>>>>>> and I don't mind it, so I'd prefer not to change it.
>>>>>>>>>>
>>>>>>>>>>> In the test:
>>>>>>>>>>>
>>>>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw 
>>>>>>>>>>> unexpected
>>>>>>>>>>> ?? 55???????? // exceptions then the test passes. In some 
>>>>>>>>>>> cases we know exactly
>>>>>>>>>>>
>>>>>>>>>>> "of" should be "or".
>>>>>>>>>>
>>>>>>>>>> Well spotted. Thanks.
>>>>>>>>>>
>>>>>>>>>>> Shouldn't you be catching exceptions for all the Thread 
>>>>>>>>>>> methods you are calling? Otherwise the test will exit if one 
>>>>>>>>>>> is thrown, and the above comment indicates that you don't 
>>>>>>>>>>> want this.
>>>>>>>>>>
>>>>>>>>>> I'm not expecting there to be any exceptions from any of the 
>>>>>>>>>> called methods. That would potentially indicate a problem in 
>>>>>>>>>> handling the terminated native thread, so would indicate a 
>>>>>>>>>> test failure.
>>>>>>>>>>
>>>>>>>>>>> Don't we normally put these tests in a package?
>>>>>>>>>>
>>>>>>>>>> Doesn't seem to be any hard and fast rule. I only uses 
>>>>>>>>>> packages when they are important for the test. In runtime we 
>>>>>>>>>> have 905 java files and only 116 have a package statement. It 
>>>>>>>>>> varies elsewhere.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> thanks,
>>>>>>>>>>>
>>>>>>>>>>> Chris
>>>>>>>>>>>
>>>>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote:
>>>>>>>>>>>> <sigh> Solaris compiler complains about doing a return from 
>>>>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix 
>>>>>>>>>>>> tomorrow.
>>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>>>>>>>>>>
>>>>>>>>>>>>> Problem:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The tests create native threads that attach to the VM 
>>>>>>>>>>>>> through JNI_AttachCurrentThread but which then terminate 
>>>>>>>>>>>>> without detaching themselves. When the VM exits and we're 
>>>>>>>>>>>>> using Flight Recorder "dumponexit" this leads to a call to 
>>>>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread 
>>>>>>>>>>>>> CPU usage. When we encounter the threads that have 
>>>>>>>>>>>>> terminated already the low level pthread_getcpuclockid 
>>>>>>>>>>>>> calls returns ESRCH but the code doesn't expect that and so 
>>>>>>>>>>>>> fails an assert in debug mode and can SEGV in product mode.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Solution:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Serviceability-side: fix the tests
>>>>>>>>>>>>>
>>>>>>>>>>>>> Change the tests so that the threads detach before 
>>>>>>>>>>>>> terminating. The two tests are (surprisingly) written in 
>>>>>>>>>>>>> completely different styles, so the solution also takes on 
>>>>>>>>>>>>> two different styles.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI 
>>>>>>>>>>>>> attached threads that terminate before detaching, and add a 
>>>>>>>>>>>>> regression test
>>>>>>>>>>>>>
>>>>>>>>>>>>> I took a good look at the low-level code for interacting 
>>>>>>>>>>>>> with arbitrary threads and as far as I can see the problem 
>>>>>>>>>>>>> only exists for this one case of pthread_getcpuclockid on 
>>>>>>>>>>>>> Linux. Elsewhere the potential for a library call failure 
>>>>>>>>>>>>> just reports an error value (such as -1 for the cpu time 
>>>>>>>>>>>>> used).
>>>>>>>>>>>>>
>>>>>>>>>>>>> So the fix is simply to allow for ESRCH when calling 
>>>>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in 
>>>>>>>>>>>>> that case.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I created a new regression test to create a new native 
>>>>>>>>>>>>> thread, attach it and then let it terminate while still 
>>>>>>>>>>>>> attached. The java code then calls various Thread and 
>>>>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes 
>>>>>>>>>>>>> or unexpected exceptions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Testing:
>>>>>>>>>>>>> ??- old tests with fixed run-time
>>>>>>>>>>>>> ??- old run-time with fixed tests
>>>>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we 
>>>>>>>>>>>>> enable Flight recorder for the tests) [in progress]
>>>>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress]
>>>>>>>>>>>>> ??- new regression test
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
> 

From serguei.spitsyn at oracle.com  Tue Jul 10 02:07:39 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 9 Jul 2018 19:07:39 -0700
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <d9212cbf-5f41-2aff-a448-035dd9e63963@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
 <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
 <bd26e602-8f03-377a-49c8-badb092736dd@oracle.com>
 <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com>
 <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com>
 <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com>
 <e760f370-c54e-fa0a-a261-c8df8e9b15bb@oracle.com>
 <d9212cbf-5f41-2aff-a448-035dd9e63963@oracle.com>
Message-ID: <416fc226-e389-df65-9487-736efc9e7528@oracle.com>

Hi David,

It looks good modulo the minor comments that others have already found.
Could I ask you to fix a couple of really minor issues in new test?

Unneeded spaces are at lines 84 and 51 in .java and .c files:

   83         if (mbean.isThreadCpuTimeSupported() &&
   84             mbean.isThreadCpuTimeEnabled() ) {
   . . .

   51   class_id = (*env)->FindClass (env, "java/lang/Thread");

Thanks,
Serguei


On 7/9/18 15:17, David Holmes wrote:
> Thanks Chris!
>
> Can I please get a second review.
>
> David
>
> On 10/07/2018 7:50 AM, Chris Plummer wrote:
>> On 7/9/18 2:41 PM, David Holmes wrote:
>>> Hi Chris,
>>>
>>> On 10/07/2018 4:22 AM, Chris Plummer wrote:
>>>> Hi David,
>>>>
>>>> Would it be better to problem list this test on solaris using 
>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the 
>>>> problem list and start executing on solaris.
>>>
>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could 
>>> only fix this for VM created threads. The general problem of TLS 
>>> destructors looping if a thread terminates without detaching from 
>>> the VM is not solvable - other than by not using TLS in the VM.
>> Ok, I misunderstood your comments in the test.
>>
>> Changes look fine.
>>
>> Chris
>>>
>>> Thanks,
>>> David
>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 7/8/18 4:58 PM, David Holmes wrote:
>>>>> tl;dr skip the new regression test on Solaris
>>>>>
>>>>> New webrev:
>>>>>
>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/
>>>>>
>>>>> This excludes the test from running on Solaris, so the makefile 
>>>>> doesn't bother compiling this native test and the Java part of the 
>>>>> test adds:
>>>>>
>>>>> ! * @requires os.family != "windows" & os.family != "solaris"
>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a 
>>>>> natively
>>>>> ? *????????? attached thread that has failed to detach before 
>>>>> terminating.
>>>>> + * @comment The native code only supports POSIX so no windows 
>>>>> testing; also
>>>>> + *????????? we have to skip solaris as a terminating thread that 
>>>>> fails to
>>>>> + *????????? detach will hit an infinite loop due to TLS 
>>>>> destructor issues - see
>>>>> + *????????? comments in JDK-8156708
>>>>>
>>>>> Note this means that Solaris is not affected by the original issue 
>>>>> because a still-attached native thread can't actually terminate 
>>>>> due to the TLS destructor infinite-loop issue.
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>> On 6/07/2018 6:07 PM, David Holmes wrote:
>>>>>> <sigh> The new test is hanging on Solaris. I just discovered we 
>>>>>> don't run these tests on Solaris until tier4.
>>>>>>
>>>>>> David
>>>>>>
>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>> Thanks for looking at this.
>>>>>>>
>>>>>>> Updated webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
>>>>>>>
>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test)
>>>>>>>
>>>>>>> More below ...
>>>>>>>
>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>>>>>>>> Hi David,
>>>>>>>>
>>>>>>>> Solaris problems aside, overall it looks fine. Some minor 
>>>>>>>> things I noted:
>>>>>>>>
>>>>>>>> I noticed that exitCode is never modified in agentA() or 
>>>>>>>> agentB(), so there isn't much point to having it. If you reach 
>>>>>>>> the bottom of the function, it passed, so PASSED can be 
>>>>>>>> returned. The code would be more clear if it did this. As-is it 
>>>>>>>> is implied that you can reach the bottom when it fails.
>>>>>>>
>>>>>>> I resisted any and all urges to do any kind of unrelated code 
>>>>>>> cleanup in the tests - once you start you may end up doing a 
>>>>>>> full rewrite.
>>>>>>>
>>>>>>>> Is detaching the threads along the failure paths really needed? 
>>>>>>>> exit() is called, so this would seem to make it unnecessary.
>>>>>>>
>>>>>>> You're right that isn't necessary. I'll remove the changes from 
>>>>>>> before the exits in ji05t001.c
>>>>>>>
>>>>>>>> I prefer assignments not to be embedded inside the "if" 
>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is 
>>>>>>>> much more readable than the similar code in agentA() and agentB().
>>>>>>>
>>>>>>> It's an existing style already used in that test e.g.
>>>>>>>
>>>>>>> ??287???? if ((res =
>>>>>>> ??288 JNI_ENV_PTR(vm)->AttachCurrentThread(
>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 
>>>>>>> 0)) != 0) {
>>>>>>>
>>>>>>> and I don't mind it, so I'd prefer not to change it.
>>>>>>>
>>>>>>>> In the test:
>>>>>>>>
>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw 
>>>>>>>> unexpected
>>>>>>>> ?? 55???????? // exceptions then the test passes. In some cases 
>>>>>>>> we know exactly
>>>>>>>>
>>>>>>>> "of" should be "or".
>>>>>>>
>>>>>>> Well spotted. Thanks.
>>>>>>>
>>>>>>>> Shouldn't you be catching exceptions for all the Thread methods 
>>>>>>>> you are calling? Otherwise the test will exit if one is thrown, 
>>>>>>>> and the above comment indicates that you don't want this.
>>>>>>>
>>>>>>> I'm not expecting there to be any exceptions from any of the 
>>>>>>> called methods. That would potentially indicate a problem in 
>>>>>>> handling the terminated native thread, so would indicate a test 
>>>>>>> failure.
>>>>>>>
>>>>>>>> Don't we normally put these tests in a package?
>>>>>>>
>>>>>>> Doesn't seem to be any hard and fast rule. I only uses packages 
>>>>>>> when they are important for the test. In runtime we have 905 
>>>>>>> java files and only 116 have a package statement. It varies 
>>>>>>> elsewhere.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote:
>>>>>>>>> <sigh> Solaris compiler complains about doing a return from 
>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix 
>>>>>>>>> tomorrow.
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>>>>>>>
>>>>>>>>>> Problem:
>>>>>>>>>>
>>>>>>>>>> The tests create native threads that attach to the VM through 
>>>>>>>>>> JNI_AttachCurrentThread but which then terminate without 
>>>>>>>>>> detaching themselves. When the VM exits and we're using 
>>>>>>>>>> Flight Recorder "dumponexit" this leads to a call to 
>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread 
>>>>>>>>>> CPU usage. When we encounter the threads that have terminated 
>>>>>>>>>> already the low level pthread_getcpuclockid calls returns 
>>>>>>>>>> ESRCH but the code doesn't expect that and so fails an assert 
>>>>>>>>>> in debug mode and can SEGV in product mode.
>>>>>>>>>>
>>>>>>>>>> Solution:
>>>>>>>>>>
>>>>>>>>>> Serviceability-side: fix the tests
>>>>>>>>>>
>>>>>>>>>> Change the tests so that the threads detach before 
>>>>>>>>>> terminating. The two tests are (surprisingly) written in 
>>>>>>>>>> completely different styles, so the solution also takes on 
>>>>>>>>>> two different styles.
>>>>>>>>>>
>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI 
>>>>>>>>>> attached threads that terminate before detaching, and add a 
>>>>>>>>>> regression test
>>>>>>>>>>
>>>>>>>>>> I took a good look at the low-level code for interacting with 
>>>>>>>>>> arbitrary threads and as far as I can see the problem only 
>>>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. 
>>>>>>>>>> Elsewhere the potential for a library call failure just 
>>>>>>>>>> reports an error value (such as -1 for the cpu time used).
>>>>>>>>>>
>>>>>>>>>> So the fix is simply to allow for ESRCH when calling 
>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that 
>>>>>>>>>> case.
>>>>>>>>>>
>>>>>>>>>> I created a new regression test to create a new native 
>>>>>>>>>> thread, attach it and then let it terminate while still 
>>>>>>>>>> attached. The java code then calls various Thread and 
>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes 
>>>>>>>>>> or unexpected exceptions.
>>>>>>>>>>
>>>>>>>>>> Testing:
>>>>>>>>>> ??- old tests with fixed run-time
>>>>>>>>>> ??- old run-time with fixed tests
>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we 
>>>>>>>>>> enable Flight recorder for the tests) [in progress]
>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress]
>>>>>>>>>> ??- new regression test
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> David
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>
>>>>
>>
>>


From david.holmes at oracle.com  Tue Jul 10 02:35:41 2018
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 10 Jul 2018 12:35:41 +1000
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <416fc226-e389-df65-9487-736efc9e7528@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
 <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
 <bd26e602-8f03-377a-49c8-badb092736dd@oracle.com>
 <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com>
 <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com>
 <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com>
 <e760f370-c54e-fa0a-a261-c8df8e9b15bb@oracle.com>
 <d9212cbf-5f41-2aff-a448-035dd9e63963@oracle.com>
 <416fc226-e389-df65-9487-736efc9e7528@oracle.com>
Message-ID: <864787be-ca72-2422-0399-716a1abe7d27@oracle.com>

On 10/07/2018 12:07 PM, serguei.spitsyn at oracle.com wrote:
> Hi David,
> 
> It looks good modulo the minor comments that others have already found.

Thanks for taking a look.

> Could I ask you to fix a couple of really minor issues in new test?
> 
> Unneeded spaces are at lines 84 and 51 in .java and .c files:
> 
>  ? 83???????? if (mbean.isThreadCpuTimeSupported() &&
>  ? 84???????????? mbean.isThreadCpuTimeEnabled() ) {
>  ? . . .
> 
>  ? 51?? class_id = (*env)->FindClass (env, "java/lang/Thread");

Sorry Serguei, too late.

David

> Thanks,
> Serguei
> 
> 
> On 7/9/18 15:17, David Holmes wrote:
>> Thanks Chris!
>>
>> Can I please get a second review.
>>
>> David
>>
>> On 10/07/2018 7:50 AM, Chris Plummer wrote:
>>> On 7/9/18 2:41 PM, David Holmes wrote:
>>>> Hi Chris,
>>>>
>>>> On 10/07/2018 4:22 AM, Chris Plummer wrote:
>>>>> Hi David,
>>>>>
>>>>> Would it be better to problem list this test on solaris using 
>>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off the 
>>>>> problem list and start executing on solaris.
>>>>
>>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could 
>>>> only fix this for VM created threads. The general problem of TLS 
>>>> destructors looping if a thread terminates without detaching from 
>>>> the VM is not solvable - other than by not using TLS in the VM.
>>> Ok, I misunderstood your comments in the test.
>>>
>>> Changes look fine.
>>>
>>> Chris
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 7/8/18 4:58 PM, David Holmes wrote:
>>>>>> tl;dr skip the new regression test on Solaris
>>>>>>
>>>>>> New webrev:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/
>>>>>>
>>>>>> This excludes the test from running on Solaris, so the makefile 
>>>>>> doesn't bother compiling this native test and the Java part of the 
>>>>>> test adds:
>>>>>>
>>>>>> ! * @requires os.family != "windows" & os.family != "solaris"
>>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a 
>>>>>> natively
>>>>>> ? *????????? attached thread that has failed to detach before 
>>>>>> terminating.
>>>>>> + * @comment The native code only supports POSIX so no windows 
>>>>>> testing; also
>>>>>> + *????????? we have to skip solaris as a terminating thread that 
>>>>>> fails to
>>>>>> + *????????? detach will hit an infinite loop due to TLS 
>>>>>> destructor issues - see
>>>>>> + *????????? comments in JDK-8156708
>>>>>>
>>>>>> Note this means that Solaris is not affected by the original issue 
>>>>>> because a still-attached native thread can't actually terminate 
>>>>>> due to the TLS destructor infinite-loop issue.
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>> On 6/07/2018 6:07 PM, David Holmes wrote:
>>>>>>> <sigh> The new test is hanging on Solaris. I just discovered we 
>>>>>>> don't run these tests on Solaris until tier4.
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote:
>>>>>>>> Hi Chris,
>>>>>>>>
>>>>>>>> Thanks for looking at this.
>>>>>>>>
>>>>>>>> Updated webrev:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
>>>>>>>>
>>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test)
>>>>>>>>
>>>>>>>> More below ...
>>>>>>>>
>>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>>>>>>>>> Hi David,
>>>>>>>>>
>>>>>>>>> Solaris problems aside, overall it looks fine. Some minor 
>>>>>>>>> things I noted:
>>>>>>>>>
>>>>>>>>> I noticed that exitCode is never modified in agentA() or 
>>>>>>>>> agentB(), so there isn't much point to having it. If you reach 
>>>>>>>>> the bottom of the function, it passed, so PASSED can be 
>>>>>>>>> returned. The code would be more clear if it did this. As-is it 
>>>>>>>>> is implied that you can reach the bottom when it fails.
>>>>>>>>
>>>>>>>> I resisted any and all urges to do any kind of unrelated code 
>>>>>>>> cleanup in the tests - once you start you may end up doing a 
>>>>>>>> full rewrite.
>>>>>>>>
>>>>>>>>> Is detaching the threads along the failure paths really needed? 
>>>>>>>>> exit() is called, so this would seem to make it unnecessary.
>>>>>>>>
>>>>>>>> You're right that isn't necessary. I'll remove the changes from 
>>>>>>>> before the exits in ji05t001.c
>>>>>>>>
>>>>>>>>> I prefer assignments not to be embedded inside the "if" 
>>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is 
>>>>>>>>> much more readable than the similar code in agentA() and agentB().
>>>>>>>>
>>>>>>>> It's an existing style already used in that test e.g.
>>>>>>>>
>>>>>>>> ??287???? if ((res =
>>>>>>>> ??288 JNI_ENV_PTR(vm)->AttachCurrentThread(
>>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void *) 
>>>>>>>> 0)) != 0) {
>>>>>>>>
>>>>>>>> and I don't mind it, so I'd prefer not to change it.
>>>>>>>>
>>>>>>>>> In the test:
>>>>>>>>>
>>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw 
>>>>>>>>> unexpected
>>>>>>>>> ?? 55???????? // exceptions then the test passes. In some cases 
>>>>>>>>> we know exactly
>>>>>>>>>
>>>>>>>>> "of" should be "or".
>>>>>>>>
>>>>>>>> Well spotted. Thanks.
>>>>>>>>
>>>>>>>>> Shouldn't you be catching exceptions for all the Thread methods 
>>>>>>>>> you are calling? Otherwise the test will exit if one is thrown, 
>>>>>>>>> and the above comment indicates that you don't want this.
>>>>>>>>
>>>>>>>> I'm not expecting there to be any exceptions from any of the 
>>>>>>>> called methods. That would potentially indicate a problem in 
>>>>>>>> handling the terminated native thread, so would indicate a test 
>>>>>>>> failure.
>>>>>>>>
>>>>>>>>> Don't we normally put these tests in a package?
>>>>>>>>
>>>>>>>> Doesn't seem to be any hard and fast rule. I only uses packages 
>>>>>>>> when they are important for the test. In runtime we have 905 
>>>>>>>> java files and only 116 have a package statement. It varies 
>>>>>>>> elsewhere.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> David
>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote:
>>>>>>>>>> <sigh> Solaris compiler complains about doing a return from 
>>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix 
>>>>>>>>>> tomorrow.
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>>>>>>>>
>>>>>>>>>>> Problem:
>>>>>>>>>>>
>>>>>>>>>>> The tests create native threads that attach to the VM through 
>>>>>>>>>>> JNI_AttachCurrentThread but which then terminate without 
>>>>>>>>>>> detaching themselves. When the VM exits and we're using 
>>>>>>>>>>> Flight Recorder "dumponexit" this leads to a call to 
>>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread 
>>>>>>>>>>> CPU usage. When we encounter the threads that have terminated 
>>>>>>>>>>> already the low level pthread_getcpuclockid calls returns 
>>>>>>>>>>> ESRCH but the code doesn't expect that and so fails an assert 
>>>>>>>>>>> in debug mode and can SEGV in product mode.
>>>>>>>>>>>
>>>>>>>>>>> Solution:
>>>>>>>>>>>
>>>>>>>>>>> Serviceability-side: fix the tests
>>>>>>>>>>>
>>>>>>>>>>> Change the tests so that the threads detach before 
>>>>>>>>>>> terminating. The two tests are (surprisingly) written in 
>>>>>>>>>>> completely different styles, so the solution also takes on 
>>>>>>>>>>> two different styles.
>>>>>>>>>>>
>>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI 
>>>>>>>>>>> attached threads that terminate before detaching, and add a 
>>>>>>>>>>> regression test
>>>>>>>>>>>
>>>>>>>>>>> I took a good look at the low-level code for interacting with 
>>>>>>>>>>> arbitrary threads and as far as I can see the problem only 
>>>>>>>>>>> exists for this one case of pthread_getcpuclockid on Linux. 
>>>>>>>>>>> Elsewhere the potential for a library call failure just 
>>>>>>>>>>> reports an error value (such as -1 for the cpu time used).
>>>>>>>>>>>
>>>>>>>>>>> So the fix is simply to allow for ESRCH when calling 
>>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in that 
>>>>>>>>>>> case.
>>>>>>>>>>>
>>>>>>>>>>> I created a new regression test to create a new native 
>>>>>>>>>>> thread, attach it and then let it terminate while still 
>>>>>>>>>>> attached. The java code then calls various Thread and 
>>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes 
>>>>>>>>>>> or unexpected exceptions.
>>>>>>>>>>>
>>>>>>>>>>> Testing:
>>>>>>>>>>> ??- old tests with fixed run-time
>>>>>>>>>>> ??- old run-time with fixed tests
>>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we 
>>>>>>>>>>> enable Flight recorder for the tests) [in progress]
>>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress]
>>>>>>>>>>> ??- new regression test
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>
>>>>>
>>>
>>>
> 

From serguei.spitsyn at oracle.com  Tue Jul 10 02:42:57 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 9 Jul 2018 19:42:57 -0700
Subject: RFR (11): 8205878: pthread_getcpuclockid is expected to return 0
 code
In-Reply-To: <864787be-ca72-2422-0399-716a1abe7d27@oracle.com>
References: <1f09c27c-4691-b33c-e006-a6c9a2cbedea@oracle.com>
 <c865179b-8ea3-7224-60f6-cea83f3d45be@oracle.com>
 <c9cee9b4-f761-fd79-8a9d-8c3df9c04c67@oracle.com>
 <49ab47f5-7cf9-4fbe-f594-d3295ded7f82@oracle.com>
 <bd26e602-8f03-377a-49c8-badb092736dd@oracle.com>
 <75c3a658-e119-54e3-75b2-258401c491ab@oracle.com>
 <01a73af9-0783-3536-e4d3-a2faefe2fbd4@oracle.com>
 <81e92666-77b5-ac53-4139-44f8c3010cda@oracle.com>
 <e760f370-c54e-fa0a-a261-c8df8e9b15bb@oracle.com>
 <d9212cbf-5f41-2aff-a448-035dd9e63963@oracle.com>
 <416fc226-e389-df65-9487-736efc9e7528@oracle.com>
 <864787be-ca72-2422-0399-716a1abe7d27@oracle.com>
Message-ID: <5d5822c1-e5fb-f38f-f44d-26086a9ff3b8@oracle.com>

On 7/9/18 19:35, David Holmes wrote:
> On 10/07/2018 12:07 PM, serguei.spitsyn at oracle.com wrote:
>> Hi David,
>>
>> It looks good modulo the minor comments that others have already found.
>
> Thanks for taking a look.
>
>> Could I ask you to fix a couple of really minor issues in new test?
>>
>> Unneeded spaces are at lines 84 and 51 in .java and .c files:
>>
>> ?? 83???????? if (mbean.isThreadCpuTimeSupported() &&
>> ?? 84???????????? mbean.isThreadCpuTimeEnabled() ) {
>> ?? . . .
>>
>> ?? 51?? class_id = (*env)->FindClass (env, "java/lang/Thread");
>
> Sorry Serguei, too late.

Not a problem, David.
Sorry for being late.

Thanks,
Serguei

> David
>
>> Thanks,
>> Serguei
>>
>>
>> On 7/9/18 15:17, David Holmes wrote:
>>> Thanks Chris!
>>>
>>> Can I please get a second review.
>>>
>>> David
>>>
>>> On 10/07/2018 7:50 AM, Chris Plummer wrote:
>>>> On 7/9/18 2:41 PM, David Holmes wrote:
>>>>> Hi Chris,
>>>>>
>>>>> On 10/07/2018 4:22 AM, Chris Plummer wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> Would it be better to problem list this test on solaris using 
>>>>>> JDK-8156708. That way when JDK-8156708 is fixed it can come off 
>>>>>> the problem list and start executing on solaris.
>>>>>
>>>>> JDK-8156708 is already fixed - it's a dup of JDK-8154715. We could 
>>>>> only fix this for VM created threads. The general problem of TLS 
>>>>> destructors looping if a thread terminates without detaching from 
>>>>> the VM is not solvable - other than by not using TLS in the VM.
>>>> Ok, I misunderstood your comments in the test.
>>>>
>>>> Changes look fine.
>>>>
>>>> Chris
>>>>>
>>>>> Thanks,
>>>>> David
>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/8/18 4:58 PM, David Holmes wrote:
>>>>>>> tl;dr skip the new regression test on Solaris
>>>>>>>
>>>>>>> New webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v3/
>>>>>>>
>>>>>>> This excludes the test from running on Solaris, so the makefile 
>>>>>>> doesn't bother compiling this native test and the Java part of 
>>>>>>> the test adds:
>>>>>>>
>>>>>>> ! * @requires os.family != "windows" & os.family != "solaris"
>>>>>>> ? * @summary Basic test of Thread and ThreadMXBean queries on a 
>>>>>>> natively
>>>>>>> ? *????????? attached thread that has failed to detach before 
>>>>>>> terminating.
>>>>>>> + * @comment The native code only supports POSIX so no windows 
>>>>>>> testing; also
>>>>>>> + *????????? we have to skip solaris as a terminating thread 
>>>>>>> that fails to
>>>>>>> + *????????? detach will hit an infinite loop due to TLS 
>>>>>>> destructor issues - see
>>>>>>> + *????????? comments in JDK-8156708
>>>>>>>
>>>>>>> Note this means that Solaris is not affected by the original 
>>>>>>> issue because a still-attached native thread can't actually 
>>>>>>> terminate due to the TLS destructor infinite-loop issue.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> David
>>>>>>>
>>>>>>> On 6/07/2018 6:07 PM, David Holmes wrote:
>>>>>>>> <sigh> The new test is hanging on Solaris. I just discovered we 
>>>>>>>> don't run these tests on Solaris until tier4.
>>>>>>>>
>>>>>>>> David
>>>>>>>>
>>>>>>>> On 6/07/2018 8:40 AM, David Holmes wrote:
>>>>>>>>> Hi Chris,
>>>>>>>>>
>>>>>>>>> Thanks for looking at this.
>>>>>>>>>
>>>>>>>>> Updated webrev:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dholmes/8205878/webrev.v2/
>>>>>>>>>
>>>>>>>>> Only real changes in ji05t001.c. (And fixed typo in the new test)
>>>>>>>>>
>>>>>>>>> More below ...
>>>>>>>>>
>>>>>>>>> On 6/07/2018 7:55 AM, Chris Plummer wrote:
>>>>>>>>>> Hi David,
>>>>>>>>>>
>>>>>>>>>> Solaris problems aside, overall it looks fine. Some minor 
>>>>>>>>>> things I noted:
>>>>>>>>>>
>>>>>>>>>> I noticed that exitCode is never modified in agentA() or 
>>>>>>>>>> agentB(), so there isn't much point to having it. If you 
>>>>>>>>>> reach the bottom of the function, it passed, so PASSED can be 
>>>>>>>>>> returned. The code would be more clear if it did this. As-is 
>>>>>>>>>> it is implied that you can reach the bottom when it fails.
>>>>>>>>>
>>>>>>>>> I resisted any and all urges to do any kind of unrelated code 
>>>>>>>>> cleanup in the tests - once you start you may end up doing a 
>>>>>>>>> full rewrite.
>>>>>>>>>
>>>>>>>>>> Is detaching the threads along the failure paths really 
>>>>>>>>>> needed? exit() is called, so this would seem to make it 
>>>>>>>>>> unnecessary.
>>>>>>>>>
>>>>>>>>> You're right that isn't necessary. I'll remove the changes 
>>>>>>>>> from before the exits in ji05t001.c
>>>>>>>>>
>>>>>>>>>> I prefer assignments not to be embedded inside the "if" 
>>>>>>>>>> condition. The DetachCurrentThread code in THREAD_return() is 
>>>>>>>>>> much more readable than the similar code in agentA() and 
>>>>>>>>>> agentB().
>>>>>>>>>
>>>>>>>>> It's an existing style already used in that test e.g.
>>>>>>>>>
>>>>>>>>> ??287???? if ((res =
>>>>>>>>> ??288 JNI_ENV_PTR(vm)->AttachCurrentThread(
>>>>>>>>> ??289???????????????? JNI_ENV_ARG(vm, (void **) &env), (void 
>>>>>>>>> *) 0)) != 0) {
>>>>>>>>>
>>>>>>>>> and I don't mind it, so I'd prefer not to change it.
>>>>>>>>>
>>>>>>>>>> In the test:
>>>>>>>>>>
>>>>>>>>>> ?? 54???????? // Generally as long as we don't crash of throw 
>>>>>>>>>> unexpected
>>>>>>>>>> ?? 55???????? // exceptions then the test passes. In some 
>>>>>>>>>> cases we know exactly
>>>>>>>>>>
>>>>>>>>>> "of" should be "or".
>>>>>>>>>
>>>>>>>>> Well spotted. Thanks.
>>>>>>>>>
>>>>>>>>>> Shouldn't you be catching exceptions for all the Thread 
>>>>>>>>>> methods you are calling? Otherwise the test will exit if one 
>>>>>>>>>> is thrown, and the above comment indicates that you don't 
>>>>>>>>>> want this.
>>>>>>>>>
>>>>>>>>> I'm not expecting there to be any exceptions from any of the 
>>>>>>>>> called methods. That would potentially indicate a problem in 
>>>>>>>>> handling the terminated native thread, so would indicate a 
>>>>>>>>> test failure.
>>>>>>>>>
>>>>>>>>>> Don't we normally put these tests in a package?
>>>>>>>>>
>>>>>>>>> Doesn't seem to be any hard and fast rule. I only uses 
>>>>>>>>> packages when they are important for the test. In runtime we 
>>>>>>>>> have 905 java files and only 116 have a package statement. It 
>>>>>>>>> varies elsewhere.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>> On 7/5/18 2:58 AM, David Holmes wrote:
>>>>>>>>>>> <sigh> Solaris compiler complains about doing a return from 
>>>>>>>>>>> inside a do-while loop. I'll have to rework part of the fix 
>>>>>>>>>>> tomorrow.
>>>>>>>>>>>
>>>>>>>>>>> David
>>>>>>>>>>>
>>>>>>>>>>> On 5/07/2018 6:19 PM, David Holmes wrote:
>>>>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8205878
>>>>>>>>>>>> Webrev: http://cr.openjdk.java.net/~dholmes/8205878/webrev/
>>>>>>>>>>>>
>>>>>>>>>>>> Problem:
>>>>>>>>>>>>
>>>>>>>>>>>> The tests create native threads that attach to the VM 
>>>>>>>>>>>> through JNI_AttachCurrentThread but which then terminate 
>>>>>>>>>>>> without detaching themselves. When the VM exits and we're 
>>>>>>>>>>>> using Flight Recorder "dumponexit" this leads to a call to 
>>>>>>>>>>>> VM_PrintThreads that in part wants to print the per-thread 
>>>>>>>>>>>> CPU usage. When we encounter the threads that have 
>>>>>>>>>>>> terminated already the low level pthread_getcpuclockid 
>>>>>>>>>>>> calls returns ESRCH but the code doesn't expect that and so 
>>>>>>>>>>>> fails an assert in debug mode and can SEGV in product mode.
>>>>>>>>>>>>
>>>>>>>>>>>> Solution:
>>>>>>>>>>>>
>>>>>>>>>>>> Serviceability-side: fix the tests
>>>>>>>>>>>>
>>>>>>>>>>>> Change the tests so that the threads detach before 
>>>>>>>>>>>> terminating. The two tests are (surprisingly) written in 
>>>>>>>>>>>> completely different styles, so the solution also takes on 
>>>>>>>>>>>> two different styles.
>>>>>>>>>>>>
>>>>>>>>>>>> Runtime-side: make the VM more robust in the fact of JNI 
>>>>>>>>>>>> attached threads that terminate before detaching, and add a 
>>>>>>>>>>>> regression test
>>>>>>>>>>>>
>>>>>>>>>>>> I took a good look at the low-level code for interacting 
>>>>>>>>>>>> with arbitrary threads and as far as I can see the problem 
>>>>>>>>>>>> only exists for this one case of pthread_getcpuclockid on 
>>>>>>>>>>>> Linux. Elsewhere the potential for a library call failure 
>>>>>>>>>>>> just reports an error value (such as -1 for the cpu time 
>>>>>>>>>>>> used).
>>>>>>>>>>>>
>>>>>>>>>>>> So the fix is simply to allow for ESRCH when calling 
>>>>>>>>>>>> pthread_getcpuclockid and return -1 for the cpu usage in 
>>>>>>>>>>>> that case.
>>>>>>>>>>>>
>>>>>>>>>>>> I created a new regression test to create a new native 
>>>>>>>>>>>> thread, attach it and then let it terminate while still 
>>>>>>>>>>>> attached. The java code then calls various Thread and 
>>>>>>>>>>>> ThreadMXBean functions on it to ensure there are no crashes 
>>>>>>>>>>>> or unexpected exceptions.
>>>>>>>>>>>>
>>>>>>>>>>>> Testing:
>>>>>>>>>>>> ??- old tests with fixed run-time
>>>>>>>>>>>> ??- old run-time with fixed tests
>>>>>>>>>>>> ??- mach tier4 (which exposed the problem - that's where we 
>>>>>>>>>>>> enable Flight recorder for the tests) [in progress]
>>>>>>>>>>>> ??- mach5 tier 1-3 for good measure [in progress]
>>>>>>>>>>>> ??- new regression test
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>>>
>>>>
>>>>
>>


From jcbeyler at google.com  Tue Jul 10 18:41:49 2018
From: jcbeyler at google.com (JC Beyler)
Date: Tue, 10 Jul 2018 11:41:49 -0700
Subject: RFR (S) 8205643: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java
 fails
Message-ID: <CAF9BGByaoFnwH=FPnXRuTP+MYXSu++9-xtunTho-g0OzNt_nEQ@mail.gmail.com>

Hi All,

Could someone review the one liner for the bug:
https://bugs.openjdk.java.net/browse/JDK-8205643

The webrev is here:
http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/

Basically, the test is testing CMS and Graal does not play well with CMS it
seems so this removes Graal being tested with it.

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180710/55d4e6a2/attachment.html>

From alexey.menkov at oracle.com  Tue Jul 10 19:26:34 2018
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Tue, 10 Jul 2018 12:26:34 -0700
Subject: RFR (S) 8205643: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails
In-Reply-To: <CAF9BGByaoFnwH=FPnXRuTP+MYXSu++9-xtunTho-g0OzNt_nEQ@mail.gmail.com>
References: <CAF9BGByaoFnwH=FPnXRuTP+MYXSu++9-xtunTho-g0OzNt_nEQ@mail.gmail.com>
Message-ID: <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com>

Hi JC,

you need also to remove the test from ProblemList

--alex

On 07/10/2018 11:41, JC Beyler wrote:
> Hi All,
> 
> Could someone review the one liner for the bug:
> https://bugs.openjdk.java.net/browse/JDK-8205643
> 
> The webrev is here:
> http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/
> 
> Basically, the test is testing CMS and Graal does not play well with CMS 
> it seems so this removes Graal being tested with it.
> 
> Thanks,
> Jc

From jcbeyler at google.com  Tue Jul 10 20:37:37 2018
From: jcbeyler at google.com (JC Beyler)
Date: Tue, 10 Jul 2018 13:37:37 -0700
Subject: RFR (S) 8205643: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java
 fails
In-Reply-To: <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com>
References: <CAF9BGByaoFnwH=FPnXRuTP+MYXSu++9-xtunTho-g0OzNt_nEQ@mail.gmail.com>
 <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com>
Message-ID: <CAF9BGBzBbj-QV61jcXacG4e9EfBxV0pPMJPBe1BPNXt982+s3A@mail.gmail.com>

Hi Alex,

Done here:
http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.01/

Any other issues with this fix?

Thanks!
Jc

On Tue, Jul 10, 2018 at 12:26 PM Alex Menkov <alexey.menkov at oracle.com>
wrote:

> Hi JC,
>
> you need also to remove the test from ProblemList
>
> --alex
>
> On 07/10/2018 11:41, JC Beyler wrote:
> > Hi All,
> >
> > Could someone review the one liner for the bug:
> > https://bugs.openjdk.java.net/browse/JDK-8205643
> >
> > The webrev is here:
> > http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/
> >
> > Basically, the test is testing CMS and Graal does not play well with CMS
> > it seems so this removes Graal being tested with it.
> >
> > Thanks,
> > Jc
>


-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180710/71826b13/attachment.html>

From alexey.menkov at oracle.com  Tue Jul 10 21:42:18 2018
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Tue, 10 Jul 2018 14:42:18 -0700
Subject: RFR (S) 8205643: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails
In-Reply-To: <CAF9BGBzBbj-QV61jcXacG4e9EfBxV0pPMJPBe1BPNXt982+s3A@mail.gmail.com>
References: <CAF9BGByaoFnwH=FPnXRuTP+MYXSu++9-xtunTho-g0OzNt_nEQ@mail.gmail.com>
 <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com>
 <CAF9BGBzBbj-QV61jcXacG4e9EfBxV0pPMJPBe1BPNXt982+s3A@mail.gmail.com>
Message-ID: <a5cfa087-6660-3b97-5d48-b2e952e405e8@oracle.com>

Looks good to me.

--alex

On 07/10/2018 13:37, JC Beyler wrote:
> Hi Alex,
> 
> Done here:
> http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.01/
> 
> Any other issues with this fix?
> 
> Thanks!
> Jc
> 
> On Tue, Jul 10, 2018 at 12:26 PM Alex Menkov <alexey.menkov at oracle.com 
> <mailto:alexey.menkov at oracle.com>> wrote:
> 
>     Hi JC,
> 
>     you need also to remove the test from ProblemList
> 
>     --alex
> 
>     On 07/10/2018 11:41, JC Beyler wrote:
>      > Hi All,
>      >
>      > Could someone review the one liner for the bug:
>      > https://bugs.openjdk.java.net/browse/JDK-8205643
>      >
>      > The webrev is here:
>      > http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/
>      >
>      > Basically, the test is testing CMS and Graal does not play well
>     with CMS
>      > it seems so this removes Graal being tested with it.
>      >
>      > Thanks,
>      > Jc
> 
> 
> 
> -- 
> 
> Thanks,
> Jc

From serguei.spitsyn at oracle.com  Tue Jul 10 21:54:32 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 10 Jul 2018 14:54:32 -0700
Subject: RFR (S) 8205643: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails
In-Reply-To: <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com>
References: <CAF9BGByaoFnwH=FPnXRuTP+MYXSu++9-xtunTho-g0OzNt_nEQ@mail.gmail.com>
 <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com>
Message-ID: <6bfc1d3a-33fd-bad7-fe63-da8dcec8400f@oracle.com>

Hi Jc,

The fix looks good.
Alex is right.
I forgot to tell you that the test has be excluded from the file:
 ?? open/test/hotspot/jtreg/ProblemList.txt

Thanks,
Serguei


On 7/10/18 12:26, Alex Menkov wrote:
> Hi JC,
>
> you need also to remove the test from ProblemList
>
> --alex
>
> On 07/10/2018 11:41, JC Beyler wrote:
>> Hi All,
>>
>> Could someone review the one liner for the bug:
>> https://bugs.openjdk.java.net/browse/JDK-8205643
>>
>> The webrev is here:
>> http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/
>>
>> Basically, the test is testing CMS and Graal does not play well with 
>> CMS it seems so this removes Graal being tested with it.
>>
>> Thanks,
>> Jc


From serguei.spitsyn at oracle.com  Tue Jul 10 21:56:37 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 10 Jul 2018 14:56:37 -0700
Subject: RFR (S) 8205643: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails
In-Reply-To: <6bfc1d3a-33fd-bad7-fe63-da8dcec8400f@oracle.com>
References: <CAF9BGByaoFnwH=FPnXRuTP+MYXSu++9-xtunTho-g0OzNt_nEQ@mail.gmail.com>
 <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com>
 <6bfc1d3a-33fd-bad7-fe63-da8dcec8400f@oracle.com>
Message-ID: <e5b7a72c-49c7-43d8-026f-58fe7236e0fd@oracle.com>

Sorry, did not see your reply to Alex.
Looks good - ship it!

Thanks,
Serguei

On 7/10/18 14:54, serguei.spitsyn at oracle.com wrote:
> Hi Jc,
>
> The fix looks good.
> Alex is right.
> I forgot to tell you that the test has be excluded from the file:
> ?? open/test/hotspot/jtreg/ProblemList.txt
>
> Thanks,
> Serguei
>
>
> On 7/10/18 12:26, Alex Menkov wrote:
>> Hi JC,
>>
>> you need also to remove the test from ProblemList
>>
>> --alex
>>
>> On 07/10/2018 11:41, JC Beyler wrote:
>>> Hi All,
>>>
>>> Could someone review the one liner for the bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8205643
>>>
>>> The webrev is here:
>>> http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/
>>>
>>> Basically, the test is testing CMS and Graal does not play well with 
>>> CMS it seems so this removes Graal being tested with it.
>>>
>>> Thanks,
>>> Jc
>


From jcbeyler at google.com  Tue Jul 10 22:31:08 2018
From: jcbeyler at google.com (JC Beyler)
Date: Tue, 10 Jul 2018 15:31:08 -0700
Subject: RFR (S) 8205643: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java
 fails
In-Reply-To: <e5b7a72c-49c7-43d8-026f-58fe7236e0fd@oracle.com>
References: <CAF9BGByaoFnwH=FPnXRuTP+MYXSu++9-xtunTho-g0OzNt_nEQ@mail.gmail.com>
 <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com>
 <6bfc1d3a-33fd-bad7-fe63-da8dcec8400f@oracle.com>
 <e5b7a72c-49c7-43d8-026f-58fe7236e0fd@oracle.com>
Message-ID: <CAF9BGBw7-trGxV8tTpfV8GVaaO9d4LCM_dny7AO=SmQWFmvz4Q@mail.gmail.com>

Hi Serguei,

Here it is:
http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.02/

Could someone test/push it please?

Thanks!
Jc

On Tue, Jul 10, 2018 at 2:56 PM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Sorry, did not see your reply to Alex.
> Looks good - ship it!
>
> Thanks,
> Serguei
>
> On 7/10/18 14:54, serguei.spitsyn at oracle.com wrote:
> > Hi Jc,
> >
> > The fix looks good.
> > Alex is right.
> > I forgot to tell you that the test has be excluded from the file:
> >    open/test/hotspot/jtreg/ProblemList.txt
> >
> > Thanks,
> > Serguei
> >
> >
> > On 7/10/18 12:26, Alex Menkov wrote:
> >> Hi JC,
> >>
> >> you need also to remove the test from ProblemList
> >>
> >> --alex
> >>
> >> On 07/10/2018 11:41, JC Beyler wrote:
> >>> Hi All,
> >>>
> >>> Could someone review the one liner for the bug:
> >>> https://bugs.openjdk.java.net/browse/JDK-8205643
> >>>
> >>> The webrev is here:
> >>> http://cr.openjdk.java.net/~jcbeyler/8205643/webrev.00/
> >>>
> >>> Basically, the test is testing CMS and Graal does not play well with
> >>> CMS it seems so this removes Graal being tested with it.
> >>>
> >>> Thanks,
> >>> Jc
> >
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180710/532f7da3/attachment.html>

From serguei.spitsyn at oracle.com  Tue Jul 10 22:38:08 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 10 Jul 2018 15:38:08 -0700
Subject: RFR (S) 8205643: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails
In-Reply-To: <CAF9BGBw7-trGxV8tTpfV8GVaaO9d4LCM_dny7AO=SmQWFmvz4Q@mail.gmail.com>
References: <CAF9BGByaoFnwH=FPnXRuTP+MYXSu++9-xtunTho-g0OzNt_nEQ@mail.gmail.com>
 <82dd9582-368e-d53d-a648-ead834415bb4@oracle.com>
 <6bfc1d3a-33fd-bad7-fe63-da8dcec8400f@oracle.com>
 <e5b7a72c-49c7-43d8-026f-58fe7236e0fd@oracle.com>
 <CAF9BGBw7-trGxV8tTpfV8GVaaO9d4LCM_dny7AO=SmQWFmvz4Q@mail.gmail.com>
Message-ID: <484b69a2-c644-94da-6071-795b25900eb3@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180710/13834297/attachment.html>

From jini.george at oracle.com  Wed Jul 11 02:38:37 2018
From: jini.george at oracle.com (Jini George)
Date: Wed, 11 Jul 2018 08:08:37 +0530
Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
In-Reply-To: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com>
References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com>
Message-ID: <4a595c59-b457-2ed5-30fb-5d0436b9d55c@oracle.com>

Gentle reminder !

Thanks,
Jini.

On 7/10/2018 12:14 AM, Jini George wrote:
> Requesting reviews for enabling SA tests on OS X for Mach5.
> 
> https://bugs.openjdk.java.net/browse/JDK-8199700
> 
> Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/
> 
> The changes are mostly to include the addition of sudo privileges to the 
> SA launchers for OSX if Platform.shouldSAAttach() fails. Some tests 
> (those using clhsdb) have been refactored to use ClhsdbLauncher for ease 
> of maintainence. This also avoids checks for Platform.shouldSAAttach() 
> for corefile related test cases. More details have been provided in JIRA.
> 
> Thanks,
> Jini.

From david.holmes at oracle.com  Wed Jul 11 08:24:29 2018
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 11 Jul 2018 18:24:29 +1000
Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
In-Reply-To: <4a595c59-b457-2ed5-30fb-5d0436b9d55c@oracle.com>
References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com>
 <4a595c59-b457-2ed5-30fb-5d0436b9d55c@oracle.com>
Message-ID: <e43c7d7e-2af1-bc6b-a1db-f3133531c82b@oracle.com>

Hi Jini,

There are quite a few changes to digest in this - it may have been 
better to break them up individually:
- sudo use
- refactor to use ClshdbLauncher
- changes to use regex matching

Focusing on the main sudo change the assumption is that on OSX you can 
run sudo without needing to provide a password - correct? That may be 
the case in mach5 but I'm not sure how others will go running these 
tests either in their test farms or locally.

I'm not sure about the regex changes from contains to matches - won't 
you need additional wildcards at the start and end of the strings to 
allow the string to be embedded in a longer string ??

Thanks,
David

PS. I start vacation in 48 hours :)

On 11/07/2018 12:38 PM, Jini George wrote:
> Gentle reminder !
> 
> Thanks,
> Jini.
> 
> On 7/10/2018 12:14 AM, Jini George wrote:
>> Requesting reviews for enabling SA tests on OS X for Mach5.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8199700
>>
>> Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/
>>
>> The changes are mostly to include the addition of sudo privileges to 
>> the SA launchers for OSX if Platform.shouldSAAttach() fails. Some 
>> tests (those using clhsdb) have been refactored to use ClhsdbLauncher 
>> for ease of maintainence. This also avoids checks for 
>> Platform.shouldSAAttach() for corefile related test cases. More 
>> details have been provided in JIRA.
>>
>> Thanks,
>> Jini.

From jini.george at oracle.com  Wed Jul 11 10:00:06 2018
From: jini.george at oracle.com (Jini George)
Date: Wed, 11 Jul 2018 15:30:06 +0530
Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
In-Reply-To: <e43c7d7e-2af1-bc6b-a1db-f3133531c82b@oracle.com>
References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com>
 <4a595c59-b457-2ed5-30fb-5d0436b9d55c@oracle.com>
 <e43c7d7e-2af1-bc6b-a1db-f3133531c82b@oracle.com>
Message-ID: <50204c5b-62c5-e427-734b-cf5de2ffbb49@oracle.com>

Thank you, David. My answers inline:

On 7/11/2018 1:54 PM, David Holmes wrote:
> Hi Jini,
> 
> There are quite a few changes to digest in this - it may have been 
> better to break them up individually:
> - sudo use
> - refactor to use ClshdbLauncher
> - changes to use regex matching
> 
> Focusing on the main sudo change the assumption is that on OSX you can 
> run sudo without needing to provide a password - correct? That may be 
> the case in mach5 but I'm not sure how others will go running these 
> tests either in their test farms or locally.
Right -- you would need to provide the password. So it prompts for the 
password for OSX. (Like how it would have been needed if you had run the 
test itself with 'sudo'). Examining the /etc/sudoers file to check if no 
password is needed could have been an option, but that itself would need 
an sudo, and probably would add unwanted complexity.

> I'm not sure about the regex changes from contains to matches - won't 
> you need additional wildcards at the start and end of the strings to 
> allow the string to be embedded in a longer string ??

OutputAnalyzer's shouldMatch() uses the find() method of the Matcher 
class which matches sub-sequences.

Thanks,
Jini.


> 
> Thanks,
> David
> 
> PS. I start vacation in 48 hours :)
> 
> On 11/07/2018 12:38 PM, Jini George wrote:
>> Gentle reminder !
>>
>> Thanks,
>> Jini.
>>
>> On 7/10/2018 12:14 AM, Jini George wrote:
>>> Requesting reviews for enabling SA tests on OS X for Mach5.
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8199700
>>>
>>> Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/
>>>
>>> The changes are mostly to include the addition of sudo privileges to 
>>> the SA launchers for OSX if Platform.shouldSAAttach() fails. Some 
>>> tests (those using clhsdb) have been refactored to use ClhsdbLauncher 
>>> for ease of maintainence. This also avoids checks for 
>>> Platform.shouldSAAttach() for corefile related test cases. More 
>>> details have been provided in JIRA.
>>>
>>> Thanks,
>>> Jini.

From kubota.yuji at gmail.com  Wed Jul 11 13:55:02 2018
From: kubota.yuji at gmail.com (KUBOTA Yuji)
Date: Wed, 11 Jul 2018 22:55:02 +0900
Subject: RFR:8207048: jhsdb debugd cannot specify a port number
Message-ID: <CABU-27O2z0QDfsJJr2+RZ_VpDT0q67Ww-A8mqzAqMZ6SrT72GA@mail.gmail.com>

Hi all,

I filed bugzilla for small fix to improvement of `jhsdb debugd` to set
a port of UnicastRemoteObject aka
sun.jvm.hotspot.debugger.remote.RemoteDebuggerServer by
`sun.jvm.hotspot.rmi.debugger.port`.

Issue: https://bugs.openjdk.java.net/browse/JDK-8207048
Webrev: http://cr.openjdk.java.net/~ykubota/8207048/webrev.00/

We can set an RMI registry port of debugd server by
`sun.jvm.hotspot.rmi.port`, but can not set a port of RemoteObject. So
RemoteObject always uses an anonymous port. For security, we should
not open ports widely to use debugd, so I want to fix.

Could you review it?

Thanks,
Yuji

From jcbeyler at google.com  Wed Jul 11 17:04:17 2018
From: jcbeyler at google.com (JC Beyler)
Date: Wed, 11 Jul 2018 10:04:17 -0700
Subject: RFR (S) 8206960: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail
Message-ID: <CAF9BGBx8FSLMaj47PrOyk-zPQpZuCh-+hqpPZT6H0GazE9bqsQ@mail.gmail.com>

Hi all,

Could someone review the small-ish webrev for the bug:
https://bugs.openjdk.java.net/browse/JDK-8206960

The webrev is here:
http://cr.openjdk.java.net/~jcbeyler/8206960/webrev.00/

Basically, the tests were failing for two reasons:
  - VMEventTest was failing because Graal does not support DisableIntrinsic
required by the test, I disabled testing the test with Graal in this case
  - The other tests were failing because the BCI <-> source code line
numbers are not always correct when using Graal via uncommon traps;
therefore the tests now check if Graal is being used and, if so, only
checks the method names. This allows us to still have tests working with
Graal, albeit a bit more coarse.

This passes all the HeapMonitor tests
with -vmoptions:"-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI
-XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal"

(Except the GCCMS one which is being fixed via the one-liner for
JDK-8205643).

Let me know what you think,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180711/407bcfd5/attachment.html>

From jini.george at oracle.com  Wed Jul 11 17:32:51 2018
From: jini.george at oracle.com (Jini George)
Date: Wed, 11 Jul 2018 23:02:51 +0530
Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running
 in Docker containers
In-Reply-To: <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com>
References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com>
 <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com>
Message-ID: <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com>

Hi Yasumasa,

This looks good to me except for one nit. And some more comments would 
help. For e.g., it would help to say that NSPidMap is to map the host to 
container lwpids.

The nit:

* 
http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html
Line 253: extra space after the parentheses

Thanks,
Jini.

On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote:
> PING: Could you review it?
> 
>> ? JBS:??? https://bugs.openjdk.java.net/browse/JDK-8205992
>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
> 
> 
> Thanks,
> 
> Yasumasa
> 
> 
> On 2018/06/28 22:12, Yasumasa Suenaga wrote:
>> Hi all,
>>
>> Please review this change.
>>
>> ? JBS:??? https://bugs.openjdk.java.net/browse/JDK-8205992
>> ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>
>> I tried to attach jhsdb to java process in docker container from 
>> container host, but it couldn't.
>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet.
>>
>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they 
>> returns PIDs in container - they are different from host's PID. So I 
>> added the code to scan /proc/<PID>/task to get all LWP IDs and they 
>> are kept in a Map in LinuxDebuggerLocal.
>>
>> Also SA_ALTROOT is set to /proc/<PID>/root if SA detects debuggee runs 
>> in container. It helps SA to parse binaries in container.
>>
>> This change has been pushed to submit repo, and it was failed on OS X 
>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963).
>> But I guess it causes JDK-8205906. This change affects to Linux only.
>>
>> Could you review it?
>>
>>
>> Thanks,
>>
>> Yasumasa
>>

From alexey.menkov at oracle.com  Wed Jul 11 18:39:33 2018
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Wed, 11 Jul 2018 11:39:33 -0700
Subject: RFR: JDK-8201513: nsk/jvmti/IterateThroughHeap/filter-* are broken
Message-ID: <cd61a65d-4191-39a5-ddeb-648c614aaab6@oracle.com>

Hi all,

please review a fix for
https://bugs.openjdk.java.net/browse/JDK-8201513
webrev:
http://cr.openjdk.java.net/~amenkov/IterateThroughHeap/webrev/

summary:
The tests had a error which was fixed during open-sourcing.
After that the tests started to fail. Root cause of the failures is 
wrong verification (positive results are interpreted as negative)

--alex

From serguei.spitsyn at oracle.com  Wed Jul 11 21:26:36 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 11 Jul 2018 14:26:36 -0700
Subject: RFR: JDK-8201513: nsk/jvmti/IterateThroughHeap/filter-* are broken
In-Reply-To: <cd61a65d-4191-39a5-ddeb-648c614aaab6@oracle.com>
References: <cd61a65d-4191-39a5-ddeb-648c614aaab6@oracle.com>
Message-ID: <d6819adf-08b0-213a-5def-e999d4c46b7f@oracle.com>

Hi Alex,

The fix looks good.
Thank you for fixing the typos!

Thanks,
Serguei


On 7/11/18 11:39, Alex Menkov wrote:
> Hi all,
>
> please review a fix for
> https://bugs.openjdk.java.net/browse/JDK-8201513
> webrev:
> http://cr.openjdk.java.net/~amenkov/IterateThroughHeap/webrev/
>
> summary:
> The tests had a error which was fixed during open-sourcing.
> After that the tests started to fail. Root cause of the failures is 
> wrong verification (positive results are interpreted as negative)
>
> --alex


From serguei.spitsyn at oracle.com  Wed Jul 11 21:42:25 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 11 Jul 2018 14:42:25 -0700
Subject: RFR (S) 8206960: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail
In-Reply-To: <CAF9BGBx8FSLMaj47PrOyk-zPQpZuCh-+hqpPZT6H0GazE9bqsQ@mail.gmail.com>
References: <CAF9BGBx8FSLMaj47PrOyk-zPQpZuCh-+hqpPZT6H0GazE9bqsQ@mail.gmail.com>
Message-ID: <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180711/8ce772a2/attachment-0001.html>

From david.holmes at oracle.com  Thu Jul 12 00:55:19 2018
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 12 Jul 2018 10:55:19 +1000
Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
In-Reply-To: <50204c5b-62c5-e427-734b-cf5de2ffbb49@oracle.com>
References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com>
 <4a595c59-b457-2ed5-30fb-5d0436b9d55c@oracle.com>
 <e43c7d7e-2af1-bc6b-a1db-f3133531c82b@oracle.com>
 <50204c5b-62c5-e427-734b-cf5de2ffbb49@oracle.com>
Message-ID: <9644b36d-bb13-8625-a770-b11a9ee6c2eb@oracle.com>

On 11/07/2018 8:00 PM, Jini George wrote:
> Thank you, David. My answers inline:
> 
> On 7/11/2018 1:54 PM, David Holmes wrote:
>> Hi Jini,
>>
>> There are quite a few changes to digest in this - it may have been 
>> better to break them up individually:
>> - sudo use
>> - refactor to use ClshdbLauncher
>> - changes to use regex matching
>>
>> Focusing on the main sudo change the assumption is that on OSX you can 
>> run sudo without needing to provide a password - correct? That may be 
>> the case in mach5 but I'm not sure how others will go running these 
>> tests either in their test farms or locally.
> Right -- you would need to provide the password. So it prompts for the 
> password for OSX. (Like how it would have been needed if you had run the 
> test itself with 'sudo'). Examining the /etc/sudoers file to check if no 
> password is needed could have been an option, but that itself would need 
> an sudo, and probably would add unwanted complexity.

So I'm not sure this change is acceptable when it may cause other 
testing environments to break. At a minimum I'd want to get the opinions 
of the SAP folk and anyone else doing regular build/test runs.

>> I'm not sure about the regex changes from contains to matches - won't 
>> you need additional wildcards at the start and end of the strings to 
>> allow the string to be embedded in a longer string ??
> 
> OutputAnalyzer's shouldMatch() uses the find() method of the Matcher 
> class which matches sub-sequences.

Ok.

Thanks,
David

> Thanks,
> Jini.
> 
> 
>>
>> Thanks,
>> David
>>
>> PS. I start vacation in 48 hours :)
>>
>> On 11/07/2018 12:38 PM, Jini George wrote:
>>> Gentle reminder !
>>>
>>> Thanks,
>>> Jini.
>>>
>>> On 7/10/2018 12:14 AM, Jini George wrote:
>>>> Requesting reviews for enabling SA tests on OS X for Mach5.
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8199700
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/
>>>>
>>>> The changes are mostly to include the addition of sudo privileges to 
>>>> the SA launchers for OSX if Platform.shouldSAAttach() fails. Some 
>>>> tests (those using clhsdb) have been refactored to use 
>>>> ClhsdbLauncher for ease of maintainence. This also avoids checks for 
>>>> Platform.shouldSAAttach() for corefile related test cases. More 
>>>> details have been provided in JIRA.
>>>>
>>>> Thanks,
>>>> Jini.

From david.holmes at oracle.com  Thu Jul 12 01:21:39 2018
From: david.holmes at oracle.com (David Holmes)
Date: Thu, 12 Jul 2018 11:21:39 +1000
Subject: RFR:8207048: jhsdb debugd cannot specify a port number
In-Reply-To: <CABU-27O2z0QDfsJJr2+RZ_VpDT0q67Ww-A8mqzAqMZ6SrT72GA@mail.gmail.com>
References: <CABU-27O2z0QDfsJJr2+RZ_VpDT0q67Ww-A8mqzAqMZ6SrT72GA@mail.gmail.com>
Message-ID: <5b422de0-ec5e-924e-004f-d58ab1474f85@oracle.com>

Hi Yuji,

I can't comment on the actual change proposed in this enhancement 
request, but it will need to have a CSR request created and approved due 
to the use of a new system property.

Thanks,
David


On 11/07/2018 11:55 PM, KUBOTA Yuji wrote:
> Hi all,
> 
> I filed bugzilla for small fix to improvement of `jhsdb debugd` to set
> a port of UnicastRemoteObject aka
> sun.jvm.hotspot.debugger.remote.RemoteDebuggerServer by
> `sun.jvm.hotspot.rmi.debugger.port`.
> 
> Issue: https://bugs.openjdk.java.net/browse/JDK-8207048
> Webrev: http://cr.openjdk.java.net/~ykubota/8207048/webrev.00/
> 
> We can set an RMI registry port of debugd server by
> `sun.jvm.hotspot.rmi.port`, but can not set a port of RemoteObject. So
> RemoteObject always uses an anonymous port. For security, we should
> not open ports widely to use debugd, so I want to fix.
> 
> Could you review it?
> 
> Thanks,
> Yuji
> 

From kubota.yuji at gmail.com  Thu Jul 12 01:40:47 2018
From: kubota.yuji at gmail.com (KUBOTA Yuji)
Date: Thu, 12 Jul 2018 10:40:47 +0900
Subject: RFR:8207048: jhsdb debugd cannot specify a port number
In-Reply-To: <5b422de0-ec5e-924e-004f-d58ab1474f85@oracle.com>
References: <CABU-27O2z0QDfsJJr2+RZ_VpDT0q67Ww-A8mqzAqMZ6SrT72GA@mail.gmail.com>
 <5b422de0-ec5e-924e-004f-d58ab1474f85@oracle.com>
Message-ID: <CABU-27OA6wGoPvY8Zycm6VHPjQq6V-7e93Z-27vREX8vttu-EQ@mail.gmail.com>

Hi David,

Thank you for comment and updating JBS. I'll create a CSR request
after getting comments whether this change is welcomed by community.

Thanks,
Yuji

2018-07-12 10:21 GMT+09:00 David Holmes <david.holmes at oracle.com>:
> Hi Yuji,
>
> I can't comment on the actual change proposed in this enhancement request,
> but it will need to have a CSR request created and approved due to the use
> of a new system property.
>
> Thanks,
> David
>
>
>
>
> On 11/07/2018 11:55 PM, KUBOTA Yuji wrote:
>>
>> Hi all,
>>
>> I filed bugzilla for small fix to improvement of `jhsdb debugd` to set
>> a port of UnicastRemoteObject aka
>> sun.jvm.hotspot.debugger.remote.RemoteDebuggerServer by
>> `sun.jvm.hotspot.rmi.debugger.port`.
>>
>> Issue: https://bugs.openjdk.java.net/browse/JDK-8207048
>> Webrev: http://cr.openjdk.java.net/~ykubota/8207048/webrev.00/
>>
>> We can set an RMI registry port of debugd server by
>> `sun.jvm.hotspot.rmi.port`, but can not set a port of RemoteObject. So
>> RemoteObject always uses an anonymous port. For security, we should
>> not open ports widely to use debugd, so I want to fix.
>>
>> Could you review it?
>>
>> Thanks,
>> Yuji
>>
>

From yasuenag at gmail.com  Thu Jul 12 04:42:10 2018
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Thu, 12 Jul 2018 13:42:10 +0900
Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running
 in Docker containers
In-Reply-To: <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com>
References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com>
 <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com>
 <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com>
Message-ID: <CAGFVN2A+OtA7KmCmN9aJCQv-kvnXoQB4jz75SXiweLbaG=EYKQ@mail.gmail.com>

Thanks Jini,

I uploaded new webrev. It contains some comments and removing extra space.

http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/


Yasumasa


2018-07-12 2:32 GMT+09:00 Jini George <jini.george at oracle.com>:
> Hi Yasumasa,
>
> This looks good to me except for one nit. And some more comments would help.
> For e.g., it would help to say that NSPidMap is to map the host to container
> lwpids.
>
> The nit:
>
> *
> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html
> Line 253: extra space after the parentheses
>
> Thanks,
> Jini.
>
> On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote:
>>
>> PING: Could you review it?
>>
>>>   JBS:    https://bugs.openjdk.java.net/browse/JDK-8205992
>>>   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2018/06/28 22:12, Yasumasa Suenaga wrote:
>>>
>>> Hi all,
>>>
>>> Please review this change.
>>>
>>>   JBS:    https://bugs.openjdk.java.net/browse/JDK-8205992
>>>   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>>
>>> I tried to attach jhsdb to java process in docker container from
>>> container host, but it couldn't.
>>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet.
>>>
>>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they
>>> returns PIDs in container - they are different from host's PID. So I added
>>> the code to scan /proc/<PID>/task to get all LWP IDs and they are kept in a
>>> Map in LinuxDebuggerLocal.
>>>
>>> Also SA_ALTROOT is set to /proc/<PID>/root if SA detects debuggee runs in
>>> container. It helps SA to parse binaries in container.
>>>
>>> This change has been pushed to submit repo, and it was failed on OS X
>>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963).
>>> But I guess it causes JDK-8205906. This change affects to Linux only.
>>>
>>> Could you review it?
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>

From jini.george at oracle.com  Thu Jul 12 05:09:35 2018
From: jini.george at oracle.com (Jini George)
Date: Thu, 12 Jul 2018 10:39:35 +0530
Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running
 in Docker containers
In-Reply-To: <CAGFVN2A+OtA7KmCmN9aJCQv-kvnXoQB4jz75SXiweLbaG=EYKQ@mail.gmail.com>
References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com>
 <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com>
 <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com>
 <CAGFVN2A+OtA7KmCmN9aJCQv-kvnXoQB4jz75SXiweLbaG=EYKQ@mail.gmail.com>
Message-ID: <ab4be109-586b-5a80-02c0-487993db3e87@oracle.com>

Looks good to me.

Thanks!
Jini (Not a Reviewer).

On 7/12/2018 10:12 AM, Yasumasa Suenaga wrote:
> Thanks Jini,
> 
> I uploaded new webrev. It contains some comments and removing extra space.
> 
> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/
> 
> 
> Yasumasa
> 
> 
> 
> 2018-07-12 2:32 GMT+09:00 Jini George <jini.george at oracle.com>:
>> Hi Yasumasa,
>>
>> This looks good to me except for one nit. And some more comments would help.
>> For e.g., it would help to say that NSPidMap is to map the host to container
>> lwpids.
>>
>> The nit:
>>
>> *
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html
>> Line 253: extra space after the parentheses
>>
>> Thanks,
>> Jini.
>>
>> On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote:
>>>
>>> PING: Could you review it?
>>>
>>>>    JBS:    https://bugs.openjdk.java.net/browse/JDK-8205992
>>>>    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2018/06/28 22:12, Yasumasa Suenaga wrote:
>>>>
>>>> Hi all,
>>>>
>>>> Please review this change.
>>>>
>>>>    JBS:    https://bugs.openjdk.java.net/browse/JDK-8205992
>>>>    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>>>
>>>> I tried to attach jhsdb to java process in docker container from
>>>> container host, but it couldn't.
>>>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet.
>>>>
>>>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they
>>>> returns PIDs in container - they are different from host's PID. So I added
>>>> the code to scan /proc/<PID>/task to get all LWP IDs and they are kept in a
>>>> Map in LinuxDebuggerLocal.
>>>>
>>>> Also SA_ALTROOT is set to /proc/<PID>/root if SA detects debuggee runs in
>>>> container. It helps SA to parse binaries in container.
>>>>
>>>> This change has been pushed to submit repo, and it was failed on OS X
>>>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963).
>>>> But I guess it causes JDK-8205906. This change affects to Linux only.
>>>>
>>>> Could you review it?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>

From daniil.x.titov at oracle.com  Thu Jul 12 05:23:18 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Wed, 11 Jul 2018 22:23:18 -0700
Subject: RFR JDK-8191948 : jdb error: InvalidTypeException: Can't assign
 double[][][] to double[][][]
Message-ID: <BC3FDEC3-43C1-46A5-A9B4-40309F4BC07C@oracle.com>

Please review the changes that fix jdb issue with evaluation of multidimensional arrays of primitives.

The problem here is that for N-dimensional arrays of the primitives with N greater then 2, JDI fails to find its component type (which is an array of dimension N-1) assuming that it is a boot type.

Thanks!
 
Issue: https://bugs.openjdk.java.net/browse/JDK-8191948 
Webrev: http://cr.openjdk.java.net/~dtitov/8191948/webrev.01 
 
Best regards,
Daniil
 
 
From serguei.spitsyn at oracle.com  Thu Jul 12 05:26:46 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 11 Jul 2018 22:26:46 -0700
Subject: RFR JDK-8191948 : jdb error: InvalidTypeException: Can't assign
 double[][][] to double[][][]
In-Reply-To: <BC3FDEC3-43C1-46A5-A9B4-40309F4BC07C@oracle.com>
References: <BC3FDEC3-43C1-46A5-A9B4-40309F4BC07C@oracle.com>
Message-ID: <01fb43cf-3f19-e4e2-fae1-30c2a5665b44@oracle.com>

Hi Daniil,

It looks good.

Thanks,
Serguei


On 7/11/18 22:23, Daniil Titov wrote:
> Please review the changes that fix jdb issue with evaluation of multidimensional arrays of primitives.
>
> The problem here is that for N-dimensional arrays of the primitives with N greater then 2, JDI fails to find its component type (which is an array of dimension N-1) assuming that it is a boot type.
>
> Thanks!
>   
> Issue: https://bugs.openjdk.java.net/browse/JDK-8191948
> Webrev: http://cr.openjdk.java.net/~dtitov/8191948/webrev.01
>   
> Best regards,
> Daniil
>   
>   
>   
>
>
>


From yasuenag at gmail.com  Thu Jul 12 05:29:05 2018
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Thu, 12 Jul 2018 14:29:05 +0900
Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running
 in Docker containers
In-Reply-To: <ab4be109-586b-5a80-02c0-487993db3e87@oracle.com>
References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com>
 <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com>
 <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com>
 <CAGFVN2A+OtA7KmCmN9aJCQv-kvnXoQB4jz75SXiweLbaG=EYKQ@mail.gmail.com>
 <ab4be109-586b-5a80-02c0-487993db3e87@oracle.com>
Message-ID: <CAGFVN2DyoT8GHgyYG7CpXB43wT-FnM8p8ivS5iGBOfn9nZaBGQ@mail.gmail.com>

Thanks Jini!

I'm waiting for Reviewer.


Yasumasa


2018-07-12 14:09 GMT+09:00 Jini George <jini.george at oracle.com>:
> Looks good to me.
>
> Thanks!
> Jini (Not a Reviewer).
>
>
> On 7/12/2018 10:12 AM, Yasumasa Suenaga wrote:
>>
>> Thanks Jini,
>>
>> I uploaded new webrev. It contains some comments and removing extra space.
>>
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/
>>
>>
>> Yasumasa
>>
>>
>>
>> 2018-07-12 2:32 GMT+09:00 Jini George <jini.george at oracle.com>:
>>>
>>> Hi Yasumasa,
>>>
>>> This looks good to me except for one nit. And some more comments would
>>> help.
>>> For e.g., it would help to say that NSPidMap is to map the host to
>>> container
>>> lwpids.
>>>
>>> The nit:
>>>
>>> *
>>>
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html
>>> Line 253: extra space after the parentheses
>>>
>>> Thanks,
>>> Jini.
>>>
>>> On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote:
>>>>
>>>>
>>>> PING: Could you review it?
>>>>
>>>>>    JBS:    https://bugs.openjdk.java.net/browse/JDK-8205992
>>>>>    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>>>
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2018/06/28 22:12, Yasumasa Suenaga wrote:
>>>>>
>>>>>
>>>>> Hi all,
>>>>>
>>>>> Please review this change.
>>>>>
>>>>>    JBS:    https://bugs.openjdk.java.net/browse/JDK-8205992
>>>>>    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>>>>
>>>>> I tried to attach jhsdb to java process in docker container from
>>>>> container host, but it couldn't.
>>>>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet.
>>>>>
>>>>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they
>>>>> returns PIDs in container - they are different from host's PID. So I
>>>>> added
>>>>> the code to scan /proc/<PID>/task to get all LWP IDs and they are kept
>>>>> in a
>>>>> Map in LinuxDebuggerLocal.
>>>>>
>>>>> Also SA_ALTROOT is set to /proc/<PID>/root if SA detects debuggee runs
>>>>> in
>>>>> container. It helps SA to parse binaries in container.
>>>>>
>>>>> This change has been pushed to submit repo, and it was failed on OS X
>>>>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963).
>>>>> But I guess it causes JDK-8205906. This change affects to Linux only.
>>>>>
>>>>> Could you review it?
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>
>

From goetz.lindenmaier at sap.com  Thu Jul 12 10:11:52 2018
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Thu, 12 Jul 2018 10:11:52 +0000
Subject: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
In-Reply-To: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com>
References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com>
Message-ID: <5eb111d4ffd8427398a09c62a925e5d7@sap.com>

Hi Jini,

I had a look at your change. 
It makes tests fail if shouldSAAttach returns false.

Now, these tests say "Errror: cannot attach",
while before they would terminate silently.

It is not an Error if the SA can not attach. 

You can reproduce this by just changing
Platform.shouldSAAttach() to always return false.

I'll run the patch throuqh our nightly tests to 
see whether they pass mac.

Best regards,
  Goetz.

> -----Original Message-----
> From: serviceability-dev [mailto:serviceability-dev-
> bounces at openjdk.java.net] On Behalf Of Jini George
> Sent: Montag, 9. Juli 2018 20:45
> To: serviceability-dev at openjdk.java.net
> Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
> 
> Requesting reviews for enabling SA tests on OS X for Mach5.
> 
> https://bugs.openjdk.java.net/browse/JDK-8199700
> 
> Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/
> 
> The changes are mostly to include the addition of sudo privileges to the
> SA launchers for OSX if Platform.shouldSAAttach() fails. Some tests
> (those using clhsdb) have been refactored to use ClhsdbLauncher for ease
> of maintainence. This also avoids checks for Platform.shouldSAAttach()
> for corefile related test cases. More details have been provided in JIRA.
> 
> Thanks,
> Jini.

From daniel.mitterdorfer at gmail.com  Thu Jul 12 13:35:39 2018
From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer)
Date: Thu, 12 Jul 2018 15:35:39 +0200
Subject: committed > max in MemoryMXBean#getHeapMemoryUsage()
Message-ID: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>

Hi,

while working on a change in Elasticsearch, I discovered an interesting
situation related to the implementation of jmm_getMemoryUsage (see
[jdk-mem-usage]). In one of the test runs, a test failed with the following
exception:

java.lang.IllegalArgumentException: committed = 542113792 should be <
max = 536870912
at java.lang.management.MemoryUsage.<init>(MemoryUsage.java:166)
at sun.management.MemoryImpl.getMemoryUsage0(Native Method)
at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71)
at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:246)
[...]

This happened on MacOS 10.12.6 with JDK 10 (build 10.0.1+10). The only JVM flags
specified where -Xms512M -Xmx512M. So far this failure occurred only once and I
could not reproduce it yet.

The values reported in the exception message are:

* "max": 536870912 = 512MB (exactly)
* "committed": 542113792 = 517MB (exactly), i.e. 5MB more than "max".

As the value of "max" is exactly what we have specified with -Xmx this indicates
to me that the problem seems to be the calculation of "committed".

As the value of "max" is exactly what we have specified with -Xmx it seems to
indicate that the problem is the calculation of "committed". I do not
understand under which conditions this can happen thus I post this to the
mailing list in case anybody has ideas what might cause this.

I plan to run further tests with JVM trace logging enabled
(-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags to be
precise) in the hope that this problem will occur again and I can provide logs
that help to debug / fix the problem.

Searching for that error message, there is [JDK-8020530] but that one is about
*non-heap* memory usage and has already been resolved a while ago. Several
sources (e.g. [apache-ignite-workaround] or [netbeans-bug]) seem to indicate
that this problem happened indeed in the wild but what I find odd is that I
could not find a single ticket in the OpenJDK bug tracker or a discussion on a
JDK mailing list about this problem.

I'd be glad to get any pointers on what might cause this or requests for
additional info that I need to provide to help analyze this problem.

Thanks,
Daniel

[jdk-mem-usage]
http://hg.openjdk.java.net/jdk-updates/jdk10u/file/142f0ed9ff5b/src/hotspot/share/services/management.cpp#l728
[JDK-8020530] https://bugs.openjdk.java.net/browse/JDK-8020530
[apache-ignite-workaround]
https://github.com/apache/ignite/blob/df4fd65a32/modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java#L336-L346
[netbeans-bug] https://netbeans.org/bugzilla/show_bug.cgi?id=194733

From jcbeyler at google.com  Thu Jul 12 14:25:29 2018
From: jcbeyler at google.com (JC Beyler)
Date: Thu, 12 Jul 2018 07:25:29 -0700
Subject: RFR (S) 8206960: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail
In-Reply-To: <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com>
References: <CAF9BGBx8FSLMaj47PrOyk-zPQpZuCh-+hqpPZT6H0GazE9bqsQ@mail.gmail.com>
 <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com>
Message-ID: <CAF9BGBwv7PEOgHxeeddbZj3rtHRVypqfaLidj5bHREo_Z4JOZA@mail.gmail.com>

Thanks Serguei!

Anybody motivated to give this a review please?

Thanks!
Jc

On Wed, Jul 11, 2018 at 2:42 PM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Hi Jc,
>
> The fix looks good.
> I'll sponsor a push once it has been reviewed.
>
> Thanks,
> Serguei
>
>
> On 7/11/18 10:04, JC Beyler wrote:
>
> Hi all,
>
> Could someone review the small-ish webrev for the bug:
> https://bugs.openjdk.java.net/browse/JDK-8206960
>
> The webrev is here:
> http://cr.openjdk.java.net/~jcbeyler/8206960/webrev.00/
>
> Basically, the tests were failing for two reasons:
>   - VMEventTest was failing because Graal does not support
> DisableIntrinsic required by the test, I disabled testing the test with
> Graal in this case
>   - The other tests were failing because the BCI <-> source code line
> numbers are not always correct when using Graal via uncommon traps;
> therefore the tests now check if Graal is being used and, if so, only
> checks the method names. This allows us to still have tests working with
> Graal, albeit a bit more coarse.
>
> This passes all the HeapMonitor tests
> with -vmoptions:"-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI
> -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal"
>
> (Except the GCCMS one which is being fixed via the one-liner for
> JDK-8205643).
>
> Let me know what you think,
> Jc
>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180712/3212d0f2/attachment.html>

From gary.adams at oracle.com  Thu Jul 12 14:53:38 2018
From: gary.adams at oracle.com (Gary Adams)
Date: Thu, 12 Jul 2018 10:53:38 -0400
Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time
 on some builds
In-Reply-To: <5B3E2FC7.1060303@oracle.com>
References: <5B3E2FC7.1060303@oracle.com>
Message-ID: <5B476B72.7060203@oracle.com>

I've attached the patch for JDK-8206007.
I'll need a sponsor to push the changes.

On 7/5/18, 10:48 AM, Gary Adams wrote:
> A simple test run using "exclude none" shows 625K methods are being 
> observed.
> The bulk of those methods were due to the last class accessed in the 
> test - VirtualMachineManager.
>
> It's not important that this particular call is used. The test is 
> simply demonstrating that
> filters work for other packages than java and javax.
>
> This proposed fix uses a simpler lookup for GregorianCalendar.
>
>   Issue: https://bugs.openjdk.java.net/browse/JDK-8206007
>   Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 8206007.patch
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180712/69ffaffc/8206007.patch>

From jini.george at oracle.com  Thu Jul 12 16:32:31 2018
From: jini.george at oracle.com (Jini George)
Date: Thu, 12 Jul 2018 22:02:31 +0530
Subject: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
In-Reply-To: <5eb111d4ffd8427398a09c62a925e5d7@sap.com>
References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com>
 <5eb111d4ffd8427398a09c62a925e5d7@sap.com>
Message-ID: <e52da678-be09-ec3d-db58-d342093b84af@oracle.com>

Thanks, Goetz. The "Error: cannot attach" was put in deliberately so 
that we get to know that this is not getting tested. I can change this 
to retain the old behaviour of skipping the tests if we cannot attach.

Thanks,
Jini.

On 7/12/2018 3:41 PM, Lindenmaier, Goetz wrote:
> Hi Jini,
> 
> I had a look at your change.
> It makes tests fail if shouldSAAttach returns false.
> 
> Now, these tests say "Errror: cannot attach",
> while before they would terminate silently.
> 
> It is not an Error if the SA can not attach.
> 
> You can reproduce this by just changing
> Platform.shouldSAAttach() to always return false.
> 
> I'll run the patch throuqh our nightly tests to
> see whether they pass mac.
> 
> Best regards,
>    Goetz.
> 
>> -----Original Message-----
>> From: serviceability-dev [mailto:serviceability-dev-
>> bounces at openjdk.java.net] On Behalf Of Jini George
>> Sent: Montag, 9. Juli 2018 20:45
>> To: serviceability-dev at openjdk.java.net
>> Subject: RFR: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
>>
>> Requesting reviews for enabling SA tests on OS X for Mach5.
>>
>> https://bugs.openjdk.java.net/browse/JDK-8199700
>>
>> Webrev: http://cr.openjdk.java.net/~jgeorge/8199700/webrev.00/
>>
>> The changes are mostly to include the addition of sudo privileges to the
>> SA launchers for OSX if Platform.shouldSAAttach() fails. Some tests
>> (those using clhsdb) have been refactored to use ClhsdbLauncher for ease
>> of maintainence. This also avoids checks for Platform.shouldSAAttach()
>> for corefile related test cases. More details have been provided in JIRA.
>>
>> Thanks,
>> Jini.

From jini.george at oracle.com  Thu Jul 12 16:43:00 2018
From: jini.george at oracle.com (Jini George)
Date: Thu, 12 Jul 2018 22:13:00 +0530
Subject: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
In-Reply-To: <5eb111d4ffd8427398a09c62a925e5d7@sap.com>
References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com>
 <5eb111d4ffd8427398a09c62a925e5d7@sap.com>
Message-ID: <a9afadbd-a610-09ec-0424-9d11c4e4e209@oracle.com>


> 
> I'll run the patch throuqh our nightly tests to
> see whether they pass mac.

Thanks for this. Let me know in case there are timeouts due to there not 
being a no-password entry for the user in the /etc/sudoers list.

Thanks,
Jini.

From alexey.menkov at oracle.com  Thu Jul 12 18:21:55 2018
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Thu, 12 Jul 2018 11:21:55 -0700
Subject: RFR JDK-8191948 : jdb error: InvalidTypeException: Can't assign
 double[][][] to double[][][]
In-Reply-To: <01fb43cf-3f19-e4e2-fae1-30c2a5665b44@oracle.com>
References: <BC3FDEC3-43C1-46A5-A9B4-40309F4BC07C@oracle.com>
 <01fb43cf-3f19-e4e2-fae1-30c2a5665b44@oracle.com>
Message-ID: <9923479f-eb69-2d5b-3939-453c77c81aef@oracle.com>

+1

--alex

On 07/11/2018 22:26, serguei.spitsyn at oracle.com wrote:
> Hi Daniil,
> 
> It looks good.
> 
> Thanks,
> Serguei
> 
> 
> On 7/11/18 22:23, Daniil Titov wrote:
>> Please review the changes that fix jdb issue with evaluation of 
>> multidimensional arrays of primitives.
>>
>> The problem here is that for N-dimensional arrays of the primitives 
>> with N greater then 2, JDI fails to find its component type (which is 
>> an array of dimension N-1) assuming that it is a boot type.
>>
>> Thanks!
>> Issue: https://bugs.openjdk.java.net/browse/JDK-8191948
>> Webrev: http://cr.openjdk.java.net/~dtitov/8191948/webrev.01
>> Best regards,
>> Daniil
>>
>>
>>
> 

From alexey.menkov at oracle.com  Thu Jul 12 18:30:37 2018
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Thu, 12 Jul 2018 11:30:37 -0700
Subject: RFR (S) 8206960: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail
In-Reply-To: <CAF9BGBwv7PEOgHxeeddbZj3rtHRVypqfaLidj5bHREo_Z4JOZA@mail.gmail.com>
References: <CAF9BGBx8FSLMaj47PrOyk-zPQpZuCh-+hqpPZT6H0GazE9bqsQ@mail.gmail.com>
 <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com>
 <CAF9BGBwv7PEOgHxeeddbZj3rtHRVypqfaLidj5bHREo_Z4JOZA@mail.gmail.com>
Message-ID: <ec918693-996e-2017-94b1-add2d654f17c@oracle.com>

Looks good to me as well.

--alex

On 07/12/2018 07:25, JC Beyler wrote:
> Thanks Serguei!
> 
> Anybody motivated to give this a review please?
> 
> Thanks!
> Jc
> 
> On Wed, Jul 11, 2018 at 2:42 PM serguei.spitsyn at oracle.com 
> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com 
> <mailto:serguei.spitsyn at oracle.com>> wrote:
> 
>     Hi Jc,
> 
>     The fix looks good.
>     I'll sponsor a push once it has been reviewed.
> 
>     Thanks,
>     Serguei
> 
> 
>     On 7/11/18 10:04, JC Beyler wrote:
>>     Hi all,
>>
>>     Could someone review the small-ish webrev for the bug:
>>     https://bugs.openjdk.java.net/browse/JDK-8206960
>>
>>     The webrev is here:
>>     http://cr.openjdk.java.net/~jcbeyler/8206960/webrev.00/
>>     <http://cr.openjdk.java.net/%7Ejcbeyler/8206960/webrev.00/>
>>
>>     Basically, the tests were failing for two reasons:
>>     ? - VMEventTest was failing because Graal does not support
>>     DisableIntrinsic required by the test, I disabled testing the test
>>     with Graal in this case
>>     ? - The other tests were failing because the BCI <-> source code
>>     line numbers are not always correct when using Graal via uncommon
>>     traps; therefore the tests now check if Graal is being used and,
>>     if so, only checks the method names. This allows us to still have
>>     tests working with Graal, albeit a bit more coarse.
>>
>>     This passes all the HeapMonitor tests
>>     with?-vmoptions:"-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI
>>     -XX:+TieredCompilation -XX:+UseJVMCICompiler -Djvmci.Compiler=graal"
>>
>>     (Except the GCCMS one which is being fixed via the one-liner for
>>     JDK-8205643).
>>
>>     Let me know what you think,
>>     Jc
> 
> 
> 
> -- 
> 
> Thanks,
> Jc

From serguei.spitsyn at oracle.com  Thu Jul 12 18:33:00 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 12 Jul 2018 11:33:00 -0700
Subject: RFR (S) 8206960: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail
In-Reply-To: <ec918693-996e-2017-94b1-add2d654f17c@oracle.com>
References: <CAF9BGBx8FSLMaj47PrOyk-zPQpZuCh-+hqpPZT6H0GazE9bqsQ@mail.gmail.com>
 <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com>
 <CAF9BGBwv7PEOgHxeeddbZj3rtHRVypqfaLidj5bHREo_Z4JOZA@mail.gmail.com>
 <ec918693-996e-2017-94b1-add2d654f17c@oracle.com>
Message-ID: <79de22b1-f7c6-9406-53fa-9a1614029097@oracle.com>

Thanks, Alex!

Jc,

I'll push it if you send me a patch.

Thanks,
Serguei


On 7/12/18 11:30, Alex Menkov wrote:
> Looks good to me as well.
>
> --alex
>
> On 07/12/2018 07:25, JC Beyler wrote:
>> Thanks Serguei!
>>
>> Anybody motivated to give this a review please?
>>
>> Thanks!
>> Jc
>>
>> On Wed, Jul 11, 2018 at 2:42 PM serguei.spitsyn at oracle.com 
>> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com 
>> <mailto:serguei.spitsyn at oracle.com>> wrote:
>>
>> ??? Hi Jc,
>>
>> ??? The fix looks good.
>> ??? I'll sponsor a push once it has been reviewed.
>>
>> ??? Thanks,
>> ??? Serguei
>>
>>
>> ??? On 7/11/18 10:04, JC Beyler wrote:
>>> ??? Hi all,
>>>
>>> ??? Could someone review the small-ish webrev for the bug:
>>> ??? https://bugs.openjdk.java.net/browse/JDK-8206960
>>>
>>> ??? The webrev is here:
>>> ??? http://cr.openjdk.java.net/~jcbeyler/8206960/webrev.00/
>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8206960/webrev.00/>
>>>
>>> ??? Basically, the tests were failing for two reasons:
>>> ??? ? - VMEventTest was failing because Graal does not support
>>> ??? DisableIntrinsic required by the test, I disabled testing the test
>>> ??? with Graal in this case
>>> ??? ? - The other tests were failing because the BCI <-> source code
>>> ??? line numbers are not always correct when using Graal via uncommon
>>> ??? traps; therefore the tests now check if Graal is being used and,
>>> ??? if so, only checks the method names. This allows us to still have
>>> ??? tests working with Graal, albeit a bit more coarse.
>>>
>>> ??? This passes all the HeapMonitor tests
>>> ??? with?-vmoptions:"-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI
>>> ??? -XX:+TieredCompilation -XX:+UseJVMCICompiler 
>>> -Djvmci.Compiler=graal"
>>>
>>> ??? (Except the GCCMS one which is being fixed via the one-liner for
>>> ??? JDK-8205643).
>>>
>>> ??? Let me know what you think,
>>> ??? Jc
>>
>>
>>
>> -- 
>>
>> Thanks,
>> Jc


From jcbeyler at google.com  Thu Jul 12 19:02:21 2018
From: jcbeyler at google.com (JC Beyler)
Date: Thu, 12 Jul 2018 12:02:21 -0700
Subject: RFR (S) 8206960: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail
In-Reply-To: <79de22b1-f7c6-9406-53fa-9a1614029097@oracle.com>
References: <CAF9BGBx8FSLMaj47PrOyk-zPQpZuCh-+hqpPZT6H0GazE9bqsQ@mail.gmail.com>
 <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com>
 <CAF9BGBwv7PEOgHxeeddbZj3rtHRVypqfaLidj5bHREo_Z4JOZA@mail.gmail.com>
 <ec918693-996e-2017-94b1-add2d654f17c@oracle.com>
 <79de22b1-f7c6-9406-53fa-9a1614029097@oracle.com>
Message-ID: <CAF9BGByAhKtD1PC5u5+xYR6QOY4vizFVSUfxJnJPGwiTqt+HmQ@mail.gmail.com>

Hi Serguei,

Here you are:
http://cr.openjdk.java.net/~jcbeyler/8206960/webrev.01/

Thanks for the push!
Jc

On Thu, Jul 12, 2018 at 11:33 AM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Thanks, Alex!
>
> Jc,
>
> I'll push it if you send me a patch.
>
> Thanks,
> Serguei
>
>
> On 7/12/18 11:30, Alex Menkov wrote:
> > Looks good to me as well.
> >
> > --alex
> >
> > On 07/12/2018 07:25, JC Beyler wrote:
> >> Thanks Serguei!
> >>
> >> Anybody motivated to give this a review please?
> >>
> >> Thanks!
> >> Jc
> >>
> >> On Wed, Jul 11, 2018 at 2:42 PM serguei.spitsyn at oracle.com
> >> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com
> >> <mailto:serguei.spitsyn at oracle.com>> wrote:
> >>
> >>     Hi Jc,
> >>
> >>     The fix looks good.
> >>     I'll sponsor a push once it has been reviewed.
> >>
> >>     Thanks,
> >>     Serguei
> >>
> >>
> >>     On 7/11/18 10:04, JC Beyler wrote:
> >>>     Hi all,
> >>>
> >>>     Could someone review the small-ish webrev for the bug:
> >>>     https://bugs.openjdk.java.net/browse/JDK-8206960
> >>>
> >>>     The webrev is here:
> >>>     http://cr.openjdk.java.net/~jcbeyler/8206960/webrev.00/
> >>> <http://cr.openjdk.java.net/%7Ejcbeyler/8206960/webrev.00/>
> >>>
> >>>     Basically, the tests were failing for two reasons:
> >>>       - VMEventTest was failing because Graal does not support
> >>>     DisableIntrinsic required by the test, I disabled testing the test
> >>>     with Graal in this case
> >>>       - The other tests were failing because the BCI <-> source code
> >>>     line numbers are not always correct when using Graal via uncommon
> >>>     traps; therefore the tests now check if Graal is being used and,
> >>>     if so, only checks the method names. This allows us to still have
> >>>     tests working with Graal, albeit a bit more coarse.
> >>>
> >>>     This passes all the HeapMonitor tests
> >>>     with -vmoptions:"-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCI
> >>>     -XX:+TieredCompilation -XX:+UseJVMCICompiler
> >>> -Djvmci.Compiler=graal"
> >>>
> >>>     (Except the GCCMS one which is being fixed via the one-liner for
> >>>     JDK-8205643).
> >>>
> >>>     Let me know what you think,
> >>>     Jc
> >>
> >>
> >>
> >> --
> >>
> >> Thanks,
> >> Jc
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180712/5906764b/attachment-0001.html>

From daniil.x.titov at oracle.com  Thu Jul 12 19:08:31 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Thu, 12 Jul 2018 12:08:31 -0700
Subject: RFR JDK-8191948 : jdb error: InvalidTypeException: Can't assign
 double[][][] to double[][][]
In-Reply-To: <9923479f-eb69-2d5b-3939-453c77c81aef@oracle.com>
References: <BC3FDEC3-43C1-46A5-A9B4-40309F4BC07C@oracle.com>
 <01fb43cf-3f19-e4e2-fae1-30c2a5665b44@oracle.com>
 <9923479f-eb69-2d5b-3939-453c77c81aef@oracle.com>
Message-ID: <309C0D5B-365F-4EAC-8D8C-A87A197640BD@oracle.com>

Thank you, Alex and Serguei for reviewing this change!


Best regards,
Daniil

?On 7/12/18, 11:21 AM, "Alex Menkov" <alexey.menkov at oracle.com> wrote:

    +1
    
    --alex
    
    On 07/11/2018 22:26, serguei.spitsyn at oracle.com wrote:
    > Hi Daniil,
    > 
    > It looks good.
    > 
    > Thanks,
    > Serguei
    > 
    > 
    > On 7/11/18 22:23, Daniil Titov wrote:
    >> Please review the changes that fix jdb issue with evaluation of 
    >> multidimensional arrays of primitives.
    >>
    >> The problem here is that for N-dimensional arrays of the primitives 
    >> with N greater then 2, JDI fails to find its component type (which is 
    >> an array of dimension N-1) assuming that it is a boot type.
    >>
    >> Thanks!
    >> Issue: https://bugs.openjdk.java.net/browse/JDK-8191948
    >> Webrev: http://cr.openjdk.java.net/~dtitov/8191948/webrev.01
    >> Best regards,
    >> Daniil
    >>
    >>
    >>
    > 
    

From serguei.spitsyn at oracle.com  Thu Jul 12 19:40:10 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 12 Jul 2018 12:40:10 -0700
Subject: RFR (S) 8206960: [Graal]
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor tests fail
In-Reply-To: <CAF9BGByAhKtD1PC5u5+xYR6QOY4vizFVSUfxJnJPGwiTqt+HmQ@mail.gmail.com>
References: <CAF9BGBx8FSLMaj47PrOyk-zPQpZuCh-+hqpPZT6H0GazE9bqsQ@mail.gmail.com>
 <1cb6163e-6bd5-2bcf-7baf-6168ee745fda@oracle.com>
 <CAF9BGBwv7PEOgHxeeddbZj3rtHRVypqfaLidj5bHREo_Z4JOZA@mail.gmail.com>
 <ec918693-996e-2017-94b1-add2d654f17c@oracle.com>
 <79de22b1-f7c6-9406-53fa-9a1614029097@oracle.com>
 <CAF9BGByAhKtD1PC5u5+xYR6QOY4vizFVSUfxJnJPGwiTqt+HmQ@mail.gmail.com>
Message-ID: <bafa669f-5f06-dbff-d1e4-42fd19f4d9f3@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180712/9df46a6a/attachment.html>

From mandy.chung at oracle.com  Thu Jul 12 20:34:52 2018
From: mandy.chung at oracle.com (mandy chung)
Date: Thu, 12 Jul 2018 13:34:52 -0700
Subject: committed > max in MemoryMXBean#getHeapMemoryUsage()
In-Reply-To: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
References: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
Message-ID: <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com>

It's indeed strange that no one reports this issue.  I created:
    https://bugs.openjdk.java.net/browse/JDK-8207200

Mandy

On 7/12/18 6:35 AM, Daniel Mitterdorfer wrote:
> Hi,
> 
> while working on a change in Elasticsearch, I discovered an interesting
> situation related to the implementation of jmm_getMemoryUsage (see
> [jdk-mem-usage]). In one of the test runs, a test failed with the following
> exception:
> 
> java.lang.IllegalArgumentException: committed = 542113792 should be <
> max = 536870912
> at java.lang.management.MemoryUsage.<init>(MemoryUsage.java:166)
> at sun.management.MemoryImpl.getMemoryUsage0(Native Method)
> at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71)
> at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:246)
> [...]
> 
> This happened on MacOS 10.12.6 with JDK 10 (build 10.0.1+10). The only JVM flags
> specified where -Xms512M -Xmx512M. So far this failure occurred only once and I
> could not reproduce it yet.
> 
> The values reported in the exception message are:
> 
> * "max": 536870912 = 512MB (exactly)
> * "committed": 542113792 = 517MB (exactly), i.e. 5MB more than "max".
> 
> As the value of "max" is exactly what we have specified with -Xmx this indicates
> to me that the problem seems to be the calculation of "committed".
> 
> As the value of "max" is exactly what we have specified with -Xmx it seems to
> indicate that the problem is the calculation of "committed". I do not
> understand under which conditions this can happen thus I post this to the
> mailing list in case anybody has ideas what might cause this.
> 
> I plan to run further tests with JVM trace logging enabled
> (-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags to be
> precise) in the hope that this problem will occur again and I can provide logs
> that help to debug / fix the problem.
> 
> Searching for that error message, there is [JDK-8020530] but that one is about
> *non-heap* memory usage and has already been resolved a while ago. Several
> sources (e.g. [apache-ignite-workaround] or [netbeans-bug]) seem to indicate
> that this problem happened indeed in the wild but what I find odd is that I
> could not find a single ticket in the OpenJDK bug tracker or a discussion on a
> JDK mailing list about this problem.
> 
> I'd be glad to get any pointers on what might cause this or requests for
> additional info that I need to provide to help analyze this problem.
> 
> Thanks,
> Daniel
> 
> [jdk-mem-usage]
> http://hg.openjdk.java.net/jdk-updates/jdk10u/file/142f0ed9ff5b/src/hotspot/share/services/management.cpp#l728
> [JDK-8020530] https://bugs.openjdk.java.net/browse/JDK-8020530
> [apache-ignite-workaround]
> https://github.com/apache/ignite/blob/df4fd65a32/modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java#L336-L346
> [netbeans-bug] https://netbeans.org/bugzilla/show_bug.cgi?id=194733
> 

From jcbeyler at google.com  Thu Jul 12 20:45:03 2018
From: jcbeyler at google.com (JC Beyler)
Date: Thu, 12 Jul 2018 13:45:03 -0700
Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling
Message-ID: <CAF9BGBzQY2+rx-ec7n7Y5Sq9N4E=XLUwx3hs_SKqhB2RXyf4TQ@mail.gmail.com>

Hi all,

Could I get a review of an update to the JVMTI Spec for Heap Sampling:
http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/

The assoicated bug is here: https://bugs.openjdk.java.net/browse/JDK-8205725
The associated CSR is here: https://bugs.openjdk.java.net/browse/JDK-8206940

The basic reasoning of this webrev/bug/CSR is:
- rate is not the right word and should be renamed to interval, this is
what provokes the change in the code/tests/API naming.
- the spec does not mention that the new sampling interval will take time
to be taken into account (you have to wait for a TLAB to be refilled); this
adds that precision so that the user is not surprised
- the spec explicitly says that the sampling is done via a geometric
variable which averages to the sampling interval; it was asked to relax
this and the spec should just say that the sampling is pseudo-random and
the interval will average out to what the user requested.

Thanks for all your help,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180712/f74207ea/attachment.html>

From serguei.spitsyn at oracle.com  Thu Jul 12 21:27:10 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 12 Jul 2018 14:27:10 -0700
Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling
In-Reply-To: <CAF9BGBzQY2+rx-ec7n7Y5Sq9N4E=XLUwx3hs_SKqhB2RXyf4TQ@mail.gmail.com>
References: <CAF9BGBzQY2+rx-ec7n7Y5Sq9N4E=XLUwx3hs_SKqhB2RXyf4TQ@mail.gmail.com>
Message-ID: <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180712/d7db6270/attachment-0001.html>

From chris.plummer at oracle.com  Thu Jul 12 22:48:34 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 12 Jul 2018 15:48:34 -0700
Subject: RFR: JDK-8206007: nsk/jdb/exclude001 test is taking a long time
 on some builds
In-Reply-To: <5B476B72.7060203@oracle.com>
References: <5B3E2FC7.1060303@oracle.com> <5B476B72.7060203@oracle.com>
Message-ID: <75c06bd1-a405-528b-25d9-307ca78d60c3@oracle.com>

I'll take care of it shortly.

Chris

On 7/12/18 7:53 AM, Gary Adams wrote:
> I've attached the patch for JDK-8206007.
> I'll need a sponsor to push the changes.
>
> On 7/5/18, 10:48 AM, Gary Adams wrote:
>> A simple test run using "exclude none" shows 625K methods are being 
>> observed.
>> The bulk of those methods were due to the last class accessed in the 
>> test - VirtualMachineManager.
>>
>> It's not important that this particular call is used. The test is 
>> simply demonstrating that
>> filters work for other packages than java and javax.
>>
>> This proposed fix uses a simpler lookup for GregorianCalendar.
>>
>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206007
>> ? Webrev: http://cr.openjdk.java.net/~gadams/8206007/webrev.00/
>


From chris.plummer at oracle.com  Thu Jul 12 22:58:36 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 12 Jul 2018 15:58:36 -0700
Subject: RFR: JDK-8201513: nsk/jvmti/IterateThroughHeap/filter-* are broken
In-Reply-To: <d6819adf-08b0-213a-5def-e999d4c46b7f@oracle.com>
References: <cd61a65d-4191-39a5-ddeb-648c614aaab6@oracle.com>
 <d6819adf-08b0-213a-5def-e999d4c46b7f@oracle.com>
Message-ID: <efb286f7-097a-948d-e50f-4f498aeeca31@oracle.com>

+1

On 7/11/18 2:26 PM, serguei.spitsyn at oracle.com wrote:
> Hi Alex,
>
> The fix looks good.
> Thank you for fixing the typos!
>
> Thanks,
> Serguei
>
>
> On 7/11/18 11:39, Alex Menkov wrote:
>> Hi all,
>>
>> please review a fix for
>> https://bugs.openjdk.java.net/browse/JDK-8201513
>> webrev:
>> http://cr.openjdk.java.net/~amenkov/IterateThroughHeap/webrev/
>>
>> summary:
>> The tests had a error which was fixed during open-sourcing.
>> After that the tests started to fail. Root cause of the failures is 
>> wrong verification (positive results are interpreted as negative)
>>
>> --alex
>


From mikael.vidstedt at oracle.com  Fri Jul 13 00:21:00 2018
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Thu, 12 Jul 2018 17:21:00 -0700
Subject: RFR(XS): 8207217: Problem list
 java/lang/management/ThreadMXBean/AllThreadIds.java
Message-ID: <BA264B02-A4D4-402E-BBA5-48AAB4EFE0E0@oracle.com>


Please review this change which problem lists the frequently failing java/lang/management/ThreadMXBean/AllThreadIds.java test until the issue[1] has been fixed:

Bug: https://bugs.openjdk.java.net/browse/JDK-8207217 <https://bugs.openjdk.java.net/browse/JDK-8207217>
webrev: http://cr.openjdk.java.net/~mikael/webrevs/8207217/webrev.00/open/webrev/ <http://cr.openjdk.java.net/~mikael/webrevs/8207217/webrev.00/open/webrev/>

Cheers,
Mikael

[1] https://bugs.openjdk.java.net/browse/JDK-8131745 <https://bugs.openjdk.java.net/browse/JDK-8131745>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180712/4d27adec/attachment.html>

From david.holmes at oracle.com  Fri Jul 13 00:29:31 2018
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 13 Jul 2018 10:29:31 +1000
Subject: RFR(XS): 8207217: Problem list
 java/lang/management/ThreadMXBean/AllThreadIds.java
In-Reply-To: <BA264B02-A4D4-402E-BBA5-48AAB4EFE0E0@oracle.com>
References: <BA264B02-A4D4-402E-BBA5-48AAB4EFE0E0@oracle.com>
Message-ID: <1684f644-bf38-47a7-2725-e4d3d700a573@oracle.com>

Ship it!

Thanks,
David

On 13/07/2018 10:21 AM, Mikael Vidstedt wrote:
> 
> Please review this change which problem lists the frequently failing 
> java/lang/management/ThreadMXBean/AllThreadIds.java test until the 
> issue[1] has been fixed:
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8207217
> webrev: 
> http://cr.openjdk.java.net/~mikael/webrevs/8207217/webrev.00/open/webrev/ <http://cr.openjdk.java.net/%7Emikael/webrevs/8207217/webrev.00/open/webrev/>
> 
> Cheers,
> Mikael
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8131745
> 

From goetz.lindenmaier at sap.com  Fri Jul 13 05:55:12 2018
From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz)
Date: Fri, 13 Jul 2018 05:55:12 +0000
Subject: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
In-Reply-To: <a9afadbd-a610-09ec-0424-9d11c4e4e209@oracle.com>
References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com>
 <5eb111d4ffd8427398a09c62a925e5d7@sap.com>
 <a9afadbd-a610-09ec-0424-9d11c4e4e209@oracle.com>
Message-ID: <1156b70a17d44226a0510713e1975451@sap.com>

Hi Jini, 

A whole bunch of tests failed on mac.  I'll send you
A jtr file off list, to avoid spamming the list.
See below the core message.

The tests passed on linuxppc64le, linuxx86_64 and solaris_sparc, the other
tests are still pending.

Best regards,
  Goetz.

----------System.err:(32/1923)----------
Command line: ['/priv/jvmtests/output_sapjvm12_o_jdk-test_dbgU_darwinintel64/sapjvm_12/bin/java' '-Xcomp' '-cp' '/priv/jvmtests/output_sapjvm12_o_jdk-test_dbgU_darwinintel64/jtreg_hotspot_work/JTwork/classes/serviceability/sa/ClhsdbFindPC.d:/priv/jvmtests/output_sapjvm12_o_jdk-test_dbgU_darwinintel64/jtreg_hotspot_work/JTwork/classes/test/lib' 'jdk.test.lib.apps.LingeredApp' '78a4a198-8a55-4684-ac1e-2d28311a0952.lck' ]
sudo: no tty present and no askpass program specified
 stdout: [];
 stderr: []
 exitValue = 1

 LingeredApp stdout: [];
 LingeredApp stderr: []
 LingeredApp exitValue = 0
java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0]

	at ClhsdbFindPC.testFindPC(ClhsdbFindPC.java:95)
	at ClhsdbFindPC.main(ClhsdbFindPC.java:103)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.lang.RuntimeException: Expected to get exit value of [0]

	at jdk.test.lib.process.OutputAnalyzer.shouldHaveExitValue(OutputAnalyzer.java:396)
	at ClhsdbLauncher.runCmd(ClhsdbLauncher.java:128)
	at ClhsdbLauncher.run(ClhsdbLauncher.java:176)
	at ClhsdbFindPC.testFindPC(ClhsdbFindPC.java:58)
	... 7 more

JavaTest Message: Test threw exception: java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0]

JavaTest Message: shutting down test

STATUS:Failed.`main' threw exception: java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0]

> -----Original Message-----
> From: Jini George <jini.george at oracle.com>
> Sent: Thursday, July 12, 2018 6:43 PM
> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; serviceability-
> dev at openjdk.java.net
> Subject: Re: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
> 
> 
> >
> > I'll run the patch throuqh our nightly tests to
> > see whether they pass mac.
> 
> Thanks for this. Let me know in case there are timeouts due to there not
> being a no-password entry for the user in the /etc/sudoers list.
> 
> Thanks,
> Jini.

From jini.george at oracle.com  Fri Jul 13 06:21:06 2018
From: jini.george at oracle.com (Jini George)
Date: Fri, 13 Jul 2018 11:51:06 +0530
Subject: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
In-Reply-To: <1156b70a17d44226a0510713e1975451@sap.com>
References: <63bc303e-82e9-678b-1da5-67f656307001@oracle.com>
 <5eb111d4ffd8427398a09c62a925e5d7@sap.com>
 <a9afadbd-a610-09ec-0424-9d11c4e4e209@oracle.com>
 <1156b70a17d44226a0510713e1975451@sap.com>
Message-ID: <a3ea8d62-db69-ea19-b548-3f7e319e3033@oracle.com>

Thanks a bunch, Goetz. As David feared, the tests are failing due to 
there not being a no-password entry for the user in the /etc/sudoers 
list ("sudo: no tty present and no askpass program specified"). Let me 
see what I can do about this.

Thanks,
Jini.

On 7/13/2018 11:25 AM, Lindenmaier, Goetz wrote:
> Hi Jini,
> 
> A whole bunch of tests failed on mac.  I'll send you
> A jtr file off list, to avoid spamming the list.
> See below the core message.
> 
> The tests passed on linuxppc64le, linuxx86_64 and solaris_sparc, the other
> tests are still pending.
> 
> Best regards,
>    Goetz.
> 
> ----------System.err:(32/1923)----------
> Command line: ['/priv/jvmtests/output_sapjvm12_o_jdk-test_dbgU_darwinintel64/sapjvm_12/bin/java' '-Xcomp' '-cp' '/priv/jvmtests/output_sapjvm12_o_jdk-test_dbgU_darwinintel64/jtreg_hotspot_work/JTwork/classes/serviceability/sa/ClhsdbFindPC.d:/priv/jvmtests/output_sapjvm12_o_jdk-test_dbgU_darwinintel64/jtreg_hotspot_work/JTwork/classes/test/lib' 'jdk.test.lib.apps.LingeredApp' '78a4a198-8a55-4684-ac1e-2d28311a0952.lck' ]
> sudo: no tty present and no askpass program specified
>   stdout: [];
>   stderr: []
>   exitValue = 1
> 
>   LingeredApp stdout: [];
>   LingeredApp stderr: []
>   LingeredApp exitValue = 0
> java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0]
> 
> 	at ClhsdbFindPC.testFindPC(ClhsdbFindPC.java:95)
> 	at ClhsdbFindPC.main(ClhsdbFindPC.java:103)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> 	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115)
> 	at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.lang.RuntimeException: Expected to get exit value of [0]
> 
> 	at jdk.test.lib.process.OutputAnalyzer.shouldHaveExitValue(OutputAnalyzer.java:396)
> 	at ClhsdbLauncher.runCmd(ClhsdbLauncher.java:128)
> 	at ClhsdbLauncher.run(ClhsdbLauncher.java:176)
> 	at ClhsdbFindPC.testFindPC(ClhsdbFindPC.java:58)
> 	... 7 more
> 
> JavaTest Message: Test threw exception: java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0]
> 
> JavaTest Message: shutting down test
> 
> STATUS:Failed.`main' threw exception: java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: Expected to get exit value of [0]
> 
>> -----Original Message-----
>> From: Jini George <jini.george at oracle.com>
>> Sent: Thursday, July 12, 2018 6:43 PM
>> To: Lindenmaier, Goetz <goetz.lindenmaier at sap.com>; serviceability-
>> dev at openjdk.java.net
>> Subject: Re: JDK-8199700: SA: Enable jhsdb jtreg tests for Mac OS X
>>
>>
>>>
>>> I'll run the patch throuqh our nightly tests to
>>> see whether they pass mac.
>>
>> Thanks for this. Let me know in case there are timeouts due to there not
>> being a no-password entry for the user in the /etc/sudoers list.
>>
>> Thanks,
>> Jini.

From daniel.mitterdorfer at gmail.com  Fri Jul 13 08:04:47 2018
From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer)
Date: Fri, 13 Jul 2018 10:04:47 +0200
Subject: committed > max in MemoryMXBean#getHeapMemoryUsage()
In-Reply-To: <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com>
References: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
 <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com>
Message-ID: <CAJmnTFhQjEL1rMmqfODDTcmcgpwQQ+TBDipn4HOfN5jXg3GdRA@mail.gmail.com>

Hi Mandy,

thank you for creating the issue. One note: I spotted this in JDK 10
(build 10.0.1+10) but in the ticket it says it affects version 8.

Daniel
Am Fr., 13. Juli 2018 um 04:15 Uhr schrieb mandy chung <mandy.chung at oracle.com>:
>
> It's indeed strange that no one reports this issue.  I created:
>     https://bugs.openjdk.java.net/browse/JDK-8207200
>
> Mandy
>
> On 7/12/18 6:35 AM, Daniel Mitterdorfer wrote:
> > Hi,
> >
> > while working on a change in Elasticsearch, I discovered an interesting
> > situation related to the implementation of jmm_getMemoryUsage (see
> > [jdk-mem-usage]). In one of the test runs, a test failed with the following
> > exception:
> >
> > java.lang.IllegalArgumentException: committed = 542113792 should be <
> > max = 536870912
> > at java.lang.management.MemoryUsage.<init>(MemoryUsage.java:166)
> > at sun.management.MemoryImpl.getMemoryUsage0(Native Method)
> > at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71)
> > at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:246)
> > [...]
> >
> > This happened on MacOS 10.12.6 with JDK 10 (build 10.0.1+10). The only JVM flags
> > specified where -Xms512M -Xmx512M. So far this failure occurred only once and I
> > could not reproduce it yet.
> >
> > The values reported in the exception message are:
> >
> > * "max": 536870912 = 512MB (exactly)
> > * "committed": 542113792 = 517MB (exactly), i.e. 5MB more than "max".
> >
> > As the value of "max" is exactly what we have specified with -Xmx this indicates
> > to me that the problem seems to be the calculation of "committed".
> >
> > As the value of "max" is exactly what we have specified with -Xmx it seems to
> > indicate that the problem is the calculation of "committed". I do not
> > understand under which conditions this can happen thus I post this to the
> > mailing list in case anybody has ideas what might cause this.
> >
> > I plan to run further tests with JVM trace logging enabled
> > (-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags to be
> > precise) in the hope that this problem will occur again and I can provide logs
> > that help to debug / fix the problem.
> >
> > Searching for that error message, there is [JDK-8020530] but that one is about
> > *non-heap* memory usage and has already been resolved a while ago. Several
> > sources (e.g. [apache-ignite-workaround] or [netbeans-bug]) seem to indicate
> > that this problem happened indeed in the wild but what I find odd is that I
> > could not find a single ticket in the OpenJDK bug tracker or a discussion on a
> > JDK mailing list about this problem.
> >
> > I'd be glad to get any pointers on what might cause this or requests for
> > additional info that I need to provide to help analyze this problem.
> >
> > Thanks,
> > Daniel
> >
> > [jdk-mem-usage]
> > http://hg.openjdk.java.net/jdk-updates/jdk10u/file/142f0ed9ff5b/src/hotspot/share/services/management.cpp#l728
> > [JDK-8020530] https://bugs.openjdk.java.net/browse/JDK-8020530
> > [apache-ignite-workaround]
> > https://github.com/apache/ignite/blob/df4fd65a32/modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java#L336-L346
> > [netbeans-bug] https://netbeans.org/bugzilla/show_bug.cgi?id=194733
> >

From Alan.Bateman at oracle.com  Fri Jul 13 08:16:27 2018
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Fri, 13 Jul 2018 09:16:27 +0100
Subject: committed > max in MemoryMXBean#getHeapMemoryUsage()
In-Reply-To: <CAJmnTFhQjEL1rMmqfODDTcmcgpwQQ+TBDipn4HOfN5jXg3GdRA@mail.gmail.com>
References: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
 <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com>
 <CAJmnTFhQjEL1rMmqfODDTcmcgpwQQ+TBDipn4HOfN5jXg3GdRA@mail.gmail.com>
Message-ID: <9ff83f20-3c71-80cd-3d3a-5b65910d0266@oracle.com>


On 13/07/2018 09:04, Daniel Mitterdorfer wrote:
> Hi Mandy,
>
> thank you for creating the issue. One note: I spotted this in JDK 10
> (build 10.0.1+10) but in the ticket it says it affects version 8.
>
A bug with affects version N is assumed to be applicable to all releases 
 > N unless tagged otherwise. So "10" could be added to the list of 
versions where the issue was spotted or confirmed if needed.

-Alan

From erik.helin at oracle.com  Fri Jul 13 08:18:19 2018
From: erik.helin at oracle.com (Erik Helin)
Date: Fri, 13 Jul 2018 10:18:19 +0200
Subject: committed > max in MemoryMXBean#getHeapMemoryUsage()
In-Reply-To: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
References: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
Message-ID: <c52de47f-b7ad-ed01-2f94-46378a95d725@oracle.com>

Hi Daniel,

thanks for letting us know. Since you have only set -Xms512 and -Xmx512 
and you are running on JDK 10 that means you are using the G1 garbage 
collector, so all the calls to pool->get_memory_usage() in the loop will 
end up in g1MemoryPool.cpp [0] which in turn will return cached values 
from the recalculate_sizes code in G1MonitoringSupport [1]. Since you 
are running with -Xmx512m you should have gotten 1 MB sized regions (see 
heapRegion.cpp for details [2]), so the 5 MB _could_ mean that five 
regions were accounted wrongly.

Do you any kind of GC logging from the test run where you encountered 
the bug?

The code in G1MonitoringSupport::recalculate_sizes seems messy enough 
that there could be in a small bug in there. I'm adding hotspot-gc-dev 
since all GC developers might not read serviceability-dev.

Thanks,
Erik

[0]: 
http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/share/gc/g1/g1MemoryPool.cpp
[1]: 
http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/share/gc/g1/g1MonitoringSupport.cpp#l182
[2]: 
http://hg.openjdk.java.net/jdk/jdk/file/tip/src/hotspot/share/gc/g1/heapRegion.cpp#l63

On 07/12/2018 03:35 PM, Daniel Mitterdorfer wrote:
> Hi,
> 
> while working on a change in Elasticsearch, I discovered an interesting
> situation related to the implementation of jmm_getMemoryUsage (see
> [jdk-mem-usage]). In one of the test runs, a test failed with the following
> exception:
> 
> java.lang.IllegalArgumentException: committed = 542113792 should be <
> max = 536870912
> at java.lang.management.MemoryUsage.<init>(MemoryUsage.java:166)
> at sun.management.MemoryImpl.getMemoryUsage0(Native Method)
> at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71)
> at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:246)
> [...]
> 
> This happened on MacOS 10.12.6 with JDK 10 (build 10.0.1+10). The only JVM flags
> specified where -Xms512M -Xmx512M. So far this failure occurred only once and I
> could not reproduce it yet.
> 
> The values reported in the exception message are:
> 
> * "max": 536870912 = 512MB (exactly)
> * "committed": 542113792 = 517MB (exactly), i.e. 5MB more than "max".
> 
> As the value of "max" is exactly what we have specified with -Xmx this indicates
> to me that the problem seems to be the calculation of "committed".
> 
> As the value of "max" is exactly what we have specified with -Xmx it seems to
> indicate that the problem is the calculation of "committed". I do not
> understand under which conditions this can happen thus I post this to the
> mailing list in case anybody has ideas what might cause this.
> 
> I plan to run further tests with JVM trace logging enabled
> (-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags to be
> precise) in the hope that this problem will occur again and I can provide logs
> that help to debug / fix the problem.
> 
> Searching for that error message, there is [JDK-8020530] but that one is about
> *non-heap* memory usage and has already been resolved a while ago. Several
> sources (e.g. [apache-ignite-workaround] or [netbeans-bug]) seem to indicate
> that this problem happened indeed in the wild but what I find odd is that I
> could not find a single ticket in the OpenJDK bug tracker or a discussion on a
> JDK mailing list about this problem.
> 
> I'd be glad to get any pointers on what might cause this or requests for
> additional info that I need to provide to help analyze this problem.
> 
> Thanks,
> Daniel
> 
> [jdk-mem-usage]
> http://hg.openjdk.java.net/jdk-updates/jdk10u/file/142f0ed9ff5b/src/hotspot/share/services/management.cpp#l728
> [JDK-8020530] https://bugs.openjdk.java.net/browse/JDK-8020530
> [apache-ignite-workaround]
> https://github.com/apache/ignite/blob/df4fd65a32/modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java#L336-L346
> [netbeans-bug] https://netbeans.org/bugzilla/show_bug.cgi?id=194733
> 

From daniel.mitterdorfer at gmail.com  Fri Jul 13 08:26:50 2018
From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer)
Date: Fri, 13 Jul 2018 10:26:50 +0200
Subject: committed > max in MemoryMXBean#getHeapMemoryUsage()
In-Reply-To: <9ff83f20-3c71-80cd-3d3a-5b65910d0266@oracle.com>
References: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
 <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com>
 <CAJmnTFhQjEL1rMmqfODDTcmcgpwQQ+TBDipn4HOfN5jXg3GdRA@mail.gmail.com>
 <9ff83f20-3c71-80cd-3d3a-5b65910d0266@oracle.com>
Message-ID: <CAJmnTFihrQy=t-UrX7ndAKpwR9r8_sMKKEMBMS5a3=miLj4L_A@mail.gmail.com>

Hi Alan,

understood. Thanks for clarifying.

Daniel
Am Fr., 13. Juli 2018 um 10:15 Uhr schrieb Alan Bateman
<Alan.Bateman at oracle.com>:
>
>
>
> On 13/07/2018 09:04, Daniel Mitterdorfer wrote:
> > Hi Mandy,
> >
> > thank you for creating the issue. One note: I spotted this in JDK 10
> > (build 10.0.1+10) but in the ticket it says it affects version 8.
> >
> A bug with affects version N is assumed to be applicable to all releases
>  > N unless tagged otherwise. So "10" could be added to the list of
> versions where the issue was spotted or confirmed if needed.
>
> -Alan

From daniel.mitterdorfer at gmail.com  Fri Jul 13 08:30:17 2018
From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer)
Date: Fri, 13 Jul 2018 10:30:17 +0200
Subject: committed > max in MemoryMXBean#getHeapMemoryUsage()
In-Reply-To: <c52de47f-b7ad-ed01-2f94-46378a95d725@oracle.com>
References: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
 <c52de47f-b7ad-ed01-2f94-46378a95d725@oracle.com>
Message-ID: <CAJmnTFi23XsD5zzZT6b+YSPZevyAzPSbcoLNAZ+J5szj7ZQjNA@mail.gmail.com>

Hi Erik,
>
> Do you any kind of GC logging from the test run where you encountered
> the bug?

Unfortunately, we don't have GC logging enabled by default in our test
suite so the exception trace is all I got. I am now repeatedly running
the test suite with the original flags (-Xms512M -Xmx512M) and also
added the following logging configuration:

-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags

As soon as I get another failure, I'll provide the full log file.
Please let me know if you need any other logs (i.e. whether I should
adjust my log configuration).

Daniel

From thomas.schatzl at oracle.com  Fri Jul 13 08:33:33 2018
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 13 Jul 2018 10:33:33 +0200
Subject: committed > max in MemoryMXBean#getHeapMemoryUsage()
In-Reply-To: <CAJmnTFi23XsD5zzZT6b+YSPZevyAzPSbcoLNAZ+J5szj7ZQjNA@mail.gmail.com>
References: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
 <c52de47f-b7ad-ed01-2f94-46378a95d725@oracle.com>
 <CAJmnTFi23XsD5zzZT6b+YSPZevyAzPSbcoLNAZ+J5szj7ZQjNA@mail.gmail.com>
Message-ID: <b12eaec322cc513348a9155fbcdf2b916a43559b.camel@oracle.com>

On Fri, 2018-07-13 at 10:30 +0200, Daniel Mitterdorfer wrote:
> Hi Erik,
> > 
> > Do you any kind of GC logging from the test run where you
> > encountered the bug?
> 
> Unfortunately, we don't have GC logging enabled by default in our
> test suite so the exception trace is all I got. I am now repeatedly
> running the test suite with the original flags (-Xms512M -Xmx512M)
> and also added the following logging configuration:
> 
> -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags
> 
> As soon as I get another failure, I'll provide the full log file.
> Please let me know if you need any other logs (i.e. whether I should
> adjust my log configuration).

  I think these flags are fine.

Since Erik and me strongly believe the issue is with the relevant G1
code Erik mentioned we will reassign the bug to us (he said there is
already a bug reported on it).

Thanks a lot,
  Thomas


From erik.helin at oracle.com  Fri Jul 13 08:34:45 2018
From: erik.helin at oracle.com (Erik Helin)
Date: Fri, 13 Jul 2018 10:34:45 +0200
Subject: committed > max in MemoryMXBean#getHeapMemoryUsage()
In-Reply-To: <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com>
References: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
 <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com>
Message-ID: <a0ef450d-96e5-0507-aaa5-6b7dd265283b@oracle.com>

On 07/12/2018 10:34 PM, mandy chung wrote:
> It's indeed strange that no one reports this issue.? I created:
>  ?? https://bugs.openjdk.java.net/browse/JDK-8207200

Mandy: I moved the bug over to hotspot/gc, this is much more likely to 
be a problem with how the GC calculates the sizes. I don't think there 
is a bug in the serviceability layer, the JNI getMemoryUsage function 
only summarizes the data it gets from the GC.

Thanks for creating the bug, we will follow up with Daniel.
Erik

> Mandy
> 
> On 7/12/18 6:35 AM, Daniel Mitterdorfer wrote:
>> Hi,
>>
>> while working on a change in Elasticsearch, I discovered an interesting
>> situation related to the implementation of jmm_getMemoryUsage (see
>> [jdk-mem-usage]). In one of the test runs, a test failed with the 
>> following
>> exception:
>>
>> java.lang.IllegalArgumentException: committed = 542113792 should be <
>> max = 536870912
>> at java.lang.management.MemoryUsage.<init>(MemoryUsage.java:166)
>> at sun.management.MemoryImpl.getMemoryUsage0(Native Method)
>> at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71)
>> at 
>> org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:246) 
>>
>> [...]
>>
>> This happened on MacOS 10.12.6 with JDK 10 (build 10.0.1+10). The only 
>> JVM flags
>> specified where -Xms512M -Xmx512M. So far this failure occurred only 
>> once and I
>> could not reproduce it yet.
>>
>> The values reported in the exception message are:
>>
>> * "max": 536870912 = 512MB (exactly)
>> * "committed": 542113792 = 517MB (exactly), i.e. 5MB more than "max".
>>
>> As the value of "max" is exactly what we have specified with -Xmx this 
>> indicates
>> to me that the problem seems to be the calculation of "committed".
>>
>> As the value of "max" is exactly what we have specified with -Xmx it 
>> seems to
>> indicate that the problem is the calculation of "committed". I do not
>> understand under which conditions this can happen thus I post this to the
>> mailing list in case anybody has ideas what might cause this.
>>
>> I plan to run further tests with JVM trace logging enabled
>> (-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags 
>> to be
>> precise) in the hope that this problem will occur again and I can 
>> provide logs
>> that help to debug / fix the problem.
>>
>> Searching for that error message, there is [JDK-8020530] but that one 
>> is about
>> *non-heap* memory usage and has already been resolved a while ago. 
>> Several
>> sources (e.g. [apache-ignite-workaround] or [netbeans-bug]) seem to 
>> indicate
>> that this problem happened indeed in the wild but what I find odd is 
>> that I
>> could not find a single ticket in the OpenJDK bug tracker or a 
>> discussion on a
>> JDK mailing list about this problem.
>>
>> I'd be glad to get any pointers on what might cause this or requests for
>> additional info that I need to provide to help analyze this problem.
>>
>> Thanks,
>> Daniel
>>
>> [jdk-mem-usage]
>> http://hg.openjdk.java.net/jdk-updates/jdk10u/file/142f0ed9ff5b/src/hotspot/share/services/management.cpp#l728 
>>
>> [JDK-8020530] https://bugs.openjdk.java.net/browse/JDK-8020530
>> [apache-ignite-workaround]
>> https://github.com/apache/ignite/blob/df4fd65a32/modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java#L336-L346 
>>
>> [netbeans-bug] https://netbeans.org/bugzilla/show_bug.cgi?id=194733
>>

From gary.adams at oracle.com  Fri Jul 13 11:29:31 2018
From: gary.adams at oracle.com (Gary Adams)
Date: Fri, 13 Jul 2018 07:29:31 -0400
Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout configured
Message-ID: <5B488D1B.3090808@oracle.com>

This is a simple update to set the jtreg timeout to match the
internal waittime already being used by these vmTestbase/nsk/jdb tests.

   Issue: https://bugs.openjdk.java.net/browse/JDK-8206013
   Webrev: http://cr.openjdk.java.net/~gadams/8206013/webrev.00/

From ralf.schmelter at sap.com  Fri Jul 13 13:22:54 2018
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Fri, 13 Jul 2018 13:22:54 +0000
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
Message-ID: <26a3d03903494257ba2995d082ae1960@sap.com>

Hi Serguei,

Sorry for the late reply, but it seems the spam filter has removed your emails. I just saw them in the archives.

Regarding this code:

288     if (length != count) {

289         error = JVMTI_ERROR_INTERNAL;

290     }

count is the number of frames filled into the array (it is set in the GetStackTrace JVMTI call) and length is the number of frames requested to be filled in. Both are independent of the start index at this point. Note that I've reused the count variable (it was first initialized to hold the number of frames on the stack). Maybe it is clearer to use a new variable in this call?

The package will not be send if an error code is set on the output stream (see outStream_sendReply()). This should cover all cases (in both the new and the old code).

Best regards,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180713/f76c1d9a/attachment.html>

From daniel.mitterdorfer at gmail.com  Fri Jul 13 14:10:37 2018
From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer)
Date: Fri, 13 Jul 2018 16:10:37 +0200
Subject: committed > max in MemoryMXBean#getHeapMemoryUsage()
In-Reply-To: <b12eaec322cc513348a9155fbcdf2b916a43559b.camel@oracle.com>
References: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
 <c52de47f-b7ad-ed01-2f94-46378a95d725@oracle.com>
 <CAJmnTFi23XsD5zzZT6b+YSPZevyAzPSbcoLNAZ+J5szj7ZQjNA@mail.gmail.com>
 <b12eaec322cc513348a9155fbcdf2b916a43559b.camel@oracle.com>
Message-ID: <CAJmnTFiQmyxU9HAawnSa9r75hkmAT9115C8F4Y0T_rAK=TXQXg@mail.gmail.com>

Hi,

I have good news. I was able to reproduce this issue but this time I
have logs. A test failed with the following stack trace around
15:06:55 with:

java.lang.IllegalArgumentException: committed = 537919488 should be <
max = 536870912
   >    at java.lang.management.MemoryUsage.<init>(MemoryUsage.java:166)
   >    at sun.management.MemoryImpl.getMemoryUsage0(Native Method)
   >    at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71)
   >    at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:242)

This time it happened on Linux (kernel 4.13.0-45-generic) with JDK 10
(build 10+46). The JVM arguments were:

-Xms512M -Xmx512M
-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags

The logs are somewhat massive (~250MB uncompressed) and available at
https://www.dropbox.com/s/wir9sv1dk5cf54y/JDK-8207200_test_log.txt.zip?dl=0

I hope that helps identifying the cause. Please let me know if you
need anything else.

Daniel
Am Fr., 13. Juli 2018 um 10:33 Uhr schrieb Thomas Schatzl
<thomas.schatzl at oracle.com>:
>
> On Fri, 2018-07-13 at 10:30 +0200, Daniel Mitterdorfer wrote:
> > Hi Erik,
> > >
> > > Do you any kind of GC logging from the test run where you
> > > encountered the bug?
> >
> > Unfortunately, we don't have GC logging enabled by default in our
> > test suite so the exception trace is all I got. I am now repeatedly
> > running the test suite with the original flags (-Xms512M -Xmx512M)
> > and also added the following logging configuration:
> >
> > -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags
> >
> > As soon as I get another failure, I'll provide the full log file.
> > Please let me know if you need any other logs (i.e. whether I should
> > adjust my log configuration).
>
>   I think these flags are fine.
>
> Since Erik and me strongly believe the issue is with the relevant G1
> code Erik mentioned we will reassign the bug to us (he said there is
> already a bug reported on it).
>
> Thanks a lot,
>   Thomas
>

From mandy.chung at oracle.com  Fri Jul 13 15:01:53 2018
From: mandy.chung at oracle.com (mandy chung)
Date: Fri, 13 Jul 2018 08:01:53 -0700
Subject: committed > max in MemoryMXBean#getHeapMemoryUsage()
In-Reply-To: <a0ef450d-96e5-0507-aaa5-6b7dd265283b@oracle.com>
References: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
 <3c85d3a1-ed79-8e6a-0839-92214f1b85c7@oracle.com>
 <a0ef450d-96e5-0507-aaa5-6b7dd265283b@oracle.com>
Message-ID: <2160e7d8-598d-6efa-6786-2c4616a1cce1@oracle.com>

Great! Thanks Erik.

Mandy

On 7/13/18 1:34 AM, Erik Helin wrote:
> On 07/12/2018 10:34 PM, mandy chung wrote:
>> It's indeed strange that no one reports this issue.? I created:
>> ??? https://bugs.openjdk.java.net/browse/JDK-8207200
> 
> Mandy: I moved the bug over to hotspot/gc, this is much more likely to 
> be a problem with how the GC calculates the sizes. I don't think there 
> is a bug in the serviceability layer, the JNI getMemoryUsage function 
> only summarizes the data it gets from the GC.
> 
> Thanks for creating the bug, we will follow up with Daniel.
> Erik
> 
>> Mandy
>>
>> On 7/12/18 6:35 AM, Daniel Mitterdorfer wrote:
>>> Hi,
>>>
>>> while working on a change in Elasticsearch, I discovered an interesting
>>> situation related to the implementation of jmm_getMemoryUsage (see
>>> [jdk-mem-usage]). In one of the test runs, a test failed with the 
>>> following
>>> exception:
>>>
>>> java.lang.IllegalArgumentException: committed = 542113792 should be <
>>> max = 536870912
>>> at java.lang.management.MemoryUsage.<init>(MemoryUsage.java:166)
>>> at sun.management.MemoryImpl.getMemoryUsage0(Native Method)
>>> at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71)
>>> at 
>>> org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:246) 
>>>
>>> [...]
>>>
>>> This happened on MacOS 10.12.6 with JDK 10 (build 10.0.1+10). The 
>>> only JVM flags
>>> specified where -Xms512M -Xmx512M. So far this failure occurred only 
>>> once and I
>>> could not reproduce it yet.
>>>
>>> The values reported in the exception message are:
>>>
>>> * "max": 536870912 = 512MB (exactly)
>>> * "committed": 542113792 = 517MB (exactly), i.e. 5MB more than "max".
>>>
>>> As the value of "max" is exactly what we have specified with -Xmx 
>>> this indicates
>>> to me that the problem seems to be the calculation of "committed".
>>>
>>> As the value of "max" is exactly what we have specified with -Xmx it 
>>> seems to
>>> indicate that the problem is the calculation of "committed". I do not
>>> understand under which conditions this can happen thus I post this to 
>>> the
>>> mailing list in case anybody has ideas what might cause this.
>>>
>>> I plan to run further tests with JVM trace logging enabled
>>> (-Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags 
>>> to be
>>> precise) in the hope that this problem will occur again and I can 
>>> provide logs
>>> that help to debug / fix the problem.
>>>
>>> Searching for that error message, there is [JDK-8020530] but that one 
>>> is about
>>> *non-heap* memory usage and has already been resolved a while ago. 
>>> Several
>>> sources (e.g. [apache-ignite-workaround] or [netbeans-bug]) seem to 
>>> indicate
>>> that this problem happened indeed in the wild but what I find odd is 
>>> that I
>>> could not find a single ticket in the OpenJDK bug tracker or a 
>>> discussion on a
>>> JDK mailing list about this problem.
>>>
>>> I'd be glad to get any pointers on what might cause this or requests for
>>> additional info that I need to provide to help analyze this problem.
>>>
>>> Thanks,
>>> Daniel
>>>
>>> [jdk-mem-usage]
>>> http://hg.openjdk.java.net/jdk-updates/jdk10u/file/142f0ed9ff5b/src/hotspot/share/services/management.cpp#l728 
>>>
>>> [JDK-8020530] https://bugs.openjdk.java.net/browse/JDK-8020530
>>> [apache-ignite-workaround]
>>> https://github.com/apache/ignite/blob/df4fd65a32/modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java#L336-L346 
>>>
>>> [netbeans-bug] https://netbeans.org/bugzilla/show_bug.cgi?id=194733
>>>

From serguei.spitsyn at oracle.com  Fri Jul 13 16:25:01 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 13 Jul 2018 09:25:01 -0700
Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout
 configured
In-Reply-To: <5B488D1B.3090808@oracle.com>
References: <5B488D1B.3090808@oracle.com>
Message-ID: <1934bc19-d78a-668a-7c05-529961f57565@oracle.com>

Hi Gary,

It looks good.

Thanks,
Serguei


On 7/13/18 04:29, Gary Adams wrote:
> This is a simple update to set the jtreg timeout to match the
> internal waittime already being used by these vmTestbase/nsk/jdb tests.
>
> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206013
> ? Webrev: http://cr.openjdk.java.net/~gadams/8206013/webrev.00/


From markus.gaisbauer at gmail.com  Fri Jul 13 16:35:21 2018
From: markus.gaisbauer at gmail.com (Markus Gaisbauer)
Date: Fri, 13 Jul 2018 18:35:21 +0200
Subject: ThreadMXBean::getCurrentThreadAllocatedBytes
Message-ID: <CAD0+aVaUZUgqwmJYa6zf_htSzhF6FjpjZEVQG0vjq9FQmyofAw@mail.gmail.com>

Hello,

I am trying to use ThreadMXBean::getThreadAllocatedBytes
(com.sun.management) to get the amount of allocated memory of the current
thread in some performance critical code.

Unfortunately, the current implementation can be rather slow and the
duration of each call unpredictable. I ran a test in a JVM with 500
threads. Depending on which thread was queried, getThreadAllocatedBytes
took between 100 ns and 2500 ns.

The root cause of the problem is ThreadsList::find_JavaThread_from_java_tid
which performs a linear scan through all Java threads in the current
process. The more threads a JVM has, the slower it gets. In the worst case,
the thread with the given TID is found as the last entry in the list.

Before Java 10, the oldest thread is the slowest one to query.
Since Java 10, the youngest thread is the slowest one to query. I think
this was a side effect of introducing "Thread Safe Memory Reclamation
(Thread-SMR) support".

             Oldest Thread   Youngest Thread
Java 8             8740 ns             76 ns
Java 10             109 ns           2485 ns

A common use case is to query the metric for the current thread (e.g.
before and after performing some operation). This case can be optimized by
introducing a new method: getCurrentThreadAllocatedBytes.

I created a patch for http://hg.openjdk.java.net/jdk/jdk/ and by using the
new method I saw the following improvements in my test:

             Oldest Thread   Youngest Thread
Proposal             37 ns             37 ns

This is a 60x improvement over the worst case of the current API. In the
best case of the current API, the new method is still 3 times faster.

// based on JVM_SetNativeThreadName in jvm.cpp.
JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, jobject
currentThread))
  // We don't use a ThreadsListHandle here because the current thread
  // must be alive.
  oop java_thread = JNIHandles::resolve_non_null(currentThread);
  JavaThread* thr = java_lang_Thread::thread(java_thread);
  if (thread == thr) {
    // only supported for the current thread
    return thr->cooked_allocated_bytes();
  }
  return -1;
JVM_END

The proposed method also fixes the problem, that getThreadAllocatedBytes
itself allocates some memory on the current thread (two long arrays, 24
bytes) and therefore can slightly skew measurements. The new
method, getCurrentThreadAllocatedBytes, returns exactly the same value if
it is called twice without allocating any memory between those calls.

I also built a variation of this method that could be used to query
allocated memory more efficiently for anyone who already has a
java.lang.Thread object:

JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env, jobject
threadObj))
  // based on code proposed in threadSMR.hpp
  ThreadsListHandle tlh;
  JavaThread* thr = NULL;
  bool is_alive = tlh.cv_internal_thread_to_JavaThread(threadObj, &thr,
NULL);
  if (is_alive) {
    return thr->cooked_allocated_bytes();
  }
  return -1;
JVM_END

This method took 70 ns in my test, which is 85% slower
than GetCurrentThreadAllocatedMemory but still 30% faster than the best
case of the current API. I currently have no immediate need for this second
method, but I think it would also be a valueable addition to the API.

I attached a patch for getCurrentThreadAllocatedBytes. I can create a
second patch for also adding getThreadAllocatedMemory(java.lang.Thread) to
the API.

I am a first time contributor and I am not 100% sure what process I must
follow to get a change like this into OpenJDK. Can someone have a look at
my proposal and help me through the process?

Best regards,
Markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180713/d0007655/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: getCurrentThreadAllocatedBytes.diff
Type: application/octet-stream
Size: 5058 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180713/d0007655/getCurrentThreadAllocatedBytes-0001.diff>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ThreadAllocatedBytesTest.java
Type: application/octet-stream
Size: 3119 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180713/d0007655/ThreadAllocatedBytesTest-0001.java>

From gary.adams at oracle.com  Fri Jul 13 18:03:12 2018
From: gary.adams at oracle.com (Gary Adams)
Date: Fri, 13 Jul 2018 14:03:12 -0400
Subject: ThreadMXBean::getCurrentThreadAllocatedBytes
Message-ID: <5B48E960.5060300@oracle.com>

Here's the starting point for openjdk contributing:
   http://openjdk.java.net/contribute/

Here's your post in the mail archives :
   
http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html

Most people will post a webrev to cr.openjdk.java.net for larger changesets.
Most attachments are stripped when sent to the mailing list.

From daniel.daugherty at oracle.com  Fri Jul 13 18:44:39 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 13 Jul 2018 14:44:39 -0400
Subject: ThreadMXBean::getCurrentThreadAllocatedBytes
In-Reply-To: <CAD0+aVaUZUgqwmJYa6zf_htSzhF6FjpjZEVQG0vjq9FQmyofAw@mail.gmail.com>
References: <CAD0+aVaUZUgqwmJYa6zf_htSzhF6FjpjZEVQG0vjq9FQmyofAw@mail.gmail.com>
Message-ID: <c7627bf9-62f3-d492-6cad-428035f20adf@oracle.com>

On 7/13/18 12:35 PM, Markus Gaisbauer wrote:
> Hello,
>
> I am trying to use ThreadMXBean::getThreadAllocatedBytes 
> (com.sun.management) to get the amount of allocated memory of the 
> current thread in some performance critical code.
>
> Unfortunately, the current implementation can be rather slow and the 
> duration of each call unpredictable. I ran a test in a JVM with 500 
> threads. Depending on which thread was queried, 
> getThreadAllocatedBytes took between 100 ns and 2500 ns.
>
> The root cause of the problem is 
> ThreadsList::find_JavaThread_from_java_tid which performs a linear 
> scan through all Java threads in the current process. The more threads 
> a JVM has, the slower it gets. In the worst case, the thread with the 
> given TID is found as the last entry in the list.
>
> Before Java 10, the oldest thread is the slowest one to query.
> Since Java 10, the youngest thread is the slowest one to query. I 
> think this was a side effect of introducing "Thread Safe Memory 
> Reclamation (Thread-SMR) support".
>
> ? ? ? ? ? ? ?Oldest Thread? ?Youngest Thread
> Java 8? ? ? ? ? ? ?8740 ns? ? ? ? ? ? ?76 ns
> Java 10? ? ? ? ? ? ?109 ns? ? ? ? ? ?2485 ns

It is good to see that longest search is much faster. Erik and Robbin
will be pleased since speeding up traversal of the ThreadsList was one
of the things that we tried to do during the Thread-SMR project.

A first step is get a new bug filed that documents the issue with
ThreadMXBean::getThreadAllocatedBytes(). Perhaps Gary or Serguei
will take care of that.

Dan


> A common use case is to query the metric for the current thread (e.g. 
> before and after performing some operation). This case can be 
> optimized by introducing a new method: getCurrentThreadAllocatedBytes.
>
> I created a patch for http://hg.openjdk.java.net/jdk/jdk/ and by using 
> the new method I saw the following improvements in my test:
> ? ? ? ? ? ? ?Oldest Thread? ?Youngest Thread
> Proposal? ? ? ? ? ? ?37 ns? ? ? ? ? ? ?37 ns
>
> This is a 60x improvement over the worst case of the current API. In 
> the best case of the current API, the new method is still 3 times faster.
>
> // based on JVM_SetNativeThreadName in jvm.cpp.
> JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, 
> jobject currentThread))
> ? // We don't use a ThreadsListHandle here because the current thread
> ? // must be alive.
> ? oop java_thread = JNIHandles::resolve_non_null(currentThread);
> ? JavaThread* thr = java_lang_Thread::thread(java_thread);
> ? if (thread == thr) {
> ? ? // only supported for the current thread
> ? ? return thr->cooked_allocated_bytes();
> ? }
> ? return -1;
> JVM_END
>
> The proposed method also fixes the problem, that 
> getThreadAllocatedBytes itself allocates some memory on the current 
> thread (two long arrays, 24 bytes) and therefore can slightly skew 
> measurements. The new method,?getCurrentThreadAllocatedBytes, returns 
> exactly the same value if it is called twice without allocating any 
> memory between those calls.
>
> I also built a variation of this method that could be used to query 
> allocated memory more efficiently for anyone who already has a 
> java.lang.Thread object:
>
> JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env, jobject 
> threadObj))
> ? // based on code proposedin threadSMR.hpp
> ThreadsListHandle tlh;
> ? JavaThread* thr = NULL;
> ? bool is_alive = tlh.cv_internal_thread_to_JavaThread(threadObj, 
> &thr, NULL);
> ? if (is_alive) {
> ? ? return thr->cooked_allocated_bytes();
> ? }
> ? return -1;
> JVM_END
>
> This method took 70 ns in my test, which is 85% slower 
> than?GetCurrentThreadAllocatedMemory but still 30% faster than the 
> best case of the current API. I currently have no immediate need for 
> this second method, but I think it would also be a valueable addition 
> to the API.
>
> I attached a patch for getCurrentThreadAllocatedBytes. I can create a 
> second patch for also adding 
> getThreadAllocatedMemory(java.lang.Thread) to the API.
>
> I am a first time contributor and I am not 100% sure what process I 
> must follow to get a change like this into OpenJDK. Can someone have a 
> look at my proposal and help me through the process?
>
> Best regards,
> Markus
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180713/1abf55e8/attachment.html>

From chris.plummer at oracle.com  Fri Jul 13 20:21:06 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 13 Jul 2018 13:21:06 -0700
Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout
 configured
In-Reply-To: <5B488D1B.3090808@oracle.com>
References: <5B488D1B.3090808@oracle.com>
Message-ID: <4271a8e2-1087-bc35-0697-00717ef9ffb9@oracle.com>

Hi Gary,

It looks like you have properly added timeout=300 wherever we use 
-waittime:5. However, I'm not 100% convinced this is always the right 
approach. In the bug description you said that -waittime is used as a 
timeout for individual operations. However, there could be multiple of 
those operations, and they could in sum exceed the 300 second jtreg 
timeout you added.

What is the default for -waittime? I'm also guessing that the initial 
application of -waittime was never really tuned to the specific tests 
and just cloned across most of them. It seems every test either needs 5m 
or the default, which doesn't really make much sense. If 5m was really 
needed, we should have seen a lot of failures when ported to jtreg, but 
as far as I know the only reason this issue got on your radar was due to 
exclude001 needing 7m. Maybe rather than adding timeout=300? you should 
change -waitime to 2m, since other than exclude001, none of the tests 
seem to need more than 2m.

Lastly, does timeoutFactor impact -waittime? It seems it should be 
applied to it also. I'm not sure if it is.

thanks,

Chris

On 7/13/18 4:29 AM, Gary Adams wrote:
> This is a simple update to set the jtreg timeout to match the
> internal waittime already being used by these vmTestbase/nsk/jdb tests.
>
> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206013
> ? Webrev: http://cr.openjdk.java.net/~gadams/8206013/webrev.00/


From daniel.daugherty at oracle.com  Fri Jul 13 20:46:12 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 13 Jul 2018 16:46:12 -0400
Subject: ThreadMXBean::getCurrentThreadAllocatedBytes
In-Reply-To: <c7627bf9-62f3-d492-6cad-428035f20adf@oracle.com>
References: <CAD0+aVaUZUgqwmJYa6zf_htSzhF6FjpjZEVQG0vjq9FQmyofAw@mail.gmail.com>
 <c7627bf9-62f3-d492-6cad-428035f20adf@oracle.com>
Message-ID: <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com>

On 7/13/18 2:44 PM, Daniel D. Daugherty wrote:
> On 7/13/18 12:35 PM, Markus Gaisbauer wrote:
>> Hello,
>>
>> I am trying to use ThreadMXBean::getThreadAllocatedBytes 
>> (com.sun.management) to get the amount of allocated memory of the 
>> current thread in some performance critical code.
>>
>> Unfortunately, the current implementation can be rather slow and the 
>> duration of each call unpredictable. I ran a test in a JVM with 500 
>> threads. Depending on which thread was queried, 
>> getThreadAllocatedBytes took between 100 ns and 2500 ns.
>>
>> The root cause of the problem is 
>> ThreadsList::find_JavaThread_from_java_tid which performs a linear 
>> scan through all Java threads in the current process. The more 
>> threads a JVM has, the slower it gets. In the worst case, the thread 
>> with the given TID is found as the last entry in the list.
>>
>> Before Java 10, the oldest thread is the slowest one to query.
>> Since Java 10, the youngest thread is the slowest one to query. I 
>> think this was a side effect of introducing "Thread Safe Memory 
>> Reclamation (Thread-SMR) support".
>>
>> ? ? ? ? ? ? ?Oldest Thread? ?Youngest Thread
>> Java 8? ? ? ? ? ? ?8740 ns? ? ? ? ? ? ?76 ns
>> Java 10? ? ? ? ? ? ?109 ns? ? ? ? ? ?2485 ns
>
> It is good to see that longest search is much faster. Erik and Robbin
> will be pleased since speeding up traversal of the ThreadsList was one
> of the things that we tried to do during the Thread-SMR project.
>
> A first step is get a new bug filed that documents the issue with
> ThreadMXBean::getThreadAllocatedBytes(). Perhaps Gary or Serguei
> will take care of that.
>
> Dan
>
>
>> A common use case is to query the metric for the current thread (e.g. 
>> before and after performing some operation). This case can be 
>> optimized by introducing a new method: getCurrentThreadAllocatedBytes.
>>
>> I created a patch for http://hg.openjdk.java.net/jdk/jdk/ and by 
>> using the new method I saw the following improvements in my test:
>> ? ? ? ? ? ? ?Oldest Thread? ?Youngest Thread
>> Proposal? ? ? ? ? ? ?37 ns? ? ? ? ? ? ?37 ns
>>
>> This is a 60x improvement over the worst case of the current API. In 
>> the best case of the current API, the new method is still 3 times faster.
>>
>> // based on JVM_SetNativeThreadName in jvm.cpp.
>> JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, 
>> jobject currentThread))
>> ? // We don't use a ThreadsListHandle here because the current thread
>> ? // must be alive.
>> ? oop java_thread = JNIHandles::resolve_non_null(currentThread);
>> JavaThread* thr = java_lang_Thread::thread(java_thread);
>> ? if (thread == thr) {
>> ? ? // only supported for the current thread
>> ? ? return thr->cooked_allocated_bytes();
>> ? }
>> ? return -1;
>> JVM_END
>>
>> The proposed method also fixes the problem, that 
>> getThreadAllocatedBytes itself allocates some memory on the current 
>> thread (two long arrays, 24 bytes) and therefore can slightly skew 
>> measurements. The new method,?getCurrentThreadAllocatedBytes, returns 
>> exactly the same value if it is called twice without allocating any 
>> memory between those calls.
>>
>> I also built a variation of this method that could be used to query 
>> allocated memory more efficiently for anyone who already has a 
>> java.lang.Thread object:
>>
>> JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env, jobject 
>> threadObj))
>> ? // based on code proposedin threadSMR.hpp
>> ThreadsListHandle tlh;
>> JavaThread* thr = NULL;
>> ? bool is_alive = tlh.cv_internal_thread_to_JavaThread(threadObj, 
>> &thr, NULL);
>> ? if (is_alive) {
>> ? ? return thr->cooked_allocated_bytes();
>> ? }
>> ? return -1;
>> JVM_END
>>
>> This method took 70 ns in my test, which is 85% slower 
>> than?GetCurrentThreadAllocatedMemory but still 30% faster than the 
>> best case of the current API. I currently have no immediate need for 
>> this second method, but I think it would also be a valueable addition 
>> to the API.
>>
>> I attached a patch for getCurrentThreadAllocatedBytes. I can create a 
>> second patch for also adding 
>> getThreadAllocatedMemory(java.lang.Thread) to the API.
>>
>> I am a first time contributor and I am not 100% sure what process I 
>> must follow to get a change like this into OpenJDK. Can someone have 
>> a look at my proposal and help me through the process?
>>
>> Best regards,
>> Markus
>>
>

I believe this is the code that's causing you grief:

open/src/hotspot/share/services/management.cpp:

// Gets an array containing the amount of memory allocated on the Java
// heap for a set of threads (in bytes).? Each element of the array is
// the amount of memory allocated for the thread ID specified in the
// corresponding entry in the given array of thread IDs; or -1 if the
// thread does not exist or has terminated.
JVM_ENTRY(void, jmm_GetThreadAllocatedMemory(JNIEnv *env, jlongArray ids,
 ???????????????????????????????????????????? jlongArray sizeArray))
 ? // Check if threads is null
 ? if (ids == NULL || sizeArray == NULL) {
 ??? THROW(vmSymbols::java_lang_NullPointerException());
 ? }

 ? ResourceMark rm(THREAD);
 ? typeArrayOop ta = typeArrayOop(JNIHandles::resolve_non_null(ids));
 ? typeArrayHandle ids_ah(THREAD, ta);

 ? typeArrayOop sa = typeArrayOop(JNIHandles::resolve_non_null(sizeArray));
 ? typeArrayHandle sizeArray_h(THREAD, sa);

 ? // validate the thread id array
 ? validate_thread_id_array(ids_ah, CHECK);

 ? // sizeArray must be of the same length as the given array of thread IDs
 ? int num_threads = ids_ah->length();
 ? if (num_threads != sizeArray_h->length()) {
 ??? THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(),
 ????????????? "The length of the given long array does not match the 
length of "
 ????????????? "the given array of thread IDs");
 ? }

 ? ThreadsListHandle tlh;
 ? for (int i = 0; i < num_threads; i++) {
 ??? JavaThread* java_thread = 
tlh.list()->find_JavaThread_from_java_tid(ids_ah->long_at(i));
 ??? if (java_thread != NULL) {
 ????? sizeArray_h->long_at_put(i, java_thread->cooked_allocated_bytes());
 ??? }
 ? }
JVM_END


Perhaps something like this above the "ThreadsListHandle tlh;" line:

 ? if (num_threads == 1 && THREAD->is_Java_thread()) {
 ??? // Only asking for 1 thread so if we're a JavaThread, then
 ??? // see if this request is for ourself.
 ??? JavaThread* jt = THREAD;
 ??? oop tobj = jt->threadObj();

 ??? if (ids_ah->long_at(0) == java_lang_Thread::thread_id(tobj)) {
 ????? // Return the info for ourself.
 ????? sizeArray_h->long_at_put(0, jt->cooked_allocated_bytes());
 ????? return;
 ??? }
 ? }

I haven't checked to see if this will even compile, but I
think you'll get the idea.

Dan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180713/646370d1/attachment-0001.html>

From daniel.daugherty at oracle.com  Fri Jul 13 20:52:52 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Fri, 13 Jul 2018 16:52:52 -0400
Subject: ThreadMXBean::getCurrentThreadAllocatedBytes
In-Reply-To: <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com>
References: <CAD0+aVaUZUgqwmJYa6zf_htSzhF6FjpjZEVQG0vjq9FQmyofAw@mail.gmail.com>
 <c7627bf9-62f3-d492-6cad-428035f20adf@oracle.com>
 <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com>
Message-ID: <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com>

Markus,

I filed the following bug for you:

 ??? JDK-8207266 ThreadMXBean::getThreadAllocatedBytes() can be quicker 
for self thread
 ??? https://bugs.openjdk.java.net/browse/JDK-8207266

Dan


On 7/13/18 4:46 PM, Daniel D. Daugherty wrote:
> On 7/13/18 2:44 PM, Daniel D. Daugherty wrote:
>> On 7/13/18 12:35 PM, Markus Gaisbauer wrote:
>>> Hello,
>>>
>>> I am trying to use ThreadMXBean::getThreadAllocatedBytes 
>>> (com.sun.management) to get the amount of allocated memory of the 
>>> current thread in some performance critical code.
>>>
>>> Unfortunately, the current implementation can be rather slow and the 
>>> duration of each call unpredictable. I ran a test in a JVM with 500 
>>> threads. Depending on which thread was queried, 
>>> getThreadAllocatedBytes took between 100 ns and 2500 ns.
>>>
>>> The root cause of the problem is 
>>> ThreadsList::find_JavaThread_from_java_tid which performs a linear 
>>> scan through all Java threads in the current process. The more 
>>> threads a JVM has, the slower it gets. In the worst case, the thread 
>>> with the given TID is found as the last entry in the list.
>>>
>>> Before Java 10, the oldest thread is the slowest one to query.
>>> Since Java 10, the youngest thread is the slowest one to query. I 
>>> think this was a side effect of introducing "Thread Safe Memory 
>>> Reclamation (Thread-SMR) support".
>>>
>>> ? ? ? ? ? ? ?Oldest Thread? ?Youngest Thread
>>> Java 8 ?8740 ns? ? ? ? ? ? ?76 ns
>>> Java 10 ?109 ns? ? ? ? ? ?2485 ns
>>
>> It is good to see that longest search is much faster. Erik and Robbin
>> will be pleased since speeding up traversal of the ThreadsList was one
>> of the things that we tried to do during the Thread-SMR project.
>>
>> A first step is get a new bug filed that documents the issue with
>> ThreadMXBean::getThreadAllocatedBytes(). Perhaps Gary or Serguei
>> will take care of that.
>>
>> Dan
>>
>>
>>> A common use case is to query the metric for the current thread 
>>> (e.g. before and after performing some operation). This case can be 
>>> optimized by introducing a new method: getCurrentThreadAllocatedBytes.
>>>
>>> I created a patch for http://hg.openjdk.java.net/jdk/jdk/ and by 
>>> using the new method I saw the following improvements in my test:
>>> ? ? ? ? ? ? ?Oldest Thread? ?Youngest Thread
>>> Proposal ?37 ns? ? ? ? ? ? ?37 ns
>>>
>>> This is a 60x improvement over the worst case of the current API. In 
>>> the best case of the current API, the new method is still 3 times 
>>> faster.
>>>
>>> // based on JVM_SetNativeThreadName in jvm.cpp.
>>> JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, 
>>> jobject currentThread))
>>> ? // We don't use a ThreadsListHandle here because the current thread
>>> ? // must be alive.
>>> ? oop java_thread = JNIHandles::resolve_non_null(currentThread);
>>> JavaThread* thr = java_lang_Thread::thread(java_thread);
>>> ? if (thread == thr) {
>>> ? ? // only supported for the current thread
>>> ? ? return thr->cooked_allocated_bytes();
>>> ? }
>>> ? return -1;
>>> JVM_END
>>>
>>> The proposed method also fixes the problem, that 
>>> getThreadAllocatedBytes itself allocates some memory on the current 
>>> thread (two long arrays, 24 bytes) and therefore can slightly skew 
>>> measurements. The new method,?getCurrentThreadAllocatedBytes, 
>>> returns exactly the same value if it is called twice without 
>>> allocating any memory between those calls.
>>>
>>> I also built a variation of this method that could be used to query 
>>> allocated memory more efficiently for anyone who already has a 
>>> java.lang.Thread object:
>>>
>>> JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env, jobject 
>>> threadObj))
>>> ? // based on code proposedin threadSMR.hpp
>>> ThreadsListHandle tlh;
>>> JavaThread* thr = NULL;
>>> ? bool is_alive = tlh.cv_internal_thread_to_JavaThread(threadObj, 
>>> &thr, NULL);
>>> ? if (is_alive) {
>>> ? ? return thr->cooked_allocated_bytes();
>>> ? }
>>> ? return -1;
>>> JVM_END
>>>
>>> This method took 70 ns in my test, which is 85% slower 
>>> than?GetCurrentThreadAllocatedMemory but still 30% faster than the 
>>> best case of the current API. I currently have no immediate need for 
>>> this second method, but I think it would also be a valueable 
>>> addition to the API.
>>>
>>> I attached a patch for getCurrentThreadAllocatedBytes. I can create 
>>> a second patch for also adding 
>>> getThreadAllocatedMemory(java.lang.Thread) to the API.
>>>
>>> I am a first time contributor and I am not 100% sure what process I 
>>> must follow to get a change like this into OpenJDK. Can someone have 
>>> a look at my proposal and help me through the process?
>>>
>>> Best regards,
>>> Markus
>>>
>>
>
> I believe this is the code that's causing you grief:
>
> open/src/hotspot/share/services/management.cpp:
>
> // Gets an array containing the amount of memory allocated on the Java
> // heap for a set of threads (in bytes).? Each element of the array is
> // the amount of memory allocated for the thread ID specified in the
> // corresponding entry in the given array of thread IDs; or -1 if the
> // thread does not exist or has terminated.
> JVM_ENTRY(void, jmm_GetThreadAllocatedMemory(JNIEnv *env, jlongArray ids,
> ???????????????????????????????????????????? jlongArray sizeArray))
> ? // Check if threads is null
> ? if (ids == NULL || sizeArray == NULL) {
> ??? THROW(vmSymbols::java_lang_NullPointerException());
> ? }
>
> ? ResourceMark rm(THREAD);
> ? typeArrayOop ta = typeArrayOop(JNIHandles::resolve_non_null(ids));
> ? typeArrayHandle ids_ah(THREAD, ta);
>
> ? typeArrayOop sa = typeArrayOop(JNIHandles::resolve_non_null(sizeArray));
> ? typeArrayHandle sizeArray_h(THREAD, sa);
>
> ? // validate the thread id array
> ? validate_thread_id_array(ids_ah, CHECK);
>
> ? // sizeArray must be of the same length as the given array of thread IDs
> ? int num_threads = ids_ah->length();
> ? if (num_threads != sizeArray_h->length()) {
> ??? THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(),
> ????????????? "The length of the given long array does not match the 
> length of "
> ????????????? "the given array of thread IDs");
> ? }
>
> ? ThreadsListHandle tlh;
> ? for (int i = 0; i < num_threads; i++) {
> ??? JavaThread* java_thread = 
> tlh.list()->find_JavaThread_from_java_tid(ids_ah->long_at(i));
> ??? if (java_thread != NULL) {
> ????? sizeArray_h->long_at_put(i, java_thread->cooked_allocated_bytes());
> ??? }
> ? }
> JVM_END
>
>
> Perhaps something like this above the "ThreadsListHandle tlh;" line:
>
> ? if (num_threads == 1 && THREAD->is_Java_thread()) {
> ??? // Only asking for 1 thread so if we're a JavaThread, then
> ??? // see if this request is for ourself.
> ??? JavaThread* jt = THREAD;
> ??? oop tobj = jt->threadObj();
>
> ??? if (ids_ah->long_at(0) == java_lang_Thread::thread_id(tobj)) {
> ????? // Return the info for ourself.
> ????? sizeArray_h->long_at_put(0, jt->cooked_allocated_bytes());
> ????? return;
> ??? }
> ? }
>
> I haven't checked to see if this will even compile, but I
> think you'll get the idea.
>
> Dan
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180713/1f435423/attachment.html>

From gary.adams at oracle.com  Fri Jul 13 21:36:46 2018
From: gary.adams at oracle.com (gary.adams at oracle.com)
Date: Fri, 13 Jul 2018 17:36:46 -0400
Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout
 configured
In-Reply-To: <4271a8e2-1087-bc35-0697-00717ef9ffb9@oracle.com>
References: <5B488D1B.3090808@oracle.com>
 <4271a8e2-1087-bc35-0697-00717ef9ffb9@oracle.com>
Message-ID: <4afe54b7-e493-1a45-4fe4-166e9b112dae@oracle.com>

We know that the default jtreg timeout is 2 minutes and typically
runs with a timeoutfactor of 4 or 10. So the harness "safety net"
is 8 to 20 minutes from jtreg.

It does appear that most of the vmTestbase tests use a 5 minute
waittime. I have seen waittime used in different ways. The one we
saw most recently was waiting for a specific reply that was taking
upwords of 7 minutes handling method exclude filtering. e.g.
600K methods on solaris-sparcv9-debug

I've seen other tests using waittime as a total test timeout.

The jtreg timeout factor has not been applied to the vmTestbase waitime.
The tests have been quickly ported so they can run under jtreg
harness, but have not been converted to use the all the jtreg features.

The purpose of this specific fix is to prevent jtreg from an early
termination at 2 minutes or 8 minutes, when the original waittime
allows for 5 minutes.

Reducing waittime will not speed up the tests. It would probably introduce
more intermittent timeout reports.

On 7/13/18 4:21 PM, Chris Plummer wrote:
> Hi Gary,
>
> It looks like you have properly added timeout=300 wherever we use 
> -waittime:5. However, I'm not 100% convinced this is always the right 
> approach. In the bug description you said that -waittime is used as a 
> timeout for individual operations. However, there could be multiple of 
> those operations, and they could in sum exceed the 300 second jtreg 
> timeout you added.
>
> What is the default for -waittime? I'm also guessing that the initial 
> application of -waittime was never really tuned to the specific tests 
> and just cloned across most of them. It seems every test either needs 
> 5m or the default, which doesn't really make much sense. If 5m was 
> really needed, we should have seen a lot of failures when ported to 
> jtreg, but as far as I know the only reason this issue got on your 
> radar was due to exclude001 needing 7m. Maybe rather than adding 
> timeout=300? you should change -waitime to 2m, since other than 
> exclude001, none of the tests seem to need more than 2m.
>
> Lastly, does timeoutFactor impact -waittime? It seems it should be 
> applied to it also. I'm not sure if it is.
>
> thanks,
>
> Chris
>
> On 7/13/18 4:29 AM, Gary Adams wrote:
>> This is a simple update to set the jtreg timeout to match the
>> internal waittime already being used by these vmTestbase/nsk/jdb tests.
>>
>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206013
>> ? Webrev: http://cr.openjdk.java.net/~gadams/8206013/webrev.00/
>
>
>


From chris.plummer at oracle.com  Fri Jul 13 22:30:08 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 13 Jul 2018 15:30:08 -0700
Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout
 configured
In-Reply-To: <4afe54b7-e493-1a45-4fe4-166e9b112dae@oracle.com>
References: <5B488D1B.3090808@oracle.com>
 <4271a8e2-1087-bc35-0697-00717ef9ffb9@oracle.com>
 <4afe54b7-e493-1a45-4fe4-166e9b112dae@oracle.com>
Message-ID: <d3243f44-0a12-26c8-e0b8-dba37f4fdc02@oracle.com>

Hi Gary,

I wasn't suggesting a shorter waittime to speed up the tests. It's just 
another (of many) timeout related parameters use to detect (what should 
be very uncommon) timeout failures sooner. I guess in that case it does 
make the test faster in cases where it does timeout.

So one question is how much do we care about timeout performance? If not 
at all (we think the timeout is very rare, if ever), we'd just do a 
something like a 1h timeout and forget about it. However, historically 
that is not the approach we have taken. jtreg is given a fairly short 
timeout of 2m, multiplied to account for platform performance.

So while I understand it doesn't make sense to have the waittime be 
longer than the (adjusted) jtreg timeout (we'd always hit the jtreg 
timeout first), I don't think that implies we should make the jtreg 
timeout longer. Maybe we should make the waittime shorter. In any case, 
with the current timeoutFactor in place, it's already the case that the 
jtreg timeout is longer than waittime. So I'm not sure why you feel the 
need to make the jtreg timeout longer, unless the test is hitting the 
jtreg timeout already.

And another thought that just came to me. Timeouts can also serve the 
purpose of detecting bugs. If the test author decides the test should 
finish in 1m, and someone bumps the timeout to 10m, that might make a 
performance bug introduced in the future go unnoticed. In general I 
don't think we should increase the timeout for tests that are not 
currently timing out. For ones that are, first see if there is a 
performance related issue.

Chris

On 7/13/18 2:36 PM, gary.adams at oracle.com wrote:
> We know that the default jtreg timeout is 2 minutes and typically
> runs with a timeoutfactor of 4 or 10. So the harness "safety net"
> is 8 to 20 minutes from jtreg.
>
> It does appear that most of the vmTestbase tests use a 5 minute
> waittime. I have seen waittime used in different ways. The one we
> saw most recently was waiting for a specific reply that was taking
> upwords of 7 minutes handling method exclude filtering. e.g.
> 600K methods on solaris-sparcv9-debug
>
> I've seen other tests using waittime as a total test timeout.
>
> The jtreg timeout factor has not been applied to the vmTestbase waitime.
> The tests have been quickly ported so they can run under jtreg
> harness, but have not been converted to use the all the jtreg features.
>
> The purpose of this specific fix is to prevent jtreg from an early
> termination at 2 minutes or 8 minutes, when the original waittime
> allows for 5 minutes.
>
> Reducing waittime will not speed up the tests. It would probably 
> introduce
> more intermittent timeout reports.
>
> On 7/13/18 4:21 PM, Chris Plummer wrote:
>> Hi Gary,
>>
>> It looks like you have properly added timeout=300 wherever we use 
>> -waittime:5. However, I'm not 100% convinced this is always the right 
>> approach. In the bug description you said that -waittime is used as a 
>> timeout for individual operations. However, there could be multiple 
>> of those operations, and they could in sum exceed the 300 second 
>> jtreg timeout you added.
>>
>> What is the default for -waittime? I'm also guessing that the initial 
>> application of -waittime was never really tuned to the specific tests 
>> and just cloned across most of them. It seems every test either needs 
>> 5m or the default, which doesn't really make much sense. If 5m was 
>> really needed, we should have seen a lot of failures when ported to 
>> jtreg, but as far as I know the only reason this issue got on your 
>> radar was due to exclude001 needing 7m. Maybe rather than adding 
>> timeout=300? you should change -waitime to 2m, since other than 
>> exclude001, none of the tests seem to need more than 2m.
>>
>> Lastly, does timeoutFactor impact -waittime? It seems it should be 
>> applied to it also. I'm not sure if it is.
>>
>> thanks,
>>
>> Chris
>>
>> On 7/13/18 4:29 AM, Gary Adams wrote:
>>> This is a simple update to set the jtreg timeout to match the
>>> internal waittime already being used by these vmTestbase/nsk/jdb tests.
>>>
>>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8206013
>>> ? Webrev: http://cr.openjdk.java.net/~gadams/8206013/webrev.00/
>>
>>
>>
>


From daniil.x.titov at oracle.com  Fri Jul 13 23:34:41 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Fri, 13 Jul 2018 16:34:41 -0700
Subject: RFR 8207261: [Graal] JDI and JDWP tests that consume all memory
 should be filtered out to not run with Graal
Message-ID: <810B1BA0-92AE-4801-A825-2024DB22D81F@oracle.com>

Please review the change that filters out 8 JDI and JDWP tests when running with Graal. ?These tests consume all memory ( to force GC or to test that the OutOfMemory error is thrown ) that sporadically results in the exceptions in the Graal compiler threads and failure.

Issue: https://bugs.openjdk.java.net/browse/JDK-8207261 
Webrev: http://cr.openjdk.java.net/~dtitov/8207261/webrev.01/    

Thanks!

Best regards,
Daniil


From chris.plummer at oracle.com  Sat Jul 14 00:28:31 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 13 Jul 2018 17:28:31 -0700
Subject: RFR 8207261: [Graal] JDI and JDWP tests that consume all memory
 should be filtered out to not run with Graal
In-Reply-To: <810B1BA0-92AE-4801-A825-2024DB22D81F@oracle.com>
References: <810B1BA0-92AE-4801-A825-2024DB22D81F@oracle.com>
Message-ID: <f7bc5cec-86b0-817b-1153-10b1104fa50d@oracle.com>

Looks good.

Chris

On 7/13/18 4:34 PM, Daniil Titov wrote:
> Please review the change that filters out 8 JDI and JDWP tests when running with Graal. ?These tests consume all memory ( to force GC or to test that the OutOfMemory error is thrown ) that sporadically results in the exceptions in the Graal compiler threads and failure.
>
> Issue: https://bugs.openjdk.java.net/browse/JDK-8207261
> Webrev: http://cr.openjdk.java.net/~dtitov/8207261/webrev.01/
>
> Thanks!
>
> Best regards,
> Daniil
>
>
>


From serguei.spitsyn at oracle.com  Sat Jul 14 00:29:46 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 13 Jul 2018 17:29:46 -0700
Subject: RFR 8207261: [Graal] JDI and JDWP tests that consume all memory
 should be filtered out to not run with Graal
In-Reply-To: <810B1BA0-92AE-4801-A825-2024DB22D81F@oracle.com>
References: <810B1BA0-92AE-4801-A825-2024DB22D81F@oracle.com>
Message-ID: <3d7e49ef-02f7-3882-7608-39d36f141b2e@oracle.com>

Hi Daniil,

It looks good.

Thanks,
Serguei

On 7/13/18 16:34, Daniil Titov wrote:
> Please review the change that filters out 8 JDI and JDWP tests when running with Graal. ?These tests consume all memory ( to force GC or to test that the OutOfMemory error is thrown ) that sporadically results in the exceptions in the Graal compiler threads and failure.
>
> Issue: https://bugs.openjdk.java.net/browse/JDK-8207261
> Webrev: http://cr.openjdk.java.net/~dtitov/8207261/webrev.01/
>
> Thanks!
>
> Best regards,
> Daniil
>
>
>


From kubota.yuji at gmail.com  Sat Jul 14 17:56:32 2018
From: kubota.yuji at gmail.com (KUBOTA Yuji)
Date: Sun, 15 Jul 2018 02:56:32 +0900
Subject: RFR:8207048: jhsdb debugd cannot specify a port number
In-Reply-To: <CABU-27OA6wGoPvY8Zycm6VHPjQq6V-7e93Z-27vREX8vttu-EQ@mail.gmail.com>
References: <CABU-27O2z0QDfsJJr2+RZ_VpDT0q67Ww-A8mqzAqMZ6SrT72GA@mail.gmail.com>
 <5b422de0-ec5e-924e-004f-d58ab1474f85@oracle.com>
 <CABU-27OA6wGoPvY8Zycm6VHPjQq6V-7e93Z-27vREX8vttu-EQ@mail.gmail.com>
Message-ID: <CABU-27NzLFCw0jCSnv7icjed34=MrUdcX+PWsqG9z_NQ2+VufA@mail.gmail.com>

Hi David and all,

My goal is we can set the port of RMI and RMI registry through command
line option in jhsdb debugd. So I want to create a CSR request of
JDK-8207048 which propose to change jhsdb command line option.

P.S.: I have never created a CSR request before. I'll need some time
to learn that.

Thanks,
Yuji

2018-07-12 10:40 GMT+09:00 KUBOTA Yuji <kubota.yuji at gmail.com>:
> Hi David,
>
> Thank you for comment and updating JBS. I'll create a CSR request
> after getting comments whether this change is welcomed by community.
>
> Thanks,
> Yuji
>
> 2018-07-12 10:21 GMT+09:00 David Holmes <david.holmes at oracle.com>:
>> Hi Yuji,
>>
>> I can't comment on the actual change proposed in this enhancement request,
>> but it will need to have a CSR request created and approved due to the use
>> of a new system property.
>>
>> Thanks,
>> David
>>
>>
>>
>>
>> On 11/07/2018 11:55 PM, KUBOTA Yuji wrote:
>>>
>>> Hi all,
>>>
>>> I filed bugzilla for small fix to improvement of `jhsdb debugd` to set
>>> a port of UnicastRemoteObject aka
>>> sun.jvm.hotspot.debugger.remote.RemoteDebuggerServer by
>>> `sun.jvm.hotspot.rmi.debugger.port`.
>>>
>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8207048
>>> Webrev: http://cr.openjdk.java.net/~ykubota/8207048/webrev.00/
>>>
>>> We can set an RMI registry port of debugd server by
>>> `sun.jvm.hotspot.rmi.port`, but can not set a port of RemoteObject. So
>>> RemoteObject always uses an anonymous port. For security, we should
>>> not open ports widely to use debugd, so I want to fix.
>>>
>>> Could you review it?
>>>
>>> Thanks,
>>> Yuji
>>>
>>

From gary.adams at oracle.com  Mon Jul 16 14:49:16 2018
From: gary.adams at oracle.com (Gary Adams)
Date: Mon, 16 Jul 2018 10:49:16 -0400
Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout
 configured
In-Reply-To: <5B4CA830.2000206@oracle.com>
References: <5B4CA830.2000206@oracle.com>
Message-ID: <5B4CB06C.20602@oracle.com>

I agree that timeouts should be very rare and that a shorter timeout
helps during test development. These tests were written a long time
ago and the test developers are no longer available for ongoing
adjustments.

These tests were not designed to be performance regression tests.
They typically have a single feature that is being tested. For instance
the recent investigations with exclude001 revealed many more methods
were being processed now than when the test was originally written.
Increasing waittime from 5 to 7 minutes allowed it to run to completion
on the slower solaris-sparcv9-debug build.

I believe we are looking for ways to keep the continuous integration
systems building and testing automatically. Intermittent timeout
failures should be avoided where possible. I believe historically
we have seen both vmTestbase/nsk waitime timeouts and
jtreg timeouts in this collection of tests. Increasing the jtreg timeout
should allow the internal waitime timeout to have first shot at
reporting a timeout.
> Hi Gary,
>
> I wasn't suggesting a shorter waittime to speed up the tests. It's just
> another (of many) timeout related parameters use to detect (what should
> be very uncommon) timeout failures sooner. I guess in that case it does
> make the test faster in cases where it does timeout.
Agreed.
>
> So one question is how much do we care about timeout performance? If not
> at all (we think the timeout is very rare, if ever), we'd just do a
> something like a 1h timeout and forget about it. However, historically
> that is not the approach we have taken. jtreg is given a fairly short
> timeout of 2m, multiplied to account for platform performance.
When/if these tests are rewritten to be more jtreg centric, the timeout and
time factor arguments should be updated.

I believe performance specific tests should catch regressions.
These functional tests should be given adequate time to complete
their tasks.
>
> So while I understand it doesn't make sense to have the waittime be
> longer than the (adjusted) jtreg timeout (we'd always hit the jtreg
> timeout first), I don't think that implies we should make the jtreg
> timeout longer. Maybe we should make the waittime shorter. In any case,
> with the current timeoutFactor in place, it's already the case that the
> jtreg timeout is longer than waittime. So I'm not sure why you feel the
> need to make the jtreg timeout longer, unless the test is hitting the
> jtreg timeout already.
The current waittime setting is what the original test developer
designated. It corresponds closest to the jtreg timeout setting.

>
> And another thought that just came to me. Timeouts can also serve the
> purpose of detecting bugs. If the test author decides the test should
> finish in 1m, and someone bumps the timeout to 10m, that might make a
> performance bug introduced in the future go unnoticed. In general I
> don't think we should increase the timeout for tests that are not
> currently timing out. For ones that are, first see if there is a
> performance related issue.
I believe the focus should be setting the timeouts that allow these
tests to reliably complete on the current supported platforms
and build variants.
>
> Chris
>
> On 7/13/18 2:36 PM,gary.adams at oracle.com  <http://mail.openjdk.java.net/mailman/listinfo/serviceability-dev>  wrote:
> >/  We know that the default jtreg timeout is 2 minutes and typically
> />/  runs with a timeoutfactor of 4 or 10. So the harness "safety net"
> />/  is 8 to 20 minutes from jtreg.
> />/
> />/  It does appear that most of the vmTestbase tests use a 5 minute
> />/  waittime. I have seen waittime used in different ways. The one we
> />/  saw most recently was waiting for a specific reply that was taking
> />/  upwords of 7 minutes handling method exclude filtering. e.g.
> />/  600K methods on solaris-sparcv9-debug
> />/
> />/  I've seen other tests using waittime as a total test timeout.
> />/
> />/  The jtreg timeout factor has not been applied to the vmTestbase waitime.
> />/  The tests have been quickly ported so they can run under jtreg
> />/  harness, but have not been converted to use the all the jtreg features.
> />/
> />/  The purpose of this specific fix is to prevent jtreg from an early
> />/  termination at 2 minutes or 8 minutes, when the original waittime
> />/  allows for 5 minutes.
> />/
> />/  Reducing waittime will not speed up the tests. It would probably
> />/  introduce
> />/  more intermittent timeout reports.
> />/
> />/  On 7/13/18 4:21 PM, Chris Plummer wrote:
> />>/  Hi Gary,
> />>/
> />>/  It looks like you have properly added timeout=300 wherever we use
> />>/  -waittime:5. However, I'm not 100% convinced this is always the right
> />>/  approach. In the bug description you said that -waittime is used as a
> />>/  timeout for individual operations. However, there could be multiple
> />>/  of those operations, and they could in sum exceed the 300 second
> />>/  jtreg timeout you added.
> />>/
> />>/  What is the default for -waittime? I'm also guessing that the initial
> />>/  application of -waittime was never really tuned to the specific tests
> />>/  and just cloned across most of them. It seems every test either needs
> />>/  5m or the default, which doesn't really make much sense. If 5m was
> />>/  really needed, we should have seen a lot of failures when ported to
> />>/  jtreg, but as far as I know the only reason this issue got on your
> />>/  radar was due to exclude001 needing 7m. Maybe rather than adding
> />>/  timeout=300  you should change -waitime to 2m, since other than
> />>/  exclude001, none of the tests seem to need more than 2m.
> />>/
> />>/  Lastly, does timeoutFactor impact -waittime? It seems it should be
> />>/  applied to it also. I'm not sure if it is.
> />>/
> />>/  thanks,
> />>/
> />>/  Chris
> />>/
> />>/  On 7/13/18 4:29 AM, Gary Adams wrote:
> />>>/  This is a simple update to set the jtreg timeout to match the
> />>>/  internal waittime already being used by these vmTestbase/nsk/jdb tests.
> />>>/
> />>>/     Issue:https://bugs.openjdk.java.net/browse/JDK-8206013
> />>>/     Webrev:http://cr.openjdk.java.net/~gadams/8206013/webrev.00/  <http://cr.openjdk.java.net/%7Egadams/8206013/webrev.00/>
> />>/
> />>/
> />>/
> />/
> /
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/86b6b04a/attachment.html>

From jcbeyler at google.com  Mon Jul 16 17:58:36 2018
From: jcbeyler at google.com (JC Beyler)
Date: Mon, 16 Jul 2018 10:58:36 -0700
Subject: RFR(S) 8205652:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
 fails
Message-ID: <CAF9BGBxd32Pok1fvfst+02xS1MxSyVo-R9Wq++JCRfpg3kZX+w@mail.gmail.com>

Hi all,

Small RFR to update a HeapMonitor test that had two issues: a test was
wrong and the test was not allocating enough to get to an expected sample
count. Instead of allocating 10 times more and hit some OOM on the test
framework, the webrev allocates in chunks and gets the number of samples.

I ran this 10k times on my machine and it passed. Serguei ran mach5 testing
with it and said it looked good.

Bug associated is: JDK-8205652
<https://bugs.openjdk.java.net/browse/JDK-8205652>
Webrev is here: http://cr.openjdk.java.net/~jcbeyler/8205652/webrev.01/

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/4dd1b32c/attachment.html>

From hohensee at amazon.com  Mon Jul 16 18:22:30 2018
From: hohensee at amazon.com (Hohensee, Paul)
Date: Mon, 16 Jul 2018 18:22:30 +0000
Subject: ThreadMXBean::getCurrentThreadAllocatedBytes
In-Reply-To: <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com>
References: <CAD0+aVaUZUgqwmJYa6zf_htSzhF6FjpjZEVQG0vjq9FQmyofAw@mail.gmail.com>
 <c7627bf9-62f3-d492-6cad-428035f20adf@oracle.com>
 <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com>
 <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com>
Message-ID: <F5E5DEF4-1B88-4E5A-B97C-156D4B088C0C@amazon.com>

I believe you could move the code ahead of the call to validate_thread_id_array() because that method just checks for thread ids <= 0.


diff -r 3ddf41505d54 src/hotspot/share/services/management.cpp

--- a/src/hotspot/share/services/management.cpp Sun Jun 03 23:33:00 2018 -0700

+++ b/src/hotspot/share/services/management.cpp Mon Jul 16 10:41:28 2018 -0700

@@ -2084,11 +2083,19 @@

   typeArrayOop sa = typeArrayOop(JNIHandles::resolve_non_null(sizeArray));

   typeArrayHandle sizeArray_h(THREAD, sa);


+  // Special-case current thread

+  int num_threads = ids_ah->length();

+  JavaThread* java_thread = JavaThread::current();

+  if (num_threads == 1 && sizeArray_h->length() == 1 &&

+      ids_ah->long_at(0) == java_lang_Thread::thread_id(java_thread->threadObj())) {

+    sizeArray_h->long_at_put(0, java_thread->cooked_allocated_bytes());

+    return;

+  }

+

   // validate the thread id array

   validate_thread_id_array(ids_ah, CHECK);


   // sizeArray must be of the same length as the given array of thread IDs

-  int num_threads = ids_ah->length();

   if (num_threads != sizeArray_h->length()) {

     THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(),

               "The length of the given long array does not match the length of "

If performance is good enough, and if you still want to add getCurrentThreadAllocatedBytes() (imo a good idea, since getCurrentThreadCpuTime() and getCurrentThreadUserTime() already exist), you could implement it by ?getThreadAllocatedBytes(Thread::currentThread().getId())?. You might want also want to add getCurrentThread* methods to com.sun.management where they don?t currently exist: then we?d have a complete parallel method set.


Another approach to improving things is to fix the underlying problem with find_JavaThread_from_java_tid(). https://bugs.openjdk.java.net/browse/JDK-8185005 proposes doing that in a different context. We came up with a patch for JDK8 that uses an open addressed hashtable (one where the ?bucket chain? is in the index array, see https://en.wikipedia.org/wiki/Hash_table#Open_addressing) to map Java tids to JavaThread*s. I?ve forward ported it to JDK12, see http://cr.openjdk.java.net/~phh/8185005/webrev.00/. The main disadvantage, of course, is that it?s yet another data structure that takes up memory. It?s really fast though and speeds up our profilers quite a bit. Perhaps we could replace the existing thread list with a variation on this map, since it?s quick to just run through the underlying array when you want to run through the threads.


Thanks,


Paul

From: serviceability-dev <serviceability-dev-bounces at openjdk.java.net> on behalf of "Daniel D. Daugherty" <daniel.daugherty at oracle.com>
Reply-To: "daniel.daugherty at oracle.com" <daniel.daugherty at oracle.com>
Date: Friday, July 13, 2018 at 1:53 PM
To: Markus Gaisbauer <markus.gaisbauer at gmail.com>, "serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>, Erik ?sterlund <erik.osterlund at oracle.com>, Robbin Ehn <robbin.ehn at oracle.com>
Subject: Re: ThreadMXBean::getCurrentThreadAllocatedBytes

Markus,

I filed the following bug for you:

    JDK-8207266 ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread
    https://bugs.openjdk.java.net/browse/JDK-8207266

Dan

On 7/13/18 4:46 PM, Daniel D. Daugherty wrote:
On 7/13/18 2:44 PM, Daniel D. Daugherty wrote:

On 7/13/18 12:35 PM, Markus Gaisbauer wrote:

Hello,

I am trying to use ThreadMXBean::getThreadAllocatedBytes (com.sun.management) to get the amount of allocated memory of the current thread in some performance critical code.

Unfortunately, the current implementation can be rather slow and the duration of each call unpredictable. I ran a test in a JVM with 500 threads. Depending on which thread was queried, getThreadAllocatedBytes took between 100 ns and 2500 ns.

The root cause of the problem is ThreadsList::find_JavaThread_from_java_tid which performs a linear scan through all Java threads in the current process. The more threads a JVM has, the slower it gets. In the worst case, the thread with the given TID is found as the last entry in the list.

Before Java 10, the oldest thread is the slowest one to query.
Since Java 10, the youngest thread is the slowest one to query. I think this was a side effect of introducing "Thread Safe Memory Reclamation (Thread-SMR) support".

             Oldest Thread   Youngest Thread
Java 8             8740 ns             76 ns
Java 10             109 ns           2485 ns

It is good to see that longest search is much faster. Erik and Robbin
will be pleased since speeding up traversal of the ThreadsList was one
of the things that we tried to do during the Thread-SMR project.

A first step is get a new bug filed that documents the issue with
ThreadMXBean::getThreadAllocatedBytes(). Perhaps Gary or Serguei
will take care of that.

Dan


A common use case is to query the metric for the current thread (e.g. before and after performing some operation). This case can be optimized by introducing a new method: getCurrentThreadAllocatedBytes.

I created a patch for http://hg.openjdk.java.net/jdk/jdk/ and by using the new method I saw the following improvements in my test:

             Oldest Thread   Youngest Thread
Proposal             37 ns             37 ns

This is a 60x improvement over the worst case of the current API. In the best case of the current API, the new method is still 3 times faster.

// based on JVM_SetNativeThreadName in jvm.cpp.
JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, jobject currentThread))
  // We don't use a ThreadsListHandle here because the current thread
  // must be alive.
  oop java_thread = JNIHandles::resolve_non_null(currentThread);
  JavaThread* thr = java_lang_Thread::thread(java_thread);
  if (thread == thr) {
    // only supported for the current thread
    return thr->cooked_allocated_bytes();
  }
  return -1;
JVM_END

The proposed method also fixes the problem, that getThreadAllocatedBytes itself allocates some memory on the current thread (two long arrays, 24 bytes) and therefore can slightly skew measurements. The new method, getCurrentThreadAllocatedBytes, returns exactly the same value if it is called twice without allocating any memory between those calls.

I also built a variation of this method that could be used to query allocated memory more efficiently for anyone who already has a java.lang.Thread object:

JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env, jobject threadObj))
  // based on code proposed in threadSMR.hpp
  ThreadsListHandle tlh;
  JavaThread* thr = NULL;
  bool is_alive = tlh.cv_internal_thread_to_JavaThread(threadObj, &thr, NULL);
  if (is_alive) {
    return thr->cooked_allocated_bytes();
  }
  return -1;
JVM_END

This method took 70 ns in my test, which is 85% slower than GetCurrentThreadAllocatedMemory but still 30% faster than the best case of the current API. I currently have no immediate need for this second method, but I think it would also be a valueable addition to the API.

I attached a patch for getCurrentThreadAllocatedBytes. I can create a second patch for also adding getThreadAllocatedMemory(java.lang.Thread) to the API.

I am a first time contributor and I am not 100% sure what process I must follow to get a change like this into OpenJDK. Can someone have a look at my proposal and help me through the process?

Best regards,
Markus


I believe this is the code that's causing you grief:

open/src/hotspot/share/services/management.cpp:

// Gets an array containing the amount of memory allocated on the Java
// heap for a set of threads (in bytes).  Each element of the array is
// the amount of memory allocated for the thread ID specified in the
// corresponding entry in the given array of thread IDs; or -1 if the
// thread does not exist or has terminated.
JVM_ENTRY(void, jmm_GetThreadAllocatedMemory(JNIEnv *env, jlongArray ids,
                                             jlongArray sizeArray))
  // Check if threads is null
  if (ids == NULL || sizeArray == NULL) {
    THROW(vmSymbols::java_lang_NullPointerException());
  }

  ResourceMark rm(THREAD);
  typeArrayOop ta = typeArrayOop(JNIHandles::resolve_non_null(ids));
  typeArrayHandle ids_ah(THREAD, ta);

  typeArrayOop sa = typeArrayOop(JNIHandles::resolve_non_null(sizeArray));
  typeArrayHandle sizeArray_h(THREAD, sa);

  // validate the thread id array
  validate_thread_id_array(ids_ah, CHECK);

  // sizeArray must be of the same length as the given array of thread IDs
  int num_threads = ids_ah->length();
  if (num_threads != sizeArray_h->length()) {
    THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(),
              "The length of the given long array does not match the length of "
              "the given array of thread IDs");
  }

  ThreadsListHandle tlh;
  for (int i = 0; i < num_threads; i++) {
    JavaThread* java_thread = tlh.list()->find_JavaThread_from_java_tid(ids_ah->long_at(i));
    if (java_thread != NULL) {
      sizeArray_h->long_at_put(i, java_thread->cooked_allocated_bytes());
    }
  }
JVM_END


Perhaps something like this above the "ThreadsListHandle tlh;" line:

  if (num_threads == 1 && THREAD->is_Java_thread()) {
    // Only asking for 1 thread so if we're a JavaThread, then
    // see if this request is for ourself.
    JavaThread* jt = THREAD;
    oop tobj = jt->threadObj();

    if (ids_ah->long_at(0) == java_lang_Thread::thread_id(tobj)) {
      // Return the info for ourself.
      sizeArray_h->long_at_put(0, jt->cooked_allocated_bytes());
      return;
    }
  }

I haven't checked to see if this will even compile, but I
think you'll get the idea.

Dan


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/b5d666a9/attachment-0001.html>

From jcbeyler at google.com  Mon Jul 16 19:37:18 2018
From: jcbeyler at google.com (JC Beyler)
Date: Mon, 16 Jul 2018 12:37:18 -0700
Subject: RFR(S) 8205541:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java
 fails
Message-ID: <CAF9BGBz1Mz0v5QcsjTOmXKUXt0neQzD8z0mqHRdjfO32eRsZzQ@mail.gmail.com>

Hi all,

Small RFR to update two HeapMonitor tests to remove test failures when
resetting a test data structure and assuming wrongly that the data
structure was empty afterwards due to a second thread adding something to
it.

The fix is to disable sampling then reset the storage before enabling it
again.

Bug associated is: JDK-8205541
<https://bugs.openjdk.java.net/browse/JDK-8205541>
Webrev is here: http://cr.openjdk.java.net/~jcbeyler/8205541/webrev.02/

Thanks all!
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/1fda6c4c/attachment.html>

From daniel.daugherty at oracle.com  Mon Jul 16 19:39:22 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 16 Jul 2018 15:39:22 -0400
Subject: ThreadMXBean::getCurrentThreadAllocatedBytes
In-Reply-To: <F5E5DEF4-1B88-4E5A-B97C-156D4B088C0C@amazon.com>
References: <CAD0+aVaUZUgqwmJYa6zf_htSzhF6FjpjZEVQG0vjq9FQmyofAw@mail.gmail.com>
 <c7627bf9-62f3-d492-6cad-428035f20adf@oracle.com>
 <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com>
 <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com>
 <F5E5DEF4-1B88-4E5A-B97C-156D4B088C0C@amazon.com>
Message-ID: <300fdd4d-8146-cdac-321e-daaff6f65bc8@oracle.com>

The new block needs to be just above the "ThreadsListHandle tlh;"
line in order to preserve all of the existing checks...

More below...


On 7/16/18 2:22 PM, Hohensee, Paul wrote:
>
> I believe you could move the code ahead of the call to 
> validate_thread_id_array() because that method just checks for thread 
> ids <= 0.
>
> *diff -r 3ddf41505d54 src/hotspot/share/services/management.cpp*
>
> *--- a/src/hotspot/share/services/management.cpp Sun Jun 03 23:33:00 
> 2018 -0700*
>
> *+++ b/src/hotspot/share/services/management.cpp Mon Jul 16 10:41:28 
> 2018 -0700*
>
> @@ -2084,11 +2083,19 @@
>
> typeArrayOop sa = typeArrayOop(JNIHandles::resolve_non_null(sizeArray));
>
> typeArrayHandle sizeArray_h(THREAD, sa);
>
> +// Special-case current thread
>

The next line uses ids_ah, but validate_threads_id_array() has
not been called yet so you don't know whether ids_ah is valid
yet.

> +int num_threads = ids_ah->length();
>

The original code that I posted used the existing THREADS
variable rather than a call to JavaThread::current() which
can be expensive.

> +JavaThread* java_thread = JavaThread::current();
>
> +if (num_threads == 1 && sizeArray_h->length() == 1 &&
>

The next line uses ids_ah, but validate_threads_id_array() has
not been called yetso you don't know whether ids_ah is valid
yet.


> +ids_ah->long_at(0) == 
> java_lang_Thread::thread_id(java_thread->threadObj())) {
>
> +sizeArray_h->long_at_put(0, java_thread->cooked_allocated_bytes());
>
> +return;
>
> +}
>
> +
>
> // validate the thread id array
>
> validate_thread_id_array(ids_ah, CHECK);
>
> // sizeArray must be of the same length as the given array of thread IDs
>

It's not safe to move the next line before validate_thread_id_array().


> -int num_threads = ids_ah->length();
>
> if (num_threads != sizeArray_h->length()) {
>
> THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(),
>
> "The length of the given long array does not match the length of "
>
> If performance is good enough, and if you still want to add 
> getCurrentThreadAllocatedBytes() (imo a good idea, since 
> getCurrentThreadCpuTime() and getCurrentThreadUserTime() already 
> exist), you could implement it by 
> ?getThreadAllocatedBytes(Thread::currentThread().getId())?. You might 
> want also want to add getCurrentThread* methods to com.sun.management 
> where they don?t currently exist: then we?d have a complete parallel 
> method set.
>
> Another approach to improving things is to fix the underlying problem 
> with find_JavaThread_from_java_tid(). 
> https://bugs.openjdk.java.net/browse/JDK-8185005 proposes doing that 
> in a different context. We came up with a patch for JDK8 that uses an 
> open addressed hashtable (one where the ?bucket chain? is in the index 
> array, see https://en.wikipedia.org/wiki/Hash_table#Open_addressing 
> <https://en.wikipedia.org/wiki/Hash_table#Open_addressing>) to map 
> Java tids to JavaThread*s. I?ve forward ported it to JDK12, see 
> http://cr.openjdk.java.net/~phh/8185005/webrev.00/ 
> <http://cr.openjdk.java.net/%7Ephh/8185005/webrev.00/>. The main 
> disadvantage, of course, is that it?s yet another data structure that 
> takes up memory. It?s really fast though and speeds up our profilers 
> quite a bit. Perhaps we could replace the existing thread list with a 
> variation on this map, since it?s quick to just run through the 
> underlying array when you want to run through the threads.
>

Hmmm... That bug got closed as will-not-fix. I'm not sure why
the triage team decided that.

Dan


> Thanks,
>
> Paul
>
> *From: *serviceability-dev 
> <serviceability-dev-bounces at openjdk.java.net> on behalf of "Daniel D. 
> Daugherty" <daniel.daugherty at oracle.com>
> *Reply-To: *"daniel.daugherty at oracle.com" <daniel.daugherty at oracle.com>
> *Date: *Friday, July 13, 2018 at 1:53 PM
> *To: *Markus Gaisbauer <markus.gaisbauer at gmail.com>, 
> "serviceability-dev at openjdk.java.net" 
> <serviceability-dev at openjdk.java.net>, Erik ?sterlund 
> <erik.osterlund at oracle.com>, Robbin Ehn <robbin.ehn at oracle.com>
> *Subject: *Re: ThreadMXBean::getCurrentThreadAllocatedBytes
>
> Markus,
>
> I filed the following bug for you:
>
> ??? JDK-8207266 ThreadMXBean::getThreadAllocatedBytes() can be quicker 
> for self thread
> https://bugs.openjdk.java.net/browse/JDK-8207266
>
> Dan
>
> On 7/13/18 4:46 PM, Daniel D. Daugherty wrote:
>
>     On 7/13/18 2:44 PM, Daniel D. Daugherty wrote:
>
>         On 7/13/18 12:35 PM, Markus Gaisbauer wrote:
>
>             Hello,
>
>             I am trying to use ThreadMXBean::getThreadAllocatedBytes
>             (com.sun.management) to get the amount of allocated memory
>             of the current thread in some performance critical code.
>
>             Unfortunately, the current implementation can be rather
>             slow and the duration of each call unpredictable. I ran a
>             test in a JVM with 500 threads. Depending on which thread
>             was queried, getThreadAllocatedBytes took between 100 ns
>             and 2500 ns.
>
>             The root cause of the problem is
>             ThreadsList::find_JavaThread_from_java_tid which performs
>             a linear scan through all Java threads in the current
>             process. The more threads a JVM has, the slower it gets.
>             In the worst case, the thread with the given TID is found
>             as the last entry in the list.
>
>             Before Java 10, the oldest thread is the slowest one to query.
>
>             Since Java 10, the youngest thread is the slowest one to
>             query. I think this was a side effect of introducing
>             "Thread Safe Memory Reclamation (Thread-SMR) support".
>
>             ? ? ? ?Oldest Thread? ?Youngest Thread
>
>             Java 8? ? ? ? ? ? ?8740 ns? ? ? ? ? ? ?76 ns
>
>             Java 10? ? ? ? ? ? ?109 ns? ? ? ? ? ?2485 ns
>
>
>         It is good to see that longest search is much faster. Erik and
>         Robbin
>         will be pleased since speeding up traversal of the ThreadsList
>         was one
>         of the things that we tried to do during the Thread-SMR project.
>
>         A first step is get a new bug filed that documents the issue with
>         ThreadMXBean::getThreadAllocatedBytes(). Perhaps Gary or Serguei
>         will take care of that.
>
>         Dan
>
>
>
>             A common use case is to query the metric for the current
>             thread (e.g. before and after performing some operation).
>             This case can be optimized by introducing a new method:
>             getCurrentThreadAllocatedBytes.
>
>             I created a patch for http://hg.openjdk.java.net/jdk/jdk/
>             <http://hg.openjdk.java.net/jdk/jdk/> and by using the new
>             method I saw the following improvements in my test:
>
>             ? ? ? ?Oldest Thread? ?Youngest Thread
>
>             Proposal ? ? ? ? ? ?37 ns? ? ? ? ? ? ?37 ns
>
>             This is a 60x improvement over the worst case of the
>             current API. In the best case of the current API, the new
>             method is still 3 times faster.
>
>             // based on JVM_SetNativeThreadName in jvm.cpp.
>
>             JVM_ENTRY(jlong,
>             jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env, jobject
>             currentThread))
>
>             ? // We don't use a ThreadsListHandle here because the
>             current thread
>
>             ? // must be alive.
>
>             ? oop java_thread =
>             JNIHandles::resolve_non_null(currentThread);
>
>             ? JavaThread* thr = java_lang_Thread::thread(java_thread);
>
>             ? if (thread == thr) {
>
>             ? ? // only supported for the current thread
>
>             ? ? return thr->cooked_allocated_bytes();
>
>             ? }
>
>             ? return -1;
>
>             JVM_END
>
>             The proposed method also fixes the problem, that
>             getThreadAllocatedBytes itself allocates some memory on
>             the current thread (two long arrays, 24 bytes) and
>             therefore can slightly skew measurements. The new
>             method,?getCurrentThreadAllocatedBytes, returns exactly
>             the same value if it is called twice without allocating
>             any memory between those calls.
>
>             I also built a variation of this method that could be used
>             to query allocated memory more efficiently for anyone who
>             already has a java.lang.Thread object:
>
>             JVM_ENTRY(jlong, jmm_GetThreadAllocatedMemory(JNIEnv *env,
>             jobject threadObj))
>
>             ? // based on code proposed in threadSMR.hpp
>
>             ? ThreadsListHandle tlh;
>
>             ? JavaThread* thr = NULL;
>
>             ? bool is_alive =
>             tlh.cv_internal_thread_to_JavaThread(threadObj, &thr, NULL);
>
>             ? if (is_alive) {
>
>             ? ? return thr->cooked_allocated_bytes();
>
>             ? }
>
>             ? return -1;
>
>             JVM_END
>
>             This method took 70 ns in my test, which is 85% slower
>             than?GetCurrentThreadAllocatedMemory but still 30% faster
>             than the best case of the current API. I currently have no
>             immediate need for this second method, but I think it
>             would also be a valueable addition to the API.
>
>             I attached a patch for getCurrentThreadAllocatedBytes. I
>             can create a second patch for also adding
>             getThreadAllocatedMemory(java.lang.Thread) to the API.
>
>             I am a first time contributor and I am not 100% sure what
>             process I must follow to get a change like this into
>             OpenJDK. Can someone have a look at my proposal and help
>             me through the process?
>
>             Best regards,
>
>             Markus
>
>
>     I believe this is the code that's causing you grief:
>
>     open/src/hotspot/share/services/management.cpp:
>
>     // Gets an array containing the amount of memory allocated on the Java
>     // heap for a set of threads (in bytes).? Each element of the array is
>     // the amount of memory allocated for the thread ID specified in the
>     // corresponding entry in the given array of thread IDs; or -1 if the
>     // thread does not exist or has terminated.
>     JVM_ENTRY(void, jmm_GetThreadAllocatedMemory(JNIEnv *env,
>     jlongArray ids,
>     jlongArray sizeArray))
>     ? // Check if threads is null
>     ? if (ids == NULL || sizeArray == NULL) {
>     THROW(vmSymbols::java_lang_NullPointerException());
>     ? }
>
>     ? ResourceMark rm(THREAD);
>     ? typeArrayOop ta = typeArrayOop(JNIHandles::resolve_non_null(ids));
>     ? typeArrayHandle ids_ah(THREAD, ta);
>
>     ? typeArrayOop sa =
>     typeArrayOop(JNIHandles::resolve_non_null(sizeArray));
>     ? typeArrayHandle sizeArray_h(THREAD, sa);
>
>     ? // validate the thread id array
>     ? validate_thread_id_array(ids_ah, CHECK);
>
>     ? // sizeArray must be of the same length as the given array of
>     thread IDs
>     ? int num_threads = ids_ah->length();
>     ? if (num_threads != sizeArray_h->length()) {
>     THROW_MSG(vmSymbols::java_lang_IllegalArgumentException(),
>     ????????????? "The length of the given long array does not match
>     the length of "
>     ????????????? "the given array of thread IDs");
>     ? }
>
>     ? ThreadsListHandle tlh;
>     ? for (int i = 0; i < num_threads; i++) {
>     ??? JavaThread* java_thread =
>     tlh.list()->find_JavaThread_from_java_tid(ids_ah->long_at(i));
>     ??? if (java_thread != NULL) {
>     ????? sizeArray_h->long_at_put(i,
>     java_thread->cooked_allocated_bytes());
>     ??? }
>     ? }
>     JVM_END
>
>
>     Perhaps something like this above the "ThreadsListHandle tlh;" line:
>
>     ? if (num_threads == 1 && THREAD->is_Java_thread()) {
>     ??? // Only asking for 1 thread so if we're a JavaThread, then
>     ??? // see if this request is for ourself.
>     ??? JavaThread* jt = THREAD;
>     ??? oop tobj = jt->threadObj();
>
>     ??? if (ids_ah->long_at(0) == java_lang_Thread::thread_id(tobj)) {
>     ????? // Return the info for ourself.
>     ????? sizeArray_h->long_at_put(0, jt->cooked_allocated_bytes());
>     ????? return;
>     ??? }
>     ? }
>
>     I haven't checked to see if this will even compile, but I
>     think you'll get the idea.
>
>     Dan
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/23b3c874/attachment-0001.html>

From markus.gaisbauer at gmail.com  Mon Jul 16 19:42:20 2018
From: markus.gaisbauer at gmail.com (Markus Gaisbauer)
Date: Mon, 16 Jul 2018 21:42:20 +0200
Subject: ThreadMXBean::getCurrentThreadAllocatedBytes
In-Reply-To: <F5E5DEF4-1B88-4E5A-B97C-156D4B088C0C@amazon.com>
References: <CAD0+aVaUZUgqwmJYa6zf_htSzhF6FjpjZEVQG0vjq9FQmyofAw@mail.gmail.com>
 <c7627bf9-62f3-d492-6cad-428035f20adf@oracle.com>
 <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com>
 <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com>
 <F5E5DEF4-1B88-4E5A-B97C-156D4B088C0C@amazon.com>
Message-ID: <CAD0+aVbJbiqcdFOJZty_XzdbsvaUMYh85Kw4_QGggy0rWNWPkQ@mail.gmail.com>

 Hi,

Thank you for all the help.

?I added the code suggested by Paul and ran my small microbenchmark. The
optimized getThreadAllocatedBytes now takes 57 ns if fed with the ID of the
current thread. All calls that aren't for the current thread will be a bit
slower. As far as I am concerned, adding this optimization probably doesn't
hurt. In fact it would be awesome if this could be backported to Java 8.

But I am still strongly in favor of adding a special method just for the
current thread:
* It is still faster (35 ns vs 57 ns)
* No heap memory is allocated by this method
* If this method is available, callers can always expect good (constant
time) performance. On the other hand two versions would exist for
getThreadAllocatedBytes.
Library code would have to figure out somehow if this particular JVM has
the slow or fast version. In my own use case, I would only want to get the
metric if it is extremely fast. It's not worth the overhead, if it takes
thousands of nanoseconds.

I wasn't aware of the existence of JavaThread::current() before. My new
method in management.cpp could be simplified to this:

JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env))
  return JavaThread::current()->cooked_allocated_bytes();
JVM_END

This code is not only simpler but also a bit faster. Each call now takes 35
ns instead of 37 ns.

Is there a technical reason why no new native methods should be added to
ThreadImpl.java? I saw that getCurrentThreadCpuTime also uses this magic
number 0 to indicate the current thread. Wouldn't it be cleaner to have two
methods instead of using and checking a special number in Java/native code?

Best regards,
Markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/f2b82f2b/attachment.html>

From markus.gaisbauer at gmail.com  Mon Jul 16 19:49:40 2018
From: markus.gaisbauer at gmail.com (Markus Gaisbauer)
Date: Mon, 16 Jul 2018 21:49:40 +0200
Subject: ThreadMXBean::getCurrentThreadAllocatedBytes
In-Reply-To: <CAD0+aVbJbiqcdFOJZty_XzdbsvaUMYh85Kw4_QGggy0rWNWPkQ@mail.gmail.com>
References: <CAD0+aVaUZUgqwmJYa6zf_htSzhF6FjpjZEVQG0vjq9FQmyofAw@mail.gmail.com>
 <c7627bf9-62f3-d492-6cad-428035f20adf@oracle.com>
 <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com>
 <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com>
 <F5E5DEF4-1B88-4E5A-B97C-156D4B088C0C@amazon.com>
 <CAD0+aVbJbiqcdFOJZty_XzdbsvaUMYh85Kw4_QGggy0rWNWPkQ@mail.gmail.com>
Message-ID: <CAD0+aVZ1U=XLgGWsYOoyS513NRdennn3v1ys5EFRccx5nj87Hg@mail.gmail.com>

I saw Daniels comment too late, but using THREAD instead
of JavaThread::current() indeed makes the new method again a bit simpler
and faster (now 33 ns instead of 35 ns).

JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env))
  return THREAD->cooked_allocated_bytes();
JVM_END

On Mon, Jul 16, 2018 at 9:42 PM Markus Gaisbauer <markus.gaisbauer at gmail.com>
wrote:

> Hi,
>
> Thank you for all the help.
>
> ?I added the code suggested by Paul and ran my small microbenchmark. The
> optimized getThreadAllocatedBytes now takes 57 ns if fed with the ID of the
> current thread. All calls that aren't for the current thread will be a bit
> slower. As far as I am concerned, adding this optimization probably doesn't
> hurt. In fact it would be awesome if this could be backported to Java 8.
>
> But I am still strongly in favor of adding a special method just for the
> current thread:
> * It is still faster (35 ns vs 57 ns)
> * No heap memory is allocated by this method
> * If this method is available, callers can always expect good (constant
> time) performance. On the other hand two versions would exist for getThreadAllocatedBytes.
> Library code would have to figure out somehow if this particular JVM has
> the slow or fast version. In my own use case, I would only want to get the
> metric if it is extremely fast. It's not worth the overhead, if it takes
> thousands of nanoseconds.
>
> I wasn't aware of the existence of JavaThread::current() before. My new
> method in management.cpp could be simplified to this:
>
> JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env))
>   return JavaThread::current()->cooked_allocated_bytes();
> JVM_END
>
> This code is not only simpler but also a bit faster. Each call now takes
> 35 ns instead of 37 ns.
>
> Is there a technical reason why no new native methods should be added to
> ThreadImpl.java? I saw that getCurrentThreadCpuTime also uses this magic
> number 0 to indicate the current thread. Wouldn't it be cleaner to have
> two methods instead of using and checking a special number in Java/native
> code?
>
> Best regards,
> Markus
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/49660b4f/attachment.html>

From serguei.spitsyn at oracle.com  Mon Jul 16 19:58:38 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 16 Jul 2018 12:58:38 -0700
Subject: RFR(S) 8205652:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java fails
In-Reply-To: <CAF9BGBxd32Pok1fvfst+02xS1MxSyVo-R9Wq++JCRfpg3kZX+w@mail.gmail.com>
References: <CAF9BGBxd32Pok1fvfst+02xS1MxSyVo-R9Wq++JCRfpg3kZX+w@mail.gmail.com>
Message-ID: <b52f6fb2-df67-bd8b-b8ca-da36b7095a3c@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/68563820/attachment.html>

From hohensee at amazon.com  Mon Jul 16 20:28:46 2018
From: hohensee at amazon.com (Hohensee, Paul)
Date: Mon, 16 Jul 2018 20:28:46 +0000
Subject: ThreadMXBean::getCurrentThreadAllocatedBytes
In-Reply-To: <CAD0+aVZ1U=XLgGWsYOoyS513NRdennn3v1ys5EFRccx5nj87Hg@mail.gmail.com>
References: <CAD0+aVaUZUgqwmJYa6zf_htSzhF6FjpjZEVQG0vjq9FQmyofAw@mail.gmail.com>
 <c7627bf9-62f3-d492-6cad-428035f20adf@oracle.com>
 <0c037cb2-ce99-7a31-de51-27df6a6fedf1@oracle.com>
 <54735477-0c2f-354e-4348-16414e4c1a6d@oracle.com>
 <F5E5DEF4-1B88-4E5A-B97C-156D4B088C0C@amazon.com>
 <CAD0+aVbJbiqcdFOJZty_XzdbsvaUMYh85Kw4_QGggy0rWNWPkQ@mail.gmail.com>
 <CAD0+aVZ1U=XLgGWsYOoyS513NRdennn3v1ys5EFRccx5nj87Hg@mail.gmail.com>
Message-ID: <9E5419DB-7F46-4AC8-809B-6E9DCE18825B@amazon.com>

Given your requirements, and that my proposal is slower, yours is better. :)

There?s no technical reason why we can?t add what you?re asking for (and everyone who?s weighed in so far is in favor), but what do people think of adding the rest of the missing getCurrentThread* methods? These would be

getCurrentThreadInfo()
getCurrentThreadInfo(boolean lockedMonitors, boolean lockedSynchronizers)
getCurrentThreadInfo(int maxDepth)

Shall we add these to java.lang.management or com.sun.management? Since we?re doing major releases every 6 months, I?d say j.l.m, but it doesn?t matter to me one way or the other. Imo, getCurrentThreadAllocatedMemory() should go in com.sun.management because that?s where the *AllocatedMemory* methods are.

For getCurrentThreadAllocatedMemory(), you should add checks for isThreadAllocatedMemorySupported() and isThreadAllocatedMemoryEnabled().

Paul

From: Markus Gaisbauer <markus.gaisbauer at gmail.com>
Date: Monday, July 16, 2018 at 12:50 PM
To: "Hohensee, Paul" <hohensee at amazon.com>
Cc: "daniel.daugherty at oracle.com" <daniel.daugherty at oracle.com>, "serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>, "erik.osterlund at oracle.com" <erik.osterlund at oracle.com>, "robbin.ehn at oracle.com" <robbin.ehn at oracle.com>
Subject: Re: ThreadMXBean::getCurrentThreadAllocatedBytes

I saw Daniels comment too late, but using THREAD instead of JavaThread::current() indeed makes the new method again a bit simpler and faster (now 33 ns instead of 35 ns).

JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env))
  return THREAD->cooked_allocated_bytes();
JVM_END

On Mon, Jul 16, 2018 at 9:42 PM Markus Gaisbauer <markus.gaisbauer at gmail.com<mailto:markus.gaisbauer at gmail.com>> wrote:
Hi,

Thank you for all the help.

?I added the code suggested by Paul and ran my small microbenchmark. The optimized getThreadAllocatedBytes now takes 57 ns if fed with the ID of the current thread. All calls that aren't for the current thread will be a bit slower. As far as I am concerned, adding this optimization probably doesn't hurt. In fact it would be awesome if this could be backported to Java 8.

But I am still strongly in favor of adding a special method just for the current thread:
* It is still faster (35 ns vs 57 ns)
* No heap memory is allocated by this method
* If this method is available, callers can always expect good (constant time) performance. On the other hand two versions would exist for getThreadAllocatedBytes. Library code would have to figure out somehow if this particular JVM has the slow or fast version. In my own use case, I would only want to get the metric if it is extremely fast. It's not worth the overhead, if it takes thousands of nanoseconds.

I wasn't aware of the existence of JavaThread::current() before. My new method in management.cpp could be simplified to this:

JVM_ENTRY(jlong, jmm_GetCurrentThreadAllocatedMemory(JNIEnv *env))
  return JavaThread::current()->cooked_allocated_bytes();
JVM_END

This code is not only simpler but also a bit faster. Each call now takes 35 ns instead of 37 ns.

Is there a technical reason why no new native methods should be added to ThreadImpl.java? I saw that getCurrentThreadCpuTime also uses this magic number 0 to indicate the current thread. Wouldn't it be cleaner to have two methods instead of using and checking a special number in Java/native code?

Best regards,
Markus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/86320dbd/attachment-0001.html>

From serguei.spitsyn at oracle.com  Mon Jul 16 23:06:40 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 16 Jul 2018 16:06:40 -0700
Subject: RFR(S) 8205541:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java
 fails
In-Reply-To: <CAF9BGBz1Mz0v5QcsjTOmXKUXt0neQzD8z0mqHRdjfO32eRsZzQ@mail.gmail.com>
References: <CAF9BGBz1Mz0v5QcsjTOmXKUXt0neQzD8z0mqHRdjfO32eRsZzQ@mail.gmail.com>
Message-ID: <0fe71fe7-1d40-0d81-e7a8-5797ffcd4471@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/1ba59fab/attachment.html>

From jcbeyler at google.com  Mon Jul 16 23:07:27 2018
From: jcbeyler at google.com (JC Beyler)
Date: Mon, 16 Jul 2018 16:07:27 -0700
Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling
In-Reply-To: <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com>
References: <CAF9BGBzQY2+rx-ec7n7Y5Sq9N4E=XLUwx3hs_SKqhB2RXyf4TQ@mail.gmail.com>
 <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com>
Message-ID: <CAF9BGBwAR1nnS5Pr3obvFZD7d_-KKMj2j-mnkj+g9jDfy1zqEQ@mail.gmail.com>

Hi all,

The CSR has recently been approved, could someone else review the spec
update webrev:
http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/

The associated bug is here: https://bugs.openjdk.java.net/browse/JDK-8205725
The associated CSR is here: https://bugs.openjdk.java.net/browse/JDK-8206940

Thanks all!
Jc


On Thu, Jul 12, 2018 at 2:27 PM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Hi Jc,
>
> It looks good to me (including the CSR that I'had already reviewed).
> Thank you for preparing a fix for this issue so quickly!
>
> Thanks,
> Serguei
>
>
> On 7/12/18 13:45, JC Beyler wrote:
>
> Hi all,
>
> Could I get a review of an update to the JVMTI Spec for Heap Sampling:
> http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/
>
> The assoicated bug is here:
> https://bugs.openjdk.java.net/browse/JDK-8205725
> The associated CSR is here:
> https://bugs.openjdk.java.net/browse/JDK-8206940
>
> The basic reasoning of this webrev/bug/CSR is:
> - rate is not the right word and should be renamed to interval, this is
> what provokes the change in the code/tests/API naming.
> - the spec does not mention that the new sampling interval will take time
> to be taken into account (you have to wait for a TLAB to be refilled); this
> adds that precision so that the user is not surprised
> - the spec explicitly says that the sampling is done via a geometric
> variable which averages to the sampling interval; it was asked to relax
> this and the spec should just say that the sampling is pseudo-random and
> the interval will average out to what the user requested.
>
> Thanks for all your help,
> Jc
>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/f470c24b/attachment.html>

From serguei.spitsyn at oracle.com  Mon Jul 16 23:10:29 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 16 Jul 2018 16:10:29 -0700
Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling
In-Reply-To: <CAF9BGBwAR1nnS5Pr3obvFZD7d_-KKMj2j-mnkj+g9jDfy1zqEQ@mail.gmail.com>
References: <CAF9BGBzQY2+rx-ec7n7Y5Sq9N4E=XLUwx3hs_SKqhB2RXyf4TQ@mail.gmail.com>
 <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com>
 <CAF9BGBwAR1nnS5Pr3obvFZD7d_-KKMj2j-mnkj+g9jDfy1zqEQ@mail.gmail.com>
Message-ID: <d38e8a76-595b-2f87-15b5-33546a9d8b6b@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/600c3d03/attachment.html>

From chris.plummer at oracle.com  Tue Jul 17 01:17:54 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 16 Jul 2018 18:17:54 -0700
Subject: RFR: JDK-8206013: vmTestbase/nsk tests should have /timeout
 configured
In-Reply-To: <5B4CB06C.20602@oracle.com>
References: <5B4CA830.2000206@oracle.com> <5B4CB06C.20602@oracle.com>
Message-ID: <97c9dfb8-c91e-6412-fb02-412f67a09ff8@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180716/cd4167e7/attachment-0001.html>

From harsha.wardhana.b at oracle.com  Tue Jul 17 06:23:50 2018
From: harsha.wardhana.b at oracle.com (Harsha Wardhana B)
Date: Tue, 17 Jul 2018 11:53:50 +0530
Subject: RFR : JDK-8170299 - Debugger does not stop inside the low memory
 notifications code
Message-ID: <0d8b21bb-10f3-8e80-d579-d890cb046d16@oracle.com>

Hi All,

Please review the fix for the bug,

JDK-8170299 - Debugger does not stop inside the low memory notifications 
code
<https://bugs.openjdk.java.net/browse/JDK-8170299>
webrev at,

http://cr.openjdk.java.net/~hb/8170299/webrev.00/

Description of the fix:

The debugger does not stop inside the listeners registered for 
notification from

1. com.sun.management.GarbageCollectorMXBean 2. sun.management.MemoryImpl (MemoryMXBean)
3. com.sun.management.DiagnosticCommandMBean

The listeners registered for above MBeans are invoked by 'ServiceThread' which is a hidden thread and is not visible to the debugger.

This issue was was already worked on before and below is the review thread for the same.

http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-September/021782.html
http://mail.openjdk.java.net/pipermail/serviceability-dev/2017-December/022611.html

With the current fix, all the user registered callbacks for above MBeans 
are executed in a newly created SingleThreadExecutor. The above file is 
also re-factored to use CopyOnWriteArrayList for managing the listeners.

The fix has been tested in Mach5 by running all the tests under 
open/:jdk_management and closed/:jdk_management. The tests under 
open/test/jdk/java/lang/management/MemoryMXBean cover the above code 
changes. I can add more tests in the subsequent reviews if need arises.

Please review the above change and let me know your comments.

Thanks
Harsha


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180717/3da80f9c/attachment.html>

From daniil.x.titov at oracle.com  Tue Jul 17 08:20:26 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 17 Jul 2018 01:20:26 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
Message-ID: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>

Please review the change that fix the JDI test when running with Graal.

The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled.

Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/

Thanks!
--Daniil


From bob.vandette at oracle.com  Tue Jul 17 14:00:08 2018
From: bob.vandette at oracle.com (Bob Vandette)
Date: Tue, 17 Jul 2018 10:00:08 -0400
Subject: RFR: 8206456 - [TESTBUG] docker jtreg tests fail on systems without
 cpuset.effective_cpus / cpuset.effective_mem
Message-ID: <A8D8CB9F-D5E8-4FBF-A921-247B33CDCF6D@oracle.com>

Please review this fix which eliminates some docker/cgroup test failures when running on older
Linux kernels with missing cgroup metric files.

BUGS:
https://bugs.openjdk.java.net/browse/JDK-8206456

WEBREV:
http://cr.openjdk.java.net/~bobv/8206456/webrev/

This fix has been verified by the reporter of the issue.

Bob.


From ralf.schmelter at sap.com  Tue Jul 17 14:08:53 2018
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Tue, 17 Jul 2018 14:08:53 +0000
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <21e17c666ac04930a0e4bb4869e989da@sap.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
Message-ID: <a7e3536694e84e859bb14e3cb19f292c@sap.com>

Hi all,

here is an updated webref at http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v2/ 

I've converted the shell based test to a Java one, which had the nice side effect of speeding it up (now takes ~1 second runtime).

The fix itself is mainly unchanged, but I've added the variable 'filledIn' to store the number of frames actually filled in by the GetStackTrace call. Formerly I've reused the count variable, but this can lead to misunderstandings.

Best regards,
Ralf


-----Original Message-----
From: serviceability-dev [mailto:serviceability-dev-bounces at openjdk.java.net] On Behalf Of Schmelter, Ralf
Sent: Montag, 9. Juli 2018 16:05
To: Chris Plummer <chris.plummer at oracle.com>; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com
Subject: [CAUTION] RE: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior

Hi Chris,

thanks for the review.

> What testing have you done?

I've tested the change by debugging by hand in eclipse and jdb, running the com/sun/jdi rtreg tests and the jdwp jck tests. And analog code is running in the SAP JVM for many years.


> How long does this test take to run.

15 s according to jtreg. 


> What happens if for some reason SOE is never thrown? It's not clear to 
> me what the script would do in this case.

It is treated as passed (which is not ideal).


> In answer to the ShellScaffold.sh question, there is already work 
> underway to convert to pure java tests. See JDK-8201652.

Ok, then I think it is better to convert the test to a Java TestScaffold test. I will update the webref when this is done.

Best regards,
Ralf 


-----Original Message-----
From: Chris Plummer [mailto:chris.plummer at oracle.com] 
Sent: Freitag, 6. Juli 2018 00:37
To: Schmelter, Ralf <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com
Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior

Hi Ralf,

Overall looks good, but I do have a few comments and questions.

Please update the copyright.

What testing have you done?

How long does this test take to run.

What happens if for some reason SOE is never thrown? It's not clear to 
me what the script would do in this case.
In answer to the ShellScaffold.sh question, there is already work 
underway to convert to pure java tests. See JDK-8201652. I'm not certain 
if it is ok for you to just submit this new shell script, or if should 
be rewritten in pure java. Most of the work to convert the scripts has 
already been done but was put on hold. Maybe Serguei can comment and 
guide you on how it would be done in java.

thanks,

Chris

On 7/3/18 3:43 AM, Schmelter, Ralf wrote:
> Hi All,
>
> Please review the fix for the bughttps://bugs.openjdk.java.net/browse/JDK-8205608  . The webref is athttp://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/  .
>
> This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack.
>
> I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is.
>
> Best regards,
> Ralf Schmelter


From matthias.baesken at sap.com  Tue Jul 17 14:13:08 2018
From: matthias.baesken at sap.com (Baesken, Matthias)
Date: Tue, 17 Jul 2018 14:13:08 +0000
Subject: 8206456 - [TESTBUG] docker jtreg tests fail on systems without
 cpuset.effective_cpus / cpuset.effective_mem
In-Reply-To: <A8D8CB9F-D5E8-4FBF-A921-247B33CDCF6D@oracle.com>
References: <A8D8CB9F-D5E8-4FBF-A921-247B33CDCF6D@oracle.com>
Message-ID: <bad62aea444440c1ba1133d5e847f09a@sap.com>

Hi Bob, looks good  (I am not a Reviewer however) !

The reported issues occured on a  SUSE Linux 12 SP1 system ,  where 

/sys/fs/cgroup/cpuset/cpuset.effective_cpus     and       /sys/fs/cgroup/cpuset/cpuset.effective_mems    are  not present .

I applied  Bobs patch ,  now  the  jdk/internal/platform/docker  -  jtreg  tests do not fail any more on the mentioned system .

Thanks, Matthias


> -----Original Message-----
> From: Bob Vandette [mailto:bob.vandette at oracle.com]
> Sent: Dienstag, 17. Juli 2018 16:00
> To: serviceability-dev at openjdk.java.net serviceability-
> dev at openjdk.java.net <serviceability-dev at openjdk.java.net>; core-libs-
> dev <core-libs-dev at openjdk.java.net>
> Cc: Baesken, Matthias <matthias.baesken at sap.com>; Schmidt, Lutz
> <lutz.schmidt at sap.com>
> Subject: RFR: 8206456 - [TESTBUG] docker jtreg tests fail on systems without
> cpuset.effective_cpus / cpuset.effective_mem
> 
> Please review this fix which eliminates some docker/cgroup test failures
> when running on older
> Linux kernels with missing cgroup metric files.
> 
> BUGS:
> https://bugs.openjdk.java.net/browse/JDK-8206456
> 
> WEBREV:
> http://cr.openjdk.java.net/~bobv/8206456/webrev/
> 
> This fix has been verified by the reporter of the issue.
> 
> Bob.
> 
> 
> 


From gary.adams at oracle.com  Tue Jul 17 15:33:54 2018
From: gary.adams at oracle.com (Gary Adams)
Date: Tue, 17 Jul 2018 11:33:54 -0400
Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR:
 suspendCounts don't match for : Common-Cleaner
Message-ID: <5B4E0C62.3020808@oracle.com>

A race condition exists between the debugger and the debuggee.

The first test thread is started with SUSPEND_NONE policy set.
While processing the thread start event the debugger captures
an initial set of thread suspend counts and resumes the
debuggee vm. If the debuggee advances quickly it reaches
the breakpoint set for methodForCommunication. Since the breakpoint
carries with it SUSPEND_ALL policy, when the debugger captures a second
set of suspend counts, it will not match the expected counts for
a SUSPEND_NONE scenario.

The proposed fix introduces a yield in the debuggee test thread run method
to allow the debugger to get the expected sampled values.

   Issue: https://bugs.openjdk.java.net/browse/JDK-8170089
   Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/


test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java:
...
    186        private void setCommunicationBreakpoint(ReferenceType 
refType, String methodName) {
    187            Method method = debuggee.methodByName(refType, 
methodName);
    188            Location location = null;
    189            try {
    190                location = method.allLineLocations().get(0);
    191            } catch (AbsentInformationException e) {
    192                throw new Failure(e);
    193            }
    194            bpRequest = debuggee.makeBreakpoint(location);
    195

    196            bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL);

    197            bpRequest.putProperty("number", "zero");
    198            bpRequest.enable();
    199
    200            eventHandler.addListener(
    201                 new EventHandler.EventListener() {
    202                     public boolean eventReceived(Event event) {
    203                        if (event instanceof BreakpointEvent && 
bpRequest.equals(event.request())) {
    204                            synchronized(eventHandler) {
    205                                display("Received communication 
breakpoint event.");
    206                                bpCount++;
    207                                eventHandler.notifyAll();
    208                            }
    209                            return true;
    210                        }
    211                        return false;
    212                     }
    213                 }
    214            );
    215        }


test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java:
...
    140                    display("......--> vm.suspend();");
    141                    vm.suspend();
    142
    143                    display("        getting : Map<String, 
Integer> suspendsCounts1");
    144
    145                    Map<String, Integer> suspendsCounts1 = new 
HashMap<String, Integer>();
    146                    for (ThreadReference threadReference : 
vm.allThreads()) {
    147                        
suspendsCounts1.put(threadReference.name(), threadReference.suspendCount());
    148                    }
    149                    display(suspendsCounts1.toString());
    150
    151                    display("        eventSet.resume;");
    152                    eventSet.resume();
    153
    154                    display("        getting : Map<String, 
Integer> suspendsCounts2");

This is where the breakpoint is encountered before the second set of 
suspend counts is acquired.

    155                    Map<String, Integer> suspendsCounts2 = new 
HashMap<String, Integer>();
    156                    for (ThreadReference threadReference : 
vm.allThreads()) {
    157                        
suspendsCounts2.put(threadReference.name(), threadReference.suspendCount());
    158                    }
    159                    display(suspendsCounts2.toString());


From serguei.spitsyn at oracle.com  Tue Jul 17 20:34:33 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 17 Jul 2018 13:34:33 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
Message-ID: <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180717/77e23a79/attachment.html>

From alexey.menkov at oracle.com  Tue Jul 17 21:29:04 2018
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Tue, 17 Jul 2018 14:29:04 -0700
Subject: RFR(S) 8205652:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java fails
In-Reply-To: <b52f6fb2-df67-bd8b-b8ca-da36b7095a3c@oracle.com>
References: <CAF9BGBxd32Pok1fvfst+02xS1MxSyVo-R9Wq++JCRfpg3kZX+w@mail.gmail.com>
 <b52f6fb2-df67-bd8b-b8ca-da36b7095a3c@oracle.com>
Message-ID: <0894a3e4-b279-5f2a-e6f7-33f065cdabd7@oracle.com>

+1

--alex

On 07/16/2018 12:58, serguei.spitsyn at oracle.com wrote:
> Hi Jc,
> 
> It looks good to me.
> 
> Thanks,
> Serguei
> 
> On 7/16/18 10:58, JC Beyler wrote:
>> Hi all,
>>
>> Small RFR to update a HeapMonitor test that had two issues: a test was 
>> wrong and the test was not allocating enough to get to an expected 
>> sample count. Instead of allocating 10 times more and hit some OOM on 
>> the test framework, the webrev allocates in chunks and gets the number 
>> of samples.
>>
>> I ran this 10k times on my machine and it passed. Serguei ran mach5 
>> testing with it and said it looked good.
>>
>> Bug associated is: JDK-8205652 
>> <https://bugs.openjdk.java.net/browse/JDK-8205652>
>> Webrev is here: 
>> http://cr.openjdk.java.net/~jcbeyler/8205652/webrev.01/ 
>> <http://cr.openjdk.java.net/%7Ejcbeyler/8205652/webrev.01/>
>>
>> Thanks,
>> Jc
> 

From alexey.menkov at oracle.com  Tue Jul 17 21:48:26 2018
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Tue, 17 Jul 2018 14:48:26 -0700
Subject: RFR(S) 8205541:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java
 fails
In-Reply-To: <0fe71fe7-1d40-0d81-e7a8-5797ffcd4471@oracle.com>
References: <CAF9BGBz1Mz0v5QcsjTOmXKUXt0neQzD8z0mqHRdjfO32eRsZzQ@mail.gmail.com>
 <0fe71fe7-1d40-0d81-e7a8-5797ffcd4471@oracle.com>
Message-ID: <11d9384f-3b8b-013d-eb0f-e41736ff584b@oracle.com>

Looks good to me as well

--alex

On 07/16/2018 16:06, serguei.spitsyn at oracle.com wrote:
> Hi Jc,
> 
> It looks good to me.
> 
> Thanks,
> Serguei
> 
> 
> On 7/16/18 12:37, JC Beyler wrote:
>> Hi all,
>>
>> Small RFR to update two HeapMonitor tests to remove test failures when 
>> resetting a test data structure and assuming wrongly that the data 
>> structure was empty afterwards due to a second thread adding something 
>> to it.
>> The fix is to disable sampling then reset the storage before enabling 
>> it again.
>>
>> Bug associated is: JDK-8205541 
>> <https://bugs.openjdk.java.net/browse/JDK-8205541>
>> Webrev is here: 
>> http://cr.openjdk.java.net/~jcbeyler/8205541/webrev.02/ 
>> <http://cr.openjdk.java.net/%7Ejcbeyler/8205541/webrev.02/>
>>
>> Thanks all!
>> Jc
> 

From daniil.x.titov at oracle.com  Tue Jul 17 21:55:35 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 17 Jul 2018 14:55:35 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
Message-ID: <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>

Hi Serguei,

The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters.  The testSourceFilter() method does the following:
      1.  creates a ClassPrepareRequest object
      2. registers new ClassPrepareEventListener
      3. sends a command to debuggee to a load test class 
      4. waits till the debuggee performed the command
      5. removes ClassPrepareEventListener
      6. checks if a ClassPrepareEvent was received
 

Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners)

cat -n  test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java
  /**
   251	     * This method sets up default listeners.
   252	     */
   253	    private void createDefaultListeners() {
   254	        /**
   255	         * This listener catches up all unexpected events.
   256	         *
   257	         */
   258	        addListener(
   259	                new EventListener() {
   260	                    public boolean eventReceived(Event event) {
   261	                        log.complain("EventHandler>  Unexpected event: " + event.getClass().getName());
   262	                        unexpectedEventCaught = true;
   263	                        return true;
   264	                    }
   265	                }
   266	        );
   267	

On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. 

With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener  is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. 

That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener  is unregistered inside testSourceFilter() method.

Please see below the new webrev with the changes you suggested.

Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/


Thanks!

Best regards,
Daniil


From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 1:34 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

Hi Daniil,

Not sure, I fully understand the fix.
So, let's start from some questions.

Why the DefaultClassPrepareEventListener is needed?
Is it not enough to filter out the other threads in the
ClassPrepareEventListener.eventReceived() method ?
 243         eventHandler.startListening();
 244         // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads.
 245         // The listener should be added after the event listener is started to ensure that it called before
 246         // the default event listener that handles unexpected events.
 247         eventHandler.addListener(new DefaultClassPrepareEventListener());

  It is still not clear why the default listener is added
  after the listening is started but not before.
  If the default listener is really needed then could you, please,
  split the lines above and L129, L160 to make a little bit shorter?
  
  I'd also suggest to replace "class prepared events" at L244
? with "ClassPrepare event" or "class prepare event".
  There is also an unneeded space in the "( e.g. compiler)".

Thanks,
Serguei


On 7/17/18 01:20, Daniil Titov wrote:
Please review the change that fix the JDI test when running with Graal.

The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled.

Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/

Thanks!
--Daniil


From jcbeyler at google.com  Tue Jul 17 22:12:06 2018
From: jcbeyler at google.com (JC Beyler)
Date: Tue, 17 Jul 2018 15:12:06 -0700
Subject: RFR(S) 8205541:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatArrayCorrectnessTest.java
 fails
In-Reply-To: <11d9384f-3b8b-013d-eb0f-e41736ff584b@oracle.com>
References: <CAF9BGBz1Mz0v5QcsjTOmXKUXt0neQzD8z0mqHRdjfO32eRsZzQ@mail.gmail.com>
 <0fe71fe7-1d40-0d81-e7a8-5797ffcd4471@oracle.com>
 <11d9384f-3b8b-013d-eb0f-e41736ff584b@oracle.com>
Message-ID: <CAF9BGBzg9MfNDmUdtdwMt16K3DCJ0itan9XiktZ1jJrUkUp61g@mail.gmail.com>

Hi Alex,

Thanks for the review!

Here is now the new webrev, ready for a push:
http://cr.openjdk.java.net/~jcbeyler/8205541/webrev.03/

Thanks all,
Jc

On Tue, Jul 17, 2018 at 2:48 PM Alex Menkov <alexey.menkov at oracle.com>
wrote:

> Looks good to me as well
>
> --alex
>
> On 07/16/2018 16:06, serguei.spitsyn at oracle.com wrote:
> > Hi Jc,
> >
> > It looks good to me.
> >
> > Thanks,
> > Serguei
> >
> >
> > On 7/16/18 12:37, JC Beyler wrote:
> >> Hi all,
> >>
> >> Small RFR to update two HeapMonitor tests to remove test failures when
> >> resetting a test data structure and assuming wrongly that the data
> >> structure was empty afterwards due to a second thread adding something
> >> to it.
> >> The fix is to disable sampling then reset the storage before enabling
> >> it again.
> >>
> >> Bug associated is: JDK-8205541
> >> <https://bugs.openjdk.java.net/browse/JDK-8205541>
> >> Webrev is here:
> >> http://cr.openjdk.java.net/~jcbeyler/8205541/webrev.02/
> >> <http://cr.openjdk.java.net/%7Ejcbeyler/8205541/webrev.02/>
> >>
> >> Thanks all!
> >> Jc
> >
>


-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180717/6c34bfac/attachment.html>

From alexey.menkov at oracle.com  Tue Jul 17 22:38:50 2018
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Tue, 17 Jul 2018 15:38:50 -0700
Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling
In-Reply-To: <d38e8a76-595b-2f87-15b5-33546a9d8b6b@oracle.com>
References: <CAF9BGBzQY2+rx-ec7n7Y5Sq9N4E=XLUwx3hs_SKqhB2RXyf4TQ@mail.gmail.com>
 <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com>
 <CAF9BGBwAR1nnS5Pr3obvFZD7d_-KKMj2j-mnkj+g9jDfy1zqEQ@mail.gmail.com>
 <d38e8a76-595b-2f87-15b5-33546a9d8b6b@oracle.com>
Message-ID: <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com>

The changes look good to me.

--alex

On 07/16/2018 16:10, serguei.spitsyn at oracle.com wrote:
> Hi all,
> 
> We need at least one more review before pushing it.
> 
> Thanks,
> Serguei
> 
> 
> On 7/16/18 16:07, JC Beyler wrote:
>> Hi all,
>>
>> The CSR has recently been approved, could someone else review the spec 
>> update webrev:
>> http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ 
>> <http://cr.openjdk.java.net/%7Ejcbeyler/8205725/webrev.03/>
>>
>> The associated bug is here: 
>> https://bugs.openjdk.java.net/browse/JDK-8205725
>> The associated CSR is here: 
>> https://bugs.openjdk.java.net/browse/JDK-8206940
>>
>> Thanks all!
>> Jc
>>
>>
>> On Thu, Jul 12, 2018 at 2:27 PM serguei.spitsyn at oracle.com 
>> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com 
>> <mailto:serguei.spitsyn at oracle.com>> wrote:
>>
>>     Hi Jc,
>>
>>     It looks good to me (including the CSR that I'had already reviewed).
>>     Thank you for preparing a fix for this issue so quickly!
>>
>>     Thanks,
>>     Serguei
>>
>>
>>     On 7/12/18 13:45, JC Beyler wrote:
>>>     Hi all,
>>>
>>>     Could I get a review of an update to the JVMTI Spec for Heap
>>>     Sampling:
>>>     http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/
>>>     <http://cr.openjdk.java.net/%7Ejcbeyler/8205725/webrev.03/>
>>>
>>>     The assoicated bug is here:
>>>     https://bugs.openjdk.java.net/browse/JDK-8205725
>>>     The associated CSR is here:
>>>     https://bugs.openjdk.java.net/browse/JDK-8206940
>>>
>>>     The basic reasoning of this webrev/bug/CSR is:
>>>     - rate is not the right word and should be renamed to interval,
>>>     this is what provokes the change in the code/tests/API naming.
>>>     - the spec does not mention that the new sampling interval will
>>>     take time to be taken into account (you have to wait for a TLAB
>>>     to be refilled); this adds that precision so that the user is not
>>>     surprised
>>>     - the spec explicitly says that the sampling is done via a
>>>     geometric variable which averages to the sampling interval; it
>>>     was asked to relax this and the spec should just say that the
>>>     sampling is pseudo-random and the interval will average out to
>>>     what the user requested.
>>>
>>>     Thanks for all your help,
>>>     Jc
>>
>>
>>
>> -- 
>>
>> Thanks,
>> Jc
> 

From serguei.spitsyn at oracle.com  Tue Jul 17 23:30:10 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 17 Jul 2018 16:30:10 -0700
Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling
In-Reply-To: <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com>
References: <CAF9BGBzQY2+rx-ec7n7Y5Sq9N4E=XLUwx3hs_SKqhB2RXyf4TQ@mail.gmail.com>
 <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com>
 <CAF9BGBwAR1nnS5Pr3obvFZD7d_-KKMj2j-mnkj+g9jDfy1zqEQ@mail.gmail.com>
 <d38e8a76-595b-2f87-15b5-33546a9d8b6b@oracle.com>
 <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com>
Message-ID: <2494a96a-c68f-3a22-a429-e49384f864bd@oracle.com>

Thanks a lot for reviews, Alex!
Serguei

On 7/17/18 15:38, Alex Menkov wrote:
> The changes look good to me.
>
> --alex
>
> On 07/16/2018 16:10, serguei.spitsyn at oracle.com wrote:
>> Hi all,
>>
>> We need at least one more review before pushing it.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 7/16/18 16:07, JC Beyler wrote:
>>> Hi all,
>>>
>>> The CSR has recently been approved, could someone else review the 
>>> spec update webrev:
>>> http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ 
>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8205725/webrev.03/>
>>>
>>> The associated bug is here: 
>>> https://bugs.openjdk.java.net/browse/JDK-8205725
>>> The associated CSR is here: 
>>> https://bugs.openjdk.java.net/browse/JDK-8206940
>>>
>>> Thanks all!
>>> Jc
>>>
>>>
>>> On Thu, Jul 12, 2018 at 2:27 PM serguei.spitsyn at oracle.com 
>>> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com 
>>> <mailto:serguei.spitsyn at oracle.com>> wrote:
>>>
>>> ??? Hi Jc,
>>>
>>> ??? It looks good to me (including the CSR that I'had already 
>>> reviewed).
>>> ??? Thank you for preparing a fix for this issue so quickly!
>>>
>>> ??? Thanks,
>>> ??? Serguei
>>>
>>>
>>> ??? On 7/12/18 13:45, JC Beyler wrote:
>>>> ??? Hi all,
>>>>
>>>> ??? Could I get a review of an update to the JVMTI Spec for Heap
>>>> ??? Sampling:
>>>> ??? http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/
>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8205725/webrev.03/>
>>>>
>>>> ??? The assoicated bug is here:
>>>> ??? https://bugs.openjdk.java.net/browse/JDK-8205725
>>>> ??? The associated CSR is here:
>>>> ??? https://bugs.openjdk.java.net/browse/JDK-8206940
>>>>
>>>> ??? The basic reasoning of this webrev/bug/CSR is:
>>>> ??? - rate is not the right word and should be renamed to interval,
>>>> ??? this is what provokes the change in the code/tests/API naming.
>>>> ??? - the spec does not mention that the new sampling interval will
>>>> ??? take time to be taken into account (you have to wait for a TLAB
>>>> ??? to be refilled); this adds that precision so that the user is not
>>>> ??? surprised
>>>> ??? - the spec explicitly says that the sampling is done via a
>>>> ??? geometric variable which averages to the sampling interval; it
>>>> ??? was asked to relax this and the spec should just say that the
>>>> ??? sampling is pseudo-random and the interval will average out to
>>>> ??? what the user requested.
>>>>
>>>> ??? Thanks for all your help,
>>>> ??? Jc
>>>
>>>
>>>
>>> -- 
>>>
>>> Thanks,
>>> Jc
>>


From serguei.spitsyn at oracle.com  Tue Jul 17 23:53:53 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 17 Jul 2018 16:53:53 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
Message-ID: <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180717/48bbbd94/attachment.html>

From mandy.chung at oracle.com  Wed Jul 18 00:07:03 2018
From: mandy.chung at oracle.com (mandy chung)
Date: Tue, 17 Jul 2018 17:07:03 -0700
Subject: RFR: 8206456 - [TESTBUG] docker jtreg tests fail on systems
 without cpuset.effective_cpus / cpuset.effective_mem
In-Reply-To: <A8D8CB9F-D5E8-4FBF-A921-247B33CDCF6D@oracle.com>
References: <A8D8CB9F-D5E8-4FBF-A921-247B33CDCF6D@oracle.com>
Message-ID: <598c9af3-2041-0be5-f177-b3a031b2ef61@oracle.com>


On 7/17/18 7:00 AM, Bob Vandette wrote:
> Please review this fix which eliminates some docker/cgroup test failures when running on older
> Linux kernels with missing cgroup metric files.
> 
> BUGS:
> https://bugs.openjdk.java.net/browse/JDK-8206456
> 
> WEBREV:
> http://cr.openjdk.java.net/~bobv/8206456/webrev/

Nit: It would be clearer to check for the specific metrics:

int[] cpusets = metrics.getEffectiveCpuSetCpus();
if (cpusets.length != 0) {
     ....
}

Same applies to getEffectiveCpuSetMems.  No need for a new webrev.

Mandy
P.S. I am not sure the conversion from the primitive to boxed type
is necessary.  But this is not related to this issue.  You may
want to take a look at that.

From daniil.x.titov at oracle.com  Wed Jul 18 00:25:49 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 17 Jul 2018 17:25:49 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
 <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
Message-ID: <EC5727CD-49F5-4A3D-BA0D-E01CB9FA1A06@oracle.com>

Hi Serguei,

We could combine both listeners in one but in this case this listener should be DefaultClassPrepareEventListener  that is registered only once  at the very beginning of the whole test. We will also need to add a method to reset eventReceived counter between invocations of testSourceFilter() since every call of testSourceFilter() is a separate subtest.

Just wanted to make sure that I correctly understood your proposal.

addListener() is invoked after startListening() just due to specifics of EventHandler implementation. EventHandler.addListener() adds a listener to the head of the list, so the last added listener is the first one to be called. And default listeners (including "unhandled events" one) are created when EventHandler.startListening() method is called. So to ensure that our listener is called before the "unhandled events" we have to call addListener() after startListening() method.

cat -n test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java

   222	    /**
   223	     * This is normally called in the main thread of the test debugger.
   224	     * It starts up an <code>EventHandler</code> thread that gets events coming in
   225	     * from the debuggee and distributes them to listeners.
   226	     */
   227	    public void startListening() {
   228	        createDefaultEventRequests();
   229	        createDefaultListeners();
   230	        listenThread.start();
   231	    }
   232	

250	    /**
   251	     * This method sets up default listeners.
   252	     */
   253	    private void createDefaultListeners() {
   254	        /**
   255	         * This listener catches up all unexpected events.
   256	         *
   257	         */
   258	        addListener(
   259	                new EventListener() {
   260	                    public boolean eventReceived(Event event) {
   261	                        log.complain("EventHandler>  Unexpected event: " + event.getClass().getName());
   262	                        unexpectedEventCaught = true;
   263	                        return true;
   264	                    }
   265	                }
   266	        );
   267	

               /**
   350	     * Add at beginning of the list because we want
   351	     * the LAST added listener to be FIRST to process
   352	     * current event.
   353	     */
   354	    public void addListener(EventListener listener) {
   355	        display("Adding listener " + listener);
   356	        synchronized(listeners) {
   357	            listeners.add(0, listener);
   358	        }
   359	    }

 
Thanks!

Best regards,
Daniil

From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 4:53 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

Hi Daniil,

Thank you for clarification and the webrev update!
I still have a couple of questions though.

I'd suggest more simple approach like below:
 154         public boolean eventReceived(Event event) {
 155             if (event instanceof ClassPrepareEvent) {
 156                 ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
 157                 ThreadReference thread = classPrepareEvent.thread();
 158                 if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
 159                     eventReceived++;
 160 
 161                     log.display("ClassPrepareEventListener: Event received: " + event +
 162                             " Class: " + classPrepareEvent.referenceType().name());
 163 
 164                     vm.resume();
 165 
 166                     return true;
 167                 }
 168             }
 169 
 170             return false;
 171         }

to something like:
          public boolean eventReceived(Event event) {
              if (event instanceof ClassPrepareEvent) {
                  ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
                  ThreadReference thread = classPrepareEvent.thread();
                  if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
                      eventReceived++;
                      log.display("ClassPrepareEventListener: Event received: " + event +
                              " Class: " + classPrepareEvent.referenceType().name());
                  } else {
                      log.display("ClassPrepareEventListener: Event filtered out: " + event +
                              " Class: " + classPrepareEvent.referenceType().name() +
                              " Thread:" + classPrepareEvent.thread().name());
                  }
                  vm.resume();
                  return true;
             ?}
              return false;
          }

 245         eventHandler.startListening();
 246         // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads.
 247         // The listener should be added after the event listener is started to ensure that it
 248         // called before the default event listener that handles unexpected events.
 249         eventHandler.addListener(new DefaultClassPrepareEventListener());
? Still unclear why addListener() is invoked after startListening() but not before.
? It can be that a place add this listener is not right and have to be moved into testSourceFilter(). 
? But I hope this fragment is not needed with the simplified approach.
? Otherwise, it looks good.

Thanks,
Serguei


On 7/17/18 14:55, Daniil Titov wrote:
Hi Serguei,

The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters.  The testSourceFilter() method does the following:
      1.  creates a ClassPrepareRequest object
      2. registers new ClassPrepareEventListener
      3. sends a command to debuggee to a load test class 
      4. waits till the debuggee performed the command
      5. removes ClassPrepareEventListener
      6. checks if a ClassPrepareEvent was received
 

Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners)

cat -n  test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java
  /**
   251	     * This method sets up default listeners.
   252	     */
   253	    private void createDefaultListeners() {
   254	        /**
   255	         * This listener catches up all unexpected events.
   256	         *
   257	         */
   258	        addListener(
   259	                new EventListener() {
   260	                    public boolean eventReceived(Event event) {
   261	                        log.complain("EventHandler>  Unexpected event: " + event.getClass().getName());
   262	                        unexpectedEventCaught = true;
   263	                        return true;
   264	                    }
   265	                }
   266	        );
   267	

On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. 

With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener  is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. 

That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener  is unregistered inside testSourceFilter() method.

Please see below the new webrev with the changes you suggested.

Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/


Thanks!

Best regards,
Daniil


From: mailto:serguei.spitsyn at oracle.com mailto:serguei.spitsyn at oracle.com
Date: Tuesday, July 17, 2018 at 1:34 PM
To: Daniil Titov mailto:daniil.x.titov at oracle.com, mailto:serviceability-dev at openjdk.java.netserviceability-dev@openjdk.java.net mailto:serviceability-dev at openjdk.java.net
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

Hi Daniil,

Not sure, I fully understand the fix.
So, let's start from some questions.

Why the DefaultClassPrepareEventListener is needed?
Is it not enough to filter out the other threads in the
ClassPrepareEventListener.eventReceived() method ?
 243         eventHandler.startListening();
 244         // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads.
 245         // The listener should be added after the event listener is started to ensure that it called before
 246         // the default event listener that handles unexpected events.
 247         eventHandler.addListener(new DefaultClassPrepareEventListener());

  It is still not clear why the default listener is added
  after the listening is started but not before.
  If the default listener is really needed then could you, please,
  split the lines above and L129, L160 to make a little bit shorter?
  
  I'd also suggest to replace "class prepared events" at L244
? with "ClassPrepare event" or "class prepare event".
  There is also an unneeded space in the "( e.g. compiler)".

Thanks,
Serguei


On 7/17/18 01:20, Daniil Titov wrote:
Please review the change that fix the JDI test when running with Graal.

The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled.

Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/

Thanks!
--Daniil


From jcbeyler at google.com  Wed Jul 18 00:54:26 2018
From: jcbeyler at google.com (JC Beyler)
Date: Tue, 17 Jul 2018 17:54:26 -0700
Subject: RFR(S) 8205652:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
 fails
In-Reply-To: <0894a3e4-b279-5f2a-e6f7-33f065cdabd7@oracle.com>
References: <CAF9BGBxd32Pok1fvfst+02xS1MxSyVo-R9Wq++JCRfpg3kZX+w@mail.gmail.com>
 <b52f6fb2-df67-bd8b-b8ca-da36b7095a3c@oracle.com>
 <0894a3e4-b279-5f2a-e6f7-33f065cdabd7@oracle.com>
Message-ID: <CAF9BGBxC2a6qAzYvdZrxQz2EtKWeW=hqJjRKKoGEozEWFW61JA@mail.gmail.com>

Hi all,

Here is the webrev:
http://cr.openjdk.java.net/~jcbeyler/8205652/webrev.02/

Thanks for the reviews and future push!
Jc

On Tue, Jul 17, 2018 at 2:29 PM Alex Menkov <alexey.menkov at oracle.com>
wrote:

> +1
>
> --alex
>
> On 07/16/2018 12:58, serguei.spitsyn at oracle.com wrote:
> > Hi Jc,
> >
> > It looks good to me.
> >
> > Thanks,
> > Serguei
> >
> > On 7/16/18 10:58, JC Beyler wrote:
> >> Hi all,
> >>
> >> Small RFR to update a HeapMonitor test that had two issues: a test was
> >> wrong and the test was not allocating enough to get to an expected
> >> sample count. Instead of allocating 10 times more and hit some OOM on
> >> the test framework, the webrev allocates in chunks and gets the number
> >> of samples.
> >>
> >> I ran this 10k times on my machine and it passed. Serguei ran mach5
> >> testing with it and said it looked good.
> >>
> >> Bug associated is: JDK-8205652
> >> <https://bugs.openjdk.java.net/browse/JDK-8205652>
> >> Webrev is here:
> >> http://cr.openjdk.java.net/~jcbeyler/8205652/webrev.01/
> >> <http://cr.openjdk.java.net/%7Ejcbeyler/8205652/webrev.01/>
> >>
> >> Thanks,
> >> Jc
> >
>


-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180717/83cfa9d9/attachment.html>

From serguei.spitsyn at oracle.com  Wed Jul 18 01:41:48 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 17 Jul 2018 18:41:48 -0700
Subject: RFR(S) 8205652:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java fails
In-Reply-To: <CAF9BGBxC2a6qAzYvdZrxQz2EtKWeW=hqJjRKKoGEozEWFW61JA@mail.gmail.com>
References: <CAF9BGBxd32Pok1fvfst+02xS1MxSyVo-R9Wq++JCRfpg3kZX+w@mail.gmail.com>
 <b52f6fb2-df67-bd8b-b8ca-da36b7095a3c@oracle.com>
 <0894a3e4-b279-5f2a-e6f7-33f065cdabd7@oracle.com>
 <CAF9BGBxC2a6qAzYvdZrxQz2EtKWeW=hqJjRKKoGEozEWFW61JA@mail.gmail.com>
Message-ID: <c8c7563e-6aaf-763e-d17c-e284f1abf129@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180717/abddf60d/attachment.html>

From serguei.spitsyn at oracle.com  Wed Jul 18 01:47:20 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 17 Jul 2018 18:47:20 -0700
Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling
In-Reply-To: <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com>
References: <CAF9BGBzQY2+rx-ec7n7Y5Sq9N4E=XLUwx3hs_SKqhB2RXyf4TQ@mail.gmail.com>
 <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com>
 <CAF9BGBwAR1nnS5Pr3obvFZD7d_-KKMj2j-mnkj+g9jDfy1zqEQ@mail.gmail.com>
 <d38e8a76-595b-2f87-15b5-33546a9d8b6b@oracle.com>
 <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com>
Message-ID: <b9a4b08f-194c-7dc6-a6ac-ebdbabcbec12@oracle.com>

Hi Jc,

Are you waiting for more reviewers?
Otherwise, could you send me a patch for push please?

Thanks,
Serguei


On 7/17/18 15:38, Alex Menkov wrote:
> The changes look good to me.
>
> --alex
>
> On 07/16/2018 16:10, serguei.spitsyn at oracle.com wrote:
>> Hi all,
>>
>> We need at least one more review before pushing it.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 7/16/18 16:07, JC Beyler wrote:
>>> Hi all,
>>>
>>> The CSR has recently been approved, could someone else review the 
>>> spec update webrev:
>>> http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/ 
>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8205725/webrev.03/>
>>>
>>> The associated bug is here: 
>>> https://bugs.openjdk.java.net/browse/JDK-8205725
>>> The associated CSR is here: 
>>> https://bugs.openjdk.java.net/browse/JDK-8206940
>>>
>>> Thanks all!
>>> Jc
>>>
>>>
>>> On Thu, Jul 12, 2018 at 2:27 PM serguei.spitsyn at oracle.com 
>>> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com 
>>> <mailto:serguei.spitsyn at oracle.com>> wrote:
>>>
>>> ??? Hi Jc,
>>>
>>> ??? It looks good to me (including the CSR that I'had already 
>>> reviewed).
>>> ??? Thank you for preparing a fix for this issue so quickly!
>>>
>>> ??? Thanks,
>>> ??? Serguei
>>>
>>>
>>> ??? On 7/12/18 13:45, JC Beyler wrote:
>>>> ??? Hi all,
>>>>
>>>> ??? Could I get a review of an update to the JVMTI Spec for Heap
>>>> ??? Sampling:
>>>> ??? http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/
>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8205725/webrev.03/>
>>>>
>>>> ??? The assoicated bug is here:
>>>> ??? https://bugs.openjdk.java.net/browse/JDK-8205725
>>>> ??? The associated CSR is here:
>>>> ??? https://bugs.openjdk.java.net/browse/JDK-8206940
>>>>
>>>> ??? The basic reasoning of this webrev/bug/CSR is:
>>>> ??? - rate is not the right word and should be renamed to interval,
>>>> ??? this is what provokes the change in the code/tests/API naming.
>>>> ??? - the spec does not mention that the new sampling interval will
>>>> ??? take time to be taken into account (you have to wait for a TLAB
>>>> ??? to be refilled); this adds that precision so that the user is not
>>>> ??? surprised
>>>> ??? - the spec explicitly says that the sampling is done via a
>>>> ??? geometric variable which averages to the sampling interval; it
>>>> ??? was asked to relax this and the spec should just say that the
>>>> ??? sampling is pseudo-random and the interval will average out to
>>>> ??? what the user requested.
>>>>
>>>> ??? Thanks for all your help,
>>>> ??? Jc
>>>
>>>
>>>
>>> -- 
>>>
>>> Thanks,
>>> Jc
>>


From daniil.x.titov at oracle.com  Wed Jul 18 02:06:37 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 17 Jul 2018 19:06:37 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
 <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
Message-ID: <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>

Hi Serguei,

 
Please review a new version of the patch.

 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.03

Bug: https://bugs.openjdk.java.net/browse/JDK-8204695

 
Thanks!

 
Best regards,

Daniil

 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 4:53 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
Hi Daniil,

Thank you for clarification and the webrev update!
I still have a couple of questions though.

I'd suggest more simple approach like below:
 154???????? public boolean eventReceived(Event event) {
 155???????????? if (event instanceof ClassPrepareEvent) {
 156???????????????? ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
 157???????????????? ThreadReference thread = classPrepareEvent.thread();
 158???????????????? if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
 159???????????????????? eventReceived++;
 160 
?161??????? ?????????????log.display("ClassPrepareEventListener: Event received: " + event +
 162???????????????????????????? " Class: " + classPrepareEvent.referenceType().name());
 163 
?164???????????????????? vm.resume();
 165 
?166???????????????????? return true;
 167???????????????? }
 168???????????? }
 169 
?170???????????? return false;
 171???????? }

to something like:
????????? public boolean eventReceived(Event event) {
????????????? if (event instanceof ClassPrepareEvent) {
????????????????? ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
????????????????? ThreadReference thread = classPrepareEvent.thread();
????????????????? if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
????????????????????? eventReceived++;
????? ????????????????log.display("ClassPrepareEventListener: Event received: " + event +
????????????????????????????? " Class: " + classPrepareEvent.referenceType().name());
????????????????? } else {
????????????????????? log.display("ClassPrepareEventListener: Event filtered out: " + event +
????????????????????????????? " Class: " + classPrepareEvent.referenceType().name() +
????????????????????????????? " Thread:" + classPrepareEvent.thread().name());
????????????????? }
????????????????? vm.resume();
????????????????? return true;
????????????  }
????????????? return false;
????????? }
 
 245???????? eventHandler.startListening();
 246???????? // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads.
 247???????? // The listener should be added after the event listener is started to ensure that it
 248???????? // called before the default event listener that handles unexpected events.
 249???????? eventHandler.addListener(new DefaultClassPrepareEventListener());
  Still unclear why addListener() is invoked after startListening() but not before.
  It can be that a place add this listener is not right and have to be moved into testSourceFilter(). 
  But I hope this fragment is not needed with the simplified approach.
  Otherwise, it looks good.

Thanks,
Serguei


On 7/17/18 14:55, Daniil Titov wrote:
Hi Serguei,
 
The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters.? The testSourceFilter() method does the following:
????? 1.? creates a ClassPrepareRequest object
????? 2. registers new ClassPrepareEventListener
????? 3. sends a command to debuggee to a load test class 
??????4. waits till the debuggee performed the command
????? 5. removes ClassPrepareEventListener
????? 6. checks if a ClassPrepareEvent was received
 
 
Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners)
 
cat -n? test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java
? /**
?? 251?? ???? * This method sets up default listeners.
?? 252?? ???? */
?? 253?? ??? private void createDefaultListeners() {
?? 254?? ??????? /**
?? 255?? ???????? * This listener catches up all unexpected events.
?? 256?? ???????? *
?? 257?? ???????? */
?? 258?? ??????? addListener(
?? 259?? ??????????????? new EventListener() {
?? 260?? ??????????????????? public boolean eventReceived(Event event) {
?? 261?? ??????????????????????? log.complain("EventHandler>? Unexpected event: " + event.getClass().getName());
?? 262?? ??????????????????????? unexpectedEventCaught = true;
?? 263?? ??????????????????????? return true;
?? 264?? ??????????????????? }
?? 265?? ??????????????? }
?? 266?? ??????? );
?? 267?? 
 
On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. 
 
With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener? is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. 
 
That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener? is unregistered inside testSourceFilter() method.
 
Please see below the new webrev with the changes you suggested.
 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/
 
 
Thanks!
 
Best regards,
Daniil
 
 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 1:34 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails
 
Hi Daniil,
 
Not sure, I fully understand the fix.
So, let's start from some questions.
 
Why the DefaultClassPrepareEventListener is needed?
Is it not enough to filter out the other threads in the
ClassPrepareEventListener.eventReceived() method ?
 243???????? eventHandler.startListening();
 244???????? // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads.
 245???????? // The listener should be added after the event listener is started to ensure that it called before
 246???????? // the default event listener that handles unexpected events.
 247???????? eventHandler.addListener(new DefaultClassPrepareEventListener());
 
? It is still not clear why the default listener is added
? after the listening is started but not before.
? If the default listener is really needed then could you, please,
? split the lines above and L129, L160 to make a little bit shorter?
? 
??I'd also suggest to replace "class prepared events" at L244
  with "ClassPrepare event" or "class prepare event".
? There is also an unneeded space in the "( e.g. compiler)".
 
Thanks,
Serguei
 
 
On 7/17/18 01:20, Daniil Titov wrote:
Please review the change that fix the JDI test when running with Graal.
 
The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled.
 
Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/
 
Thanks!
--Daniil
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180717/5570deda/attachment-0001.html>

From serguei.spitsyn at oracle.com  Wed Jul 18 03:00:02 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 17 Jul 2018 20:00:02 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
 <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
 <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>
Message-ID: <7be59224-6013-b163-daf2-2ed6bb90a3b8@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180717/12b12b08/attachment.html>

From jcbeyler at google.com  Wed Jul 18 03:05:08 2018
From: jcbeyler at google.com (JC Beyler)
Date: Tue, 17 Jul 2018 20:05:08 -0700
Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling
In-Reply-To: <b9a4b08f-194c-7dc6-a6ac-ebdbabcbec12@oracle.com>
References: <CAF9BGBzQY2+rx-ec7n7Y5Sq9N4E=XLUwx3hs_SKqhB2RXyf4TQ@mail.gmail.com>
 <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com>
 <CAF9BGBwAR1nnS5Pr3obvFZD7d_-KKMj2j-mnkj+g9jDfy1zqEQ@mail.gmail.com>
 <d38e8a76-595b-2f87-15b5-33546a9d8b6b@oracle.com>
 <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com>
 <b9a4b08f-194c-7dc6-a6ac-ebdbabcbec12@oracle.com>
Message-ID: <CAF9BGBzUwJ_Urw3CHM1R4Y+iGiXzrv_Sg9ke+cyd_-uDi=UsWA@mail.gmail.com>

Hi Serguei,

No I was waiting for the other patches to be pushed (thank you for doing
it). Now that it is done, I prepared this one that should be clean of
conflicts for you :-)

Here it is:
http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.04/

Thanks!
Jc


On Tue, Jul 17, 2018 at 6:47 PM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Hi Jc,
>
> Are you waiting for more reviewers?
> Otherwise, could you send me a patch for push please?
>
> Thanks,
> Serguei
>
>
> On 7/17/18 15:38, Alex Menkov wrote:
> > The changes look good to me.
> >
> > --alex
> >
> > On 07/16/2018 16:10, serguei.spitsyn at oracle.com wrote:
> >> Hi all,
> >>
> >> We need at least one more review before pushing it.
> >>
> >> Thanks,
> >> Serguei
> >>
> >>
> >> On 7/16/18 16:07, JC Beyler wrote:
> >>> Hi all,
> >>>
> >>> The CSR has recently been approved, could someone else review the
> >>> spec update webrev:
> >>> http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/
> >>> <http://cr.openjdk.java.net/%7Ejcbeyler/8205725/webrev.03/>
> >>>
> >>> The associated bug is here:
> >>> https://bugs.openjdk.java.net/browse/JDK-8205725
> >>> The associated CSR is here:
> >>> https://bugs.openjdk.java.net/browse/JDK-8206940
> >>>
> >>> Thanks all!
> >>> Jc
> >>>
> >>>
> >>> On Thu, Jul 12, 2018 at 2:27 PM serguei.spitsyn at oracle.com
> >>> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com
> >>> <mailto:serguei.spitsyn at oracle.com>> wrote:
> >>>
> >>>     Hi Jc,
> >>>
> >>>     It looks good to me (including the CSR that I'had already
> >>> reviewed).
> >>>     Thank you for preparing a fix for this issue so quickly!
> >>>
> >>>     Thanks,
> >>>     Serguei
> >>>
> >>>
> >>>     On 7/12/18 13:45, JC Beyler wrote:
> >>>>     Hi all,
> >>>>
> >>>>     Could I get a review of an update to the JVMTI Spec for Heap
> >>>>     Sampling:
> >>>>     http://cr.openjdk.java.net/~jcbeyler/8205725/webrev.03/
> >>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8205725/webrev.03/>
> >>>>
> >>>>     The assoicated bug is here:
> >>>>     https://bugs.openjdk.java.net/browse/JDK-8205725
> >>>>     The associated CSR is here:
> >>>>     https://bugs.openjdk.java.net/browse/JDK-8206940
> >>>>
> >>>>     The basic reasoning of this webrev/bug/CSR is:
> >>>>     - rate is not the right word and should be renamed to interval,
> >>>>     this is what provokes the change in the code/tests/API naming.
> >>>>     - the spec does not mention that the new sampling interval will
> >>>>     take time to be taken into account (you have to wait for a TLAB
> >>>>     to be refilled); this adds that precision so that the user is not
> >>>>     surprised
> >>>>     - the spec explicitly says that the sampling is done via a
> >>>>     geometric variable which averages to the sampling interval; it
> >>>>     was asked to relax this and the spec should just say that the
> >>>>     sampling is pseudo-random and the interval will average out to
> >>>>     what the user requested.
> >>>>
> >>>>     Thanks for all your help,
> >>>>     Jc
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>> Thanks,
> >>> Jc
> >>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180717/8a5362c3/attachment-0001.html>

From serguei.spitsyn at oracle.com  Wed Jul 18 03:31:41 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 17 Jul 2018 20:31:41 -0700
Subject: RFR (S) 8205725: Update the JVMTI Spec for Heap Sampling
In-Reply-To: <CAF9BGBzUwJ_Urw3CHM1R4Y+iGiXzrv_Sg9ke+cyd_-uDi=UsWA@mail.gmail.com>
References: <CAF9BGBzQY2+rx-ec7n7Y5Sq9N4E=XLUwx3hs_SKqhB2RXyf4TQ@mail.gmail.com>
 <8c440689-10d1-c233-e5d0-0c8aa1f792c1@oracle.com>
 <CAF9BGBwAR1nnS5Pr3obvFZD7d_-KKMj2j-mnkj+g9jDfy1zqEQ@mail.gmail.com>
 <d38e8a76-595b-2f87-15b5-33546a9d8b6b@oracle.com>
 <18f6b7c6-899b-cf89-d4fd-175508163806@oracle.com>
 <b9a4b08f-194c-7dc6-a6ac-ebdbabcbec12@oracle.com>
 <CAF9BGBzUwJ_Urw3CHM1R4Y+iGiXzrv_Sg9ke+cyd_-uDi=UsWA@mail.gmail.com>
Message-ID: <506173c6-722d-b4b5-6505-b2166b073282@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180717/b8a41202/attachment.html>

From daniil.x.titov at oracle.com  Wed Jul 18 03:32:10 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Tue, 17 Jul 2018 20:32:10 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <7be59224-6013-b163-daf2-2ed6bb90a3b8@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
 <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
 <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>
 <7be59224-6013-b163-daf2-2ed6bb90a3b8@oracle.com>
Message-ID: <B5A65954-0F1A-4B21-924A-ED5E591CDED6@oracle.com>

Hi Serguei,

 
The changes are in the one test class vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java so they affect only this single test. No other tests depend on this class.

 
Best regards,

Daniil

 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 7:59 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
Hi Daniil,

It looks good to me.
Thank you for the update.

How many tests are depending on this class?
Could we say that all the nsk/jdi/ClassPrepareRequest tests
need to be checked that there are no regressions?

Thanks,
Serguei


On 7/17/18 19:06, Daniil Titov wrote:

Hi Serguei,

 
Please review a new version of the patch.

 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.03

Bug: https://bugs.openjdk.java.net/browse/JDK-8204695

 
Thanks!

 
Best regards,

Daniil

 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 4:53 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
Hi Daniil,

Thank you for clarification and the webrev update!
I still have a couple of questions though.

I'd suggest more simple approach like below:
 154         public boolean eventReceived(Event event) {
 155             if (event instanceof ClassPrepareEvent) {
 156                 ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
 157                 ThreadReference thread = classPrepareEvent.thread();
 158                 if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
 159                     eventReceived++;
 160 
 161                     log.display("ClassPrepareEventListener: Event received: " + event +
 162                             " Class: " + classPrepareEvent.referenceType().name());
 163 
 164                     vm.resume();
 165 
 166                     return true;
 167                 }
 168             }
 169 
 170             return false;
 171         }

to something like:
          public boolean eventReceived(Event event) {
              if (event instanceof ClassPrepareEvent) {
                  ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
                  ThreadReference thread = classPrepareEvent.thread();
                  if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
                      eventReceived++;
                      log.display("ClassPrepareEventListener: Event received: " + event +
                              " Class: " + classPrepareEvent.referenceType().name());
                  } else {
                      log.display("ClassPrepareEventListener: Event filtered out: " + event +
                              " Class: " + classPrepareEvent.referenceType().name() +
                              " Thread:" + classPrepareEvent.thread().name());
                  }
                  vm.resume();
                  return true;
              }
              return false;
          }
 
 245         eventHandler.startListening();
 246         // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads.
 247         // The listener should be added after the event listener is started to ensure that it
 248         // called before the default event listener that handles unexpected events.
 249         eventHandler.addListener(new DefaultClassPrepareEventListener());
  Still unclear why addListener() is invoked after startListening() but not before.
  It can be that a place add this listener is not right and have to be moved into testSourceFilter(). 
  But I hope this fragment is not needed with the simplified approach.
  Otherwise, it looks good.

Thanks,
Serguei


On 7/17/18 14:55, Daniil Titov wrote:
Hi Serguei,
 
The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters.  The testSourceFilter() method does the following:
      1.  creates a ClassPrepareRequest object
      2. registers new ClassPrepareEventListener
      3. sends a command to debuggee to a load test class 
      4. waits till the debuggee performed the command
      5. removes ClassPrepareEventListener
      6. checks if a ClassPrepareEvent was received
 
 
Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners)
 
cat -n  test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java
  /**
   251        * This method sets up default listeners.
   252        */
   253       private void createDefaultListeners() {
   254           /**
   255            * This listener catches up all unexpected events.
   256            *
   257            */
   258           addListener(
   259                   new EventListener() {
   260                       public boolean eventReceived(Event event) {
   261                           log.complain("EventHandler>  Unexpected event: " + event.getClass().getName());
   262                           unexpectedEventCaught = true;
   263                           return true;
   264                       }
   265                   }
   266           );
   267   
 
On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. 
 
With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener  is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. 
 
That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener  is unregistered inside testSourceFilter() method.
 
Please see below the new webrev with the changes you suggested.
 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/
 
 
Thanks!
 
Best regards,
Daniil
 
 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 1:34 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails
 
Hi Daniil,
 
Not sure, I fully understand the fix.
So, let's start from some questions.
 
Why the DefaultClassPrepareEventListener is needed?
Is it not enough to filter out the other threads in the
ClassPrepareEventListener.eventReceived() method ?
 243         eventHandler.startListening();
 244         // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads.
 245         // The listener should be added after the event listener is started to ensure that it called before
 246         // the default event listener that handles unexpected events.
 247         eventHandler.addListener(new DefaultClassPrepareEventListener());
 
  It is still not clear why the default listener is added
  after the listening is started but not before.
  If the default listener is really needed then could you, please,
  split the lines above and L129, L160 to make a little bit shorter?
  
  I'd also suggest to replace "class prepared events" at L244
  with "ClassPrepare event" or "class prepare event".
  There is also an unneeded space in the "( e.g. compiler)".
 
Thanks,
Serguei
 
 
On 7/17/18 01:20, Daniil Titov wrote:
Please review the change that fix the JDI test when running with Graal.
 
The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled.
 
Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/
 
Thanks!
--Daniil
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180717/824d012f/attachment-0001.html>

From serguei.spitsyn at oracle.com  Wed Jul 18 03:36:50 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 17 Jul 2018 20:36:50 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <B5A65954-0F1A-4B21-924A-ED5E591CDED6@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
 <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
 <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>
 <7be59224-6013-b163-daf2-2ed6bb90a3b8@oracle.com>
 <B5A65954-0F1A-4B21-924A-ED5E591CDED6@oracle.com>
Message-ID: <a5384e77-8b2f-f8f1-21c1-863a0a2a6f59@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180717/1f80c222/attachment.html>

From chris.plummer at oracle.com  Wed Jul 18 05:01:50 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 17 Jul 2018 22:01:50 -0700
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <a7e3536694e84e859bb14e3cb19f292c@sap.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
Message-ID: <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>

Hi Ralf,

A few comments below, but overall looks good:

 ? 27? * @summary get stack trace for large stacks took too long.

How about "Test that getting the stack trace for a very large stack does 
not take too long".

The max number of frames you'll test for is 100M, but the stack size is 
set to 4m, assuming -Xss works (and I think on some platforms it may 
not). 100M frames seems like overkill for a 4M stack. If the stack was 
nothing more than a frame link pointer on a 32-bit system, you'd only 
have 1M frames, but lets be more realistic than that and say you should 
never have more than 256k frames. Lowering the max number of frames will 
prevent this test from taking a very long time on platforms where -Xss 
has failed.

 ? 65???????????????? // Have some frames be removed before we call again.

Should this be: "Pop some frames so there is room on the stack for the 
println()"

 ? 96???????? bpe = resumeTo("Frames2Targ", "callEnded", "()V");

What happens if we never get to callEnded()?

thanks,

Chris

On 7/17/18 7:08 AM, Schmelter, Ralf wrote:
> Hi all,
>
> here is an updated webref at http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v2/
>
> I've converted the shell based test to a Java one, which had the nice side effect of speeding it up (now takes ~1 second runtime).
>
> The fix itself is mainly unchanged, but I've added the variable 'filledIn' to store the number of frames actually filled in by the GetStackTrace call. Formerly I've reused the count variable, but this can lead to misunderstandings.
>
> Best regards,
> Ralf
>
>
> -----Original Message-----
> From: serviceability-dev [mailto:serviceability-dev-bounces at openjdk.java.net] On Behalf Of Schmelter, Ralf
> Sent: Montag, 9. Juli 2018 16:05
> To: Chris Plummer <chris.plummer at oracle.com>; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com
> Subject: [CAUTION] RE: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior
>
> Hi Chris,
>
> thanks for the review.
>
>> What testing have you done?
> I've tested the change by debugging by hand in eclipse and jdb, running the com/sun/jdi rtreg tests and the jdwp jck tests. And analog code is running in the SAP JVM for many years.
>
>
>> How long does this test take to run.
> 15 s according to jtreg.
>
>
>> What happens if for some reason SOE is never thrown? It's not clear to
>> me what the script would do in this case.
> It is treated as passed (which is not ideal).
>
>
>> In answer to the ShellScaffold.sh question, there is already work
>> underway to convert to pure java tests. See JDK-8201652.
> Ok, then I think it is better to convert the test to a Java TestScaffold test. I will update the webref when this is done.
>
> Best regards,
> Ralf
>
>
>
> -----Original Message-----
> From: Chris Plummer [mailto:chris.plummer at oracle.com]
> Sent: Freitag, 6. Juli 2018 00:37
> To: Schmelter, Ralf <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com
> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior
>
> Hi Ralf,
>
> Overall looks good, but I do have a few comments and questions.
>
> Please update the copyright.
>
> What testing have you done?
>
> How long does this test take to run.
>
> What happens if for some reason SOE is never thrown? It's not clear to
> me what the script would do in this case.
> In answer to the ShellScaffold.sh question, there is already work
> underway to convert to pure java tests. See JDK-8201652. I'm not certain
> if it is ok for you to just submit this new shell script, or if should
> be rewritten in pure java. Most of the work to convert the scripts has
> already been done but was put on hold. Maybe Serguei can comment and
> guide you on how it would be done in java.
>
> thanks,
>
> Chris
>
> On 7/3/18 3:43 AM, Schmelter, Ralf wrote:
>> Hi All,
>>
>> Please review the fix for the bughttps://bugs.openjdk.java.net/browse/JDK-8205608  . The webref is athttp://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/  .
>>
>> This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack.
>>
>> I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is.
>>
>> Best regards,
>> Ralf Schmelter
>


From chris.plummer at oracle.com  Wed Jul 18 06:38:59 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 17 Jul 2018 23:38:59 -0700
Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR:
 suspendCounts don't match for : Common-Cleaner
In-Reply-To: <5B4E0C62.3020808@oracle.com>
References: <5B4E0C62.3020808@oracle.com>
Message-ID: <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com>

Hi Gary,

I've been having trouble following the control flow of this test. One 
thing I've stumbled across is the following:

 ??????????? /* A debuggee class must define 'methodForCommunication'
 ???????????? * method and invoke it in points of synchronization
 ???????????? * with a debugger.
 ???????????? */
setCommunicationBreakpoint(debuggeeClass,"methodForCommunication");

So why isn't this mode of synchronization good enough? Is it because it 
was not designed with the understanding that the debugger might be doing 
suspended thread counts, and suspending all threads at the breakpoint 
messes up the test?

 From what I can tell of the test, after the debuggee is started and 
hits the default breakpoint at the start of main(), the debugger then 
does a vm.resume() at the start of the for loop in the runTest() method. 
The debuggee then creates a thread and calls methodForCommunication(). 
There is already a breakpoint set there by the above debuggee code. It's 
unclear to me what happens as a result of this breakpoint and how it 
serves the test. Also unclear to me who is responsible for the 
vm.resume() after the breakpoint is hit.

The debugger then requests all ThreadStart events, requesting that no 
threads be disabled when it is sent. I think you are saying that when 
the ThreadStart event comes in, sometimes we are at the 
methodForCommunication breakpoint, with all threads disabled, and this 
messes up the thread suspend counts. You want to delay 100ms so the 
breakpoint event can be processed and threads resumed again (although I 
can't see who actually resumes the thread after hitting the 
methodForCommunication breakpoint).

Chris

On 7/17/18 8:33 AM, Gary Adams wrote:
> A race condition exists between the debugger and the debuggee.
>
> The first test thread is started with SUSPEND_NONE policy set.
> While processing the thread start event the debugger captures
> an initial set of thread suspend counts and resumes the
> debuggee vm. If the debuggee advances quickly it reaches
> the breakpoint set for methodForCommunication. Since the breakpoint
> carries with it SUSPEND_ALL policy, when the debugger captures a second
> set of suspend counts, it will not match the expected counts for
> a SUSPEND_NONE scenario.
>
> The proposed fix introduces a yield in the debuggee test thread run 
> method
> to allow the debugger to get the expected sampled values.
>
> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8170089
> ? Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/
>
>
> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java:
> ...
> ?? 186??????? private void setCommunicationBreakpoint(ReferenceType 
> refType, String methodName) {
> ?? 187??????????? Method method = debuggee.methodByName(refType, 
> methodName);
> ?? 188??????????? Location location = null;
> ?? 189??????????? try {
> ?? 190??????????????? location = method.allLineLocations().get(0);
> ?? 191??????????? } catch (AbsentInformationException e) {
> ?? 192??????????????? throw new Failure(e);
> ?? 193??????????? }
> ?? 194??????????? bpRequest = debuggee.makeBreakpoint(location);
> ?? 195
>
> ?? 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL);
>
> ?? 197??????????? bpRequest.putProperty("number", "zero");
> ?? 198??????????? bpRequest.enable();
> ?? 199
> ?? 200??????????? eventHandler.addListener(
> ?? 201???????????????? new EventHandler.EventListener() {
> ?? 202???????????????????? public boolean eventReceived(Event event) {
> ?? 203??????????????????????? if (event instanceof BreakpointEvent && 
> bpRequest.equals(event.request())) {
> ?? 204??????????????????????????? synchronized(eventHandler) {
> ?? 205??????????????????????????????? display("Received communication 
> breakpoint event.");
> ?? 206??????????????????????????????? bpCount++;
> ?? 207??????????????????????????????? eventHandler.notifyAll();
> ?? 208??????????????????????????? }
> ?? 209??????????????????????????? return true;
> ?? 210??????????????????????? }
> ?? 211??????????????????????? return false;
> ?? 212???????????????????? }
> ?? 213???????????????? }
> ?? 214??????????? );
> ?? 215??????? }
>
>
> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java:
> ...
> ?? 140??????????????????? display("......--> vm.suspend();");
> ?? 141??????????????????? vm.suspend();
> ?? 142
> ?? 143??????????????????? display("??????? getting : Map<String, 
> Integer> suspendsCounts1");
> ?? 144
> ?? 145??????????????????? Map<String, Integer> suspendsCounts1 = new 
> HashMap<String, Integer>();
> ?? 146??????????????????? for (ThreadReference threadReference : 
> vm.allThreads()) {
> ?? 147 suspendsCounts1.put(threadReference.name(), 
> threadReference.suspendCount());
> ?? 148??????????????????? }
> ?? 149??????????????????? display(suspendsCounts1.toString());
> ?? 150
> ?? 151??????????????????? display("??????? eventSet.resume;");
> ?? 152??????????????????? eventSet.resume();
> ?? 153
> ?? 154??????????????????? display("??????? getting : Map<String, 
> Integer> suspendsCounts2");
>
> This is where the breakpoint is encountered before the second set of 
> suspend counts is acquired.
>
> ?? 155??????????????????? Map<String, Integer> suspendsCounts2 = new 
> HashMap<String, Integer>();
> ?? 156??????????????????? for (ThreadReference threadReference : 
> vm.allThreads()) {
> ?? 157 suspendsCounts2.put(threadReference.name(), 
> threadReference.suspendCount());
> ?? 158??????????????????? }
> ?? 159??????????????????? display(suspendsCounts2.toString());
>


From gary.adams at oracle.com  Wed Jul 18 11:52:36 2018
From: gary.adams at oracle.com (Gary Adams)
Date: Wed, 18 Jul 2018 07:52:36 -0400
Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR:
 suspendCounts don't match for : Common-Cleaner
In-Reply-To: <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com>
References: <5B4E0C62.3020808@oracle.com>
 <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com>
Message-ID: <5B4F2A04.20409@oracle.com>

There is nothing wrong with the breakpoint in methodForCommunication.
The test uses it to make sure the threads are each tested separately.
The breakpoint eventhandler just displays a message, increments a counter
and returns.

Let me step through resume008 the debugee to help clarify ...

1. The test thread is created and the synchronized break point is 
observed. lines 101-102
2. The thread is started. lines 104,135-137
     2a. The main thread blocks on a local object. lines 133, 139
     2b. The test thread is started. lines 137,
            A run entered message is displayed, line 159
            The main thread lock object is notified, line 167
           2b1. The main thread continues. line 167, 146
                   The next test thread is created. line 106
                   The synchronized breakpoint is observed, line 107
           2b2. A run exited message is displayed, line 169

On the resume008 debugger side  ...
   1. On a thread start event the debugee is suspended, line 141
   2. Messages are displayed and a first set of thread suspend counts is 
acquired. lines 143-151
   3. The threads are resumed, line 152
--->
   4.  Messages are displayed and a second set of thread suspend counts 
is acquired. lines 154-159

The way the test is written the expectation is the debugger steps 2,3,4 
will all happen
while the test thread is running.

When the debugger resumes the debuggee threads (debugger step 3)
the debuggee continues from where it left off (debuggee steps 2b,2b1,2b2)

If we complete debuggee step 2b1 (line 107) before the debugger 
completes step 4 line 159,
then the synchronized breakpoint will suspend the vm and the counts will 
not match
for the SUSPEND_NONE test thread start.

resume008a.java:

    100                        case 0:
    101                                thread0 = new 
Threadresume008a("thread0");
    102                                methodForCommunication();
    103
    104                                threadStart(thread0);
    105
    106                                thread1 = new 
Threadresume008a("thread1");
    107                                methodForCommunication();
    108                                break;

    ...
    135        static int threadStart(Thread t) {
    136            synchronized (waitnotifyObj) {
    137                t.start();
    138                try {
    139                    waitnotifyObj.wait();
    140                } catch ( Exception e) {
    141                    exitCode = FAILED;
    142                    logErr("       Exception : " + e );
    143                    return FAILED;
    144                }
    145            }
    146            return PASSED;
    147        }

    149        static class Threadresume008a extends Thread {
    ...
    157
    158            public void run() {
    159                log1("  'run': enter  :: threadName == " + tName);

This is the proposed fix that will let the debugger complete it's second
acquisition of suspend counts while the test thread is still running.

    160                // Yield, so the start thread event processing 
can be completed.
    161                try {
    162                    Thread.sleep(100);
    163                } catch (InterruptedException e) {
    164                    // ignored
    165                }

    166                synchronized (waitnotifyObj) {
    167                        waitnotifyObj.notify();
    168                }
    169                log1("  'run': exit   :: threadName == " + tName);
    170                return;
    171            }
    172        }
    150
    151            String tName = null;
    152
    153            public Threadresume008a(String threadName) {
    154                super(threadName);
    155                tName = threadName;
    156            }
    157
    158            public void run() {
    159                log1("  'run': enter  :: threadName == " + tName);
    160                // Yield, so the start thread event processing 
can be completed.
    161                try {
    162                    Thread.sleep(100);
    163                } catch (InterruptedException e) {
    164                    // ignored
    165                }
    166                synchronized (waitnotifyObj) {
    167                        waitnotifyObj.notify();
    168                }
    169                log1("  'run': exit   :: threadName == " + tName);
    170                return;
    171            }
    172        }


On 7/18/18, 2:38 AM, Chris Plummer wrote:
> Hi Gary,
>
> I've been having trouble following the control flow of this test. One 
> thing I've stumbled across is the following:
>
>             /* A debuggee class must define 'methodForCommunication'
>              * method and invoke it in points of synchronization
>              * with a debugger.
>              */
> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication");
>
> So why isn't this mode of synchronization good enough? Is it because 
> it was not designed with the understanding that the debugger might be 
> doing suspended thread counts, and suspending all threads at the 
> breakpoint messes up the test?
>
> From what I can tell of the test, after the debuggee is started and 
> hits the default breakpoint at the start of main(), the debugger then 
> does a vm.resume() at the start of the for loop in the runTest() 
> method. The debuggee then creates a thread and calls 
> methodForCommunication(). There is already a breakpoint set there by 
> the above debuggee code. It's unclear to me what happens as a result 
> of this breakpoint and how it serves the test. Also unclear to me who 
> is responsible for the vm.resume() after the breakpoint is hit.
>
> The debugger then requests all ThreadStart events, requesting that no 
> threads be disabled when it is sent. I think you are saying that when 
> the ThreadStart event comes in, sometimes we are at the 
> methodForCommunication breakpoint, with all threads disabled, and this 
> messes up the thread suspend counts. You want to delay 100ms so the 
> breakpoint event can be processed and threads resumed again (although 
> I can't see who actually resumes the thread after hitting the 
> methodForCommunication breakpoint).
>
> Chris
>
> On 7/17/18 8:33 AM, Gary Adams wrote:
>> A race condition exists between the debugger and the debuggee.
>>
>> The first test thread is started with SUSPEND_NONE policy set.
>> While processing the thread start event the debugger captures
>> an initial set of thread suspend counts and resumes the
>> debuggee vm. If the debuggee advances quickly it reaches
>> the breakpoint set for methodForCommunication. Since the breakpoint
>> carries with it SUSPEND_ALL policy, when the debugger captures a second
>> set of suspend counts, it will not match the expected counts for
>> a SUSPEND_NONE scenario.
>>
>> The proposed fix introduces a yield in the debuggee test thread run 
>> method
>> to allow the debugger to get the expected sampled values.
>>
>>   Issue: https://bugs.openjdk.java.net/browse/JDK-8170089
>>   Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/
>>
>>
>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java:
>> ...
>>    186        private void setCommunicationBreakpoint(ReferenceType 
>> refType, String methodName) {
>>    187            Method method = debuggee.methodByName(refType, 
>> methodName);
>>    188            Location location = null;
>>    189            try {
>>    190                location = method.allLineLocations().get(0);
>>    191            } catch (AbsentInformationException e) {
>>    192                throw new Failure(e);
>>    193            }
>>    194            bpRequest = debuggee.makeBreakpoint(location);
>>    195
>>
>>    196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL);
>>
>>    197            bpRequest.putProperty("number", "zero");
>>    198            bpRequest.enable();
>>    199
>>    200            eventHandler.addListener(
>>    201                 new EventHandler.EventListener() {
>>    202                     public boolean eventReceived(Event event) {
>>    203                        if (event instanceof BreakpointEvent && 
>> bpRequest.equals(event.request())) {
>>    204                            synchronized(eventHandler) {
>>    205                                display("Received communication 
>> breakpoint event.");
>>    206                                bpCount++;
>>    207                                eventHandler.notifyAll();
>>    208                            }
>>    209                            return true;
>>    210                        }
>>    211                        return false;
>>    212                     }
>>    213                 }
>>    214            );
>>    215        }
>>
>>
>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java:
>> ...
>>    140                    display("......--> vm.suspend();");
>>    141                    vm.suspend();
>>    142
>>    143                    display("        getting : Map<String, 
>> Integer> suspendsCounts1");
>>    144
>>    145                    Map<String, Integer> suspendsCounts1 = new 
>> HashMap<String, Integer>();
>>    146                    for (ThreadReference threadReference : 
>> vm.allThreads()) {
>>    147 suspendsCounts1.put(threadReference.name(), 
>> threadReference.suspendCount());
>>    148                    }
>>    149                    display(suspendsCounts1.toString());
>>    150
>>    151                    display("        eventSet.resume;");
>>    152                    eventSet.resume();
>>    153
>>    154                    display("        getting : Map<String, 
>> Integer> suspendsCounts2");
>>
>> This is where the breakpoint is encountered before the second set of 
>> suspend counts is acquired.
>>
>>    155                    Map<String, Integer> suspendsCounts2 = new 
>> HashMap<String, Integer>();
>>    156                    for (ThreadReference threadReference : 
>> vm.allThreads()) {
>>    157 suspendsCounts2.put(threadReference.name(), 
>> threadReference.suspendCount());
>>    158                    }
>>    159                    display(suspendsCounts2.toString());
>>
>


From yasuenag at gmail.com  Wed Jul 18 12:59:04 2018
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Wed, 18 Jul 2018 21:59:04 +0900
Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running in
 Docker containers
In-Reply-To: <CAGFVN2A+OtA7KmCmN9aJCQv-kvnXoQB4jz75SXiweLbaG=EYKQ@mail.gmail.com>
References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com>
 <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com>
 <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com>
 <CAGFVN2A+OtA7KmCmN9aJCQv-kvnXoQB4jz75SXiweLbaG=EYKQ@mail.gmail.com>
Message-ID: <1bae36e7-3efc-3aef-6a99-324102da2549@gmail.com>

PING:

Could you review it?

    JBS:    https://bugs.openjdk.java.net/browse/JDK-8205992
    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/

This change has been reviewed by Jini.
We need a Reviewer.


Thanks,

Yasumasa


On 2018/07/12 13:42, Yasumasa Suenaga wrote:
> Thanks Jini,
> 
> I uploaded new webrev. It contains some comments and removing extra space.
> 
> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/
> 
> 
> Yasumasa
> 
> 
> 
> 2018-07-12 2:32 GMT+09:00 Jini George <jini.george at oracle.com>:
>> Hi Yasumasa,
>>
>> This looks good to me except for one nit. And some more comments would help.
>> For e.g., it would help to say that NSPidMap is to map the host to container
>> lwpids.
>>
>> The nit:
>>
>> *
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html
>> Line 253: extra space after the parentheses
>>
>> Thanks,
>> Jini.
>>
>> On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote:
>>>
>>> PING: Could you review it?
>>>
>>>>    JBS:    https://bugs.openjdk.java.net/browse/JDK-8205992
>>>>    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2018/06/28 22:12, Yasumasa Suenaga wrote:
>>>>
>>>> Hi all,
>>>>
>>>> Please review this change.
>>>>
>>>>    JBS:    https://bugs.openjdk.java.net/browse/JDK-8205992
>>>>    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>>>
>>>> I tried to attach jhsdb to java process in docker container from
>>>> container host, but it couldn't.
>>>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet.
>>>>
>>>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they
>>>> returns PIDs in container - they are different from host's PID. So I added
>>>> the code to scan /proc/<PID>/task to get all LWP IDs and they are kept in a
>>>> Map in LinuxDebuggerLocal.
>>>>
>>>> Also SA_ALTROOT is set to /proc/<PID>/root if SA detects debuggee runs in
>>>> container. It helps SA to parse binaries in container.
>>>>
>>>> This change has been pushed to submit repo, and it was failed on OS X
>>>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963).
>>>> But I guess it causes JDK-8205906. This change affects to Linux only.
>>>>
>>>> Could you review it?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>

From ralf.schmelter at sap.com  Wed Jul 18 15:44:39 2018
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Wed, 18 Jul 2018 15:44:39 +0000
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
 <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
Message-ID: <f96a16917c934a539523e078f902b880@sap.com>

Hi Chris,

here is an updated webref http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v3/ 

I've changed the summary text and the comment according to your suggestion.

The 100M frames is surely overkill for this test. I had seen that the JIT compiler started to inline, leading to less memory needed per frame. But I've never got more than 1M frames even for very big stacks. Therefore I've reduced it in the test to 1M.

When the stack overflow never occurs and callEnded() thus never gets called, the test will fail, because
bpe = resumeTo("Frames2Targ", "callEnded", "()V");
will fail since the VM will exit and never reach the breakpoint. In addition, a message will be written about the missing SOE.

Best regards,
Ralf


-----Original Message-----
From: Chris Plummer [mailto:chris.plummer at oracle.com] 
Sent: Mittwoch, 18. Juli 2018 07:02
To: Schmelter, Ralf <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com; Stuefe, Thomas <thomas.stuefe at sap.com>
Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior

Hi Ralf,

A few comments below, but overall looks good:

 ? 27? * @summary get stack trace for large stacks took too long.

How about "Test that getting the stack trace for a very large stack does 
not take too long".

The max number of frames you'll test for is 100M, but the stack size is 
set to 4m, assuming -Xss works (and I think on some platforms it may 
not). 100M frames seems like overkill for a 4M stack. If the stack was 
nothing more than a frame link pointer on a 32-bit system, you'd only 
have 1M frames, but lets be more realistic than that and say you should 
never have more than 256k frames. Lowering the max number of frames will 
prevent this test from taking a very long time on platforms where -Xss 
has failed.

 ? 65???????????????? // Have some frames be removed before we call again.

Should this be: "Pop some frames so there is room on the stack for the 
println()"

 ? 96???????? bpe = resumeTo("Frames2Targ", "callEnded", "()V");

What happens if we never get to callEnded()?

thanks,

Chris

On 7/17/18 7:08 AM, Schmelter, Ralf wrote:
> Hi all,
>
> here is an updated webref at http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v2/
>
> I've converted the shell based test to a Java one, which had the nice side effect of speeding it up (now takes ~1 second runtime).
>
> The fix itself is mainly unchanged, but I've added the variable 'filledIn' to store the number of frames actually filled in by the GetStackTrace call. Formerly I've reused the count variable, but this can lead to misunderstandings.
>
> Best regards,
> Ralf
>
>
> -----Original Message-----
> From: serviceability-dev [mailto:serviceability-dev-bounces at openjdk.java.net] On Behalf Of Schmelter, Ralf
> Sent: Montag, 9. Juli 2018 16:05
> To: Chris Plummer <chris.plummer at oracle.com>; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com
> Subject: [CAUTION] RE: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior
>
> Hi Chris,
>
> thanks for the review.
>
>> What testing have you done?
> I've tested the change by debugging by hand in eclipse and jdb, running the com/sun/jdi rtreg tests and the jdwp jck tests. And analog code is running in the SAP JVM for many years.
>
>
>> How long does this test take to run.
> 15 s according to jtreg.
>
>
>> What happens if for some reason SOE is never thrown? It's not clear to
>> me what the script would do in this case.
> It is treated as passed (which is not ideal).
>
>
>> In answer to the ShellScaffold.sh question, there is already work
>> underway to convert to pure java tests. See JDK-8201652.
> Ok, then I think it is better to convert the test to a Java TestScaffold test. I will update the webref when this is done.
>
> Best regards,
> Ralf
>
>
>
> -----Original Message-----
> From: Chris Plummer [mailto:chris.plummer at oracle.com]
> Sent: Freitag, 6. Juli 2018 00:37
> To: Schmelter, Ralf <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com
> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior
>
> Hi Ralf,
>
> Overall looks good, but I do have a few comments and questions.
>
> Please update the copyright.
>
> What testing have you done?
>
> How long does this test take to run.
>
> What happens if for some reason SOE is never thrown? It's not clear to
> me what the script would do in this case.
> In answer to the ShellScaffold.sh question, there is already work
> underway to convert to pure java tests. See JDK-8201652. I'm not certain
> if it is ok for you to just submit this new shell script, or if should
> be rewritten in pure java. Most of the work to convert the scripts has
> already been done but was put on hold. Maybe Serguei can comment and
> guide you on how it would be done in java.
>
> thanks,
>
> Chris
>
> On 7/3/18 3:43 AM, Schmelter, Ralf wrote:
>> Hi All,
>>
>> Please review the fix for the bughttps://bugs.openjdk.java.net/browse/JDK-8205608  . The webref is athttp://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/  .
>>
>> This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack.
>>
>> I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is.
>>
>> Best regards,
>> Ralf Schmelter
>


From jcbeyler at google.com  Wed Jul 18 16:21:19 2018
From: jcbeyler at google.com (JC Beyler)
Date: Wed, 18 Jul 2018 09:21:19 -0700
Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled
In-Reply-To: <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
References: <CAF9BGBzAVdC3JZCu2945n8JXo0usocBxVxYzdfdBU_gWOTe8wg@mail.gmail.com>
 <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com>
 <CAF9BGBx7TnVo5vmNx75TUhd1VSeow6KsgcSk2ztKdibKuzv9fw@mail.gmail.com>
 <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com>
 <CAF9BGBxLchBQ3g1uwD=34E+QR5ARO7DqBfxZpFU2TRWTSaTajQ@mail.gmail.com>
 <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
Message-ID: <CAF9BGBz+SNUffJWNgkvTTK6jB7-8yf96HGSroxvY13FT80XPCg@mail.gmail.com>

Subject Was:
Re: RFR (S): C1 still does eden allocations when TLAB is enabled

+ serviceability-dev

Hi all,

Could anyone else give me a review of this webrev and check/test the
various architecture changes?

http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/


Thanks for all your help!
Jc


On Mon, Jul 16, 2018 at 2:58 PM JC Beyler <jcbeyler at google.com> wrote:

> Hi all,
>
> Here is a webrev that does all the architectures in the same way:
> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>
> Could anyone review the other architectures and test?
>   - arm, sparc & aarch64 are also modified now to follow the same "if no
> tlab, then consider eden space allocation" logic.
>
> Thanks for your help!
> Jc
>
> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler <jcbeyler at google.com> wrote:
>
>> Hi Kim,
>>
>> I opened this bug
>> https://bugs.openjdk.java.net/browse/JDK-8190862
>>
>> and now I've done an update:
>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/
>>
>> I basically have done your nits but also removed the try_eden (it was
>> used to bind a label but was not used). I updated the comments to use the
>> one you preferred.
>>
>> I still have to do the other architectures though but at least we seem to
>> have a consensus on this architecture, correct?
>>
>> Thanks for the review,
>> Jc
>>
>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett <kim.barrett at oracle.com>
>> wrote:
>>
>>> > On Jul 13, 2018, at 4:54 PM, JC Beyler <jcbeyler at google.com> wrote:
>>> >
>>> > Yes, you are right, I did those changes due to:
>>> > https://bugs.openjdk.java.net/browse/JDK-8194084
>>> >
>>> > If Robbin agrees to this change, and if no one sees an issue, I'll go
>>> ahead
>>> > and propagate the change across architectures.
>>> >
>>> > Thanks for the review, I'll wait for Robbin (or anyone else's comment
>>> and
>>> > review) :)
>>> > Jc
>>> >
>>> > On Fri, Jul 13, 2018 at 1:08 PM John Rose <john.r.rose at oracle.com>
>>> wrote:
>>> >
>>> >> On Jul 13, 2018, at 10:23 AM, JC Beyler <jcbeyler at google.com> wrote:
>>> >>
>>> >>
>>> >> I'm not sure if we had left this case intentionally or not but, if we
>>> want
>>> >> it all to be consistent, we should perhaps fix it.
>>> >>
>>> >>
>>> >> Well, you put in that logic last February, so unless somebody speaks
>>> up
>>> >> quickly, I support your adjusting it to be the way you want it.
>>> >>
>>> >> Doing "hg grep -u supports_inline_contig_alloc -I src/hotspot/share"
>>> >> suggests that the GC group is most active in touching this feature.
>>> >> If Robbin is OK with it, there's your reviewer.
>>> >>
>>> >> FWIW, you can use me as a reviewer, but I'd get one other person
>>> >> working on the GC to OK it.
>>> >>
>>> >> ? John
>>> >>
>>> >
>>> >
>>> > --
>>> >
>>> > Thanks,
>>> > Jc
>>>
>>> Robbin is on vacation; you might not hear from him for a while.
>>>
>>> I'm assuming you'll open a new bug for this?
>>>
>>> Except for a few minor nits (below), this looks okay to me.
>>>
>>> The comment at line 1052 needs updating.
>>>
>>> pre-existing: The retry_tlab label declared on line 1054 is unused.
>>>
>>> pre-existing: The try_eden label declared on line 1054 is bound at
>>> line 1058, but unreferenced.
>>>
>>> I like the wording of the comment at 1139 better than the wording at
>>> 1016.
>>>
>>>
>>
>> --
>>
>> Thanks,
>> Jc
>>
>
>
> --
>
> Thanks,
> Jc
>


-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180718/c9e16c1f/attachment-0001.html>

From chris.plummer at oracle.com  Wed Jul 18 17:10:36 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 18 Jul 2018 10:10:36 -0700
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <f96a16917c934a539523e078f902b880@sap.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
 <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
 <f96a16917c934a539523e078f902b880@sap.com>
Message-ID: <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com>

Hi Ralf,

Looks good.

thanks,

Chris

On 7/18/18 8:44 AM, Schmelter, Ralf wrote:
> Hi Chris,
>
> here is an updated webref http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v3/
>
> I've changed the summary text and the comment according to your suggestion.
>
> The 100M frames is surely overkill for this test. I had seen that the JIT compiler started to inline, leading to less memory needed per frame. But I've never got more than 1M frames even for very big stacks. Therefore I've reduced it in the test to 1M.
>
> When the stack overflow never occurs and callEnded() thus never gets called, the test will fail, because
> bpe = resumeTo("Frames2Targ", "callEnded", "()V");
> will fail since the VM will exit and never reach the breakpoint. In addition, a message will be written about the missing SOE.
>
> Best regards,
> Ralf
>
>
> -----Original Message-----
> From: Chris Plummer [mailto:chris.plummer at oracle.com]
> Sent: Mittwoch, 18. Juli 2018 07:02
> To: Schmelter, Ralf <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com; Stuefe, Thomas <thomas.stuefe at sap.com>
> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior
>
> Hi Ralf,
>
> A few comments below, but overall looks good:
>
>   ? 27? * @summary get stack trace for large stacks took too long.
>
> How about "Test that getting the stack trace for a very large stack does
> not take too long".
>
> The max number of frames you'll test for is 100M, but the stack size is
> set to 4m, assuming -Xss works (and I think on some platforms it may
> not). 100M frames seems like overkill for a 4M stack. If the stack was
> nothing more than a frame link pointer on a 32-bit system, you'd only
> have 1M frames, but lets be more realistic than that and say you should
> never have more than 256k frames. Lowering the max number of frames will
> prevent this test from taking a very long time on platforms where -Xss
> has failed.
>
>   ? 65???????????????? // Have some frames be removed before we call again.
>
> Should this be: "Pop some frames so there is room on the stack for the
> println()"
>
>   ? 96???????? bpe = resumeTo("Frames2Targ", "callEnded", "()V");
>
> What happens if we never get to callEnded()?
>
> thanks,
>
> Chris
>
> On 7/17/18 7:08 AM, Schmelter, Ralf wrote:
>> Hi all,
>>
>> here is an updated webref at http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v2/
>>
>> I've converted the shell based test to a Java one, which had the nice side effect of speeding it up (now takes ~1 second runtime).
>>
>> The fix itself is mainly unchanged, but I've added the variable 'filledIn' to store the number of frames actually filled in by the GetStackTrace call. Formerly I've reused the count variable, but this can lead to misunderstandings.
>>
>> Best regards,
>> Ralf
>>
>>
>> -----Original Message-----
>> From: serviceability-dev [mailto:serviceability-dev-bounces at openjdk.java.net] On Behalf Of Schmelter, Ralf
>> Sent: Montag, 9. Juli 2018 16:05
>> To: Chris Plummer <chris.plummer at oracle.com>; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com
>> Subject: [CAUTION] RE: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior
>>
>> Hi Chris,
>>
>> thanks for the review.
>>
>>> What testing have you done?
>> I've tested the change by debugging by hand in eclipse and jdb, running the com/sun/jdi rtreg tests and the jdwp jck tests. And analog code is running in the SAP JVM for many years.
>>
>>
>>> How long does this test take to run.
>> 15 s according to jtreg.
>>
>>
>>> What happens if for some reason SOE is never thrown? It's not clear to
>>> me what the script would do in this case.
>> It is treated as passed (which is not ideal).
>>
>>
>>> In answer to the ShellScaffold.sh question, there is already work
>>> underway to convert to pure java tests. See JDK-8201652.
>> Ok, then I think it is better to convert the test to a Java TestScaffold test. I will update the webref when this is done.
>>
>> Best regards,
>> Ralf
>>
>>
>>
>> -----Original Message-----
>> From: Chris Plummer [mailto:chris.plummer at oracle.com]
>> Sent: Freitag, 6. Juli 2018 00:37
>> To: Schmelter, Ralf <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; serguei.spitsyn at oracle.com
>> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior
>>
>> Hi Ralf,
>>
>> Overall looks good, but I do have a few comments and questions.
>>
>> Please update the copyright.
>>
>> What testing have you done?
>>
>> How long does this test take to run.
>>
>> What happens if for some reason SOE is never thrown? It's not clear to
>> me what the script would do in this case.
>> In answer to the ShellScaffold.sh question, there is already work
>> underway to convert to pure java tests. See JDK-8201652. I'm not certain
>> if it is ok for you to just submit this new shell script, or if should
>> be rewritten in pure java. Most of the work to convert the scripts has
>> already been done but was put on hold. Maybe Serguei can comment and
>> guide you on how it would be done in java.
>>
>> thanks,
>>
>> Chris
>>
>> On 7/3/18 3:43 AM, Schmelter, Ralf wrote:
>>> Hi All,
>>>
>>> Please review the fix for the bughttps://bugs.openjdk.java.net/browse/JDK-8205608  . The webref is athttp://cr.openjdk.java.net/~simonis/webrevs/2018/8205608/  .
>>>
>>> This fixes the quadratic runtime (in the number of frames) of the frames() method, making it linear instead. It uses additional memory proportional to the number of frames on the stack.
>>>
>>> I've included a jtreg test, which would time out in the old implementation (since it takes minutes to get the stack frames). I'm not sure how useful this is.
>>>
>>> Best regards,
>>> Ralf Schmelter
>


From chris.plummer at oracle.com  Wed Jul 18 18:50:49 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 18 Jul 2018 11:50:49 -0700
Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR:
 suspendCounts don't match for : Common-Cleaner
In-Reply-To: <5B4F2A04.20409@oracle.com>
References: <5B4E0C62.3020808@oracle.com>
 <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com>
Message-ID: <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com>

Hi Gary,

Who does the resume for the breakpoint event?

 ??????? eventHandler.addListener(
 ???????????? new EventHandler.EventListener() {
 ???????????????? public boolean eventReceived(Event event) {
 ??????????????????? if (event instanceof BreakpointEvent && 
bpRequest.equals(event.request())) {
 ??????????????????????? synchronized(eventHandler) {
 ??????????????????????????? display("Received communication breakpoint 
event.");
 ??????????????????????????? bpCount++;
 ??????????????????????????? eventHandler.notifyAll();
 ??????????????????????? }
 ??????????????????????? return true;
 ??????????????????? }
 ??????????????????? return false;
 ???????????????? }
 ???????????? }
 ??????? );

Also:

> ? 1. On a thread start event the debugee is suspended, line 141 
That's not true for the first ThreadStartEvent since SUSPEND_NONE was used.

Chris

On 7/18/18 4:52 AM, Gary Adams wrote:
> There is nothing wrong with the breakpoint in methodForCommunication.
> The test uses it to make sure the threads are each tested separately.
> The breakpoint eventhandler just displays a message, increments a counter
> and returns.
>
> Let me step through resume008 the debugee to help clarify ...
>
> 1. The test thread is created and the synchronized break point is 
> observed. lines 101-102
> 2. The thread is started. lines 104,135-137
> ??? 2a. The main thread blocks on a local object. lines 133, 139
> ??? 2b. The test thread is started. lines 137,
> ?????????? A run entered message is displayed, line 159
> ?????????? The main thread lock object is notified, line 167
> ????????? 2b1. The main thread continues. line 167, 146
> ????????????????? The next test thread is created. line 106
> ????????????????? The synchronized breakpoint is observed, line 107
> ????????? 2b2. A run exited message is displayed, line 169
>
> On the resume008 debugger side? ...
> ? 1. On a thread start event the debugee is suspended, line 141
> ? 2. Messages are displayed and a first set of thread suspend counts 
> is acquired. lines 143-151
> ? 3. The threads are resumed, line 152
> --->
> ? 4.? Messages are displayed and a second set of thread suspend counts 
> is acquired. lines 154-159
>
> The way the test is written the expectation is the debugger steps 
> 2,3,4 will all happen
> while the test thread is running.
>
> When the debugger resumes the debuggee threads (debugger step 3)
> the debuggee continues from where it left off (debuggee steps 2b,2b1,2b2)
>
> If we complete debuggee step 2b1 (line 107) before the debugger 
> completes step 4 line 159,
> then the synchronized breakpoint will suspend the vm and the counts 
> will not match
> for the SUSPEND_NONE test thread start.
>
> resume008a.java:
>
> ?? 100??????????????????????? case 0:
> ?? 101??????????????????????????????? thread0 = new 
> Threadresume008a("thread0");
> ?? 102??????????????????????????????? methodForCommunication();
> ?? 103
> ?? 104??????????????????????????????? threadStart(thread0);
> ?? 105
> ?? 106??????????????????????????????? thread1 = new 
> Threadresume008a("thread1");
> ?? 107??????????????????????????????? methodForCommunication();
> ?? 108??????????????????????????????? break;
>
> ?? ...
> ?? 135??????? static int threadStart(Thread t) {
> ?? 136??????????? synchronized (waitnotifyObj) {
> ?? 137??????????????? t.start();
> ?? 138??????????????? try {
> ?? 139??????????????????? waitnotifyObj.wait();
> ?? 140??????????????? } catch ( Exception e) {
> ?? 141??????????????????? exitCode = FAILED;
> ?? 142??????????????????? logErr("?????? Exception : " + e );
> ?? 143??????????????????? return FAILED;
> ?? 144??????????????? }
> ?? 145??????????? }
> ?? 146??????????? return PASSED;
> ?? 147??????? }
>
> ?? 149??????? static class Threadresume008a extends Thread {
> ?? ...
> ?? 157
> ?? 158??????????? public void run() {
> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + tName);
>
> This is the proposed fix that will let the debugger complete it's second
> acquisition of suspend counts while the test thread is still running.
>
> ?? 160??????????????? // Yield, so the start thread event processing 
> can be completed.
> ?? 161??????????????? try {
> ?? 162??????????????????? Thread.sleep(100);
> ?? 163??????????????? } catch (InterruptedException e) {
> ?? 164??????????????????? // ignored
> ?? 165??????????????? }
>
> ?? 166??????????????? synchronized (waitnotifyObj) {
> ?? 167??????????????????????? waitnotifyObj.notify();
> ?? 168??????????????? }
> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + tName);
> ?? 170??????????????? return;
> ?? 171??????????? }
> ?? 172??????? }
> ?? 150
> ?? 151??????????? String tName = null;
> ?? 152
> ?? 153??????????? public Threadresume008a(String threadName) {
> ?? 154??????????????? super(threadName);
> ?? 155??????????????? tName = threadName;
> ?? 156??????????? }
> ?? 157
> ?? 158??????????? public void run() {
> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + tName);
> ?? 160??????????????? // Yield, so the start thread event processing 
> can be completed.
> ?? 161??????????????? try {
> ?? 162??????????????????? Thread.sleep(100);
> ?? 163??????????????? } catch (InterruptedException e) {
> ?? 164??????????????????? // ignored
> ?? 165??????????????? }
> ?? 166??????????????? synchronized (waitnotifyObj) {
> ?? 167??????????????????????? waitnotifyObj.notify();
> ?? 168??????????????? }
> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + tName);
> ?? 170??????????????? return;
> ?? 171??????????? }
> ?? 172??????? }
>
>
>
> On 7/18/18, 2:38 AM, Chris Plummer wrote:
>> Hi Gary,
>>
>> I've been having trouble following the control flow of this test. One 
>> thing I've stumbled across is the following:
>>
>> ??????????? /* A debuggee class must define 'methodForCommunication'
>> ???????????? * method and invoke it in points of synchronization
>> ???????????? * with a debugger.
>> ???????????? */
>> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication");
>>
>> So why isn't this mode of synchronization good enough? Is it because 
>> it was not designed with the understanding that the debugger might be 
>> doing suspended thread counts, and suspending all threads at the 
>> breakpoint messes up the test?
>>
>> From what I can tell of the test, after the debuggee is started and 
>> hits the default breakpoint at the start of main(), the debugger then 
>> does a vm.resume() at the start of the for loop in the runTest() 
>> method. The debuggee then creates a thread and calls 
>> methodForCommunication(). There is already a breakpoint set there by 
>> the above debuggee code. It's unclear to me what happens as a result 
>> of this breakpoint and how it serves the test. Also unclear to me who 
>> is responsible for the vm.resume() after the breakpoint is hit.
>>
>> The debugger then requests all ThreadStart events, requesting that no 
>> threads be disabled when it is sent. I think you are saying that when 
>> the ThreadStart event comes in, sometimes we are at the 
>> methodForCommunication breakpoint, with all threads disabled, and 
>> this messes up the thread suspend counts. You want to delay 100ms so 
>> the breakpoint event can be processed and threads resumed again 
>> (although I can't see who actually resumes the thread after hitting 
>> the methodForCommunication breakpoint).
>>
>> Chris
>>
>> On 7/17/18 8:33 AM, Gary Adams wrote:
>>> A race condition exists between the debugger and the debuggee.
>>>
>>> The first test thread is started with SUSPEND_NONE policy set.
>>> While processing the thread start event the debugger captures
>>> an initial set of thread suspend counts and resumes the
>>> debuggee vm. If the debuggee advances quickly it reaches
>>> the breakpoint set for methodForCommunication. Since the breakpoint
>>> carries with it SUSPEND_ALL policy, when the debugger captures a second
>>> set of suspend counts, it will not match the expected counts for
>>> a SUSPEND_NONE scenario.
>>>
>>> The proposed fix introduces a yield in the debuggee test thread run 
>>> method
>>> to allow the debugger to get the expected sampled values.
>>>
>>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8170089
>>> ? Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/
>>>
>>>
>>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java:
>>> ...
>>> ?? 186??????? private void setCommunicationBreakpoint(ReferenceType 
>>> refType, String methodName) {
>>> ?? 187??????????? Method method = debuggee.methodByName(refType, 
>>> methodName);
>>> ?? 188??????????? Location location = null;
>>> ?? 189??????????? try {
>>> ?? 190??????????????? location = method.allLineLocations().get(0);
>>> ?? 191??????????? } catch (AbsentInformationException e) {
>>> ?? 192??????????????? throw new Failure(e);
>>> ?? 193??????????? }
>>> ?? 194??????????? bpRequest = debuggee.makeBreakpoint(location);
>>> ?? 195
>>>
>>> ?? 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL);
>>>
>>> ?? 197??????????? bpRequest.putProperty("number", "zero");
>>> ?? 198??????????? bpRequest.enable();
>>> ?? 199
>>> ?? 200??????????? eventHandler.addListener(
>>> ?? 201???????????????? new EventHandler.EventListener() {
>>> ?? 202???????????????????? public boolean eventReceived(Event event) {
>>> ?? 203??????????????????????? if (event instanceof BreakpointEvent 
>>> && bpRequest.equals(event.request())) {
>>> ?? 204??????????????????????????? synchronized(eventHandler) {
>>> ?? 205??????????????????????????????? display("Received 
>>> communication breakpoint event.");
>>> ?? 206??????????????????????????????? bpCount++;
>>> ?? 207 eventHandler.notifyAll();
>>> ?? 208??????????????????????????? }
>>> ?? 209??????????????????????????? return true;
>>> ?? 210??????????????????????? }
>>> ?? 211??????????????????????? return false;
>>> ?? 212???????????????????? }
>>> ?? 213???????????????? }
>>> ?? 214??????????? );
>>> ?? 215??????? }
>>>
>>>
>>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java:
>>> ...
>>> ?? 140??????????????????? display("......--> vm.suspend();");
>>> ?? 141??????????????????? vm.suspend();
>>> ?? 142
>>> ?? 143??????????????????? display("??????? getting : Map<String, 
>>> Integer> suspendsCounts1");
>>> ?? 144
>>> ?? 145??????????????????? Map<String, Integer> suspendsCounts1 = new 
>>> HashMap<String, Integer>();
>>> ?? 146??????????????????? for (ThreadReference threadReference : 
>>> vm.allThreads()) {
>>> ?? 147 suspendsCounts1.put(threadReference.name(), 
>>> threadReference.suspendCount());
>>> ?? 148??????????????????? }
>>> ?? 149??????????????????? display(suspendsCounts1.toString());
>>> ?? 150
>>> ?? 151??????????????????? display("??????? eventSet.resume;");
>>> ?? 152??????????????????? eventSet.resume();
>>> ?? 153
>>> ?? 154??????????????????? display("??????? getting : Map<String, 
>>> Integer> suspendsCounts2");
>>>
>>> This is where the breakpoint is encountered before the second set of 
>>> suspend counts is acquired.
>>>
>>> ?? 155??????????????????? Map<String, Integer> suspendsCounts2 = new 
>>> HashMap<String, Integer>();
>>> ?? 156??????????????????? for (ThreadReference threadReference : 
>>> vm.allThreads()) {
>>> ?? 157 suspendsCounts2.put(threadReference.name(), 
>>> threadReference.suspendCount());
>>> ?? 158??????????????????? }
>>> ?? 159??????????????????? display(suspendsCounts2.toString());
>>>
>>
>


From gary.adams at oracle.com  Wed Jul 18 19:45:03 2018
From: gary.adams at oracle.com (Gary Adams)
Date: Wed, 18 Jul 2018 15:45:03 -0400
Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR:
 suspendCounts don't match for : Common-Cleaner
In-Reply-To: <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com>
References: <5B4E0C62.3020808@oracle.com>
 <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com>
 <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com>
Message-ID: <5B4F98BF.1060602@oracle.com>

Answers below  ...

On 7/18/18, 2:50 PM, Chris Plummer wrote:
> Hi Gary,
>
> Who does the resume for the breakpoint event?
>
>         eventHandler.addListener(
>              new EventHandler.EventListener() {
>                  public boolean eventReceived(Event event) {
>                     if (event instanceof BreakpointEvent && 
> bpRequest.equals(event.request())) {
>                         synchronized(eventHandler) {
>                             display("Received communication breakpoint 
> event.");
>                             bpCount++;
>                             eventHandler.notifyAll();
>                         }
>                         return true;
>                     }
>                     return false;
>                  }
>              }
>         );
I believe you are looking for this sequence.
At the top of the loop a check is made if
resume() should be called "shouldRunAfterBreakpoint".
lines 96-99 is an early termination. And at the
bottom of the loop, line 240, is the normal
continue the test to the next case.

resume008.java :
...
     94            for (int i = 0; ; i++) {
     95

     96                if (!shouldRunAfterBreakpoint()) {
     97                    vm.resume();
     98                    break;
     99                }

    100
    101
    102                display(":::::: case: # " + i);
    103
    104                switch (i) {
    105
    106                    case 0:
    107                    eventRequest = settingThreadStartRequest (
    108                                           SUSPEND_NONE,   
"ThreadStartRequest1");
...
   238
    239                display("......--> vm.resume()");
    240                vm.resume();
    241            }
>
> Also:
>
>>   1. On a thread start event the debugee is suspended, line 141 
> That's not true for the first ThreadStartEvent since SUSPEND_NONE was 
> used.
The thread start event is set to SUSPEND_NONE for thread0, but when
the thread start event is observed the resume008 test suspends the vm
immediately after fetching the "number" property.

    132                if ( !(newEvent instanceof ThreadStartEvent)) {
    133                    setFailedStatus("ERROR: new event is not 
ThreadStartEvent");
    134                } else {
    135
    136                    String property = (String) 
newEvent.request().getProperty("number");
    137                    display("       got new ThreadStartEvent with 
propety 'number' == " + property);
    138
    139                    display("......checking up on 
EventSet.resume()");
    140                    display("......--> vm.suspend();");
    141                    vm.suspend();


>
> Chris
>
> On 7/18/18 4:52 AM, Gary Adams wrote:
>> There is nothing wrong with the breakpoint in methodForCommunication.
>> The test uses it to make sure the threads are each tested separately.
>> The breakpoint eventhandler just displays a message, increments a 
>> counter
>> and returns.
>>
>> Let me step through resume008a the debugee to help clarify ...
>>
>> 1. The test thread is created and the synchronized break point is 
>> observed. lines 101-102
>> 2. The thread is started. lines 104,135-137
>>     2a. The main thread blocks on a local object. lines 133, 139
>>     2b. The test thread is started. lines 137,
>>            A run entered message is displayed, line 159
>>            The main thread lock object is notified, line 167
>>           2b1. The main thread continues. line 167, 146
>>                   The next test thread is created. line 106
>>                   The synchronized breakpoint is observed, line 107
>>           2b2. A run exited message is displayed, line 169
>>
>> On the resume008 debugger side  ...
>>   1. On a thread start event the debugee is suspended, line 141
>>   2. Messages are displayed and a first set of thread suspend counts 
>> is acquired. lines 143-151
>>   3. The threads are resumed, line 152
>> --->
>>   4.  Messages are displayed and a second set of thread suspend 
>> counts is acquired. lines 154-159
>>
>> The way the test is written the expectation is the debugger steps 
>> 2,3,4 will all happen
>> while the test thread is running.
>>
>> When the debugger resumes the debuggee threads (debugger step 3)
>> the debuggee continues from where it left off (debuggee steps 
>> 2b,2b1,2b2)
>>
>> If we complete debuggee step 2b1 (line 107) before the debugger 
>> completes step 4 line 159,
>> then the synchronized breakpoint will suspend the vm and the counts 
>> will not match
>> for the SUSPEND_NONE test thread start.
>>
>> resume008a.java:
>>
>>    100                        case 0:
>>    101                                thread0 = new 
>> Threadresume008a("thread0");
>>    102                                methodForCommunication();
>>    103
>>    104                                threadStart(thread0);
>>    105
>>    106                                thread1 = new 
>> Threadresume008a("thread1");
>>    107                                methodForCommunication();
>>    108                                break;
>>
>>    ...
>>    135        static int threadStart(Thread t) {
>>    136            synchronized (waitnotifyObj) {
>>    137                t.start();
>>    138                try {
>>    139                    waitnotifyObj.wait();
>>    140                } catch ( Exception e) {
>>    141                    exitCode = FAILED;
>>    142                    logErr("       Exception : " + e );
>>    143                    return FAILED;
>>    144                }
>>    145            }
>>    146            return PASSED;
>>    147        }
>>
>>    149        static class Threadresume008a extends Thread {
>>    ...
>>    157
>>    158            public void run() {
>>    159                log1("  'run': enter  :: threadName == " + tName);
>>
>> This is the proposed fix that will let the debugger complete it's second
>> acquisition of suspend counts while the test thread is still running.
>>
>>    160                // Yield, so the start thread event processing 
>> can be completed.
>>    161                try {
>>    162                    Thread.sleep(100);
>>    163                } catch (InterruptedException e) {
>>    164                    // ignored
>>    165                }
>>
>>    166                synchronized (waitnotifyObj) {
>>    167                        waitnotifyObj.notify();
>>    168                }
>>    169                log1("  'run': exit   :: threadName == " + tName);
>>    170                return;
>>    171            }
>>    172        }
>>    150
>>    151            String tName = null;
>>    152
>>    153            public Threadresume008a(String threadName) {
>>    154                super(threadName);
>>    155                tName = threadName;
>>    156            }
>>    157
>>    158            public void run() {
>>    159                log1("  'run': enter  :: threadName == " + tName);
>>    160                // Yield, so the start thread event processing 
>> can be completed.
>>    161                try {
>>    162                    Thread.sleep(100);
>>    163                } catch (InterruptedException e) {
>>    164                    // ignored
>>    165                }
>>    166                synchronized (waitnotifyObj) {
>>    167                        waitnotifyObj.notify();
>>    168                }
>>    169                log1("  'run': exit   :: threadName == " + tName);
>>    170                return;
>>    171            }
>>    172        }
>>
>>
>>
>> On 7/18/18, 2:38 AM, Chris Plummer wrote:
>>> Hi Gary,
>>>
>>> I've been having trouble following the control flow of this test. 
>>> One thing I've stumbled across is the following:
>>>
>>>             /* A debuggee class must define 'methodForCommunication'
>>>              * method and invoke it in points of synchronization
>>>              * with a debugger.
>>>              */
>>> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication");
>>>
>>> So why isn't this mode of synchronization good enough? Is it because 
>>> it was not designed with the understanding that the debugger might 
>>> be doing suspended thread counts, and suspending all threads at the 
>>> breakpoint messes up the test?
>>>
>>> From what I can tell of the test, after the debuggee is started and 
>>> hits the default breakpoint at the start of main(), the debugger 
>>> then does a vm.resume() at the start of the for loop in the 
>>> runTest() method. The debuggee then creates a thread and calls 
>>> methodForCommunication(). There is already a breakpoint set there by 
>>> the above debuggee code. It's unclear to me what happens as a result 
>>> of this breakpoint and how it serves the test. Also unclear to me 
>>> who is responsible for the vm.resume() after the breakpoint is hit.
>>>
>>> The debugger then requests all ThreadStart events, requesting that 
>>> no threads be disabled when it is sent. I think you are saying that 
>>> when the ThreadStart event comes in, sometimes we are at the 
>>> methodForCommunication breakpoint, with all threads disabled, and 
>>> this messes up the thread suspend counts. You want to delay 100ms so 
>>> the breakpoint event can be processed and threads resumed again 
>>> (although I can't see who actually resumes the thread after hitting 
>>> the methodForCommunication breakpoint).
>>>
>>> Chris
>>>
>>> On 7/17/18 8:33 AM, Gary Adams wrote:
>>>> A race condition exists between the debugger and the debuggee.
>>>>
>>>> The first test thread is started with SUSPEND_NONE policy set.
>>>> While processing the thread start event the debugger captures
>>>> an initial set of thread suspend counts and resumes the
>>>> debuggee vm. If the debuggee advances quickly it reaches
>>>> the breakpoint set for methodForCommunication. Since the breakpoint
>>>> carries with it SUSPEND_ALL policy, when the debugger captures a 
>>>> second
>>>> set of suspend counts, it will not match the expected counts for
>>>> a SUSPEND_NONE scenario.
>>>>
>>>> The proposed fix introduces a yield in the debuggee test thread run 
>>>> method
>>>> to allow the debugger to get the expected sampled values.
>>>>
>>>>   Issue: https://bugs.openjdk.java.net/browse/JDK-8170089
>>>>   Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/
>>>>
>>>>
>>>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java:
>>>> ...
>>>>    186        private void setCommunicationBreakpoint(ReferenceType 
>>>> refType, String methodName) {
>>>>    187            Method method = debuggee.methodByName(refType, 
>>>> methodName);
>>>>    188            Location location = null;
>>>>    189            try {
>>>>    190                location = method.allLineLocations().get(0);
>>>>    191            } catch (AbsentInformationException e) {
>>>>    192                throw new Failure(e);
>>>>    193            }
>>>>    194            bpRequest = debuggee.makeBreakpoint(location);
>>>>    195
>>>>
>>>>    196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL);
>>>>
>>>>    197            bpRequest.putProperty("number", "zero");
>>>>    198            bpRequest.enable();
>>>>    199
>>>>    200            eventHandler.addListener(
>>>>    201                 new EventHandler.EventListener() {
>>>>    202                     public boolean eventReceived(Event event) {
>>>>    203                        if (event instanceof BreakpointEvent 
>>>> && bpRequest.equals(event.request())) {
>>>>    204                            synchronized(eventHandler) {
>>>>    205                                display("Received 
>>>> communication breakpoint event.");
>>>>    206                                bpCount++;
>>>>    207 eventHandler.notifyAll();
>>>>    208                            }
>>>>    209                            return true;
>>>>    210                        }
>>>>    211                        return false;
>>>>    212                     }
>>>>    213                 }
>>>>    214            );
>>>>    215        }
>>>>
>>>>
>>>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java:
>>>> ...
>>>>    140                    display("......--> vm.suspend();");
>>>>    141                    vm.suspend();
>>>>    142
>>>>    143                    display("        getting : Map<String, 
>>>> Integer> suspendsCounts1");
>>>>    144
>>>>    145                    Map<String, Integer> suspendsCounts1 = 
>>>> new HashMap<String, Integer>();
>>>>    146                    for (ThreadReference threadReference : 
>>>> vm.allThreads()) {
>>>>    147 suspendsCounts1.put(threadReference.name(), 
>>>> threadReference.suspendCount());
>>>>    148                    }
>>>>    149                    display(suspendsCounts1.toString());
>>>>    150
>>>>    151                    display("        eventSet.resume;");
>>>>    152                    eventSet.resume();
>>>>    153
>>>>    154                    display("        getting : Map<String, 
>>>> Integer> suspendsCounts2");
>>>>
>>>> This is where the breakpoint is encountered before the second set 
>>>> of suspend counts is acquired.
>>>>
>>>>    155                    Map<String, Integer> suspendsCounts2 = 
>>>> new HashMap<String, Integer>();
>>>>    156                    for (ThreadReference threadReference : 
>>>> vm.allThreads()) {
>>>>    157 suspendsCounts2.put(threadReference.name(), 
>>>> threadReference.suspendCount());
>>>>    158                    }
>>>>    159                    display(suspendsCounts2.toString());
>>>>
>>>
>>
>


From chris.plummer at oracle.com  Wed Jul 18 20:47:09 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 18 Jul 2018 13:47:09 -0700
Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR:
 suspendCounts don't match for : Common-Cleaner
In-Reply-To: <5B4F98BF.1060602@oracle.com>
References: <5B4E0C62.3020808@oracle.com>
 <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com>
 <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com>
 <5B4F98BF.1060602@oracle.com>
Message-ID: <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com>

Hi Gary

Ok, so shouldRunAfterBreakpoint() is the code that does the 
eventHandler.wait(), so it gets the eventHandler.notifyAll() 
notification from the BreakpointEvent handler.

And as a side note, I see now that resumption of execution after the 
breakpoint at main() is done by:

 ??????????? // after waitForClassPrepared() main debuggee thread is 
suspended, resume it before test start
 ??????????? display("RESUME DEBUGGEE VM");
 ??????????? vm.resume();

 ??????????? testRun();

shouldRunAfterBreakpoint() is returning true until the end of the test 
when the debuggee is executes "instruction = end". That's why runTests() 
does a "break" when shouldRunAfterBreakpoint() returns false. So this 
means the code that is checking shouldRunAfterBreakpoint() is not 
resuming execution for the first few (probably 3) 
methodForCommunication() breakpoints. However, it does make sure that 
runTests() blocks until the BreakPointEvent has been processed.

You point out the vm.resume() at the bottom of the loop in runTests(), 
but that's only after a bunch of ThreadStartEvent processing above it 
has been done already. The ThreadStartEvent would never get generated if 
there was not a resume some point earlier. I think it is happening 
during the eventHandler.waitForRequestedEventSet() call, which does a 
vm.resume().

So if I understand the order of things now:

-shouldRunAfterBreakpoint() returns after first methodForCommunication() 
is hit. At this point we know the first thread has been created, but no 
attempt to start it yet. The debuggee is suspended at this point.
-runTests() requests ThreadStartEvents with SUSPEND_NONE. This also does 
a vm.resume().
-The debuggee starts the thread and then does another 
methodForCommunication() (this 2nd one is actually after the 2nd thread 
has been created, but not yet started). Now we have a race. Do we get 
the ThreadStartEvent first or the BreakpointEvent. This is because when 
the ThreadStartEvent is generated, the thread is not suspended due to 
SUSPEND_NONE. Even if the ThreadStartEvent comes in first, the async 
handling of the BreakpointEvent can cause problems during the 
ThreadStartEvent processing.
-You added a 100ms delay after the thread has started, but before 
methodForCommunication(), hoping it will make it so the ThreadStartEvent 
can be received and fully processed before the BreakpointEvent is.

I think it would be preferable to fix this by doing better 
sychronization. After all, that is the approach the test originally 
took. It could have been written with a bunch of sleep() delays instead, 
but that in general is not a very good approach.

What if you added a shouldRunAfterBreakpoint() call after getting the 
ThreadStartEvent arrives. At this point you would know that the vm is 
suspended due to the breakpoint, so no need for:

 ??????????????? display("......checking up on EventSet.resume()");
 ??????????????? display("......--> vm.suspend();");
 ??????????????? vm.suspend();

You might then also need to add another methodForCommunication() call at 
the end of case 0 and 1 in the debuggee, although I think you could 
instead just change the shouldRunAfterBreakpoint() at the start of the 
loop. I think that check actually belongs at the end of the loop, and 
only for case 2. In fact it would be an error if 
shouldRunAfterBreakpoint() did not return true in that case. Then you 
also need to add a shouldRunAfterBreakpoint() at the start of case 0 to 
get things rolling (and I think at the start of case 1 also).

Chris


On 7/18/18 12:45 PM, Gary Adams wrote:
> Answers below? ...
>
> On 7/18/18, 2:50 PM, Chris Plummer wrote:
>> Hi Gary,
>>
>> Who does the resume for the breakpoint event?
>>
>> ??????? eventHandler.addListener(
>> ???????????? new EventHandler.EventListener() {
>> ???????????????? public boolean eventReceived(Event event) {
>> ??????????????????? if (event instanceof BreakpointEvent && 
>> bpRequest.equals(event.request())) {
>> ??????????????????????? synchronized(eventHandler) {
>> ??????????????????????????? display("Received communication 
>> breakpoint event.");
>> ??????????????????????????? bpCount++;
>> ??????????????????????????? eventHandler.notifyAll();
>> ??????????????????????? }
>> ??????????????????????? return true;
>> ??????????????????? }
>> ??????????????????? return false;
>> ???????????????? }
>> ???????????? }
>> ??????? );
> I believe you are looking for this sequence.
> At the top of the loop a check is made if
> resume() should be called "shouldRunAfterBreakpoint".
> lines 96-99 is an early termination. And at the
> bottom of the loop, line 240, is the normal
> continue the test to the next case.
>
> resume008.java :
> ...
> ??? 94??????????? for (int i = 0; ; i++) {
> ??? 95
>
> ??? 96??????????????? if (!shouldRunAfterBreakpoint()) {
> ??? 97??????????????????? vm.resume();
> ??? 98??????????????????? break;
> ??? 99??????????????? }
>
> 100
> ?? 101
> ?? 102??????????????? display(":::::: case: # " + i);
> ?? 103
> ?? 104??????????????? switch (i) {
> ?? 105
> ?? 106??????????????????? case 0:
> ?? 107??????????????????? eventRequest = settingThreadStartRequest (
> ?? 108?????????????????????????????????????????? SUSPEND_NONE, 
> "ThreadStartRequest1");
> ...
> ? 238
> ?? 239??????????????? display("......--> vm.resume()");
> ?? 240??????????????? vm.resume();
> ?? 241??????????? }
>>
>> Also:
>>
>>> ? 1. On a thread start event the debugee is suspended, line 141 
>> That's not true for the first ThreadStartEvent since SUSPEND_NONE was 
>> used.
> The thread start event is set to SUSPEND_NONE for thread0, but when
> the thread start event is observed the resume008 test suspends the vm
> immediately after fetching the "number" property.
My point is that the Debuggee continues to run after the 
ThreadStartEvent is sent, and relies on the debugger to stop it after 
receiving the event. But in the meantime the debuggee has advanced to 
the next breakpoint, but only sometimes, thus the bug you are seeing.
>
> ?? 132??????????????? if ( !(newEvent instanceof ThreadStartEvent)) {
> ?? 133??????????????????? setFailedStatus("ERROR: new event is not 
> ThreadStartEvent");
> ?? 134??????????????? } else {
> ?? 135
> ?? 136??????????????????? String property = (String) 
> newEvent.request().getProperty("number");
> ?? 137??????????????????? display("?????? got new ThreadStartEvent 
> with propety 'number' == " + property);
> ?? 138
> ?? 139??????????????????? display("......checking up on 
> EventSet.resume()");
> ?? 140??????????????????? display("......--> vm.suspend();");
> ?? 141??????????????????? vm.suspend();
>
>
>>
>> Chris
>>
>> On 7/18/18 4:52 AM, Gary Adams wrote:
>>> There is nothing wrong with the breakpoint in methodForCommunication.
>>> The test uses it to make sure the threads are each tested separately.
>>> The breakpoint eventhandler just displays a message, increments a 
>>> counter
>>> and returns.
>>>
>>> Let me step through resume008a the debugee to help clarify ...
>>>
>>> 1. The test thread is created and the synchronized break point is 
>>> observed. lines 101-102
>>> 2. The thread is started. lines 104,135-137
>>> ??? 2a. The main thread blocks on a local object. lines 133, 139
>>> ??? 2b. The test thread is started. lines 137,
>>> ?????????? A run entered message is displayed, line 159
>>> ?????????? The main thread lock object is notified, line 167
>>> ????????? 2b1. The main thread continues. line 167, 146
>>> ????????????????? The next test thread is created. line 106
>>> ????????????????? The synchronized breakpoint is observed, line 107
>>> ????????? 2b2. A run exited message is displayed, line 169
>>>
>>> On the resume008 debugger side? ...
>>> ? 1. On a thread start event the debugee is suspended, line 141
>>> ? 2. Messages are displayed and a first set of thread suspend counts 
>>> is acquired. lines 143-151
>>> ? 3. The threads are resumed, line 152
>>> --->
>>> ? 4.? Messages are displayed and a second set of thread suspend 
>>> counts is acquired. lines 154-159
>>>
>>> The way the test is written the expectation is the debugger steps 
>>> 2,3,4 will all happen
>>> while the test thread is running.
>>>
>>> When the debugger resumes the debuggee threads (debugger step 3)
>>> the debuggee continues from where it left off (debuggee steps 
>>> 2b,2b1,2b2)
>>>
>>> If we complete debuggee step 2b1 (line 107) before the debugger 
>>> completes step 4 line 159,
>>> then the synchronized breakpoint will suspend the vm and the counts 
>>> will not match
>>> for the SUSPEND_NONE test thread start.
>>>
>>> resume008a.java:
>>>
>>> ?? 100??????????????????????? case 0:
>>> ?? 101??????????????????????????????? thread0 = new 
>>> Threadresume008a("thread0");
>>> ?? 102 methodForCommunication();
>>> ?? 103
>>> ?? 104??????????????????????????????? threadStart(thread0);
>>> ?? 105
>>> ?? 106??????????????????????????????? thread1 = new 
>>> Threadresume008a("thread1");
>>> ?? 107 methodForCommunication();
>>> ?? 108??????????????????????????????? break;
>>>
>>> ?? ...
>>> ?? 135??????? static int threadStart(Thread t) {
>>> ?? 136??????????? synchronized (waitnotifyObj) {
>>> ?? 137??????????????? t.start();
>>> ?? 138??????????????? try {
>>> ?? 139??????????????????? waitnotifyObj.wait();
>>> ?? 140??????????????? } catch ( Exception e) {
>>> ?? 141??????????????????? exitCode = FAILED;
>>> ?? 142??????????????????? logErr("?????? Exception : " + e );
>>> ?? 143??????????????????? return FAILED;
>>> ?? 144??????????????? }
>>> ?? 145??????????? }
>>> ?? 146??????????? return PASSED;
>>> ?? 147??????? }
>>>
>>> ?? 149??????? static class Threadresume008a extends Thread {
>>> ?? ...
>>> ?? 157
>>> ?? 158??????????? public void run() {
>>> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + 
>>> tName);
>>>
>>> This is the proposed fix that will let the debugger complete it's 
>>> second
>>> acquisition of suspend counts while the test thread is still running.
>>>
>>> ?? 160??????????????? // Yield, so the start thread event processing 
>>> can be completed.
>>> ?? 161??????????????? try {
>>> ?? 162??????????????????? Thread.sleep(100);
>>> ?? 163??????????????? } catch (InterruptedException e) {
>>> ?? 164??????????????????? // ignored
>>> ?? 165??????????????? }
>>>
>>> ?? 166??????????????? synchronized (waitnotifyObj) {
>>> ?? 167??????????????????????? waitnotifyObj.notify();
>>> ?? 168??????????????? }
>>> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + 
>>> tName);
>>> ?? 170??????????????? return;
>>> ?? 171??????????? }
>>> ?? 172??????? }
>>> ?? 150
>>> ?? 151??????????? String tName = null;
>>> ?? 152
>>> ?? 153??????????? public Threadresume008a(String threadName) {
>>> ?? 154??????????????? super(threadName);
>>> ?? 155??????????????? tName = threadName;
>>> ?? 156??????????? }
>>> ?? 157
>>> ?? 158??????????? public void run() {
>>> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + 
>>> tName);
>>> ?? 160??????????????? // Yield, so the start thread event processing 
>>> can be completed.
>>> ?? 161??????????????? try {
>>> ?? 162??????????????????? Thread.sleep(100);
>>> ?? 163??????????????? } catch (InterruptedException e) {
>>> ?? 164??????????????????? // ignored
>>> ?? 165??????????????? }
>>> ?? 166??????????????? synchronized (waitnotifyObj) {
>>> ?? 167??????????????????????? waitnotifyObj.notify();
>>> ?? 168??????????????? }
>>> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + 
>>> tName);
>>> ?? 170??????????????? return;
>>> ?? 171??????????? }
>>> ?? 172??????? }
>>>
>>>
>>>
>>> On 7/18/18, 2:38 AM, Chris Plummer wrote:
>>>> Hi Gary,
>>>>
>>>> I've been having trouble following the control flow of this test. 
>>>> One thing I've stumbled across is the following:
>>>>
>>>> ??????????? /* A debuggee class must define 'methodForCommunication'
>>>> ???????????? * method and invoke it in points of synchronization
>>>> ???????????? * with a debugger.
>>>> ???????????? */
>>>> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication");
>>>>
>>>> So why isn't this mode of synchronization good enough? Is it 
>>>> because it was not designed with the understanding that the 
>>>> debugger might be doing suspended thread counts, and suspending all 
>>>> threads at the breakpoint messes up the test?
>>>>
>>>> From what I can tell of the test, after the debuggee is started and 
>>>> hits the default breakpoint at the start of main(), the debugger 
>>>> then does a vm.resume() at the start of the for loop in the 
>>>> runTest() method. The debuggee then creates a thread and calls 
>>>> methodForCommunication(). There is already a breakpoint set there 
>>>> by the above debuggee code. It's unclear to me what happens as a 
>>>> result of this breakpoint and how it serves the test. Also unclear 
>>>> to me who is responsible for the vm.resume() after the breakpoint 
>>>> is hit.
>>>>
>>>> The debugger then requests all ThreadStart events, requesting that 
>>>> no threads be disabled when it is sent. I think you are saying that 
>>>> when the ThreadStart event comes in, sometimes we are at the 
>>>> methodForCommunication breakpoint, with all threads disabled, and 
>>>> this messes up the thread suspend counts. You want to delay 100ms 
>>>> so the breakpoint event can be processed and threads resumed again 
>>>> (although I can't see who actually resumes the thread after hitting 
>>>> the methodForCommunication breakpoint).
>>>>
>>>> Chris
>>>>
>>>> On 7/17/18 8:33 AM, Gary Adams wrote:
>>>>> A race condition exists between the debugger and the debuggee.
>>>>>
>>>>> The first test thread is started with SUSPEND_NONE policy set.
>>>>> While processing the thread start event the debugger captures
>>>>> an initial set of thread suspend counts and resumes the
>>>>> debuggee vm. If the debuggee advances quickly it reaches
>>>>> the breakpoint set for methodForCommunication. Since the breakpoint
>>>>> carries with it SUSPEND_ALL policy, when the debugger captures a 
>>>>> second
>>>>> set of suspend counts, it will not match the expected counts for
>>>>> a SUSPEND_NONE scenario.
>>>>>
>>>>> The proposed fix introduces a yield in the debuggee test thread 
>>>>> run method
>>>>> to allow the debugger to get the expected sampled values.
>>>>>
>>>>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8170089
>>>>> ? Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/
>>>>>
>>>>>
>>>>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java:
>>>>> ...
>>>>> ?? 186??????? private void 
>>>>> setCommunicationBreakpoint(ReferenceType refType, String 
>>>>> methodName) {
>>>>> ?? 187??????????? Method method = debuggee.methodByName(refType, 
>>>>> methodName);
>>>>> ?? 188??????????? Location location = null;
>>>>> ?? 189??????????? try {
>>>>> ?? 190??????????????? location = method.allLineLocations().get(0);
>>>>> ?? 191??????????? } catch (AbsentInformationException e) {
>>>>> ?? 192??????????????? throw new Failure(e);
>>>>> ?? 193??????????? }
>>>>> ?? 194??????????? bpRequest = debuggee.makeBreakpoint(location);
>>>>> ?? 195
>>>>>
>>>>> ?? 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL);
>>>>>
>>>>> ?? 197??????????? bpRequest.putProperty("number", "zero");
>>>>> ?? 198??????????? bpRequest.enable();
>>>>> ?? 199
>>>>> ?? 200??????????? eventHandler.addListener(
>>>>> ?? 201???????????????? new EventHandler.EventListener() {
>>>>> ?? 202???????????????????? public boolean eventReceived(Event 
>>>>> event) {
>>>>> ?? 203??????????????????????? if (event instanceof BreakpointEvent 
>>>>> && bpRequest.equals(event.request())) {
>>>>> ?? 204 synchronized(eventHandler) {
>>>>> ?? 205??????????????????????????????? display("Received 
>>>>> communication breakpoint event.");
>>>>> ?? 206??????????????????????????????? bpCount++;
>>>>> ?? 207 eventHandler.notifyAll();
>>>>> ?? 208??????????????????????????? }
>>>>> ?? 209??????????????????????????? return true;
>>>>> ?? 210??????????????????????? }
>>>>> ?? 211??????????????????????? return false;
>>>>> ?? 212???????????????????? }
>>>>> ?? 213???????????????? }
>>>>> ?? 214??????????? );
>>>>> ?? 215??????? }
>>>>>
>>>>>
>>>>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java:
>>>>> ...
>>>>> ?? 140??????????????????? display("......--> vm.suspend();");
>>>>> ?? 141??????????????????? vm.suspend();
>>>>> ?? 142
>>>>> ?? 143??????????????????? display("??????? getting : Map<String, 
>>>>> Integer> suspendsCounts1");
>>>>> ?? 144
>>>>> ?? 145??????????????????? Map<String, Integer> suspendsCounts1 = 
>>>>> new HashMap<String, Integer>();
>>>>> ?? 146??????????????????? for (ThreadReference threadReference : 
>>>>> vm.allThreads()) {
>>>>> ?? 147 suspendsCounts1.put(threadReference.name(), 
>>>>> threadReference.suspendCount());
>>>>> ?? 148??????????????????? }
>>>>> ?? 149 display(suspendsCounts1.toString());
>>>>> ?? 150
>>>>> ?? 151??????????????????? display(" eventSet.resume;");
>>>>> ?? 152??????????????????? eventSet.resume();
>>>>> ?? 153
>>>>> ?? 154??????????????????? display("??????? getting : Map<String, 
>>>>> Integer> suspendsCounts2");
>>>>>
>>>>> This is where the breakpoint is encountered before the second set 
>>>>> of suspend counts is acquired.
>>>>>
>>>>> ?? 155??????????????????? Map<String, Integer> suspendsCounts2 = 
>>>>> new HashMap<String, Integer>();
>>>>> ?? 156??????????????????? for (ThreadReference threadReference : 
>>>>> vm.allThreads()) {
>>>>> ?? 157 suspendsCounts2.put(threadReference.name(), 
>>>>> threadReference.suspendCount());
>>>>> ?? 158??????????????????? }
>>>>> ?? 159 display(suspendsCounts2.toString());
>>>>>
>>>>
>>>
>>
>


From serguei.spitsyn at oracle.com  Wed Jul 18 20:56:47 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 18 Jul 2018 13:56:47 -0700
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
 <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
 <f96a16917c934a539523e078f902b880@sap.com>
 <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com>
Message-ID: <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180718/a86bcf46/attachment.html>

From gary.adams at oracle.com  Wed Jul 18 22:09:43 2018
From: gary.adams at oracle.com (gary.adams at oracle.com)
Date: Wed, 18 Jul 2018 18:09:43 -0400
Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR:
 suspendCounts don't match for : Common-Cleaner
In-Reply-To: <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com>
References: <5B4E0C62.3020808@oracle.com>
 <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com>
 <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com>
 <5B4F98BF.1060602@oracle.com>
 <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com>
Message-ID: <c309dffe-f935-60ce-ce4b-5c99cd01406b@oracle.com>

On 7/18/18 4:47 PM, Chris Plummer wrote:
> Hi Gary
>
> Ok, so shouldRunAfterBreakpoint() is the code that does the 
> eventHandler.wait(), so it gets the eventHandler.notifyAll() 
> notification from the BreakpointEvent handler.
>
> And as a side note, I see now that resumption of execution after the 
> breakpoint at main() is done by:
>
> ??????????? // after waitForClassPrepared() main debuggee thread is 
> suspended, resume it before test start
> ??????????? display("RESUME DEBUGGEE VM");
> ??????????? vm.resume();
>
> ??????????? testRun();
>
> shouldRunAfterBreakpoint() is returning true until the end of the test 
> when the debuggee is executes "instruction = end". That's why 
> runTests() does a "break" when shouldRunAfterBreakpoint() returns 
> false. So this means the code that is checking 
> shouldRunAfterBreakpoint() is not resuming execution for the first few 
> (probably 3) methodForCommunication() breakpoints. However, it does 
> make sure that runTests() blocks until the BreakPointEvent has been 
> processed.
>
> You point out the vm.resume() at the bottom of the loop in runTests(), 
> but that's only after a bunch of ThreadStartEvent processing above it 
> has been done already. The ThreadStartEvent would never get generated 
> if there was not a resume some point earlier. I think it is happening 
> during the eventHandler.waitForRequestedEventSet() call, which does a 
> vm.resume().
>
> So if I understand the order of things now:
>
> -shouldRunAfterBreakpoint() returns after first 
> methodForCommunication() is hit. At this point we know the first 
> thread has been created, but no attempt to start it yet. The debuggee 
> is suspended at this point.
> -runTests() requests ThreadStartEvents with SUSPEND_NONE. This also 
> does a vm.resume().
> -The debuggee starts the thread and then does another 
> methodForCommunication() (this 2nd one is actually after the 2nd 
> thread has been created, but not yet started). Now we have a race. Do 
> we get the ThreadStartEvent first or the BreakpointEvent. This is 
> because when the ThreadStartEvent is generated, the thread is not 
> suspended due to SUSPEND_NONE. Even if the ThreadStartEvent comes in 
> first, the async handling of the BreakpointEvent can cause problems 
> during the ThreadStartEvent processing.
Based on the failed log in the bug report, the thread start event is 
observed,
the suspend counts acquired, then after the resume, the breakpoint message
is displayed and the second set of suspend counts acquired.

I can show you the passed and failed logs tomorrow.
> -You added a 100ms delay after the thread has started, but before 
> methodForCommunication(), hoping it will make it so the 
> ThreadStartEvent can be received and fully processed before the 
> BreakpointEvent is.
The delay is mostly just a yield so the debugger gets a chance to run.
>
> I think it would be preferable to fix this by doing better 
> sychronization. After all, that is the approach the test originally 
> took. It could have been written with a bunch of sleep() delays 
> instead, but that in general is not a very good approach.
>
> What if you added a shouldRunAfterBreakpoint() call after getting the 
> ThreadStartEvent arrives. At this point you would know that the vm is 
> suspended due to the breakpoint, so no need for:
>
> ??????????????? display("......checking up on EventSet.resume()");
> ??????????????? display("......--> vm.suspend();");
> ??????????????? vm.suspend();
I think the suspend is intentional to capture the the suspend counts.
It also needs to resume the vm and acquire again so it can confirm the 
correct
suspend count behaviors.
If the test waits to capture the second set of suspend counts, the 
breakpoint
causes incorrect values.

...
>
> You might then also need to add another methodForCommunication() call 
> at the end of case 0 and 1 in the debuggee, although I think you could 
> instead just change the shouldRunAfterBreakpoint() at the start of the 
> loop. I think that check actually belongs at the end of the loop, and 
> only for case 2. In fact it would be an error if 
> shouldRunAfterBreakpoint() did not return true in that case. Then you 
> also need to add a shouldRunAfterBreakpoint() at the start of case 0 
> to get things rolling (and I think at the start of case 1 also).
>
> Chris
>
>
> On 7/18/18 12:45 PM, Gary Adams wrote:
>> Answers below? ...
>>
>> On 7/18/18, 2:50 PM, Chris Plummer wrote:
>>> Hi Gary,
>>>
>>> Who does the resume for the breakpoint event?
>>>
>>> ??????? eventHandler.addListener(
>>> ???????????? new EventHandler.EventListener() {
>>> ???????????????? public boolean eventReceived(Event event) {
>>> ??????????????????? if (event instanceof BreakpointEvent && 
>>> bpRequest.equals(event.request())) {
>>> ??????????????????????? synchronized(eventHandler) {
>>> ??????????????????????????? display("Received communication 
>>> breakpoint event.");
>>> ??????????????????????????? bpCount++;
>>> ??????????????????????????? eventHandler.notifyAll();
>>> ??????????????????????? }
>>> ??????????????????????? return true;
>>> ??????????????????? }
>>> ??????????????????? return false;
>>> ???????????????? }
>>> ???????????? }
>>> ??????? );
>> I believe you are looking for this sequence.
>> At the top of the loop a check is made if
>> resume() should be called "shouldRunAfterBreakpoint".
>> lines 96-99 is an early termination. And at the
>> bottom of the loop, line 240, is the normal
>> continue the test to the next case.
>>
>> resume008.java :
>> ...
>> ??? 94??????????? for (int i = 0; ; i++) {
>> ??? 95
>>
>> ??? 96??????????????? if (!shouldRunAfterBreakpoint()) {
>> ??? 97??????????????????? vm.resume();
>> ??? 98??????????????????? break;
>> ??? 99??????????????? }
>>
>> 100
>> ?? 101
>> ?? 102??????????????? display(":::::: case: # " + i);
>> ?? 103
>> ?? 104??????????????? switch (i) {
>> ?? 105
>> ?? 106??????????????????? case 0:
>> ?? 107??????????????????? eventRequest = settingThreadStartRequest (
>> ?? 108?????????????????????????????????????????? SUSPEND_NONE, 
>> "ThreadStartRequest1");
>> ...
>> ? 238
>> ?? 239??????????????? display("......--> vm.resume()");
>> ?? 240??????????????? vm.resume();
>> ?? 241??????????? }
>>>
>>> Also:
>>>
>>>> ? 1. On a thread start event the debugee is suspended, line 141 
>>> That's not true for the first ThreadStartEvent since SUSPEND_NONE 
>>> was used.
>> The thread start event is set to SUSPEND_NONE for thread0, but when
>> the thread start event is observed the resume008 test suspends the vm
>> immediately after fetching the "number" property.
> My point is that the Debuggee continues to run after the 
> ThreadStartEvent is sent, and relies on the debugger to stop it after 
> receiving the event. But in the meantime the debuggee has advanced to 
> the next breakpoint, but only sometimes, thus the bug you are seeing.
>>
>> ?? 132??????????????? if ( !(newEvent instanceof ThreadStartEvent)) {
>> ?? 133??????????????????? setFailedStatus("ERROR: new event is not 
>> ThreadStartEvent");
>> ?? 134??????????????? } else {
>> ?? 135
>> ?? 136??????????????????? String property = (String) 
>> newEvent.request().getProperty("number");
>> ?? 137??????????????????? display("?????? got new ThreadStartEvent 
>> with propety 'number' == " + property);
>> ?? 138
>> ?? 139??????????????????? display("......checking up on 
>> EventSet.resume()");
>> ?? 140??????????????????? display("......--> vm.suspend();");
>> ?? 141??????????????????? vm.suspend();
>>
>>
>>>
>>> Chris
>>>
>>> On 7/18/18 4:52 AM, Gary Adams wrote:
>>>> There is nothing wrong with the breakpoint in methodForCommunication.
>>>> The test uses it to make sure the threads are each tested separately.
>>>> The breakpoint eventhandler just displays a message, increments a 
>>>> counter
>>>> and returns.
>>>>
>>>> Let me step through resume008a the debugee to help clarify ...
>>>>
>>>> 1. The test thread is created and the synchronized break point is 
>>>> observed. lines 101-102
>>>> 2. The thread is started. lines 104,135-137
>>>> ??? 2a. The main thread blocks on a local object. lines 133, 139
>>>> ??? 2b. The test thread is started. lines 137,
>>>> ?????????? A run entered message is displayed, line 159
>>>> ?????????? The main thread lock object is notified, line 167
>>>> ????????? 2b1. The main thread continues. line 167, 146
>>>> ????????????????? The next test thread is created. line 106
>>>> ????????????????? The synchronized breakpoint is observed, line 107
>>>> ????????? 2b2. A run exited message is displayed, line 169
>>>>
>>>> On the resume008 debugger side? ...
>>>> ? 1. On a thread start event the debugee is suspended, line 141
>>>> ? 2. Messages are displayed and a first set of thread suspend 
>>>> counts is acquired. lines 143-151
>>>> ? 3. The threads are resumed, line 152
>>>> --->
>>>> ? 4.? Messages are displayed and a second set of thread suspend 
>>>> counts is acquired. lines 154-159
>>>>
>>>> The way the test is written the expectation is the debugger steps 
>>>> 2,3,4 will all happen
>>>> while the test thread is running.
>>>>
>>>> When the debugger resumes the debuggee threads (debugger step 3)
>>>> the debuggee continues from where it left off (debuggee steps 
>>>> 2b,2b1,2b2)
>>>>
>>>> If we complete debuggee step 2b1 (line 107) before the debugger 
>>>> completes step 4 line 159,
>>>> then the synchronized breakpoint will suspend the vm and the counts 
>>>> will not match
>>>> for the SUSPEND_NONE test thread start.
>>>>
>>>> resume008a.java:
>>>>
>>>> ?? 100??????????????????????? case 0:
>>>> ?? 101??????????????????????????????? thread0 = new 
>>>> Threadresume008a("thread0");
>>>> ?? 102 methodForCommunication();
>>>> ?? 103
>>>> ?? 104??????????????????????????????? threadStart(thread0);
>>>> ?? 105
>>>> ?? 106??????????????????????????????? thread1 = new 
>>>> Threadresume008a("thread1");
>>>> ?? 107 methodForCommunication();
>>>> ?? 108??????????????????????????????? break;
>>>>
>>>> ?? ...
>>>> ?? 135??????? static int threadStart(Thread t) {
>>>> ?? 136??????????? synchronized (waitnotifyObj) {
>>>> ?? 137??????????????? t.start();
>>>> ?? 138??????????????? try {
>>>> ?? 139??????????????????? waitnotifyObj.wait();
>>>> ?? 140??????????????? } catch ( Exception e) {
>>>> ?? 141??????????????????? exitCode = FAILED;
>>>> ?? 142??????????????????? logErr("?????? Exception : " + e );
>>>> ?? 143??????????????????? return FAILED;
>>>> ?? 144??????????????? }
>>>> ?? 145??????????? }
>>>> ?? 146??????????? return PASSED;
>>>> ?? 147??????? }
>>>>
>>>> ?? 149??????? static class Threadresume008a extends Thread {
>>>> ?? ...
>>>> ?? 157
>>>> ?? 158??????????? public void run() {
>>>> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + 
>>>> tName);
>>>>
>>>> This is the proposed fix that will let the debugger complete it's 
>>>> second
>>>> acquisition of suspend counts while the test thread is still running.
>>>>
>>>> ?? 160??????????????? // Yield, so the start thread event 
>>>> processing can be completed.
>>>> ?? 161??????????????? try {
>>>> ?? 162??????????????????? Thread.sleep(100);
>>>> ?? 163??????????????? } catch (InterruptedException e) {
>>>> ?? 164??????????????????? // ignored
>>>> ?? 165??????????????? }
>>>>
>>>> ?? 166??????????????? synchronized (waitnotifyObj) {
>>>> ?? 167??????????????????????? waitnotifyObj.notify();
>>>> ?? 168??????????????? }
>>>> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + 
>>>> tName);
>>>> ?? 170??????????????? return;
>>>> ?? 171??????????? }
>>>> ?? 172??????? }
>>>> ?? 150
>>>> ?? 151??????????? String tName = null;
>>>> ?? 152
>>>> ?? 153??????????? public Threadresume008a(String threadName) {
>>>> ?? 154??????????????? super(threadName);
>>>> ?? 155??????????????? tName = threadName;
>>>> ?? 156??????????? }
>>>> ?? 157
>>>> ?? 158??????????? public void run() {
>>>> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + 
>>>> tName);
>>>> ?? 160??????????????? // Yield, so the start thread event 
>>>> processing can be completed.
>>>> ?? 161??????????????? try {
>>>> ?? 162??????????????????? Thread.sleep(100);
>>>> ?? 163??????????????? } catch (InterruptedException e) {
>>>> ?? 164??????????????????? // ignored
>>>> ?? 165??????????????? }
>>>> ?? 166??????????????? synchronized (waitnotifyObj) {
>>>> ?? 167??????????????????????? waitnotifyObj.notify();
>>>> ?? 168??????????????? }
>>>> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + 
>>>> tName);
>>>> ?? 170??????????????? return;
>>>> ?? 171??????????? }
>>>> ?? 172??????? }
>>>>
>>>>
>>>>
>>>> On 7/18/18, 2:38 AM, Chris Plummer wrote:
>>>>> Hi Gary,
>>>>>
>>>>> I've been having trouble following the control flow of this test. 
>>>>> One thing I've stumbled across is the following:
>>>>>
>>>>> ??????????? /* A debuggee class must define 'methodForCommunication'
>>>>> ???????????? * method and invoke it in points of synchronization
>>>>> ???????????? * with a debugger.
>>>>> ???????????? */
>>>>> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication");
>>>>>
>>>>> So why isn't this mode of synchronization good enough? Is it 
>>>>> because it was not designed with the understanding that the 
>>>>> debugger might be doing suspended thread counts, and suspending 
>>>>> all threads at the breakpoint messes up the test?
>>>>>
>>>>> From what I can tell of the test, after the debuggee is started 
>>>>> and hits the default breakpoint at the start of main(), the 
>>>>> debugger then does a vm.resume() at the start of the for loop in 
>>>>> the runTest() method. The debuggee then creates a thread and calls 
>>>>> methodForCommunication(). There is already a breakpoint set there 
>>>>> by the above debuggee code. It's unclear to me what happens as a 
>>>>> result of this breakpoint and how it serves the test. Also unclear 
>>>>> to me who is responsible for the vm.resume() after the breakpoint 
>>>>> is hit.
>>>>>
>>>>> The debugger then requests all ThreadStart events, requesting that 
>>>>> no threads be disabled when it is sent. I think you are saying 
>>>>> that when the ThreadStart event comes in, sometimes we are at the 
>>>>> methodForCommunication breakpoint, with all threads disabled, and 
>>>>> this messes up the thread suspend counts. You want to delay 100ms 
>>>>> so the breakpoint event can be processed and threads resumed again 
>>>>> (although I can't see who actually resumes the thread after 
>>>>> hitting the methodForCommunication breakpoint).
>>>>>
>>>>> Chris
>>>>>
>>>>> On 7/17/18 8:33 AM, Gary Adams wrote:
>>>>>> A race condition exists between the debugger and the debuggee.
>>>>>>
>>>>>> The first test thread is started with SUSPEND_NONE policy set.
>>>>>> While processing the thread start event the debugger captures
>>>>>> an initial set of thread suspend counts and resumes the
>>>>>> debuggee vm. If the debuggee advances quickly it reaches
>>>>>> the breakpoint set for methodForCommunication. Since the breakpoint
>>>>>> carries with it SUSPEND_ALL policy, when the debugger captures a 
>>>>>> second
>>>>>> set of suspend counts, it will not match the expected counts for
>>>>>> a SUSPEND_NONE scenario.
>>>>>>
>>>>>> The proposed fix introduces a yield in the debuggee test thread 
>>>>>> run method
>>>>>> to allow the debugger to get the expected sampled values.
>>>>>>
>>>>>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8170089
>>>>>> ? Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/
>>>>>>
>>>>>>
>>>>>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java:
>>>>>> ...
>>>>>> ?? 186??????? private void 
>>>>>> setCommunicationBreakpoint(ReferenceType refType, String 
>>>>>> methodName) {
>>>>>> ?? 187??????????? Method method = debuggee.methodByName(refType, 
>>>>>> methodName);
>>>>>> ?? 188??????????? Location location = null;
>>>>>> ?? 189??????????? try {
>>>>>> ?? 190??????????????? location = method.allLineLocations().get(0);
>>>>>> ?? 191??????????? } catch (AbsentInformationException e) {
>>>>>> ?? 192??????????????? throw new Failure(e);
>>>>>> ?? 193??????????? }
>>>>>> ?? 194??????????? bpRequest = debuggee.makeBreakpoint(location);
>>>>>> ?? 195
>>>>>>
>>>>>> ?? 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL);
>>>>>>
>>>>>> ?? 197??????????? bpRequest.putProperty("number", "zero");
>>>>>> ?? 198??????????? bpRequest.enable();
>>>>>> ?? 199
>>>>>> ?? 200??????????? eventHandler.addListener(
>>>>>> ?? 201???????????????? new EventHandler.EventListener() {
>>>>>> ?? 202???????????????????? public boolean eventReceived(Event 
>>>>>> event) {
>>>>>> ?? 203??????????????????????? if (event instanceof 
>>>>>> BreakpointEvent && bpRequest.equals(event.request())) {
>>>>>> ?? 204 synchronized(eventHandler) {
>>>>>> ?? 205??????????????????????????????? display("Received 
>>>>>> communication breakpoint event.");
>>>>>> ?? 206??????????????????????????????? bpCount++;
>>>>>> ?? 207 eventHandler.notifyAll();
>>>>>> ?? 208??????????????????????????? }
>>>>>> ?? 209??????????????????????????? return true;
>>>>>> ?? 210??????????????????????? }
>>>>>> ?? 211??????????????????????? return false;
>>>>>> ?? 212???????????????????? }
>>>>>> ?? 213???????????????? }
>>>>>> ?? 214??????????? );
>>>>>> ?? 215??????? }
>>>>>>
>>>>>>
>>>>>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: 
>>>>>>
>>>>>> ...
>>>>>> ?? 140??????????????????? display("......--> vm.suspend();");
>>>>>> ?? 141??????????????????? vm.suspend();
>>>>>> ?? 142
>>>>>> ?? 143??????????????????? display("??????? getting : Map<String, 
>>>>>> Integer> suspendsCounts1");
>>>>>> ?? 144
>>>>>> ?? 145??????????????????? Map<String, Integer> suspendsCounts1 = 
>>>>>> new HashMap<String, Integer>();
>>>>>> ?? 146??????????????????? for (ThreadReference threadReference : 
>>>>>> vm.allThreads()) {
>>>>>> ?? 147 suspendsCounts1.put(threadReference.name(), 
>>>>>> threadReference.suspendCount());
>>>>>> ?? 148??????????????????? }
>>>>>> ?? 149 display(suspendsCounts1.toString());
>>>>>> ?? 150
>>>>>> ?? 151??????????????????? display(" eventSet.resume;");
>>>>>> ?? 152??????????????????? eventSet.resume();
>>>>>> ?? 153
>>>>>> ?? 154??????????????????? display("??????? getting : Map<String, 
>>>>>> Integer> suspendsCounts2");
>>>>>>
>>>>>> This is where the breakpoint is encountered before the second set 
>>>>>> of suspend counts is acquired.
>>>>>>
>>>>>> ?? 155??????????????????? Map<String, Integer> suspendsCounts2 = 
>>>>>> new HashMap<String, Integer>();
>>>>>> ?? 156??????????????????? for (ThreadReference threadReference : 
>>>>>> vm.allThreads()) {
>>>>>> ?? 157 suspendsCounts2.put(threadReference.name(), 
>>>>>> threadReference.suspendCount());
>>>>>> ?? 158??????????????????? }
>>>>>> ?? 159 display(suspendsCounts2.toString());
>>>>>>
>>>>>
>>>>
>>>
>>
>


From serguei.spitsyn at oracle.com  Thu Jul 19 03:32:27 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 18 Jul 2018 20:32:27 -0700
Subject: RFR(XS): 8207819: Problem list
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
Message-ID: <c19f5f7d-6c51-457f-d240-db48ab53580d@oracle.com>

Please, review the fix for sub-task:
 ? https://bugs.openjdk.java.net/browse/JDK-8207819


The test HeapMonitorStatRateTest.java needs to be problem listed until 
main bug is fixed
 ? https://bugs.openjdk.java.net/browse/JDK-8207765


The patch is:

diff -r 3c0a5bf931e4 test/hotspot/jtreg/ProblemList.txt
--- a/test/hotspot/jtreg/ProblemList.txt??? Thu Jul 19 10:30:24 2018 +0800
+++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 18 20:27:10 2018 -0700
@@ -81,6 +81,7 @@

 ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all
 ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all
+serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java 
8207765 generic-all

 ?#############################################################################


Thanks,
Serguei

From chris.plummer at oracle.com  Thu Jul 19 03:47:56 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 18 Jul 2018 20:47:56 -0700
Subject: RFR(XS): 8207819: Problem list
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
In-Reply-To: <c19f5f7d-6c51-457f-d240-db48ab53580d@oracle.com>
References: <c19f5f7d-6c51-457f-d240-db48ab53580d@oracle.com>
Message-ID: <a768ea17-4f76-d860-c691-fb89fc20a98b@oracle.com>

Looks good.

Chris

On 7/18/18 8:32 PM, serguei.spitsyn at oracle.com wrote:
> Please, review the fix for sub-task:
> ? https://bugs.openjdk.java.net/browse/JDK-8207819
>
>
> The test HeapMonitorStatRateTest.java needs to be problem listed until 
> main bug is fixed
> ? https://bugs.openjdk.java.net/browse/JDK-8207765
>
>
> The patch is:
>
> diff -r 3c0a5bf931e4 test/hotspot/jtreg/ProblemList.txt
> --- a/test/hotspot/jtreg/ProblemList.txt??? Thu Jul 19 10:30:24 2018 
> +0800
> +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 18 20:27:10 2018 
> -0700
> @@ -81,6 +81,7 @@
>
> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all
> ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all
> +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java 
> 8207765 generic-all
>
> ?############################################################################# 
>
>
>
> Thanks,
> Serguei


From serguei.spitsyn at oracle.com  Thu Jul 19 03:49:58 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 18 Jul 2018 20:49:58 -0700
Subject: RFR(XS): 8207819: Problem list
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
In-Reply-To: <a768ea17-4f76-d860-c691-fb89fc20a98b@oracle.com>
References: <c19f5f7d-6c51-457f-d240-db48ab53580d@oracle.com>
 <a768ea17-4f76-d860-c691-fb89fc20a98b@oracle.com>
Message-ID: <5c714d80-2216-84bd-5391-a7fbcccd34a7@oracle.com>

Thanks, Chris!
This meets the Trivial Change policy, so that pushing now.

Thanks,
Serguei


On 7/18/18 20:47, Chris Plummer wrote:
> Looks good.
>
> Chris
>
> On 7/18/18 8:32 PM, serguei.spitsyn at oracle.com wrote:
>> Please, review the fix for sub-task:
>> ? https://bugs.openjdk.java.net/browse/JDK-8207819
>>
>>
>> The test HeapMonitorStatRateTest.java needs to be problem listed 
>> until main bug is fixed
>> ? https://bugs.openjdk.java.net/browse/JDK-8207765
>>
>>
>> The patch is:
>>
>> diff -r 3c0a5bf931e4 test/hotspot/jtreg/ProblemList.txt
>> --- a/test/hotspot/jtreg/ProblemList.txt??? Thu Jul 19 10:30:24 2018 
>> +0800
>> +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 18 20:27:10 2018 
>> -0700
>> @@ -81,6 +81,7 @@
>>
>> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 
>> generic-all
>> ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 
>> generic-all
>> +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java 
>> 8207765 generic-all
>>
>> ?############################################################################# 
>>
>>
>>
>> Thanks,
>> Serguei
>
>
>


From gary.adams at oracle.com  Thu Jul 19 12:08:12 2018
From: gary.adams at oracle.com (Gary Adams)
Date: Thu, 19 Jul 2018 08:08:12 -0400
Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR:
 suspendCounts don't match for : Common-Cleaner
In-Reply-To: <c309dffe-f935-60ce-ce4b-5c99cd01406b@oracle.com>
References: <5B4E0C62.3020808@oracle.com>
 <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com>
 <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com>
 <5B4F98BF.1060602@oracle.com>
 <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com>
 <c309dffe-f935-60ce-ce4b-5c99cd01406b@oracle.com>
Message-ID: <5B507F2C.4080503@oracle.com>

In the successful run below "the first acquire thread suspend counts, 
resume,
and the second acquire thread suspend counts" is not interrupted by the
breakpoint event.

Note that the failed thread0 case the test thread finishes rapidly.


[2018-01-22T20:33:46.86] debugee.stderr>  **>  debuggee:   'run': enter  :: threadName == thread0
*[2018-01-22T20:33:46.86] debugee.stderr>  **>  debuggee:   'run': exit   :: threadName == thread0*


and the successful test run , the thread0 run method exits after the thread1
has started.

debugger> :::::: case: # 1
debugger> ......waiting for new ThreadStartEvent : 1
EventHandler> waitForRequestedEventSet: enabling remove of listener 
nsk.share.jdi.EventHandler$7 at 616bc3ae
EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 616bc3ae
EventHandler> waitForRequestedEventSet: vm.resume called
EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD
*debugee.stderr> **> debuggee:   'run': exit   :: threadName == thread0*


Here's a recent mach5 failed log:

[2018-01-22T20:33:45.65] #
[2018-01-22T20:33:45.65] export TEST_CLEANUP
[2018-01-22T20:33:45.65] export SHELL
[2018-01-22T20:33:45.65] export DISPLAY
[2018-01-22T20:33:45.65] export LIBJSIG_PATH
[2018-01-22T20:33:45.65] export TESTBASE
[2018-01-22T20:33:45.65] export JAVA_OPTS
[2018-01-22T20:33:45.65] export RAS_OPTIONS
[2018-01-22T20:33:45.65] export HOME
[2018-01-22T20:33:45.65] export LD_LIBRARY_PATH
[2018-01-22T20:33:45.65] export CLASSPATH
[2018-01-22T20:33:45.65] export TEMP
[2018-01-22T20:33:45.65] export TESTED_JAVA_HOME
[2018-01-22T20:33:45.65] export BASH_ENV
[2018-01-22T20:33:45.65] export PATH
[2018-01-22T20:33:45.65] TEST_DEST_DIR="resume008"
[2018-01-22T20:33:45.65] # Actual: TEST_DEST_DIR=resume008
[2018-01-22T20:33:45.65] TESTNAME="${test_case_name}"
[2018-01-22T20:33:45.65] # Actual: TESTNAME=resume008
[2018-01-22T20:33:45.65] testName="nsk/jdi/EventSet/resume//resume008"
[2018-01-22T20:33:45.65] # Actual: testName=nsk/jdi/EventSet/resume//resume008
[2018-01-22T20:33:45.65] TESTDIR="${test_work_dir}"
[2018-01-22T20:33:45.65] # Actual: TESTDIR=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008
[2018-01-22T20:33:45.65] testWorkDir="${test_work_dir}/"
[2018-01-22T20:33:45.65] # Actual: testWorkDir=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/
[2018-01-22T20:33:45.65] export testWorkDir
[2018-01-22T20:33:45.65] tlogOutFile="${test_work_dir}/${test_name}.tlog"
[2018-01-22T20:33:45.65] # Actual: tlogOutFile=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/resume008.tlog
[2018-01-22T20:33:45.65] testErrFile="${test_work_dir}/${test_name}.err"
[2018-01-22T20:33:45.65] # Actual: testErrFile=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/resume008.err
[2018-01-22T20:33:45.65] EXECUTE_CLASS="${test_name}"
[2018-01-22T20:33:45.66] # Actual: EXECUTE_CLASS=resume008
[2018-01-22T20:33:45.66] NSK_STRESS_METASPACE_OPTS="-XX:MaxMetaspaceSize=128m -XX:CompressedClassSpaceSize=64m -XX:MaxHeapSize=512m -Xlog:gc(ASTERISK_SUBST),gc+heap=trace"
[2018-01-22T20:33:45.66] # Actual: NSK_STRESS_METASPACE_OPTS=-XX:MaxMetaspaceSize=128m -XX:CompressedClassSpaceSize=64m -XX:MaxHeapSize=512m -Xlog:gc*,gc+heap=trace
[2018-01-22T20:33:45.66] export NSK_STRESS_METASPACE_OPTS
[2018-01-22T20:33:45.66] EXECUTE_CLASS="nsk.jdi.EventSet.resume.resume008"
[2018-01-22T20:33:45.66] # Actual: EXECUTE_CLASS=nsk.jdi.EventSet.resume.resume008
[2018-01-22T20:33:45.66] TEST_ARGS="${JDI_TEST_KEYS} -debugee.vmkeys=${JDI_DEBUGEE_VM_KEYS}"
[2018-01-22T20:33:45.66] # Actual: TEST_ARGS=-verbose -arch=linux-amd64 -waittime=5 -debugee.vmkind=java -transport.address=dynamic -debugee.vmkeys=-XX:MaxRAMPercentage=12.5
[2018-01-22T20:33:45.66] JAVA="${TESTED_JAVA_HOME}/bin/${DEBUGGER_KIND_OF_JAVA}"
[2018-01-22T20:33:45.66] # Actual: JAVA=/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10/bin/java
[2018-01-22T20:33:45.66] JAVA_OPTS="${DEBUGGER_JAVA_OPTS}"
[2018-01-22T20:33:45.66] # Actual: JAVA_OPTS=
[2018-01-22T20:33:45.66] APPLICATION_TIMEOUT="${TIMEOUT}"
[2018-01-22T20:33:45.66] # Actual: APPLICATION_TIMEOUT=30
[2018-01-22T20:33:45.66] CLASSPATH="${test_work_dir}${PS}${CLASSPATH}"
[2018-01-22T20:33:45.66] # Actual: CLASSPATH=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008:/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.test/hotspot/closed/tonga/bin/classes:
[2018-01-22T20:33:45.66] export CLASSPATH
[2018-01-22T20:33:45.66] ${JAVA} ${JAVA_OPTS} ${EXECUTE_CLASS} ${TEST_ARGS}
[2018-01-22T20:33:45.66] # Actual: /scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10/bin/java nsk.jdi.EventSet.resume.resume008 -verbose -arch=linux-amd64 -waittime=5 -debugee.vmkind=java -transport.address=dynamic -debugee.vmkeys=-XX:MaxRAMPercentage=12.5
[2018-01-22T20:33:46.01] binder>  VirtualMachineManager: version 9.0
[2018-01-22T20:33:46.05] binder>  Finding connector: default
[2018-01-22T20:33:46.05] binder>  LaunchingConnector:
[2018-01-22T20:33:46.06] binder>      name: com.sun.jdi.CommandLineLaunch
[2018-01-22T20:33:46.06] binder>      description: Launches target using Sun Java VM command line and attaches to it
[2018-01-22T20:33:46.06] binder>      transport: com.sun.tools.jdi.SunCommandLineLauncher$2 at 457e2f02
[2018-01-22T20:33:46.19] binder>  Connector arguments:
[2018-01-22T20:33:46.19] binder>      home=/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10
[2018-01-22T20:33:46.19] binder>      vmexec=java
[2018-01-22T20:33:46.19] binder>      options=-XX:MaxRAMPercentage=12.5
[2018-01-22T20:33:46.20] binder>      main=nsk.jdi.EventSet.resume.resume008a "-verbose" "-arch=linux-amd64" "-waittime=5" "-debugee.vmkind=java" "-transport.address=dynamic" "-debugee.vmkeys=-XX:MaxRAMPercentage=12.5" "-pipe.port=28038"
[2018-01-22T20:33:46.20] binder>      quote="
[2018-01-22T20:33:46.20] binder>      suspend=true
[2018-01-22T20:33:46.20] binder>  Launching debugee
[2018-01-22T20:33:46.56] binder>  Waiting for VM initialized
[2018-01-22T20:33:46.60] Initial VMStartEvent received: VMStartEvent in thread main
[2018-01-22T20:33:46.61] EventHandler>  Adding listener nsk.share.jdi.EventHandler$1 at 1e7c7811
[2018-01-22T20:33:46.61] EventHandler>  Adding listener nsk.share.jdi.EventHandler$2 at 1a3869f4
[2018-01-22T20:33:46.61] EventHandler>  Adding listener nsk.share.jdi.EventHandler$3 at 77f99a05
[2018-01-22T20:33:46.61] EventHandler>  Adding listener nsk.share.jdi.EventHandler$4 at 3aeaafa6
[2018-01-22T20:33:46.61] EventHandler>  Adding listener nsk.share.jdi.EventHandler$5 at 4d3167f4
[2018-01-22T20:33:46.62] EventHandler>  waitForRequestedEvent: enabling remove of listener nsk.share.jdi.EventHandler$6 at 4eb7f003
[2018-01-22T20:33:46.62] EventHandler>  Adding listener nsk.share.jdi.EventHandler$6 at 4eb7f003
[2018-01-22T20:33:46.62] EventHandler>  waitForRequestedEvent: vm.resume called
[2018-01-22T20:33:46.67] EventHandler>  Received event set with policy = SUSPEND_EVENT_THREAD
[2018-01-22T20:33:46.68] EventHandler>  Event: ClassPrepareEventImpl req class prepare request  (enabled)
[2018-01-22T20:33:46.69] EventHandler>  waitForRequestedEvent: Received event(ClassPrepareEvent in thread main) for request(class prepare request  (enabled))
[2018-01-22T20:33:46.69] EventHandler>  Removing listener nsk.share.jdi.EventHandler$6 at 4eb7f003
[2018-01-22T20:33:46.69] debugger>  Received ClassPrepareEvent for debuggee class: nsk.jdi.EventSet.resume.resume008a
[2018-01-22T20:33:46.71] binder>  Breakpoint set:
[2018-01-22T20:33:46.71] 	breakpoint request nsk.jdi.EventSet.resume.resume008a:60 (disabled)
[2018-01-22T20:33:46.71] EventHandler>  Adding listener nsk.share.jdi.TestDebuggerType1$1 at 43738a82
[2018-01-22T20:33:46.71] debugger>  TESTING BEGINS
[2018-01-22T20:33:46.71] debugger>  RESUME DEBUGGEE VM
[2018-01-22T20:33:46.72] debugger>  shouldRunAfterBreakpoint: entered
[2018-01-22T20:33:46.72] debugger>  shouldRunAfterBreakpoint: waiting for breakpoint event during 1 sec.

[2018-01-22T20:33:46.84] EventHandler>  Received event set with policy = SUSPEND_ALL
[2018-01-22T20:33:46.84] EventHandler>  Event: BreakpointEventImpl req breakpoint request nsk.jdi.EventSet.resume.resume008a:60 (enabled)
[2018-01-22T20:33:46.84] debugger>  Received communication breakpoint event.

[2018-01-22T20:33:46.84] debugger>  shouldRunAfterBreakpoint: received breakpoint event.
[2018-01-22T20:33:46.84] debugee.stderr>  **>  debuggee: debuggee started!
[2018-01-22T20:33:46.85] debugger>  shouldRunAfterBreakpoint: exited with true.

[2018-01-22T20:33:46.85] debugger>  :::::: case: # 0
[2018-01-22T20:33:46.85] debugger>  ......waiting for new ThreadStartEvent : 0
[2018-01-22T20:33:46.85] EventHandler>  waitForRequestedEventSet: enabling remove of listener nsk.share.jdi.EventHandler$7 at 6ec8211c
[2018-01-22T20:33:46.85] EventHandler>  Adding listener nsk.share.jdi.EventHandler$7 at 6ec8211c
[2018-01-22T20:33:46.85] EventHandler>  waitForRequestedEventSet: vm.resume called
[2018-01-22T20:33:46.86] debugee.stderr>  **>  debuggee:   'run': enter  :: threadName == thread0
[2018-01-22T20:33:46.86] debugee.stderr>  **>  debuggee:   'run': exit   :: threadName == thread0
[2018-01-22T20:33:46.86] EventHandler>  Received event set with policy = SUSPEND_NONE
[2018-01-22T20:33:46.86] EventHandler>  waitForRequestedEventSet: Received event set for request: thread start request  (enabled)
[2018-01-22T20:33:46.86] EventHandler>  Event: ThreadStartEventImpl req thread start request  (enabled)
[2018-01-22T20:33:46.86] EventHandler>  Removing listener nsk.share.jdi.EventHandler$7 at 6ec8211c
[2018-01-22T20:33:46.86] debugger>         got new ThreadStartEvent with propety 'number' == ThreadStartRequest1
[2018-01-22T20:33:46.86] debugger>  ......checking up on EventSet.resume()
[2018-01-22T20:33:46.86] debugger>  ......-->  vm.suspend();
[2018-01-22T20:33:46.87] debugger>          getting : Map<String, Integer>  suspendsCounts1
[2018-01-22T20:33:46.87] debugger>  {Reference Handler=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1}
[2018-01-22T20:33:46.87] debugger>          eventSet.resume;
[2018-01-22T20:33:46.87] debugger>          getting : Map<String, Integer>  suspendsCounts2

[2018-01-22T20:33:46.87] EventHandler>  Received event set with policy = SUSPEND_ALL
[2018-01-22T20:33:46.87] EventHandler>  Event: BreakpointEventImpl req breakpoint request nsk.jdi.EventSet.resume.resume008a:60 (enabled)
[2018-01-22T20:33:46.87] debugger>  Received communication breakpoint event.

[2018-01-22T20:33:46.87] debugger>  {Reference Handler=2, Common-Cleaner=2, main=1, Signal Dispatcher=2, Finalizer=2}
[2018-01-22T20:33:46.87] debugger>          getting : int policy = eventSet.suspendPolicy();
[2018-01-22T20:33:46.87] debugger>          case SUSPEND_NONE
[2018-01-22T20:33:46.87] debugger>          checking Reference Handler
[2018-01-22T20:33:46.87] # ERROR: debugger>  ERROR: suspendCounts don't match for : Reference Handler
[2018-01-22T20:33:46.88] The following stacktrace is for Aurora. Used to create a RULE:
[2018-01-22T20:33:46.88] nsk.share.TestFailure: debugger>  ERROR: suspendCounts don't match for : Reference Handler
[2018-01-22T20:33:46.88] 	at nsk.share.Log.logExceptionForAurora(Log.java:411)
[2018-01-22T20:33:46.88] 	at nsk.share.Log.complain(Log.java:380)
[2018-01-22T20:33:46.88] 	at nsk.share.jdi.TestDebuggerType1.complain(TestDebuggerType1.java:63)
[2018-01-22T20:33:46.88] 	at nsk.jdi.EventSet.resume.resume008.testRun(resume008.java:163)
[2018-01-22T20:33:46.88] 	at nsk.share.jdi.TestDebuggerType1.runThis(TestDebuggerType1.java:104)
[2018-01-22T20:33:46.88] 	at nsk.jdi.EventSet.resume.resume008.run(resume008.java:62)
[2018-01-22T20:33:46.88] 	at nsk.jdi.EventSet.resume.resume008.main(resume008.java:57)
[2018-01-22T20:33:46.88] # ERROR: debugger>  before resuming : 1
[2018-01-22T20:33:46.88] # ERROR: debugger>  after  resuming : 2
[2018-01-22T20:33:46.88] debugger>  ......-->  vm.resume()
[2018-01-22T20:33:46.88] debugger>  shouldRunAfterBreakpoint: entered
[2018-01-22T20:33:46.88] debugger>  shouldRunAfterBreakpoint: received breakpoint event.
[2018-01-22T20:33:46.88] debugger>  shouldRunAfterBreakpoint: exited with true.
[2018-01-22T20:33:46.88] debugger>  :::::: case: # 1
[2018-01-22T20:33:46.88] debugger>  ......waiting for new ThreadStartEvent : 1
[2018-01-22T20:33:46.88] EventHandler>  waitForRequestedEventSet: enabling remove of listener nsk.share.jdi.EventHandler$7 at 548ad73b
[2018-01-22T20:33:46.88] EventHandler>  Adding listener nsk.share.jdi.EventHandler$7 at 548ad73b
[2018-01-22T20:33:46.88] EventHandler>  waitForRequestedEventSet: vm.resume called
[2018-01-22T20:33:46.88] EventHandler>  Received event set with policy = SUSPEND_EVENT_THREAD
[2018-01-22T20:33:46.88] EventHandler>  waitForRequestedEventSet: Received event set for request: thread start request  (enabled)
[2018-01-22T20:33:46.88] EventHandler>  Event: ThreadStartEventImpl req thread start request  (enabled)
[2018-01-22T20:33:46.88] EventHandler>  Removing listener nsk.share.jdi.EventHandler$7 at 548ad73b
[2018-01-22T20:33:46.88] debugger>         got new ThreadStartEvent with propety 'number' == ThreadStartRequest2
[2018-01-22T20:33:46.88] debugger>  ......checking up on EventSet.resume()
[2018-01-22T20:33:46.88] debugger>  ......-->  vm.suspend();
[2018-01-22T20:33:46.88] debugger>          getting : Map<String, Integer>  suspendsCounts1
[2018-01-22T20:33:46.89] debugger>  {Reference Handler=1, thread1=2, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1}
[2018-01-22T20:33:46.89] debugger>          eventSet.resume;
[2018-01-22T20:33:46.89] debugger>          getting : Map<String, Integer>  suspendsCounts2
[2018-01-22T20:33:46.89] debugger>  {Reference Handler=1, thread1=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1}
[2018-01-22T20:33:46.89] debugger>          getting : int policy = eventSet.suspendPolicy();
[2018-01-22T20:33:46.89] debugger>          case SUSPEND_THREAD
[2018-01-22T20:33:46.89] debugger>  checking Reference Handler
[2018-01-22T20:33:46.89] debugger>  checking thread1
[2018-01-22T20:33:46.89] debugger>  checking Common-Cleaner
[2018-01-22T20:33:46.89] debugger>  checking main
[2018-01-22T20:33:46.90] debugger>  checking Signal Dispatcher
[2018-01-22T20:33:46.90] debugger>  checking Finalizer
[2018-01-22T20:33:46.90] debugger>  ......-->  vm.resume()
[2018-01-22T20:33:46.90] debugger>  shouldRunAfterBreakpoint: entered
[2018-01-22T20:33:46.90] debugger>  shouldRunAfterBreakpoint: waiting for breakpoint event during 1 sec.
[2018-01-22T20:33:46.90] debugee.stderr>  **>  debuggee:   'run': enter  :: threadName == thread1
[2018-01-22T20:33:46.90] debugee.stderr>  **>  debuggee:   'run': exit   :: threadName == thread1
[2018-01-22T20:33:46.90] EventHandler>  Received event set with policy = SUSPEND_ALL
[2018-01-22T20:33:46.90] EventHandler>  Event: BreakpointEventImpl req breakpoint request nsk.jdi.EventSet.resume.resume008a:60 (enabled)
[2018-01-22T20:33:46.90] debugger>  Received communication breakpoint event.
[2018-01-22T20:33:46.90] debugger>  shouldRunAfterBreakpoint: received breakpoint event.
[2018-01-22T20:33:46.90] debugger>  shouldRunAfterBreakpoint: exited with true.
[2018-01-22T20:33:46.90] debugger>  :::::: case: # 2
[2018-01-22T20:33:46.90] debugger>  ......waiting for new ThreadStartEvent : 2
[2018-01-22T20:33:46.90] EventHandler>  waitForRequestedEventSet: enabling remove of listener nsk.share.jdi.EventHandler$7 at 2641e737
[2018-01-22T20:33:46.90] EventHandler>  Adding listener nsk.share.jdi.EventHandler$7 at 2641e737
[2018-01-22T20:33:46.90] EventHandler>  waitForRequestedEventSet: vm.resume called
[2018-01-22T20:33:46.90] EventHandler>  Received event set with policy = SUSPEND_ALL
[2018-01-22T20:33:46.90] EventHandler>  waitForRequestedEventSet: Received event set for request: thread start request  (enabled)
[2018-01-22T20:33:46.90] EventHandler>  Event: ThreadStartEventImpl req thread start request  (enabled)
[2018-01-22T20:33:46.90] EventHandler>  Removing listener nsk.share.jdi.EventHandler$7 at 2641e737
[2018-01-22T20:33:46.90] debugger>         got new ThreadStartEvent with propety 'number' == ThreadStartRequest3
[2018-01-22T20:33:46.90] debugger>  ......checking up on EventSet.resume()
[2018-01-22T20:33:46.90] debugger>  ......-->  vm.suspend();
[2018-01-22T20:33:46.90] debugger>          getting : Map<String, Integer>  suspendsCounts1
[2018-01-22T20:33:46.91] debugger>  {Reference Handler=2, thread2=2, Common-Cleaner=2, main=2, Signal Dispatcher=2, Finalizer=2}
[2018-01-22T20:33:46.91] debugger>          eventSet.resume;
[2018-01-22T20:33:46.91] debugger>          getting : Map<String, Integer>  suspendsCounts2
[2018-01-22T20:33:46.91] debugger>  {Reference Handler=1, thread2=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1}
[2018-01-22T20:33:46.91] debugger>          getting : int policy = eventSet.suspendPolicy();
[2018-01-22T20:33:46.91] debugger>          case SUSPEND_ALL
[2018-01-22T20:33:46.91] debugger>  checking Reference Handler
[2018-01-22T20:33:46.91] debugger>  checking thread2
[2018-01-22T20:33:46.91] debugger>  checking Common-Cleaner
[2018-01-22T20:33:46.91] debugger>  checking main
[2018-01-22T20:33:46.91] debugger>  checking Signal Dispatcher
[2018-01-22T20:33:46.91] debugger>  checking Finalizer
[2018-01-22T20:33:46.91] debugger>  ......-->  vm.resume()
[2018-01-22T20:33:46.91] debugger>  shouldRunAfterBreakpoint: entered
[2018-01-22T20:33:46.91] debugger>  shouldRunAfterBreakpoint: waiting for breakpoint event during 1 sec.
[2018-01-22T20:33:46.91] debugee.stderr>  **>  debuggee:   'run': enter  :: threadName == thread2
[2018-01-22T20:33:46.91] debugee.stderr>  **>  debuggee:   'run': exit   :: threadName == thread2
[2018-01-22T20:33:46.91] EventHandler>  Received event set with policy = SUSPEND_ALL
[2018-01-22T20:33:46.91] EventHandler>  Event: BreakpointEventImpl req breakpoint request nsk.jdi.EventSet.resume.resume008a:60 (enabled)
[2018-01-22T20:33:46.91] debugger>  Received communication breakpoint event.
[2018-01-22T20:33:46.91] debugger>  shouldRunAfterBreakpoint: received breakpoint event.
[2018-01-22T20:33:46.91] debugger>  shouldRunAfterBreakpoint: received instruction from debuggee to finish.
[2018-01-22T20:33:46.91] debugger>  shouldRunAfterBreakpoint: exited with false.
[2018-01-22T20:33:46.91] debugger>  TESTING ENDS
[2018-01-22T20:33:46.91] debugger>  Waiting for debuggee's exit...
[2018-01-22T20:33:46.91] EventHandler>  waitForVMDisconnect
[2018-01-22T20:33:46.91] debugee.stderr>  **>  debuggee: debuggee exits
[2018-01-22T20:33:46.92] EventHandler>  Received event set with policy = SUSPEND_NONE
[2018-01-22T20:33:46.92] EventHandler>  Event: VMDeathEventImpl req null
[2018-01-22T20:33:46.92] EventHandler>  receieved VMDeath
[2018-01-22T20:33:46.92] EventHandler>  Removing listener nsk.share.jdi.EventHandler$3 at 77f99a05
[2018-01-22T20:33:47.25] EventHandler>  Received event set with policy = SUSPEND_NONE
[2018-01-22T20:33:47.25] EventHandler>  Event: VMDisconnectEventImpl req null
[2018-01-22T20:33:47.25] EventHandler>  receieved VMDisconnect
[2018-01-22T20:33:47.25] EventHandler>  Removing listener nsk.share.jdi.EventHandler$4 at 3aeaafa6
[2018-01-22T20:33:47.25] EventHandler>  finished
[2018-01-22T20:33:47.25] EventHandler>  waitForVMDisconnect: done
[2018-01-22T20:33:47.25] debugger>  Event handler thread exited.
[2018-01-22T20:33:47.25] debugger>  Debuggee PASSED.
[2018-01-22T20:33:47.26]
[2018-01-22T20:33:47.26]
[2018-01-22T20:33:47.26] #>
[2018-01-22T20:33:47.26] #>   SUMMARY: Following errors occured
[2018-01-22T20:33:47.26] #>       during test execution:
[2018-01-22T20:33:47.26] #>
[2018-01-22T20:33:47.26] # ERROR: debugger>  ERROR: suspendCounts don't match for : Reference Handler
[2018-01-22T20:33:47.26] # ERROR: debugger>  before resuming : 1
[2018-01-22T20:33:47.26] # ERROR: debugger>  after  resuming : 2
[2018-01-22T20:33:47.27] # Test level exit status: 97


Here's a recent passed log from a local run:

----------System.out:(164/9808)----------
run [nsk.jdi.EventSet.resume.resume008, -verbose, -arch=linux-x64, 
-waittime=5, -debugee.vmkind=java, -transport.address=dynamic, 
-debugee.vmkeys=-XX:MaxRAMPercentage=2 ]
binder> VirtualMachineManager: version 11.0
binder> Finding connector: default
binder> LaunchingConnector:
binder>     name: com.sun.jdi.CommandLineLaunch
binder>     description: Launches target using Sun Java VM command line 
and attaches to it
binder>     transport: com.sun.tools.jdi.SunCommandLineLauncher$2 at 749dec1a
binder> Connector arguments:
binder>     home=/export/users/gradams/ws/jdk-jdk/build/linux-x64/images/jdk
binder>     vmexec=java
binder>     options=-XX:MaxRAMPercentage=2
binder>     main=nsk.jdi.EventSet.resume.resume008a "-verbose" 
"-arch=linux-x64" "-waittime=5" "-debugee.vmkind=java" 
"-transport.address=dynamic" "-debugee.vmkeys=-XX:MaxRAMPercentage=2 " 
"-pipe.port=35940"
binder>     quote="
binder>     suspend=true
binder> Launching debugee
binder> Waiting for VM initialized
Initial VMStartEvent received: VMStartEvent in thread main
EventHandler> Adding listener nsk.share.jdi.EventHandler$1 at 2ab41d39
EventHandler> Adding listener nsk.share.jdi.EventHandler$2 at 2e3cb1e2
EventHandler> Adding listener nsk.share.jdi.EventHandler$3 at 57f20df9
EventHandler> Adding listener nsk.share.jdi.EventHandler$4 at 6e72e291
EventHandler> Adding listener nsk.share.jdi.EventHandler$5 at 5889e23e
EventHandler> waitForRequestedEvent: enabling remove of listener 
nsk.share.jdi.EventHandler$6 at 46dcda7f
EventHandler> Adding listener nsk.share.jdi.EventHandler$6 at 46dcda7f
EventHandler> waitForRequestedEvent: vm.resume called
EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD
EventHandler> Event: ClassPrepareEventImpl req class prepare request  
(enabled)
EventHandler> waitForRequestedEvent: Received event(ClassPrepareEvent in 
thread main) for request(class prepare request  (enabled))
EventHandler> Removing listener nsk.share.jdi.EventHandler$6 at 46dcda7f
debugger> Received ClassPrepareEvent for debuggee class: 
nsk.jdi.EventSet.resume.resume008a
binder> Breakpoint set:
     breakpoint request nsk.jdi.EventSet.resume.resume008a:74 (disabled)
EventHandler> Adding listener nsk.share.jdi.TestDebuggerType1$1 at 322c2a05
debugger> TESTING BEGINS
debugger> RESUME DEBUGGEE VM
debugger> shouldRunAfterBreakpoint: entered
debugger> shouldRunAfterBreakpoint: waiting for breakpoint event during 
1 sec.

debugee.stderr> **> debuggee: debuggee started!
EventHandler> Received event set with policy = SUSPEND_ALL
EventHandler> Event: BreakpointEventImpl req breakpoint request 
nsk.jdi.EventSet.resume.resume008a:74 (enabled)
debugger> Received communication breakpoint event.

debugger> shouldRunAfterBreakpoint: received breakpoint event.
debugger> shouldRunAfterBreakpoint: exited with true.
debugger> :::::: case: # 0
debugger> ......waiting for new ThreadStartEvent : 0

EventHandler> waitForRequestedEventSet: enabling remove of listener 
nsk.share.jdi.EventHandler$7 at 78aa490d
EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 78aa490d
EventHandler> waitForRequestedEventSet: vm.resume called
EventHandler> Received event set with policy = SUSPEND_NONE
debugee.stderr> **> debuggee:   'run': enter  :: threadName == thread0
EventHandler> waitForRequestedEventSet: Received event set for request: 
thread start request  (enabled)
EventHandler> Event: ThreadStartEventImpl req thread start request  
(enabled)
EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 78aa490d
EventHandler> Received event set with policy = SUSPEND_ALL
EventHandler> Event: BreakpointEventImpl req breakpoint request 
nsk.jdi.EventSet.resume.resume008a:74 (enabled)
debugger> Received communication breakpoint event.

debugger>        got new ThreadStartEvent with propety 'number' == 
ThreadStartRequest1
debugger> ......checking up on EventSet.resume()
debugger> ......--> vm.suspend();
debugger>         getting : Map<String, Integer> suspendsCounts1
debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
Signal Dispatcher=2, Finalizer=2}
debugger>         eventSet.resume;
debugger>         getting : Map<String, Integer> suspendsCounts2
debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
Signal Dispatcher=2, Finalizer=2}
debugger>         getting : int policy = eventSet.suspendPolicy();
debugger>         case SUSPEND_NONE
debugger>         checking Reference Handler
debugger>         checking thread0
debugger>         checking Common-Cleaner
debugger>         checking main
debugger>         checking Signal Dispatcher
debugger>         checking Finalizer
debugger> ......--> vm.resume()
debugger> shouldRunAfterBreakpoint: entered
debugger> shouldRunAfterBreakpoint: received breakpoint event.
debugger> shouldRunAfterBreakpoint: exited with true.
debugger> :::::: case: # 1
debugger> ......waiting for new ThreadStartEvent : 1
EventHandler> waitForRequestedEventSet: enabling remove of listener 
nsk.share.jdi.EventHandler$7 at 616bc3ae
EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 616bc3ae
EventHandler> waitForRequestedEventSet: vm.resume called
EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD
debugee.stderr> **> debuggee:   'run': exit   :: threadName == thread0
EventHandler> waitForRequestedEventSet: Received event set for request: 
thread start request  (enabled)
EventHandler> Event: ThreadStartEventImpl req thread start request  
(enabled)
EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 616bc3ae
debugger>        got new ThreadStartEvent with propety 'number' == 
ThreadStartRequest2
debugger> ......checking up on EventSet.resume()
debugger> ......--> vm.suspend();
debugger>         getting : Map<String, Integer> suspendsCounts1
debugger> {Reference Handler=1, thread1=2, Common-Cleaner=1, main=1, 
Signal Dispatcher=1, Finalizer=1}
debugger>         eventSet.resume;
debugger>         getting : Map<String, Integer> suspendsCounts2
debugger> {Reference Handler=1, thread1=1, Common-Cleaner=1, main=1, 
Signal Dispatcher=1, Finalizer=1}
debugger>         getting : int policy = eventSet.suspendPolicy();
debugger>         case SUSPEND_THREAD
debugger> checking Reference Handler
debugger> checking thread1
debugger> checking Common-Cleaner
debugger> checking main
debugger> checking Signal Dispatcher
debugger> checking Finalizer
debugger> ......--> vm.resume()
debugger> shouldRunAfterBreakpoint: entered
debugger> shouldRunAfterBreakpoint: waiting for breakpoint event during 
1 sec.
debugee.stderr> **> debuggee:   'run': enter  :: threadName == thread1
debugee.stderr> **> debuggee:   'run': exit   :: threadName == thread1
EventHandler> Received event set with policy = SUSPEND_ALL
EventHandler> Event: BreakpointEventImpl req breakpoint request 
nsk.jdi.EventSet.resume.resume008a:74 (enabled)
debugger> Received communication breakpoint event.
debugger> shouldRunAfterBreakpoint: received breakpoint event.
debugger> shouldRunAfterBreakpoint: exited with true.
debugger> :::::: case: # 2
debugger> ......waiting for new ThreadStartEvent : 2
EventHandler> waitForRequestedEventSet: enabling remove of listener 
nsk.share.jdi.EventHandler$7 at 44e265ef
EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 44e265ef
EventHandler> waitForRequestedEventSet: vm.resume called
EventHandler> Received event set with policy = SUSPEND_ALL
EventHandler> waitForRequestedEventSet: Received event set for request: 
thread start request  (enabled)
EventHandler> Event: ThreadStartEventImpl req thread start request  
(enabled)
EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 44e265ef
debugger>        got new ThreadStartEvent with propety 'number' == 
ThreadStartRequest3
debugger> ......checking up on EventSet.resume()
debugger> ......--> vm.suspend();
debugger>         getting : Map<String, Integer> suspendsCounts1
debugger> {Reference Handler=2, thread2=2, Common-Cleaner=2, main=2, 
Signal Dispatcher=2, Finalizer=2}
debugger>         eventSet.resume;
debugger>         getting : Map<String, Integer> suspendsCounts2
debugger> {Reference Handler=1, thread2=1, Common-Cleaner=1, main=1, 
Signal Dispatcher=1, Finalizer=1}
debugger>         getting : int policy = eventSet.suspendPolicy();
debugger>         case SUSPEND_ALL
debugger> checking Reference Handler
debugger> checking thread2
debugger> checking Common-Cleaner
debugger> checking main
debugger> checking Signal Dispatcher
debugger> checking Finalizer
debugger> ......--> vm.resume()
debugger> shouldRunAfterBreakpoint: entered
debugger> shouldRunAfterBreakpoint: waiting for breakpoint event during 
1 sec.
debugee.stderr> **> debuggee:   'run': enter  :: threadName == thread2
debugee.stderr> **> debuggee:   'run': exit   :: threadName == thread2
EventHandler> Received event set with policy = SUSPEND_ALL
EventHandler> Event: BreakpointEventImpl req breakpoint request 
nsk.jdi.EventSet.resume.resume008a:74 (enabled)
debugger> Received communication breakpoint event.
debugger> shouldRunAfterBreakpoint: received breakpoint event.
debugger> shouldRunAfterBreakpoint: received instruction from debuggee 
to finish.
debugger> shouldRunAfterBreakpoint: exited with false.
debugger> TESTING ENDS
debugger> Waiting for debuggee's exit...
debugee.stderr> **> debuggee: debuggee exits
EventHandler> waitForVMDisconnect
EventHandler> Received event set with policy = SUSPEND_NONE
EventHandler> Event: VMDeathEventImpl req null
EventHandler> receieved VMDeath
EventHandler> Removing listener nsk.share.jdi.EventHandler$3 at 57f20df9
EventHandler> Received event set with policy = SUSPEND_NONE
EventHandler> Event: VMDisconnectEventImpl req null
EventHandler> receieved VMDisconnect
EventHandler> Removing listener nsk.share.jdi.EventHandler$4 at 6e72e291
EventHandler> finished
EventHandler> waitForVMDisconnect: done
debugger> Event handler thread exited.
debugger> Debuggee PASSED.

On 7/18/18, 6:09 PM, gary.adams at oracle.com wrote:
> On 7/18/18 4:47 PM, Chris Plummer wrote:
>> Hi Gary
>>
>> Ok, so shouldRunAfterBreakpoint() is the code that does the 
>> eventHandler.wait(), so it gets the eventHandler.notifyAll() 
>> notification from the BreakpointEvent handler.
>>
>> And as a side note, I see now that resumption of execution after the 
>> breakpoint at main() is done by:
>>
>>             // after waitForClassPrepared() main debuggee thread is 
>> suspended, resume it before test start
>>             display("RESUME DEBUGGEE VM");
>>             vm.resume();
>>
>>             testRun();
>>
>> shouldRunAfterBreakpoint() is returning true until the end of the 
>> test when the debuggee is executes "instruction = end". That's why 
>> runTests() does a "break" when shouldRunAfterBreakpoint() returns 
>> false. So this means the code that is checking 
>> shouldRunAfterBreakpoint() is not resuming execution for the first 
>> few (probably 3) methodForCommunication() breakpoints. However, it 
>> does make sure that runTests() blocks until the BreakPointEvent has 
>> been processed.
>>
>> You point out the vm.resume() at the bottom of the loop in 
>> runTests(), but that's only after a bunch of ThreadStartEvent 
>> processing above it has been done already. The ThreadStartEvent would 
>> never get generated if there was not a resume some point earlier. I 
>> think it is happening during the 
>> eventHandler.waitForRequestedEventSet() call, which does a vm.resume().
>>
>> So if I understand the order of things now:
>>
>> -shouldRunAfterBreakpoint() returns after first 
>> methodForCommunication() is hit. At this point we know the first 
>> thread has been created, but no attempt to start it yet. The debuggee 
>> is suspended at this point.
>> -runTests() requests ThreadStartEvents with SUSPEND_NONE. This also 
>> does a vm.resume().
>> -The debuggee starts the thread and then does another 
>> methodForCommunication() (this 2nd one is actually after the 2nd 
>> thread has been created, but not yet started). Now we have a race. Do 
>> we get the ThreadStartEvent first or the BreakpointEvent. This is 
>> because when the ThreadStartEvent is generated, the thread is not 
>> suspended due to SUSPEND_NONE. Even if the ThreadStartEvent comes in 
>> first, the async handling of the BreakpointEvent can cause problems 
>> during the ThreadStartEvent processing.
> Based on the failed log in the bug report, the thread start event is 
> observed,
> the suspend counts acquired, then after the resume, the breakpoint 
> message
> is displayed and the second set of suspend counts acquired.
>
> I can show you the passed and failed logs tomorrow.
>> -You added a 100ms delay after the thread has started, but before 
>> methodForCommunication(), hoping it will make it so the 
>> ThreadStartEvent can be received and fully processed before the 
>> BreakpointEvent is.
> The delay is mostly just a yield so the debugger gets a chance to run.
>>
>> I think it would be preferable to fix this by doing better 
>> sychronization. After all, that is the approach the test originally 
>> took. It could have been written with a bunch of sleep() delays 
>> instead, but that in general is not a very good approach.
>>
>> What if you added a shouldRunAfterBreakpoint() call after getting the 
>> ThreadStartEvent arrives. At this point you would know that the vm is 
>> suspended due to the breakpoint, so no need for:
>>
>>                 display("......checking up on EventSet.resume()");
>>                 display("......--> vm.suspend();");
>>                 vm.suspend();
> I think the suspend is intentional to capture the the suspend counts.
> It also needs to resume the vm and acquire again so it can confirm the 
> correct
> suspend count behaviors.
> If the test waits to capture the second set of suspend counts, the 
> breakpoint
> causes incorrect values.
>
> ...
>>
>> You might then also need to add another methodForCommunication() call 
>> at the end of case 0 and 1 in the debuggee, although I think you 
>> could instead just change the shouldRunAfterBreakpoint() at the start 
>> of the loop. I think that check actually belongs at the end of the 
>> loop, and only for case 2. In fact it would be an error if 
>> shouldRunAfterBreakpoint() did not return true in that case. Then you 
>> also need to add a shouldRunAfterBreakpoint() at the start of case 0 
>> to get things rolling (and I think at the start of case 1 also).
>>
>> Chris
>>
>>
>> On 7/18/18 12:45 PM, Gary Adams wrote:
>>> Answers below  ...
>>>
>>> On 7/18/18, 2:50 PM, Chris Plummer wrote:
>>>> Hi Gary,
>>>>
>>>> Who does the resume for the breakpoint event?
>>>>
>>>>         eventHandler.addListener(
>>>>              new EventHandler.EventListener() {
>>>>                  public boolean eventReceived(Event event) {
>>>>                     if (event instanceof BreakpointEvent && 
>>>> bpRequest.equals(event.request())) {
>>>>                         synchronized(eventHandler) {
>>>>                             display("Received communication 
>>>> breakpoint event.");
>>>>                             bpCount++;
>>>>                             eventHandler.notifyAll();
>>>>                         }
>>>>                         return true;
>>>>                     }
>>>>                     return false;
>>>>                  }
>>>>              }
>>>>         );
>>> I believe you are looking for this sequence.
>>> At the top of the loop a check is made if
>>> resume() should be called "shouldRunAfterBreakpoint".
>>> lines 96-99 is an early termination. And at the
>>> bottom of the loop, line 240, is the normal
>>> continue the test to the next case.
>>>
>>> resume008.java :
>>> ...
>>>     94            for (int i = 0; ; i++) {
>>>     95
>>>
>>>     96                if (!shouldRunAfterBreakpoint()) {
>>>     97                    vm.resume();
>>>     98                    break;
>>>     99                }
>>>
>>> 100
>>>    101
>>>    102                display(":::::: case: # " + i);
>>>    103
>>>    104                switch (i) {
>>>    105
>>>    106                    case 0:
>>>    107                    eventRequest = settingThreadStartRequest (
>>>    108                                           SUSPEND_NONE, 
>>> "ThreadStartRequest1");
>>> ...
>>>   238
>>>    239                display("......--> vm.resume()");
>>>    240                vm.resume();
>>>    241            }
>>>>
>>>> Also:
>>>>
>>>>>   1. On a thread start event the debugee is suspended, line 141 
>>>> That's not true for the first ThreadStartEvent since SUSPEND_NONE 
>>>> was used.
>>> The thread start event is set to SUSPEND_NONE for thread0, but when
>>> the thread start event is observed the resume008 test suspends the vm
>>> immediately after fetching the "number" property.
>> My point is that the Debuggee continues to run after the 
>> ThreadStartEvent is sent, and relies on the debugger to stop it after 
>> receiving the event. But in the meantime the debuggee has advanced to 
>> the next breakpoint, but only sometimes, thus the bug you are seeing.
>>>
>>>    132                if ( !(newEvent instanceof ThreadStartEvent)) {
>>>    133                    setFailedStatus("ERROR: new event is not 
>>> ThreadStartEvent");
>>>    134                } else {
>>>    135
>>>    136                    String property = (String) 
>>> newEvent.request().getProperty("number");
>>>    137                    display("       got new ThreadStartEvent 
>>> with propety 'number' == " + property);
>>>    138
>>>    139                    display("......checking up on 
>>> EventSet.resume()");
>>>    140                    display("......--> vm.suspend();");
>>>    141                    vm.suspend();
>>>
>>>
>>>>
>>>> Chris
>>>>
>>>> On 7/18/18 4:52 AM, Gary Adams wrote:
>>>>> There is nothing wrong with the breakpoint in methodForCommunication.
>>>>> The test uses it to make sure the threads are each tested separately.
>>>>> The breakpoint eventhandler just displays a message, increments a 
>>>>> counter
>>>>> and returns.
>>>>>
>>>>> Let me step through resume008a the debugee to help clarify ...
>>>>>
>>>>> 1. The test thread is created and the synchronized break point is 
>>>>> observed. lines 101-102
>>>>> 2. The thread is started. lines 104,135-137
>>>>>     2a. The main thread blocks on a local object. lines 133, 139
>>>>>     2b. The test thread is started. lines 137,
>>>>>            A run entered message is displayed, line 159
>>>>>            The main thread lock object is notified, line 167
>>>>>           2b1. The main thread continues. line 167, 146
>>>>>                   The next test thread is created. line 106
>>>>>                   The synchronized breakpoint is observed, line 107
>>>>>           2b2. A run exited message is displayed, line 169
>>>>>
>>>>> On the resume008 debugger side  ...
>>>>>   1. On a thread start event the debugee is suspended, line 141
>>>>>   2. Messages are displayed and a first set of thread suspend 
>>>>> counts is acquired. lines 143-151
>>>>>   3. The threads are resumed, line 152
>>>>> --->
>>>>>   4.  Messages are displayed and a second set of thread suspend 
>>>>> counts is acquired. lines 154-159
>>>>>
>>>>> The way the test is written the expectation is the debugger steps 
>>>>> 2,3,4 will all happen
>>>>> while the test thread is running.
>>>>>
>>>>> When the debugger resumes the debuggee threads (debugger step 3)
>>>>> the debuggee continues from where it left off (debuggee steps 
>>>>> 2b,2b1,2b2)
>>>>>
>>>>> If we complete debuggee step 2b1 (line 107) before the debugger 
>>>>> completes step 4 line 159,
>>>>> then the synchronized breakpoint will suspend the vm and the 
>>>>> counts will not match
>>>>> for the SUSPEND_NONE test thread start.
>>>>>
>>>>> resume008a.java:
>>>>>
>>>>>    100                        case 0:
>>>>>    101                                thread0 = new 
>>>>> Threadresume008a("thread0");
>>>>>    102 methodForCommunication();
>>>>>    103
>>>>>    104                                threadStart(thread0);
>>>>>    105
>>>>>    106                                thread1 = new 
>>>>> Threadresume008a("thread1");
>>>>>    107 methodForCommunication();
>>>>>    108                                break;
>>>>>
>>>>>    ...
>>>>>    135        static int threadStart(Thread t) {
>>>>>    136            synchronized (waitnotifyObj) {
>>>>>    137                t.start();
>>>>>    138                try {
>>>>>    139                    waitnotifyObj.wait();
>>>>>    140                } catch ( Exception e) {
>>>>>    141                    exitCode = FAILED;
>>>>>    142                    logErr("       Exception : " + e );
>>>>>    143                    return FAILED;
>>>>>    144                }
>>>>>    145            }
>>>>>    146            return PASSED;
>>>>>    147        }
>>>>>
>>>>>    149        static class Threadresume008a extends Thread {
>>>>>    ...
>>>>>    157
>>>>>    158            public void run() {
>>>>>    159                log1("  'run': enter  :: threadName == " + 
>>>>> tName);
>>>>>
>>>>> This is the proposed fix that will let the debugger complete it's 
>>>>> second
>>>>> acquisition of suspend counts while the test thread is still running.
>>>>>
>>>>>    160                // Yield, so the start thread event 
>>>>> processing can be completed.
>>>>>    161                try {
>>>>>    162                    Thread.sleep(100);
>>>>>    163                } catch (InterruptedException e) {
>>>>>    164                    // ignored
>>>>>    165                }
>>>>>
>>>>>    166                synchronized (waitnotifyObj) {
>>>>>    167                        waitnotifyObj.notify();
>>>>>    168                }
>>>>>    169                log1("  'run': exit   :: threadName == " + 
>>>>> tName);
>>>>>    170                return;
>>>>>    171            }
>>>>>    172        }
>>>>>    150
>>>>>    151            String tName = null;
>>>>>    152
>>>>>    153            public Threadresume008a(String threadName) {
>>>>>    154                super(threadName);
>>>>>    155                tName = threadName;
>>>>>    156            }
>>>>>    157
>>>>>    158            public void run() {
>>>>>    159                log1("  'run': enter  :: threadName == " + 
>>>>> tName);
>>>>>    160                // Yield, so the start thread event 
>>>>> processing can be completed.
>>>>>    161                try {
>>>>>    162                    Thread.sleep(100);
>>>>>    163                } catch (InterruptedException e) {
>>>>>    164                    // ignored
>>>>>    165                }
>>>>>    166                synchronized (waitnotifyObj) {
>>>>>    167                        waitnotifyObj.notify();
>>>>>    168                }
>>>>>    169                log1("  'run': exit   :: threadName == " + 
>>>>> tName);
>>>>>    170                return;
>>>>>    171            }
>>>>>    172        }
>>>>>
>>>>>
>>>>>
>>>>> On 7/18/18, 2:38 AM, Chris Plummer wrote:
>>>>>> Hi Gary,
>>>>>>
>>>>>> I've been having trouble following the control flow of this test. 
>>>>>> One thing I've stumbled across is the following:
>>>>>>
>>>>>>             /* A debuggee class must define 'methodForCommunication'
>>>>>>              * method and invoke it in points of synchronization
>>>>>>              * with a debugger.
>>>>>>              */
>>>>>> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication");
>>>>>>
>>>>>> So why isn't this mode of synchronization good enough? Is it 
>>>>>> because it was not designed with the understanding that the 
>>>>>> debugger might be doing suspended thread counts, and suspending 
>>>>>> all threads at the breakpoint messes up the test?
>>>>>>
>>>>>> From what I can tell of the test, after the debuggee is started 
>>>>>> and hits the default breakpoint at the start of main(), the 
>>>>>> debugger then does a vm.resume() at the start of the for loop in 
>>>>>> the runTest() method. The debuggee then creates a thread and 
>>>>>> calls methodForCommunication(). There is already a breakpoint set 
>>>>>> there by the above debuggee code. It's unclear to me what happens 
>>>>>> as a result of this breakpoint and how it serves the test. Also 
>>>>>> unclear to me who is responsible for the vm.resume() after the 
>>>>>> breakpoint is hit.
>>>>>>
>>>>>> The debugger then requests all ThreadStart events, requesting 
>>>>>> that no threads be disabled when it is sent. I think you are 
>>>>>> saying that when the ThreadStart event comes in, sometimes we are 
>>>>>> at the methodForCommunication breakpoint, with all threads 
>>>>>> disabled, and this messes up the thread suspend counts. You want 
>>>>>> to delay 100ms so the breakpoint event can be processed and 
>>>>>> threads resumed again (although I can't see who actually resumes 
>>>>>> the thread after hitting the methodForCommunication breakpoint).
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/17/18 8:33 AM, Gary Adams wrote:
>>>>>>> A race condition exists between the debugger and the debuggee.
>>>>>>>
>>>>>>> The first test thread is started with SUSPEND_NONE policy set.
>>>>>>> While processing the thread start event the debugger captures
>>>>>>> an initial set of thread suspend counts and resumes the
>>>>>>> debuggee vm. If the debuggee advances quickly it reaches
>>>>>>> the breakpoint set for methodForCommunication. Since the breakpoint
>>>>>>> carries with it SUSPEND_ALL policy, when the debugger captures a 
>>>>>>> second
>>>>>>> set of suspend counts, it will not match the expected counts for
>>>>>>> a SUSPEND_NONE scenario.
>>>>>>>
>>>>>>> The proposed fix introduces a yield in the debuggee test thread 
>>>>>>> run method
>>>>>>> to allow the debugger to get the expected sampled values.
>>>>>>>
>>>>>>>   Issue: https://bugs.openjdk.java.net/browse/JDK-8170089
>>>>>>>   Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/
>>>>>>>
>>>>>>>
>>>>>>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java:
>>>>>>> ...
>>>>>>>    186        private void 
>>>>>>> setCommunicationBreakpoint(ReferenceType refType, String 
>>>>>>> methodName) {
>>>>>>>    187            Method method = debuggee.methodByName(refType, 
>>>>>>> methodName);
>>>>>>>    188            Location location = null;
>>>>>>>    189            try {
>>>>>>>    190                location = method.allLineLocations().get(0);
>>>>>>>    191            } catch (AbsentInformationException e) {
>>>>>>>    192                throw new Failure(e);
>>>>>>>    193            }
>>>>>>>    194            bpRequest = debuggee.makeBreakpoint(location);
>>>>>>>    195
>>>>>>>
>>>>>>>    196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL);
>>>>>>>
>>>>>>>    197            bpRequest.putProperty("number", "zero");
>>>>>>>    198            bpRequest.enable();
>>>>>>>    199
>>>>>>>    200            eventHandler.addListener(
>>>>>>>    201                 new EventHandler.EventListener() {
>>>>>>>    202                     public boolean eventReceived(Event 
>>>>>>> event) {
>>>>>>>    203                        if (event instanceof 
>>>>>>> BreakpointEvent && bpRequest.equals(event.request())) {
>>>>>>>    204 synchronized(eventHandler) {
>>>>>>>    205                                display("Received 
>>>>>>> communication breakpoint event.");
>>>>>>>    206                                bpCount++;
>>>>>>>    207 eventHandler.notifyAll();
>>>>>>>    208                            }
>>>>>>>    209                            return true;
>>>>>>>    210                        }
>>>>>>>    211                        return false;
>>>>>>>    212                     }
>>>>>>>    213                 }
>>>>>>>    214            );
>>>>>>>    215        }
>>>>>>>
>>>>>>>
>>>>>>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: 
>>>>>>>
>>>>>>> ...
>>>>>>>    140                    display("......--> vm.suspend();");
>>>>>>>    141                    vm.suspend();
>>>>>>>    142
>>>>>>>    143                    display("        getting : Map<String, 
>>>>>>> Integer> suspendsCounts1");
>>>>>>>    144
>>>>>>>    145                    Map<String, Integer> suspendsCounts1 = 
>>>>>>> new HashMap<String, Integer>();
>>>>>>>    146                    for (ThreadReference threadReference : 
>>>>>>> vm.allThreads()) {
>>>>>>>    147 suspendsCounts1.put(threadReference.name(), 
>>>>>>> threadReference.suspendCount());
>>>>>>>    148                    }
>>>>>>>    149 display(suspendsCounts1.toString());
>>>>>>>    150
>>>>>>>    151                    display(" eventSet.resume;");
>>>>>>>    152                    eventSet.resume();
>>>>>>>    153
>>>>>>>    154                    display("        getting : Map<String, 
>>>>>>> Integer> suspendsCounts2");
>>>>>>>
>>>>>>> This is where the breakpoint is encountered before the second 
>>>>>>> set of suspend counts is acquired.
>>>>>>>
>>>>>>>    155                    Map<String, Integer> suspendsCounts2 = 
>>>>>>> new HashMap<String, Integer>();
>>>>>>>    156                    for (ThreadReference threadReference : 
>>>>>>> vm.allThreads()) {
>>>>>>>    157 suspendsCounts2.put(threadReference.name(), 
>>>>>>> threadReference.suspendCount());
>>>>>>>    158                    }
>>>>>>>    159 display(suspendsCounts2.toString());
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/c20bfffa/attachment-0001.html>

From daniel.daugherty at oracle.com  Thu Jul 19 13:46:52 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 19 Jul 2018 09:46:52 -0400
Subject: RFR(XS): 8207819: Problem list
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
In-Reply-To: <5c714d80-2216-84bd-5391-a7fbcccd34a7@oracle.com>
References: <c19f5f7d-6c51-457f-d240-db48ab53580d@oracle.com>
 <a768ea17-4f76-d860-c691-fb89fc20a98b@oracle.com>
 <5c714d80-2216-84bd-5391-a7fbcccd34a7@oracle.com>
Message-ID: <bf98cf63-6cac-bbbd-dd6e-d1c88868c819@oracle.com>

JDK-8207765 covers two different tests as of yesterday:

serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java

and

serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java

I updated it to add a similar failure mode sighting for 
HeapMonitorStatIntervalTest.java

Dan


On 7/18/18 11:49 PM, serguei.spitsyn at oracle.com wrote:
> Thanks, Chris!
> This meets the Trivial Change policy, so that pushing now.
>
> Thanks,
> Serguei
>
>
> On 7/18/18 20:47, Chris Plummer wrote:
>> Looks good.
>>
>> Chris
>>
>> On 7/18/18 8:32 PM, serguei.spitsyn at oracle.com wrote:
>>> Please, review the fix for sub-task:
>>> ? https://bugs.openjdk.java.net/browse/JDK-8207819
>>>
>>>
>>> The test HeapMonitorStatRateTest.java needs to be problem listed 
>>> until main bug is fixed
>>> ? https://bugs.openjdk.java.net/browse/JDK-8207765
>>>
>>>
>>> The patch is:
>>>
>>> diff -r 3c0a5bf931e4 test/hotspot/jtreg/ProblemList.txt
>>> --- a/test/hotspot/jtreg/ProblemList.txt??? Thu Jul 19 10:30:24 2018 
>>> +0800
>>> +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 18 20:27:10 2018 
>>> -0700
>>> @@ -81,6 +81,7 @@
>>>
>>> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 
>>> generic-all
>>> ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 
>>> generic-all
>>> +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java 
>>> 8207765 generic-all
>>>
>>> ?############################################################################# 
>>>
>>>
>>>
>>> Thanks,
>>> Serguei
>>
>>
>>
>
>


From serguei.spitsyn at oracle.com  Thu Jul 19 13:55:28 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 19 Jul 2018 06:55:28 -0700
Subject: RFR(XS): 8207819: Problem list
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
In-Reply-To: <bf98cf63-6cac-bbbd-dd6e-d1c88868c819@oracle.com>
References: <c19f5f7d-6c51-457f-d240-db48ab53580d@oracle.com>
 <a768ea17-4f76-d860-c691-fb89fc20a98b@oracle.com>
 <5c714d80-2216-84bd-5391-a7fbcccd34a7@oracle.com>
 <bf98cf63-6cac-bbbd-dd6e-d1c88868c819@oracle.com>
Message-ID: <27fd8f83-682a-738d-8203-cd20e8bf2556@oracle.com>

Hi Dan,

Thank you, Dan.
I've just discovered the same in the recent mach5 test results.
Sorry for overlooking it.
Will need another sub-task for this now.

Thanks,
Serguei

On 7/19/18 06:46, Daniel D. Daugherty wrote:
> JDK-8207765 covers two different tests as of yesterday:
>
> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
>
> and
>
> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java 
>
>
> I updated it to add a similar failure mode sighting for 
> HeapMonitorStatIntervalTest.java
>
> Dan
>
>
> On 7/18/18 11:49 PM, serguei.spitsyn at oracle.com wrote:
>> Thanks, Chris!
>> This meets the Trivial Change policy, so that pushing now.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 7/18/18 20:47, Chris Plummer wrote:
>>> Looks good.
>>>
>>> Chris
>>>
>>> On 7/18/18 8:32 PM, serguei.spitsyn at oracle.com wrote:
>>>> Please, review the fix for sub-task:
>>>> ? https://bugs.openjdk.java.net/browse/JDK-8207819
>>>>
>>>>
>>>> The test HeapMonitorStatRateTest.java needs to be problem listed 
>>>> until main bug is fixed
>>>> ? https://bugs.openjdk.java.net/browse/JDK-8207765
>>>>
>>>>
>>>> The patch is:
>>>>
>>>> diff -r 3c0a5bf931e4 test/hotspot/jtreg/ProblemList.txt
>>>> --- a/test/hotspot/jtreg/ProblemList.txt??? Thu Jul 19 10:30:24 
>>>> 2018 +0800
>>>> +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 18 20:27:10 
>>>> 2018 -0700
>>>> @@ -81,6 +81,7 @@
>>>>
>>>> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 
>>>> generic-all
>>>> ?serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all
>>>> +serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java 
>>>> 8207765 generic-all
>>>>
>>>> ?############################################################################# 
>>>>
>>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>
>>>
>>>
>>
>>
>


From yasuenag at gmail.com  Thu Jul 19 14:03:24 2018
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Thu, 19 Jul 2018 23:03:24 +0900
Subject: RFR: 8207843: HSDB cannot show Object Histogram when ZGC is working
Message-ID: <0e8ce2e6-e043-2320-e660-a2f1f4df820e@gmail.com>

Hi all,

Please review this webrev.

      JBS: https://bugs.openjdk.java.net/browse/JDK-8207843
   webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207843/webrev.00/

I encountered AssertionFailure when I attached HSDB to the process which is working with ZGC as below:

sun.jvm.hotspot.utilities.AssertionFailure: Unexpected CollectedHeap type: sun.jvm.hotspot.gc.z.ZCollectedHeap
     at jdk.hotspot.agent/sun.jvm.hotspot.utilities.Assert.that(Assert.java:32)
     at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.collectLiveRegions(ObjectHeap.java:448)
     at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.iterate(ObjectHeap.java:173)
     at jdk.hotspot.agent/sun.jvm.hotspot.HSDB$VisitHeap.run(HSDB.java:1741)
     at jdk.hotspot.agent/sun.jvm.hotspot.utilities.WorkerThread$MainLoop.run(WorkerThread.java:70)
     at java.base/java.lang.Thread.run(Thread.java:832)

ObjectHeap#collectLiveRegions() branches by instance type of CollectedHeap. However it does not support ZCollectedHeap.
So I add ZCollectedHeap to it and add some methods to iterate ZPageTable.


Thanks,

Yasumasa

From erik.helin at oracle.com  Thu Jul 19 14:57:25 2018
From: erik.helin at oracle.com (Erik Helin)
Date: Thu, 19 Jul 2018 16:57:25 +0200
Subject: committed > max in MemoryMXBean#getHeapMemoryUsage()
In-Reply-To: <CAJmnTFiQmyxU9HAawnSa9r75hkmAT9115C8F4Y0T_rAK=TXQXg@mail.gmail.com>
References: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
 <c52de47f-b7ad-ed01-2f94-46378a95d725@oracle.com>
 <CAJmnTFi23XsD5zzZT6b+YSPZevyAzPSbcoLNAZ+J5szj7ZQjNA@mail.gmail.com>
 <b12eaec322cc513348a9155fbcdf2b916a43559b.camel@oracle.com>
 <CAJmnTFiQmyxU9HAawnSa9r75hkmAT9115C8F4Y0T_rAK=TXQXg@mail.gmail.com>
Message-ID: <35365ee0-e158-ad99-93c0-6f60251573a1@oracle.com>

On 07/13/2018 04:10 PM, Daniel Mitterdorfer wrote:
> Hi,
> 
> I have good news. I was able to reproduce this issue but this time I
> have logs. A test failed with the following stack trace around
> 15:06:55 with:
> 
> java.lang.IllegalArgumentException: committed = 537919488 should be <
> max = 536870912
>     >    at java.lang.management.MemoryUsage.<init>(MemoryUsage.java:166)
>     >    at sun.management.MemoryImpl.getMemoryUsage0(Native Method)
>     >    at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71)
>     >    at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:242)
> 
> This time it happened on Linux (kernel 4.13.0-45-generic) with JDK 10
> (build 10+46). The JVM arguments were:
> 
> -Xms512M -Xmx512M
> -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags
> 
> The logs are somewhat massive (~250MB uncompressed) and available at
> https://www.dropbox.com/s/wir9sv1dk5cf54y/JDK-8207200_test_log.txt.zip?dl=0

Thanks for the logs Daniel, they helped a lot! Me and Thomas looked 
through the logs and the code and as we suspected, this is code is a bit 
buggy :/ Please see the bug for more details:

https://bugs.openjdk.java.net/browse/JDK-8207200

Again, thanks for taking your time and reporting this issue and for 
getting us the logs, much appreciated!
Erik

> I hope that helps identifying the cause. Please let me know if you
> need anything else.
> 
> Daniel
> Am Fr., 13. Juli 2018 um 10:33 Uhr schrieb Thomas Schatzl
> <thomas.schatzl at oracle.com>:
>>
>> On Fri, 2018-07-13 at 10:30 +0200, Daniel Mitterdorfer wrote:
>>> Hi Erik,
>>>>
>>>> Do you any kind of GC logging from the test run where you
>>>> encountered the bug?
>>>
>>> Unfortunately, we don't have GC logging enabled by default in our
>>> test suite so the exception trace is all I got. I am now repeatedly
>>> running the test suite with the original flags (-Xms512M -Xmx512M)
>>> and also added the following logging configuration:
>>>
>>> -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags
>>>
>>> As soon as I get another failure, I'll provide the full log file.
>>> Please let me know if you need any other logs (i.e. whether I should
>>> adjust my log configuration).
>>
>>    I think these flags are fine.
>>
>> Since Erik and me strongly believe the issue is with the relevant G1
>> code Erik mentioned we will reassign the bug to us (he said there is
>> already a bug reported on it).
>>
>> Thanks a lot,
>>    Thomas
>>

From bob.vandette at oracle.com  Thu Jul 19 15:34:33 2018
From: bob.vandette at oracle.com (Bob Vandette)
Date: Thu, 19 Jul 2018 11:34:33 -0400
Subject: RFR: 8206456 - [TESTBUG] docker jtreg tests fail on systems
 without cpuset.effective_cpus / cpuset.effective_mem
In-Reply-To: <598c9af3-2041-0be5-f177-b3a031b2ef61@oracle.com>
References: <A8D8CB9F-D5E8-4FBF-A921-247B33CDCF6D@oracle.com>
 <598c9af3-2041-0be5-f177-b3a031b2ef61@oracle.com>
Message-ID: <9766EFD8-9220-4AC1-A5D2-71A8F9568FC4@oracle.com>


> On Jul 17, 2018, at 8:07 PM, mandy chung <mandy.chung at oracle.com> wrote:
> 
> 
> 
> On 7/17/18 7:00 AM, Bob Vandette wrote:
>> Please review this fix which eliminates some docker/cgroup test failures when running on older
>> Linux kernels with missing cgroup metric files.
>> BUGS:
>> https://bugs.openjdk.java.net/browse/JDK-8206456
>> WEBREV:
>> http://cr.openjdk.java.net/~bobv/8206456/webrev/
> 
> Nit: It would be clearer to check for the specific metrics:
> 
> int[] cpusets = metrics.getEffectiveCpuSetCpus();
> if (cpusets.length != 0) {
>    ....
> }
> 
> Same applies to getEffectiveCpuSetMems.  No need for a new webrev.

Thanks, I?ll do that cleanup.

> 
> Mandy
> P.S. I am not sure the conversion from the primitive to boxed type
> is necessary.  But this is not related to this issue.  You may
> want to take a look at that.

I?ll defer this issue to Harsha who wrote these tests since changing that is
out of scope for this fix.

Thanks,
Bob.


From jcbeyler at google.com  Thu Jul 19 16:39:57 2018
From: jcbeyler at google.com (JC Beyler)
Date: Thu, 19 Jul 2018 09:39:57 -0700
Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC
Message-ID: <CAF9BGBzZEwuh-gihhEXUPdZyJtdx0qSAiAC=eN7S7usB6hVJLg@mail.gmail.com>

Hi all,

Could I have a few reviews of:
http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/

The  test assumed the size of a 1-element array but ZGC changes that
assumption. The test now first allocates a bit of memory and gets the
average size of the samples before assuming the size. This works
with/without ZGC.

Webrev: http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/
Bug: https://bugs.openjdk.java.net/browse/JDK-8207765

Thanks!
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/9d7813c1/attachment-0001.html>

From daniel.daugherty at oracle.com  Thu Jul 19 16:45:14 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 19 Jul 2018 12:45:14 -0400
Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC
In-Reply-To: <CAF9BGBzZEwuh-gihhEXUPdZyJtdx0qSAiAC=eN7S7usB6hVJLg@mail.gmail.com>
References: <CAF9BGBzZEwuh-gihhEXUPdZyJtdx0qSAiAC=eN7S7usB6hVJLg@mail.gmail.com>
Message-ID: <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com>

JDK-8207765 covers two different tests as of yesterday:

serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java

and

serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java

I updated it to add a similar failure mode sighting for 
HeapMonitorStatIntervalTest.java


Does your fix address both test failures?

Dan


On 7/19/18 12:39 PM, JC Beyler wrote:
> Hi all,
>
> Could I have a few reviews of:
> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ 
> <http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/>
>
> The? test assumed the size of a 1-element array but ZGC changes that 
> assumption. The test now first allocates a bit of memory and gets the 
> average size of the samples before assuming the size. This works 
> with/without ZGC.
>
> Webrev: http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/ 
> <http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8207765
>
> Thanks!
> Jc

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/27154a23/attachment.html>

From jcbeyler at google.com  Thu Jul 19 17:07:06 2018
From: jcbeyler at google.com (JC Beyler)
Date: Thu, 19 Jul 2018 10:07:06 -0700
Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC
In-Reply-To: <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com>
References: <CAF9BGBzZEwuh-gihhEXUPdZyJtdx0qSAiAC=eN7S7usB6hVJLg@mail.gmail.com>
 <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com>
Message-ID: <CAF9BGBx=Y8BunE0aQYueCEKD-anm7z8vh+_mOG0tkSDAV3h=2A@mail.gmail.com>

Hi Dan,


serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
became
serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java,
when we updated the spec and said "rate" was the wrong word.

So yes, it fixes both since at some point all branches should see that the
StatRate test becomes renamed into the StatInterval test. Does that make
sense?

Thanks!
Jc


On Thu, Jul 19, 2018 at 9:45 AM Daniel D. Daugherty <
daniel.daugherty at oracle.com> wrote:

> JDK-8207765 covers two different tests as of yesterday:
>
> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
>
> and
>
> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java
>
> I updated it to add a similar failure mode sighting for
> HeapMonitorStatIntervalTest.java
>
>
> Does your fix address both test failures?
>
> Dan
>
>
> On 7/19/18 12:39 PM, JC Beyler wrote:
>
> Hi all,
>
> Could I have a few reviews of:
> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/
>
> The  test assumed the size of a 1-element array but ZGC changes that
> assumption. The test now first allocates a bit of memory and gets the
> average size of the samples before assuming the size. This works
> with/without ZGC.
>
> Webrev: http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/
> Bug: https://bugs.openjdk.java.net/browse/JDK-8207765
>
> Thanks!
> Jc
>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/03925203/attachment.html>

From jcbeyler at google.com  Thu Jul 19 17:08:42 2018
From: jcbeyler at google.com (JC Beyler)
Date: Thu, 19 Jul 2018 10:08:42 -0700
Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC
In-Reply-To: <CAF9BGBx=Y8BunE0aQYueCEKD-anm7z8vh+_mOG0tkSDAV3h=2A@mail.gmail.com>
References: <CAF9BGBzZEwuh-gihhEXUPdZyJtdx0qSAiAC=eN7S7usB6hVJLg@mail.gmail.com>
 <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com>
 <CAF9BGBx=Y8BunE0aQYueCEKD-anm7z8vh+_mOG0tkSDAV3h=2A@mail.gmail.com>
Message-ID: <CAF9BGBwjwQ2GLJ+_grHOwXDuYRRymHeiTxkKZ_DSvEEaU2n9fg@mail.gmail.com>

I forgot to put the link:
https://bugs.openjdk.java.net/browse/JDK-8207763

It got renamed in jdk11 via:
http://hg.openjdk.java.net/jdk/jdk11/rev/1edcf36fe15f

Thanks!
Jc

On Thu, Jul 19, 2018 at 10:07 AM JC Beyler <jcbeyler at google.com> wrote:

> Hi Dan,
>
>
> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
> became
> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java,
> when we updated the spec and said "rate" was the wrong word.
>
> So yes, it fixes both since at some point all branches should see that the
> StatRate test becomes renamed into the StatInterval test. Does that make
> sense?
>
> Thanks!
> Jc
>
>
> On Thu, Jul 19, 2018 at 9:45 AM Daniel D. Daugherty <
> daniel.daugherty at oracle.com> wrote:
>
>> JDK-8207765 covers two different tests as of yesterday:
>>
>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
>>
>> and
>>
>>
>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java
>>
>> I updated it to add a similar failure mode sighting for
>> HeapMonitorStatIntervalTest.java
>>
>>
>> Does your fix address both test failures?
>>
>> Dan
>>
>>
>> On 7/19/18 12:39 PM, JC Beyler wrote:
>>
>> Hi all,
>>
>> Could I have a few reviews of:
>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/
>>
>> The  test assumed the size of a 1-element array but ZGC changes that
>> assumption. The test now first allocates a bit of memory and gets the
>> average size of the samples before assuming the size. This works
>> with/without ZGC.
>>
>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8207765
>>
>> Thanks!
>> Jc
>>
>>
>>
>
> --
>
> Thanks,
> Jc
>


-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/5ac5b2d6/attachment.html>

From daniel.mitterdorfer at gmail.com  Thu Jul 19 17:10:09 2018
From: daniel.mitterdorfer at gmail.com (Daniel Mitterdorfer)
Date: Thu, 19 Jul 2018 19:10:09 +0200
Subject: committed > max in MemoryMXBean#getHeapMemoryUsage()
In-Reply-To: <35365ee0-e158-ad99-93c0-6f60251573a1@oracle.com>
References: <CAJmnTFgVxSC_O6CoOj=en9ZrVVNvW7kGxzb60YYf14FQ44kqVQ@mail.gmail.com>
 <c52de47f-b7ad-ed01-2f94-46378a95d725@oracle.com>
 <CAJmnTFi23XsD5zzZT6b+YSPZevyAzPSbcoLNAZ+J5szj7ZQjNA@mail.gmail.com>
 <b12eaec322cc513348a9155fbcdf2b916a43559b.camel@oracle.com>
 <CAJmnTFiQmyxU9HAawnSa9r75hkmAT9115C8F4Y0T_rAK=TXQXg@mail.gmail.com>
 <35365ee0-e158-ad99-93c0-6f60251573a1@oracle.com>
Message-ID: <CAJmnTFhDD6k3-vcPObecCQfRruTLaJT-e8NaJDnZaQu8W6RKPA@mail.gmail.com>

Hi Erik,

I am quite happy that I could reproduce it after running the tests
repeatedly for approximately a week after the first failure.

Glad I could help and thank you all for you help as well!

Daniel
Am Do., 19. Juli 2018 um 16:57 Uhr schrieb Erik Helin <erik.helin at oracle.com>:
>
> On 07/13/2018 04:10 PM, Daniel Mitterdorfer wrote:
> > Hi,
> >
> > I have good news. I was able to reproduce this issue but this time I
> > have logs. A test failed with the following stack trace around
> > 15:06:55 with:
> >
> > java.lang.IllegalArgumentException: committed = 537919488 should be <
> > max = 536870912
> >     >    at java.lang.management.MemoryUsage.<init>(MemoryUsage.java:166)
> >     >    at sun.management.MemoryImpl.getMemoryUsage0(Native Method)
> >     >    at sun.management.MemoryImpl.getHeapMemoryUsage(MemoryImpl.java:71)
> >     >    at org.elasticsearch.indices.breaker.HierarchyCircuitBreakerService.currentMemoryUsage(HierarchyCircuitBreakerService.java:242)
> >
> > This time it happened on Linux (kernel 4.13.0-45-generic) with JDK 10
> > (build 10+46). The JVM arguments were:
> >
> > -Xms512M -Xmx512M
> > -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags
> >
> > The logs are somewhat massive (~250MB uncompressed) and available at
> > https://www.dropbox.com/s/wir9sv1dk5cf54y/JDK-8207200_test_log.txt.zip?dl=0
>
> Thanks for the logs Daniel, they helped a lot! Me and Thomas looked
> through the logs and the code and as we suspected, this is code is a bit
> buggy :/ Please see the bug for more details:
>
> https://bugs.openjdk.java.net/browse/JDK-8207200
>
> Again, thanks for taking your time and reporting this issue and for
> getting us the logs, much appreciated!
> Erik
>
> > I hope that helps identifying the cause. Please let me know if you
> > need anything else.
> >
> > Daniel
> > Am Fr., 13. Juli 2018 um 10:33 Uhr schrieb Thomas Schatzl
> > <thomas.schatzl at oracle.com>:
> >>
> >> On Fri, 2018-07-13 at 10:30 +0200, Daniel Mitterdorfer wrote:
> >>> Hi Erik,
> >>>>
> >>>> Do you any kind of GC logging from the test run where you
> >>>> encountered the bug?
> >>>
> >>> Unfortunately, we don't have GC logging enabled by default in our
> >>> test suite so the exception trace is all I got. I am now repeatedly
> >>> running the test suite with the original flags (-Xms512M -Xmx512M)
> >>> and also added the following logging configuration:
> >>>
> >>> -Xlog:gc*=trace,heap*=trace,tlab*=off:stdout:time,pid,tid,level,tags
> >>>
> >>> As soon as I get another failure, I'll provide the full log file.
> >>> Please let me know if you need any other logs (i.e. whether I should
> >>> adjust my log configuration).
> >>
> >>    I think these flags are fine.
> >>
> >> Since Erik and me strongly believe the issue is with the relevant G1
> >> code Erik mentioned we will reassign the bug to us (he said there is
> >> already a bug reported on it).
> >>
> >> Thanks a lot,
> >>    Thomas
> >>

From chris.plummer at oracle.com  Thu Jul 19 17:10:59 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 19 Jul 2018 10:10:59 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <B5A65954-0F1A-4B21-924A-ED5E591CDED6@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
 <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
 <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>
 <7be59224-6013-b163-daf2-2ed6bb90a3b8@oracle.com>
 <B5A65954-0F1A-4B21-924A-ED5E591CDED6@oracle.com>
Message-ID: <b7e42d27-b87b-4c65-7c35-ccc2ae3fe33c@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/7ce1835c/attachment-0001.html>

From daniil.x.titov at oracle.com  Thu Jul 19 17:22:57 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Thu, 19 Jul 2018 10:22:57 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <b7e42d27-b87b-4c65-7c35-ccc2ae3fe33c@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
 <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
 <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>
 <7be59224-6013-b163-daf2-2ed6bb90a3b8@oracle.com>
 <B5A65954-0F1A-4B21-924A-ED5E591CDED6@oracle.com>
 <b7e42d27-b87b-4c65-7c35-ccc2ae3fe33c@oracle.com>
Message-ID: <F4AC09FB-F9A2-411B-B1F7-6C6B036E0F2F@oracle.com>

Hi Chris,

 
This would depend on how the particular test is implemented. The specifics of this particular test are that an event listener for ClassPrepare events was registered and unregistered multiple times and the failure happened when ClassPrepare event from Graal compiler thread was received at the moment when the listener was unregistered. 

 
Best regards,

Daniil 

 
From: Chris Plummer <chris.plummer at oracle.com>
Date: Thursday, July 19, 2018 at 10:11 AM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
It seems that any test that requests ClassPrepareEvents could be getting unexpected events when graal is enabled.

Chris

On 7/17/18 8:32 PM, Daniil Titov wrote:

Hi Serguei,

 
The changes are in the one test class vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java so they affect only this single test. No other tests depend on this class.

 
Best regards,

Daniil

 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 7:59 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
Hi Daniil,

It looks good to me.
Thank you for the update.

How many tests are depending on this class?
Could we say that all the nsk/jdi/ClassPrepareRequest tests
need to be checked that there are no regressions?

Thanks,
Serguei


On 7/17/18 19:06, Daniil Titov wrote:

Hi Serguei,

 
Please review a new version of the patch.

 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.03

Bug: https://bugs.openjdk.java.net/browse/JDK-8204695

 
Thanks!

 
Best regards,

Daniil

 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 4:53 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
Hi Daniil,

Thank you for clarification and the webrev update!
I still have a couple of questions though.

I'd suggest more simple approach like below:
 154         public boolean eventReceived(Event event) {
 155             if (event instanceof ClassPrepareEvent) {
 156                 ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
 157                 ThreadReference thread = classPrepareEvent.thread();
 158                 if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
 159                     eventReceived++;
 160 
 161                     log.display("ClassPrepareEventListener: Event received: " + event +
 162                             " Class: " + classPrepareEvent.referenceType().name());
 163 
 164                     vm.resume();
 165 
 166                     return true;
 167                 }
 168             }
 169 
 170             return false;
 171         }

to something like:
          public boolean eventReceived(Event event) {
              if (event instanceof ClassPrepareEvent) {
                  ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
                  ThreadReference thread = classPrepareEvent.thread();
                  if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
                      eventReceived++;
                      log.display("ClassPrepareEventListener: Event received: " + event +
                              " Class: " + classPrepareEvent.referenceType().name());
                  } else {
                      log.display("ClassPrepareEventListener: Event filtered out: " + event +
                              " Class: " + classPrepareEvent.referenceType().name() +
                              " Thread:" + classPrepareEvent.thread().name());
                  }
                  vm.resume();
                  return true;
              }
              return false;
          }
 
 245         eventHandler.startListening();
 246         // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads.
 247         // The listener should be added after the event listener is started to ensure that it
 248         // called before the default event listener that handles unexpected events.
 249         eventHandler.addListener(new DefaultClassPrepareEventListener());
  Still unclear why addListener() is invoked after startListening() but not before.
  It can be that a place add this listener is not right and have to be moved into testSourceFilter(). 
  But I hope this fragment is not needed with the simplified approach.
  Otherwise, it looks good.

Thanks,
Serguei


On 7/17/18 14:55, Daniil Titov wrote:
Hi Serguei,
 
The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters.  The testSourceFilter() method does the following:
      1.  creates a ClassPrepareRequest object
      2. registers new ClassPrepareEventListener
      3. sends a command to debuggee to a load test class 
      4. waits till the debuggee performed the command
      5. removes ClassPrepareEventListener
      6. checks if a ClassPrepareEvent was received
 
 
Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners)
 
cat -n  test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java
  /**
   251        * This method sets up default listeners.
   252        */
   253       private void createDefaultListeners() {
   254           /**
   255            * This listener catches up all unexpected events.
   256            *
   257            */
   258           addListener(
   259                   new EventListener() {
   260                       public boolean eventReceived(Event event) {
   261                           log.complain("EventHandler>  Unexpected event: " + event.getClass().getName());
   262                           unexpectedEventCaught = true;
   263                           return true;
   264                       }
   265                   }
   266           );
   267   
 
On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. 
 
With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener  is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. 
 
That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener  is unregistered inside testSourceFilter() method.
 
Please see below the new webrev with the changes you suggested.
 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/
 
 
Thanks!
 
Best regards,
Daniil
 
 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 1:34 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails
 
Hi Daniil,
 
Not sure, I fully understand the fix.
So, let's start from some questions.
 
Why the DefaultClassPrepareEventListener is needed?
Is it not enough to filter out the other threads in the
ClassPrepareEventListener.eventReceived() method ?
 243         eventHandler.startListening();
 244         // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads.
 245         // The listener should be added after the event listener is started to ensure that it called before
 246         // the default event listener that handles unexpected events.
 247         eventHandler.addListener(new DefaultClassPrepareEventListener());
 
  It is still not clear why the default listener is added
  after the listening is started but not before.
  If the default listener is really needed then could you, please,
  split the lines above and L129, L160 to make a little bit shorter?
  
  I'd also suggest to replace "class prepared events" at L244
  with "ClassPrepare event" or "class prepare event".
  There is also an unneeded space in the "( e.g. compiler)".
 
Thanks,
Serguei
 
 
On 7/17/18 01:20, Daniil Titov wrote:
Please review the change that fix the JDI test when running with Graal.
 
The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled.
 
Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/
 
Thanks!
--Daniil
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/e32d9e28/attachment-0001.html>

From chris.plummer at oracle.com  Thu Jul 19 17:31:42 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 19 Jul 2018 10:31:42 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
 <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
 <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>
Message-ID: <b1865316-9b3b-a32e-9c9b-14d5666b3d01@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/d600f476/attachment.html>

From daniil.x.titov at oracle.com  Thu Jul 19 17:45:04 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Thu, 19 Jul 2018 10:45:04 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <b1865316-9b3b-a32e-9c9b-14d5666b3d01@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
 <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
 <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>
 <b1865316-9b3b-a32e-9c9b-14d5666b3d01@oracle.com>
Message-ID: <5471B2B0-6D2B-41E6-B02A-F48386E64862@oracle.com>

Hi Chris,

 
It solves the problem when ClassPrepare event comes from the Graal compiler thread at the time when no any listener listening for this event is registered by the test. In this case it is handled by the EventHandler itself (lines 261-262 below) as an unexpected event and the test fails (the EventHandler ?has its own ?default? listeners registered when the event handler starts to handle events if they are not handled by the listener the test registers).

 
cat -n  test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java

  /**

   251        * This method sets up default listeners.

   252        */

   253       private void createDefaultListeners() {

   254           /**

   255            * This listener catches up all unexpected events.

   256            *

   257            */

   258           addListener(

   259                   new EventListener() {

   260                       public boolean eventReceived(Event event) {

   261                           log.complain("EventHandler>  Unexpected event: " + event.getClass().getName());

   262                           unexpectedEventCaught = true;

   263                           return true;

   264                       }

   265                   }

   266           );

   267   

 
Best regards,

Daniil

 
From: Chris Plummer <chris.plummer at oracle.com>
Date: Thursday, July 19, 2018 at 10:31 AM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
Hi Daniil,

I understand the changes in eventReceived() to filter out events that are not on the main thread, but I don't understand why you've gone from having a new listener on each iteration, to just having one listener that stays active over all iterations. What problem does that change solve?

thanks,

Chris

On 7/17/18 7:06 PM, Daniil Titov wrote:

Hi Serguei,

 
Please review a new version of the patch.

 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.03

Bug: https://bugs.openjdk.java.net/browse/JDK-8204695

 
Thanks!

 
Best regards,

Daniil

 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 4:53 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
Hi Daniil,

Thank you for clarification and the webrev update!
I still have a couple of questions though.

I'd suggest more simple approach like below:
 154         public boolean eventReceived(Event event) {
 155             if (event instanceof ClassPrepareEvent) {
 156                 ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
 157                 ThreadReference thread = classPrepareEvent.thread();
 158                 if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
 159                     eventReceived++;
 160 
 161                     log.display("ClassPrepareEventListener: Event received: " + event +
 162                             " Class: " + classPrepareEvent.referenceType().name());
 163 
 164                     vm.resume();
 165 
 166                     return true;
 167                 }
 168             }
 169 
 170             return false;
 171         }

to something like:
          public boolean eventReceived(Event event) {
              if (event instanceof ClassPrepareEvent) {
                  ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
                  ThreadReference thread = classPrepareEvent.thread();
                  if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
                      eventReceived++;
                      log.display("ClassPrepareEventListener: Event received: " + event +
                              " Class: " + classPrepareEvent.referenceType().name());
                  } else {
                      log.display("ClassPrepareEventListener: Event filtered out: " + event +
                              " Class: " + classPrepareEvent.referenceType().name() +
                              " Thread:" + classPrepareEvent.thread().name());
                  }
                  vm.resume();
                  return true;
              }
              return false;
          }
 
 245         eventHandler.startListening();
 246         // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads.
 247         // The listener should be added after the event listener is started to ensure that it
 248         // called before the default event listener that handles unexpected events.
 249         eventHandler.addListener(new DefaultClassPrepareEventListener());
  Still unclear why addListener() is invoked after startListening() but not before.
  It can be that a place add this listener is not right and have to be moved into testSourceFilter(). 
  But I hope this fragment is not needed with the simplified approach.
  Otherwise, it looks good.

Thanks,
Serguei


On 7/17/18 14:55, Daniil Titov wrote:
Hi Serguei,
 
The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters.  The testSourceFilter() method does the following:
      1.  creates a ClassPrepareRequest object
      2. registers new ClassPrepareEventListener
      3. sends a command to debuggee to a load test class 
      4. waits till the debuggee performed the command
      5. removes ClassPrepareEventListener
      6. checks if a ClassPrepareEvent was received
 
 
Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners)
 
cat -n  test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java
  /**
   251        * This method sets up default listeners.
   252        */
   253       private void createDefaultListeners() {
   254           /**
   255            * This listener catches up all unexpected events.
   256            *
   257            */
   258           addListener(
   259                   new EventListener() {
   260                       public boolean eventReceived(Event event) {
   261                           log.complain("EventHandler>  Unexpected event: " + event.getClass().getName());
   262                           unexpectedEventCaught = true;
   263                           return true;
   264                       }
   265                   }
   266           );
   267   
 
On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. 
 
With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener  is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. 
 
That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener  is unregistered inside testSourceFilter() method.
 
Please see below the new webrev with the changes you suggested.
 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/
 
 
Thanks!
 
Best regards,
Daniil
 
 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 1:34 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails
 
Hi Daniil,
 
Not sure, I fully understand the fix.
So, let's start from some questions.
 
Why the DefaultClassPrepareEventListener is needed?
Is it not enough to filter out the other threads in the
ClassPrepareEventListener.eventReceived() method ?
 243         eventHandler.startListening();
 244         // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads.
 245         // The listener should be added after the event listener is started to ensure that it called before
 246         // the default event listener that handles unexpected events.
 247         eventHandler.addListener(new DefaultClassPrepareEventListener());
 
  It is still not clear why the default listener is added
  after the listening is started but not before.
  If the default listener is really needed then could you, please,
  split the lines above and L129, L160 to make a little bit shorter?
  
  I'd also suggest to replace "class prepared events" at L244
  with "ClassPrepare event" or "class prepare event".
  There is also an unneeded space in the "( e.g. compiler)".
 
Thanks,
Serguei
 
 
On 7/17/18 01:20, Daniil Titov wrote:
Please review the change that fix the JDI test when running with Graal.
 
The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled.
 
Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/
 
Thanks!
--Daniil
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/2714f8a1/attachment-0001.html>

From chris.plummer at oracle.com  Thu Jul 19 17:54:42 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 19 Jul 2018 10:54:42 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <5471B2B0-6D2B-41E6-B02A-F48386E64862@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
 <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
 <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>
 <b1865316-9b3b-a32e-9c9b-14d5666b3d01@oracle.com>
 <5471B2B0-6D2B-41E6-B02A-F48386E64862@oracle.com>
Message-ID: <870171c6-89e7-d019-a3a4-10a7df2230d3@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/cc1a4d50/attachment-0001.html>

From daniil.x.titov at oracle.com  Thu Jul 19 18:01:29 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Thu, 19 Jul 2018 11:01:29 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <870171c6-89e7-d019-a3a4-10a7df2230d3@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
 <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
 <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>
 <b1865316-9b3b-a32e-9c9b-14d5666b3d01@oracle.com>
 <5471B2B0-6D2B-41E6-B02A-F48386E64862@oracle.com>
 <870171c6-89e7-d019-a3a4-10a7df2230d3@oracle.com>
Message-ID: <9A1E81BE-B05D-48B9-8123-E2739BD68DFD@oracle.com>

Hi Chris,

 
Some events are still coming in after disable() returns. The event handler sees the request object associated with this event ( event.request() ) as disabled but it still receives them.

 
Best regards,

Daniil

 
From: Chris Plummer <chris.plummer at oracle.com>
Date: Thursday, July 19, 2018 at 10:54 AM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
But the code used to be:

 168         request.disable();
 169 
 170         eventHandler.removeListener(listener);

Doesn't the disable stop any new ClassPrepareEvents from coming in, and this is done before the listener is removed, or is there a synchronization issue here, and you can still get some events coming in after disable() returns. I'm not sure if disable() makes any guarantees about the debuggee side having fully processed it and guaranteed delivery of all pending events before it returns.

thanks,

Chris

On 7/19/18 10:45 AM, Daniil Titov wrote:

Hi Chris,

 
It solves the problem when ClassPrepare event comes from the Graal compiler thread at the time when no any listener listening for this event is registered by the test. In this case it is handled by the EventHandler itself (lines 261-262 below) as an unexpected event and the test fails (the EventHandler  has its own ?default? listeners registered when the event handler starts to handle events if they are not handled by the listener the test registers).

 
cat -n  test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java

  /**

   251        * This method sets up default listeners.

   252        */

   253       private void createDefaultListeners() {

   254           /**

   255            * This listener catches up all unexpected events.

   256            *

   257            */

   258           addListener(

   259                   new EventListener() {

   260                       public boolean eventReceived(Event event) {

   261                           log.complain("EventHandler>  Unexpected event: " + event.getClass().getName());

   262                           unexpectedEventCaught = true;

   263                           return true;

   264                       }

   265                   }

   266           );

   267   

 
Best regards,

Daniil

 
From: Chris Plummer <chris.plummer at oracle.com>
Date: Thursday, July 19, 2018 at 10:31 AM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
Hi Daniil,

I understand the changes in eventReceived() to filter out events that are not on the main thread, but I don't understand why you've gone from having a new listener on each iteration, to just having one listener that stays active over all iterations. What problem does that change solve?

thanks,

Chris

On 7/17/18 7:06 PM, Daniil Titov wrote:

Hi Serguei,

 
Please review a new version of the patch.

 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.03

Bug: https://bugs.openjdk.java.net/browse/JDK-8204695

 
Thanks!

 
Best regards,

Daniil

 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 4:53 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
Hi Daniil,

Thank you for clarification and the webrev update!
I still have a couple of questions though.

I'd suggest more simple approach like below:
 154         public boolean eventReceived(Event event) {
 155             if (event instanceof ClassPrepareEvent) {
 156                 ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
 157                 ThreadReference thread = classPrepareEvent.thread();
 158                 if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
 159                     eventReceived++;
 160 
 161                     log.display("ClassPrepareEventListener: Event received: " + event +
 162                             " Class: " + classPrepareEvent.referenceType().name());
 163 
 164                     vm.resume();
 165 
 166                     return true;
 167                 }
 168             }
 169 
 170             return false;
 171         }

to something like:
          public boolean eventReceived(Event event) {
              if (event instanceof ClassPrepareEvent) {
                  ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
                  ThreadReference thread = classPrepareEvent.thread();
                  if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
                      eventReceived++;
                      log.display("ClassPrepareEventListener: Event received: " + event +
                              " Class: " + classPrepareEvent.referenceType().name());
                  } else {
                      log.display("ClassPrepareEventListener: Event filtered out: " + event +
                              " Class: " + classPrepareEvent.referenceType().name() +
                              " Thread:" + classPrepareEvent.thread().name());
                  }
                  vm.resume();
                  return true;
              }
              return false;
          }
 
 245         eventHandler.startListening();
 246         // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads.
 247         // The listener should be added after the event listener is started to ensure that it
 248         // called before the default event listener that handles unexpected events.
 249         eventHandler.addListener(new DefaultClassPrepareEventListener());
  Still unclear why addListener() is invoked after startListening() but not before.
  It can be that a place add this listener is not right and have to be moved into testSourceFilter(). 
  But I hope this fragment is not needed with the simplified approach.
  Otherwise, it looks good.

Thanks,
Serguei


On 7/17/18 14:55, Daniil Titov wrote:
Hi Serguei,
 
The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters.  The testSourceFilter() method does the following:
      1.  creates a ClassPrepareRequest object
      2. registers new ClassPrepareEventListener
      3. sends a command to debuggee to a load test class 
      4. waits till the debuggee performed the command
      5. removes ClassPrepareEventListener
      6. checks if a ClassPrepareEvent was received
 
 
Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners)
 
cat -n  test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java
  /**
   251        * This method sets up default listeners.
   252        */
   253       private void createDefaultListeners() {
   254           /**
   255            * This listener catches up all unexpected events.
   256            *
   257            */
   258           addListener(
   259                   new EventListener() {
   260                       public boolean eventReceived(Event event) {
   261                           log.complain("EventHandler>  Unexpected event: " + event.getClass().getName());
   262                           unexpectedEventCaught = true;
   263                           return true;
   264                       }
   265                   }
   266           );
   267   
 
On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. 
 
With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener  is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. 
 
That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener  is unregistered inside testSourceFilter() method.
 
Please see below the new webrev with the changes you suggested.
 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/
 
 
Thanks!
 
Best regards,
Daniil
 
 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 1:34 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails
 
Hi Daniil,
 
Not sure, I fully understand the fix.
So, let's start from some questions.
 
Why the DefaultClassPrepareEventListener is needed?
Is it not enough to filter out the other threads in the
ClassPrepareEventListener.eventReceived() method ?
 243         eventHandler.startListening();
 244         // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads.
 245         // The listener should be added after the event listener is started to ensure that it called before
 246         // the default event listener that handles unexpected events.
 247         eventHandler.addListener(new DefaultClassPrepareEventListener());
 
  It is still not clear why the default listener is added
  after the listening is started but not before.
  If the default listener is really needed then could you, please,
  split the lines above and L129, L160 to make a little bit shorter?
  
  I'd also suggest to replace "class prepared events" at L244
  with "ClassPrepare event" or "class prepare event".
  There is also an unneeded space in the "( e.g. compiler)".
 
Thanks,
Serguei
 
 
On 7/17/18 01:20, Daniil Titov wrote:
Please review the change that fix the JDI test when running with Graal.
 
The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled.
 
Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/
 
Thanks!
--Daniil
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/5d56f216/attachment-0001.html>

From chris.plummer at oracle.com  Thu Jul 19 18:03:16 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 19 Jul 2018 11:03:16 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <9A1E81BE-B05D-48B9-8123-E2739BD68DFD@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
 <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
 <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>
 <b1865316-9b3b-a32e-9c9b-14d5666b3d01@oracle.com>
 <5471B2B0-6D2B-41E6-B02A-F48386E64862@oracle.com>
 <870171c6-89e7-d019-a3a4-10a7df2230d3@oracle.com>
 <9A1E81BE-B05D-48B9-8123-E2739BD68DFD@oracle.com>
Message-ID: <a0f4c83e-0ff9-c019-0095-9a154d168932@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/a3d91ae8/attachment-0001.html>

From daniil.x.titov at oracle.com  Thu Jul 19 18:06:47 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Thu, 19 Jul 2018 11:06:47 -0700
Subject: RFR 8204695: [Graal]
 vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java
 fails
In-Reply-To: <a0f4c83e-0ff9-c019-0095-9a154d168932@oracle.com>
References: <30C81603-624E-4D13-B877-F1383B2F2698@oracle.com>
 <e305ba43-79bc-f4a2-41cf-1012c94d85f7@oracle.com>
 <9651ACD9-362A-4104-AD90-45E1D9AB28E8@oracle.com>
 <d2d3ddfa-d269-0eff-712f-1753bfa9cbea@oracle.com>
 <9323EEC7-A9B2-47C2-B1FD-0466B19B21B8@oracle.com>
 <b1865316-9b3b-a32e-9c9b-14d5666b3d01@oracle.com>
 <5471B2B0-6D2B-41E6-B02A-F48386E64862@oracle.com>
 <870171c6-89e7-d019-a3a4-10a7df2230d3@oracle.com>
 <9A1E81BE-B05D-48B9-8123-E2739BD68DFD@oracle.com>
 <a0f4c83e-0ff9-c019-0095-9a154d168932@oracle.com>
Message-ID: <D4B3C010-3D28-4AD7-98F8-12EF6916086F@oracle.com>

Thank you Chris and Serguei for reviewing this change!

 
Best regards,

Daniil

 
From: Chris Plummer <chris.plummer at oracle.com>
Date: Thursday, July 19, 2018 at 11:03 AM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
Ok, your changes look good then.

thanks,

Chris

On 7/19/18 11:01 AM, Daniil Titov wrote:

Hi Chris,

 
Some events are still coming in after disable() returns. The event handler sees the request object associated with this event ( event.request() ) as disabled but it still receives them.

 
Best regards,

Daniil

 
From: Chris Plummer <chris.plummer at oracle.com>
Date: Thursday, July 19, 2018 at 10:54 AM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
But the code used to be:

 168         request.disable();
 169 
 170         eventHandler.removeListener(listener);

Doesn't the disable stop any new ClassPrepareEvents from coming in, and this is done before the listener is removed, or is there a synchronization issue here, and you can still get some events coming in after disable() returns. I'm not sure if disable() makes any guarantees about the debuggee side having fully processed it and guaranteed delivery of all pending events before it returns.

thanks,

Chris

On 7/19/18 10:45 AM, Daniil Titov wrote:

Hi Chris,

 
It solves the problem when ClassPrepare event comes from the Graal compiler thread at the time when no any listener listening for this event is registered by the test. In this case it is handled by the EventHandler itself (lines 261-262 below) as an unexpected event and the test fails (the EventHandler  has its own ?default? listeners registered when the event handler starts to handle events if they are not handled by the listener the test registers).

 
cat -n  test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java

  /**

   251        * This method sets up default listeners.

   252        */

   253       private void createDefaultListeners() {

   254           /**

   255            * This listener catches up all unexpected events.

   256            *

   257            */

   258           addListener(

   259                   new EventListener() {

   260                       public boolean eventReceived(Event event) {

   261                           log.complain("EventHandler>  Unexpected event: " + event.getClass().getName());

   262                           unexpectedEventCaught = true;

   263                           return true;

   264                       }

   265                   }

   266           );

   267   

 
Best regards,

Daniil

 
From: Chris Plummer <chris.plummer at oracle.com>
Date: Thursday, July 19, 2018 at 10:31 AM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
Hi Daniil,

I understand the changes in eventReceived() to filter out events that are not on the main thread, but I don't understand why you've gone from having a new listener on each iteration, to just having one listener that stays active over all iterations. What problem does that change solve?

thanks,

Chris

On 7/17/18 7:06 PM, Daniil Titov wrote:

Hi Serguei,

 
Please review a new version of the patch.

 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.03

Bug: https://bugs.openjdk.java.net/browse/JDK-8204695

 
Thanks!

 
Best regards,

Daniil

 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 4:53 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails

 
Hi Daniil,

Thank you for clarification and the webrev update!
I still have a couple of questions though.

I'd suggest more simple approach like below:
 154         public boolean eventReceived(Event event) {
 155             if (event instanceof ClassPrepareEvent) {
 156                 ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
 157                 ThreadReference thread = classPrepareEvent.thread();
 158                 if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
 159                     eventReceived++;
 160 
 161                     log.display("ClassPrepareEventListener: Event received: " + event +
 162                             " Class: " + classPrepareEvent.referenceType().name());
 163 
 164                     vm.resume();
 165 
 166                     return true;
 167                 }
 168             }
 169 
 170             return false;
 171         }

to something like:
          public boolean eventReceived(Event event) {
              if (event instanceof ClassPrepareEvent) {
                  ClassPrepareEvent classPrepareEvent = (ClassPrepareEvent) event;
                  ThreadReference thread = classPrepareEvent.thread();
                  if (thread != null && DEBUGGEE_MAIN_THREAD.equals(thread.name())) {
                      eventReceived++;
                      log.display("ClassPrepareEventListener: Event received: " + event +
                              " Class: " + classPrepareEvent.referenceType().name());
                  } else {
                      log.display("ClassPrepareEventListener: Event filtered out: " + event +
                              " Class: " + classPrepareEvent.referenceType().name() +
                              " Thread:" + classPrepareEvent.thread().name());
                  }
                  vm.resume();
                  return true;
              }
              return false;
          }
 
 245         eventHandler.startListening();
 246         // Add a listener to handle ClassPrepare events fired by other (e.g. compiler) threads.
 247         // The listener should be added after the event listener is started to ensure that it
 248         // called before the default event listener that handles unexpected events.
 249         eventHandler.addListener(new DefaultClassPrepareEventListener());
  Still unclear why addListener() is invoked after startListening() but not before.
  It can be that a place add this listener is not right and have to be moved into testSourceFilter(). 
  But I hope this fragment is not needed with the simplified approach.
  Otherwise, it looks good.

Thanks,
Serguei


On 7/17/18 14:55, Daniil Titov wrote:
Hi Serguei,
 
The test starts the event handler (nsk.share.jdi.EventHandler) and then iterates several times calling testSourceFilter() method passing there different parameters.  The testSourceFilter() method does the following:
      1.  creates a ClassPrepareRequest object
      2. registers new ClassPrepareEventListener
      3. sends a command to debuggee to a load test class 
      4. waits till the debuggee performed the command
      5. removes ClassPrepareEventListener
      6. checks if a ClassPrepareEvent was received
 
 
Upon its start the EventHandler creates a default list of events listeners. The last listener in this list handles unexpected events (that are events not processed by the previous listeners)
 
cat -n  test/hotspot/jtreg/vmTestbase/nsk/share/jdi/EventHandler.java
  /**
   251        * This method sets up default listeners.
   252        */
   253       private void createDefaultListeners() {
   254           /**
   255            * This listener catches up all unexpected events.
   256            *
   257            */
   258           addListener(
   259                   new EventListener() {
   260                       public boolean eventReceived(Event event) {
   261                           log.complain("EventHandler>  Unexpected event: " + event.getClass().getName());
   262                           unexpectedEventCaught = true;
   263                           return true;
   264                       }
   265                   }
   266           );
   267   
 
On step 2 above the ClassPrepareEventListener is added at the head of the list of the listeners. It handles ClassPrepareEvents and prevents the next listeners from being invoked by returning "true" from its eventReceived(Event) method. 
 
With Graal turned on after step 1 the JVMTI compiler thread starts sending class prepare events for classes it compiles. If any of such event is dispatched after step 5 (when ClassPrepareEventListener  is removed) there is no any registered listeners to handle it and this event is handled by the "unexpected events listener" (see above) that marks the test as failed. 
 
That is why DefaultClassPrepareEventListener is needed: to process ClassPrepare events dispatched after ClassPrepareEventListener  is unregistered inside testSourceFilter() method.
 
Please see below the new webrev with the changes you suggested.
 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.02/
 
 
Thanks!
 
Best regards,
Daniil
 
 
From: "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com>
Date: Tuesday, July 17, 2018 at 1:34 PM
To: Daniil Titov <daniil.x.titov at oracle.com>, "serviceability-dev at openjdk.java.net serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: Re: RFR 8204695: [Graal] vmTestbase/nsk/jdi/ClassPrepareRequest/addSourceNameFilter/addSourceNameFilter002/addSourceNameFilter002.java fails
 
Hi Daniil,
 
Not sure, I fully understand the fix.
So, let's start from some questions.
 
Why the DefaultClassPrepareEventListener is needed?
Is it not enough to filter out the other threads in the
ClassPrepareEventListener.eventReceived() method ?
 243         eventHandler.startListening();
 244         // Add a listener to handle class prepared events fired by other ( e.g. compiler) threads.
 245         // The listener should be added after the event listener is started to ensure that it called before
 246         // the default event listener that handles unexpected events.
 247         eventHandler.addListener(new DefaultClassPrepareEventListener());
 
  It is still not clear why the default listener is added
  after the listening is started but not before.
  If the default listener is really needed then could you, please,
  split the lines above and L129, L160 to make a little bit shorter?
  
  I'd also suggest to replace "class prepared events" at L244
  with "ClassPrepare event" or "class prepare event".
  There is also an unneeded space in the "( e.g. compiler)".
 
Thanks,
Serguei
 
 
On 7/17/18 01:20, Daniil Titov wrote:
Please review the change that fix the JDI test when running with Graal.
 
The problem here is that the test verifies that a class prepare event is generated when the target VM loads a specific test class, but with Graal turned on additional class prepare events are generated by the compiler threads. The test doesn't expect them and fails. The fix ensures that additional class prepare events are ignored by the test and properly handled.
 
Bug: https://bugs.openjdk.java.net/browse/JDK-8204695 
Webrev: http://cr.openjdk.java.net/~dtitov/8204695/webrev.01/
 
Thanks!
--Daniil
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/2cc490ec/attachment-0001.html>

From chris.plummer at oracle.com  Thu Jul 19 18:26:49 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 19 Jul 2018 11:26:49 -0700
Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running
 in Docker containers
In-Reply-To: <1bae36e7-3efc-3aef-6a99-324102da2549@gmail.com>
References: <5acf45c6-b666-f4f5-2947-8e026701b29a@gmail.com>
 <2f2ab689-8ec6-4c39-96c6-f2be6bbb21c8@gmail.com>
 <192b41c2-e646-e0dc-0739-d8347046c645@oracle.com>
 <CAGFVN2A+OtA7KmCmN9aJCQv-kvnXoQB4jz75SXiweLbaG=EYKQ@mail.gmail.com>
 <1bae36e7-3efc-3aef-6a99-324102da2549@gmail.com>
Message-ID: <9894ffb1-1325-f38e-afa6-90f2b56a6d89@oracle.com>

Hi Yasumasa,

 ? 84???? // It maps the LWPID in the host to it in the container.

"it" -> "the PID"

 ?286???? // Get LWPID in the host from the container's LWPID.
 ?287???? public int getHostPID(int id) {
 ?288???????? try {
 ?289???????????? return nspidMap.get(id);
 ?290???????? } catch (NullPointerException e) {
 ?291???????????? return -1;
 ?292???????? }
 ?293???? }

What is the source of the NPE here? Is it because nspidMap was never 
initialized because the process is not in a container? In that case I 
think you should be checking for null rather than having an NPE be part 
of normal execution.

 ? 42???????????? int hostPID = 
((LinuxDebuggerLocal)debugger).getHostPID(pid);
 ? 43???????????? if (hostPID != -1) {
 ? 44???????????????? pid = hostPID;
 ? 45???????????? }

A comment here would be helpful.

The rest looks good. I should probably run it through some internal 
testing. Let me know when you have a final webrev.

thanks,

Chris

On 7/18/18 5:59 AM, Yasumasa Suenaga wrote:
> PING:
>
> Could you review it?
>
> ?? JBS:??? https://bugs.openjdk.java.net/browse/JDK-8205992
> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/
>
> This change has been reviewed by Jini.
> We need a Reviewer.
>
>
> Thanks,
>
> Yasumasa
>
>
> On 2018/07/12 13:42, Yasumasa Suenaga wrote:
>> Thanks Jini,
>>
>> I uploaded new webrev. It contains some comments and removing extra 
>> space.
>>
>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/
>>
>>
>> Yasumasa
>>
>>
>>
>> 2018-07-12 2:32 GMT+09:00 Jini George <jini.george at oracle.com>:
>>> Hi Yasumasa,
>>>
>>> This looks good to me except for one nit. And some more comments 
>>> would help.
>>> For e.g., it would help to say that NSPidMap is to map the host to 
>>> container
>>> lwpids.
>>>
>>> The nit:
>>>
>>> *
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html 
>>>
>>> Line 253: extra space after the parentheses
>>>
>>> Thanks,
>>> Jini.
>>>
>>> On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote:
>>>>
>>>> PING: Could you review it?
>>>>
>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8205992
>>>>> ?? webrev: 
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>> On 2018/06/28 22:12, Yasumasa Suenaga wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> Please review this change.
>>>>>
>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8205992
>>>>> ?? webrev: 
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>>>>
>>>>> I tried to attach jhsdb to java process in docker container from
>>>>> container host, but it couldn't.
>>>>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet.
>>>>>
>>>>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but 
>>>>> they
>>>>> returns PIDs in container - they are different from host's PID. So 
>>>>> I added
>>>>> the code to scan /proc/<PID>/task to get all LWP IDs and they are 
>>>>> kept in a
>>>>> Map in LinuxDebuggerLocal.
>>>>>
>>>>> Also SA_ALTROOT is set to /proc/<PID>/root if SA detects debuggee 
>>>>> runs in
>>>>> container. It helps SA to parse binaries in container.
>>>>>
>>>>> This change has been pushed to submit repo, and it was failed on OS X
>>>>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963).
>>>>> But I guess it causes JDK-8205906. This change affects to Linux only.
>>>>>
>>>>> Could you review it?
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>


From serguei.spitsyn at oracle.com  Thu Jul 19 21:33:30 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 19 Jul 2018 14:33:30 -0700
Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC
In-Reply-To: <CAF9BGBwjwQ2GLJ+_grHOwXDuYRRymHeiTxkKZ_DSvEEaU2n9fg@mail.gmail.com>
References: <CAF9BGBzZEwuh-gihhEXUPdZyJtdx0qSAiAC=eN7S7usB6hVJLg@mail.gmail.com>
 <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com>
 <CAF9BGBx=Y8BunE0aQYueCEKD-anm7z8vh+_mOG0tkSDAV3h=2A@mail.gmail.com>
 <CAF9BGBwjwQ2GLJ+_grHOwXDuYRRymHeiTxkKZ_DSvEEaU2n9fg@mail.gmail.com>
Message-ID: <f6bb401b-c97a-2454-1a8e-6d341bf408f7@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/5a5b9fde/attachment.html>

From jcbeyler at google.com  Thu Jul 19 21:52:07 2018
From: jcbeyler at google.com (JC Beyler)
Date: Thu, 19 Jul 2018 14:52:07 -0700
Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC
In-Reply-To: <f6bb401b-c97a-2454-1a8e-6d341bf408f7@oracle.com>
References: <CAF9BGBzZEwuh-gihhEXUPdZyJtdx0qSAiAC=eN7S7usB6hVJLg@mail.gmail.com>
 <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com>
 <CAF9BGBx=Y8BunE0aQYueCEKD-anm7z8vh+_mOG0tkSDAV3h=2A@mail.gmail.com>
 <CAF9BGBwjwQ2GLJ+_grHOwXDuYRRymHeiTxkKZ_DSvEEaU2n9fg@mail.gmail.com>
 <f6bb401b-c97a-2454-1a8e-6d341bf408f7@oracle.com>
Message-ID: <CAF9BGBwkQAm4KQyHgGBKLVnHNpn4APS2nLZLNcPEsHw+Q_G7kw@mail.gmail.com>

Hi Serguei,

Done here:
http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.01/

I added:

+  // Calculate the size of a 1-element array in order to assess
average sampling interval+  // via the HeapMonitorStatIntervalTest.
This is needed because various GCs could add+  // extra memory to
arrays.+  // This is done by allocating a 1-element array and then
looking in the heap monitoring+  // samples for the average size of
objects collected.


Let me know what you think and then I need one more review to prepare the
patch :-)

Thanks all!
Jc

On Thu, Jul 19, 2018 at 2:33 PM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Hi Jc,
>
> The fix looks good to me.
> Just minor comments.
>
>
> http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor.java.frames.html
>
>  108   public static void calculateAverageOneElementSize() {
>
>   Could you, please, add a comment before calculateAverageOneElementSize
> method
>   explaining shortly why it is needed and what it is doing?
>   Otherwise, it is not easy to understand this code from scratch.
>
> Thanks,
> Serguei
>
>
> On 7/19/18 10:08, JC Beyler wrote:
>
> I forgot to put the link:
> https://bugs.openjdk.java.net/browse/JDK-8207763
>
> It got renamed in jdk11 via:
> http://hg.openjdk.java.net/jdk/jdk11/rev/1edcf36fe15f
>
> Thanks!
> Jc
>
> On Thu, Jul 19, 2018 at 10:07 AM JC Beyler <jcbeyler at google.com> wrote:
>
>> Hi Dan,
>>
>>
>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
>> became
>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java,
>> when we updated the spec and said "rate" was the wrong word.
>>
>> So yes, it fixes both since at some point all branches should see that
>> the StatRate test becomes renamed into the StatInterval test. Does that
>> make sense?
>>
>> Thanks!
>> Jc
>>
>>
>> On Thu, Jul 19, 2018 at 9:45 AM Daniel D. Daugherty <
>> daniel.daugherty at oracle.com> wrote:
>>
>>> JDK-8207765 covers two different tests as of yesterday:
>>>
>>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
>>>
>>> and
>>>
>>>
>>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java
>>>
>>> I updated it to add a similar failure mode sighting for
>>> HeapMonitorStatIntervalTest.java
>>>
>>>
>>> Does your fix address both test failures?
>>>
>>> Dan
>>>
>>>
>>> On 7/19/18 12:39 PM, JC Beyler wrote:
>>>
>>> Hi all,
>>>
>>> Could I have a few reviews of:
>>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/
>>>
>>> The  test assumed the size of a 1-element array but ZGC changes that
>>> assumption. The test now first allocates a bit of memory and gets the
>>> average size of the samples before assuming the size. This works
>>> with/without ZGC.
>>>
>>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8207765
>>>
>>> Thanks!
>>> Jc
>>>
>>>
>>>
>>
>> --
>>
>> Thanks,
>> Jc
>>
>
>
> --
>
> Thanks,
> Jc
>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/fcff89ea/attachment-0001.html>

From serguei.spitsyn at oracle.com  Thu Jul 19 22:20:23 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 19 Jul 2018 15:20:23 -0700
Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC
In-Reply-To: <CAF9BGBwkQAm4KQyHgGBKLVnHNpn4APS2nLZLNcPEsHw+Q_G7kw@mail.gmail.com>
References: <CAF9BGBzZEwuh-gihhEXUPdZyJtdx0qSAiAC=eN7S7usB6hVJLg@mail.gmail.com>
 <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com>
 <CAF9BGBx=Y8BunE0aQYueCEKD-anm7z8vh+_mOG0tkSDAV3h=2A@mail.gmail.com>
 <CAF9BGBwjwQ2GLJ+_grHOwXDuYRRymHeiTxkKZ_DSvEEaU2n9fg@mail.gmail.com>
 <f6bb401b-c97a-2454-1a8e-6d341bf408f7@oracle.com>
 <CAF9BGBwkQAm4KQyHgGBKLVnHNpn4APS2nLZLNcPEsHw+Q_G7kw@mail.gmail.com>
Message-ID: <6c66de5b-fb39-3212-cef8-2fba58aca121@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/00c34a8a/attachment.html>

From serguei.spitsyn at oracle.com  Thu Jul 19 23:32:38 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 19 Jul 2018 16:32:38 -0700
Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is
 enabled
In-Reply-To: <de54ede9-e29f-5e58-8d7f-6ad3c74d558c@oracle.com>
References: <CAF9BGBzAVdC3JZCu2945n8JXo0usocBxVxYzdfdBU_gWOTe8wg@mail.gmail.com>
 <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com>
 <CAF9BGBx7TnVo5vmNx75TUhd1VSeow6KsgcSk2ztKdibKuzv9fw@mail.gmail.com>
 <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com>
 <CAF9BGBxLchBQ3g1uwD=34E+QR5ARO7DqBfxZpFU2TRWTSaTajQ@mail.gmail.com>
 <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
 <CAF9BGBz+SNUffJWNgkvTTK6jB7-8yf96HGSroxvY13FT80XPCg@mail.gmail.com>
 <de54ede9-e29f-5e58-8d7f-6ad3c74d558c@oracle.com>
Message-ID: <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com>

Thanks, Rahul!
In fact, there no good experts for this area in the serviceability team.
It would be much better if anyone from the Compiler team could do it.

Vladimir K.,

Is there anyone from the Compiler team available to review this?
Otherwise, I could try to review it but am not sure about my review quality.

Thanks,
Serguei


On 7/19/18 00:48, Rahul Raghavan wrote:
> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled
>
> (just adding + hotspot-compiler-dev also)
>
>
> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote:
> Subject Was:
> Re: RFR (S): C1 still does eden allocations when TLAB is enabled
>
> + serviceability-dev
>
> Hi all,
>
> Could anyone else give me a review of this webrev and check/test the
> various architecture changes?
>
> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>
>
> Thanks for all your help!
> Jc
>
>
>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler <jcbeyler at google.com> wrote:
>>
>>> Hi all,
>>>
>>> Here is a webrev that does all the architectures in the same way:
>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>>>
>>> Could anyone review the other architectures and test?
>>> ?? - arm, sparc & aarch64 are also modified now to follow the same 
>>> "if no
>>> tlab, then consider eden space allocation" logic.
>>>
>>> Thanks for your help!
>>> Jc
>>>
>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler <jcbeyler at google.com> wrote:
>>>
>>>> Hi Kim,
>>>>
>>>> I opened this bug
>>>> https://bugs.openjdk.java.net/browse/JDK-8190862
>>>>
>>>> and now I've done an update:
>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/
>>>>
>>>> I basically have done your nits but also removed the try_eden (it was
>>>> used to bind a label but was not used). I updated the comments to 
>>>> use the
>>>> one you preferred.
>>>>
>>>> I still have to do the other architectures though but at least we 
>>>> seem to
>>>> have a consensus on this architecture, correct?
>>>>
>>>> Thanks for the review,
>>>> Jc
>>>>
>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett <kim.barrett at oracle.com>
>>>> wrote:
>>>>
>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler <jcbeyler at google.com> wrote:
>>>>>>
>>>>>> Yes, you are right, I did those changes due to:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084
>>>>>>
>>>>>> If Robbin agrees to this change, and if no one sees an issue, 
>>>>>> I'll go
>>>>> ahead
>>>>>> and propagate the change across architectures.
>>>>>>
>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's 
>>>>>> comment
>>>>> and
>>>>>> review) :)
>>>>>> Jc
>>>>>>
>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose <john.r.rose at oracle.com>
>>>>> wrote:
>>>>>>
>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler <jcbeyler at google.com> 
>>>>>>> wrote:
>>>>>>>
>>>>>>>
>>>>>>> I'm not sure if we had left this case intentionally or not but, 
>>>>>>> if we
>>>>> want
>>>>>>> it all to be consistent, we should perhaps fix it.
>>>>>>>
>>>>>>>
>>>>>>> Well, you put in that logic last February, so unless somebody 
>>>>>>> speaks
>>>>> up
>>>>>>> quickly, I support your adjusting it to be the way you want it.
>>>>>>>
>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I 
>>>>>>> src/hotspot/share"
>>>>>>> suggests that the GC group is most active in touching this feature.
>>>>>>> If Robbin is OK with it, there's your reviewer.
>>>>>>>
>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person
>>>>>>> working on the GC to OK it.
>>>>>>>
>>>>>>> ? John
>>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>>
>>>>>> Thanks,
>>>>>> Jc
>>>>>
>>>>> Robbin is on vacation; you might not hear from him for a while.
>>>>>
>>>>> I'm assuming you'll open a new bug for this?
>>>>>
>>>>> Except for a few minor nits (below), this looks okay to me.
>>>>>
>>>>> The comment at line 1052 needs updating.
>>>>>
>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused.
>>>>>
>>>>> pre-existing: The try_eden label declared on line 1054 is bound at
>>>>> line 1058, but unreferenced.
>>>>>
>>>>> I like the wording of the comment at 1139 better than the wording at
>>>>> 1016.
>>>>>
>>>>>
>>>>
>>>> -- 
>>>>
>>>> Thanks,
>>>> Jc
>>>>
>>>
>>>
>>> -- 
>>>
>>> Thanks,
>>> Jc
>>>
>>
>>


From alexey.menkov at oracle.com  Fri Jul 20 00:06:13 2018
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Thu, 19 Jul 2018 17:06:13 -0700
Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC
In-Reply-To: <CAF9BGBwkQAm4KQyHgGBKLVnHNpn4APS2nLZLNcPEsHw+Q_G7kw@mail.gmail.com>
References: <CAF9BGBzZEwuh-gihhEXUPdZyJtdx0qSAiAC=eN7S7usB6hVJLg@mail.gmail.com>
 <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com>
 <CAF9BGBx=Y8BunE0aQYueCEKD-anm7z8vh+_mOG0tkSDAV3h=2A@mail.gmail.com>
 <CAF9BGBwjwQ2GLJ+_grHOwXDuYRRymHeiTxkKZ_DSvEEaU2n9fg@mail.gmail.com>
 <f6bb401b-c97a-2454-1a8e-6d341bf408f7@oracle.com>
 <CAF9BGBwkQAm4KQyHgGBKLVnHNpn4APS2nLZLNcPEsHw+Q_G7kw@mail.gmail.com>
Message-ID: <9a75ac9e-a1db-63dc-07aa-3be97442ce7e@oracle.com>

Looks good.

--alex

On 07/19/2018 14:52, JC Beyler wrote:
> Hi Serguei,
> 
> Done here:
> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.01/ 
> <http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.01/>
> 
> I added:
> 
> + // Calculate the size of a 1-element array in order to assess average 
> sampling interval
> + // via the HeapMonitorStatIntervalTest. This is needed because various 
> GCs could add
> + // extra memory to arrays.
> + // This is done by allocating a 1-element array and then looking in 
> the heap monitoring
> + // samples for the average size of objects collected.
> 
> 
> Let me know what you think and then I need one more review to prepare 
> the patch :-)
> 
> Thanks all!
> Jc
> 
> On Thu, Jul 19, 2018 at 2:33 PM serguei.spitsyn at oracle.com 
> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com 
> <mailto:serguei.spitsyn at oracle.com>> wrote:
> 
>     Hi Jc,
> 
>     The fix looks good to me.
>     Just minor comments.
> 
>     http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor.java.frames.html
> 
>     108 public static void calculateAverageOneElementSize() {
> 
>      ? Could you, please, add a comment before
>     calculateAverageOneElementSize method
>      ? explaining shortly why it is needed and what it is doing?
>      ? Otherwise, it is not easy to understand this code from scratch.
> 
>     Thanks,
>     Serguei
> 
> 
>     On 7/19/18 10:08, JC Beyler wrote:
>>     I forgot to put the link:
>>     https://bugs.openjdk.java.net/browse/JDK-8207763
>>
>>     It got renamed in jdk11 via:
>>     http://hg.openjdk.java.net/jdk/jdk11/rev/1edcf36fe15f
>>
>>     Thanks!
>>     Jc
>>
>>     On Thu, Jul 19, 2018 at 10:07 AM JC Beyler <jcbeyler at google.com
>>     <mailto:jcbeyler at google.com>> wrote:
>>
>>         Hi Dan,
>>
>>
>>         serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
>>         became
>>         serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java,
>>         when we updated the spec and said "rate" was the wrong word.
>>
>>         So yes, it fixes both since at some point all branches should
>>         see that the StatRate test becomes renamed into the
>>         StatInterval test. Does that make sense?
>>
>>         Thanks!
>>         Jc
>>
>>
>>         On Thu, Jul 19, 2018 at 9:45 AM Daniel D. Daugherty
>>         <daniel.daugherty at oracle.com
>>         <mailto:daniel.daugherty at oracle.com>> wrote:
>>
>>             JDK-8207765 covers two different tests as of yesterday:
>>
>>             serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
>>
>>             and
>>
>>             serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java
>>
>>             I updated it to add a similar failure mode sighting for
>>             HeapMonitorStatIntervalTest.java
>>
>>
>>             Does your fix address both test failures?
>>
>>             Dan
>>
>>
>>             On 7/19/18 12:39 PM, JC Beyler wrote:
>>>             Hi all,
>>>
>>>             Could I have a few reviews of:
>>>             http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/
>>>             <http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/>
>>>
>>>             The? test assumed the size of a 1-element array but ZGC
>>>             changes that assumption. The test now first allocates a
>>>             bit of memory and gets the average size of the samples
>>>             before assuming the size. This works with/without ZGC.
>>>
>>>             Webrev:
>>>             http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/
>>>             <http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/>
>>>             Bug: https://bugs.openjdk.java.net/browse/JDK-8207765
>>>
>>>             Thanks!
>>>             Jc
>>
>>
>>
>>         -- 
>>
>>         Thanks,
>>         Jc
>>
>>
>>
>>     -- 
>>
>>     Thanks,
>>     Jc
> 
> 
> 
> -- 
> 
> Thanks,
> Jc

From serguei.spitsyn at oracle.com  Fri Jul 20 00:21:45 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 19 Jul 2018 17:21:45 -0700
Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC
In-Reply-To: <9a75ac9e-a1db-63dc-07aa-3be97442ce7e@oracle.com>
References: <CAF9BGBzZEwuh-gihhEXUPdZyJtdx0qSAiAC=eN7S7usB6hVJLg@mail.gmail.com>
 <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com>
 <CAF9BGBx=Y8BunE0aQYueCEKD-anm7z8vh+_mOG0tkSDAV3h=2A@mail.gmail.com>
 <CAF9BGBwjwQ2GLJ+_grHOwXDuYRRymHeiTxkKZ_DSvEEaU2n9fg@mail.gmail.com>
 <f6bb401b-c97a-2454-1a8e-6d341bf408f7@oracle.com>
 <CAF9BGBwkQAm4KQyHgGBKLVnHNpn4APS2nLZLNcPEsHw+Q_G7kw@mail.gmail.com>
 <9a75ac9e-a1db-63dc-07aa-3be97442ce7e@oracle.com>
Message-ID: <9f8cd454-1e48-c4f5-41fc-834adda7867e@oracle.com>

Thanks a lot, Alex!

Jc,

Could you please send me a patch for push?

Thanks,
Serguei

On 7/19/18 17:06, Alex Menkov wrote:
> Looks good.
>
> --alex
>
> On 07/19/2018 14:52, JC Beyler wrote:
>> Hi Serguei,
>>
>> Done here:
>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.01/ 
>> <http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.01/>
>>
>> I added:
>>
>> + // Calculate the size of a 1-element array in order to assess 
>> average sampling interval
>> + // via the HeapMonitorStatIntervalTest. This is needed because 
>> various GCs could add
>> + // extra memory to arrays.
>> + // This is done by allocating a 1-element array and then looking in 
>> the heap monitoring
>> + // samples for the average size of objects collected.
>>
>>
>> Let me know what you think and then I need one more review to prepare 
>> the patch :-)
>>
>> Thanks all!
>> Jc
>>
>> On Thu, Jul 19, 2018 at 2:33 PM serguei.spitsyn at oracle.com 
>> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com 
>> <mailto:serguei.spitsyn at oracle.com>> wrote:
>>
>> ??? Hi Jc,
>>
>> ??? The fix looks good to me.
>> ??? Just minor comments.
>>
>> http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor.java.frames.html
>>
>> ??? 108 public static void calculateAverageOneElementSize() {
>>
>> ???? ? Could you, please, add a comment before
>> ??? calculateAverageOneElementSize method
>> ???? ? explaining shortly why it is needed and what it is doing?
>> ???? ? Otherwise, it is not easy to understand this code from scratch.
>>
>> ??? Thanks,
>> ??? Serguei
>>
>>
>> ??? On 7/19/18 10:08, JC Beyler wrote:
>>> ??? I forgot to put the link:
>>> ??? https://bugs.openjdk.java.net/browse/JDK-8207763
>>>
>>> ??? It got renamed in jdk11 via:
>>> ??? http://hg.openjdk.java.net/jdk/jdk11/rev/1edcf36fe15f
>>>
>>> ??? Thanks!
>>> ??? Jc
>>>
>>> ??? On Thu, Jul 19, 2018 at 10:07 AM JC Beyler <jcbeyler at google.com
>>> ??? <mailto:jcbeyler at google.com>> wrote:
>>>
>>> ??????? Hi Dan,
>>>
>>>
>>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
>>> ??????? became
>>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java,
>>> ??????? when we updated the spec and said "rate" was the wrong word.
>>>
>>> ??????? So yes, it fixes both since at some point all branches should
>>> ??????? see that the StatRate test becomes renamed into the
>>> ??????? StatInterval test. Does that make sense?
>>>
>>> ??????? Thanks!
>>> ??????? Jc
>>>
>>>
>>> ??????? On Thu, Jul 19, 2018 at 9:45 AM Daniel D. Daugherty
>>> ??????? <daniel.daugherty at oracle.com
>>> ??????? <mailto:daniel.daugherty at oracle.com>> wrote:
>>>
>>> ??????????? JDK-8207765 covers two different tests as of yesterday:
>>>
>>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
>>>
>>> ??????????? and
>>>
>>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java
>>>
>>> ??????????? I updated it to add a similar failure mode sighting for
>>> ??????????? HeapMonitorStatIntervalTest.java
>>>
>>>
>>> ??????????? Does your fix address both test failures?
>>>
>>> ??????????? Dan
>>>
>>>
>>> ??????????? On 7/19/18 12:39 PM, JC Beyler wrote:
>>>> ??????????? Hi all,
>>>>
>>>> ??????????? Could I have a few reviews of:
>>>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/
>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/>
>>>>
>>>> ??????????? The? test assumed the size of a 1-element array but ZGC
>>>> ??????????? changes that assumption. The test now first allocates a
>>>> ??????????? bit of memory and gets the average size of the samples
>>>> ??????????? before assuming the size. This works with/without ZGC.
>>>>
>>>> ??????????? Webrev:
>>>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/
>>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/>
>>>> ??????????? Bug: https://bugs.openjdk.java.net/browse/JDK-8207765
>>>>
>>>> ??????????? Thanks!
>>>> ??????????? Jc
>>>
>>>
>>>
>>> ??????? --
>>> ??????? Thanks,
>>> ??????? Jc
>>>
>>>
>>>
>>> ??? --
>>> ??? Thanks,
>>> ??? Jc
>>
>>
>>
>> -- 
>>
>> Thanks,
>> Jc


From jcbeyler at google.com  Fri Jul 20 01:22:49 2018
From: jcbeyler at google.com (JC Beyler)
Date: Thu, 19 Jul 2018 18:22:49 -0700
Subject: RFR(S) 8207765: HeapMonitorIntervalRateTest fails with ZGC
In-Reply-To: <9f8cd454-1e48-c4f5-41fc-834adda7867e@oracle.com>
References: <CAF9BGBzZEwuh-gihhEXUPdZyJtdx0qSAiAC=eN7S7usB6hVJLg@mail.gmail.com>
 <2e4b591f-81e7-3298-fe73-3c7dc9ba168e@oracle.com>
 <CAF9BGBx=Y8BunE0aQYueCEKD-anm7z8vh+_mOG0tkSDAV3h=2A@mail.gmail.com>
 <CAF9BGBwjwQ2GLJ+_grHOwXDuYRRymHeiTxkKZ_DSvEEaU2n9fg@mail.gmail.com>
 <f6bb401b-c97a-2454-1a8e-6d341bf408f7@oracle.com>
 <CAF9BGBwkQAm4KQyHgGBKLVnHNpn4APS2nLZLNcPEsHw+Q_G7kw@mail.gmail.com>
 <9a75ac9e-a1db-63dc-07aa-3be97442ce7e@oracle.com>
 <9f8cd454-1e48-c4f5-41fc-834adda7867e@oracle.com>
Message-ID: <CAF9BGBzSFR=Lr8SCCncQAMBres_yHtHs9cjs+nVfhTLOb3mwjQ@mail.gmail.com>

Hi Serguei and Alexey,

Thanks both and here you are:
http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.02/

Let me know if you need anything else!
Jc

On Thu, Jul 19, 2018 at 5:21 PM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Thanks a lot, Alex!
>
> Jc,
>
> Could you please send me a patch for push?
>
> Thanks,
> Serguei
>
> On 7/19/18 17:06, Alex Menkov wrote:
> > Looks good.
> >
> > --alex
> >
> > On 07/19/2018 14:52, JC Beyler wrote:
> >> Hi Serguei,
> >>
> >> Done here:
> >> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.01/
> >> <http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.01/>
> >>
> >> I added:
> >>
> >> + // Calculate the size of a 1-element array in order to assess
> >> average sampling interval
> >> + // via the HeapMonitorStatIntervalTest. This is needed because
> >> various GCs could add
> >> + // extra memory to arrays.
> >> + // This is done by allocating a 1-element array and then looking in
> >> the heap monitoring
> >> + // samples for the average size of objects collected.
> >>
> >>
> >> Let me know what you think and then I need one more review to prepare
> >> the patch :-)
> >>
> >> Thanks all!
> >> Jc
> >>
> >> On Thu, Jul 19, 2018 at 2:33 PM serguei.spitsyn at oracle.com
> >> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com
> >> <mailto:serguei.spitsyn at oracle.com>> wrote:
> >>
> >>     Hi Jc,
> >>
> >>     The fix looks good to me.
> >>     Just minor comments.
> >>
> >>
> http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitor.java.frames.html
> >>
> >>     108 public static void calculateAverageOneElementSize() {
> >>
> >>        Could you, please, add a comment before
> >>     calculateAverageOneElementSize method
> >>        explaining shortly why it is needed and what it is doing?
> >>        Otherwise, it is not easy to understand this code from scratch.
> >>
> >>     Thanks,
> >>     Serguei
> >>
> >>
> >>     On 7/19/18 10:08, JC Beyler wrote:
> >>>     I forgot to put the link:
> >>>     https://bugs.openjdk.java.net/browse/JDK-8207763
> >>>
> >>>     It got renamed in jdk11 via:
> >>>     http://hg.openjdk.java.net/jdk/jdk11/rev/1edcf36fe15f
> >>>
> >>>     Thanks!
> >>>     Jc
> >>>
> >>>     On Thu, Jul 19, 2018 at 10:07 AM JC Beyler <jcbeyler at google.com
> >>>     <mailto:jcbeyler at google.com>> wrote:
> >>>
> >>>         Hi Dan,
> >>>
> >>>
> >>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
> >>>         became
> >>>
> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java,
> >>>         when we updated the spec and said "rate" was the wrong word.
> >>>
> >>>         So yes, it fixes both since at some point all branches should
> >>>         see that the StatRate test becomes renamed into the
> >>>         StatInterval test. Does that make sense?
> >>>
> >>>         Thanks!
> >>>         Jc
> >>>
> >>>
> >>>         On Thu, Jul 19, 2018 at 9:45 AM Daniel D. Daugherty
> >>>         <daniel.daugherty at oracle.com
> >>>         <mailto:daniel.daugherty at oracle.com>> wrote:
> >>>
> >>>             JDK-8207765 covers two different tests as of yesterday:
> >>>
> >>> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatRateTest.java
> >>>
> >>>             and
> >>>
> >>>
> serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatIntervalTest.java
> >>>
> >>>             I updated it to add a similar failure mode sighting for
> >>>             HeapMonitorStatIntervalTest.java
> >>>
> >>>
> >>>             Does your fix address both test failures?
> >>>
> >>>             Dan
> >>>
> >>>
> >>>             On 7/19/18 12:39 PM, JC Beyler wrote:
> >>>>             Hi all,
> >>>>
> >>>>             Could I have a few reviews of:
> >>>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/
> >>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/>
> >>>>
> >>>>             The  test assumed the size of a 1-element array but ZGC
> >>>>             changes that assumption. The test now first allocates a
> >>>>             bit of memory and gets the average size of the samples
> >>>>             before assuming the size. This works with/without ZGC.
> >>>>
> >>>>             Webrev:
> >>>> http://cr.openjdk.java.net/~jcbeyler/8207765/webrev.00/
> >>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8207765/webrev.00/>
> >>>>             Bug: https://bugs.openjdk.java.net/browse/JDK-8207765
> >>>>
> >>>>             Thanks!
> >>>>             Jc
> >>>
> >>>
> >>>
> >>>         --
> >>>         Thanks,
> >>>         Jc
> >>>
> >>>
> >>>
> >>>     --
> >>>     Thanks,
> >>>     Jc
> >>
> >>
> >>
> >> --
> >>
> >> Thanks,
> >> Jc
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180719/3399ac92/attachment.html>

From yasuenag at gmail.com  Fri Jul 20 05:13:49 2018
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Fri, 20 Jul 2018 14:13:49 +0900
Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running
 in Docker containers
Message-ID: <CAGFVN2A5iZeUg7bysrbdaB2FNUvfobc_gDOc97GS_eTz2DrhDw@mail.gmail.com>

Hi Chris,

Thank you for your comment.
I uploaded new webrev. Could you review again?

  http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.02/

I tested my change on Linux x64, but I cannot check it on other
platform (includes older Linux).
However SA tests are included in HotSpot tier 1 tests. Tests on submit
repo work fine with this change
(mach5-one-ysuenaga-JDK-8205992-20180720-0305-31840).


Thanks,

Yasumasa


2018-07-20 3:26 GMT+09:00 Chris Plummer <chris.plummer at oracle.com>:
> Hi Yasumasa,
>
>   84     // It maps the LWPID in the host to it in the container.
>
> "it" -> "the PID"
>
>  286     // Get LWPID in the host from the container's LWPID.
>  287     public int getHostPID(int id) {
>  288         try {
>  289             return nspidMap.get(id);
>  290         } catch (NullPointerException e) {
>  291             return -1;
>  292         }
>  293     }
>
> What is the source of the NPE here? Is it because nspidMap was never
> initialized because the process is not in a container? In that case I think
> you should be checking for null rather than having an NPE be part of normal
> execution.
>
>   42             int hostPID =
> ((LinuxDebuggerLocal)debugger).getHostPID(pid);
>   43             if (hostPID != -1) {
>   44                 pid = hostPID;
>   45             }
>
> A comment here would be helpful.
>
> The rest looks good. I should probably run it through some internal testing.
> Let me know when you have a final webrev.
>
> thanks,
>
> Chris
>
>
> On 7/18/18 5:59 AM, Yasumasa Suenaga wrote:
>>
>> PING:
>>
>> Could you review it?
>>
>>    JBS:    https://bugs.openjdk.java.net/browse/JDK-8205992
>>    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/
>>
>> This change has been reviewed by Jini.
>> We need a Reviewer.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2018/07/12 13:42, Yasumasa Suenaga wrote:
>>>
>>> Thanks Jini,
>>>
>>> I uploaded new webrev. It contains some comments and removing extra
>>> space.
>>>
>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/
>>>
>>>
>>> Yasumasa
>>>
>>>
>>>
>>> 2018-07-12 2:32 GMT+09:00 Jini George <jini.george at oracle.com>:
>>>>
>>>> Hi Yasumasa,
>>>>
>>>> This looks good to me except for one nit. And some more comments would
>>>> help.
>>>> For e.g., it would help to say that NSPidMap is to map the host to
>>>> container
>>>> lwpids.
>>>>
>>>> The nit:
>>>>
>>>> *
>>>>
>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html
>>>> Line 253: extra space after the parentheses
>>>>
>>>> Thanks,
>>>> Jini.
>>>>
>>>> On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote:
>>>>>
>>>>>
>>>>> PING: Could you review it?
>>>>>
>>>>>>    JBS: https://bugs.openjdk.java.net/browse/JDK-8205992
>>>>>>    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Yasumasa
>>>>>
>>>>>
>>>>> On 2018/06/28 22:12, Yasumasa Suenaga wrote:
>>>>>>
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Please review this change.
>>>>>>
>>>>>>    JBS: https://bugs.openjdk.java.net/browse/JDK-8205992
>>>>>>    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>>>>>
>>>>>> I tried to attach jhsdb to java process in docker container from
>>>>>> container host, but it couldn't.
>>>>>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet.
>>>>>>
>>>>>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they
>>>>>> returns PIDs in container - they are different from host's PID. So I
>>>>>> added
>>>>>> the code to scan /proc/<PID>/task to get all LWP IDs and they are kept
>>>>>> in a
>>>>>> Map in LinuxDebuggerLocal.
>>>>>>
>>>>>> Also SA_ALTROOT is set to /proc/<PID>/root if SA detects debuggee runs
>>>>>> in
>>>>>> container. It helps SA to parse binaries in container.
>>>>>>
>>>>>> This change has been pushed to submit repo, and it was failed on OS X
>>>>>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963).
>>>>>> But I guess it causes JDK-8205906. This change affects to Linux only.
>>>>>>
>>>>>> Could you review it?
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>
>
>

From ralf.schmelter at sap.com  Fri Jul 20 14:28:09 2018
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Fri, 20 Jul 2018 14:28:09 +0000
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
 <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
 <f96a16917c934a539523e078f902b880@sap.com>
 <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com>
 <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com>
Message-ID: <6de6362944f84740b80abb22cbbea872@sap.com>

Hi Sergue,

I?ve updated the webref: http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/

JVMTI_ERROR_OPAQUE_FRAME was never returned from GetFrameLocation(). If it would have, the old code would have removed all native methods from the call stack. The original JVMDI call did indeed return JVMDI_ERROR_OPAQUE_FRAME, so maybe it was a leftover from the JVMDI->JVMTI transition.

I?ve tried to make the test more readable and added some comments to explain why it is done the way it is.

Best regards,
Ralf


From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com] 
Sent: Mittwoch, 18. Juli 2018 22:57
To: Chris Plummer <chris.plummer at oracle.com>; Schmelter, Ralf <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; Stuefe, Thomas <thomas.stuefe at sap.com>
Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior

Hi Ralf,

The fix itself looks pretty good to me.
Some minor comments.

The copyright year needs an update.
 218     jint count, filledIn;

 Could you, please, split the declarations above into different lines to follow the local style?
Ii is interesting that the original implementation checked the error code returned
from the JVMTI GetFrameLocation for being equal to JVMTI_ERROR_OPAQUE_FRAME.
However, the GetFrameLocation spec does not list this error code as possible.


Some comments about the test.
  52     static void callEnded() {
  53         System.out.println("SOE occurred as expected");
  54     }
  55 
  56     static int call(int depth) {
  57         if (depth == 0) {
  58             // Should have seen a stack overflow by now.
  59             System.out.println("Exited without creating SOE");
  60             System.exit(0);
  61         }
  62 
  63         try {
  64             int newDepth = call(depth - 1);
  65 
  66             if (newDepth == -1_000) {
  67                 // Pop some frames so there is room on the stack for the
  68                 // println()
  69                 callEnded();
  70             }
  71 
  72             return newDepth - 1;
  73         } catch (StackOverflowError e) {
  74             return -1;
  75         }
  76     }
  77 }
? I'd suggest to rename the methods call() and callEnded() to something like
? recursiveMethod() and recursionEnd().
? Also, the manipulations with SOE create a complexity and are confusing.
? Could it be more simple to let it propagated and then catch in main()?
? What is the point for all these checks at the lines 104-119?
? In general, I'm looking for some ways to make it more clear, simple and stable.

Thanks,
Serguei

From vladimir.kozlov at oracle.com  Fri Jul 20 17:52:56 2018
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 20 Jul 2018 10:52:56 -0700
Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is
 enabled
In-Reply-To: <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com>
References: <CAF9BGBzAVdC3JZCu2945n8JXo0usocBxVxYzdfdBU_gWOTe8wg@mail.gmail.com>
 <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com>
 <CAF9BGBx7TnVo5vmNx75TUhd1VSeow6KsgcSk2ztKdibKuzv9fw@mail.gmail.com>
 <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com>
 <CAF9BGBxLchBQ3g1uwD=34E+QR5ARO7DqBfxZpFU2TRWTSaTajQ@mail.gmail.com>
 <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
 <CAF9BGBz+SNUffJWNgkvTTK6jB7-8yf96HGSroxvY13FT80XPCg@mail.gmail.com>
 <de54ede9-e29f-5e58-8d7f-6ad3c74d558c@oracle.com>
 <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com>
Message-ID: <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com>

I asked Igor V. to look.

Seems like review is done in an other thread which does not have bug id 
in subject. Currently webrev.03

Vladimir

On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote:
> Thanks, Rahul!
> In fact, there no good experts for this area in the serviceability team.
> It would be much better if anyone from the Compiler team could do it.
> 
> Vladimir K.,
> 
> Is there anyone from the Compiler team available to review this?
> Otherwise, I could try to review it but am not sure about my review 
> quality.
> 
> Thanks,
> Serguei
> 
> 
> On 7/19/18 00:48, Rahul Raghavan wrote:
>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled
>>
>> (just adding + hotspot-compiler-dev also)
>>
>>
>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote:
>> Subject Was:
>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled
>>
>> + serviceability-dev
>>
>> Hi all,
>>
>> Could anyone else give me a review of this webrev and check/test the
>> various architecture changes?
>>
>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>>
>>
>> Thanks for all your help!
>> Jc
>>
>>
>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler <jcbeyler at google.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> Here is a webrev that does all the architectures in the same way:
>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>>>>
>>>> Could anyone review the other architectures and test?
>>>> ?? - arm, sparc & aarch64 are also modified now to follow the same 
>>>> "if no
>>>> tlab, then consider eden space allocation" logic.
>>>>
>>>> Thanks for your help!
>>>> Jc
>>>>
>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler <jcbeyler at google.com> wrote:
>>>>
>>>>> Hi Kim,
>>>>>
>>>>> I opened this bug
>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862
>>>>>
>>>>> and now I've done an update:
>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/
>>>>>
>>>>> I basically have done your nits but also removed the try_eden (it was
>>>>> used to bind a label but was not used). I updated the comments to 
>>>>> use the
>>>>> one you preferred.
>>>>>
>>>>> I still have to do the other architectures though but at least we 
>>>>> seem to
>>>>> have a consensus on this architecture, correct?
>>>>>
>>>>> Thanks for the review,
>>>>> Jc
>>>>>
>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett <kim.barrett at oracle.com>
>>>>> wrote:
>>>>>
>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler <jcbeyler at google.com> wrote:
>>>>>>>
>>>>>>> Yes, you are right, I did those changes due to:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084
>>>>>>>
>>>>>>> If Robbin agrees to this change, and if no one sees an issue, 
>>>>>>> I'll go
>>>>>> ahead
>>>>>>> and propagate the change across architectures.
>>>>>>>
>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's 
>>>>>>> comment
>>>>>> and
>>>>>>> review) :)
>>>>>>> Jc
>>>>>>>
>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose <john.r.rose at oracle.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler <jcbeyler at google.com> 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm not sure if we had left this case intentionally or not but, 
>>>>>>>> if we
>>>>>> want
>>>>>>>> it all to be consistent, we should perhaps fix it.
>>>>>>>>
>>>>>>>>
>>>>>>>> Well, you put in that logic last February, so unless somebody 
>>>>>>>> speaks
>>>>>> up
>>>>>>>> quickly, I support your adjusting it to be the way you want it.
>>>>>>>>
>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I 
>>>>>>>> src/hotspot/share"
>>>>>>>> suggests that the GC group is most active in touching this feature.
>>>>>>>> If Robbin is OK with it, there's your reviewer.
>>>>>>>>
>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person
>>>>>>>> working on the GC to OK it.
>>>>>>>>
>>>>>>>> ? John
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- 
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jc
>>>>>>
>>>>>> Robbin is on vacation; you might not hear from him for a while.
>>>>>>
>>>>>> I'm assuming you'll open a new bug for this?
>>>>>>
>>>>>> Except for a few minor nits (below), this looks okay to me.
>>>>>>
>>>>>> The comment at line 1052 needs updating.
>>>>>>
>>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused.
>>>>>>
>>>>>> pre-existing: The try_eden label declared on line 1054 is bound at
>>>>>> line 1058, but unreferenced.
>>>>>>
>>>>>> I like the wording of the comment at 1139 better than the wording at
>>>>>> 1016.
>>>>>>
>>>>>>
>>>>>
>>>>> -- 
>>>>>
>>>>> Thanks,
>>>>> Jc
>>>>>
>>>>
>>>>
>>>> -- 
>>>>
>>>> Thanks,
>>>> Jc
>>>>
>>>
>>>
> 

From vladimir.kozlov at oracle.com  Fri Jul 20 17:57:11 2018
From: vladimir.kozlov at oracle.com (Vladimir Kozlov)
Date: Fri, 20 Jul 2018 10:57:11 -0700
Subject: RFR (S): C1 still does eden allocations when TLAB is enabled
In-Reply-To: <CAF9BGBweLjX0BOOmcNCwwsLiAz2Jimx8AHjWiMC7mdeCYvdtJg@mail.gmail.com>
References: <CAF9BGBzAVdC3JZCu2945n8JXo0usocBxVxYzdfdBU_gWOTe8wg@mail.gmail.com>
 <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com>
 <CAF9BGBx7TnVo5vmNx75TUhd1VSeow6KsgcSk2ztKdibKuzv9fw@mail.gmail.com>
 <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com>
 <CAF9BGBxLchBQ3g1uwD=34E+QR5ARO7DqBfxZpFU2TRWTSaTajQ@mail.gmail.com>
 <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
 <a44df8bb9dbd7e951d6fdc2b46d106c42a556f16.camel@oracle.com>
 <CAF9BGBweLjX0BOOmcNCwwsLiAz2Jimx8AHjWiMC7mdeCYvdtJg@mail.gmail.com>
Message-ID: <22211468-5b15-e6a8-be6b-7ce5d2fbdf27@oracle.com>

Please, don't do review in 2 mailing threads.

Thanks,
Vladimir

On 7/20/18 8:30 AM, JC Beyler wrote:
> Awesome thanks Thomas!
> 
> Here is the webrev with the extra information then:
> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/
> 
> Thanks again for all the reviews everyone!
> Jc
> 
> On Fri, Jul 20, 2018 at 3:23 AM Thomas Schatzl <thomas.schatzl at oracle.com>
> wrote:
> 
>> Hi,
>>
>> On Mon, 2018-07-16 at 14:58 -0700, JC Beyler wrote:
>>> Hi all,
>>>
>>> Here is a webrev that does all the architectures in the same way:
>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>>>
>>> Could anyone review the other architectures and test?
>>>    - arm, sparc & aarch64 are also modified now to follow the same "if
>>> no
>>> tlab, then consider eden space allocation" logic.
>>>
>>> Thanks for your help!
>>> Jc
>>>
>>
>>    looks good.
>>
>> I ran the change through hs-tier1-3 with no issues. It only tests on
>> sparc and x64 though.
>>
>> I do not expect issues on the other platforms though :)
>>
>> Thanks,
>>    Thomas
>>
>>
> 

From serguei.spitsyn at oracle.com  Fri Jul 20 18:18:20 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 20 Jul 2018 11:18:20 -0700
Subject: RFR (S): 8207252: C1 still does eden allocations when TLAB is
 enabled
In-Reply-To: <CAF9BGBweLjX0BOOmcNCwwsLiAz2Jimx8AHjWiMC7mdeCYvdtJg@mail.gmail.com>
References: <CAF9BGBzAVdC3JZCu2945n8JXo0usocBxVxYzdfdBU_gWOTe8wg@mail.gmail.com>
 <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com>
 <CAF9BGBx7TnVo5vmNx75TUhd1VSeow6KsgcSk2ztKdibKuzv9fw@mail.gmail.com>
 <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com>
 <CAF9BGBxLchBQ3g1uwD=34E+QR5ARO7DqBfxZpFU2TRWTSaTajQ@mail.gmail.com>
 <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
 <a44df8bb9dbd7e951d6fdc2b46d106c42a556f16.camel@oracle.com>
 <CAF9BGBweLjX0BOOmcNCwwsLiAz2Jimx8AHjWiMC7mdeCYvdtJg@mail.gmail.com>
Message-ID: <c9561d70-8831-9b43-fb38-38f94426c7f6@oracle.com>

Restored the bug number and added back the hotspot-dev and 
serviceability-dev mailing lists.

Thanks,
Serguei


On 7/20/18 08:30, JC Beyler wrote:
> Awesome thanks Thomas!
>
> Here is the webrev with the extra information then:
> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/
>
> Thanks again for all the reviews everyone!
> Jc
>
> On Fri, Jul 20, 2018 at 3:23 AM Thomas Schatzl <thomas.schatzl at oracle.com>
> wrote:
>
>> Hi,
>>
>> On Mon, 2018-07-16 at 14:58 -0700, JC Beyler wrote:
>>> Hi all,
>>>
>>> Here is a webrev that does all the architectures in the same way:
>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>>>
>>> Could anyone review the other architectures and test?
>>>    - arm, sparc & aarch64 are also modified now to follow the same "if
>>> no
>>> tlab, then consider eden space allocation" logic.
>>>
>>> Thanks for your help!
>>> Jc
>>>
>>    looks good.
>>
>> I ran the change through hs-tier1-3 with no issues. It only tests on
>> sparc and x64 though.
>>
>> I do not expect issues on the other platforms though :)
>>
>> Thanks,
>>    Thomas
>>
>>


From serguei.spitsyn at oracle.com  Fri Jul 20 18:21:56 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 20 Jul 2018 11:21:56 -0700
Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is
 enabled
In-Reply-To: <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com>
References: <CAF9BGBzAVdC3JZCu2945n8JXo0usocBxVxYzdfdBU_gWOTe8wg@mail.gmail.com>
 <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com>
 <CAF9BGBx7TnVo5vmNx75TUhd1VSeow6KsgcSk2ztKdibKuzv9fw@mail.gmail.com>
 <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com>
 <CAF9BGBxLchBQ3g1uwD=34E+QR5ARO7DqBfxZpFU2TRWTSaTajQ@mail.gmail.com>
 <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
 <CAF9BGBz+SNUffJWNgkvTTK6jB7-8yf96HGSroxvY13FT80XPCg@mail.gmail.com>
 <de54ede9-e29f-5e58-8d7f-6ad3c74d558c@oracle.com>
 <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com>
 <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com>
Message-ID: <b3c9630e-8434-b2da-f732-6569e629debf@oracle.com>

Thank you a lot, Vladimir!
Yes, the webrev.03 is the latest.
Jc, will correct us if it is not right.

Thanks,
Serguei


On 7/20/18 10:52, Vladimir Kozlov wrote:
> I asked Igor V. to look.
>
> Seems like review is done in an other thread which does not have bug 
> id in subject. Currently webrev.03
>
> Vladimir
>
> On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote:
>> Thanks, Rahul!
>> In fact, there no good experts for this area in the serviceability team.
>> It would be much better if anyone from the Compiler team could do it.
>>
>> Vladimir K.,
>>
>> Is there anyone from the Compiler team available to review this?
>> Otherwise, I could try to review it but am not sure about my review 
>> quality.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 7/19/18 00:48, Rahul Raghavan wrote:
>>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled
>>>
>>> (just adding + hotspot-compiler-dev also)
>>>
>>>
>>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote:
>>> Subject Was:
>>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled
>>>
>>> + serviceability-dev
>>>
>>> Hi all,
>>>
>>> Could anyone else give me a review of this webrev and check/test the
>>> various architecture changes?
>>>
>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>>>
>>>
>>> Thanks for all your help!
>>> Jc
>>>
>>>
>>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler <jcbeyler at google.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Here is a webrev that does all the architectures in the same way:
>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>>>>>
>>>>> Could anyone review the other architectures and test?
>>>>> ?? - arm, sparc & aarch64 are also modified now to follow the same 
>>>>> "if no
>>>>> tlab, then consider eden space allocation" logic.
>>>>>
>>>>> Thanks for your help!
>>>>> Jc
>>>>>
>>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler <jcbeyler at google.com> 
>>>>> wrote:
>>>>>
>>>>>> Hi Kim,
>>>>>>
>>>>>> I opened this bug
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862
>>>>>>
>>>>>> and now I've done an update:
>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/
>>>>>>
>>>>>> I basically have done your nits but also removed the try_eden (it 
>>>>>> was
>>>>>> used to bind a label but was not used). I updated the comments to 
>>>>>> use the
>>>>>> one you preferred.
>>>>>>
>>>>>> I still have to do the other architectures though but at least we 
>>>>>> seem to
>>>>>> have a consensus on this architecture, correct?
>>>>>>
>>>>>> Thanks for the review,
>>>>>> Jc
>>>>>>
>>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett <kim.barrett at oracle.com>
>>>>>> wrote:
>>>>>>
>>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler <jcbeyler at google.com> 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Yes, you are right, I did those changes due to:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084
>>>>>>>>
>>>>>>>> If Robbin agrees to this change, and if no one sees an issue, 
>>>>>>>> I'll go
>>>>>>> ahead
>>>>>>>> and propagate the change across architectures.
>>>>>>>>
>>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's 
>>>>>>>> comment
>>>>>>> and
>>>>>>>> review) :)
>>>>>>>> Jc
>>>>>>>>
>>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose <john.r.rose at oracle.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler <jcbeyler at google.com> 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm not sure if we had left this case intentionally or not 
>>>>>>>>> but, if we
>>>>>>> want
>>>>>>>>> it all to be consistent, we should perhaps fix it.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Well, you put in that logic last February, so unless somebody 
>>>>>>>>> speaks
>>>>>>> up
>>>>>>>>> quickly, I support your adjusting it to be the way you want it.
>>>>>>>>>
>>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I 
>>>>>>>>> src/hotspot/share"
>>>>>>>>> suggests that the GC group is most active in touching this 
>>>>>>>>> feature.
>>>>>>>>> If Robbin is OK with it, there's your reviewer.
>>>>>>>>>
>>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person
>>>>>>>>> working on the GC to OK it.
>>>>>>>>>
>>>>>>>>> ? John
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jc
>>>>>>>
>>>>>>> Robbin is on vacation; you might not hear from him for a while.
>>>>>>>
>>>>>>> I'm assuming you'll open a new bug for this?
>>>>>>>
>>>>>>> Except for a few minor nits (below), this looks okay to me.
>>>>>>>
>>>>>>> The comment at line 1052 needs updating.
>>>>>>>
>>>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused.
>>>>>>>
>>>>>>> pre-existing: The try_eden label declared on line 1054 is bound at
>>>>>>> line 1058, but unreferenced.
>>>>>>>
>>>>>>> I like the wording of the comment at 1139 better than the 
>>>>>>> wording at
>>>>>>> 1016.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> -- 
>>>>>>
>>>>>> Thanks,
>>>>>> Jc
>>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>>
>>>>> Thanks,
>>>>> Jc
>>>>>
>>>>
>>>>
>>


From chris.plummer at oracle.com  Fri Jul 20 18:37:22 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 20 Jul 2018 11:37:22 -0700
Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR:
 suspendCounts don't match for : Common-Cleaner
In-Reply-To: <5B507F2C.4080503@oracle.com>
References: <5B4E0C62.3020808@oracle.com>
 <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com>
 <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com>
 <5B4F98BF.1060602@oracle.com>
 <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com>
 <c309dffe-f935-60ce-ce4b-5c99cd01406b@oracle.com>
 <5B507F2C.4080503@oracle.com>
Message-ID: <2bd15cea-86b5-1af0-ef53-2c82a4685224@oracle.com>

Hi Gary,

The test fails if the breakpoint event comes in after the test captures 
the initial thread suspend counts and before the test captures the 2nd 
suspend counts.

debugger>???????? getting : Map<String, Integer> suspendsCounts1
debugger> {Reference Handler=1, Common-Cleaner=1, main=1, Signal 
Dispatcher=1, Finalizer=1}
debugger>???????? eventSet.resume;
debugger>???????? getting : Map<String, Integer> suspendsCounts2
EventHandler> Received event set with policy = SUSPEND_ALL
EventHandler> Event: BreakpointEventImpl req breakpoint request 
nsk.jdi.EventSet.resume.resume008a:60 (enabled)
debugger> Received communication breakpoint event.
debugger> {Reference Handler=2, Common-Cleaner=2, main=1, Signal 
Dispatcher=2, Finalizer=2}

So we end up with some threads starting with 1 suspend and ending with 2 
(not clear to me why main is still at 1).

It will pass if the breakpoint comes in after it does both of suspend 
count checks, as you have shown with the sleep(100) solution. Output 
looks like this:

debugger>??????? got new ThreadStartEvent with propety 'number' == 
ThreadStartRequest1
...
debugger> ......--> vm.suspend();
debugger>???????? getting : Map<String, Integer> suspendsCounts1
debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, 
Signal Dispatcher=1, Finalizer=1}
debugger>???????? eventSet.resume;
debugger>???????? getting : Map<String, Integer> suspendsCounts2
debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, 
Signal Dispatcher=1, Finalizer=1}
...
debugger> Received communication breakpoint event.

I've also shown that it passes if the breakpoint always comes in before 
capturing the initial suspend counts. I added a sleep on the debugger 
side right after eventHandler.waitForRequestedEventSet() returns. Output 
looks like:

debugger> Received communication breakpoint event.
debugger>??????? got new ThreadStartEvent with propety 'number' == 
ThreadStartRequest1
...
debugger> ......--> vm.suspend();
debugger>???????? getting : Map<String, Integer> suspendsCounts1
debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
Signal Dispatcher=2, Finalizer=2}
debugger>???????? eventSet.resume;
debugger>???????? getting : Map<String, Integer> suspendsCounts2
debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
Signal Dispatcher=2, Finalizer=2}

I think we should add synchronization to force one of these two 
outcomes. For the first, you would need to make the debugger modify some 
variable that the debuggee is watching (sitting in a loop waiting for it 
to change). For the second, you can rely on the existing 
methodForCommunication() approach. You just need to restructure the 
debugger a bit. I had started down this path late Wednesday, but got 
sidetracked by a few other things. I can look into it some more if you'd 
like.

thanks,

Chris

On 7/19/18 5:08 AM, Gary Adams wrote:
> In the successful run below "the first acquire thread suspend counts, 
> resume,
> and the second acquire thread suspend counts" is not interrupted by the
> breakpoint event.
>
> Note that the failed thread0 case the test thread finishes rapidly.
> [2018-01-22T20:33:46.86] debugee.stderr> **> debuggee: 'run': enter :: 
> threadName == thread0 *[2018-01-22T20:33:46.86] debugee.stderr> **> 
> debuggee: 'run': exit :: threadName == thread0*
>
> and the successful test run , the thread0 run method exits after the 
> thread1
> has started.
>
> debugger> :::::: case: # 1
> debugger> ......waiting for new ThreadStartEvent : 1
> EventHandler> waitForRequestedEventSet: enabling remove of listener 
> nsk.share.jdi.EventHandler$7 at 616bc3ae
> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 616bc3ae
> EventHandler> waitForRequestedEventSet: vm.resume called
> EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD
> *debugee.stderr> **> debuggee:?? 'run': exit?? :: threadName == thread0*
>
>
> Here's a recent mach5 failed log:
> [2018-01-22T20:33:45.65] # [2018-01-22T20:33:45.65] export 
> TEST_CLEANUP [2018-01-22T20:33:45.65] export SHELL 
> [2018-01-22T20:33:45.65] export DISPLAY [2018-01-22T20:33:45.65] 
> export LIBJSIG_PATH [2018-01-22T20:33:45.65] export TESTBASE 
> [2018-01-22T20:33:45.65] export JAVA_OPTS [2018-01-22T20:33:45.65] 
> export RAS_OPTIONS [2018-01-22T20:33:45.65] export HOME 
> [2018-01-22T20:33:45.65] export LD_LIBRARY_PATH 
> [2018-01-22T20:33:45.65] export CLASSPATH [2018-01-22T20:33:45.65] 
> export TEMP [2018-01-22T20:33:45.65] export TESTED_JAVA_HOME 
> [2018-01-22T20:33:45.65] export BASH_ENV [2018-01-22T20:33:45.65] 
> export PATH [2018-01-22T20:33:45.65] TEST_DEST_DIR="resume008" 
> [2018-01-22T20:33:45.65] # Actual: TEST_DEST_DIR=resume008 
> [2018-01-22T20:33:45.65] TESTNAME="${test_case_name}" 
> [2018-01-22T20:33:45.65] # Actual: TESTNAME=resume008 
> [2018-01-22T20:33:45.65] testName="nsk/jdi/EventSet/resume//resume008" 
> [2018-01-22T20:33:45.65] # Actual: 
> testName=nsk/jdi/EventSet/resume//resume008 [2018-01-22T20:33:45.65] 
> TESTDIR="${test_work_dir}" [2018-01-22T20:33:45.65] # Actual: 
> TESTDIR=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008 
> [2018-01-22T20:33:45.65] testWorkDir="${test_work_dir}/" 
> [2018-01-22T20:33:45.65] # Actual: 
> testWorkDir=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/ 
> [2018-01-22T20:33:45.65] export testWorkDir [2018-01-22T20:33:45.65] 
> tlogOutFile="${test_work_dir}/${test_name}.tlog" 
> [2018-01-22T20:33:45.65] # Actual: 
> tlogOutFile=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/resume008.tlog 
> [2018-01-22T20:33:45.65] 
> testErrFile="${test_work_dir}/${test_name}.err" 
> [2018-01-22T20:33:45.65] # Actual: 
> testErrFile=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/resume008.err 
> [2018-01-22T20:33:45.65] EXECUTE_CLASS="${test_name}" 
> [2018-01-22T20:33:45.66] # Actual: EXECUTE_CLASS=resume008 
> [2018-01-22T20:33:45.66] 
> NSK_STRESS_METASPACE_OPTS="-XX:MaxMetaspaceSize=128m 
> -XX:CompressedClassSpaceSize=64m -XX:MaxHeapSize=512m 
> -Xlog:gc(ASTERISK_SUBST),gc+heap=trace" [2018-01-22T20:33:45.66] # 
> Actual: NSK_STRESS_METASPACE_OPTS=-XX:MaxMetaspaceSize=128m 
> -XX:CompressedClassSpaceSize=64m -XX:MaxHeapSize=512m 
> -Xlog:gc*,gc+heap=trace [2018-01-22T20:33:45.66] export 
> NSK_STRESS_METASPACE_OPTS [2018-01-22T20:33:45.66] 
> EXECUTE_CLASS="nsk.jdi.EventSet.resume.resume008" 
> [2018-01-22T20:33:45.66] # Actual: 
> EXECUTE_CLASS=nsk.jdi.EventSet.resume.resume008 
> [2018-01-22T20:33:45.66] TEST_ARGS="${JDI_TEST_KEYS} 
> -debugee.vmkeys=${JDI_DEBUGEE_VM_KEYS}" [2018-01-22T20:33:45.66] # 
> Actual: TEST_ARGS=-verbose -arch=linux-amd64 -waittime=5 
> -debugee.vmkind=java -transport.address=dynamic 
> -debugee.vmkeys=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:45.66] 
> JAVA="${TESTED_JAVA_HOME}/bin/${DEBUGGER_KIND_OF_JAVA}" 
> [2018-01-22T20:33:45.66] # Actual: 
> JAVA=/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10/bin/java 
> [2018-01-22T20:33:45.66] JAVA_OPTS="${DEBUGGER_JAVA_OPTS}" 
> [2018-01-22T20:33:45.66] # Actual: JAVA_OPTS= [2018-01-22T20:33:45.66] 
> APPLICATION_TIMEOUT="${TIMEOUT}" [2018-01-22T20:33:45.66] # Actual: 
> APPLICATION_TIMEOUT=30 [2018-01-22T20:33:45.66] 
> CLASSPATH="${test_work_dir}${PS}${CLASSPATH}" [2018-01-22T20:33:45.66] 
> # Actual: 
> CLASSPATH=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008:/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.test/hotspot/closed/tonga/bin/classes: 
> [2018-01-22T20:33:45.66] export CLASSPATH [2018-01-22T20:33:45.66] 
> ${JAVA} ${JAVA_OPTS} ${EXECUTE_CLASS} ${TEST_ARGS} 
> [2018-01-22T20:33:45.66] # Actual: 
> /scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10/bin/java 
> nsk.jdi.EventSet.resume.resume008 -verbose -arch=linux-amd64 
> -waittime=5 -debugee.vmkind=java -transport.address=dynamic 
> -debugee.vmkeys=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:46.01] 
> binder> VirtualMachineManager: version 9.0 [2018-01-22T20:33:46.05] 
> binder> Finding connector: default [2018-01-22T20:33:46.05] binder> 
> LaunchingConnector: [2018-01-22T20:33:46.06] binder> name: 
> com.sun.jdi.CommandLineLaunch [2018-01-22T20:33:46.06] binder> 
> description: Launches target using Sun Java VM command line and 
> attaches to it [2018-01-22T20:33:46.06] binder> transport: 
> com.sun.tools.jdi.SunCommandLineLauncher$2 at 457e2f02 
> [2018-01-22T20:33:46.19] binder> Connector arguments: 
> [2018-01-22T20:33:46.19] binder> 
> home=/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10 
> [2018-01-22T20:33:46.19] binder> vmexec=java [2018-01-22T20:33:46.19] 
> binder> options=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:46.20] 
> binder> main=nsk.jdi.EventSet.resume.resume008a "-verbose" 
> "-arch=linux-amd64" "-waittime=5" "-debugee.vmkind=java" 
> "-transport.address=dynamic" 
> "-debugee.vmkeys=-XX:MaxRAMPercentage=12.5" "-pipe.port=28038" 
> [2018-01-22T20:33:46.20] binder> quote=" [2018-01-22T20:33:46.20] 
> binder> suspend=true [2018-01-22T20:33:46.20] binder> Launching 
> debugee [2018-01-22T20:33:46.56] binder> Waiting for VM initialized 
> [2018-01-22T20:33:46.60] Initial VMStartEvent received: VMStartEvent 
> in thread main [2018-01-22T20:33:46.61] EventHandler> Adding listener 
> nsk.share.jdi.EventHandler$1 at 1e7c7811 [2018-01-22T20:33:46.61] 
> EventHandler> Adding listener nsk.share.jdi.EventHandler$2 at 1a3869f4 
> [2018-01-22T20:33:46.61] EventHandler> Adding listener 
> nsk.share.jdi.EventHandler$3 at 77f99a05 [2018-01-22T20:33:46.61] 
> EventHandler> Adding listener nsk.share.jdi.EventHandler$4 at 3aeaafa6 
> [2018-01-22T20:33:46.61] EventHandler> Adding listener 
> nsk.share.jdi.EventHandler$5 at 4d3167f4 [2018-01-22T20:33:46.62] 
> EventHandler> waitForRequestedEvent: enabling remove of listener 
> nsk.share.jdi.EventHandler$6 at 4eb7f003 [2018-01-22T20:33:46.62] 
> EventHandler> Adding listener nsk.share.jdi.EventHandler$6 at 4eb7f003 
> [2018-01-22T20:33:46.62] EventHandler> waitForRequestedEvent: 
> vm.resume called [2018-01-22T20:33:46.67] EventHandler> Received event 
> set with policy = SUSPEND_EVENT_THREAD [2018-01-22T20:33:46.68] 
> EventHandler> Event: ClassPrepareEventImpl req class prepare request 
> (enabled) [2018-01-22T20:33:46.69] EventHandler> 
> waitForRequestedEvent: Received event(ClassPrepareEvent in thread 
> main) for request(class prepare request (enabled)) 
> [2018-01-22T20:33:46.69] EventHandler> Removing listener 
> nsk.share.jdi.EventHandler$6 at 4eb7f003 [2018-01-22T20:33:46.69] 
> debugger> Received ClassPrepareEvent for debuggee class: 
> nsk.jdi.EventSet.resume.resume008a [2018-01-22T20:33:46.71] binder> 
> Breakpoint set: [2018-01-22T20:33:46.71] breakpoint request 
> nsk.jdi.EventSet.resume.resume008a:60 (disabled) 
> [2018-01-22T20:33:46.71] EventHandler> Adding listener 
> nsk.share.jdi.TestDebuggerType1$1 at 43738a82 [2018-01-22T20:33:46.71] 
> debugger> TESTING BEGINS [2018-01-22T20:33:46.71] debugger> RESUME 
> DEBUGGEE VM [2018-01-22T20:33:46.72] debugger> 
> shouldRunAfterBreakpoint: entered [2018-01-22T20:33:46.72] debugger> 
> shouldRunAfterBreakpoint: waiting for breakpoint event during 1 sec. 
> [2018-01-22T20:33:46.84] EventHandler> Received event set with policy 
> = SUSPEND_ALL [2018-01-22T20:33:46.84] EventHandler> Event: 
> BreakpointEventImpl req breakpoint request 
> nsk.jdi.EventSet.resume.resume008a:60 (enabled) 
> [2018-01-22T20:33:46.84] debugger> Received communication breakpoint 
> event. [2018-01-22T20:33:46.84] debugger> shouldRunAfterBreakpoint: 
> received breakpoint event. [2018-01-22T20:33:46.84] debugee.stderr> 
> **> debuggee: debuggee started! [2018-01-22T20:33:46.85] debugger> 
> shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.85] 
> debugger> :::::: case: # 0 [2018-01-22T20:33:46.85] debugger> 
> ......waiting for new ThreadStartEvent : 0 [2018-01-22T20:33:46.85] 
> EventHandler> waitForRequestedEventSet: enabling remove of listener 
> nsk.share.jdi.EventHandler$7 at 6ec8211c [2018-01-22T20:33:46.85] 
> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 6ec8211c 
> [2018-01-22T20:33:46.85] EventHandler> waitForRequestedEventSet: 
> vm.resume called [2018-01-22T20:33:46.86] debugee.stderr> **> 
> debuggee: 'run': enter :: threadName == thread0 
> [2018-01-22T20:33:46.86] debugee.stderr> **> debuggee: 'run': exit :: 
> threadName == thread0 [2018-01-22T20:33:46.86] EventHandler> Received 
> event set with policy = SUSPEND_NONE [2018-01-22T20:33:46.86] 
> EventHandler> waitForRequestedEventSet: Received event set for 
> request: thread start request (enabled) [2018-01-22T20:33:46.86] 
> EventHandler> Event: ThreadStartEventImpl req thread start request 
> (enabled) [2018-01-22T20:33:46.86] EventHandler> Removing listener 
> nsk.share.jdi.EventHandler$7 at 6ec8211c [2018-01-22T20:33:46.86] 
> debugger> got new ThreadStartEvent with propety 'number' == 
> ThreadStartRequest1 [2018-01-22T20:33:46.86] debugger> ......checking 
> up on EventSet.resume() [2018-01-22T20:33:46.86] debugger> ......--> 
> vm.suspend(); [2018-01-22T20:33:46.87] debugger> getting : Map<String, 
> Integer> suspendsCounts1 [2018-01-22T20:33:46.87] debugger> {Reference 
> Handler=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, Finalizer=1} 
> [2018-01-22T20:33:46.87] debugger> eventSet.resume; 
> [2018-01-22T20:33:46.87] debugger> getting : Map<String, Integer> 
> suspendsCounts2 [2018-01-22T20:33:46.87] EventHandler> Received event 
> set with policy = SUSPEND_ALL [2018-01-22T20:33:46.87] EventHandler> 
> Event: BreakpointEventImpl req breakpoint request 
> nsk.jdi.EventSet.resume.resume008a:60 (enabled) 
> [2018-01-22T20:33:46.87] debugger> Received communication breakpoint 
> event. [2018-01-22T20:33:46.87] debugger> {Reference Handler=2, 
> Common-Cleaner=2, main=1, Signal Dispatcher=2, Finalizer=2} 
> [2018-01-22T20:33:46.87] debugger> getting : int policy = 
> eventSet.suspendPolicy(); [2018-01-22T20:33:46.87] debugger> case 
> SUSPEND_NONE [2018-01-22T20:33:46.87] debugger> checking Reference 
> Handler [2018-01-22T20:33:46.87] # ERROR: debugger> ERROR: 
> suspendCounts don't match for : Reference Handler 
> [2018-01-22T20:33:46.88] The following stacktrace is for Aurora. Used 
> to create a RULE: [2018-01-22T20:33:46.88] nsk.share.TestFailure: 
> debugger> ERROR: suspendCounts don't match for : Reference Handler 
> [2018-01-22T20:33:46.88] at 
> nsk.share.Log.logExceptionForAurora(Log.java:411) 
> [2018-01-22T20:33:46.88] at nsk.share.Log.complain(Log.java:380) 
> [2018-01-22T20:33:46.88] at 
> nsk.share.jdi.TestDebuggerType1.complain(TestDebuggerType1.java:63) 
> [2018-01-22T20:33:46.88] at 
> nsk.jdi.EventSet.resume.resume008.testRun(resume008.java:163) 
> [2018-01-22T20:33:46.88] at 
> nsk.share.jdi.TestDebuggerType1.runThis(TestDebuggerType1.java:104) 
> [2018-01-22T20:33:46.88] at 
> nsk.jdi.EventSet.resume.resume008.run(resume008.java:62) 
> [2018-01-22T20:33:46.88] at 
> nsk.jdi.EventSet.resume.resume008.main(resume008.java:57) 
> [2018-01-22T20:33:46.88] # ERROR: debugger> before resuming : 1 
> [2018-01-22T20:33:46.88] # ERROR: debugger> after resuming : 2 
> [2018-01-22T20:33:46.88] debugger> ......--> vm.resume() 
> [2018-01-22T20:33:46.88] debugger> shouldRunAfterBreakpoint: entered 
> [2018-01-22T20:33:46.88] debugger> shouldRunAfterBreakpoint: received 
> breakpoint event. [2018-01-22T20:33:46.88] debugger> 
> shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.88] 
> debugger> :::::: case: # 1 [2018-01-22T20:33:46.88] debugger> 
> ......waiting for new ThreadStartEvent : 1 [2018-01-22T20:33:46.88] 
> EventHandler> waitForRequestedEventSet: enabling remove of listener 
> nsk.share.jdi.EventHandler$7 at 548ad73b [2018-01-22T20:33:46.88] 
> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 548ad73b 
> [2018-01-22T20:33:46.88] EventHandler> waitForRequestedEventSet: 
> vm.resume called [2018-01-22T20:33:46.88] EventHandler> Received event 
> set with policy = SUSPEND_EVENT_THREAD [2018-01-22T20:33:46.88] 
> EventHandler> waitForRequestedEventSet: Received event set for 
> request: thread start request (enabled) [2018-01-22T20:33:46.88] 
> EventHandler> Event: ThreadStartEventImpl req thread start request 
> (enabled) [2018-01-22T20:33:46.88] EventHandler> Removing listener 
> nsk.share.jdi.EventHandler$7 at 548ad73b [2018-01-22T20:33:46.88] 
> debugger> got new ThreadStartEvent with propety 'number' == 
> ThreadStartRequest2 [2018-01-22T20:33:46.88] debugger> ......checking 
> up on EventSet.resume() [2018-01-22T20:33:46.88] debugger> ......--> 
> vm.suspend(); [2018-01-22T20:33:46.88] debugger> getting : Map<String, 
> Integer> suspendsCounts1 [2018-01-22T20:33:46.89] debugger> {Reference 
> Handler=1, thread1=2, Common-Cleaner=1, main=1, Signal Dispatcher=1, 
> Finalizer=1} [2018-01-22T20:33:46.89] debugger> eventSet.resume; 
> [2018-01-22T20:33:46.89] debugger> getting : Map<String, Integer> 
> suspendsCounts2 [2018-01-22T20:33:46.89] debugger> {Reference 
> Handler=1, thread1=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, 
> Finalizer=1} [2018-01-22T20:33:46.89] debugger> getting : int policy = 
> eventSet.suspendPolicy(); [2018-01-22T20:33:46.89] debugger> case 
> SUSPEND_THREAD [2018-01-22T20:33:46.89] debugger> checking Reference 
> Handler [2018-01-22T20:33:46.89] debugger> checking thread1 
> [2018-01-22T20:33:46.89] debugger> checking Common-Cleaner 
> [2018-01-22T20:33:46.89] debugger> checking main 
> [2018-01-22T20:33:46.90] debugger> checking Signal Dispatcher 
> [2018-01-22T20:33:46.90] debugger> checking Finalizer 
> [2018-01-22T20:33:46.90] debugger> ......--> vm.resume() 
> [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: entered 
> [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: waiting 
> for breakpoint event during 1 sec. [2018-01-22T20:33:46.90] 
> debugee.stderr> **> debuggee: 'run': enter :: threadName == thread1 
> [2018-01-22T20:33:46.90] debugee.stderr> **> debuggee: 'run': exit :: 
> threadName == thread1 [2018-01-22T20:33:46.90] EventHandler> Received 
> event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.90] 
> EventHandler> Event: BreakpointEventImpl req breakpoint request 
> nsk.jdi.EventSet.resume.resume008a:60 (enabled) 
> [2018-01-22T20:33:46.90] debugger> Received communication breakpoint 
> event. [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: 
> received breakpoint event. [2018-01-22T20:33:46.90] debugger> 
> shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.90] 
> debugger> :::::: case: # 2 [2018-01-22T20:33:46.90] debugger> 
> ......waiting for new ThreadStartEvent : 2 [2018-01-22T20:33:46.90] 
> EventHandler> waitForRequestedEventSet: enabling remove of listener 
> nsk.share.jdi.EventHandler$7 at 2641e737 [2018-01-22T20:33:46.90] 
> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 2641e737 
> [2018-01-22T20:33:46.90] EventHandler> waitForRequestedEventSet: 
> vm.resume called [2018-01-22T20:33:46.90] EventHandler> Received event 
> set with policy = SUSPEND_ALL [2018-01-22T20:33:46.90] EventHandler> 
> waitForRequestedEventSet: Received event set for request: thread start 
> request (enabled) [2018-01-22T20:33:46.90] EventHandler> Event: 
> ThreadStartEventImpl req thread start request (enabled) 
> [2018-01-22T20:33:46.90] EventHandler> Removing listener 
> nsk.share.jdi.EventHandler$7 at 2641e737 [2018-01-22T20:33:46.90] 
> debugger> got new ThreadStartEvent with propety 'number' == 
> ThreadStartRequest3 [2018-01-22T20:33:46.90] debugger> ......checking 
> up on EventSet.resume() [2018-01-22T20:33:46.90] debugger> ......--> 
> vm.suspend(); [2018-01-22T20:33:46.90] debugger> getting : Map<String, 
> Integer> suspendsCounts1 [2018-01-22T20:33:46.91] debugger> {Reference 
> Handler=2, thread2=2, Common-Cleaner=2, main=2, Signal Dispatcher=2, 
> Finalizer=2} [2018-01-22T20:33:46.91] debugger> eventSet.resume; 
> [2018-01-22T20:33:46.91] debugger> getting : Map<String, Integer> 
> suspendsCounts2 [2018-01-22T20:33:46.91] debugger> {Reference 
> Handler=1, thread2=1, Common-Cleaner=1, main=1, Signal Dispatcher=1, 
> Finalizer=1} [2018-01-22T20:33:46.91] debugger> getting : int policy = 
> eventSet.suspendPolicy(); [2018-01-22T20:33:46.91] debugger> case 
> SUSPEND_ALL [2018-01-22T20:33:46.91] debugger> checking Reference 
> Handler [2018-01-22T20:33:46.91] debugger> checking thread2 
> [2018-01-22T20:33:46.91] debugger> checking Common-Cleaner 
> [2018-01-22T20:33:46.91] debugger> checking main 
> [2018-01-22T20:33:46.91] debugger> checking Signal Dispatcher 
> [2018-01-22T20:33:46.91] debugger> checking Finalizer 
> [2018-01-22T20:33:46.91] debugger> ......--> vm.resume() 
> [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: entered 
> [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: waiting 
> for breakpoint event during 1 sec. [2018-01-22T20:33:46.91] 
> debugee.stderr> **> debuggee: 'run': enter :: threadName == thread2 
> [2018-01-22T20:33:46.91] debugee.stderr> **> debuggee: 'run': exit :: 
> threadName == thread2 [2018-01-22T20:33:46.91] EventHandler> Received 
> event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.91] 
> EventHandler> Event: BreakpointEventImpl req breakpoint request 
> nsk.jdi.EventSet.resume.resume008a:60 (enabled) 
> [2018-01-22T20:33:46.91] debugger> Received communication breakpoint 
> event. [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: 
> received breakpoint event. [2018-01-22T20:33:46.91] debugger> 
> shouldRunAfterBreakpoint: received instruction from debuggee to 
> finish. [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: 
> exited with false. [2018-01-22T20:33:46.91] debugger> TESTING ENDS 
> [2018-01-22T20:33:46.91] debugger> Waiting for debuggee's exit... 
> [2018-01-22T20:33:46.91] EventHandler> waitForVMDisconnect 
> [2018-01-22T20:33:46.91] debugee.stderr> **> debuggee: debuggee exits 
> [2018-01-22T20:33:46.92] EventHandler> Received event set with policy 
> = SUSPEND_NONE [2018-01-22T20:33:46.92] EventHandler> Event: 
> VMDeathEventImpl req null [2018-01-22T20:33:46.92] EventHandler> 
> receieved VMDeath [2018-01-22T20:33:46.92] EventHandler> Removing 
> listener nsk.share.jdi.EventHandler$3 at 77f99a05 
> [2018-01-22T20:33:47.25] EventHandler> Received event set with policy 
> = SUSPEND_NONE [2018-01-22T20:33:47.25] EventHandler> Event: 
> VMDisconnectEventImpl req null [2018-01-22T20:33:47.25] EventHandler> 
> receieved VMDisconnect [2018-01-22T20:33:47.25] EventHandler> Removing 
> listener nsk.share.jdi.EventHandler$4 at 3aeaafa6 
> [2018-01-22T20:33:47.25] EventHandler> finished 
> [2018-01-22T20:33:47.25] EventHandler> waitForVMDisconnect: done 
> [2018-01-22T20:33:47.25] debugger> Event handler thread exited. 
> [2018-01-22T20:33:47.25] debugger> Debuggee PASSED. 
> [2018-01-22T20:33:47.26] [2018-01-22T20:33:47.26] 
> [2018-01-22T20:33:47.26] #> [2018-01-22T20:33:47.26] #> SUMMARY: 
> Following errors occured [2018-01-22T20:33:47.26] #> during test 
> execution: [2018-01-22T20:33:47.26] #> [2018-01-22T20:33:47.26] # 
> ERROR: debugger> ERROR: suspendCounts don't match for : Reference 
> Handler [2018-01-22T20:33:47.26] # ERROR: debugger> before resuming : 
> 1 [2018-01-22T20:33:47.26] # ERROR: debugger> after resuming : 2 
> [2018-01-22T20:33:47.27] # Test level exit status: 97
>
>
> Here's a recent passed log from a local run:
>
> ----------System.out:(164/9808)----------
> run [nsk.jdi.EventSet.resume.resume008, -verbose, -arch=linux-x64, 
> -waittime=5, -debugee.vmkind=java, -transport.address=dynamic, 
> -debugee.vmkeys=-XX:MaxRAMPercentage=2 ]
> binder> VirtualMachineManager: version 11.0
> binder> Finding connector: default
> binder> LaunchingConnector:
> binder>???? name: com.sun.jdi.CommandLineLaunch
> binder>???? description: Launches target using Sun Java VM command 
> line and attaches to it
> binder>???? transport: com.sun.tools.jdi.SunCommandLineLauncher$2 at 749dec1a
> binder> Connector arguments:
> binder> home=/export/users/gradams/ws/jdk-jdk/build/linux-x64/images/jdk
> binder>???? vmexec=java
> binder>???? options=-XX:MaxRAMPercentage=2
> binder>???? main=nsk.jdi.EventSet.resume.resume008a "-verbose" 
> "-arch=linux-x64" "-waittime=5" "-debugee.vmkind=java" 
> "-transport.address=dynamic" "-debugee.vmkeys=-XX:MaxRAMPercentage=2 " 
> "-pipe.port=35940"
> binder>???? quote="
> binder>???? suspend=true
> binder> Launching debugee
> binder> Waiting for VM initialized
> Initial VMStartEvent received: VMStartEvent in thread main
> EventHandler> Adding listener nsk.share.jdi.EventHandler$1 at 2ab41d39
> EventHandler> Adding listener nsk.share.jdi.EventHandler$2 at 2e3cb1e2
> EventHandler> Adding listener nsk.share.jdi.EventHandler$3 at 57f20df9
> EventHandler> Adding listener nsk.share.jdi.EventHandler$4 at 6e72e291
> EventHandler> Adding listener nsk.share.jdi.EventHandler$5 at 5889e23e
> EventHandler> waitForRequestedEvent: enabling remove of listener 
> nsk.share.jdi.EventHandler$6 at 46dcda7f
> EventHandler> Adding listener nsk.share.jdi.EventHandler$6 at 46dcda7f
> EventHandler> waitForRequestedEvent: vm.resume called
> EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD
> EventHandler> Event: ClassPrepareEventImpl req class prepare request? 
> (enabled)
> EventHandler> waitForRequestedEvent: Received event(ClassPrepareEvent 
> in thread main) for request(class prepare request? (enabled))
> EventHandler> Removing listener nsk.share.jdi.EventHandler$6 at 46dcda7f
> debugger> Received ClassPrepareEvent for debuggee class: 
> nsk.jdi.EventSet.resume.resume008a
> binder> Breakpoint set:
> ??? breakpoint request nsk.jdi.EventSet.resume.resume008a:74 (disabled)
> EventHandler> Adding listener nsk.share.jdi.TestDebuggerType1$1 at 322c2a05
> debugger> TESTING BEGINS
> debugger> RESUME DEBUGGEE VM
> debugger> shouldRunAfterBreakpoint: entered
> debugger> shouldRunAfterBreakpoint: waiting for breakpoint event 
> during 1 sec.
>
> debugee.stderr> **> debuggee: debuggee started!
> EventHandler> Received event set with policy = SUSPEND_ALL
> EventHandler> Event: BreakpointEventImpl req breakpoint request 
> nsk.jdi.EventSet.resume.resume008a:74 (enabled)
> debugger> Received communication breakpoint event.
>
> debugger> shouldRunAfterBreakpoint: received breakpoint event.
> debugger> shouldRunAfterBreakpoint: exited with true.
> debugger> :::::: case: # 0
> debugger> ......waiting for new ThreadStartEvent : 0
>
> EventHandler> waitForRequestedEventSet: enabling remove of listener 
> nsk.share.jdi.EventHandler$7 at 78aa490d
> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 78aa490d
> EventHandler> waitForRequestedEventSet: vm.resume called
> EventHandler> Received event set with policy = SUSPEND_NONE
> debugee.stderr> **> debuggee:?? 'run': enter? :: threadName == thread0
> EventHandler> waitForRequestedEventSet: Received event set for 
> request: thread start request? (enabled)
> EventHandler> Event: ThreadStartEventImpl req thread start request? 
> (enabled)
> EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 78aa490d
> EventHandler> Received event set with policy = SUSPEND_ALL
> EventHandler> Event: BreakpointEventImpl req breakpoint request 
> nsk.jdi.EventSet.resume.resume008a:74 (enabled)
> debugger> Received communication breakpoint event.
>
> debugger>??????? got new ThreadStartEvent with propety 'number' == 
> ThreadStartRequest1
> debugger> ......checking up on EventSet.resume()
> debugger> ......--> vm.suspend();
> debugger>???????? getting : Map<String, Integer> suspendsCounts1
> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
> Signal Dispatcher=2, Finalizer=2}
> debugger>???????? eventSet.resume;
> debugger>???????? getting : Map<String, Integer> suspendsCounts2
> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
> Signal Dispatcher=2, Finalizer=2}
> debugger>???????? getting : int policy = eventSet.suspendPolicy();
> debugger>???????? case SUSPEND_NONE
> debugger>???????? checking Reference Handler
> debugger>???????? checking thread0
> debugger>???????? checking Common-Cleaner
> debugger>???????? checking main
> debugger>???????? checking Signal Dispatcher
> debugger>???????? checking Finalizer
> debugger> ......--> vm.resume()
> debugger> shouldRunAfterBreakpoint: entered
> debugger> shouldRunAfterBreakpoint: received breakpoint event.
> debugger> shouldRunAfterBreakpoint: exited with true.
> debugger> :::::: case: # 1
> debugger> ......waiting for new ThreadStartEvent : 1
> EventHandler> waitForRequestedEventSet: enabling remove of listener 
> nsk.share.jdi.EventHandler$7 at 616bc3ae
> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 616bc3ae
> EventHandler> waitForRequestedEventSet: vm.resume called
> EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD
> debugee.stderr> **> debuggee:?? 'run': exit?? :: threadName == thread0
> EventHandler> waitForRequestedEventSet: Received event set for 
> request: thread start request? (enabled)
> EventHandler> Event: ThreadStartEventImpl req thread start request? 
> (enabled)
> EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 616bc3ae
> debugger>??????? got new ThreadStartEvent with propety 'number' == 
> ThreadStartRequest2
> debugger> ......checking up on EventSet.resume()
> debugger> ......--> vm.suspend();
> debugger>???????? getting : Map<String, Integer> suspendsCounts1
> debugger> {Reference Handler=1, thread1=2, Common-Cleaner=1, main=1, 
> Signal Dispatcher=1, Finalizer=1}
> debugger>???????? eventSet.resume;
> debugger>???????? getting : Map<String, Integer> suspendsCounts2
> debugger> {Reference Handler=1, thread1=1, Common-Cleaner=1, main=1, 
> Signal Dispatcher=1, Finalizer=1}
> debugger>???????? getting : int policy = eventSet.suspendPolicy();
> debugger>???????? case SUSPEND_THREAD
> debugger> checking Reference Handler
> debugger> checking thread1
> debugger> checking Common-Cleaner
> debugger> checking main
> debugger> checking Signal Dispatcher
> debugger> checking Finalizer
> debugger> ......--> vm.resume()
> debugger> shouldRunAfterBreakpoint: entered
> debugger> shouldRunAfterBreakpoint: waiting for breakpoint event 
> during 1 sec.
> debugee.stderr> **> debuggee:?? 'run': enter? :: threadName == thread1
> debugee.stderr> **> debuggee:?? 'run': exit?? :: threadName == thread1
> EventHandler> Received event set with policy = SUSPEND_ALL
> EventHandler> Event: BreakpointEventImpl req breakpoint request 
> nsk.jdi.EventSet.resume.resume008a:74 (enabled)
> debugger> Received communication breakpoint event.
> debugger> shouldRunAfterBreakpoint: received breakpoint event.
> debugger> shouldRunAfterBreakpoint: exited with true.
> debugger> :::::: case: # 2
> debugger> ......waiting for new ThreadStartEvent : 2
> EventHandler> waitForRequestedEventSet: enabling remove of listener 
> nsk.share.jdi.EventHandler$7 at 44e265ef
> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 44e265ef
> EventHandler> waitForRequestedEventSet: vm.resume called
> EventHandler> Received event set with policy = SUSPEND_ALL
> EventHandler> waitForRequestedEventSet: Received event set for 
> request: thread start request? (enabled)
> EventHandler> Event: ThreadStartEventImpl req thread start request? 
> (enabled)
> EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 44e265ef
> debugger>??????? got new ThreadStartEvent with propety 'number' == 
> ThreadStartRequest3
> debugger> ......checking up on EventSet.resume()
> debugger> ......--> vm.suspend();
> debugger>???????? getting : Map<String, Integer> suspendsCounts1
> debugger> {Reference Handler=2, thread2=2, Common-Cleaner=2, main=2, 
> Signal Dispatcher=2, Finalizer=2}
> debugger>???????? eventSet.resume;
> debugger>???????? getting : Map<String, Integer> suspendsCounts2
> debugger> {Reference Handler=1, thread2=1, Common-Cleaner=1, main=1, 
> Signal Dispatcher=1, Finalizer=1}
> debugger>???????? getting : int policy = eventSet.suspendPolicy();
> debugger>???????? case SUSPEND_ALL
> debugger> checking Reference Handler
> debugger> checking thread2
> debugger> checking Common-Cleaner
> debugger> checking main
> debugger> checking Signal Dispatcher
> debugger> checking Finalizer
> debugger> ......--> vm.resume()
> debugger> shouldRunAfterBreakpoint: entered
> debugger> shouldRunAfterBreakpoint: waiting for breakpoint event 
> during 1 sec.
> debugee.stderr> **> debuggee:?? 'run': enter? :: threadName == thread2
> debugee.stderr> **> debuggee:?? 'run': exit?? :: threadName == thread2
> EventHandler> Received event set with policy = SUSPEND_ALL
> EventHandler> Event: BreakpointEventImpl req breakpoint request 
> nsk.jdi.EventSet.resume.resume008a:74 (enabled)
> debugger> Received communication breakpoint event.
> debugger> shouldRunAfterBreakpoint: received breakpoint event.
> debugger> shouldRunAfterBreakpoint: received instruction from debuggee 
> to finish.
> debugger> shouldRunAfterBreakpoint: exited with false.
> debugger> TESTING ENDS
> debugger> Waiting for debuggee's exit...
> debugee.stderr> **> debuggee: debuggee exits
> EventHandler> waitForVMDisconnect
> EventHandler> Received event set with policy = SUSPEND_NONE
> EventHandler> Event: VMDeathEventImpl req null
> EventHandler> receieved VMDeath
> EventHandler> Removing listener nsk.share.jdi.EventHandler$3 at 57f20df9
> EventHandler> Received event set with policy = SUSPEND_NONE
> EventHandler> Event: VMDisconnectEventImpl req null
> EventHandler> receieved VMDisconnect
> EventHandler> Removing listener nsk.share.jdi.EventHandler$4 at 6e72e291
> EventHandler> finished
> EventHandler> waitForVMDisconnect: done
> debugger> Event handler thread exited.
> debugger> Debuggee PASSED.
>
> On 7/18/18, 6:09 PM, gary.adams at oracle.com wrote:
>> On 7/18/18 4:47 PM, Chris Plummer wrote:
>>> Hi Gary
>>>
>>> Ok, so shouldRunAfterBreakpoint() is the code that does the 
>>> eventHandler.wait(), so it gets the eventHandler.notifyAll() 
>>> notification from the BreakpointEvent handler.
>>>
>>> And as a side note, I see now that resumption of execution after the 
>>> breakpoint at main() is done by:
>>>
>>> ??????????? // after waitForClassPrepared() main debuggee thread is 
>>> suspended, resume it before test start
>>> ??????????? display("RESUME DEBUGGEE VM");
>>> ??????????? vm.resume();
>>>
>>> ??????????? testRun();
>>>
>>> shouldRunAfterBreakpoint() is returning true until the end of the 
>>> test when the debuggee is executes "instruction = end". That's why 
>>> runTests() does a "break" when shouldRunAfterBreakpoint() returns 
>>> false. So this means the code that is checking 
>>> shouldRunAfterBreakpoint() is not resuming execution for the first 
>>> few (probably 3) methodForCommunication() breakpoints. However, it 
>>> does make sure that runTests() blocks until the BreakPointEvent has 
>>> been processed.
>>>
>>> You point out the vm.resume() at the bottom of the loop in 
>>> runTests(), but that's only after a bunch of ThreadStartEvent 
>>> processing above it has been done already. The ThreadStartEvent 
>>> would never get generated if there was not a resume some point 
>>> earlier. I think it is happening during the 
>>> eventHandler.waitForRequestedEventSet() call, which does a vm.resume().
>>>
>>> So if I understand the order of things now:
>>>
>>> -shouldRunAfterBreakpoint() returns after first 
>>> methodForCommunication() is hit. At this point we know the first 
>>> thread has been created, but no attempt to start it yet. The 
>>> debuggee is suspended at this point.
>>> -runTests() requests ThreadStartEvents with SUSPEND_NONE. This also 
>>> does a vm.resume().
>>> -The debuggee starts the thread and then does another 
>>> methodForCommunication() (this 2nd one is actually after the 2nd 
>>> thread has been created, but not yet started). Now we have a race. 
>>> Do we get the ThreadStartEvent first or the BreakpointEvent. This is 
>>> because when the ThreadStartEvent is generated, the thread is not 
>>> suspended due to SUSPEND_NONE. Even if the ThreadStartEvent comes in 
>>> first, the async handling of the BreakpointEvent can cause problems 
>>> during the ThreadStartEvent processing.
>> Based on the failed log in the bug report, the thread start event is 
>> observed,
>> the suspend counts acquired, then after the resume, the breakpoint 
>> message
>> is displayed and the second set of suspend counts acquired.
>>
>> I can show you the passed and failed logs tomorrow.
>>> -You added a 100ms delay after the thread has started, but before 
>>> methodForCommunication(), hoping it will make it so the 
>>> ThreadStartEvent can be received and fully processed before the 
>>> BreakpointEvent is.
>> The delay is mostly just a yield so the debugger gets a chance to run.
>>>
>>> I think it would be preferable to fix this by doing better 
>>> sychronization. After all, that is the approach the test originally 
>>> took. It could have been written with a bunch of sleep() delays 
>>> instead, but that in general is not a very good approach.
>>>
>>> What if you added a shouldRunAfterBreakpoint() call after getting 
>>> the ThreadStartEvent arrives. At this point you would know that the 
>>> vm is suspended due to the breakpoint, so no need for:
>>>
>>> ??????????????? display("......checking up on EventSet.resume()");
>>> ??????????????? display("......--> vm.suspend();");
>>> ??????????????? vm.suspend();
>> I think the suspend is intentional to capture the the suspend counts.
>> It also needs to resume the vm and acquire again so it can confirm 
>> the correct
>> suspend count behaviors.
>> If the test waits to capture the second set of suspend counts, the 
>> breakpoint
>> causes incorrect values.
>>
>> ...
>>>
>>> You might then also need to add another methodForCommunication() 
>>> call at the end of case 0 and 1 in the debuggee, although I think 
>>> you could instead just change the shouldRunAfterBreakpoint() at the 
>>> start of the loop. I think that check actually belongs at the end of 
>>> the loop, and only for case 2. In fact it would be an error if 
>>> shouldRunAfterBreakpoint() did not return true in that case. Then 
>>> you also need to add a shouldRunAfterBreakpoint() at the start of 
>>> case 0 to get things rolling (and I think at the start of case 1 also).
>>>
>>> Chris
>>>
>>>
>>> On 7/18/18 12:45 PM, Gary Adams wrote:
>>>> Answers below? ...
>>>>
>>>> On 7/18/18, 2:50 PM, Chris Plummer wrote:
>>>>> Hi Gary,
>>>>>
>>>>> Who does the resume for the breakpoint event?
>>>>>
>>>>> ??????? eventHandler.addListener(
>>>>> ???????????? new EventHandler.EventListener() {
>>>>> ???????????????? public boolean eventReceived(Event event) {
>>>>> ??????????????????? if (event instanceof BreakpointEvent && 
>>>>> bpRequest.equals(event.request())) {
>>>>> ??????????????????????? synchronized(eventHandler) {
>>>>> ??????????????????????????? display("Received communication 
>>>>> breakpoint event.");
>>>>> ??????????????????????????? bpCount++;
>>>>> ??????????????????????????? eventHandler.notifyAll();
>>>>> ??????????????????????? }
>>>>> ??????????????????????? return true;
>>>>> ??????????????????? }
>>>>> ??????????????????? return false;
>>>>> ???????????????? }
>>>>> ???????????? }
>>>>> ??????? );
>>>> I believe you are looking for this sequence.
>>>> At the top of the loop a check is made if
>>>> resume() should be called "shouldRunAfterBreakpoint".
>>>> lines 96-99 is an early termination. And at the
>>>> bottom of the loop, line 240, is the normal
>>>> continue the test to the next case.
>>>>
>>>> resume008.java :
>>>> ...
>>>> ??? 94??????????? for (int i = 0; ; i++) {
>>>> ??? 95
>>>>
>>>> ??? 96??????????????? if (!shouldRunAfterBreakpoint()) {
>>>> ??? 97??????????????????? vm.resume();
>>>> ??? 98??????????????????? break;
>>>> ??? 99??????????????? }
>>>>
>>>> 100
>>>> ?? 101
>>>> ?? 102??????????????? display(":::::: case: # " + i);
>>>> ?? 103
>>>> ?? 104??????????????? switch (i) {
>>>> ?? 105
>>>> ?? 106??????????????????? case 0:
>>>> ?? 107??????????????????? eventRequest = settingThreadStartRequest (
>>>> ?? 108 SUSPEND_NONE, "ThreadStartRequest1");
>>>> ...
>>>> ? 238
>>>> ?? 239??????????????? display("......--> vm.resume()");
>>>> ?? 240??????????????? vm.resume();
>>>> ?? 241??????????? }
>>>>>
>>>>> Also:
>>>>>
>>>>>> ? 1. On a thread start event the debugee is suspended, line 141 
>>>>> That's not true for the first ThreadStartEvent since SUSPEND_NONE 
>>>>> was used.
>>>> The thread start event is set to SUSPEND_NONE for thread0, but when
>>>> the thread start event is observed the resume008 test suspends the vm
>>>> immediately after fetching the "number" property.
>>> My point is that the Debuggee continues to run after the 
>>> ThreadStartEvent is sent, and relies on the debugger to stop it 
>>> after receiving the event. But in the meantime the debuggee has 
>>> advanced to the next breakpoint, but only sometimes, thus the bug 
>>> you are seeing.
>>>>
>>>> ?? 132??????????????? if ( !(newEvent instanceof ThreadStartEvent)) {
>>>> ?? 133??????????????????? setFailedStatus("ERROR: new event is not 
>>>> ThreadStartEvent");
>>>> ?? 134??????????????? } else {
>>>> ?? 135
>>>> ?? 136??????????????????? String property = (String) 
>>>> newEvent.request().getProperty("number");
>>>> ?? 137??????????????????? display("?????? got new ThreadStartEvent 
>>>> with propety 'number' == " + property);
>>>> ?? 138
>>>> ?? 139??????????????????? display("......checking up on 
>>>> EventSet.resume()");
>>>> ?? 140??????????????????? display("......--> vm.suspend();");
>>>> ?? 141??????????????????? vm.suspend();
>>>>
>>>>
>>>>>
>>>>> Chris
>>>>>
>>>>> On 7/18/18 4:52 AM, Gary Adams wrote:
>>>>>> There is nothing wrong with the breakpoint in 
>>>>>> methodForCommunication.
>>>>>> The test uses it to make sure the threads are each tested 
>>>>>> separately.
>>>>>> The breakpoint eventhandler just displays a message, increments a 
>>>>>> counter
>>>>>> and returns.
>>>>>>
>>>>>> Let me step through resume008a the debugee to help clarify ...
>>>>>>
>>>>>> 1. The test thread is created and the synchronized break point is 
>>>>>> observed. lines 101-102
>>>>>> 2. The thread is started. lines 104,135-137
>>>>>> ??? 2a. The main thread blocks on a local object. lines 133, 139
>>>>>> ??? 2b. The test thread is started. lines 137,
>>>>>> ?????????? A run entered message is displayed, line 159
>>>>>> ?????????? The main thread lock object is notified, line 167
>>>>>> ????????? 2b1. The main thread continues. line 167, 146
>>>>>> ????????????????? The next test thread is created. line 106
>>>>>> ????????????????? The synchronized breakpoint is observed, line 107
>>>>>> ????????? 2b2. A run exited message is displayed, line 169
>>>>>>
>>>>>> On the resume008 debugger side? ...
>>>>>> ? 1. On a thread start event the debugee is suspended, line 141
>>>>>> ? 2. Messages are displayed and a first set of thread suspend 
>>>>>> counts is acquired. lines 143-151
>>>>>> ? 3. The threads are resumed, line 152
>>>>>> --->
>>>>>> ? 4.? Messages are displayed and a second set of thread suspend 
>>>>>> counts is acquired. lines 154-159
>>>>>>
>>>>>> The way the test is written the expectation is the debugger steps 
>>>>>> 2,3,4 will all happen
>>>>>> while the test thread is running.
>>>>>>
>>>>>> When the debugger resumes the debuggee threads (debugger step 3)
>>>>>> the debuggee continues from where it left off (debuggee steps 
>>>>>> 2b,2b1,2b2)
>>>>>>
>>>>>> If we complete debuggee step 2b1 (line 107) before the debugger 
>>>>>> completes step 4 line 159,
>>>>>> then the synchronized breakpoint will suspend the vm and the 
>>>>>> counts will not match
>>>>>> for the SUSPEND_NONE test thread start.
>>>>>>
>>>>>> resume008a.java:
>>>>>>
>>>>>> ?? 100??????????????????????? case 0:
>>>>>> ?? 101??????????????????????????????? thread0 = new 
>>>>>> Threadresume008a("thread0");
>>>>>> ?? 102 methodForCommunication();
>>>>>> ?? 103
>>>>>> ?? 104 threadStart(thread0);
>>>>>> ?? 105
>>>>>> ?? 106??????????????????????????????? thread1 = new 
>>>>>> Threadresume008a("thread1");
>>>>>> ?? 107 methodForCommunication();
>>>>>> ?? 108??????????????????????????????? break;
>>>>>>
>>>>>> ?? ...
>>>>>> ?? 135??????? static int threadStart(Thread t) {
>>>>>> ?? 136??????????? synchronized (waitnotifyObj) {
>>>>>> ?? 137??????????????? t.start();
>>>>>> ?? 138??????????????? try {
>>>>>> ?? 139??????????????????? waitnotifyObj.wait();
>>>>>> ?? 140??????????????? } catch ( Exception e) {
>>>>>> ?? 141??????????????????? exitCode = FAILED;
>>>>>> ?? 142??????????????????? logErr("?????? Exception : " + e );
>>>>>> ?? 143??????????????????? return FAILED;
>>>>>> ?? 144??????????????? }
>>>>>> ?? 145??????????? }
>>>>>> ?? 146??????????? return PASSED;
>>>>>> ?? 147??????? }
>>>>>>
>>>>>> ?? 149??????? static class Threadresume008a extends Thread {
>>>>>> ?? ...
>>>>>> ?? 157
>>>>>> ?? 158??????????? public void run() {
>>>>>> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + 
>>>>>> tName);
>>>>>>
>>>>>> This is the proposed fix that will let the debugger complete it's 
>>>>>> second
>>>>>> acquisition of suspend counts while the test thread is still 
>>>>>> running.
>>>>>>
>>>>>> ?? 160??????????????? // Yield, so the start thread event 
>>>>>> processing can be completed.
>>>>>> ?? 161??????????????? try {
>>>>>> ?? 162??????????????????? Thread.sleep(100);
>>>>>> ?? 163??????????????? } catch (InterruptedException e) {
>>>>>> ?? 164??????????????????? // ignored
>>>>>> ?? 165??????????????? }
>>>>>>
>>>>>> ?? 166??????????????? synchronized (waitnotifyObj) {
>>>>>> ?? 167??????????????????????? waitnotifyObj.notify();
>>>>>> ?? 168??????????????? }
>>>>>> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + 
>>>>>> tName);
>>>>>> ?? 170??????????????? return;
>>>>>> ?? 171??????????? }
>>>>>> ?? 172??????? }
>>>>>> ?? 150
>>>>>> ?? 151??????????? String tName = null;
>>>>>> ?? 152
>>>>>> ?? 153??????????? public Threadresume008a(String threadName) {
>>>>>> ?? 154??????????????? super(threadName);
>>>>>> ?? 155??????????????? tName = threadName;
>>>>>> ?? 156??????????? }
>>>>>> ?? 157
>>>>>> ?? 158??????????? public void run() {
>>>>>> ?? 159??????????????? log1("? 'run': enter? :: threadName == " + 
>>>>>> tName);
>>>>>> ?? 160??????????????? // Yield, so the start thread event 
>>>>>> processing can be completed.
>>>>>> ?? 161??????????????? try {
>>>>>> ?? 162??????????????????? Thread.sleep(100);
>>>>>> ?? 163??????????????? } catch (InterruptedException e) {
>>>>>> ?? 164??????????????????? // ignored
>>>>>> ?? 165??????????????? }
>>>>>> ?? 166??????????????? synchronized (waitnotifyObj) {
>>>>>> ?? 167??????????????????????? waitnotifyObj.notify();
>>>>>> ?? 168??????????????? }
>>>>>> ?? 169??????????????? log1("? 'run': exit?? :: threadName == " + 
>>>>>> tName);
>>>>>> ?? 170??????????????? return;
>>>>>> ?? 171??????????? }
>>>>>> ?? 172??????? }
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 7/18/18, 2:38 AM, Chris Plummer wrote:
>>>>>>> Hi Gary,
>>>>>>>
>>>>>>> I've been having trouble following the control flow of this 
>>>>>>> test. One thing I've stumbled across is the following:
>>>>>>>
>>>>>>> ??????????? /* A debuggee class must define 
>>>>>>> 'methodForCommunication'
>>>>>>> ???????????? * method and invoke it in points of synchronization
>>>>>>> ???????????? * with a debugger.
>>>>>>> ???????????? */
>>>>>>> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication");
>>>>>>>
>>>>>>> So why isn't this mode of synchronization good enough? Is it 
>>>>>>> because it was not designed with the understanding that the 
>>>>>>> debugger might be doing suspended thread counts, and suspending 
>>>>>>> all threads at the breakpoint messes up the test?
>>>>>>>
>>>>>>> From what I can tell of the test, after the debuggee is started 
>>>>>>> and hits the default breakpoint at the start of main(), the 
>>>>>>> debugger then does a vm.resume() at the start of the for loop in 
>>>>>>> the runTest() method. The debuggee then creates a thread and 
>>>>>>> calls methodForCommunication(). There is already a breakpoint 
>>>>>>> set there by the above debuggee code. It's unclear to me what 
>>>>>>> happens as a result of this breakpoint and how it serves the 
>>>>>>> test. Also unclear to me who is responsible for the vm.resume() 
>>>>>>> after the breakpoint is hit.
>>>>>>>
>>>>>>> The debugger then requests all ThreadStart events, requesting 
>>>>>>> that no threads be disabled when it is sent. I think you are 
>>>>>>> saying that when the ThreadStart event comes in, sometimes we 
>>>>>>> are at the methodForCommunication breakpoint, with all threads 
>>>>>>> disabled, and this messes up the thread suspend counts. You want 
>>>>>>> to delay 100ms so the breakpoint event can be processed and 
>>>>>>> threads resumed again (although I can't see who actually resumes 
>>>>>>> the thread after hitting the methodForCommunication breakpoint).
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 7/17/18 8:33 AM, Gary Adams wrote:
>>>>>>>> A race condition exists between the debugger and the debuggee.
>>>>>>>>
>>>>>>>> The first test thread is started with SUSPEND_NONE policy set.
>>>>>>>> While processing the thread start event the debugger captures
>>>>>>>> an initial set of thread suspend counts and resumes the
>>>>>>>> debuggee vm. If the debuggee advances quickly it reaches
>>>>>>>> the breakpoint set for methodForCommunication. Since the 
>>>>>>>> breakpoint
>>>>>>>> carries with it SUSPEND_ALL policy, when the debugger captures 
>>>>>>>> a second
>>>>>>>> set of suspend counts, it will not match the expected counts for
>>>>>>>> a SUSPEND_NONE scenario.
>>>>>>>>
>>>>>>>> The proposed fix introduces a yield in the debuggee test thread 
>>>>>>>> run method
>>>>>>>> to allow the debugger to get the expected sampled values.
>>>>>>>>
>>>>>>>> ? Issue: https://bugs.openjdk.java.net/browse/JDK-8170089
>>>>>>>> ? Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/
>>>>>>>>
>>>>>>>>
>>>>>>>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java: 
>>>>>>>>
>>>>>>>> ...
>>>>>>>> ?? 186??????? private void 
>>>>>>>> setCommunicationBreakpoint(ReferenceType refType, String 
>>>>>>>> methodName) {
>>>>>>>> ?? 187??????????? Method method = 
>>>>>>>> debuggee.methodByName(refType, methodName);
>>>>>>>> ?? 188??????????? Location location = null;
>>>>>>>> ?? 189??????????? try {
>>>>>>>> ?? 190??????????????? location = method.allLineLocations().get(0);
>>>>>>>> ?? 191??????????? } catch (AbsentInformationException e) {
>>>>>>>> ?? 192??????????????? throw new Failure(e);
>>>>>>>> ?? 193??????????? }
>>>>>>>> ?? 194??????????? bpRequest = debuggee.makeBreakpoint(location);
>>>>>>>> ?? 195
>>>>>>>>
>>>>>>>> ?? 196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL);
>>>>>>>>
>>>>>>>> ?? 197??????????? bpRequest.putProperty("number", "zero");
>>>>>>>> ?? 198??????????? bpRequest.enable();
>>>>>>>> ?? 199
>>>>>>>> ?? 200??????????? eventHandler.addListener(
>>>>>>>> ?? 201???????????????? new EventHandler.EventListener() {
>>>>>>>> ?? 202???????????????????? public boolean eventReceived(Event 
>>>>>>>> event) {
>>>>>>>> ?? 203??????????????????????? if (event instanceof 
>>>>>>>> BreakpointEvent && bpRequest.equals(event.request())) {
>>>>>>>> ?? 204 synchronized(eventHandler) {
>>>>>>>> ?? 205 display("Received communication breakpoint event.");
>>>>>>>> ?? 206??????????????????????????????? bpCount++;
>>>>>>>> ?? 207 eventHandler.notifyAll();
>>>>>>>> ?? 208??????????????????????????? }
>>>>>>>> ?? 209??????????????????????????? return true;
>>>>>>>> ?? 210??????????????????????? }
>>>>>>>> ?? 211??????????????????????? return false;
>>>>>>>> ?? 212???????????????????? }
>>>>>>>> ?? 213???????????????? }
>>>>>>>> ?? 214??????????? );
>>>>>>>> ?? 215??????? }
>>>>>>>>
>>>>>>>>
>>>>>>>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: 
>>>>>>>>
>>>>>>>> ...
>>>>>>>> ?? 140??????????????????? display("......--> vm.suspend();");
>>>>>>>> ?? 141??????????????????? vm.suspend();
>>>>>>>> ?? 142
>>>>>>>> ?? 143??????????????????? display("??????? getting : 
>>>>>>>> Map<String, Integer> suspendsCounts1");
>>>>>>>> ?? 144
>>>>>>>> ?? 145??????????????????? Map<String, Integer> suspendsCounts1 
>>>>>>>> = new HashMap<String, Integer>();
>>>>>>>> ?? 146??????????????????? for (ThreadReference threadReference 
>>>>>>>> : vm.allThreads()) {
>>>>>>>> ?? 147 suspendsCounts1.put(threadReference.name(), 
>>>>>>>> threadReference.suspendCount());
>>>>>>>> ?? 148??????????????????? }
>>>>>>>> ?? 149 display(suspendsCounts1.toString());
>>>>>>>> ?? 150
>>>>>>>> ?? 151??????????????????? display(" eventSet.resume;");
>>>>>>>> ?? 152??????????????????? eventSet.resume();
>>>>>>>> ?? 153
>>>>>>>> ?? 154??????????????????? display("??????? getting : 
>>>>>>>> Map<String, Integer> suspendsCounts2");
>>>>>>>>
>>>>>>>> This is where the breakpoint is encountered before the second 
>>>>>>>> set of suspend counts is acquired.
>>>>>>>>
>>>>>>>> ?? 155??????????????????? Map<String, Integer> suspendsCounts2 
>>>>>>>> = new HashMap<String, Integer>();
>>>>>>>> ?? 156??????????????????? for (ThreadReference threadReference 
>>>>>>>> : vm.allThreads()) {
>>>>>>>> ?? 157 suspendsCounts2.put(threadReference.name(), 
>>>>>>>> threadReference.suspendCount());
>>>>>>>> ?? 158??????????????????? }
>>>>>>>> ?? 159 display(suspendsCounts2.toString());
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


From gary.adams at oracle.com  Fri Jul 20 19:11:24 2018
From: gary.adams at oracle.com (Gary Adams)
Date: Fri, 20 Jul 2018 15:11:24 -0400
Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find
 boolVar with expected value: false
In-Reply-To: <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com>
References: <5B082D2E.7000408@oracle.com>
 <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com>
Message-ID: <5B5233DC.5040003@oracle.com>

Here's another attempt to clear up the overlapping output from
the command processing and event handler in the jdb tests.

The fundamental problem is observed when "prompts"
are produced interleaved with command and event output.

This attempts to fix the issue by buffering the output and
printing it fully assembled.

  Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev.01/

On 5/26/18, 6:50 AM, gary.adams at oracle.com wrote:
> This is a review request for a previously closed test bug.
> The test was recently moved to the open repos, and the
> proposed fix is in the open code.
>
>   Webrev: http://cr.openjdk.java.net/~gadams/8169718/webrev/
>
>
> -------- Forwarded Message --------
> Subject: 	RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot 
> find boolVar with expected value: false
> Date: 	Fri, 25 May 2018 11:35:10 -0400
> From: 	Gary Adams <gary.adams at oracle.com>
> Reply-To: 	gary.adams at oracle.com
>
> 	
>
>
>
> The jdb tests use stdin to send commands to a jdb process
> and parses the stdout to determine if a command was
> successful and when the process is prompting for new commands
> to be sent.
>
> Some commands are synchronous, so when the command is completed
> a new prompt is sent back immediately.
>
> Some commands are asynchronous, so there could be a delay
> until a breakpoint is reached. The event handler then sends a prompt
> when the application thread is stopped and new jdb commands can be sent.
>
> The problem causing the intermittent failures was a corruption in the
> output stream when prompts were being sent at the wrong times.
>
> Instead of receiving
>    "Breakpoint hit:"<location>
>     <prompt>
>
> the log contained
>    "Breakpoint hit:"<prompt>  <location>
>
> Once out of sync, jdb commands were being sent prematurely
> and the wrong values were being compared against expected behavior.
> The simple fix proposed here recognizes that commands like "cont",
> "step" and "next" are asynchronous commands and should not send back
> a prompt immediately. Instead. the event handler will deliver the next prompt
> when the next "Breakpoint hit:" or "Step completed:" state change occurs.
>
> The bulk of the testing was done on windows-x64-debug builds where the
> intermittent failures were observed in ~5 in 1000 testruns. The fix has
> also been tested on linux-x64-debug, solaris-sparcv9-debug,
> and macosx-x64-debug, even though the failures have never been reported
> against those platforms.
>
> Failures have been observed in many of the nsk/jdb tests with similar corrupted
> output streams, but never directly associated with this issue before.
>
>   redefine001, caught_exception002, locals002, eval001, next001,
>   stop_at003, step002, print002, trace001, step_up001, read001,
>   clear004, kill001, set001
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180720/1b42a9b7/attachment.html>

From jcbeyler at google.com  Fri Jul 20 19:37:56 2018
From: jcbeyler at google.com (JC Beyler)
Date: Fri, 20 Jul 2018 12:37:56 -0700
Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is
 enabled
In-Reply-To: <b3c9630e-8434-b2da-f732-6569e629debf@oracle.com>
References: <CAF9BGBzAVdC3JZCu2945n8JXo0usocBxVxYzdfdBU_gWOTe8wg@mail.gmail.com>
 <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com>
 <CAF9BGBx7TnVo5vmNx75TUhd1VSeow6KsgcSk2ztKdibKuzv9fw@mail.gmail.com>
 <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com>
 <CAF9BGBxLchBQ3g1uwD=34E+QR5ARO7DqBfxZpFU2TRWTSaTajQ@mail.gmail.com>
 <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
 <CAF9BGBz+SNUffJWNgkvTTK6jB7-8yf96HGSroxvY13FT80XPCg@mail.gmail.com>
 <de54ede9-e29f-5e58-8d7f-6ad3c74d558c@oracle.com>
 <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com>
 <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com>
 <b3c9630e-8434-b2da-f732-6569e629debf@oracle.com>
Message-ID: <CAF9BGBw+gUm+yHRGueGrrBJ5Pn5BA0RTpFyPeOVrBkj8+VAZWg@mail.gmail.com>

Yes that is right, this is the latest:
http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/

I apologize for the multiple threads and confusion,
Jc

On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Thank you a lot, Vladimir!
> Yes, the webrev.03 is the latest.
> Jc, will correct us if it is not right.
>
> Thanks,
> Serguei
>
>
> On 7/20/18 10:52, Vladimir Kozlov wrote:
> > I asked Igor V. to look.
> >
> > Seems like review is done in an other thread which does not have bug
> > id in subject. Currently webrev.03
> >
> > Vladimir
> >
> > On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote:
> >> Thanks, Rahul!
> >> In fact, there no good experts for this area in the serviceability team.
> >> It would be much better if anyone from the Compiler team could do it.
> >>
> >> Vladimir K.,
> >>
> >> Is there anyone from the Compiler team available to review this?
> >> Otherwise, I could try to review it but am not sure about my review
> >> quality.
> >>
> >> Thanks,
> >> Serguei
> >>
> >>
> >> On 7/19/18 00:48, Rahul Raghavan wrote:
> >>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled
> >>>
> >>> (just adding + hotspot-compiler-dev also)
> >>>
> >>>
> >>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote:
> >>> Subject Was:
> >>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled
> >>>
> >>> + serviceability-dev
> >>>
> >>> Hi all,
> >>>
> >>> Could anyone else give me a review of this webrev and check/test the
> >>> various architecture changes?
> >>>
> >>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
> >>>
> >>>
> >>> Thanks for all your help!
> >>> Jc
> >>>
> >>>
> >>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler <jcbeyler at google.com>
> wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> Here is a webrev that does all the architectures in the same way:
> >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
> >>>>>
> >>>>> Could anyone review the other architectures and test?
> >>>>>    - arm, sparc & aarch64 are also modified now to follow the same
> >>>>> "if no
> >>>>> tlab, then consider eden space allocation" logic.
> >>>>>
> >>>>> Thanks for your help!
> >>>>> Jc
> >>>>>
> >>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler <jcbeyler at google.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi Kim,
> >>>>>>
> >>>>>> I opened this bug
> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862
> >>>>>>
> >>>>>> and now I've done an update:
> >>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/
> >>>>>>
> >>>>>> I basically have done your nits but also removed the try_eden (it
> >>>>>> was
> >>>>>> used to bind a label but was not used). I updated the comments to
> >>>>>> use the
> >>>>>> one you preferred.
> >>>>>>
> >>>>>> I still have to do the other architectures though but at least we
> >>>>>> seem to
> >>>>>> have a consensus on this architecture, correct?
> >>>>>>
> >>>>>> Thanks for the review,
> >>>>>> Jc
> >>>>>>
> >>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett <kim.barrett at oracle.com
> >
> >>>>>> wrote:
> >>>>>>
> >>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler <jcbeyler at google.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Yes, you are right, I did those changes due to:
> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084
> >>>>>>>>
> >>>>>>>> If Robbin agrees to this change, and if no one sees an issue,
> >>>>>>>> I'll go
> >>>>>>> ahead
> >>>>>>>> and propagate the change across architectures.
> >>>>>>>>
> >>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's
> >>>>>>>> comment
> >>>>>>> and
> >>>>>>>> review) :)
> >>>>>>>> Jc
> >>>>>>>>
> >>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose <john.r.rose at oracle.com
> >
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler <jcbeyler at google.com>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I'm not sure if we had left this case intentionally or not
> >>>>>>>>> but, if we
> >>>>>>> want
> >>>>>>>>> it all to be consistent, we should perhaps fix it.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Well, you put in that logic last February, so unless somebody
> >>>>>>>>> speaks
> >>>>>>> up
> >>>>>>>>> quickly, I support your adjusting it to be the way you want it.
> >>>>>>>>>
> >>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I
> >>>>>>>>> src/hotspot/share"
> >>>>>>>>> suggests that the GC group is most active in touching this
> >>>>>>>>> feature.
> >>>>>>>>> If Robbin is OK with it, there's your reviewer.
> >>>>>>>>>
> >>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person
> >>>>>>>>> working on the GC to OK it.
> >>>>>>>>>
> >>>>>>>>> ? John
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Jc
> >>>>>>>
> >>>>>>> Robbin is on vacation; you might not hear from him for a while.
> >>>>>>>
> >>>>>>> I'm assuming you'll open a new bug for this?
> >>>>>>>
> >>>>>>> Except for a few minor nits (below), this looks okay to me.
> >>>>>>>
> >>>>>>> The comment at line 1052 needs updating.
> >>>>>>>
> >>>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused.
> >>>>>>>
> >>>>>>> pre-existing: The try_eden label declared on line 1054 is bound at
> >>>>>>> line 1058, but unreferenced.
> >>>>>>>
> >>>>>>> I like the wording of the comment at 1139 better than the
> >>>>>>> wording at
> >>>>>>> 1016.
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Jc
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>>
> >>>>> Thanks,
> >>>>> Jc
> >>>>>
> >>>>
> >>>>
> >>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180720/5f7d53db/attachment-0001.html>

From chris.plummer at oracle.com  Fri Jul 20 20:07:29 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 20 Jul 2018 13:07:29 -0700
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <6de6362944f84740b80abb22cbbea872@sap.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
 <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
 <f96a16917c934a539523e078f902b880@sap.com>
 <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com>
 <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com>
 <6de6362944f84740b80abb22cbbea872@sap.com>
Message-ID: <c4831aef-fd48-3d1c-123d-642cfd3c897c@oracle.com>

Hi Ralf,

Changes look good and pass all the testing I did. You can push once 
Serguei approves.

thanks,

Chris

On 7/20/18 7:28 AM, Schmelter, Ralf wrote:
> Hi Sergue,
>
> I?ve updated the webref: http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/
>
> JVMTI_ERROR_OPAQUE_FRAME was never returned from GetFrameLocation(). If it would have, the old code would have removed all native methods from the call stack. The original JVMDI call did indeed return JVMDI_ERROR_OPAQUE_FRAME, so maybe it was a leftover from the JVMDI->JVMTI transition.
>
> I?ve tried to make the test more readable and added some comments to explain why it is done the way it is.
>
> Best regards,
> Ralf
>
>
>
>
> From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com]
> Sent: Mittwoch, 18. Juli 2018 22:57
> To: Chris Plummer <chris.plummer at oracle.com>; Schmelter, Ralf <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; Stuefe, Thomas <thomas.stuefe at sap.com>
> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior
>
> Hi Ralf,
>
> The fix itself looks pretty good to me.
> Some minor comments.
>
> The copyright year needs an update.
>   218     jint count, filledIn;
>
>   Could you, please, split the declarations above into different lines to follow the local style?
> Ii is interesting that the original implementation checked the error code returned
> from the JVMTI GetFrameLocation for being equal to JVMTI_ERROR_OPAQUE_FRAME.
> However, the GetFrameLocation spec does not list this error code as possible.
>
>
> Some comments about the test.
>    52     static void callEnded() {
>    53         System.out.println("SOE occurred as expected");
>    54     }
>    55
>    56     static int call(int depth) {
>    57         if (depth == 0) {
>    58             // Should have seen a stack overflow by now.
>    59             System.out.println("Exited without creating SOE");
>    60             System.exit(0);
>    61         }
>    62
>    63         try {
>    64             int newDepth = call(depth - 1);
>    65
>    66             if (newDepth == -1_000) {
>    67                 // Pop some frames so there is room on the stack for the
>    68                 // println()
>    69                 callEnded();
>    70             }
>    71
>    72             return newDepth - 1;
>    73         } catch (StackOverflowError e) {
>    74             return -1;
>    75         }
>    76     }
>    77 }
>  ? I'd suggest to rename the methods call() and callEnded() to something like
>  ? recursiveMethod() and recursionEnd().
>  ? Also, the manipulations with SOE create a complexity and are confusing.
>  ? Could it be more simple to let it propagated and then catch in main()?
>  ? What is the point for all these checks at the lines 104-119?
>  ? In general, I'm looking for some ways to make it more clear, simple and stable.
>
> Thanks,
> Serguei


From chris.plummer at oracle.com  Fri Jul 20 20:12:51 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 20 Jul 2018 13:12:51 -0700
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <c4831aef-fd48-3d1c-123d-642cfd3c897c@oracle.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
 <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
 <f96a16917c934a539523e078f902b880@sap.com>
 <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com>
 <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com>
 <6de6362944f84740b80abb22cbbea872@sap.com>
 <c4831aef-fd48-3d1c-123d-642cfd3c897c@oracle.com>
Message-ID: <dc32a759-7196-71dd-cff9-bd3e5b10b01a@oracle.com>

Oops. Sorry, that testing comment was for another changeset. I didn't 
test your changes. If you think they could use some additional testing 
on some more platforms, let me know.

thanks,

Chris

On 7/20/18 1:07 PM, Chris Plummer wrote:
> Hi Ralf,
>
> Changes look good and pass all the testing I did. You can push once 
> Serguei approves.
>
> thanks,
>
> Chris
>
> On 7/20/18 7:28 AM, Schmelter, Ralf wrote:
>> Hi Sergue,
>>
>> I?ve updated the webref: 
>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/
>>
>> JVMTI_ERROR_OPAQUE_FRAME was never returned from GetFrameLocation(). 
>> If it would have, the old code would have removed all native methods 
>> from the call stack. The original JVMDI call did indeed return 
>> JVMDI_ERROR_OPAQUE_FRAME, so maybe it was a leftover from the 
>> JVMDI->JVMTI transition.
>>
>> I?ve tried to make the test more readable and added some comments to 
>> explain why it is done the way it is.
>>
>> Best regards,
>> Ralf
>>
>>
>>
>>
>> From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com]
>> Sent: Mittwoch, 18. Juli 2018 22:57
>> To: Chris Plummer <chris.plummer at oracle.com>; Schmelter, Ralf 
>> <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; 
>> Stuefe, Thomas <thomas.stuefe at sap.com>
>> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c 
>> to prevent quadratic runtime behavior
>>
>> Hi Ralf,
>>
>> The fix itself looks pretty good to me.
>> Some minor comments.
>>
>> The copyright year needs an update.
>> ? 218???? jint count, filledIn;
>>
>> ? Could you, please, split the declarations above into different 
>> lines to follow the local style?
>> Ii is interesting that the original implementation checked the error 
>> code returned
>> from the JVMTI GetFrameLocation for being equal to 
>> JVMTI_ERROR_OPAQUE_FRAME.
>> However, the GetFrameLocation spec does not list this error code as 
>> possible.
>>
>>
>> Some comments about the test.
>> ?? 52???? static void callEnded() {
>> ?? 53???????? System.out.println("SOE occurred as expected");
>> ?? 54???? }
>> ?? 55
>> ?? 56???? static int call(int depth) {
>> ?? 57???????? if (depth == 0) {
>> ?? 58???????????? // Should have seen a stack overflow by now.
>> ?? 59???????????? System.out.println("Exited without creating SOE");
>> ?? 60???????????? System.exit(0);
>> ?? 61???????? }
>> ?? 62
>> ?? 63???????? try {
>> ?? 64???????????? int newDepth = call(depth - 1);
>> ?? 65
>> ?? 66???????????? if (newDepth == -1_000) {
>> ?? 67???????????????? // Pop some frames so there is room on the 
>> stack for the
>> ?? 68???????????????? // println()
>> ?? 69???????????????? callEnded();
>> ?? 70???????????? }
>> ?? 71
>> ?? 72???????????? return newDepth - 1;
>> ?? 73???????? } catch (StackOverflowError e) {
>> ?? 74???????????? return -1;
>> ?? 75???????? }
>> ?? 76???? }
>> ?? 77 }
>> ?? I'd suggest to rename the methods call() and callEnded() to 
>> something like
>> ?? recursiveMethod() and recursionEnd().
>> ?? Also, the manipulations with SOE create a complexity and are 
>> confusing.
>> ?? Could it be more simple to let it propagated and then catch in 
>> main()?
>> ?? What is the point for all these checks at the lines 104-119?
>> ?? In general, I'm looking for some ways to make it more clear, 
>> simple and stable.
>>
>> Thanks,
>> Serguei
>
>
>


From chris.plummer at oracle.com  Fri Jul 20 20:13:55 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 20 Jul 2018 13:13:55 -0700
Subject: PING: RFR: 8205992: jhsdb cannot attach to Java processes running
 in Docker containers
In-Reply-To: <CAGFVN2A5iZeUg7bysrbdaB2FNUvfobc_gDOc97GS_eTz2DrhDw@mail.gmail.com>
References: <CAGFVN2A5iZeUg7bysrbdaB2FNUvfobc_gDOc97GS_eTz2DrhDw@mail.gmail.com>
Message-ID: <abf21c24-4893-b509-0701-0a11ed29cb42@oracle.com>

Hi Yasumasa,

Changes look and and passed all my testing.

thanks,

Chris

On 7/19/18 10:13 PM, Yasumasa Suenaga wrote:
> Hi Chris,
>
> Thank you for your comment.
> I uploaded new webrev. Could you review again?
>
>    http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.02/
>
> I tested my change on Linux x64, but I cannot check it on other
> platform (includes older Linux).
> However SA tests are included in HotSpot tier 1 tests. Tests on submit
> repo work fine with this change
> (mach5-one-ysuenaga-JDK-8205992-20180720-0305-31840).
>
>
> Thanks,
>
> Yasumasa
>
>
> 2018-07-20 3:26 GMT+09:00 Chris Plummer <chris.plummer at oracle.com>:
>> Hi Yasumasa,
>>
>>    84     // It maps the LWPID in the host to it in the container.
>>
>> "it" -> "the PID"
>>
>>   286     // Get LWPID in the host from the container's LWPID.
>>   287     public int getHostPID(int id) {
>>   288         try {
>>   289             return nspidMap.get(id);
>>   290         } catch (NullPointerException e) {
>>   291             return -1;
>>   292         }
>>   293     }
>>
>> What is the source of the NPE here? Is it because nspidMap was never
>> initialized because the process is not in a container? In that case I think
>> you should be checking for null rather than having an NPE be part of normal
>> execution.
>>
>>    42             int hostPID =
>> ((LinuxDebuggerLocal)debugger).getHostPID(pid);
>>    43             if (hostPID != -1) {
>>    44                 pid = hostPID;
>>    45             }
>>
>> A comment here would be helpful.
>>
>> The rest looks good. I should probably run it through some internal testing.
>> Let me know when you have a final webrev.
>>
>> thanks,
>>
>> Chris
>>
>>
>> On 7/18/18 5:59 AM, Yasumasa Suenaga wrote:
>>> PING:
>>>
>>> Could you review it?
>>>
>>>     JBS:    https://bugs.openjdk.java.net/browse/JDK-8205992
>>>     webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/
>>>
>>> This change has been reviewed by Jini.
>>> We need a Reviewer.
>>>
>>>
>>> Thanks,
>>>
>>> Yasumasa
>>>
>>>
>>> On 2018/07/12 13:42, Yasumasa Suenaga wrote:
>>>> Thanks Jini,
>>>>
>>>> I uploaded new webrev. It contains some comments and removing extra
>>>> space.
>>>>
>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.01/
>>>>
>>>>
>>>> Yasumasa
>>>>
>>>>
>>>>
>>>> 2018-07-12 2:32 GMT+09:00 Jini George <jini.george at oracle.com>:
>>>>> Hi Yasumasa,
>>>>>
>>>>> This looks good to me except for one nit. And some more comments would
>>>>> help.
>>>>> For e.g., it would help to say that NSPidMap is to map the host to
>>>>> container
>>>>> lwpids.
>>>>>
>>>>> The nit:
>>>>>
>>>>> *
>>>>>
>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.c.sdiff.html
>>>>> Line 253: extra space after the parentheses
>>>>>
>>>>> Thanks,
>>>>> Jini.
>>>>>
>>>>> On 7/4/2018 4:34 AM, Yasumasa Suenaga wrote:
>>>>>>
>>>>>> PING: Could you review it?
>>>>>>
>>>>>>>     JBS: https://bugs.openjdk.java.net/browse/JDK-8205992
>>>>>>>     webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Yasumasa
>>>>>>
>>>>>>
>>>>>> On 2018/06/28 22:12, Yasumasa Suenaga wrote:
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Please review this change.
>>>>>>>
>>>>>>>     JBS: https://bugs.openjdk.java.net/browse/JDK-8205992
>>>>>>>     webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8205992/webrev.00/
>>>>>>>
>>>>>>> I tried to attach jhsdb to java process in docker container from
>>>>>>> container host, but it couldn't.
>>>>>>> jcmd supports PID namespace in JDK-8193710, but jhsdb hasn't yet.
>>>>>>>
>>>>>>> SA gets LWP ID via thread stack and funcs in libthread_db.so, but they
>>>>>>> returns PIDs in container - they are different from host's PID. So I
>>>>>>> added
>>>>>>> the code to scan /proc/<PID>/task to get all LWP IDs and they are kept
>>>>>>> in a
>>>>>>> Map in LinuxDebuggerLocal.
>>>>>>>
>>>>>>> Also SA_ALTROOT is set to /proc/<PID>/root if SA detects debuggee runs
>>>>>>> in
>>>>>>> container. It helps SA to parse binaries in container.
>>>>>>>
>>>>>>> This change has been pushed to submit repo, and it was failed on OS X
>>>>>>> (mach5-one-ysuenaga-JDK-8205992-20180628-1015-28963).
>>>>>>> But I guess it causes JDK-8205906. This change affects to Linux only.
>>>>>>>
>>>>>>> Could you review it?
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Yasumasa
>>>>>>>
>>


From serguei.spitsyn at oracle.com  Fri Jul 20 20:40:16 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 20 Jul 2018 13:40:16 -0700
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <6de6362944f84740b80abb22cbbea872@sap.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
 <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
 <f96a16917c934a539523e078f902b880@sap.com>
 <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com>
 <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com>
 <6de6362944f84740b80abb22cbbea872@sap.com>
Message-ID: <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com>

Hi Ralf,


On 7/20/18 07:28, Schmelter, Ralf wrote:
> Hi Sergue,
>
> I?ve updated the webref: http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/

The copyright year in ThreadReferenceImpl.c still has to be 2018, not 2008.

http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html

   72             if (newDepth == -1_000) {
   73                 // Pop some frames so there is room on the stack for the
   74                 // call (including println()).
   75                 notifyRecursionEnded();
   76             }

 ? I have a concern on potential issue mentioned in the comment above.
 ? Should a StackOverflowError be expected here?

   79         } catch (StackOverflowError e) {
   80             // Use negative depth to indicate the recursion has ended.
   81             return -1;
   82         }

 ? What is going to happen if the StackOverflowError was really caught 
above?
 ? If I understand it correctly, the notifyRecursionEnded() call will be 
missed then.
 ? This breakpoint will be missed as well:

   107         bpe = resumeTo("Frames2Targ", "notifyRecursionEnded", "()V");


> JVMTI_ERROR_OPAQUE_FRAME was never returned from GetFrameLocation(). If it would have, the old code would have removed all native methods from the call stack. The original JVMDI call did indeed return JVMDI_ERROR_OPAQUE_FRAME, so maybe it was a leftover from the JVMDI->JVMTI transition.

Agreed.

> I?ve tried to make the test more readable and added some comments to explain why it is done the way it is.

Thank you for the update!


Thanks,
Serguei

> Best regards,
> Ralf
>
>
>
>
> From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com]
> Sent: Mittwoch, 18. Juli 2018 22:57
> To: Chris Plummer <chris.plummer at oracle.com>; Schmelter, Ralf <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; Stuefe, Thomas <thomas.stuefe at sap.com>
> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior
>
> Hi Ralf,
>
> The fix itself looks pretty good to me.
> Some minor comments.
>
> The copyright year needs an update.
>   218     jint count, filledIn;
>
>   Could you, please, split the declarations above into different lines to follow the local style?
> Ii is interesting that the original implementation checked the error code returned
> from the JVMTI GetFrameLocation for being equal to JVMTI_ERROR_OPAQUE_FRAME.
> However, the GetFrameLocation spec does not list this error code as possible.
>
>
> Some comments about the test.
>    52     static void callEnded() {
>    53         System.out.println("SOE occurred as expected");
>    54     }
>    55
>    56     static int call(int depth) {
>    57         if (depth == 0) {
>    58             // Should have seen a stack overflow by now.
>    59             System.out.println("Exited without creating SOE");
>    60             System.exit(0);
>    61         }
>    62
>    63         try {
>    64             int newDepth = call(depth - 1);
>    65
>    66             if (newDepth == -1_000) {
>    67                 // Pop some frames so there is room on the stack for the
>    68                 // println()
>    69                 callEnded();
>    70             }
>    71
>    72             return newDepth - 1;
>    73         } catch (StackOverflowError e) {
>    74             return -1;
>    75         }
>    76     }
>    77 }
>  ? I'd suggest to rename the methods call() and callEnded() to something like
>  ? recursiveMethod() and recursionEnd().
>  ? Also, the manipulations with SOE create a complexity and are confusing.
>  ? Could it be more simple to let it propagated and then catch in main()?
>  ? What is the point for all these checks at the lines 104-119?
>  ? In general, I'm looking for some ways to make it more clear, simple and stable.
>
> Thanks,
> Serguei


From chris.plummer at oracle.com  Fri Jul 20 20:44:55 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 20 Jul 2018 13:44:55 -0700
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
 <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
 <f96a16917c934a539523e078f902b880@sap.com>
 <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com>
 <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com>
 <6de6362944f84740b80abb22cbbea872@sap.com>
 <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com>
Message-ID: <eac7c9ba-1d94-3efe-a5ac-1b54bf6303e9@oracle.com>

On 7/20/18 1:40 PM, serguei.spitsyn at oracle.com wrote:
> Hi Ralf,
>
>
> On 7/20/18 07:28, Schmelter, Ralf wrote:
>> Hi Sergue,
>>
>> I?ve updated the webref: 
>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/
>
> The copyright year in ThreadReferenceImpl.c still has to be 2018, not 
> 2008.
>
> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html 
>
>
> ? 72???????????? if (newDepth == -1_000) {
> ? 73???????????????? // Pop some frames so there is room on the stack 
> for the
> ? 74???????????????? // call (including println()).
> ? 75???????????????? notifyRecursionEnded();
> ? 76???????????? }
>
> ? I have a concern on potential issue mentioned in the comment above.
> ? Should a StackOverflowError be expected here?
>
> ? 79???????? } catch (StackOverflowError e) {
> ? 80???????????? // Use negative depth to indicate the recursion has 
> ended.
> ? 81???????????? return -1;
> ? 82???????? }
>
> ? What is going to happen if the StackOverflowError was really caught 
> above?
The SOE is really caught in the above code. I returns -1, and starts the 
unwinding of the stack. After 1000 frames have been popped via returns, 
notifyRecursionEnded() will be called. The pops are so 
notifyRecursionEnded() can be called without worry of another SOE.

Chris
> ? If I understand it correctly, the notifyRecursionEnded() call will 
> be missed then.
> ? This breakpoint will be missed as well:
>
> ? 107???????? bpe = resumeTo("Frames2Targ", "notifyRecursionEnded", 
> "()V");
>
>
>
>> JVMTI_ERROR_OPAQUE_FRAME was never returned from GetFrameLocation(). 
>> If it would have, the old code would have removed all native methods 
>> from the call stack. The original JVMDI call did indeed return 
>> JVMDI_ERROR_OPAQUE_FRAME, so maybe it was a leftover from the 
>> JVMDI->JVMTI transition.
>
> Agreed.
>
>> I?ve tried to make the test more readable and added some comments to 
>> explain why it is done the way it is.
>
> Thank you for the update!
>
>
> Thanks,
> Serguei
>
>> Best regards,
>> Ralf
>>
>>
>>
>>
>> From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com]
>> Sent: Mittwoch, 18. Juli 2018 22:57
>> To: Chris Plummer <chris.plummer at oracle.com>; Schmelter, Ralf 
>> <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; 
>> Stuefe, Thomas <thomas.stuefe at sap.com>
>> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c 
>> to prevent quadratic runtime behavior
>>
>> Hi Ralf,
>>
>> The fix itself looks pretty good to me.
>> Some minor comments.
>>
>> The copyright year needs an update.
>> ? 218???? jint count, filledIn;
>>
>> ? Could you, please, split the declarations above into different 
>> lines to follow the local style?
>> Ii is interesting that the original implementation checked the error 
>> code returned
>> from the JVMTI GetFrameLocation for being equal to 
>> JVMTI_ERROR_OPAQUE_FRAME.
>> However, the GetFrameLocation spec does not list this error code as 
>> possible.
>>
>>
>> Some comments about the test.
>> ?? 52???? static void callEnded() {
>> ?? 53???????? System.out.println("SOE occurred as expected");
>> ?? 54???? }
>> ?? 55
>> ?? 56???? static int call(int depth) {
>> ?? 57???????? if (depth == 0) {
>> ?? 58???????????? // Should have seen a stack overflow by now.
>> ?? 59???????????? System.out.println("Exited without creating SOE");
>> ?? 60???????????? System.exit(0);
>> ?? 61???????? }
>> ?? 62
>> ?? 63???????? try {
>> ?? 64???????????? int newDepth = call(depth - 1);
>> ?? 65
>> ?? 66???????????? if (newDepth == -1_000) {
>> ?? 67???????????????? // Pop some frames so there is room on the 
>> stack for the
>> ?? 68???????????????? // println()
>> ?? 69???????????????? callEnded();
>> ?? 70???????????? }
>> ?? 71
>> ?? 72???????????? return newDepth - 1;
>> ?? 73???????? } catch (StackOverflowError e) {
>> ?? 74???????????? return -1;
>> ?? 75???????? }
>> ?? 76???? }
>> ?? 77 }
>> ?? I'd suggest to rename the methods call() and callEnded() to 
>> something like
>> ?? recursiveMethod() and recursionEnd().
>> ?? Also, the manipulations with SOE create a complexity and are 
>> confusing.
>> ?? Could it be more simple to let it propagated and then catch in 
>> main()?
>> ?? What is the point for all these checks at the lines 104-119?
>> ?? In general, I'm looking for some ways to make it more clear, 
>> simple and stable.
>>
>> Thanks,
>> Serguei
>


From serguei.spitsyn at oracle.com  Fri Jul 20 21:04:22 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Fri, 20 Jul 2018 14:04:22 -0700
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <eac7c9ba-1d94-3efe-a5ac-1b54bf6303e9@oracle.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
 <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
 <f96a16917c934a539523e078f902b880@sap.com>
 <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com>
 <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com>
 <6de6362944f84740b80abb22cbbea872@sap.com>
 <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com>
 <eac7c9ba-1d94-3efe-a5ac-1b54bf6303e9@oracle.com>
Message-ID: <343bce9f-7072-17fd-9351-1c62f67a12de@oracle.com>

On 7/20/18 13:44, Chris Plummer wrote:
> On 7/20/18 1:40 PM, serguei.spitsyn at oracle.com wrote:
>> Hi Ralf,
>>
>>
>> On 7/20/18 07:28, Schmelter, Ralf wrote:
>>> Hi Sergue,
>>>
>>> I?ve updated the webref: 
>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/
>>
>> The copyright year in ThreadReferenceImpl.c still has to be 2018, not 
>> 2008.
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html 
>>
>>
>> ? 72???????????? if (newDepth == -1_000) {
>> ? 73???????????????? // Pop some frames so there is room on the stack 
>> for the
>> ? 74???????????????? // call (including println()).
>> ? 75???????????????? notifyRecursionEnded();
>> ? 76???????????? }
>>
>> ? I have a concern on potential issue mentioned in the comment above.
>> ? Should a StackOverflowError be expected here?
>>
>> ? 79???????? } catch (StackOverflowError e) {
>> ? 80???????????? // Use negative depth to indicate the recursion has 
>> ended.
>> ? 81???????????? return -1;
>> ? 82???????? }
>>
>> ? What is going to happen if the StackOverflowError was really caught 
>> above?
> The SOE is really caught in the above code. I returns -1, and starts 
> the unwinding of the stack. After 1000 frames have been popped via 
> returns, notifyRecursionEnded() will be called. The pops are so 
> notifyRecursionEnded() can be called without worry of another SOE.

Got it, thanks Chris.

So, I'm Okay with the fix assuming the copyright year is fixed.

Thanks,
Serguei

>
> Chris
>> ? If I understand it correctly, the notifyRecursionEnded() call will 
>> be missed then.
>> ? This breakpoint will be missed as well:
>>
>> ? 107???????? bpe = resumeTo("Frames2Targ", "notifyRecursionEnded", 
>> "()V");
>>
>>
>>
>>> JVMTI_ERROR_OPAQUE_FRAME was never returned from GetFrameLocation(). 
>>> If it would have, the old code would have removed all native methods 
>>> from the call stack. The original JVMDI call did indeed return 
>>> JVMDI_ERROR_OPAQUE_FRAME, so maybe it was a leftover from the 
>>> JVMDI->JVMTI transition.
>>
>> Agreed.
>>
>>> I?ve tried to make the test more readable and added some comments to 
>>> explain why it is done the way it is.
>>
>> Thank you for the update!
>>
>>
>> Thanks,
>> Serguei
>>
>>> Best regards,
>>> Ralf
>>>
>>>
>>>
>>>
>>> From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com]
>>> Sent: Mittwoch, 18. Juli 2018 22:57
>>> To: Chris Plummer <chris.plummer at oracle.com>; Schmelter, Ralf 
>>> <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; 
>>> Stuefe, Thomas <thomas.stuefe at sap.com>
>>> Subject: Re: RFR (S) 8205608: Fix 'frames()' in 
>>> ThreadReferenceImpl.c to prevent quadratic runtime behavior
>>>
>>> Hi Ralf,
>>>
>>> The fix itself looks pretty good to me.
>>> Some minor comments.
>>>
>>> The copyright year needs an update.
>>> ? 218???? jint count, filledIn;
>>>
>>> ? Could you, please, split the declarations above into different 
>>> lines to follow the local style?
>>> Ii is interesting that the original implementation checked the error 
>>> code returned
>>> from the JVMTI GetFrameLocation for being equal to 
>>> JVMTI_ERROR_OPAQUE_FRAME.
>>> However, the GetFrameLocation spec does not list this error code as 
>>> possible.
>>>
>>>
>>> Some comments about the test.
>>> ?? 52???? static void callEnded() {
>>> ?? 53???????? System.out.println("SOE occurred as expected");
>>> ?? 54???? }
>>> ?? 55
>>> ?? 56???? static int call(int depth) {
>>> ?? 57???????? if (depth == 0) {
>>> ?? 58???????????? // Should have seen a stack overflow by now.
>>> ?? 59???????????? System.out.println("Exited without creating SOE");
>>> ?? 60???????????? System.exit(0);
>>> ?? 61???????? }
>>> ?? 62
>>> ?? 63???????? try {
>>> ?? 64???????????? int newDepth = call(depth - 1);
>>> ?? 65
>>> ?? 66???????????? if (newDepth == -1_000) {
>>> ?? 67???????????????? // Pop some frames so there is room on the 
>>> stack for the
>>> ?? 68???????????????? // println()
>>> ?? 69???????????????? callEnded();
>>> ?? 70???????????? }
>>> ?? 71
>>> ?? 72???????????? return newDepth - 1;
>>> ?? 73???????? } catch (StackOverflowError e) {
>>> ?? 74???????????? return -1;
>>> ?? 75???????? }
>>> ?? 76???? }
>>> ?? 77 }
>>> ?? I'd suggest to rename the methods call() and callEnded() to 
>>> something like
>>> ?? recursiveMethod() and recursionEnd().
>>> ?? Also, the manipulations with SOE create a complexity and are 
>>> confusing.
>>> ?? Could it be more simple to let it propagated and then catch in 
>>> main()?
>>> ?? What is the point for all these checks at the lines 104-119?
>>> ?? In general, I'm looking for some ways to make it more clear, 
>>> simple and stable.
>>>
>>> Thanks,
>>> Serguei
>>
>


From hohensee at amazon.com  Fri Jul 20 22:37:14 2018
From: hohensee at amazon.com (Hohensee, Paul)
Date: Fri, 20 Jul 2018 22:37:14 +0000
Subject: RFR(L): 8196889: Revamp G1 JMX MemoryPoolMXBean,
 GarbageCollectorMXBean, and jstat counter definitions
Message-ID: <E3D02D84-76D2-48A8-A209-9B777F64551D@amazon.com>

Please review.

Bug: https://bugs.openjdk.java.net/browse/JDK-8196989
CSR: https://bugs.openjdk.java.net/browse/JDK-8196991
Webrev: http://cr.openjdk.java.net/~phh/8196989/webrev.00

This webrev is marked ?L? because it?s a behavioral change (CSR in draft state, may I have a review of that too please?) and because the test change fanout is large. The actual code changes are ?M?.

Passes the submit repo, Hotspot tier1, the JFR gc event tests and any other test set with ?gc? or ?serviceability? in the test directory name. I found it difficult to verify the accuracy of the reported values other than manually, since they can vary from run to run of the same program. I?d appreciate suggestions for how to go about writing accuracy tests.

I set out originally to revamp only the MXBeans, but decided it would be incomplete if I didn?t include the jstat counters and the output of the GC.heap_info jcmd option. I can separate the latter two into their own RFEs, but I find it easier understand it all in a single webrev and hope the reviewers will too.

The basic approach is to add the new memory pools and collectors, the new jstat counters, and an archive region counter that stands in for an actual archive region set. HeapRegionSets are disjoint, so initially I tried to create a first-class archive region set (on the same level as the humongous region set), but that idea foundered on the fact that there?s too much code I don?t fully understand that depends on archive regions being in the existing old region set. Externally (i.e., in the MXBeans and the jstat counters), however, the old region set doesn?t include archive regions (unless running in legacy mode).

I used CMS?s TraceCMSMemoryManagerStats class as the model for TraceConcMemoryManagerStats, which latter collects statistics on concurrent cycles. There are two STW pauses in each concurrent cycle: they are recorded separately and count as two sun.gc.collector.2 events.

The humongous and archive space committed and used values are always identical, hence they are always 100% used.

The revised output of jcmd GC.heap_info is in G1CollectedHeap::print_on().
I fixed a typo in src/hotspot/share/gc/g1/g1Policy.hpp by changing the result type of young_list_target_length() from size_t to uint, which latter is the type of the _young_list_target_length member.
I updated the copyright date in src/hotspot/share/services/memoryService.hpp to 2018, as I neglected to do so in a previous push.
Thanks,
Paul


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180720/31bc4f30/attachment-0001.html>

From igor.veresov at oracle.com  Sat Jul 21 20:47:34 2018
From: igor.veresov at oracle.com (Igor Veresov)
Date: Sat, 21 Jul 2018 13:47:34 -0700
Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is
 enabled
In-Reply-To: <CAF9BGBw+gUm+yHRGueGrrBJ5Pn5BA0RTpFyPeOVrBkj8+VAZWg@mail.gmail.com>
References: <CAF9BGBzAVdC3JZCu2945n8JXo0usocBxVxYzdfdBU_gWOTe8wg@mail.gmail.com>
 <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com>
 <CAF9BGBx7TnVo5vmNx75TUhd1VSeow6KsgcSk2ztKdibKuzv9fw@mail.gmail.com>
 <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com>
 <CAF9BGBxLchBQ3g1uwD=34E+QR5ARO7DqBfxZpFU2TRWTSaTajQ@mail.gmail.com>
 <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
 <CAF9BGBz+SNUffJWNgkvTTK6jB7-8yf96HGSroxvY13FT80XPCg@mail.gmail.com>
 <de54ede9-e29f-5e58-8d7f-6ad3c74d558c@oracle.com>
 <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com>
 <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com>
 <b3c9630e-8434-b2da-f732-6569e629debf@oracle.com>
 <CAF9BGBw+gUm+yHRGueGrrBJ5Pn5BA0RTpFyPeOVrBkj8+VAZWg@mail.gmail.com>
Message-ID: <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com>

I think you can just predicate the emission of these stubs for !UseTLAB, and not mess with the CPU-specific code. What do you think?

diff --git a/src/hotspot/share/c1/c1_LIRGenerator.cpp b/src/hotspot/share/c1/c1_LIRGenerator.cpp
--- a/src/hotspot/share/c1/c1_LIRGenerator.cpp
+++ b/src/hotspot/share/c1/c1_LIRGenerator.cpp
@@ -674,7 +674,7 @@
 void LIRGenerator::new_instance(LIR_Opr dst, ciInstanceKlass* klass, bool is_unresolved, LIR_Opr scratch1, LIR_Opr scratch2, LIR_Opr scratch3, LIR_Opr scratch4, LIR_Opr klass_reg, CodeEmitInfo* info) {
   klass2reg_with_patching(klass_reg, klass, info, is_unresolved);
   // If klass is not loaded we do not know if the klass has finalizers:
-  if (UseFastNewInstance && klass->is_loaded()
+  if (UseFastNewInstance && !UseTLAB && klass->is_loaded()
       && !Klass::layout_helper_needs_slow_path(klass->layout_helper())) {

     Runtime1::StubID stub_id = klass->is_initialized() ? Runtime1::fast_new_instance_id : Runtime1::fast_new_instance_init_check_id;
 

igor

> On Jul 20, 2018, at 12:37 PM, JC Beyler <jcbeyler at google.com> wrote:
> 
> Yes that is right, this is the latest:
> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/
> 
> I apologize for the multiple threads and confusion,
> Jc
> 
> On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com <
> serguei.spitsyn at oracle.com> wrote:
> 
>> Thank you a lot, Vladimir!
>> Yes, the webrev.03 is the latest.
>> Jc, will correct us if it is not right.
>> 
>> Thanks,
>> Serguei
>> 
>> 
>> On 7/20/18 10:52, Vladimir Kozlov wrote:
>>> I asked Igor V. to look.
>>> 
>>> Seems like review is done in an other thread which does not have bug
>>> id in subject. Currently webrev.03
>>> 
>>> Vladimir
>>> 
>>> On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote:
>>>> Thanks, Rahul!
>>>> In fact, there no good experts for this area in the serviceability team.
>>>> It would be much better if anyone from the Compiler team could do it.
>>>> 
>>>> Vladimir K.,
>>>> 
>>>> Is there anyone from the Compiler team available to review this?
>>>> Otherwise, I could try to review it but am not sure about my review
>>>> quality.
>>>> 
>>>> Thanks,
>>>> Serguei
>>>> 
>>>> 
>>>> On 7/19/18 00:48, Rahul Raghavan wrote:
>>>>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled
>>>>> 
>>>>> (just adding + hotspot-compiler-dev also)
>>>>> 
>>>>> 
>>>>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote:
>>>>> Subject Was:
>>>>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled
>>>>> 
>>>>> + serviceability-dev
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> Could anyone else give me a review of this webrev and check/test the
>>>>> various architecture changes?
>>>>> 
>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>>>>> 
>>>>> 
>>>>> Thanks for all your help!
>>>>> Jc
>>>>> 
>>>>> 
>>>>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler <jcbeyler at google.com>
>> wrote:
>>>>>> 
>>>>>>> Hi all,
>>>>>>> 
>>>>>>> Here is a webrev that does all the architectures in the same way:
>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>>>>>>> 
>>>>>>> Could anyone review the other architectures and test?
>>>>>>>   - arm, sparc & aarch64 are also modified now to follow the same
>>>>>>> "if no
>>>>>>> tlab, then consider eden space allocation" logic.
>>>>>>> 
>>>>>>> Thanks for your help!
>>>>>>> Jc
>>>>>>> 
>>>>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler <jcbeyler at google.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Kim,
>>>>>>>> 
>>>>>>>> I opened this bug
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862
>>>>>>>> 
>>>>>>>> and now I've done an update:
>>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/
>>>>>>>> 
>>>>>>>> I basically have done your nits but also removed the try_eden (it
>>>>>>>> was
>>>>>>>> used to bind a label but was not used). I updated the comments to
>>>>>>>> use the
>>>>>>>> one you preferred.
>>>>>>>> 
>>>>>>>> I still have to do the other architectures though but at least we
>>>>>>>> seem to
>>>>>>>> have a consensus on this architecture, correct?
>>>>>>>> 
>>>>>>>> Thanks for the review,
>>>>>>>> Jc
>>>>>>>> 
>>>>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett <kim.barrett at oracle.com
>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler <jcbeyler at google.com>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Yes, you are right, I did those changes due to:
>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084
>>>>>>>>>> 
>>>>>>>>>> If Robbin agrees to this change, and if no one sees an issue,
>>>>>>>>>> I'll go
>>>>>>>>> ahead
>>>>>>>>>> and propagate the change across architectures.
>>>>>>>>>> 
>>>>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's
>>>>>>>>>> comment
>>>>>>>>> and
>>>>>>>>>> review) :)
>>>>>>>>>> Jc
>>>>>>>>>> 
>>>>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose <john.r.rose at oracle.com
>>> 
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler <jcbeyler at google.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> I'm not sure if we had left this case intentionally or not
>>>>>>>>>>> but, if we
>>>>>>>>> want
>>>>>>>>>>> it all to be consistent, we should perhaps fix it.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Well, you put in that logic last February, so unless somebody
>>>>>>>>>>> speaks
>>>>>>>>> up
>>>>>>>>>>> quickly, I support your adjusting it to be the way you want it.
>>>>>>>>>>> 
>>>>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I
>>>>>>>>>>> src/hotspot/share"
>>>>>>>>>>> suggests that the GC group is most active in touching this
>>>>>>>>>>> feature.
>>>>>>>>>>> If Robbin is OK with it, there's your reviewer.
>>>>>>>>>>> 
>>>>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person
>>>>>>>>>>> working on the GC to OK it.
>>>>>>>>>>> 
>>>>>>>>>>> ? John
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Jc
>>>>>>>>> 
>>>>>>>>> Robbin is on vacation; you might not hear from him for a while.
>>>>>>>>> 
>>>>>>>>> I'm assuming you'll open a new bug for this?
>>>>>>>>> 
>>>>>>>>> Except for a few minor nits (below), this looks okay to me.
>>>>>>>>> 
>>>>>>>>> The comment at line 1052 needs updating.
>>>>>>>>> 
>>>>>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused.
>>>>>>>>> 
>>>>>>>>> pre-existing: The try_eden label declared on line 1054 is bound at
>>>>>>>>> line 1058, but unreferenced.
>>>>>>>>> 
>>>>>>>>> I like the wording of the comment at 1139 better than the
>>>>>>>>> wording at
>>>>>>>>> 1016.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Jc
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Jc
>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>> 
>> 
> 
> -- 
> 
> Thanks,
> Jc


From jcbeyler at google.com  Sun Jul 22 02:06:26 2018
From: jcbeyler at google.com (JC Beyler)
Date: Sat, 21 Jul 2018 19:06:26 -0700
Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is
 enabled
In-Reply-To: <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com>
References: <CAF9BGBzAVdC3JZCu2945n8JXo0usocBxVxYzdfdBU_gWOTe8wg@mail.gmail.com>
 <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com>
 <CAF9BGBx7TnVo5vmNx75TUhd1VSeow6KsgcSk2ztKdibKuzv9fw@mail.gmail.com>
 <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com>
 <CAF9BGBxLchBQ3g1uwD=34E+QR5ARO7DqBfxZpFU2TRWTSaTajQ@mail.gmail.com>
 <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
 <CAF9BGBz+SNUffJWNgkvTTK6jB7-8yf96HGSroxvY13FT80XPCg@mail.gmail.com>
 <de54ede9-e29f-5e58-8d7f-6ad3c74d558c@oracle.com>
 <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com>
 <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com>
 <b3c9630e-8434-b2da-f732-6569e629debf@oracle.com>
 <CAF9BGBw+gUm+yHRGueGrrBJ5Pn5BA0RTpFyPeOVrBkj8+VAZWg@mail.gmail.com>
 <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com>
Message-ID: <CAF9BGByRFu-KAb-PStS0ZGrjyeP35LS1OK4MW5E8AB7bsFM==A@mail.gmail.com>

Hi Igor,

Thanks for looking at it! I don't know the code paths enough to know if
that is sufficient (I'll trust you evidently). I can run the tests next
week if we prefer that route.

Were I to choose, I would prefer that interpreter/c1/c2 all follow the same
kind of paths, which would be my fix I believe:
1) If TLAB, allocate there or slowpath
2) Else If contiguous inline allocations are enabled, try that
3) Goto Slowpath

With your fix, even if we do not have the issue anymore, it still keeps
code that is not consistent but perhaps I'm missing something?
Jc

On Sat, Jul 21, 2018 at 1:47 PM Igor Veresov <igor.veresov at oracle.com>
wrote:

> I think you can just predicate the emission of these stubs for !UseTLAB,
> and not mess with the CPU-specific code. What do you think?
>
> diff --git a/src/hotspot/share/c1/c1_LIRGenerator.cpp
> b/src/hotspot/share/c1/c1_LIRGenerator.cpp
> --- a/src/hotspot/share/c1/c1_LIRGenerator.cpp
> +++ b/src/hotspot/share/c1/c1_LIRGenerator.cpp
> @@ -674,7 +674,7 @@
>  void LIRGenerator::new_instance(LIR_Opr dst, ciInstanceKlass* klass, bool
> is_unresolved, LIR_Opr scratch1, LIR_Opr scratch2, LIR_Opr scratch3,
> LIR_Opr scratch4, LIR_Opr klass_reg, CodeEmitInfo* info) {
>    klass2reg_with_patching(klass_reg, klass, info, is_unresolved);
>    // If klass is not loaded we do not know if the klass has finalizers:
> -  if (UseFastNewInstance && klass->is_loaded()
> +  if (UseFastNewInstance && !UseTLAB && klass->is_loaded()
>        && !Klass::layout_helper_needs_slow_path(klass->layout_helper())) {
>
>      Runtime1::StubID stub_id = klass->is_initialized() ?
> Runtime1::fast_new_instance_id : Runtime1::fast_new_instance_init_check_id;
>
>
> igor
>
> > On Jul 20, 2018, at 12:37 PM, JC Beyler <jcbeyler at google.com> wrote:
> >
> > Yes that is right, this is the latest:
> > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/
> >
> > I apologize for the multiple threads and confusion,
> > Jc
> >
> > On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com <
> > serguei.spitsyn at oracle.com> wrote:
> >
> >> Thank you a lot, Vladimir!
> >> Yes, the webrev.03 is the latest.
> >> Jc, will correct us if it is not right.
> >>
> >> Thanks,
> >> Serguei
> >>
> >>
> >> On 7/20/18 10:52, Vladimir Kozlov wrote:
> >>> I asked Igor V. to look.
> >>>
> >>> Seems like review is done in an other thread which does not have bug
> >>> id in subject. Currently webrev.03
> >>>
> >>> Vladimir
> >>>
> >>> On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote:
> >>>> Thanks, Rahul!
> >>>> In fact, there no good experts for this area in the serviceability
> team.
> >>>> It would be much better if anyone from the Compiler team could do it.
> >>>>
> >>>> Vladimir K.,
> >>>>
> >>>> Is there anyone from the Compiler team available to review this?
> >>>> Otherwise, I could try to review it but am not sure about my review
> >>>> quality.
> >>>>
> >>>> Thanks,
> >>>> Serguei
> >>>>
> >>>>
> >>>> On 7/19/18 00:48, Rahul Raghavan wrote:
> >>>>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled
> >>>>>
> >>>>> (just adding + hotspot-compiler-dev also)
> >>>>>
> >>>>>
> >>>>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote:
> >>>>> Subject Was:
> >>>>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled
> >>>>>
> >>>>> + serviceability-dev
> >>>>>
> >>>>> Hi all,
> >>>>>
> >>>>> Could anyone else give me a review of this webrev and check/test the
> >>>>> various architecture changes?
> >>>>>
> >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
> >>>>>
> >>>>>
> >>>>> Thanks for all your help!
> >>>>> Jc
> >>>>>
> >>>>>
> >>>>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler <jcbeyler at google.com>
> >> wrote:
> >>>>>>
> >>>>>>> Hi all,
> >>>>>>>
> >>>>>>> Here is a webrev that does all the architectures in the same way:
> >>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
> >>>>>>>
> >>>>>>> Could anyone review the other architectures and test?
> >>>>>>>   - arm, sparc & aarch64 are also modified now to follow the same
> >>>>>>> "if no
> >>>>>>> tlab, then consider eden space allocation" logic.
> >>>>>>>
> >>>>>>> Thanks for your help!
> >>>>>>> Jc
> >>>>>>>
> >>>>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler <jcbeyler at google.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi Kim,
> >>>>>>>>
> >>>>>>>> I opened this bug
> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862
> >>>>>>>>
> >>>>>>>> and now I've done an update:
> >>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/
> >>>>>>>>
> >>>>>>>> I basically have done your nits but also removed the try_eden (it
> >>>>>>>> was
> >>>>>>>> used to bind a label but was not used). I updated the comments to
> >>>>>>>> use the
> >>>>>>>> one you preferred.
> >>>>>>>>
> >>>>>>>> I still have to do the other architectures though but at least we
> >>>>>>>> seem to
> >>>>>>>> have a consensus on this architecture, correct?
> >>>>>>>>
> >>>>>>>> Thanks for the review,
> >>>>>>>> Jc
> >>>>>>>>
> >>>>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett <
> kim.barrett at oracle.com
> >>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler <jcbeyler at google.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Yes, you are right, I did those changes due to:
> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084
> >>>>>>>>>>
> >>>>>>>>>> If Robbin agrees to this change, and if no one sees an issue,
> >>>>>>>>>> I'll go
> >>>>>>>>> ahead
> >>>>>>>>>> and propagate the change across architectures.
> >>>>>>>>>>
> >>>>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's
> >>>>>>>>>> comment
> >>>>>>>>> and
> >>>>>>>>>> review) :)
> >>>>>>>>>> Jc
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose <
> john.r.rose at oracle.com
> >>>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler <jcbeyler at google.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> I'm not sure if we had left this case intentionally or not
> >>>>>>>>>>> but, if we
> >>>>>>>>> want
> >>>>>>>>>>> it all to be consistent, we should perhaps fix it.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Well, you put in that logic last February, so unless somebody
> >>>>>>>>>>> speaks
> >>>>>>>>> up
> >>>>>>>>>>> quickly, I support your adjusting it to be the way you want it.
> >>>>>>>>>>>
> >>>>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I
> >>>>>>>>>>> src/hotspot/share"
> >>>>>>>>>>> suggests that the GC group is most active in touching this
> >>>>>>>>>>> feature.
> >>>>>>>>>>> If Robbin is OK with it, there's your reviewer.
> >>>>>>>>>>>
> >>>>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other
> person
> >>>>>>>>>>> working on the GC to OK it.
> >>>>>>>>>>>
> >>>>>>>>>>> ? John
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Jc
> >>>>>>>>>
> >>>>>>>>> Robbin is on vacation; you might not hear from him for a while.
> >>>>>>>>>
> >>>>>>>>> I'm assuming you'll open a new bug for this?
> >>>>>>>>>
> >>>>>>>>> Except for a few minor nits (below), this looks okay to me.
> >>>>>>>>>
> >>>>>>>>> The comment at line 1052 needs updating.
> >>>>>>>>>
> >>>>>>>>> pre-existing: The retry_tlab label declared on line 1054 is
> unused.
> >>>>>>>>>
> >>>>>>>>> pre-existing: The try_eden label declared on line 1054 is bound
> at
> >>>>>>>>> line 1058, but unreferenced.
> >>>>>>>>>
> >>>>>>>>> I like the wording of the comment at 1139 better than the
> >>>>>>>>> wording at
> >>>>>>>>> 1016.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Jc
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Jc
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>
> >>
> >
> > --
> >
> > Thanks,
> > Jc
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180721/9137f8de/attachment-0001.html>

From igor.veresov at oracle.com  Sun Jul 22 02:39:21 2018
From: igor.veresov at oracle.com (Igor Veresov)
Date: Sat, 21 Jul 2018 19:39:21 -0700
Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is
 enabled
In-Reply-To: <CAF9BGByRFu-KAb-PStS0ZGrjyeP35LS1OK4MW5E8AB7bsFM==A@mail.gmail.com>
References: <CAF9BGBzAVdC3JZCu2945n8JXo0usocBxVxYzdfdBU_gWOTe8wg@mail.gmail.com>
 <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com>
 <CAF9BGBx7TnVo5vmNx75TUhd1VSeow6KsgcSk2ztKdibKuzv9fw@mail.gmail.com>
 <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com>
 <CAF9BGBxLchBQ3g1uwD=34E+QR5ARO7DqBfxZpFU2TRWTSaTajQ@mail.gmail.com>
 <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
 <CAF9BGBz+SNUffJWNgkvTTK6jB7-8yf96HGSroxvY13FT80XPCg@mail.gmail.com>
 <de54ede9-e29f-5e58-8d7f-6ad3c74d558c@oracle.com>
 <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com>
 <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com>
 <b3c9630e-8434-b2da-f732-6569e629debf@oracle.com>
 <CAF9BGBw+gUm+yHRGueGrrBJ5Pn5BA0RTpFyPeOVrBkj8+VAZWg@mail.gmail.com>
 <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com>
 <CAF9BGByRFu-KAb-PStS0ZGrjyeP35LS1OK4MW5E8AB7bsFM==A@mail.gmail.com>
Message-ID: <719DC045-4311-499E-9F7D-784096A044C6@oracle.com>

Yeah, the fix I proposed doesn?t do exactly what we?d want. Sorry for the confusion. Your fix is fine. Reviewed.

igor

> On Jul 21, 2018, at 7:06 PM, JC Beyler <jcbeyler at google.com> wrote:
> 
> Hi Igor,
> 
> Thanks for looking at it! I don't know the code paths enough to know if that is sufficient (I'll trust you evidently). I can run the tests next week if we prefer that route.
> 
> Were I to choose, I would prefer that interpreter/c1/c2 all follow the same kind of paths, which would be my fix I believe: 
> 1) If TLAB, allocate there or slowpath
> 2) Else If contiguous inline allocations are enabled, try that
> 3) Goto Slowpath
> 
> With your fix, even if we do not have the issue anymore, it still keeps code that is not consistent but perhaps I'm missing something?
> Jc
> 
> On Sat, Jul 21, 2018 at 1:47 PM Igor Veresov <igor.veresov at oracle.com <mailto:igor.veresov at oracle.com>> wrote:
> I think you can just predicate the emission of these stubs for !UseTLAB, and not mess with the CPU-specific code. What do you think?
> 
> diff --git a/src/hotspot/share/c1/c1_LIRGenerator.cpp b/src/hotspot/share/c1/c1_LIRGenerator.cpp
> --- a/src/hotspot/share/c1/c1_LIRGenerator.cpp
> +++ b/src/hotspot/share/c1/c1_LIRGenerator.cpp
> @@ -674,7 +674,7 @@
>  void LIRGenerator::new_instance(LIR_Opr dst, ciInstanceKlass* klass, bool is_unresolved, LIR_Opr scratch1, LIR_Opr scratch2, LIR_Opr scratch3, LIR_Opr scratch4, LIR_Opr klass_reg, CodeEmitInfo* info) {
>    klass2reg_with_patching(klass_reg, klass, info, is_unresolved);
>    // If klass is not loaded we do not know if the klass has finalizers:
> -  if (UseFastNewInstance && klass->is_loaded()
> +  if (UseFastNewInstance && !UseTLAB && klass->is_loaded()
>        && !Klass::layout_helper_needs_slow_path(klass->layout_helper())) {
> 
>      Runtime1::StubID stub_id = klass->is_initialized() ? Runtime1::fast_new_instance_id : Runtime1::fast_new_instance_init_check_id;
> 
> 
> igor
> 
> > On Jul 20, 2018, at 12:37 PM, JC Beyler <jcbeyler at google.com <mailto:jcbeyler at google.com>> wrote:
> > 
> > Yes that is right, this is the latest:
> > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/ <http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/>
> > 
> > I apologize for the multiple threads and confusion,
> > Jc
> > 
> > On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com> <
> > serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com>> wrote:
> > 
> >> Thank you a lot, Vladimir!
> >> Yes, the webrev.03 is the latest.
> >> Jc, will correct us if it is not right.
> >> 
> >> Thanks,
> >> Serguei
> >> 
> >> 
> >> On 7/20/18 10:52, Vladimir Kozlov wrote:
> >>> I asked Igor V. to look.
> >>> 
> >>> Seems like review is done in an other thread which does not have bug
> >>> id in subject. Currently webrev.03
> >>> 
> >>> Vladimir
> >>> 
> >>> On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com <mailto:serguei.spitsyn at oracle.com> wrote:
> >>>> Thanks, Rahul!
> >>>> In fact, there no good experts for this area in the serviceability team.
> >>>> It would be much better if anyone from the Compiler team could do it.
> >>>> 
> >>>> Vladimir K.,
> >>>> 
> >>>> Is there anyone from the Compiler team available to review this?
> >>>> Otherwise, I could try to review it but am not sure about my review
> >>>> quality.
> >>>> 
> >>>> Thanks,
> >>>> Serguei
> >>>> 
> >>>> 
> >>>> On 7/19/18 00:48, Rahul Raghavan wrote:
> >>>>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled
> >>>>> 
> >>>>> (just adding + hotspot-compiler-dev also)
> >>>>> 
> >>>>> 
> >>>>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote:
> >>>>> Subject Was:
> >>>>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled
> >>>>> 
> >>>>> + serviceability-dev
> >>>>> 
> >>>>> Hi all,
> >>>>> 
> >>>>> Could anyone else give me a review of this webrev and check/test the
> >>>>> various architecture changes?
> >>>>> 
> >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ <http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/>
> >>>>> 
> >>>>> 
> >>>>> Thanks for all your help!
> >>>>> Jc
> >>>>> 
> >>>>> 
> >>>>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler <jcbeyler at google.com <mailto:jcbeyler at google.com>>
> >> wrote:
> >>>>>> 
> >>>>>>> Hi all,
> >>>>>>> 
> >>>>>>> Here is a webrev that does all the architectures in the same way:
> >>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/ <http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/>
> >>>>>>> 
> >>>>>>> Could anyone review the other architectures and test?
> >>>>>>>   - arm, sparc & aarch64 are also modified now to follow the same
> >>>>>>> "if no
> >>>>>>> tlab, then consider eden space allocation" logic.
> >>>>>>> 
> >>>>>>> Thanks for your help!
> >>>>>>> Jc
> >>>>>>> 
> >>>>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler <jcbeyler at google.com <mailto:jcbeyler at google.com>>
> >>>>>>> wrote:
> >>>>>>> 
> >>>>>>>> Hi Kim,
> >>>>>>>> 
> >>>>>>>> I opened this bug
> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862 <https://bugs.openjdk.java.net/browse/JDK-8190862>
> >>>>>>>> 
> >>>>>>>> and now I've done an update:
> >>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/ <http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/>
> >>>>>>>> 
> >>>>>>>> I basically have done your nits but also removed the try_eden (it
> >>>>>>>> was
> >>>>>>>> used to bind a label but was not used). I updated the comments to
> >>>>>>>> use the
> >>>>>>>> one you preferred.
> >>>>>>>> 
> >>>>>>>> I still have to do the other architectures though but at least we
> >>>>>>>> seem to
> >>>>>>>> have a consensus on this architecture, correct?
> >>>>>>>> 
> >>>>>>>> Thanks for the review,
> >>>>>>>> Jc
> >>>>>>>> 
> >>>>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett <kim.barrett at oracle.com <mailto:kim.barrett at oracle.com>
> >>> 
> >>>>>>>> wrote:
> >>>>>>>> 
> >>>>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler <jcbeyler at google.com <mailto:jcbeyler at google.com>>
> >>>>>>>>>> wrote:
> >>>>>>>>>> 
> >>>>>>>>>> Yes, you are right, I did those changes due to:
> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084 <https://bugs.openjdk.java.net/browse/JDK-8194084>
> >>>>>>>>>> 
> >>>>>>>>>> If Robbin agrees to this change, and if no one sees an issue,
> >>>>>>>>>> I'll go
> >>>>>>>>> ahead
> >>>>>>>>>> and propagate the change across architectures.
> >>>>>>>>>> 
> >>>>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's
> >>>>>>>>>> comment
> >>>>>>>>> and
> >>>>>>>>>> review) :)
> >>>>>>>>>> Jc
> >>>>>>>>>> 
> >>>>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose <john.r.rose at oracle.com <mailto:john.r.rose at oracle.com>
> >>> 
> >>>>>>>>> wrote:
> >>>>>>>>>> 
> >>>>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler <jcbeyler at google.com <mailto:jcbeyler at google.com>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>> 
> >>>>>>>>>>> 
> >>>>>>>>>>> I'm not sure if we had left this case intentionally or not
> >>>>>>>>>>> but, if we
> >>>>>>>>> want
> >>>>>>>>>>> it all to be consistent, we should perhaps fix it.
> >>>>>>>>>>> 
> >>>>>>>>>>> 
> >>>>>>>>>>> Well, you put in that logic last February, so unless somebody
> >>>>>>>>>>> speaks
> >>>>>>>>> up
> >>>>>>>>>>> quickly, I support your adjusting it to be the way you want it.
> >>>>>>>>>>> 
> >>>>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I
> >>>>>>>>>>> src/hotspot/share"
> >>>>>>>>>>> suggests that the GC group is most active in touching this
> >>>>>>>>>>> feature.
> >>>>>>>>>>> If Robbin is OK with it, there's your reviewer.
> >>>>>>>>>>> 
> >>>>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person
> >>>>>>>>>>> working on the GC to OK it.
> >>>>>>>>>>> 
> >>>>>>>>>>> ? John
> >>>>>>>>>>> 
> >>>>>>>>>> 
> >>>>>>>>>> 
> >>>>>>>>>> --
> >>>>>>>>>> 
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Jc
> >>>>>>>>> 
> >>>>>>>>> Robbin is on vacation; you might not hear from him for a while.
> >>>>>>>>> 
> >>>>>>>>> I'm assuming you'll open a new bug for this?
> >>>>>>>>> 
> >>>>>>>>> Except for a few minor nits (below), this looks okay to me.
> >>>>>>>>> 
> >>>>>>>>> The comment at line 1052 needs updating.
> >>>>>>>>> 
> >>>>>>>>> pre-existing: The retry_tlab label declared on line 1054 is unused.
> >>>>>>>>> 
> >>>>>>>>> pre-existing: The try_eden label declared on line 1054 is bound at
> >>>>>>>>> line 1058, but unreferenced.
> >>>>>>>>> 
> >>>>>>>>> I like the wording of the comment at 1139 better than the
> >>>>>>>>> wording at
> >>>>>>>>> 1016.
> >>>>>>>>> 
> >>>>>>>>> 
> >>>>>>>> 
> >>>>>>>> --
> >>>>>>>> 
> >>>>>>>> Thanks,
> >>>>>>>> Jc
> >>>>>>>> 
> >>>>>>> 
> >>>>>>> 
> >>>>>>> --
> >>>>>>> 
> >>>>>>> Thanks,
> >>>>>>> Jc
> >>>>>>> 
> >>>>>> 
> >>>>>> 
> >>>> 
> >> 
> >> 
> > 
> > -- 
> > 
> > Thanks,
> > Jc
> 
> 
> 
> -- 
> 
> Thanks,
> Jc

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180721/3d7aab10/attachment-0001.html>

From jcbeyler at google.com  Mon Jul 23 03:04:15 2018
From: jcbeyler at google.com (JC Beyler)
Date: Sun, 22 Jul 2018 20:04:15 -0700
Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is
 enabled
In-Reply-To: <719DC045-4311-499E-9F7D-784096A044C6@oracle.com>
References: <CAF9BGBzAVdC3JZCu2945n8JXo0usocBxVxYzdfdBU_gWOTe8wg@mail.gmail.com>
 <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com>
 <CAF9BGBx7TnVo5vmNx75TUhd1VSeow6KsgcSk2ztKdibKuzv9fw@mail.gmail.com>
 <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com>
 <CAF9BGBxLchBQ3g1uwD=34E+QR5ARO7DqBfxZpFU2TRWTSaTajQ@mail.gmail.com>
 <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
 <CAF9BGBz+SNUffJWNgkvTTK6jB7-8yf96HGSroxvY13FT80XPCg@mail.gmail.com>
 <de54ede9-e29f-5e58-8d7f-6ad3c74d558c@oracle.com>
 <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com>
 <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com>
 <b3c9630e-8434-b2da-f732-6569e629debf@oracle.com>
 <CAF9BGBw+gUm+yHRGueGrrBJ5Pn5BA0RTpFyPeOVrBkj8+VAZWg@mail.gmail.com>
 <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com>
 <CAF9BGByRFu-KAb-PStS0ZGrjyeP35LS1OK4MW5E8AB7bsFM==A@mail.gmail.com>
 <719DC045-4311-499E-9F7D-784096A044C6@oracle.com>
Message-ID: <CAF9BGBwEDkKSPoCxYz1bAcE5a-bYpJErFyJ2s4y5WWz5zqg-KQ@mail.gmail.com>

Thanks Igor!

http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.04/

Has now your name in the reviewers. Would anyone want to push it by chance?

Thanks!
Jc

On Sat, Jul 21, 2018 at 7:39 PM Igor Veresov <igor.veresov at oracle.com>
wrote:

> Yeah, the fix I proposed doesn?t do exactly what we?d want. Sorry for the
> confusion. Your fix is fine. Reviewed.
>
> igor
>
> On Jul 21, 2018, at 7:06 PM, JC Beyler <jcbeyler at google.com> wrote:
>
> Hi Igor,
>
> Thanks for looking at it! I don't know the code paths enough to know if
> that is sufficient (I'll trust you evidently). I can run the tests next
> week if we prefer that route.
>
> Were I to choose, I would prefer that interpreter/c1/c2 all follow the
> same kind of paths, which would be my fix I believe:
> 1) If TLAB, allocate there or slowpath
> 2) Else If contiguous inline allocations are enabled, try that
> 3) Goto Slowpath
>
> With your fix, even if we do not have the issue anymore, it still keeps
> code that is not consistent but perhaps I'm missing something?
> Jc
>
> On Sat, Jul 21, 2018 at 1:47 PM Igor Veresov <igor.veresov at oracle.com>
> wrote:
>
>> I think you can just predicate the emission of these stubs for !UseTLAB,
>> and not mess with the CPU-specific code. What do you think?
>>
>> diff --git a/src/hotspot/share/c1/c1_LIRGenerator.cpp
>> b/src/hotspot/share/c1/c1_LIRGenerator.cpp
>> --- a/src/hotspot/share/c1/c1_LIRGenerator.cpp
>> +++ b/src/hotspot/share/c1/c1_LIRGenerator.cpp
>> @@ -674,7 +674,7 @@
>>  void LIRGenerator::new_instance(LIR_Opr dst, ciInstanceKlass* klass,
>> bool is_unresolved, LIR_Opr scratch1, LIR_Opr scratch2, LIR_Opr scratch3,
>> LIR_Opr scratch4, LIR_Opr klass_reg, CodeEmitInfo* info) {
>>    klass2reg_with_patching(klass_reg, klass, info, is_unresolved);
>>    // If klass is not loaded we do not know if the klass has finalizers:
>> -  if (UseFastNewInstance && klass->is_loaded()
>> +  if (UseFastNewInstance && !UseTLAB && klass->is_loaded()
>>        && !Klass::layout_helper_needs_slow_path(klass->layout_helper())) {
>>
>>      Runtime1::StubID stub_id = klass->is_initialized() ?
>> Runtime1::fast_new_instance_id : Runtime1::fast_new_instance_init_check_id;
>>
>>
>> igor
>>
>> > On Jul 20, 2018, at 12:37 PM, JC Beyler <jcbeyler at google.com> wrote:
>> >
>> > Yes that is right, this is the latest:
>> > http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.03/
>> >
>> > I apologize for the multiple threads and confusion,
>> > Jc
>> >
>> > On Fri, Jul 20, 2018 at 11:22 AM serguei.spitsyn at oracle.com <
>> > serguei.spitsyn at oracle.com> wrote:
>> >
>> >> Thank you a lot, Vladimir!
>> >> Yes, the webrev.03 is the latest.
>> >> Jc, will correct us if it is not right.
>> >>
>> >> Thanks,
>> >> Serguei
>> >>
>> >>
>> >> On 7/20/18 10:52, Vladimir Kozlov wrote:
>> >>> I asked Igor V. to look.
>> >>>
>> >>> Seems like review is done in an other thread which does not have bug
>> >>> id in subject. Currently webrev.03
>> >>>
>> >>> Vladimir
>> >>>
>> >>> On 7/19/18 4:32 PM, serguei.spitsyn at oracle.com wrote:
>> >>>> Thanks, Rahul!
>> >>>> In fact, there no good experts for this area in the serviceability
>> team.
>> >>>> It would be much better if anyone from the Compiler team could do it.
>> >>>>
>> >>>> Vladimir K.,
>> >>>>
>> >>>> Is there anyone from the Compiler team available to review this?
>> >>>> Otherwise, I could try to review it but am not sure about my review
>> >>>> quality.
>> >>>>
>> >>>> Thanks,
>> >>>> Serguei
>> >>>>
>> >>>>
>> >>>> On 7/19/18 00:48, Rahul Raghavan wrote:
>> >>>>> RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled
>> >>>>>
>> >>>>> (just adding + hotspot-compiler-dev also)
>> >>>>>
>> >>>>>
>> >>>>> On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote:
>> >>>>> Subject Was:
>> >>>>> Re: RFR (S): C1 still does eden allocations when TLAB is enabled
>> >>>>>
>> >>>>> + serviceability-dev
>> >>>>>
>> >>>>> Hi all,
>> >>>>>
>> >>>>> Could anyone else give me a review of this webrev and check/test the
>> >>>>> various architecture changes?
>> >>>>>
>> >>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>> >>>>>
>> >>>>>
>> >>>>> Thanks for all your help!
>> >>>>> Jc
>> >>>>>
>> >>>>>
>> >>>>>> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler <jcbeyler at google.com>
>> >> wrote:
>> >>>>>>
>> >>>>>>> Hi all,
>> >>>>>>>
>> >>>>>>> Here is a webrev that does all the architectures in the same way:
>> >>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>> >>>>>>>
>> >>>>>>> Could anyone review the other architectures and test?
>> >>>>>>>   - arm, sparc & aarch64 are also modified now to follow the same
>> >>>>>>> "if no
>> >>>>>>> tlab, then consider eden space allocation" logic.
>> >>>>>>>
>> >>>>>>> Thanks for your help!
>> >>>>>>> Jc
>> >>>>>>>
>> >>>>>>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler <jcbeyler at google.com>
>> >>>>>>> wrote:
>> >>>>>>>
>> >>>>>>>> Hi Kim,
>> >>>>>>>>
>> >>>>>>>> I opened this bug
>> >>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8190862
>> >>>>>>>>
>> >>>>>>>> and now I've done an update:
>> >>>>>>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/
>> >>>>>>>>
>> >>>>>>>> I basically have done your nits but also removed the try_eden (it
>> >>>>>>>> was
>> >>>>>>>> used to bind a label but was not used). I updated the comments to
>> >>>>>>>> use the
>> >>>>>>>> one you preferred.
>> >>>>>>>>
>> >>>>>>>> I still have to do the other architectures though but at least we
>> >>>>>>>> seem to
>> >>>>>>>> have a consensus on this architecture, correct?
>> >>>>>>>>
>> >>>>>>>> Thanks for the review,
>> >>>>>>>> Jc
>> >>>>>>>>
>> >>>>>>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett <
>> kim.barrett at oracle.com
>> >>>
>> >>>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler <jcbeyler at google.com>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> Yes, you are right, I did those changes due to:
>> >>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084
>> >>>>>>>>>>
>> >>>>>>>>>> If Robbin agrees to this change, and if no one sees an issue,
>> >>>>>>>>>> I'll go
>> >>>>>>>>> ahead
>> >>>>>>>>>> and propagate the change across architectures.
>> >>>>>>>>>>
>> >>>>>>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's
>> >>>>>>>>>> comment
>> >>>>>>>>> and
>> >>>>>>>>>> review) :)
>> >>>>>>>>>> Jc
>> >>>>>>>>>>
>> >>>>>>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose <
>> john.r.rose at oracle.com
>> >>>
>> >>>>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler <jcbeyler at google.com>
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> I'm not sure if we had left this case intentionally or not
>> >>>>>>>>>>> but, if we
>> >>>>>>>>> want
>> >>>>>>>>>>> it all to be consistent, we should perhaps fix it.
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> Well, you put in that logic last February, so unless somebody
>> >>>>>>>>>>> speaks
>> >>>>>>>>> up
>> >>>>>>>>>>> quickly, I support your adjusting it to be the way you want
>> it.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I
>> >>>>>>>>>>> src/hotspot/share"
>> >>>>>>>>>>> suggests that the GC group is most active in touching this
>> >>>>>>>>>>> feature.
>> >>>>>>>>>>> If Robbin is OK with it, there's your reviewer.
>> >>>>>>>>>>>
>> >>>>>>>>>>> FWIW, you can use me as a reviewer, but I'd get one other
>> person
>> >>>>>>>>>>> working on the GC to OK it.
>> >>>>>>>>>>>
>> >>>>>>>>>>> ? John
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> --
>> >>>>>>>>>>
>> >>>>>>>>>> Thanks,
>> >>>>>>>>>> Jc
>> >>>>>>>>>
>> >>>>>>>>> Robbin is on vacation; you might not hear from him for a while.
>> >>>>>>>>>
>> >>>>>>>>> I'm assuming you'll open a new bug for this?
>> >>>>>>>>>
>> >>>>>>>>> Except for a few minor nits (below), this looks okay to me.
>> >>>>>>>>>
>> >>>>>>>>> The comment at line 1052 needs updating.
>> >>>>>>>>>
>> >>>>>>>>> pre-existing: The retry_tlab label declared on line 1054 is
>> unused.
>> >>>>>>>>>
>> >>>>>>>>> pre-existing: The try_eden label declared on line 1054 is bound
>> at
>> >>>>>>>>> line 1058, but unreferenced.
>> >>>>>>>>>
>> >>>>>>>>> I like the wording of the comment at 1139 better than the
>> >>>>>>>>> wording at
>> >>>>>>>>> 1016.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>> --
>> >>>>>>>>
>> >>>>>>>> Thanks,
>> >>>>>>>> Jc
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>>
>> >>>>>>> Thanks,
>> >>>>>>> Jc
>> >>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>
>> >>
>> >>
>> >
>> > --
>> >
>> > Thanks,
>> > Jc
>>
>>
>
> --
>
> Thanks,
> Jc
>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180722/bc10e9ae/attachment.html>

From serguei.spitsyn at oracle.com  Mon Jul 23 06:52:59 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Sun, 22 Jul 2018 23:52:59 -0700
Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is
 enabled
In-Reply-To: <CAF9BGBwEDkKSPoCxYz1bAcE5a-bYpJErFyJ2s4y5WWz5zqg-KQ@mail.gmail.com>
References: <CAF9BGBzAVdC3JZCu2945n8JXo0usocBxVxYzdfdBU_gWOTe8wg@mail.gmail.com>
 <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com>
 <CAF9BGBx7TnVo5vmNx75TUhd1VSeow6KsgcSk2ztKdibKuzv9fw@mail.gmail.com>
 <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com>
 <CAF9BGBxLchBQ3g1uwD=34E+QR5ARO7DqBfxZpFU2TRWTSaTajQ@mail.gmail.com>
 <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
 <CAF9BGBz+SNUffJWNgkvTTK6jB7-8yf96HGSroxvY13FT80XPCg@mail.gmail.com>
 <de54ede9-e29f-5e58-8d7f-6ad3c74d558c@oracle.com>
 <09523277-c374-9243-9eb4-0d1f57dd2f55@oracle.com>
 <8832ef4e-e9c8-ebad-6c69-98e2e85ec279@oracle.com>
 <b3c9630e-8434-b2da-f732-6569e629debf@oracle.com>
 <CAF9BGBw+gUm+yHRGueGrrBJ5Pn5BA0RTpFyPeOVrBkj8+VAZWg@mail.gmail.com>
 <3515C5E6-12CF-400E-B4F3-6CF2D211C587@oracle.com>
 <CAF9BGByRFu-KAb-PStS0ZGrjyeP35LS1OK4MW5E8AB7bsFM==A@mail.gmail.com>
 <719DC045-4311-499E-9F7D-784096A044C6@oracle.com>
 <CAF9BGBwEDkKSPoCxYz1bAcE5a-bYpJErFyJ2s4y5WWz5zqg-KQ@mail.gmail.com>
Message-ID: <4e48d67c-9c01-c1bf-4955-a237f1484c2e@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180722/46b8e8d9/attachment-0001.html>

From chris.plummer at oracle.com  Mon Jul 23 07:42:16 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 23 Jul 2018 00:42:16 -0700
Subject: RFR(S): 8151259: [TESTBUG] nsk/jvmti/RedefineClasses/redefclass030
 fails with "unexpected values of outer fields of the class" when running with
 -Xcomp
Message-ID: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>

Hello,

Please review the following fix for JDK11:

https://bugs.openjdk.java.net/browse/JDK-8151259
http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00

It fixes the following 3 tests:

vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java
vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java
vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java

Any of which could fail when run with -Xcomp with (followed by a bunch 
more errors):

 ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
ignored.

Although lately we've only seen this with redefclass030.java on macosx.

These 3 tests do redefinition of a "hot" method after triggering 
compilation for it. After the redef some testing is done to ensure that 
the redef was done correctly, but the issue these test have actually 
comes before any redef is done.

The test attempts to trigger compilation by calling a hot method a lot. 
The agent detects compilation by receiving a CompiledMethodLoad event. 
There was an issue discovered long ago that when -Xcomp is used, the 
compilation happens before the "hot" method is ever called. Then the 
redef would happen before compilation, and this somehow messed up the 
test (I'm not exactly sure how). The fix was to basically abandon the 
redef attempt when this problem is detected, and then supposedly just 
let the test run to completion (skipping the actual testing of the 
redef). After this change, if you ran with -Xcomp it would pass, but if 
you looked in the log you would see:

 ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
ignored.

However, there was a bug in the logic to make the test run to 
completion, and also causes the above message to not appear. Instead the 
test would fail with:

# ERROR: Redefinition not completed.

Followed by a bunch more error message during the part of the test that 
checks if the redef was done properly.

If the CompiledMethodLoad event comes in before the hot method is ever 
called (which it does with -Xcomp), the test sets fire = -1. If the hot 
method was called, it is set to 1.? The setting of fire = -1 was added 
to fix the -Xcomp problem mentioned above. The jvmti agent does the 
following:

 ??? do {
 ??????? THREAD_sleep(1);
 ??????? /* wait for compilation to happen */
 ??? } while(fire == 0);

 ??? if (fire == 1) {
 ??????? /* do the redef here */
 ??????? NSK_DISPLAY0("agentProc: <<<<<<<< RedefineClasses() is 
successfully done\n");
 ??? } else {
 ??????? // fire == -1
 ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't executed. Don't 
perform redefinition\n");
 ??? }

The agent then syncs with the debuggee, waiting for it finish up. What 
the test expects is that waitForRedefinitionStarted() in the debuggee 
will time out after two seconds while waiting for fire == 1 (which it 
thinks will will always happen because it was set to -1). When it times 
out, the test does appear to exit properly with, but with the following 
in the log, which is intended:

 ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
ignored.

However, sometimes before waitForRedefinitionStarted() times out, the 
hot method is called enough times to trigger compilation. So another 
CompiledMethodLoad event arrives, and this time fire is set to 1. 
Because of this, waitForRedefinitionStarted() doesn't time out and 
returns with an indication that the redef has started. After this 
waitForRedefinitionCompleted() is executed. It waits for the redef to 
complete, but it never does since the agent decided not to do the redef 
when it saw fire == -1. So waitForRedefinitionCompleted() times out 
after 10 seconds and the test fails, with:

# ERROR: Redefinition not completed.

Actually the above error is not really what causes the failure. When the 
above error is detected, no error status is set and the test continues 
as if the redef had been done. So then the logic that detects if the 
redef was done properly ends up failing, and that's where the test 
actually indicates a failure status. You see a whole bunch of other 
errors in the log because of all the checks that fail.

The fix is to not abandon the test when the first CompiledMethodLoad 
event is before the hot method was called. Instead just leave fire==0 
and wait for the next CompiledMethodLoad event that is triggered after 
the method is called enough times to be recompiled. I'm not sure why it 
was not originally done this way. Possibly the recompilation did not 
happen reliably, but I have not run into this problem. The other changes 
in redefclass030.c are just cleaning up debug tracing.

Another fix was to properly set the error status when 
waitForRedefinitionStarted() or waitForRedefinitionCompleted() times 
out, although this is just a safety net and I didn't run into any cases 
where this happened after fixing the CompiledMethodLoad event handling. 
So in general the changes in redefclass030.java were not needed, but 
provide better error handling.

thanks,

Chris


From daniel.daugherty at oracle.com  Mon Jul 23 16:10:22 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 23 Jul 2018 12:10:22 -0400
Subject: RFR(XXXS): 8208092 ProblemList serviceability/sa/ClhsdbCDSCore.java
Message-ID: <d153e64b-5208-d026-e465-f0f85af27db6@oracle.com>

Greetings,

We have an intermittent tier1 test failure on Linux-X64 in both JDK11 and
JDK12. I'm putting it on the ProblemList:

$ hg diff
diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt
--- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 2018 +0530
+++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 12:07:59 2018 -0400
@@ -79,6 +79,7 @@

 ?# :hotspot_serviceability

+serviceability/sa/ClhsdbCDSCore.java???????????????? 8208092 linux-x64
 ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all
 ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all

Thanks, in advance, for a single (R)eview of this trivial change.

Dan


From sgehwolf at redhat.com  Mon Jul 23 16:27:30 2018
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Mon, 23 Jul 2018 18:27:30 +0200
Subject: RFR(XS): 8208091: SA: jhsdb jstack --mixed throws
 UnmappedAddressException on i686
Message-ID: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com>

Hi,

Could I please get a review of this one-liner change related to jhsdb
--mixed when attaching to a running Java process? The issue arises when
threads are in native code and that native code has frame pointers not
properly preserved. In such a case the SA performs a simple frame
pointer valididy check: ebp >= esp

However, the code of retrieving the value for esp is incorrect in as
much as it's not in sync with native code in regards to the register
index:

native code => X86ThreadContext.SP
Java code   => X86ThreadContext.ESP

X86ThreadContext.ESP is never being set by the native code. Since
X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then
returns null, ebp.lessThan(esp) wrongly returns false causing the
issue. This webrev fixes it by using SP as index on the Java side.
Thoughts?

webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/
bug: https://bugs.openjdk.java.net/browse/JDK-8208091

Thanks,
Severin

From serguei.spitsyn at oracle.com  Mon Jul 23 16:44:20 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 23 Jul 2018 09:44:20 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
Message-ID: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>

Hi Chris,

Would it be more simple to avoid running these tests with -Xcomp?
I guess, this would work: @requires vm.compMode != "Xcomp"

Thanks,
Serguei


On 7/23/18 00:42, Chris Plummer wrote:
> Hello,
>
> Please review the following fix for JDK11:
>
> https://bugs.openjdk.java.net/browse/JDK-8151259
> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00
>
> It fixes the following 3 tests:
>
> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java
> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java
> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java
>
> Any of which could fail when run with -Xcomp with (followed by a bunch 
> more errors):
>
> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
> ignored.
>
> Although lately we've only seen this with redefclass030.java on macosx.
>
> These 3 tests do redefinition of a "hot" method after triggering 
> compilation for it. After the redef some testing is done to ensure 
> that the redef was done correctly, but the issue these test have 
> actually comes before any redef is done.
>
> The test attempts to trigger compilation by calling a hot method a 
> lot. The agent detects compilation by receiving a CompiledMethodLoad 
> event. There was an issue discovered long ago that when -Xcomp is 
> used, the compilation happens before the "hot" method is ever called. 
> Then the redef would happen before compilation, and this somehow 
> messed up the test (I'm not exactly sure how). The fix was to 
> basically abandon the redef attempt when this problem is detected, and 
> then supposedly just let the test run to completion (skipping the 
> actual testing of the redef). After this change, if you ran with 
> -Xcomp it would pass, but if you looked in the log you would see:
>
> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
> ignored.
>
> However, there was a bug in the logic to make the test run to 
> completion, and also causes the above message to not appear. Instead 
> the test would fail with:
>
> # ERROR: Redefinition not completed.
>
> Followed by a bunch more error message during the part of the test 
> that checks if the redef was done properly.
>
> If the CompiledMethodLoad event comes in before the hot method is ever 
> called (which it does with -Xcomp), the test sets fire = -1. If the 
> hot method was called, it is set to 1.? The setting of fire = -1 was 
> added to fix the -Xcomp problem mentioned above. The jvmti agent does 
> the following:
>
> ??? do {
> ??????? THREAD_sleep(1);
> ??????? /* wait for compilation to happen */
> ??? } while(fire == 0);
>
> ??? if (fire == 1) {
> ??????? /* do the redef here */
> ??????? NSK_DISPLAY0("agentProc: <<<<<<<< RedefineClasses() is 
> successfully done\n");
> ??? } else {
> ??????? // fire == -1
> ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't executed. Don't 
> perform redefinition\n");
> ??? }
>
> The agent then syncs with the debuggee, waiting for it finish up. What 
> the test expects is that waitForRedefinitionStarted() in the debuggee 
> will time out after two seconds while waiting for fire == 1 (which it 
> thinks will will always happen because it was set to -1). When it 
> times out, the test does appear to exit properly with, but with the 
> following in the log, which is intended:
>
> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
> ignored.
>
> However, sometimes before waitForRedefinitionStarted() times out, the 
> hot method is called enough times to trigger compilation. So another 
> CompiledMethodLoad event arrives, and this time fire is set to 1. 
> Because of this, waitForRedefinitionStarted() doesn't time out and 
> returns with an indication that the redef has started. After this 
> waitForRedefinitionCompleted() is executed. It waits for the redef to 
> complete, but it never does since the agent decided not to do the 
> redef when it saw fire == -1. So waitForRedefinitionCompleted() times 
> out after 10 seconds and the test fails, with:
>
> # ERROR: Redefinition not completed.
>
> Actually the above error is not really what causes the failure. When 
> the above error is detected, no error status is set and the test 
> continues as if the redef had been done. So then the logic that 
> detects if the redef was done properly ends up failing, and that's 
> where the test actually indicates a failure status. You see a whole 
> bunch of other errors in the log because of all the checks that fail.
>
> The fix is to not abandon the test when the first CompiledMethodLoad 
> event is before the hot method was called. Instead just leave fire==0 
> and wait for the next CompiledMethodLoad event that is triggered after 
> the method is called enough times to be recompiled. I'm not sure why 
> it was not originally done this way. Possibly the recompilation did 
> not happen reliably, but I have not run into this problem. The other 
> changes in redefclass030.c are just cleaning up debug tracing.
>
> Another fix was to properly set the error status when 
> waitForRedefinitionStarted() or waitForRedefinitionCompleted() times 
> out, although this is just a safety net and I didn't run into any 
> cases where this happened after fixing the CompiledMethodLoad event 
> handling. So in general the changes in redefclass030.java were not 
> needed, but provide better error handling.
>
> thanks,
>
> Chris
>


From daniel.daugherty at oracle.com  Mon Jul 23 17:17:30 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 23 Jul 2018 13:17:30 -0400
Subject: RFR(XXXS): 8208092 ProblemList
 serviceability/sa/ClhsdbCDSCore.java
In-Reply-To: <6d7b229d-0d0b-d828-4fc8-3693cc2e3c1d@oracle.com>
References: <d153e64b-5208-d026-e465-f0f85af27db6@oracle.com>
 <6d7b229d-0d0b-d828-4fc8-3693cc2e3c1d@oracle.com>
Message-ID: <b7222d8e-0820-f327-e65c-a368e431ae72@oracle.com>

I added back the alias... you accidentally deleted it...


On 7/23/18 12:34 PM, serguei.spitsyn at oracle.com wrote:
> Hi Dan,
>
> The bug number in the problem list has to be 8207832, not 8208092. :)

Thanks for the catch! I knew I should have waited until after lunch to
send out that RFR... sigh...


> Count it as reviewed if you fix it - trivial rule applies.

$ hg diff
diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt
--- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 2018 +0530
+++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 13:15:50 2018 -0400
@@ -79,6 +79,7 @@

 ?# :hotspot_serviceability

+serviceability/sa/ClhsdbCDSCore.java???????????????? 8207832 linux-x64
 ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all
 ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all


Dan


>
> Thanks,
> Serguei
>
>
> On 7/23/18 09:10, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> We have an intermittent tier1 test failure on Linux-X64 in both JDK11 
>> and
>> JDK12. I'm putting it on the ProblemList:
>>
>> $ hg diff
>> diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt
>> --- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 2018 
>> +0530
>> +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 12:07:59 2018 
>> -0400
>> @@ -79,6 +79,7 @@
>>
>> ?# :hotspot_serviceability
>>
>> +serviceability/sa/ClhsdbCDSCore.java???????????????? 8208092 linux-x64
>> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 
>> generic-all
>> ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 
>> generic-all
>>
>> Thanks, in advance, for a single (R)eview of this trivial change.
>>
>> Dan
>>
>


From serguei.spitsyn at oracle.com  Mon Jul 23 18:37:08 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 23 Jul 2018 11:37:08 -0700
Subject: RFR(XXXS): 8208092 ProblemList
 serviceability/sa/ClhsdbCDSCore.java
In-Reply-To: <b7222d8e-0820-f327-e65c-a368e431ae72@oracle.com>
References: <d153e64b-5208-d026-e465-f0f85af27db6@oracle.com>
 <6d7b229d-0d0b-d828-4fc8-3693cc2e3c1d@oracle.com>
 <b7222d8e-0820-f327-e65c-a368e431ae72@oracle.com>
Message-ID: <ee273602-b691-91b2-a82a-d4ff66a12d62@oracle.com>

Looks good.

Thanks,
Serguei

On 7/23/18 10:17, Daniel D. Daugherty wrote:
> I added back the alias... you accidentally deleted it...
>
>
> On 7/23/18 12:34 PM, serguei.spitsyn at oracle.com wrote:
>> Hi Dan,
>>
>> The bug number in the problem list has to be 8207832, not 8208092. :)
>
> Thanks for the catch! I knew I should have waited until after lunch to
> send out that RFR... sigh...
>
>
>> Count it as reviewed if you fix it - trivial rule applies.
>
> $ hg diff
> diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt
> --- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 2018 
> +0530
> +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 13:15:50 2018 
> -0400
> @@ -79,6 +79,7 @@
>
> ?# :hotspot_serviceability
>
> +serviceability/sa/ClhsdbCDSCore.java???????????????? 8207832 linux-x64
> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all
> ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all
>
>
> Dan
>
>
>>
>> Thanks,
>> Serguei
>>
>>
>> On 7/23/18 09:10, Daniel D. Daugherty wrote:
>>> Greetings,
>>>
>>> We have an intermittent tier1 test failure on Linux-X64 in both 
>>> JDK11 and
>>> JDK12. I'm putting it on the ProblemList:
>>>
>>> $ hg diff
>>> diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt
>>> --- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 2018 
>>> +0530
>>> +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 12:07:59 2018 
>>> -0400
>>> @@ -79,6 +79,7 @@
>>>
>>> ?# :hotspot_serviceability
>>>
>>> +serviceability/sa/ClhsdbCDSCore.java???????????????? 8208092 linux-x64
>>> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 
>>> generic-all
>>> ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 
>>> generic-all
>>>
>>> Thanks, in advance, for a single (R)eview of this trivial change.
>>>
>>> Dan
>>>
>>
>


From daniel.daugherty at oracle.com  Mon Jul 23 18:40:13 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 23 Jul 2018 14:40:13 -0400
Subject: RFR(XXXS): 8208092 ProblemList
 serviceability/sa/ClhsdbCDSCore.java
In-Reply-To: <ee273602-b691-91b2-a82a-d4ff66a12d62@oracle.com>
References: <d153e64b-5208-d026-e465-f0f85af27db6@oracle.com>
 <6d7b229d-0d0b-d828-4fc8-3693cc2e3c1d@oracle.com>
 <b7222d8e-0820-f327-e65c-a368e431ae72@oracle.com>
 <ee273602-b691-91b2-a82a-d4ff66a12d62@oracle.com>
Message-ID: <e7cf4240-3f3c-6ddb-ef4c-939aed3bc7cd@oracle.com>

Thanks!

Dan


On 7/23/18 2:37 PM, serguei.spitsyn at oracle.com wrote:
> Looks good.
>
> Thanks,
> Serguei
>
> On 7/23/18 10:17, Daniel D. Daugherty wrote:
>> I added back the alias... you accidentally deleted it...
>>
>>
>> On 7/23/18 12:34 PM, serguei.spitsyn at oracle.com wrote:
>>> Hi Dan,
>>>
>>> The bug number in the problem list has to be 8207832, not 8208092. :)
>>
>> Thanks for the catch! I knew I should have waited until after lunch to
>> send out that RFR... sigh...
>>
>>
>>> Count it as reviewed if you fix it - trivial rule applies.
>>
>> $ hg diff
>> diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt
>> --- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 2018 
>> +0530
>> +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 13:15:50 2018 
>> -0400
>> @@ -79,6 +79,7 @@
>>
>> ?# :hotspot_serviceability
>>
>> +serviceability/sa/ClhsdbCDSCore.java???????????????? 8207832 linux-x64
>> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 
>> generic-all
>> ?serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 
>> generic-all
>>
>>
>> Dan
>>
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 7/23/18 09:10, Daniel D. Daugherty wrote:
>>>> Greetings,
>>>>
>>>> We have an intermittent tier1 test failure on Linux-X64 in both 
>>>> JDK11 and
>>>> JDK12. I'm putting it on the ProblemList:
>>>>
>>>> $ hg diff
>>>> diff -r d9b22cbe3e7a test/hotspot/jtreg/ProblemList.txt
>>>> --- a/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 19:58:43 
>>>> 2018 +0530
>>>> +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 23 12:07:59 
>>>> 2018 -0400
>>>> @@ -79,6 +79,7 @@
>>>>
>>>> ?# :hotspot_serviceability
>>>>
>>>> +serviceability/sa/ClhsdbCDSCore.java 8208092 linux-x64
>>>> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 
>>>> generic-all
>>>> ?serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all
>>>>
>>>> Thanks, in advance, for a single (R)eview of this trivial change.
>>>>
>>>> Dan
>>>>
>>>
>>
>


From chris.plummer at oracle.com  Mon Jul 23 18:40:24 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 23 Jul 2018 11:40:24 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
Message-ID: <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>

Hi Serguei,

If the fix was complicated I would agree, but it really just boils down 
to this one line change:

-??????????? fire = -1;
+??????????? fire = 0; // Ignore this compilation. Wait for next one.

Given that, I see no reason not to increase our test coverage by 
supporting this test during -Xcomp runs.

thanks,

Chris

On 7/23/18 9:44 AM, serguei.spitsyn at oracle.com wrote:
> Hi Chris,
>
> Would it be more simple to avoid running these tests with -Xcomp?
> I guess, this would work: @requires vm.compMode != "Xcomp"
>
> Thanks,
> Serguei
>
>
> On 7/23/18 00:42, Chris Plummer wrote:
>> Hello,
>>
>> Please review the following fix for JDK11:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8151259
>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00
>>
>> It fixes the following 3 tests:
>>
>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java
>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java
>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java
>>
>> Any of which could fail when run with -Xcomp with (followed by a 
>> bunch more errors):
>>
>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
>> ignored.
>>
>> Although lately we've only seen this with redefclass030.java on macosx.
>>
>> These 3 tests do redefinition of a "hot" method after triggering 
>> compilation for it. After the redef some testing is done to ensure 
>> that the redef was done correctly, but the issue these test have 
>> actually comes before any redef is done.
>>
>> The test attempts to trigger compilation by calling a hot method a 
>> lot. The agent detects compilation by receiving a CompiledMethodLoad 
>> event. There was an issue discovered long ago that when -Xcomp is 
>> used, the compilation happens before the "hot" method is ever called. 
>> Then the redef would happen before compilation, and this somehow 
>> messed up the test (I'm not exactly sure how). The fix was to 
>> basically abandon the redef attempt when this problem is detected, 
>> and then supposedly just let the test run to completion (skipping the 
>> actual testing of the redef). After this change, if you ran with 
>> -Xcomp it would pass, but if you looked in the log you would see:
>>
>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
>> ignored.
>>
>> However, there was a bug in the logic to make the test run to 
>> completion, and also causes the above message to not appear. Instead 
>> the test would fail with:
>>
>> # ERROR: Redefinition not completed.
>>
>> Followed by a bunch more error message during the part of the test 
>> that checks if the redef was done properly.
>>
>> If the CompiledMethodLoad event comes in before the hot method is 
>> ever called (which it does with -Xcomp), the test sets fire = -1. If 
>> the hot method was called, it is set to 1.? The setting of fire = -1 
>> was added to fix the -Xcomp problem mentioned above. The jvmti agent 
>> does the following:
>>
>> ??? do {
>> ??????? THREAD_sleep(1);
>> ??????? /* wait for compilation to happen */
>> ??? } while(fire == 0);
>>
>> ??? if (fire == 1) {
>> ??????? /* do the redef here */
>> ??????? NSK_DISPLAY0("agentProc: <<<<<<<< RedefineClasses() is 
>> successfully done\n");
>> ??? } else {
>> ??????? // fire == -1
>> ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't executed. 
>> Don't perform redefinition\n");
>> ??? }
>>
>> The agent then syncs with the debuggee, waiting for it finish up. 
>> What the test expects is that waitForRedefinitionStarted() in the 
>> debuggee will time out after two seconds while waiting for fire == 1 
>> (which it thinks will will always happen because it was set to -1). 
>> When it times out, the test does appear to exit properly with, but 
>> with the following in the log, which is intended:
>>
>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
>> ignored.
>>
>> However, sometimes before waitForRedefinitionStarted() times out, the 
>> hot method is called enough times to trigger compilation. So another 
>> CompiledMethodLoad event arrives, and this time fire is set to 1. 
>> Because of this, waitForRedefinitionStarted() doesn't time out and 
>> returns with an indication that the redef has started. After this 
>> waitForRedefinitionCompleted() is executed. It waits for the redef to 
>> complete, but it never does since the agent decided not to do the 
>> redef when it saw fire == -1. So waitForRedefinitionCompleted() times 
>> out after 10 seconds and the test fails, with:
>>
>> # ERROR: Redefinition not completed.
>>
>> Actually the above error is not really what causes the failure. When 
>> the above error is detected, no error status is set and the test 
>> continues as if the redef had been done. So then the logic that 
>> detects if the redef was done properly ends up failing, and that's 
>> where the test actually indicates a failure status. You see a whole 
>> bunch of other errors in the log because of all the checks that fail.
>>
>> The fix is to not abandon the test when the first CompiledMethodLoad 
>> event is before the hot method was called. Instead just leave fire==0 
>> and wait for the next CompiledMethodLoad event that is triggered 
>> after the method is called enough times to be recompiled. I'm not 
>> sure why it was not originally done this way. Possibly the 
>> recompilation did not happen reliably, but I have not run into this 
>> problem. The other changes in redefclass030.c are just cleaning up 
>> debug tracing.
>>
>> Another fix was to properly set the error status when 
>> waitForRedefinitionStarted() or waitForRedefinitionCompleted() times 
>> out, although this is just a safety net and I didn't run into any 
>> cases where this happened after fixing the CompiledMethodLoad event 
>> handling. So in general the changes in redefclass030.java were not 
>> needed, but provide better error handling.
>>
>> thanks,
>>
>> Chris
>>
>


From chris.plummer at oracle.com  Mon Jul 23 19:17:26 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 23 Jul 2018 12:17:26 -0700
Subject: RFR(XS):8208075: Quarantine
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
Message-ID: <831a1b6d-c6e4-ea65-3cc1-c7202e75440f@oracle.com>

Hi,

Please review the following, to be pushed to JDK 11

https://bugs.openjdk.java.net/browse/JDK-8208075

diff --git a/test/hotspot/jtreg/ProblemList.txt 
b/test/hotspot/jtreg/ProblemList.txt
--- a/test/hotspot/jtreg/ProblemList.txt
+++ b/test/hotspot/jtreg/ProblemList.txt
@@ -123,7 +123,7 @@

 ?vmTestbase/nsk/jvmti/ClearBreakpoint/clrbrk001/TestDescription.java 
8016181 generic-all
 ?vmTestbase/nsk/jvmti/FieldModification/fieldmod001/TestDescription.java 8016181 generic-all
-vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java 
8202896 linux-x64
+vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java 
8202896,8206076,8208074 generic-all
 ?vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted001/TestDescription.java 7013634 generic-all
 ?vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted003/TestDescription.java 6606767 generic-all
 ?vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted004/TestDescription.java 7013634,6606767 generic-all

The test was already quarantined on linux-x86 due to 8202896. However a 
few days ago it started to fail for a different reason on every run and 
every platform. 8208074 was filed for it. Also, it fails rarely due to 
timeout, and 8206076 was filed for that failure a few months ago.

https://bugs.openjdk.java.net/browse/JDK-8202896
https://bugs.openjdk.java.net/browse/JDK-8206076
https://bugs.openjdk.java.net/browse/JDK-8208074

thanks,

Chris


From serguei.spitsyn at oracle.com  Mon Jul 23 19:39:41 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 23 Jul 2018 12:39:41 -0700
Subject: RFR(XS):8208075: Quarantine
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
In-Reply-To: <831a1b6d-c6e4-ea65-3cc1-c7202e75440f@oracle.com>
References: <831a1b6d-c6e4-ea65-3cc1-c7202e75440f@oracle.com>
Message-ID: <6e64b192-2955-c765-abf5-718fdc02f168@oracle.com>

Looks good.

Thanks,
Serguei


On 7/23/18 12:17, Chris Plummer wrote:
> Hi,
>
> Please review the following, to be pushed to JDK 11
>
> https://bugs.openjdk.java.net/browse/JDK-8208075
>
> diff --git a/test/hotspot/jtreg/ProblemList.txt 
> b/test/hotspot/jtreg/ProblemList.txt
> --- a/test/hotspot/jtreg/ProblemList.txt
> +++ b/test/hotspot/jtreg/ProblemList.txt
> @@ -123,7 +123,7 @@
>
> ?vmTestbase/nsk/jvmti/ClearBreakpoint/clrbrk001/TestDescription.java 
> 8016181 generic-all
> ?vmTestbase/nsk/jvmti/FieldModification/fieldmod001/TestDescription.java 
> 8016181 generic-all
> -vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java 
> 8202896 linux-x64
> +vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java 
> 8202896,8206076,8208074 generic-all
> ?vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted001/TestDescription.java 
> 7013634 generic-all
> ?vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted003/TestDescription.java 
> 6606767 generic-all
> ?vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted004/TestDescription.java 
> 7013634,6606767 generic-all
>
> The test was already quarantined on linux-x86 due to 8202896. However 
> a few days ago it started to fail for a different reason on every run 
> and every platform. 8208074 was filed for it. Also, it fails rarely 
> due to timeout, and 8206076 was filed for that failure a few months ago.
>
> https://bugs.openjdk.java.net/browse/JDK-8202896
> https://bugs.openjdk.java.net/browse/JDK-8206076
> https://bugs.openjdk.java.net/browse/JDK-8208074
>
> thanks,
>
> Chris
>


From hohensee at amazon.com  Mon Jul 23 21:33:28 2018
From: hohensee at amazon.com (Hohensee, Paul)
Date: Mon, 23 Jul 2018 21:33:28 +0000
Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean,
 GarbageCollectorMXBean, and jstat counter definitions
Message-ID: <FCFCADFE-5CE0-42DE-8ED8-FBC57464207F@amazon.com>

Corrected subject line: 8196889 s/b 8196989.

From: hotspot-gc-dev <hotspot-gc-dev-bounces at openjdk.java.net> on behalf of "Hohensee, Paul" <hohensee at amazon.com>
Date: Friday, July 20, 2018 at 3:38 PM
To: "hotspot-gc-dev at openjdk.java.net" <hotspot-gc-dev at openjdk.java.net>, "serviceability-dev at openjdk.java.net" <serviceability-dev at openjdk.java.net>
Subject: RFR(L): 8196889: Revamp G1 JMX MemoryPoolMXBean, GarbageCollectorMXBean, and jstat counter definitions

Please review.

Bug: https://bugs.openjdk.java.net/browse/JDK-8196989
CSR: https://bugs.openjdk.java.net/browse/JDK-8196991
Webrev: http://cr.openjdk.java.net/~phh/8196989/webrev.00

This webrev is marked ?L? because it?s a behavioral change (CSR in draft state, may I have a review of that too please?) and because the test change fanout is large. The actual code changes are ?M?.

Passes the submit repo, Hotspot tier1, the JFR gc event tests and any other test set with ?gc? or ?serviceability? in the test directory name. I found it difficult to verify the accuracy of the reported values other than manually, since they can vary from run to run of the same program. I?d appreciate suggestions for how to go about writing accuracy tests.

I set out originally to revamp only the MXBeans, but decided it would be incomplete if I didn?t include the jstat counters and the output of the GC.heap_info jcmd option. I can separate the latter two into their own RFEs, but I find it easier understand it all in a single webrev and hope the reviewers will too.

The basic approach is to add the new memory pools and collectors, the new jstat counters, and an archive region counter that stands in for an actual archive region set. HeapRegionSets are disjoint, so initially I tried to create a first-class archive region set (on the same level as the humongous region set), but that idea foundered on the fact that there?s too much code I don?t fully understand that depends on archive regions being in the existing old region set. Externally (i.e., in the MXBeans and the jstat counters), however, the old region set doesn?t include archive regions (unless running in legacy mode).

I used CMS?s TraceCMSMemoryManagerStats class as the model for TraceConcMemoryManagerStats, which latter collects statistics on concurrent cycles. There are two STW pauses in each concurrent cycle: they are recorded separately and count as two sun.gc.collector.2 events.

The humongous and archive space committed and used values are always identical, hence they are always 100% used.

The revised output of jcmd GC.heap_info is in G1CollectedHeap::print_on().
I fixed a typo in src/hotspot/share/gc/g1/g1Policy.hpp by changing the result type of young_list_target_length() from size_t to uint, which latter is the type of the _young_list_target_length member.
I updated the copyright date in src/hotspot/share/services/memoryService.hpp to 2018, as I neglected to do so in a previous push.
Thanks,
Paul


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180723/b63d919b/attachment-0001.html>

From serguei.spitsyn at oracle.com  Tue Jul 24 00:22:27 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 23 Jul 2018 17:22:27 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
 <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
Message-ID: <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>

Hi Chris,


On 7/23/18 11:40, Chris Plummer wrote:
> Hi Serguei,
>
> If the fix was complicated I would agree, but it really just boils 
> down to this one line change:
>
> -??????????? fire = -1;
> +??????????? fire = 0; // Ignore this compilation. Wait for next one.

It is not obvious that this will completely fix the problem.
Is it possible that there will not be next compilation with the -Xcomp?

If it is possible then it is better to explicitly exclude these tests 
for -Xcomp.
Otherwise, consider this reviewed.

>
> Given that, I see no reason not to increase our test coverage by 
> supporting this test during -Xcomp runs.

I'd agree if it is going to be stable.

Thanks,
Serguei

>
> thanks,
>
> Chris
>
> On 7/23/18 9:44 AM, serguei.spitsyn at oracle.com wrote:
>> Hi Chris,
>>
>> Would it be more simple to avoid running these tests with -Xcomp?
>> I guess, this would work: @requires vm.compMode != "Xcomp"
>>
>> Thanks,
>> Serguei
>>
>>
>> On 7/23/18 00:42, Chris Plummer wrote:
>>> Hello,
>>>
>>> Please review the following fix for JDK11:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8151259
>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00
>>>
>>> It fixes the following 3 tests:
>>>
>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java
>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java
>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java
>>>
>>> Any of which could fail when run with -Xcomp with (followed by a 
>>> bunch more errors):
>>>
>>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
>>> ignored.
>>>
>>> Although lately we've only seen this with redefclass030.java on macosx.
>>>
>>> These 3 tests do redefinition of a "hot" method after triggering 
>>> compilation for it. After the redef some testing is done to ensure 
>>> that the redef was done correctly, but the issue these test have 
>>> actually comes before any redef is done.
>>>
>>> The test attempts to trigger compilation by calling a hot method a 
>>> lot. The agent detects compilation by receiving a CompiledMethodLoad 
>>> event. There was an issue discovered long ago that when -Xcomp is 
>>> used, the compilation happens before the "hot" method is ever 
>>> called. Then the redef would happen before compilation, and this 
>>> somehow messed up the test (I'm not exactly sure how). The fix was 
>>> to basically abandon the redef attempt when this problem is 
>>> detected, and then supposedly just let the test run to completion 
>>> (skipping the actual testing of the redef). After this change, if 
>>> you ran with -Xcomp it would pass, but if you looked in the log you 
>>> would see:
>>>
>>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
>>> ignored.
>>>
>>> However, there was a bug in the logic to make the test run to 
>>> completion, and also causes the above message to not appear. Instead 
>>> the test would fail with:
>>>
>>> # ERROR: Redefinition not completed.
>>>
>>> Followed by a bunch more error message during the part of the test 
>>> that checks if the redef was done properly.
>>>
>>> If the CompiledMethodLoad event comes in before the hot method is 
>>> ever called (which it does with -Xcomp), the test sets fire = -1. If 
>>> the hot method was called, it is set to 1.? The setting of fire = -1 
>>> was added to fix the -Xcomp problem mentioned above. The jvmti agent 
>>> does the following:
>>>
>>> ??? do {
>>> ??????? THREAD_sleep(1);
>>> ??????? /* wait for compilation to happen */
>>> ??? } while(fire == 0);
>>>
>>> ??? if (fire == 1) {
>>> ??????? /* do the redef here */
>>> ??????? NSK_DISPLAY0("agentProc: <<<<<<<< RedefineClasses() is 
>>> successfully done\n");
>>> ??? } else {
>>> ??????? // fire == -1
>>> ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't executed. 
>>> Don't perform redefinition\n");
>>> ??? }
>>>
>>> The agent then syncs with the debuggee, waiting for it finish up. 
>>> What the test expects is that waitForRedefinitionStarted() in the 
>>> debuggee will time out after two seconds while waiting for fire == 1 
>>> (which it thinks will will always happen because it was set to -1). 
>>> When it times out, the test does appear to exit properly with, but 
>>> with the following in the log, which is intended:
>>>
>>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
>>> ignored.
>>>
>>> However, sometimes before waitForRedefinitionStarted() times out, 
>>> the hot method is called enough times to trigger compilation. So 
>>> another CompiledMethodLoad event arrives, and this time fire is set 
>>> to 1. Because of this, waitForRedefinitionStarted() doesn't time out 
>>> and returns with an indication that the redef has started. After 
>>> this waitForRedefinitionCompleted() is executed. It waits for the 
>>> redef to complete, but it never does since the agent decided not to 
>>> do the redef when it saw fire == -1. So 
>>> waitForRedefinitionCompleted() times out after 10 seconds and the 
>>> test fails, with:
>>>
>>> # ERROR: Redefinition not completed.
>>>
>>> Actually the above error is not really what causes the failure. When 
>>> the above error is detected, no error status is set and the test 
>>> continues as if the redef had been done. So then the logic that 
>>> detects if the redef was done properly ends up failing, and that's 
>>> where the test actually indicates a failure status. You see a whole 
>>> bunch of other errors in the log because of all the checks that fail.
>>>
>>> The fix is to not abandon the test when the first CompiledMethodLoad 
>>> event is before the hot method was called. Instead just leave 
>>> fire==0 and wait for the next CompiledMethodLoad event that is 
>>> triggered after the method is called enough times to be recompiled. 
>>> I'm not sure why it was not originally done this way. Possibly the 
>>> recompilation did not happen reliably, but I have not run into this 
>>> problem. The other changes in redefclass030.c are just cleaning up 
>>> debug tracing.
>>>
>>> Another fix was to properly set the error status when 
>>> waitForRedefinitionStarted() or waitForRedefinitionCompleted() times 
>>> out, although this is just a safety net and I didn't run into any 
>>> cases where this happened after fixing the CompiledMethodLoad event 
>>> handling. So in general the changes in redefclass030.java were not 
>>> needed, but provide better error handling.
>>>
>>> thanks,
>>>
>>> Chris
>>>
>>
>
>


From chris.plummer at oracle.com  Tue Jul 24 03:19:33 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 23 Jul 2018 20:19:33 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
 <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
 <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>
Message-ID: <f457a86e-491f-472a-b052-a85740e1a135@oracle.com>

On 7/23/18 5:22 PM, serguei.spitsyn at oracle.com wrote:
> Hi Chris,
>
>
> On 7/23/18 11:40, Chris Plummer wrote:
>> Hi Serguei,
>>
>> If the fix was complicated I would agree, but it really just boils 
>> down to this one line change:
>>
>> -??????????? fire = -1;
>> +??????????? fire = 0; // Ignore this compilation. Wait for next one.
>
> It is not obvious that this will completely fix the problem.
> Is it possible that there will not be next compilation with the -Xcomp?
It's only one method that we check for. I don't see why there would be 
2nd -Xcomp compilation for it, but even if there was, the test will 
ignore it just like the first one. It will ignore compilations of the 
method until the flag has been set indicating the method has been 
executed once. If for some reason the method is never compiled after 
being executed once, the test will give up waiting for it (I think after 
30 seconds) and produce an error.
>
> If it is possible then it is better to explicitly exclude these tests 
> for -Xcomp.
> Otherwise, consider this reviewed.
>
>>
>> Given that, I see no reason not to increase our test coverage by 
>> supporting this test during -Xcomp runs.
>
> I'd agree if it is going to be stable.
>
If problems turn up in the future, we can reconsider disabling it.

thanks,

Chris
> Thanks,
> Serguei
>
>>
>> thanks,
>>
>> Chris
>>
>> On 7/23/18 9:44 AM, serguei.spitsyn at oracle.com wrote:
>>> Hi Chris,
>>>
>>> Would it be more simple to avoid running these tests with -Xcomp?
>>> I guess, this would work: @requires vm.compMode != "Xcomp"
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 7/23/18 00:42, Chris Plummer wrote:
>>>> Hello,
>>>>
>>>> Please review the following fix for JDK11:
>>>>
>>>> https://bugs.openjdk.java.net/browse/JDK-8151259
>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00
>>>>
>>>> It fixes the following 3 tests:
>>>>
>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java
>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java
>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java
>>>>
>>>> Any of which could fail when run with -Xcomp with (followed by a 
>>>> bunch more errors):
>>>>
>>>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
>>>> ignored.
>>>>
>>>> Although lately we've only seen this with redefclass030.java on 
>>>> macosx.
>>>>
>>>> These 3 tests do redefinition of a "hot" method after triggering 
>>>> compilation for it. After the redef some testing is done to ensure 
>>>> that the redef was done correctly, but the issue these test have 
>>>> actually comes before any redef is done.
>>>>
>>>> The test attempts to trigger compilation by calling a hot method a 
>>>> lot. The agent detects compilation by receiving a 
>>>> CompiledMethodLoad event. There was an issue discovered long ago 
>>>> that when -Xcomp is used, the compilation happens before the "hot" 
>>>> method is ever called. Then the redef would happen before 
>>>> compilation, and this somehow messed up the test (I'm not exactly 
>>>> sure how). The fix was to basically abandon the redef attempt when 
>>>> this problem is detected, and then supposedly just let the test run 
>>>> to completion (skipping the actual testing of the redef). After 
>>>> this change, if you ran with -Xcomp it would pass, but if you 
>>>> looked in the log you would see:
>>>>
>>>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
>>>> ignored.
>>>>
>>>> However, there was a bug in the logic to make the test run to 
>>>> completion, and also causes the above message to not appear. 
>>>> Instead the test would fail with:
>>>>
>>>> # ERROR: Redefinition not completed.
>>>>
>>>> Followed by a bunch more error message during the part of the test 
>>>> that checks if the redef was done properly.
>>>>
>>>> If the CompiledMethodLoad event comes in before the hot method is 
>>>> ever called (which it does with -Xcomp), the test sets fire = -1. 
>>>> If the hot method was called, it is set to 1.? The setting of fire 
>>>> = -1 was added to fix the -Xcomp problem mentioned above. The jvmti 
>>>> agent does the following:
>>>>
>>>> ??? do {
>>>> ??????? THREAD_sleep(1);
>>>> ??????? /* wait for compilation to happen */
>>>> ??? } while(fire == 0);
>>>>
>>>> ??? if (fire == 1) {
>>>> ??????? /* do the redef here */
>>>> ??????? NSK_DISPLAY0("agentProc: <<<<<<<< RedefineClasses() is 
>>>> successfully done\n");
>>>> ??? } else {
>>>> ??????? // fire == -1
>>>> ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't executed. 
>>>> Don't perform redefinition\n");
>>>> ??? }
>>>>
>>>> The agent then syncs with the debuggee, waiting for it finish up. 
>>>> What the test expects is that waitForRedefinitionStarted() in the 
>>>> debuggee will time out after two seconds while waiting for fire == 
>>>> 1 (which it thinks will will always happen because it was set to 
>>>> -1). When it times out, the test does appear to exit properly with, 
>>>> but with the following in the log, which is intended:
>>>>
>>>> ?# ERROR: Redefinition not started. Maybe running with -Xcomp. Test 
>>>> ignored.
>>>>
>>>> However, sometimes before waitForRedefinitionStarted() times out, 
>>>> the hot method is called enough times to trigger compilation. So 
>>>> another CompiledMethodLoad event arrives, and this time fire is set 
>>>> to 1. Because of this, waitForRedefinitionStarted() doesn't time 
>>>> out and returns with an indication that the redef has started. 
>>>> After this waitForRedefinitionCompleted() is executed. It waits for 
>>>> the redef to complete, but it never does since the agent decided 
>>>> not to do the redef when it saw fire == -1. So 
>>>> waitForRedefinitionCompleted() times out after 10 seconds and the 
>>>> test fails, with:
>>>>
>>>> # ERROR: Redefinition not completed.
>>>>
>>>> Actually the above error is not really what causes the failure. 
>>>> When the above error is detected, no error status is set and the 
>>>> test continues as if the redef had been done. So then the logic 
>>>> that detects if the redef was done properly ends up failing, and 
>>>> that's where the test actually indicates a failure status. You see 
>>>> a whole bunch of other errors in the log because of all the checks 
>>>> that fail.
>>>>
>>>> The fix is to not abandon the test when the first 
>>>> CompiledMethodLoad event is before the hot method was called. 
>>>> Instead just leave fire==0 and wait for the next CompiledMethodLoad 
>>>> event that is triggered after the method is called enough times to 
>>>> be recompiled. I'm not sure why it was not originally done this 
>>>> way. Possibly the recompilation did not happen reliably, but I have 
>>>> not run into this problem. The other changes in redefclass030.c are 
>>>> just cleaning up debug tracing.
>>>>
>>>> Another fix was to properly set the error status when 
>>>> waitForRedefinitionStarted() or waitForRedefinitionCompleted() 
>>>> times out, although this is just a safety net and I didn't run into 
>>>> any cases where this happened after fixing the CompiledMethodLoad 
>>>> event handling. So in general the changes in redefclass030.java 
>>>> were not needed, but provide better error handling.
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>
>>
>>
>


From serguei.spitsyn at oracle.com  Tue Jul 24 07:25:16 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 24 Jul 2018 00:25:16 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <f457a86e-491f-472a-b052-a85740e1a135@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
 <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
 <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>
 <f457a86e-491f-472a-b052-a85740e1a135@oracle.com>
Message-ID: <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180724/67e5f3d9/attachment-0001.html>

From ralf.schmelter at sap.com  Tue Jul 24 13:32:54 2018
From: ralf.schmelter at sap.com (Schmelter, Ralf)
Date: Tue, 24 Jul 2018 13:32:54 +0000
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <343bce9f-7072-17fd-9351-1c62f67a12de@oracle.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
 <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
 <f96a16917c934a539523e078f902b880@sap.com>
 <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com>
 <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com>
 <6de6362944f84740b80abb22cbbea872@sap.com>
 <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com>
 <eac7c9ba-1d94-3efe-a5ac-1b54bf6303e9@oracle.com>
 <343bce9f-7072-17fd-9351-1c62f67a12de@oracle.com>
Message-ID: <92dcce7000a94cf89ae2169cb1f843f2@sap.com>

Hi all,

here is the update webref with the fixed copyright: http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v5/

Best regards,
Ralf

-----Original Message-----
From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com] 
Sent: Freitag, 20. Juli 2018 23:04
To: Chris Plummer <chris.plummer at oracle.com>; Schmelter, Ralf <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; Stuefe, Thomas <thomas.stuefe at sap.com>
Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior

On 7/20/18 13:44, Chris Plummer wrote:
> On 7/20/18 1:40 PM, serguei.spitsyn at oracle.com wrote:
>> Hi Ralf,
>>
>>
>> On 7/20/18 07:28, Schmelter, Ralf wrote:
>>> Hi Sergue,
>>>
>>> I?ve updated the webref: 
>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/
>>
>> The copyright year in ThreadReferenceImpl.c still has to be 2018, not 
>> 2008.
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html 
>>
>>
>> ? 72???????????? if (newDepth == -1_000) {
>> ? 73???????????????? // Pop some frames so there is room on the stack 
>> for the
>> ? 74???????????????? // call (including println()).
>> ? 75???????????????? notifyRecursionEnded();
>> ? 76???????????? }
>>
>> ? I have a concern on potential issue mentioned in the comment above.
>> ? Should a StackOverflowError be expected here?
>>
>> ? 79???????? } catch (StackOverflowError e) {
>> ? 80???????????? // Use negative depth to indicate the recursion has 
>> ended.
>> ? 81???????????? return -1;
>> ? 82???????? }
>>
>> ? What is going to happen if the StackOverflowError was really caught 
>> above?
> The SOE is really caught in the above code. I returns -1, and starts 
> the unwinding of the stack. After 1000 frames have been popped via 
> returns, notifyRecursionEnded() will be called. The pops are so 
> notifyRecursionEnded() can be called without worry of another SOE.

Got it, thanks Chris.

So, I'm Okay with the fix assuming the copyright year is fixed.

Thanks,
Serguei

From serguei.spitsyn at oracle.com  Tue Jul 24 16:01:34 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 24 Jul 2018 09:01:34 -0700
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <92dcce7000a94cf89ae2169cb1f843f2@sap.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
 <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
 <f96a16917c934a539523e078f902b880@sap.com>
 <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com>
 <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com>
 <6de6362944f84740b80abb22cbbea872@sap.com>
 <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com>
 <eac7c9ba-1d94-3efe-a5ac-1b54bf6303e9@oracle.com>
 <343bce9f-7072-17fd-9351-1c62f67a12de@oracle.com>
 <92dcce7000a94cf89ae2169cb1f843f2@sap.com>
Message-ID: <76229193-7f46-a17a-7ebb-bddbd3d698b9@oracle.com>

Hi Ralf,

I think, you have to consider it reviewed.
Sorry, I was not clear no new webrev is needed.

Do you need a sponsor for the push?

Thanks,
Serguei


On 7/24/18 06:32, Schmelter, Ralf wrote:
> Hi all,
>
> here is the update webref with the fixed copyright: http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v5/
>
> Best regards,
> Ralf
>
> -----Original Message-----
> From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com]
> Sent: Freitag, 20. Juli 2018 23:04
> To: Chris Plummer <chris.plummer at oracle.com>; Schmelter, Ralf <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; Stuefe, Thomas <thomas.stuefe at sap.com>
> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to prevent quadratic runtime behavior
>
> On 7/20/18 13:44, Chris Plummer wrote:
>> On 7/20/18 1:40 PM, serguei.spitsyn at oracle.com wrote:
>>> Hi Ralf,
>>>
>>>
>>> On 7/20/18 07:28, Schmelter, Ralf wrote:
>>>> Hi Sergue,
>>>>
>>>> I?ve updated the webref:
>>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/
>>> The copyright year in ThreadReferenceImpl.c still has to be 2018, not
>>> 2008.
>>>
>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html
>>>
>>>
>>>  ? 72???????????? if (newDepth == -1_000) {
>>>  ? 73???????????????? // Pop some frames so there is room on the stack
>>> for the
>>>  ? 74???????????????? // call (including println()).
>>>  ? 75???????????????? notifyRecursionEnded();
>>>  ? 76???????????? }
>>>
>>>  ? I have a concern on potential issue mentioned in the comment above.
>>>  ? Should a StackOverflowError be expected here?
>>>
>>>  ? 79???????? } catch (StackOverflowError e) {
>>>  ? 80???????????? // Use negative depth to indicate the recursion has
>>> ended.
>>>  ? 81???????????? return -1;
>>>  ? 82???????? }
>>>
>>>  ? What is going to happen if the StackOverflowError was really caught
>>> above?
>> The SOE is really caught in the above code. I returns -1, and starts
>> the unwinding of the stack. After 1000 frames have been popped via
>> returns, notifyRecursionEnded() will be called. The pops are so
>> notifyRecursionEnded() can be called without worry of another SOE.
> Got it, thanks Chris.
>
> So, I'm Okay with the fix assuming the copyright year is fixed.
>
> Thanks,
> Serguei


From chris.plummer at oracle.com  Tue Jul 24 16:27:04 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 24 Jul 2018 09:27:04 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
 <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
 <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>
 <f457a86e-491f-472a-b052-a85740e1a135@oracle.com>
 <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com>
Message-ID: <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180724/a1738c7a/attachment-0001.html>

From serguei.spitsyn at oracle.com  Tue Jul 24 19:18:16 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 24 Jul 2018 12:18:16 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
 <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
 <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>
 <f457a86e-491f-472a-b052-a85740e1a135@oracle.com>
 <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com>
 <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com>
Message-ID: <ff45637f-9ad8-9e63-4484-91fff7ed6777@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180724/36d511e7/attachment.html>

From gary.adams at oracle.com  Tue Jul 24 19:28:28 2018
From: gary.adams at oracle.com (Gary Adams)
Date: Tue, 24 Jul 2018 15:28:28 -0400
Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR:
 suspendCounts don't match for : Common-Cleaner
In-Reply-To: <2bd15cea-86b5-1af0-ef53-2c82a4685224@oracle.com>
References: <5B4E0C62.3020808@oracle.com>
 <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com>
 <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com>
 <5B4F98BF.1060602@oracle.com>
 <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com>
 <c309dffe-f935-60ce-ce4b-5c99cd01406b@oracle.com>
 <5B507F2C.4080503@oracle.com>
 <2bd15cea-86b5-1af0-ef53-2c82a4685224@oracle.com>
Message-ID: <5B577DDC.3000500@oracle.com>

Here's a quick prototype to add a variable to the debuggee.
The debugger sets it at the end of each completed test case.

The debuggee can then check for the value change to delay
hitting the breakpoint which interfered with suspend count checks.

Would need to add a bit more error and timeout checking to
complete the fix. Should also check if the other resume008 test cases
need similar synchronization. Could possibly migrate the code up to
TestDebuggerType1 if other tests also needed this generic capability.


diff --git 
a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java 
b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java
--- a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java
+++ b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java
@@ -63,6 +63,9 @@
   *   to be resulting in the event.
   * - Upon getting new event, the debugger
   *   performs the check corresponding to the event.
+ * - The debugger informs the debuggee when it completes
+ *   each test case, so it will wait before hitting
+ *   communication breakpoints.
   */

  public class resume008 extends TestDebuggerType1 {
@@ -234,6 +237,7 @@

                       default: throw new Failure("** default case 1 **");
                  }
+                informDebuggeeTestCase(i);
              }

              display("......--> vm.resume()");
@@ -255,4 +259,25 @@
          }
      }

+    /**
+     * Inform debuggee which thread test the debugger has completed.
+     * Used for synchronization, so the debuggee does not move too quickly.
+     * @param testCase index of just completed test
+     */
+    void informDebuggeeTestCase(int testCase) {
+        if (!EventHandler.isDisconnected() && debuggeeClass != null) {
+            try {
+                ((ClassType)debuggeeClass)
+                    .setValue(debuggeeClass.fieldByName("testCase"),
+                              vm.mirrorOf(testCase));
+            } catch (InvalidTypeException ite) {
+                // ignored
+            } catch (ClassNotLoadedException cnle) {
+                // ignored
+            } catch (VMDisconnectedException e) {
+                // ignored
  }
+        }
+    }
+
+}


diff --git 
a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java 
b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java
--- a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java
+++ b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java
@@ -62,6 +62,7 @@

      static int exitCode = PASSED;

+    static int testCase = -1;
      static int instruction = 1;
      static int end         = 0;
                                     //    static int quit        = 0;
@@ -104,6 +105,15 @@
                              threadStart(thread0);

                              thread1 = new Threadresume008a("thread1");
+                            // Wait for debugger to complete the first 
test case
+                            // before advancing to the next breakpoint
+                            while (testCase < 0) {
+                                try {
+                                    Thread.sleep(100);
+                                } catch (InterruptedException e) {
+                                    // ignored
+                                }
+                            }
                              methodForCommunication();
                              break;


On 7/20/18, 2:37 PM, Chris Plummer wrote:
> Hi Gary,
>
> The test fails if the breakpoint event comes in after the test 
> captures the initial thread suspend counts and before the test 
> captures the 2nd suspend counts.
>
> debugger>         getting : Map<String, Integer> suspendsCounts1
> debugger> {Reference Handler=1, Common-Cleaner=1, main=1, Signal 
> Dispatcher=1, Finalizer=1}
> debugger>         eventSet.resume;
> debugger>         getting : Map<String, Integer> suspendsCounts2
> EventHandler> Received event set with policy = SUSPEND_ALL
> EventHandler> Event: BreakpointEventImpl req breakpoint request 
> nsk.jdi.EventSet.resume.resume008a:60 (enabled)
> debugger> Received communication breakpoint event.
> debugger> {Reference Handler=2, Common-Cleaner=2, main=1, Signal 
> Dispatcher=2, Finalizer=2}
>
> So we end up with some threads starting with 1 suspend and ending with 
> 2 (not clear to me why main is still at 1).
>
> It will pass if the breakpoint comes in after it does both of suspend 
> count checks, as you have shown with the sleep(100) solution. Output 
> looks like this:
>
> debugger>        got new ThreadStartEvent with propety 'number' == 
> ThreadStartRequest1
> ...
> debugger> ......--> vm.suspend();
> debugger>         getting : Map<String, Integer> suspendsCounts1
> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, 
> Signal Dispatcher=1, Finalizer=1}
> debugger>         eventSet.resume;
> debugger>         getting : Map<String, Integer> suspendsCounts2
> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, 
> Signal Dispatcher=1, Finalizer=1}
> ...
> debugger> Received communication breakpoint event.
>
> I've also shown that it passes if the breakpoint always comes in 
> before capturing the initial suspend counts. I added a sleep on the 
> debugger side right after eventHandler.waitForRequestedEventSet() 
> returns. Output looks like:
>
> debugger> Received communication breakpoint event.
> debugger>        got new ThreadStartEvent with propety 'number' == 
> ThreadStartRequest1
> ...
> debugger> ......--> vm.suspend();
> debugger>         getting : Map<String, Integer> suspendsCounts1
> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
> Signal Dispatcher=2, Finalizer=2}
> debugger>         eventSet.resume;
> debugger>         getting : Map<String, Integer> suspendsCounts2
> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
> Signal Dispatcher=2, Finalizer=2}
>
> I think we should add synchronization to force one of these two 
> outcomes. For the first, you would need to make the debugger modify 
> some variable that the debuggee is watching (sitting in a loop waiting 
> for it to change). For the second, you can rely on the existing 
> methodForCommunication() approach. You just need to restructure the 
> debugger a bit. I had started down this path late Wednesday, but got 
> sidetracked by a few other things. I can look into it some more if 
> you'd like.
>
> thanks,
>
> Chris
>
> On 7/19/18 5:08 AM, Gary Adams wrote:
>> In the successful run below "the first acquire thread suspend counts, 
>> resume,
>> and the second acquire thread suspend counts" is not interrupted by the
>> breakpoint event.
>>
>> Note that the failed thread0 case the test thread finishes rapidly.
>> [2018-01-22T20:33:46.86] debugee.stderr> **> debuggee: 'run': enter 
>> :: threadName == thread0 *[2018-01-22T20:33:46.86] debugee.stderr> 
>> **> debuggee: 'run': exit :: threadName == thread0*
>>
>> and the successful test run , the thread0 run method exits after the 
>> thread1
>> has started.
>>
>> debugger> :::::: case: # 1
>> debugger> ......waiting for new ThreadStartEvent : 1
>> EventHandler> waitForRequestedEventSet: enabling remove of listener 
>> nsk.share.jdi.EventHandler$7 at 616bc3ae
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 616bc3ae
>> EventHandler> waitForRequestedEventSet: vm.resume called
>> EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD
>> *debugee.stderr> **> debuggee:   'run': exit   :: threadName == thread0*
>>
>>
>> Here's a recent mach5 failed log:
>> [2018-01-22T20:33:45.65] # [2018-01-22T20:33:45.65] export 
>> TEST_CLEANUP [2018-01-22T20:33:45.65] export SHELL 
>> [2018-01-22T20:33:45.65] export DISPLAY [2018-01-22T20:33:45.65] 
>> export LIBJSIG_PATH [2018-01-22T20:33:45.65] export TESTBASE 
>> [2018-01-22T20:33:45.65] export JAVA_OPTS [2018-01-22T20:33:45.65] 
>> export RAS_OPTIONS [2018-01-22T20:33:45.65] export HOME 
>> [2018-01-22T20:33:45.65] export LD_LIBRARY_PATH 
>> [2018-01-22T20:33:45.65] export CLASSPATH [2018-01-22T20:33:45.65] 
>> export TEMP [2018-01-22T20:33:45.65] export TESTED_JAVA_HOME 
>> [2018-01-22T20:33:45.65] export BASH_ENV [2018-01-22T20:33:45.65] 
>> export PATH [2018-01-22T20:33:45.65] TEST_DEST_DIR="resume008" 
>> [2018-01-22T20:33:45.65] # Actual: TEST_DEST_DIR=resume008 
>> [2018-01-22T20:33:45.65] TESTNAME="${test_case_name}" 
>> [2018-01-22T20:33:45.65] # Actual: TESTNAME=resume008 
>> [2018-01-22T20:33:45.65] 
>> testName="nsk/jdi/EventSet/resume//resume008" 
>> [2018-01-22T20:33:45.65] # Actual: 
>> testName=nsk/jdi/EventSet/resume//resume008 [2018-01-22T20:33:45.65] 
>> TESTDIR="${test_work_dir}" [2018-01-22T20:33:45.65] # Actual: 
>> TESTDIR=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008 
>> [2018-01-22T20:33:45.65] testWorkDir="${test_work_dir}/" 
>> [2018-01-22T20:33:45.65] # Actual: 
>> testWorkDir=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/ 
>> [2018-01-22T20:33:45.65] export testWorkDir [2018-01-22T20:33:45.65] 
>> tlogOutFile="${test_work_dir}/${test_name}.tlog" 
>> [2018-01-22T20:33:45.65] # Actual: 
>> tlogOutFile=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/resume008.tlog 
>> [2018-01-22T20:33:45.65] 
>> testErrFile="${test_work_dir}/${test_name}.err" 
>> [2018-01-22T20:33:45.65] # Actual: 
>> testErrFile=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008/resume008.err 
>> [2018-01-22T20:33:45.65] EXECUTE_CLASS="${test_name}" 
>> [2018-01-22T20:33:45.66] # Actual: EXECUTE_CLASS=resume008 
>> [2018-01-22T20:33:45.66] 
>> NSK_STRESS_METASPACE_OPTS="-XX:MaxMetaspaceSize=128m 
>> -XX:CompressedClassSpaceSize=64m -XX:MaxHeapSize=512m 
>> -Xlog:gc(ASTERISK_SUBST),gc+heap=trace" [2018-01-22T20:33:45.66] # 
>> Actual: NSK_STRESS_METASPACE_OPTS=-XX:MaxMetaspaceSize=128m 
>> -XX:CompressedClassSpaceSize=64m -XX:MaxHeapSize=512m 
>> -Xlog:gc*,gc+heap=trace [2018-01-22T20:33:45.66] export 
>> NSK_STRESS_METASPACE_OPTS [2018-01-22T20:33:45.66] 
>> EXECUTE_CLASS="nsk.jdi.EventSet.resume.resume008" 
>> [2018-01-22T20:33:45.66] # Actual: 
>> EXECUTE_CLASS=nsk.jdi.EventSet.resume.resume008 
>> [2018-01-22T20:33:45.66] TEST_ARGS="${JDI_TEST_KEYS} 
>> -debugee.vmkeys=${JDI_DEBUGEE_VM_KEYS}" [2018-01-22T20:33:45.66] # 
>> Actual: TEST_ARGS=-verbose -arch=linux-amd64 -waittime=5 
>> -debugee.vmkind=java -transport.address=dynamic 
>> -debugee.vmkeys=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:45.66] 
>> JAVA="${TESTED_JAVA_HOME}/bin/${DEBUGGER_KIND_OF_JAVA}" 
>> [2018-01-22T20:33:45.66] # Actual: 
>> JAVA=/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10/bin/java 
>> [2018-01-22T20:33:45.66] JAVA_OPTS="${DEBUGGER_JAVA_OPTS}" 
>> [2018-01-22T20:33:45.66] # Actual: JAVA_OPTS= 
>> [2018-01-22T20:33:45.66] APPLICATION_TIMEOUT="${TIMEOUT}" 
>> [2018-01-22T20:33:45.66] # Actual: APPLICATION_TIMEOUT=30 
>> [2018-01-22T20:33:45.66] 
>> CLASSPATH="${test_work_dir}${PS}${CLASSPATH}" 
>> [2018-01-22T20:33:45.66] # Actual: 
>> CLASSPATH=/scratch/opt/mach5/mesos/work_dir/slaves/450fa0f5-8733-43f8-b866-79fe3e86d200-S1/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/73521160-5cd7-4903-8c37-ced4a457889d/runs/43aad3bc-5f94-43a8-85b4-40524a4342b6/testoutput/tonga/mach5-one.Linux.amd64/resume008:/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.test/hotspot/closed/tonga/bin/classes: 
>> [2018-01-22T20:33:45.66] export CLASSPATH [2018-01-22T20:33:45.66] 
>> ${JAVA} ${JAVA_OPTS} ${EXECUTE_CLASS} ${TEST_ARGS} 
>> [2018-01-22T20:33:45.66] # Actual: 
>> /scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10/bin/java 
>> nsk.jdi.EventSet.resume.resume008 -verbose -arch=linux-amd64 
>> -waittime=5 -debugee.vmkind=java -transport.address=dynamic 
>> -debugee.vmkeys=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:46.01] 
>> binder> VirtualMachineManager: version 9.0 [2018-01-22T20:33:46.05] 
>> binder> Finding connector: default [2018-01-22T20:33:46.05] binder> 
>> LaunchingConnector: [2018-01-22T20:33:46.06] binder> name: 
>> com.sun.jdi.CommandLineLaunch [2018-01-22T20:33:46.06] binder> 
>> description: Launches target using Sun Java VM command line and 
>> attaches to it [2018-01-22T20:33:46.06] binder> transport: 
>> com.sun.tools.jdi.SunCommandLineLauncher$2 at 457e2f02 
>> [2018-01-22T20:33:46.19] binder> Connector arguments: 
>> [2018-01-22T20:33:46.19] binder> 
>> home=/scratch/opt/mach5/mesos/work_dir/jib-master/install/2018-01-22-1955354.leonid.mesnik.hs/linux-x64.jdk/jdk-10 
>> [2018-01-22T20:33:46.19] binder> vmexec=java [2018-01-22T20:33:46.19] 
>> binder> options=-XX:MaxRAMPercentage=12.5 [2018-01-22T20:33:46.20] 
>> binder> main=nsk.jdi.EventSet.resume.resume008a "-verbose" 
>> "-arch=linux-amd64" "-waittime=5" "-debugee.vmkind=java" 
>> "-transport.address=dynamic" 
>> "-debugee.vmkeys=-XX:MaxRAMPercentage=12.5" "-pipe.port=28038" 
>> [2018-01-22T20:33:46.20] binder> quote=" [2018-01-22T20:33:46.20] 
>> binder> suspend=true [2018-01-22T20:33:46.20] binder> Launching 
>> debugee [2018-01-22T20:33:46.56] binder> Waiting for VM initialized 
>> [2018-01-22T20:33:46.60] Initial VMStartEvent received: VMStartEvent 
>> in thread main [2018-01-22T20:33:46.61] EventHandler> Adding listener 
>> nsk.share.jdi.EventHandler$1 at 1e7c7811 [2018-01-22T20:33:46.61] 
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$2 at 1a3869f4 
>> [2018-01-22T20:33:46.61] EventHandler> Adding listener 
>> nsk.share.jdi.EventHandler$3 at 77f99a05 [2018-01-22T20:33:46.61] 
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$4 at 3aeaafa6 
>> [2018-01-22T20:33:46.61] EventHandler> Adding listener 
>> nsk.share.jdi.EventHandler$5 at 4d3167f4 [2018-01-22T20:33:46.62] 
>> EventHandler> waitForRequestedEvent: enabling remove of listener 
>> nsk.share.jdi.EventHandler$6 at 4eb7f003 [2018-01-22T20:33:46.62] 
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$6 at 4eb7f003 
>> [2018-01-22T20:33:46.62] EventHandler> waitForRequestedEvent: 
>> vm.resume called [2018-01-22T20:33:46.67] EventHandler> Received 
>> event set with policy = SUSPEND_EVENT_THREAD [2018-01-22T20:33:46.68] 
>> EventHandler> Event: ClassPrepareEventImpl req class prepare request 
>> (enabled) [2018-01-22T20:33:46.69] EventHandler> 
>> waitForRequestedEvent: Received event(ClassPrepareEvent in thread 
>> main) for request(class prepare request (enabled)) 
>> [2018-01-22T20:33:46.69] EventHandler> Removing listener 
>> nsk.share.jdi.EventHandler$6 at 4eb7f003 [2018-01-22T20:33:46.69] 
>> debugger> Received ClassPrepareEvent for debuggee class: 
>> nsk.jdi.EventSet.resume.resume008a [2018-01-22T20:33:46.71] binder> 
>> Breakpoint set: [2018-01-22T20:33:46.71] breakpoint request 
>> nsk.jdi.EventSet.resume.resume008a:60 (disabled) 
>> [2018-01-22T20:33:46.71] EventHandler> Adding listener 
>> nsk.share.jdi.TestDebuggerType1$1 at 43738a82 [2018-01-22T20:33:46.71] 
>> debugger> TESTING BEGINS [2018-01-22T20:33:46.71] debugger> RESUME 
>> DEBUGGEE VM [2018-01-22T20:33:46.72] debugger> 
>> shouldRunAfterBreakpoint: entered [2018-01-22T20:33:46.72] debugger> 
>> shouldRunAfterBreakpoint: waiting for breakpoint event during 1 sec. 
>> [2018-01-22T20:33:46.84] EventHandler> Received event set with policy 
>> = SUSPEND_ALL [2018-01-22T20:33:46.84] EventHandler> Event: 
>> BreakpointEventImpl req breakpoint request 
>> nsk.jdi.EventSet.resume.resume008a:60 (enabled) 
>> [2018-01-22T20:33:46.84] debugger> Received communication breakpoint 
>> event. [2018-01-22T20:33:46.84] debugger> shouldRunAfterBreakpoint: 
>> received breakpoint event. [2018-01-22T20:33:46.84] debugee.stderr> 
>> **> debuggee: debuggee started! [2018-01-22T20:33:46.85] debugger> 
>> shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.85] 
>> debugger> :::::: case: # 0 [2018-01-22T20:33:46.85] debugger> 
>> ......waiting for new ThreadStartEvent : 0 [2018-01-22T20:33:46.85] 
>> EventHandler> waitForRequestedEventSet: enabling remove of listener 
>> nsk.share.jdi.EventHandler$7 at 6ec8211c [2018-01-22T20:33:46.85] 
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 6ec8211c 
>> [2018-01-22T20:33:46.85] EventHandler> waitForRequestedEventSet: 
>> vm.resume called [2018-01-22T20:33:46.86] debugee.stderr> **> 
>> debuggee: 'run': enter :: threadName == thread0 
>> [2018-01-22T20:33:46.86] debugee.stderr> **> debuggee: 'run': exit :: 
>> threadName == thread0 [2018-01-22T20:33:46.86] EventHandler> Received 
>> event set with policy = SUSPEND_NONE [2018-01-22T20:33:46.86] 
>> EventHandler> waitForRequestedEventSet: Received event set for 
>> request: thread start request (enabled) [2018-01-22T20:33:46.86] 
>> EventHandler> Event: ThreadStartEventImpl req thread start request 
>> (enabled) [2018-01-22T20:33:46.86] EventHandler> Removing listener 
>> nsk.share.jdi.EventHandler$7 at 6ec8211c [2018-01-22T20:33:46.86] 
>> debugger> got new ThreadStartEvent with propety 'number' == 
>> ThreadStartRequest1 [2018-01-22T20:33:46.86] debugger> ......checking 
>> up on EventSet.resume() [2018-01-22T20:33:46.86] debugger> ......--> 
>> vm.suspend(); [2018-01-22T20:33:46.87] debugger> getting : 
>> Map<String, Integer> suspendsCounts1 [2018-01-22T20:33:46.87] 
>> debugger> {Reference Handler=1, Common-Cleaner=1, main=1, Signal 
>> Dispatcher=1, Finalizer=1} [2018-01-22T20:33:46.87] debugger> 
>> eventSet.resume; [2018-01-22T20:33:46.87] debugger> getting : 
>> Map<String, Integer> suspendsCounts2 [2018-01-22T20:33:46.87] 
>> EventHandler> Received event set with policy = SUSPEND_ALL 
>> [2018-01-22T20:33:46.87] EventHandler> Event: BreakpointEventImpl req 
>> breakpoint request nsk.jdi.EventSet.resume.resume008a:60 (enabled) 
>> [2018-01-22T20:33:46.87] debugger> Received communication breakpoint 
>> event. [2018-01-22T20:33:46.87] debugger> {Reference Handler=2, 
>> Common-Cleaner=2, main=1, Signal Dispatcher=2, Finalizer=2} 
>> [2018-01-22T20:33:46.87] debugger> getting : int policy = 
>> eventSet.suspendPolicy(); [2018-01-22T20:33:46.87] debugger> case 
>> SUSPEND_NONE [2018-01-22T20:33:46.87] debugger> checking Reference 
>> Handler [2018-01-22T20:33:46.87] # ERROR: debugger> ERROR: 
>> suspendCounts don't match for : Reference Handler 
>> [2018-01-22T20:33:46.88] The following stacktrace is for Aurora. Used 
>> to create a RULE: [2018-01-22T20:33:46.88] nsk.share.TestFailure: 
>> debugger> ERROR: suspendCounts don't match for : Reference Handler 
>> [2018-01-22T20:33:46.88] at 
>> nsk.share.Log.logExceptionForAurora(Log.java:411) 
>> [2018-01-22T20:33:46.88] at nsk.share.Log.complain(Log.java:380) 
>> [2018-01-22T20:33:46.88] at 
>> nsk.share.jdi.TestDebuggerType1.complain(TestDebuggerType1.java:63) 
>> [2018-01-22T20:33:46.88] at 
>> nsk.jdi.EventSet.resume.resume008.testRun(resume008.java:163) 
>> [2018-01-22T20:33:46.88] at 
>> nsk.share.jdi.TestDebuggerType1.runThis(TestDebuggerType1.java:104) 
>> [2018-01-22T20:33:46.88] at 
>> nsk.jdi.EventSet.resume.resume008.run(resume008.java:62) 
>> [2018-01-22T20:33:46.88] at 
>> nsk.jdi.EventSet.resume.resume008.main(resume008.java:57) 
>> [2018-01-22T20:33:46.88] # ERROR: debugger> before resuming : 1 
>> [2018-01-22T20:33:46.88] # ERROR: debugger> after resuming : 2 
>> [2018-01-22T20:33:46.88] debugger> ......--> vm.resume() 
>> [2018-01-22T20:33:46.88] debugger> shouldRunAfterBreakpoint: entered 
>> [2018-01-22T20:33:46.88] debugger> shouldRunAfterBreakpoint: received 
>> breakpoint event. [2018-01-22T20:33:46.88] debugger> 
>> shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.88] 
>> debugger> :::::: case: # 1 [2018-01-22T20:33:46.88] debugger> 
>> ......waiting for new ThreadStartEvent : 1 [2018-01-22T20:33:46.88] 
>> EventHandler> waitForRequestedEventSet: enabling remove of listener 
>> nsk.share.jdi.EventHandler$7 at 548ad73b [2018-01-22T20:33:46.88] 
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 548ad73b 
>> [2018-01-22T20:33:46.88] EventHandler> waitForRequestedEventSet: 
>> vm.resume called [2018-01-22T20:33:46.88] EventHandler> Received 
>> event set with policy = SUSPEND_EVENT_THREAD [2018-01-22T20:33:46.88] 
>> EventHandler> waitForRequestedEventSet: Received event set for 
>> request: thread start request (enabled) [2018-01-22T20:33:46.88] 
>> EventHandler> Event: ThreadStartEventImpl req thread start request 
>> (enabled) [2018-01-22T20:33:46.88] EventHandler> Removing listener 
>> nsk.share.jdi.EventHandler$7 at 548ad73b [2018-01-22T20:33:46.88] 
>> debugger> got new ThreadStartEvent with propety 'number' == 
>> ThreadStartRequest2 [2018-01-22T20:33:46.88] debugger> ......checking 
>> up on EventSet.resume() [2018-01-22T20:33:46.88] debugger> ......--> 
>> vm.suspend(); [2018-01-22T20:33:46.88] debugger> getting : 
>> Map<String, Integer> suspendsCounts1 [2018-01-22T20:33:46.89] 
>> debugger> {Reference Handler=1, thread1=2, Common-Cleaner=1, main=1, 
>> Signal Dispatcher=1, Finalizer=1} [2018-01-22T20:33:46.89] debugger> 
>> eventSet.resume; [2018-01-22T20:33:46.89] debugger> getting : 
>> Map<String, Integer> suspendsCounts2 [2018-01-22T20:33:46.89] 
>> debugger> {Reference Handler=1, thread1=1, Common-Cleaner=1, main=1, 
>> Signal Dispatcher=1, Finalizer=1} [2018-01-22T20:33:46.89] debugger> 
>> getting : int policy = eventSet.suspendPolicy(); 
>> [2018-01-22T20:33:46.89] debugger> case SUSPEND_THREAD 
>> [2018-01-22T20:33:46.89] debugger> checking Reference Handler 
>> [2018-01-22T20:33:46.89] debugger> checking thread1 
>> [2018-01-22T20:33:46.89] debugger> checking Common-Cleaner 
>> [2018-01-22T20:33:46.89] debugger> checking main 
>> [2018-01-22T20:33:46.90] debugger> checking Signal Dispatcher 
>> [2018-01-22T20:33:46.90] debugger> checking Finalizer 
>> [2018-01-22T20:33:46.90] debugger> ......--> vm.resume() 
>> [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: entered 
>> [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: waiting 
>> for breakpoint event during 1 sec. [2018-01-22T20:33:46.90] 
>> debugee.stderr> **> debuggee: 'run': enter :: threadName == thread1 
>> [2018-01-22T20:33:46.90] debugee.stderr> **> debuggee: 'run': exit :: 
>> threadName == thread1 [2018-01-22T20:33:46.90] EventHandler> Received 
>> event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.90] 
>> EventHandler> Event: BreakpointEventImpl req breakpoint request 
>> nsk.jdi.EventSet.resume.resume008a:60 (enabled) 
>> [2018-01-22T20:33:46.90] debugger> Received communication breakpoint 
>> event. [2018-01-22T20:33:46.90] debugger> shouldRunAfterBreakpoint: 
>> received breakpoint event. [2018-01-22T20:33:46.90] debugger> 
>> shouldRunAfterBreakpoint: exited with true. [2018-01-22T20:33:46.90] 
>> debugger> :::::: case: # 2 [2018-01-22T20:33:46.90] debugger> 
>> ......waiting for new ThreadStartEvent : 2 [2018-01-22T20:33:46.90] 
>> EventHandler> waitForRequestedEventSet: enabling remove of listener 
>> nsk.share.jdi.EventHandler$7 at 2641e737 [2018-01-22T20:33:46.90] 
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 2641e737 
>> [2018-01-22T20:33:46.90] EventHandler> waitForRequestedEventSet: 
>> vm.resume called [2018-01-22T20:33:46.90] EventHandler> Received 
>> event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.90] 
>> EventHandler> waitForRequestedEventSet: Received event set for 
>> request: thread start request (enabled) [2018-01-22T20:33:46.90] 
>> EventHandler> Event: ThreadStartEventImpl req thread start request 
>> (enabled) [2018-01-22T20:33:46.90] EventHandler> Removing listener 
>> nsk.share.jdi.EventHandler$7 at 2641e737 [2018-01-22T20:33:46.90] 
>> debugger> got new ThreadStartEvent with propety 'number' == 
>> ThreadStartRequest3 [2018-01-22T20:33:46.90] debugger> ......checking 
>> up on EventSet.resume() [2018-01-22T20:33:46.90] debugger> ......--> 
>> vm.suspend(); [2018-01-22T20:33:46.90] debugger> getting : 
>> Map<String, Integer> suspendsCounts1 [2018-01-22T20:33:46.91] 
>> debugger> {Reference Handler=2, thread2=2, Common-Cleaner=2, main=2, 
>> Signal Dispatcher=2, Finalizer=2} [2018-01-22T20:33:46.91] debugger> 
>> eventSet.resume; [2018-01-22T20:33:46.91] debugger> getting : 
>> Map<String, Integer> suspendsCounts2 [2018-01-22T20:33:46.91] 
>> debugger> {Reference Handler=1, thread2=1, Common-Cleaner=1, main=1, 
>> Signal Dispatcher=1, Finalizer=1} [2018-01-22T20:33:46.91] debugger> 
>> getting : int policy = eventSet.suspendPolicy(); 
>> [2018-01-22T20:33:46.91] debugger> case SUSPEND_ALL 
>> [2018-01-22T20:33:46.91] debugger> checking Reference Handler 
>> [2018-01-22T20:33:46.91] debugger> checking thread2 
>> [2018-01-22T20:33:46.91] debugger> checking Common-Cleaner 
>> [2018-01-22T20:33:46.91] debugger> checking main 
>> [2018-01-22T20:33:46.91] debugger> checking Signal Dispatcher 
>> [2018-01-22T20:33:46.91] debugger> checking Finalizer 
>> [2018-01-22T20:33:46.91] debugger> ......--> vm.resume() 
>> [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: entered 
>> [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: waiting 
>> for breakpoint event during 1 sec. [2018-01-22T20:33:46.91] 
>> debugee.stderr> **> debuggee: 'run': enter :: threadName == thread2 
>> [2018-01-22T20:33:46.91] debugee.stderr> **> debuggee: 'run': exit :: 
>> threadName == thread2 [2018-01-22T20:33:46.91] EventHandler> Received 
>> event set with policy = SUSPEND_ALL [2018-01-22T20:33:46.91] 
>> EventHandler> Event: BreakpointEventImpl req breakpoint request 
>> nsk.jdi.EventSet.resume.resume008a:60 (enabled) 
>> [2018-01-22T20:33:46.91] debugger> Received communication breakpoint 
>> event. [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: 
>> received breakpoint event. [2018-01-22T20:33:46.91] debugger> 
>> shouldRunAfterBreakpoint: received instruction from debuggee to 
>> finish. [2018-01-22T20:33:46.91] debugger> shouldRunAfterBreakpoint: 
>> exited with false. [2018-01-22T20:33:46.91] debugger> TESTING ENDS 
>> [2018-01-22T20:33:46.91] debugger> Waiting for debuggee's exit... 
>> [2018-01-22T20:33:46.91] EventHandler> waitForVMDisconnect 
>> [2018-01-22T20:33:46.91] debugee.stderr> **> debuggee: debuggee exits 
>> [2018-01-22T20:33:46.92] EventHandler> Received event set with policy 
>> = SUSPEND_NONE [2018-01-22T20:33:46.92] EventHandler> Event: 
>> VMDeathEventImpl req null [2018-01-22T20:33:46.92] EventHandler> 
>> receieved VMDeath [2018-01-22T20:33:46.92] EventHandler> Removing 
>> listener nsk.share.jdi.EventHandler$3 at 77f99a05 
>> [2018-01-22T20:33:47.25] EventHandler> Received event set with policy 
>> = SUSPEND_NONE [2018-01-22T20:33:47.25] EventHandler> Event: 
>> VMDisconnectEventImpl req null [2018-01-22T20:33:47.25] EventHandler> 
>> receieved VMDisconnect [2018-01-22T20:33:47.25] EventHandler> 
>> Removing listener nsk.share.jdi.EventHandler$4 at 3aeaafa6 
>> [2018-01-22T20:33:47.25] EventHandler> finished 
>> [2018-01-22T20:33:47.25] EventHandler> waitForVMDisconnect: done 
>> [2018-01-22T20:33:47.25] debugger> Event handler thread exited. 
>> [2018-01-22T20:33:47.25] debugger> Debuggee PASSED. 
>> [2018-01-22T20:33:47.26] [2018-01-22T20:33:47.26] 
>> [2018-01-22T20:33:47.26] #> [2018-01-22T20:33:47.26] #> SUMMARY: 
>> Following errors occured [2018-01-22T20:33:47.26] #> during test 
>> execution: [2018-01-22T20:33:47.26] #> [2018-01-22T20:33:47.26] # 
>> ERROR: debugger> ERROR: suspendCounts don't match for : Reference 
>> Handler [2018-01-22T20:33:47.26] # ERROR: debugger> before resuming : 
>> 1 [2018-01-22T20:33:47.26] # ERROR: debugger> after resuming : 2 
>> [2018-01-22T20:33:47.27] # Test level exit status: 97
>>
>>
>> Here's a recent passed log from a local run:
>>
>> ----------System.out:(164/9808)----------
>> run [nsk.jdi.EventSet.resume.resume008, -verbose, -arch=linux-x64, 
>> -waittime=5, -debugee.vmkind=java, -transport.address=dynamic, 
>> -debugee.vmkeys=-XX:MaxRAMPercentage=2 ]
>> binder> VirtualMachineManager: version 11.0
>> binder> Finding connector: default
>> binder> LaunchingConnector:
>> binder>     name: com.sun.jdi.CommandLineLaunch
>> binder>     description: Launches target using Sun Java VM command 
>> line and attaches to it
>> binder>     transport: 
>> com.sun.tools.jdi.SunCommandLineLauncher$2 at 749dec1a
>> binder> Connector arguments:
>> binder> home=/export/users/gradams/ws/jdk-jdk/build/linux-x64/images/jdk
>> binder>     vmexec=java
>> binder>     options=-XX:MaxRAMPercentage=2
>> binder>     main=nsk.jdi.EventSet.resume.resume008a "-verbose" 
>> "-arch=linux-x64" "-waittime=5" "-debugee.vmkind=java" 
>> "-transport.address=dynamic" "-debugee.vmkeys=-XX:MaxRAMPercentage=2 
>> " "-pipe.port=35940"
>> binder>     quote="
>> binder>     suspend=true
>> binder> Launching debugee
>> binder> Waiting for VM initialized
>> Initial VMStartEvent received: VMStartEvent in thread main
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$1 at 2ab41d39
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$2 at 2e3cb1e2
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$3 at 57f20df9
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$4 at 6e72e291
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$5 at 5889e23e
>> EventHandler> waitForRequestedEvent: enabling remove of listener 
>> nsk.share.jdi.EventHandler$6 at 46dcda7f
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$6 at 46dcda7f
>> EventHandler> waitForRequestedEvent: vm.resume called
>> EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD
>> EventHandler> Event: ClassPrepareEventImpl req class prepare request  
>> (enabled)
>> EventHandler> waitForRequestedEvent: Received event(ClassPrepareEvent 
>> in thread main) for request(class prepare request  (enabled))
>> EventHandler> Removing listener nsk.share.jdi.EventHandler$6 at 46dcda7f
>> debugger> Received ClassPrepareEvent for debuggee class: 
>> nsk.jdi.EventSet.resume.resume008a
>> binder> Breakpoint set:
>>     breakpoint request nsk.jdi.EventSet.resume.resume008a:74 (disabled)
>> EventHandler> Adding listener nsk.share.jdi.TestDebuggerType1$1 at 322c2a05
>> debugger> TESTING BEGINS
>> debugger> RESUME DEBUGGEE VM
>> debugger> shouldRunAfterBreakpoint: entered
>> debugger> shouldRunAfterBreakpoint: waiting for breakpoint event 
>> during 1 sec.
>>
>> debugee.stderr> **> debuggee: debuggee started!
>> EventHandler> Received event set with policy = SUSPEND_ALL
>> EventHandler> Event: BreakpointEventImpl req breakpoint request 
>> nsk.jdi.EventSet.resume.resume008a:74 (enabled)
>> debugger> Received communication breakpoint event.
>>
>> debugger> shouldRunAfterBreakpoint: received breakpoint event.
>> debugger> shouldRunAfterBreakpoint: exited with true.
>> debugger> :::::: case: # 0
>> debugger> ......waiting for new ThreadStartEvent : 0
>>
>> EventHandler> waitForRequestedEventSet: enabling remove of listener 
>> nsk.share.jdi.EventHandler$7 at 78aa490d
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 78aa490d
>> EventHandler> waitForRequestedEventSet: vm.resume called
>> EventHandler> Received event set with policy = SUSPEND_NONE
>> debugee.stderr> **> debuggee:   'run': enter  :: threadName == thread0
>> EventHandler> waitForRequestedEventSet: Received event set for 
>> request: thread start request  (enabled)
>> EventHandler> Event: ThreadStartEventImpl req thread start request  
>> (enabled)
>> EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 78aa490d
>> EventHandler> Received event set with policy = SUSPEND_ALL
>> EventHandler> Event: BreakpointEventImpl req breakpoint request 
>> nsk.jdi.EventSet.resume.resume008a:74 (enabled)
>> debugger> Received communication breakpoint event.
>>
>> debugger>        got new ThreadStartEvent with propety 'number' == 
>> ThreadStartRequest1
>> debugger> ......checking up on EventSet.resume()
>> debugger> ......--> vm.suspend();
>> debugger>         getting : Map<String, Integer> suspendsCounts1
>> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
>> Signal Dispatcher=2, Finalizer=2}
>> debugger>         eventSet.resume;
>> debugger>         getting : Map<String, Integer> suspendsCounts2
>> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
>> Signal Dispatcher=2, Finalizer=2}
>> debugger>         getting : int policy = eventSet.suspendPolicy();
>> debugger>         case SUSPEND_NONE
>> debugger>         checking Reference Handler
>> debugger>         checking thread0
>> debugger>         checking Common-Cleaner
>> debugger>         checking main
>> debugger>         checking Signal Dispatcher
>> debugger>         checking Finalizer
>> debugger> ......--> vm.resume()
>> debugger> shouldRunAfterBreakpoint: entered
>> debugger> shouldRunAfterBreakpoint: received breakpoint event.
>> debugger> shouldRunAfterBreakpoint: exited with true.
>> debugger> :::::: case: # 1
>> debugger> ......waiting for new ThreadStartEvent : 1
>> EventHandler> waitForRequestedEventSet: enabling remove of listener 
>> nsk.share.jdi.EventHandler$7 at 616bc3ae
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 616bc3ae
>> EventHandler> waitForRequestedEventSet: vm.resume called
>> EventHandler> Received event set with policy = SUSPEND_EVENT_THREAD
>> debugee.stderr> **> debuggee:   'run': exit   :: threadName == thread0
>> EventHandler> waitForRequestedEventSet: Received event set for 
>> request: thread start request  (enabled)
>> EventHandler> Event: ThreadStartEventImpl req thread start request  
>> (enabled)
>> EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 616bc3ae
>> debugger>        got new ThreadStartEvent with propety 'number' == 
>> ThreadStartRequest2
>> debugger> ......checking up on EventSet.resume()
>> debugger> ......--> vm.suspend();
>> debugger>         getting : Map<String, Integer> suspendsCounts1
>> debugger> {Reference Handler=1, thread1=2, Common-Cleaner=1, main=1, 
>> Signal Dispatcher=1, Finalizer=1}
>> debugger>         eventSet.resume;
>> debugger>         getting : Map<String, Integer> suspendsCounts2
>> debugger> {Reference Handler=1, thread1=1, Common-Cleaner=1, main=1, 
>> Signal Dispatcher=1, Finalizer=1}
>> debugger>         getting : int policy = eventSet.suspendPolicy();
>> debugger>         case SUSPEND_THREAD
>> debugger> checking Reference Handler
>> debugger> checking thread1
>> debugger> checking Common-Cleaner
>> debugger> checking main
>> debugger> checking Signal Dispatcher
>> debugger> checking Finalizer
>> debugger> ......--> vm.resume()
>> debugger> shouldRunAfterBreakpoint: entered
>> debugger> shouldRunAfterBreakpoint: waiting for breakpoint event 
>> during 1 sec.
>> debugee.stderr> **> debuggee:   'run': enter  :: threadName == thread1
>> debugee.stderr> **> debuggee:   'run': exit   :: threadName == thread1
>> EventHandler> Received event set with policy = SUSPEND_ALL
>> EventHandler> Event: BreakpointEventImpl req breakpoint request 
>> nsk.jdi.EventSet.resume.resume008a:74 (enabled)
>> debugger> Received communication breakpoint event.
>> debugger> shouldRunAfterBreakpoint: received breakpoint event.
>> debugger> shouldRunAfterBreakpoint: exited with true.
>> debugger> :::::: case: # 2
>> debugger> ......waiting for new ThreadStartEvent : 2
>> EventHandler> waitForRequestedEventSet: enabling remove of listener 
>> nsk.share.jdi.EventHandler$7 at 44e265ef
>> EventHandler> Adding listener nsk.share.jdi.EventHandler$7 at 44e265ef
>> EventHandler> waitForRequestedEventSet: vm.resume called
>> EventHandler> Received event set with policy = SUSPEND_ALL
>> EventHandler> waitForRequestedEventSet: Received event set for 
>> request: thread start request  (enabled)
>> EventHandler> Event: ThreadStartEventImpl req thread start request  
>> (enabled)
>> EventHandler> Removing listener nsk.share.jdi.EventHandler$7 at 44e265ef
>> debugger>        got new ThreadStartEvent with propety 'number' == 
>> ThreadStartRequest3
>> debugger> ......checking up on EventSet.resume()
>> debugger> ......--> vm.suspend();
>> debugger>         getting : Map<String, Integer> suspendsCounts1
>> debugger> {Reference Handler=2, thread2=2, Common-Cleaner=2, main=2, 
>> Signal Dispatcher=2, Finalizer=2}
>> debugger>         eventSet.resume;
>> debugger>         getting : Map<String, Integer> suspendsCounts2
>> debugger> {Reference Handler=1, thread2=1, Common-Cleaner=1, main=1, 
>> Signal Dispatcher=1, Finalizer=1}
>> debugger>         getting : int policy = eventSet.suspendPolicy();
>> debugger>         case SUSPEND_ALL
>> debugger> checking Reference Handler
>> debugger> checking thread2
>> debugger> checking Common-Cleaner
>> debugger> checking main
>> debugger> checking Signal Dispatcher
>> debugger> checking Finalizer
>> debugger> ......--> vm.resume()
>> debugger> shouldRunAfterBreakpoint: entered
>> debugger> shouldRunAfterBreakpoint: waiting for breakpoint event 
>> during 1 sec.
>> debugee.stderr> **> debuggee:   'run': enter  :: threadName == thread2
>> debugee.stderr> **> debuggee:   'run': exit   :: threadName == thread2
>> EventHandler> Received event set with policy = SUSPEND_ALL
>> EventHandler> Event: BreakpointEventImpl req breakpoint request 
>> nsk.jdi.EventSet.resume.resume008a:74 (enabled)
>> debugger> Received communication breakpoint event.
>> debugger> shouldRunAfterBreakpoint: received breakpoint event.
>> debugger> shouldRunAfterBreakpoint: received instruction from 
>> debuggee to finish.
>> debugger> shouldRunAfterBreakpoint: exited with false.
>> debugger> TESTING ENDS
>> debugger> Waiting for debuggee's exit...
>> debugee.stderr> **> debuggee: debuggee exits
>> EventHandler> waitForVMDisconnect
>> EventHandler> Received event set with policy = SUSPEND_NONE
>> EventHandler> Event: VMDeathEventImpl req null
>> EventHandler> receieved VMDeath
>> EventHandler> Removing listener nsk.share.jdi.EventHandler$3 at 57f20df9
>> EventHandler> Received event set with policy = SUSPEND_NONE
>> EventHandler> Event: VMDisconnectEventImpl req null
>> EventHandler> receieved VMDisconnect
>> EventHandler> Removing listener nsk.share.jdi.EventHandler$4 at 6e72e291
>> EventHandler> finished
>> EventHandler> waitForVMDisconnect: done
>> debugger> Event handler thread exited.
>> debugger> Debuggee PASSED.
>>
>> On 7/18/18, 6:09 PM, gary.adams at oracle.com wrote:
>>> On 7/18/18 4:47 PM, Chris Plummer wrote:
>>>> Hi Gary
>>>>
>>>> Ok, so shouldRunAfterBreakpoint() is the code that does the 
>>>> eventHandler.wait(), so it gets the eventHandler.notifyAll() 
>>>> notification from the BreakpointEvent handler.
>>>>
>>>> And as a side note, I see now that resumption of execution after 
>>>> the breakpoint at main() is done by:
>>>>
>>>>             // after waitForClassPrepared() main debuggee thread is 
>>>> suspended, resume it before test start
>>>>             display("RESUME DEBUGGEE VM");
>>>>             vm.resume();
>>>>
>>>>             testRun();
>>>>
>>>> shouldRunAfterBreakpoint() is returning true until the end of the 
>>>> test when the debuggee is executes "instruction = end". That's why 
>>>> runTests() does a "break" when shouldRunAfterBreakpoint() returns 
>>>> false. So this means the code that is checking 
>>>> shouldRunAfterBreakpoint() is not resuming execution for the first 
>>>> few (probably 3) methodForCommunication() breakpoints. However, it 
>>>> does make sure that runTests() blocks until the BreakPointEvent has 
>>>> been processed.
>>>>
>>>> You point out the vm.resume() at the bottom of the loop in 
>>>> runTests(), but that's only after a bunch of ThreadStartEvent 
>>>> processing above it has been done already. The ThreadStartEvent 
>>>> would never get generated if there was not a resume some point 
>>>> earlier. I think it is happening during the 
>>>> eventHandler.waitForRequestedEventSet() call, which does a 
>>>> vm.resume().
>>>>
>>>> So if I understand the order of things now:
>>>>
>>>> -shouldRunAfterBreakpoint() returns after first 
>>>> methodForCommunication() is hit. At this point we know the first 
>>>> thread has been created, but no attempt to start it yet. The 
>>>> debuggee is suspended at this point.
>>>> -runTests() requests ThreadStartEvents with SUSPEND_NONE. This also 
>>>> does a vm.resume().
>>>> -The debuggee starts the thread and then does another 
>>>> methodForCommunication() (this 2nd one is actually after the 2nd 
>>>> thread has been created, but not yet started). Now we have a race. 
>>>> Do we get the ThreadStartEvent first or the BreakpointEvent. This 
>>>> is because when the ThreadStartEvent is generated, the thread is 
>>>> not suspended due to SUSPEND_NONE. Even if the ThreadStartEvent 
>>>> comes in first, the async handling of the BreakpointEvent can cause 
>>>> problems during the ThreadStartEvent processing.
>>> Based on the failed log in the bug report, the thread start event is 
>>> observed,
>>> the suspend counts acquired, then after the resume, the breakpoint 
>>> message
>>> is displayed and the second set of suspend counts acquired.
>>>
>>> I can show you the passed and failed logs tomorrow.
>>>> -You added a 100ms delay after the thread has started, but before 
>>>> methodForCommunication(), hoping it will make it so the 
>>>> ThreadStartEvent can be received and fully processed before the 
>>>> BreakpointEvent is.
>>> The delay is mostly just a yield so the debugger gets a chance to run.
>>>>
>>>> I think it would be preferable to fix this by doing better 
>>>> sychronization. After all, that is the approach the test originally 
>>>> took. It could have been written with a bunch of sleep() delays 
>>>> instead, but that in general is not a very good approach.
>>>>
>>>> What if you added a shouldRunAfterBreakpoint() call after getting 
>>>> the ThreadStartEvent arrives. At this point you would know that the 
>>>> vm is suspended due to the breakpoint, so no need for:
>>>>
>>>>                 display("......checking up on EventSet.resume()");
>>>>                 display("......--> vm.suspend();");
>>>>                 vm.suspend();
>>> I think the suspend is intentional to capture the the suspend counts.
>>> It also needs to resume the vm and acquire again so it can confirm 
>>> the correct
>>> suspend count behaviors.
>>> If the test waits to capture the second set of suspend counts, the 
>>> breakpoint
>>> causes incorrect values.
>>>
>>> ...
>>>>
>>>> You might then also need to add another methodForCommunication() 
>>>> call at the end of case 0 and 1 in the debuggee, although I think 
>>>> you could instead just change the shouldRunAfterBreakpoint() at the 
>>>> start of the loop. I think that check actually belongs at the end 
>>>> of the loop, and only for case 2. In fact it would be an error if 
>>>> shouldRunAfterBreakpoint() did not return true in that case. Then 
>>>> you also need to add a shouldRunAfterBreakpoint() at the start of 
>>>> case 0 to get things rolling (and I think at the start of case 1 
>>>> also).
>>>>
>>>> Chris
>>>>
>>>>
>>>> On 7/18/18 12:45 PM, Gary Adams wrote:
>>>>> Answers below  ...
>>>>>
>>>>> On 7/18/18, 2:50 PM, Chris Plummer wrote:
>>>>>> Hi Gary,
>>>>>>
>>>>>> Who does the resume for the breakpoint event?
>>>>>>
>>>>>>         eventHandler.addListener(
>>>>>>              new EventHandler.EventListener() {
>>>>>>                  public boolean eventReceived(Event event) {
>>>>>>                     if (event instanceof BreakpointEvent && 
>>>>>> bpRequest.equals(event.request())) {
>>>>>>                         synchronized(eventHandler) {
>>>>>>                             display("Received communication 
>>>>>> breakpoint event.");
>>>>>>                             bpCount++;
>>>>>>                             eventHandler.notifyAll();
>>>>>>                         }
>>>>>>                         return true;
>>>>>>                     }
>>>>>>                     return false;
>>>>>>                  }
>>>>>>              }
>>>>>>         );
>>>>> I believe you are looking for this sequence.
>>>>> At the top of the loop a check is made if
>>>>> resume() should be called "shouldRunAfterBreakpoint".
>>>>> lines 96-99 is an early termination. And at the
>>>>> bottom of the loop, line 240, is the normal
>>>>> continue the test to the next case.
>>>>>
>>>>> resume008.java :
>>>>> ...
>>>>>     94            for (int i = 0; ; i++) {
>>>>>     95
>>>>>
>>>>>     96                if (!shouldRunAfterBreakpoint()) {
>>>>>     97                    vm.resume();
>>>>>     98                    break;
>>>>>     99                }
>>>>>
>>>>> 100
>>>>>    101
>>>>>    102                display(":::::: case: # " + i);
>>>>>    103
>>>>>    104                switch (i) {
>>>>>    105
>>>>>    106                    case 0:
>>>>>    107                    eventRequest = settingThreadStartRequest (
>>>>>    108 SUSPEND_NONE, "ThreadStartRequest1");
>>>>> ...
>>>>>   238
>>>>>    239                display("......--> vm.resume()");
>>>>>    240                vm.resume();
>>>>>    241            }
>>>>>>
>>>>>> Also:
>>>>>>
>>>>>>>   1. On a thread start event the debugee is suspended, line 141 
>>>>>> That's not true for the first ThreadStartEvent since SUSPEND_NONE 
>>>>>> was used.
>>>>> The thread start event is set to SUSPEND_NONE for thread0, but when
>>>>> the thread start event is observed the resume008 test suspends the vm
>>>>> immediately after fetching the "number" property.
>>>> My point is that the Debuggee continues to run after the 
>>>> ThreadStartEvent is sent, and relies on the debugger to stop it 
>>>> after receiving the event. But in the meantime the debuggee has 
>>>> advanced to the next breakpoint, but only sometimes, thus the bug 
>>>> you are seeing.
>>>>>
>>>>>    132                if ( !(newEvent instanceof ThreadStartEvent)) {
>>>>>    133                    setFailedStatus("ERROR: new event is not 
>>>>> ThreadStartEvent");
>>>>>    134                } else {
>>>>>    135
>>>>>    136                    String property = (String) 
>>>>> newEvent.request().getProperty("number");
>>>>>    137                    display("       got new ThreadStartEvent 
>>>>> with propety 'number' == " + property);
>>>>>    138
>>>>>    139                    display("......checking up on 
>>>>> EventSet.resume()");
>>>>>    140                    display("......--> vm.suspend();");
>>>>>    141                    vm.suspend();
>>>>>
>>>>>
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/18/18 4:52 AM, Gary Adams wrote:
>>>>>>> There is nothing wrong with the breakpoint in 
>>>>>>> methodForCommunication.
>>>>>>> The test uses it to make sure the threads are each tested 
>>>>>>> separately.
>>>>>>> The breakpoint eventhandler just displays a message, increments 
>>>>>>> a counter
>>>>>>> and returns.
>>>>>>>
>>>>>>> Let me step through resume008a the debugee to help clarify ...
>>>>>>>
>>>>>>> 1. The test thread is created and the synchronized break point 
>>>>>>> is observed. lines 101-102
>>>>>>> 2. The thread is started. lines 104,135-137
>>>>>>>     2a. The main thread blocks on a local object. lines 133, 139
>>>>>>>     2b. The test thread is started. lines 137,
>>>>>>>            A run entered message is displayed, line 159
>>>>>>>            The main thread lock object is notified, line 167
>>>>>>>           2b1. The main thread continues. line 167, 146
>>>>>>>                   The next test thread is created. line 106
>>>>>>>                   The synchronized breakpoint is observed, line 107
>>>>>>>           2b2. A run exited message is displayed, line 169
>>>>>>>
>>>>>>> On the resume008 debugger side  ...
>>>>>>>   1. On a thread start event the debugee is suspended, line 141
>>>>>>>   2. Messages are displayed and a first set of thread suspend 
>>>>>>> counts is acquired. lines 143-151
>>>>>>>   3. The threads are resumed, line 152
>>>>>>> --->
>>>>>>>   4.  Messages are displayed and a second set of thread suspend 
>>>>>>> counts is acquired. lines 154-159
>>>>>>>
>>>>>>> The way the test is written the expectation is the debugger 
>>>>>>> steps 2,3,4 will all happen
>>>>>>> while the test thread is running.
>>>>>>>
>>>>>>> When the debugger resumes the debuggee threads (debugger step 3)
>>>>>>> the debuggee continues from where it left off (debuggee steps 
>>>>>>> 2b,2b1,2b2)
>>>>>>>
>>>>>>> If we complete debuggee step 2b1 (line 107) before the debugger 
>>>>>>> completes step 4 line 159,
>>>>>>> then the synchronized breakpoint will suspend the vm and the 
>>>>>>> counts will not match
>>>>>>> for the SUSPEND_NONE test thread start.
>>>>>>>
>>>>>>> resume008a.java:
>>>>>>>
>>>>>>>    100                        case 0:
>>>>>>>    101                                thread0 = new 
>>>>>>> Threadresume008a("thread0");
>>>>>>>    102 methodForCommunication();
>>>>>>>    103
>>>>>>>    104 threadStart(thread0);
>>>>>>>    105
>>>>>>>    106                                thread1 = new 
>>>>>>> Threadresume008a("thread1");
>>>>>>>    107 methodForCommunication();
>>>>>>>    108                                break;
>>>>>>>
>>>>>>>    ...
>>>>>>>    135        static int threadStart(Thread t) {
>>>>>>>    136            synchronized (waitnotifyObj) {
>>>>>>>    137                t.start();
>>>>>>>    138                try {
>>>>>>>    139                    waitnotifyObj.wait();
>>>>>>>    140                } catch ( Exception e) {
>>>>>>>    141                    exitCode = FAILED;
>>>>>>>    142                    logErr("       Exception : " + e );
>>>>>>>    143                    return FAILED;
>>>>>>>    144                }
>>>>>>>    145            }
>>>>>>>    146            return PASSED;
>>>>>>>    147        }
>>>>>>>
>>>>>>>    149        static class Threadresume008a extends Thread {
>>>>>>>    ...
>>>>>>>    157
>>>>>>>    158            public void run() {
>>>>>>>    159                log1("  'run': enter  :: threadName == " + 
>>>>>>> tName);
>>>>>>>
>>>>>>> This is the proposed fix that will let the debugger complete 
>>>>>>> it's second
>>>>>>> acquisition of suspend counts while the test thread is still 
>>>>>>> running.
>>>>>>>
>>>>>>>    160                // Yield, so the start thread event 
>>>>>>> processing can be completed.
>>>>>>>    161                try {
>>>>>>>    162                    Thread.sleep(100);
>>>>>>>    163                } catch (InterruptedException e) {
>>>>>>>    164                    // ignored
>>>>>>>    165                }
>>>>>>>
>>>>>>>    166                synchronized (waitnotifyObj) {
>>>>>>>    167                        waitnotifyObj.notify();
>>>>>>>    168                }
>>>>>>>    169                log1("  'run': exit   :: threadName == " + 
>>>>>>> tName);
>>>>>>>    170                return;
>>>>>>>    171            }
>>>>>>>    172        }
>>>>>>>    150
>>>>>>>    151            String tName = null;
>>>>>>>    152
>>>>>>>    153            public Threadresume008a(String threadName) {
>>>>>>>    154                super(threadName);
>>>>>>>    155                tName = threadName;
>>>>>>>    156            }
>>>>>>>    157
>>>>>>>    158            public void run() {
>>>>>>>    159                log1("  'run': enter  :: threadName == " + 
>>>>>>> tName);
>>>>>>>    160                // Yield, so the start thread event 
>>>>>>> processing can be completed.
>>>>>>>    161                try {
>>>>>>>    162                    Thread.sleep(100);
>>>>>>>    163                } catch (InterruptedException e) {
>>>>>>>    164                    // ignored
>>>>>>>    165                }
>>>>>>>    166                synchronized (waitnotifyObj) {
>>>>>>>    167                        waitnotifyObj.notify();
>>>>>>>    168                }
>>>>>>>    169                log1("  'run': exit   :: threadName == " + 
>>>>>>> tName);
>>>>>>>    170                return;
>>>>>>>    171            }
>>>>>>>    172        }
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 7/18/18, 2:38 AM, Chris Plummer wrote:
>>>>>>>> Hi Gary,
>>>>>>>>
>>>>>>>> I've been having trouble following the control flow of this 
>>>>>>>> test. One thing I've stumbled across is the following:
>>>>>>>>
>>>>>>>>             /* A debuggee class must define 
>>>>>>>> 'methodForCommunication'
>>>>>>>>              * method and invoke it in points of synchronization
>>>>>>>>              * with a debugger.
>>>>>>>>              */
>>>>>>>> setCommunicationBreakpoint(debuggeeClass,"methodForCommunication"); 
>>>>>>>>
>>>>>>>>
>>>>>>>> So why isn't this mode of synchronization good enough? Is it 
>>>>>>>> because it was not designed with the understanding that the 
>>>>>>>> debugger might be doing suspended thread counts, and suspending 
>>>>>>>> all threads at the breakpoint messes up the test?
>>>>>>>>
>>>>>>>> From what I can tell of the test, after the debuggee is started 
>>>>>>>> and hits the default breakpoint at the start of main(), the 
>>>>>>>> debugger then does a vm.resume() at the start of the for loop 
>>>>>>>> in the runTest() method. The debuggee then creates a thread and 
>>>>>>>> calls methodForCommunication(). There is already a breakpoint 
>>>>>>>> set there by the above debuggee code. It's unclear to me what 
>>>>>>>> happens as a result of this breakpoint and how it serves the 
>>>>>>>> test. Also unclear to me who is responsible for the vm.resume() 
>>>>>>>> after the breakpoint is hit.
>>>>>>>>
>>>>>>>> The debugger then requests all ThreadStart events, requesting 
>>>>>>>> that no threads be disabled when it is sent. I think you are 
>>>>>>>> saying that when the ThreadStart event comes in, sometimes we 
>>>>>>>> are at the methodForCommunication breakpoint, with all threads 
>>>>>>>> disabled, and this messes up the thread suspend counts. You 
>>>>>>>> want to delay 100ms so the breakpoint event can be processed 
>>>>>>>> and threads resumed again (although I can't see who actually 
>>>>>>>> resumes the thread after hitting the methodForCommunication 
>>>>>>>> breakpoint).
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>
>>>>>>>> On 7/17/18 8:33 AM, Gary Adams wrote:
>>>>>>>>> A race condition exists between the debugger and the debuggee.
>>>>>>>>>
>>>>>>>>> The first test thread is started with SUSPEND_NONE policy set.
>>>>>>>>> While processing the thread start event the debugger captures
>>>>>>>>> an initial set of thread suspend counts and resumes the
>>>>>>>>> debuggee vm. If the debuggee advances quickly it reaches
>>>>>>>>> the breakpoint set for methodForCommunication. Since the 
>>>>>>>>> breakpoint
>>>>>>>>> carries with it SUSPEND_ALL policy, when the debugger captures 
>>>>>>>>> a second
>>>>>>>>> set of suspend counts, it will not match the expected counts for
>>>>>>>>> a SUSPEND_NONE scenario.
>>>>>>>>>
>>>>>>>>> The proposed fix introduces a yield in the debuggee test 
>>>>>>>>> thread run method
>>>>>>>>> to allow the debugger to get the expected sampled values.
>>>>>>>>>
>>>>>>>>>   Issue: https://bugs.openjdk.java.net/browse/JDK-8170089
>>>>>>>>>   Webrev: http://cr.openjdk.java.net/~gadams/8170089/webrev.00/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> test/hotspot/jtreg/vmTestbase/nsk/share/jdi/TestDebuggerType1.java: 
>>>>>>>>>
>>>>>>>>> ...
>>>>>>>>>    186        private void 
>>>>>>>>> setCommunicationBreakpoint(ReferenceType refType, String 
>>>>>>>>> methodName) {
>>>>>>>>>    187            Method method = 
>>>>>>>>> debuggee.methodByName(refType, methodName);
>>>>>>>>>    188            Location location = null;
>>>>>>>>>    189            try {
>>>>>>>>>    190                location = 
>>>>>>>>> method.allLineLocations().get(0);
>>>>>>>>>    191            } catch (AbsentInformationException e) {
>>>>>>>>>    192                throw new Failure(e);
>>>>>>>>>    193            }
>>>>>>>>>    194            bpRequest = debuggee.makeBreakpoint(location);
>>>>>>>>>    195
>>>>>>>>>
>>>>>>>>>    196 bpRequest.setSuspendPolicy(EventRequest.SUSPEND_ALL);
>>>>>>>>>
>>>>>>>>>    197            bpRequest.putProperty("number", "zero");
>>>>>>>>>    198            bpRequest.enable();
>>>>>>>>>    199
>>>>>>>>>    200            eventHandler.addListener(
>>>>>>>>>    201                 new EventHandler.EventListener() {
>>>>>>>>>    202                     public boolean eventReceived(Event 
>>>>>>>>> event) {
>>>>>>>>>    203                        if (event instanceof 
>>>>>>>>> BreakpointEvent && bpRequest.equals(event.request())) {
>>>>>>>>>    204 synchronized(eventHandler) {
>>>>>>>>>    205 display("Received communication breakpoint event.");
>>>>>>>>>    206                                bpCount++;
>>>>>>>>>    207 eventHandler.notifyAll();
>>>>>>>>>    208                            }
>>>>>>>>>    209                            return true;
>>>>>>>>>    210                        }
>>>>>>>>>    211                        return false;
>>>>>>>>>    212                     }
>>>>>>>>>    213                 }
>>>>>>>>>    214            );
>>>>>>>>>    215        }
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java: 
>>>>>>>>>
>>>>>>>>> ...
>>>>>>>>>    140                    display("......--> vm.suspend();");
>>>>>>>>>    141                    vm.suspend();
>>>>>>>>>    142
>>>>>>>>>    143                    display("        getting : 
>>>>>>>>> Map<String, Integer> suspendsCounts1");
>>>>>>>>>    144
>>>>>>>>>    145                    Map<String, Integer> suspendsCounts1 
>>>>>>>>> = new HashMap<String, Integer>();
>>>>>>>>>    146                    for (ThreadReference threadReference 
>>>>>>>>> : vm.allThreads()) {
>>>>>>>>>    147 suspendsCounts1.put(threadReference.name(), 
>>>>>>>>> threadReference.suspendCount());
>>>>>>>>>    148                    }
>>>>>>>>>    149 display(suspendsCounts1.toString());
>>>>>>>>>    150
>>>>>>>>>    151                    display(" eventSet.resume;");
>>>>>>>>>    152                    eventSet.resume();
>>>>>>>>>    153
>>>>>>>>>    154                    display("        getting : 
>>>>>>>>> Map<String, Integer> suspendsCounts2");
>>>>>>>>>
>>>>>>>>> This is where the breakpoint is encountered before the second 
>>>>>>>>> set of suspend counts is acquired.
>>>>>>>>>
>>>>>>>>>    155                    Map<String, Integer> suspendsCounts2 
>>>>>>>>> = new HashMap<String, Integer>();
>>>>>>>>>    156                    for (ThreadReference threadReference 
>>>>>>>>> : vm.allThreads()) {
>>>>>>>>>    157 suspendsCounts2.put(threadReference.name(), 
>>>>>>>>> threadReference.suspendCount());
>>>>>>>>>    158                    }
>>>>>>>>>    159 display(suspendsCounts2.toString());
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


From chris.plummer at oracle.com  Tue Jul 24 19:42:41 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 24 Jul 2018 12:42:41 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <ff45637f-9ad8-9e63-4484-91fff7ed6777@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
 <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
 <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>
 <f457a86e-491f-472a-b052-a85740e1a135@oracle.com>
 <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com>
 <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com>
 <ff45637f-9ad8-9e63-4484-91fff7ed6777@oracle.com>
Message-ID: <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180724/baee05f5/attachment.html>

From chris.plummer at oracle.com  Tue Jul 24 20:22:14 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 24 Jul 2018 13:22:14 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
 <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
 <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>
 <f457a86e-491f-472a-b052-a85740e1a135@oracle.com>
 <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com>
 <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com>
 <ff45637f-9ad8-9e63-4484-91fff7ed6777@oracle.com>
 <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com>
Message-ID: <b75a822d-face-46e6-3b8b-103d9424b310@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180724/af61a804/attachment-0001.html>

From chris.plummer at oracle.com  Tue Jul 24 20:55:57 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 24 Jul 2018 13:55:57 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
 <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
 <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>
 <f457a86e-491f-472a-b052-a85740e1a135@oracle.com>
 <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com>
 <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com>
 <ff45637f-9ad8-9e63-4484-91fff7ed6777@oracle.com>
 <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com>
 <b75a822d-face-46e6-3b8b-103d9424b310@oracle.com>
 <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com>
Message-ID: <68ef717f-7388-0dcb-be6b-7b8ed462fdf5@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180724/e3705399/attachment.html>

From serguei.spitsyn at oracle.com  Tue Jul 24 20:46:04 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 24 Jul 2018 13:46:04 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <b75a822d-face-46e6-3b8b-103d9424b310@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
 <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
 <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>
 <f457a86e-491f-472a-b052-a85740e1a135@oracle.com>
 <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com>
 <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com>
 <ff45637f-9ad8-9e63-4484-91fff7ed6777@oracle.com>
 <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com>
 <b75a822d-face-46e6-3b8b-103d9424b310@oracle.com>
Message-ID: <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180724/5a056ce4/attachment-0001.html>

From serguei.spitsyn at oracle.com  Tue Jul 24 22:00:03 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 24 Jul 2018 15:00:03 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <68ef717f-7388-0dcb-be6b-7b8ed462fdf5@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
 <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
 <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>
 <f457a86e-491f-472a-b052-a85740e1a135@oracle.com>
 <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com>
 <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com>
 <ff45637f-9ad8-9e63-4484-91fff7ed6777@oracle.com>
 <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com>
 <b75a822d-face-46e6-3b8b-103d9424b310@oracle.com>
 <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com>
 <68ef717f-7388-0dcb-be6b-7b8ed462fdf5@oracle.com>
Message-ID: <fb82a78a-c1eb-b0a9-0059-91bdc7a79a6d@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180724/263376fd/attachment-0001.html>

From chris.plummer at oracle.com  Tue Jul 24 23:23:48 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 24 Jul 2018 16:23:48 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <fb82a78a-c1eb-b0a9-0059-91bdc7a79a6d@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
 <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
 <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>
 <f457a86e-491f-472a-b052-a85740e1a135@oracle.com>
 <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com>
 <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com>
 <ff45637f-9ad8-9e63-4484-91fff7ed6777@oracle.com>
 <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com>
 <b75a822d-face-46e6-3b8b-103d9424b310@oracle.com>
 <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com>
 <68ef717f-7388-0dcb-be6b-7b8ed462fdf5@oracle.com>
 <fb82a78a-c1eb-b0a9-0059-91bdc7a79a6d@oracle.com>
Message-ID: <539dc93d-8984-4d82-50eb-bd0395476247@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180724/8f4fd0c0/attachment-0001.html>

From chris.plummer at oracle.com  Tue Jul 24 23:46:19 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 24 Jul 2018 16:46:19 -0700
Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR:
 suspendCounts don't match for : Common-Cleaner
In-Reply-To: <5B577DDC.3000500@oracle.com>
References: <5B4E0C62.3020808@oracle.com>
 <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com>
 <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com>
 <5B4F98BF.1060602@oracle.com>
 <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com>
 <c309dffe-f935-60ce-ce4b-5c99cd01406b@oracle.com>
 <5B507F2C.4080503@oracle.com>
 <2bd15cea-86b5-1af0-ef53-2c82a4685224@oracle.com>
 <5B577DDC.3000500@oracle.com>
Message-ID: <00dd4347-7869-b566-b7f6-d8b897e73c80@oracle.com>

Hi Gary,

It looks like that should work fine.

thanks,

Chris

On 7/24/18 12:28 PM, Gary Adams wrote:
> Here's a quick prototype to add a variable to the debuggee.
> The debugger sets it at the end of each completed test case.
>
> The debuggee can then check for the value change to delay
> hitting the breakpoint which interfered with suspend count checks.
>
> Would need to add a bit more error and timeout checking to
> complete the fix. Should also check if the other resume008 test cases
> need similar synchronization. Could possibly migrate the code up to
> TestDebuggerType1 if other tests also needed this generic capability.
>
>
> diff --git 
> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java 
> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java
> --- 
> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java
> +++ 
> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java
> @@ -63,6 +63,9 @@
> ? *?? to be resulting in the event.
> ? * - Upon getting new event, the debugger
> ? *?? performs the check corresponding to the event.
> + * - The debugger informs the debuggee when it completes
> + *?? each test case, so it will wait before hitting
> + *?? communication breakpoints.
> ? */
>
> ?public class resume008 extends TestDebuggerType1 {
> @@ -234,6 +237,7 @@
>
> ????????????????????? default: throw new Failure("** default case 1 **");
> ???????????????? }
> +??????????????? informDebuggeeTestCase(i);
> ???????????? }
>
> ???????????? display("......--> vm.resume()");
> @@ -255,4 +259,25 @@
> ???????? }
> ???? }
>
> +??? /**
> +???? * Inform debuggee which thread test the debugger has completed.
> +???? * Used for synchronization, so the debuggee does not move too 
> quickly.
> +???? * @param testCase index of just completed test
> +???? */
> +??? void informDebuggeeTestCase(int testCase) {
> +??????? if (!EventHandler.isDisconnected() && debuggeeClass != null) {
> +??????????? try {
> +??????????????? ((ClassType)debuggeeClass)
> + .setValue(debuggeeClass.fieldByName("testCase"),
> +????????????????????????????? vm.mirrorOf(testCase));
> +??????????? } catch (InvalidTypeException ite) {
> +??????????????? // ignored
> +??????????? } catch (ClassNotLoadedException cnle) {
> +??????????????? // ignored
> +??????????? } catch (VMDisconnectedException e) {
> +??????????????? // ignored
> ?}
> +??????? }
> +??? }
> +
> +}
>
>
> diff --git 
> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java 
> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java
> --- 
> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java
> +++ 
> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java
> @@ -62,6 +62,7 @@
>
> ???? static int exitCode = PASSED;
>
> +??? static int testCase = -1;
> ???? static int instruction = 1;
> ???? static int end???????? = 0;
> ??????????????????????????????????? //??? static int quit??????? = 0;
> @@ -104,6 +105,15 @@
> ???????????????????????????? threadStart(thread0);
>
> ???????????????????????????? thread1 = new Threadresume008a("thread1");
> +??????????????????????????? // Wait for debugger to complete the 
> first test case
> +??????????????????????????? // before advancing to the next breakpoint
> +??????????????????????????? while (testCase < 0) {
> +??????????????????????????????? try {
> +??????????????????????????????????? Thread.sleep(100);
> +??????????????????????????????? } catch (InterruptedException e) {
> +??????????????????????????????????? // ignored
> +??????????????????????????????? }
> +??????????????????????????? }
> ???????????????????????????? methodForCommunication();
> ???????????????????????????? break;
>
>
> On 7/20/18, 2:37 PM, Chris Plummer wrote:
>> Hi Gary,
>>
>> The test fails if the breakpoint event comes in after the test 
>> captures the initial thread suspend counts and before the test 
>> captures the 2nd suspend counts.
>>
>> debugger>???????? getting : Map<String, Integer> suspendsCounts1
>> debugger> {Reference Handler=1, Common-Cleaner=1, main=1, Signal 
>> Dispatcher=1, Finalizer=1}
>> debugger>???????? eventSet.resume;
>> debugger>???????? getting : Map<String, Integer> suspendsCounts2
>> EventHandler> Received event set with policy = SUSPEND_ALL
>> EventHandler> Event: BreakpointEventImpl req breakpoint request 
>> nsk.jdi.EventSet.resume.resume008a:60 (enabled)
>> debugger> Received communication breakpoint event.
>> debugger> {Reference Handler=2, Common-Cleaner=2, main=1, Signal 
>> Dispatcher=2, Finalizer=2}
>>
>> So we end up with some threads starting with 1 suspend and ending 
>> with 2 (not clear to me why main is still at 1).
>>
>> It will pass if the breakpoint comes in after it does both of suspend 
>> count checks, as you have shown with the sleep(100) solution. Output 
>> looks like this:
>>
>> debugger>??????? got new ThreadStartEvent with propety 'number' == 
>> ThreadStartRequest1
>> ...
>> debugger> ......--> vm.suspend();
>> debugger>???????? getting : Map<String, Integer> suspendsCounts1
>> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, 
>> Signal Dispatcher=1, Finalizer=1}
>> debugger>???????? eventSet.resume;
>> debugger>???????? getting : Map<String, Integer> suspendsCounts2
>> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, 
>> Signal Dispatcher=1, Finalizer=1}
>> ...
>> debugger> Received communication breakpoint event.
>>
>> I've also shown that it passes if the breakpoint always comes in 
>> before capturing the initial suspend counts. I added a sleep on the 
>> debugger side right after eventHandler.waitForRequestedEventSet() 
>> returns. Output looks like:
>>
>> debugger> Received communication breakpoint event.
>> debugger>??????? got new ThreadStartEvent with propety 'number' == 
>> ThreadStartRequest1
>> ...
>> debugger> ......--> vm.suspend();
>> debugger>???????? getting : Map<String, Integer> suspendsCounts1
>> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
>> Signal Dispatcher=2, Finalizer=2}
>> debugger>???????? eventSet.resume;
>> debugger>???????? getting : Map<String, Integer> suspendsCounts2
>> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
>> Signal Dispatcher=2, Finalizer=2}
>>
>> I think we should add synchronization to force one of these two 
>> outcomes. For the first, you would need to make the debugger modify 
>> some variable that the debuggee is watching (sitting in a loop 
>> waiting for it to change). For the second, you can rely on the 
>> existing methodForCommunication() approach. You just need to 
>> restructure the debugger a bit. I had started down this path late 
>> Wednesday, but got sidetracked by a few other things. I can look into 
>> it some more if you'd like.
>>
>> thanks,
>>
>> Chris 


From ekaterina.pavlova at oracle.com  Tue Jul 24 22:10:56 2018
From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova)
Date: Tue, 24 Jul 2018 15:10:56 -0700
Subject: [11] RFR(XS): 8195156 [Graal]
 serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails with
 Graal in Xcomp mode
Message-ID: <82547f4a-7bd8-447f-4aa8-af83ed156823@oracle.com>

Hi All,

serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails with Graal because
two more modules jdk.proxy1 and jdk.proxy2 are dynamically initialized by Graal code.
These modules are not part of boot modules and as results the check fails.
It was agreed with Serviceability team to filter these modules out.

Please review the fix.

     JBS: https://bugs.openjdk.java.net/browse/JDK-8195156
  webrev: http://cr.openjdk.java.net/~epavlova//8195156/webrev.00/index.html
testing: tested by running serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java with Graal and -Xcomp


Thanks,
-katya

p.s.
  Igor Ignatyev volunteered to sponsor this change.


From rahul.v.raghavan at oracle.com  Thu Jul 19 07:48:26 2018
From: rahul.v.raghavan at oracle.com (Rahul Raghavan)
Date: Thu, 19 Jul 2018 13:18:26 +0530
Subject: RFR (S) 8207252: C1 still does eden allocations when TLAB is
 enabled
In-Reply-To: <CAF9BGBz+SNUffJWNgkvTTK6jB7-8yf96HGSroxvY13FT80XPCg@mail.gmail.com>
References: <CAF9BGBzAVdC3JZCu2945n8JXo0usocBxVxYzdfdBU_gWOTe8wg@mail.gmail.com>
 <9216B675-A5EB-4D45-8F16-D48B83BE93C9@oracle.com>
 <CAF9BGBx7TnVo5vmNx75TUhd1VSeow6KsgcSk2ztKdibKuzv9fw@mail.gmail.com>
 <497549C6-6A68-4407-BA8A-3F001A47EDD0@oracle.com>
 <CAF9BGBxLchBQ3g1uwD=34E+QR5ARO7DqBfxZpFU2TRWTSaTajQ@mail.gmail.com>
 <CAF9BGBzGTsbna1nAh-iQOE-GXqfS6M5wGSBwWEGQX1ADd492JA@mail.gmail.com>
 <CAF9BGBz+SNUffJWNgkvTTK6jB7-8yf96HGSroxvY13FT80XPCg@mail.gmail.com>
Message-ID: <de54ede9-e29f-5e58-8d7f-6ad3c74d558c@oracle.com>

RFR (S) 8207252: C1 still does eden allocations when TLAB is enabled

(just adding + hotspot-compiler-dev also)


On Wednesday 18 July 2018 09:51 PM, JC Beyler wrote:
Subject Was:
Re: RFR (S): C1 still does eden allocations when TLAB is enabled

+ serviceability-dev

Hi all,

Could anyone else give me a review of this webrev and check/test the
various architecture changes?

http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/


Thanks for all your help!
Jc


> On Mon, Jul 16, 2018 at 2:58 PM JC Beyler <jcbeyler at google.com> wrote:
> 
>> Hi all,
>>
>> Here is a webrev that does all the architectures in the same way:
>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.02/
>>
>> Could anyone review the other architectures and test?
>>    - arm, sparc & aarch64 are also modified now to follow the same "if no
>> tlab, then consider eden space allocation" logic.
>>
>> Thanks for your help!
>> Jc
>>
>> On Fri, Jul 13, 2018 at 9:16 PM JC Beyler <jcbeyler at google.com> wrote:
>>
>>> Hi Kim,
>>>
>>> I opened this bug
>>> https://bugs.openjdk.java.net/browse/JDK-8190862
>>>
>>> and now I've done an update:
>>> http://cr.openjdk.java.net/~jcbeyler/8207252/webrev.01/
>>>
>>> I basically have done your nits but also removed the try_eden (it was
>>> used to bind a label but was not used). I updated the comments to use the
>>> one you preferred.
>>>
>>> I still have to do the other architectures though but at least we seem to
>>> have a consensus on this architecture, correct?
>>>
>>> Thanks for the review,
>>> Jc
>>>
>>> On Fri, Jul 13, 2018 at 5:17 PM Kim Barrett <kim.barrett at oracle.com>
>>> wrote:
>>>
>>>>> On Jul 13, 2018, at 4:54 PM, JC Beyler <jcbeyler at google.com> wrote:
>>>>>
>>>>> Yes, you are right, I did those changes due to:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8194084
>>>>>
>>>>> If Robbin agrees to this change, and if no one sees an issue, I'll go
>>>> ahead
>>>>> and propagate the change across architectures.
>>>>>
>>>>> Thanks for the review, I'll wait for Robbin (or anyone else's comment
>>>> and
>>>>> review) :)
>>>>> Jc
>>>>>
>>>>> On Fri, Jul 13, 2018 at 1:08 PM John Rose <john.r.rose at oracle.com>
>>>> wrote:
>>>>>
>>>>>> On Jul 13, 2018, at 10:23 AM, JC Beyler <jcbeyler at google.com> wrote:
>>>>>>
>>>>>>
>>>>>> I'm not sure if we had left this case intentionally or not but, if we
>>>> want
>>>>>> it all to be consistent, we should perhaps fix it.
>>>>>>
>>>>>>
>>>>>> Well, you put in that logic last February, so unless somebody speaks
>>>> up
>>>>>> quickly, I support your adjusting it to be the way you want it.
>>>>>>
>>>>>> Doing "hg grep -u supports_inline_contig_alloc -I src/hotspot/share"
>>>>>> suggests that the GC group is most active in touching this feature.
>>>>>> If Robbin is OK with it, there's your reviewer.
>>>>>>
>>>>>> FWIW, you can use me as a reviewer, but I'd get one other person
>>>>>> working on the GC to OK it.
>>>>>>
>>>>>> ? John
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Thanks,
>>>>> Jc
>>>>
>>>> Robbin is on vacation; you might not hear from him for a while.
>>>>
>>>> I'm assuming you'll open a new bug for this?
>>>>
>>>> Except for a few minor nits (below), this looks okay to me.
>>>>
>>>> The comment at line 1052 needs updating.
>>>>
>>>> pre-existing: The retry_tlab label declared on line 1054 is unused.
>>>>
>>>> pre-existing: The try_eden label declared on line 1054 is bound at
>>>> line 1058, but unreferenced.
>>>>
>>>> I like the wording of the comment at 1139 better than the wording at
>>>> 1016.
>>>>
>>>>
>>>
>>> --
>>>
>>> Thanks,
>>> Jc
>>>
>>
>>
>> --
>>
>> Thanks,
>> Jc
>>
> 
> 

From serguei.spitsyn at oracle.com  Wed Jul 25 00:51:25 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 24 Jul 2018 17:51:25 -0700
Subject: [11] RFR(XS): 8195156 [Graal]
 serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails with
 Graal in Xcomp mode
In-Reply-To: <18ba24bb-0498-2d35-45b3-fdad9e6c8c5d@oracle.com>
References: <82547f4a-7bd8-447f-4aa8-af83ed156823@oracle.com>
 <18ba24bb-0498-2d35-45b3-fdad9e6c8c5d@oracle.com>
Message-ID: <99f46eb4-dd71-2f4d-4aae-b0f1aa9fd62b@oracle.com>

Forgot to tell that a copyright comment needs a year update.
No need in new webrev.

Thanks,
Serguei

On 7/24/18 17:48, serguei.spitsyn at oracle.com wrote:
> Hi Katya,
>
> Nice simple fix.
> Thank you for taking care about it!
>
> Thanks,
> Serguei
>
> On 7/24/18 15:10, Ekaterina Pavlova wrote:
>> Hi All,
>>
>> serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails 
>> with Graal because
>> two more modules jdk.proxy1 and jdk.proxy2 are dynamically 
>> initialized by Graal code.
>> These modules are not part of boot modules and as results the check 
>> fails.
>> It was agreed with Serviceability team to filter these modules out.
>>
>> Please review the fix.
>>
>> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8195156
>> ?webrev: 
>> http://cr.openjdk.java.net/~epavlova//8195156/webrev.00/index.html
>> testing: tested by running 
>> serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java with 
>> Graal and -Xcomp
>>
>>
>> Thanks,
>> -katya
>>
>> p.s.
>> ?Igor Ignatyev volunteered to sponsor this change.
>>
>


From serguei.spitsyn at oracle.com  Wed Jul 25 01:01:44 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 24 Jul 2018 18:01:44 -0700
Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR:
 suspendCounts don't match for : Common-Cleaner
In-Reply-To: <00dd4347-7869-b566-b7f6-d8b897e73c80@oracle.com>
References: <5B4E0C62.3020808@oracle.com>
 <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com>
 <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com>
 <5B4F98BF.1060602@oracle.com>
 <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com>
 <c309dffe-f935-60ce-ce4b-5c99cd01406b@oracle.com>
 <5B507F2C.4080503@oracle.com>
 <2bd15cea-86b5-1af0-ef53-2c82a4685224@oracle.com>
 <5B577DDC.3000500@oracle.com>
 <00dd4347-7869-b566-b7f6-d8b897e73c80@oracle.com>
Message-ID: <31f065d3-c178-fe6d-95c2-86096cf9e5ea@oracle.com>

Hi Gary,

+1

Thanks,
Serguei


On 7/24/18 16:46, Chris Plummer wrote:
> Hi Gary,
>
> It looks like that should work fine.
>
> thanks,
>
> Chris
>
> On 7/24/18 12:28 PM, Gary Adams wrote:
>> Here's a quick prototype to add a variable to the debuggee.
>> The debugger sets it at the end of each completed test case.
>>
>> The debuggee can then check for the value change to delay
>> hitting the breakpoint which interfered with suspend count checks.
>>
>> Would need to add a bit more error and timeout checking to
>> complete the fix. Should also check if the other resume008 test cases
>> need similar synchronization. Could possibly migrate the code up to
>> TestDebuggerType1 if other tests also needed this generic capability.
>>
>>
>> diff --git 
>> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java 
>> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java
>> --- 
>> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java
>> +++ 
>> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java
>> @@ -63,6 +63,9 @@
>> ? *?? to be resulting in the event.
>> ? * - Upon getting new event, the debugger
>> ? *?? performs the check corresponding to the event.
>> + * - The debugger informs the debuggee when it completes
>> + *?? each test case, so it will wait before hitting
>> + *?? communication breakpoints.
>> ? */
>>
>> ?public class resume008 extends TestDebuggerType1 {
>> @@ -234,6 +237,7 @@
>>
>> ????????????????????? default: throw new Failure("** default case 1 
>> **");
>> ???????????????? }
>> +??????????????? informDebuggeeTestCase(i);
>> ???????????? }
>>
>> ???????????? display("......--> vm.resume()");
>> @@ -255,4 +259,25 @@
>> ???????? }
>> ???? }
>>
>> +??? /**
>> +???? * Inform debuggee which thread test the debugger has completed.
>> +???? * Used for synchronization, so the debuggee does not move too 
>> quickly.
>> +???? * @param testCase index of just completed test
>> +???? */
>> +??? void informDebuggeeTestCase(int testCase) {
>> +??????? if (!EventHandler.isDisconnected() && debuggeeClass != null) {
>> +??????????? try {
>> +??????????????? ((ClassType)debuggeeClass)
>> + .setValue(debuggeeClass.fieldByName("testCase"),
>> +????????????????????????????? vm.mirrorOf(testCase));
>> +??????????? } catch (InvalidTypeException ite) {
>> +??????????????? // ignored
>> +??????????? } catch (ClassNotLoadedException cnle) {
>> +??????????????? // ignored
>> +??????????? } catch (VMDisconnectedException e) {
>> +??????????????? // ignored
>> ?}
>> +??????? }
>> +??? }
>> +
>> +}
>>
>>
>> diff --git 
>> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java 
>> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java
>> --- 
>> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java
>> +++ 
>> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java
>> @@ -62,6 +62,7 @@
>>
>> ???? static int exitCode = PASSED;
>>
>> +??? static int testCase = -1;
>> ???? static int instruction = 1;
>> ???? static int end???????? = 0;
>> ??????????????????????????????????? //??? static int quit = 0;
>> @@ -104,6 +105,15 @@
>> ???????????????????????????? threadStart(thread0);
>>
>> ???????????????????????????? thread1 = new Threadresume008a("thread1");
>> +??????????????????????????? // Wait for debugger to complete the 
>> first test case
>> +??????????????????????????? // before advancing to the next breakpoint
>> +??????????????????????????? while (testCase < 0) {
>> +??????????????????????????????? try {
>> +??????????????????????????????????? Thread.sleep(100);
>> +??????????????????????????????? } catch (InterruptedException e) {
>> +??????????????????????????????????? // ignored
>> +??????????????????????????????? }
>> +??????????????????????????? }
>> ???????????????????????????? methodForCommunication();
>> ???????????????????????????? break;
>>
>>
>> On 7/20/18, 2:37 PM, Chris Plummer wrote:
>>> Hi Gary,
>>>
>>> The test fails if the breakpoint event comes in after the test 
>>> captures the initial thread suspend counts and before the test 
>>> captures the 2nd suspend counts.
>>>
>>> debugger>???????? getting : Map<String, Integer> suspendsCounts1
>>> debugger> {Reference Handler=1, Common-Cleaner=1, main=1, Signal 
>>> Dispatcher=1, Finalizer=1}
>>> debugger>???????? eventSet.resume;
>>> debugger>???????? getting : Map<String, Integer> suspendsCounts2
>>> EventHandler> Received event set with policy = SUSPEND_ALL
>>> EventHandler> Event: BreakpointEventImpl req breakpoint request 
>>> nsk.jdi.EventSet.resume.resume008a:60 (enabled)
>>> debugger> Received communication breakpoint event.
>>> debugger> {Reference Handler=2, Common-Cleaner=2, main=1, Signal 
>>> Dispatcher=2, Finalizer=2}
>>>
>>> So we end up with some threads starting with 1 suspend and ending 
>>> with 2 (not clear to me why main is still at 1).
>>>
>>> It will pass if the breakpoint comes in after it does both of 
>>> suspend count checks, as you have shown with the sleep(100) 
>>> solution. Output looks like this:
>>>
>>> debugger>??????? got new ThreadStartEvent with propety 'number' == 
>>> ThreadStartRequest1
>>> ...
>>> debugger> ......--> vm.suspend();
>>> debugger>???????? getting : Map<String, Integer> suspendsCounts1
>>> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, 
>>> Signal Dispatcher=1, Finalizer=1}
>>> debugger>???????? eventSet.resume;
>>> debugger>???????? getting : Map<String, Integer> suspendsCounts2
>>> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, 
>>> Signal Dispatcher=1, Finalizer=1}
>>> ...
>>> debugger> Received communication breakpoint event.
>>>
>>> I've also shown that it passes if the breakpoint always comes in 
>>> before capturing the initial suspend counts. I added a sleep on the 
>>> debugger side right after eventHandler.waitForRequestedEventSet() 
>>> returns. Output looks like:
>>>
>>> debugger> Received communication breakpoint event.
>>> debugger>??????? got new ThreadStartEvent with propety 'number' == 
>>> ThreadStartRequest1
>>> ...
>>> debugger> ......--> vm.suspend();
>>> debugger>???????? getting : Map<String, Integer> suspendsCounts1
>>> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
>>> Signal Dispatcher=2, Finalizer=2}
>>> debugger>???????? eventSet.resume;
>>> debugger>???????? getting : Map<String, Integer> suspendsCounts2
>>> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
>>> Signal Dispatcher=2, Finalizer=2}
>>>
>>> I think we should add synchronization to force one of these two 
>>> outcomes. For the first, you would need to make the debugger modify 
>>> some variable that the debuggee is watching (sitting in a loop 
>>> waiting for it to change). For the second, you can rely on the 
>>> existing methodForCommunication() approach. You just need to 
>>> restructure the debugger a bit. I had started down this path late 
>>> Wednesday, but got sidetracked by a few other things. I can look 
>>> into it some more if you'd like.
>>>
>>> thanks,
>>>
>>> Chris 
>


From serguei.spitsyn at oracle.com  Wed Jul 25 00:48:58 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 24 Jul 2018 17:48:58 -0700
Subject: [11] RFR(XS): 8195156 [Graal]
 serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails with
 Graal in Xcomp mode
In-Reply-To: <82547f4a-7bd8-447f-4aa8-af83ed156823@oracle.com>
References: <82547f4a-7bd8-447f-4aa8-af83ed156823@oracle.com>
Message-ID: <18ba24bb-0498-2d35-45b3-fdad9e6c8c5d@oracle.com>

Hi Katya,

Nice simple fix.
Thank you for taking care about it!

Thanks,
Serguei

On 7/24/18 15:10, Ekaterina Pavlova wrote:
> Hi All,
>
> serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails 
> with Graal because
> two more modules jdk.proxy1 and jdk.proxy2 are dynamically initialized 
> by Graal code.
> These modules are not part of boot modules and as results the check 
> fails.
> It was agreed with Serviceability team to filter these modules out.
>
> Please review the fix.
>
> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8195156
> ?webrev: 
> http://cr.openjdk.java.net/~epavlova//8195156/webrev.00/index.html
> testing: tested by running 
> serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java with 
> Graal and -Xcomp
>
>
> Thanks,
> -katya
>
> p.s.
> ?Igor Ignatyev volunteered to sponsor this change.
>


From ekaterina.pavlova at oracle.com  Wed Jul 25 02:08:22 2018
From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova)
Date: Tue, 24 Jul 2018 19:08:22 -0700
Subject: [11] RFR(XS): 8195156 [Graal]
 serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails with
 Graal in Xcomp mode
In-Reply-To: <99f46eb4-dd71-2f4d-4aae-b0f1aa9fd62b@oracle.com>
References: <82547f4a-7bd8-447f-4aa8-af83ed156823@oracle.com>
 <18ba24bb-0498-2d35-45b3-fdad9e6c8c5d@oracle.com>
 <99f46eb4-dd71-2f4d-4aae-b0f1aa9fd62b@oracle.com>
Message-ID: <72710916-021f-6dd9-7f05-b1b1b410734d@oracle.com>

Vladimir, Serguei, thanks for your reviews!
I fixed copyright year.

-katya

On 7/24/18 5:51 PM, serguei.spitsyn at oracle.com wrote:
> Forgot to tell that a copyright comment needs a year update.
> No need in new webrev.
> 
> Thanks,
> Serguei
> 
> On 7/24/18 17:48, serguei.spitsyn at oracle.com wrote:
>> Hi Katya,
>>
>> Nice simple fix.
>> Thank you for taking care about it!
>>
>> Thanks,
>> Serguei
>>
>> On 7/24/18 15:10, Ekaterina Pavlova wrote:
>>> Hi All,
>>>
>>> serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java fails with Graal because
>>> two more modules jdk.proxy1 and jdk.proxy2 are dynamically initialized by Graal code.
>>> These modules are not part of boot modules and as results the check fails.
>>> It was agreed with Serviceability team to filter these modules out.
>>>
>>> Please review the fix.
>>>
>>> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8195156
>>> ?webrev: http://cr.openjdk.java.net/~epavlova//8195156/webrev.00/index.html
>>> testing: tested by running serviceability/jvmti/GetModulesInfo/JvmtiGetAllModulesTest.java with Graal and -Xcomp
>>>
>>>
>>> Thanks,
>>> -katya
>>>
>>> p.s.
>>> ?Igor Ignatyev volunteered to sponsor this change.
>>>
>>
> 


From fairoz.matte at oracle.com  Wed Jul 25 08:23:48 2018
From: fairoz.matte at oracle.com (Fairoz Matte)
Date: Wed, 25 Jul 2018 01:23:48 -0700 (PDT)
Subject: [8u-backport] RFR: JDK-8191948: jdb error: InvalidTypeException:
 Can't assign double[][][] to double[][][]
Message-ID: <5be949d0-70ba-4a9d-9541-907b37d3d0fb@default>

Hi,

Kindly review the backport of "JDK-8191948: jdb error: InvalidTypeException: Can't assign double[][][] to double[][][]" to 8u

Webrev - http://cr.openjdk.java.net/~fmatte/8191948/webrev.00/    

JDK 11 bug - https://bugs.openjdk.java.net/browse/JDK-8191948 

JDK 11 changeset - http://hg.openjdk.java.net/jdk/jdk11/rev/73c769e0486a 

Review thread - http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024405.html   

Thanks,
Fairoz

From gary.adams at oracle.com  Wed Jul 25 12:56:40 2018
From: gary.adams at oracle.com (gary.adams at oracle.com)
Date: Wed, 25 Jul 2018 08:56:40 -0400
Subject: RFR JDK-8170089: nsk/jdi/EventSet/resume/resume008: ERROR:
 suspendCounts don't match for : Common-Cleaner
In-Reply-To: <00dd4347-7869-b566-b7f6-d8b897e73c80@oracle.com>
References: <5B4E0C62.3020808@oracle.com>
 <5122a4fc-f5b7-3151-d550-ad2e8930ffb7@oracle.com> <5B4F2A04.20409@oracle.com>
 <4aeb7167-47a5-8a69-b326-010993bdfc14@oracle.com>
 <5B4F98BF.1060602@oracle.com>
 <1aab79c7-a927-1e87-834c-171ed7122a21@oracle.com>
 <c309dffe-f935-60ce-ce4b-5c99cd01406b@oracle.com>
 <5B507F2C.4080503@oracle.com>
 <2bd15cea-86b5-1af0-ef53-2c82a4685224@oracle.com>
 <5B577DDC.3000500@oracle.com>
 <00dd4347-7869-b566-b7f6-d8b897e73c80@oracle.com>
Message-ID: <8f5e2612-f348-012a-e4d8-9f3c4a082b8d@oracle.com>

During some longer testing runs I noticed similar failures for resume002,
resume003 and resume006. I'll spend a few more cycles to see if a more
general purpose solution could be shared across these tests.

On 7/24/18 7:46 PM, Chris Plummer wrote:
> Hi Gary,
>
> It looks like that should work fine.
>
> thanks,
>
> Chris
>
> On 7/24/18 12:28 PM, Gary Adams wrote:
>> Here's a quick prototype to add a variable to the debuggee.
>> The debugger sets it at the end of each completed test case.
>>
>> The debuggee can then check for the value change to delay
>> hitting the breakpoint which interfered with suspend count checks.
>>
>> Would need to add a bit more error and timeout checking to
>> complete the fix. Should also check if the other resume008 test cases
>> need similar synchronization. Could possibly migrate the code up to
>> TestDebuggerType1 if other tests also needed this generic capability.
>>
>>
>> diff --git 
>> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java 
>> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java
>> --- 
>> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java
>> +++ 
>> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008.java
>> @@ -63,6 +63,9 @@
>> ? *?? to be resulting in the event.
>> ? * - Upon getting new event, the debugger
>> ? *?? performs the check corresponding to the event.
>> + * - The debugger informs the debuggee when it completes
>> + *?? each test case, so it will wait before hitting
>> + *?? communication breakpoints.
>> ? */
>>
>> ?public class resume008 extends TestDebuggerType1 {
>> @@ -234,6 +237,7 @@
>>
>> ????????????????????? default: throw new Failure("** default case 1 
>> **");
>> ???????????????? }
>> +??????????????? informDebuggeeTestCase(i);
>> ???????????? }
>>
>> ???????????? display("......--> vm.resume()");
>> @@ -255,4 +259,25 @@
>> ???????? }
>> ???? }
>>
>> +??? /**
>> +???? * Inform debuggee which thread test the debugger has completed.
>> +???? * Used for synchronization, so the debuggee does not move too 
>> quickly.
>> +???? * @param testCase index of just completed test
>> +???? */
>> +??? void informDebuggeeTestCase(int testCase) {
>> +??????? if (!EventHandler.isDisconnected() && debuggeeClass != null) {
>> +??????????? try {
>> +??????????????? ((ClassType)debuggeeClass)
>> + .setValue(debuggeeClass.fieldByName("testCase"),
>> +????????????????????????????? vm.mirrorOf(testCase));
>> +??????????? } catch (InvalidTypeException ite) {
>> +??????????????? // ignored
>> +??????????? } catch (ClassNotLoadedException cnle) {
>> +??????????????? // ignored
>> +??????????? } catch (VMDisconnectedException e) {
>> +??????????????? // ignored
>> ?}
>> +??????? }
>> +??? }
>> +
>> +}
>>
>>
>> diff --git 
>> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java 
>> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java
>> --- 
>> a/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java
>> +++ 
>> b/test/hotspot/jtreg/vmTestbase/nsk/jdi/EventSet/resume/resume008a.java
>> @@ -62,6 +62,7 @@
>>
>> ???? static int exitCode = PASSED;
>>
>> +??? static int testCase = -1;
>> ???? static int instruction = 1;
>> ???? static int end???????? = 0;
>> ??????????????????????????????????? //??? static int quit = 0;
>> @@ -104,6 +105,15 @@
>> ???????????????????????????? threadStart(thread0);
>>
>> ???????????????????????????? thread1 = new Threadresume008a("thread1");
>> +??????????????????????????? // Wait for debugger to complete the 
>> first test case
>> +??????????????????????????? // before advancing to the next breakpoint
>> +??????????????????????????? while (testCase < 0) {
>> +??????????????????????????????? try {
>> +??????????????????????????????????? Thread.sleep(100);
>> +??????????????????????????????? } catch (InterruptedException e) {
>> +??????????????????????????????????? // ignored
>> +??????????????????????????????? }
>> +??????????????????????????? }
>> ???????????????????????????? methodForCommunication();
>> ???????????????????????????? break;
>>
>>
>> On 7/20/18, 2:37 PM, Chris Plummer wrote:
>>> Hi Gary,
>>>
>>> The test fails if the breakpoint event comes in after the test 
>>> captures the initial thread suspend counts and before the test 
>>> captures the 2nd suspend counts.
>>>
>>> debugger>???????? getting : Map<String, Integer> suspendsCounts1
>>> debugger> {Reference Handler=1, Common-Cleaner=1, main=1, Signal 
>>> Dispatcher=1, Finalizer=1}
>>> debugger>???????? eventSet.resume;
>>> debugger>???????? getting : Map<String, Integer> suspendsCounts2
>>> EventHandler> Received event set with policy = SUSPEND_ALL
>>> EventHandler> Event: BreakpointEventImpl req breakpoint request 
>>> nsk.jdi.EventSet.resume.resume008a:60 (enabled)
>>> debugger> Received communication breakpoint event.
>>> debugger> {Reference Handler=2, Common-Cleaner=2, main=1, Signal 
>>> Dispatcher=2, Finalizer=2}
>>>
>>> So we end up with some threads starting with 1 suspend and ending 
>>> with 2 (not clear to me why main is still at 1).
>>>
>>> It will pass if the breakpoint comes in after it does both of 
>>> suspend count checks, as you have shown with the sleep(100) 
>>> solution. Output looks like this:
>>>
>>> debugger>??????? got new ThreadStartEvent with propety 'number' == 
>>> ThreadStartRequest1
>>> ...
>>> debugger> ......--> vm.suspend();
>>> debugger>???????? getting : Map<String, Integer> suspendsCounts1
>>> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, 
>>> Signal Dispatcher=1, Finalizer=1}
>>> debugger>???????? eventSet.resume;
>>> debugger>???????? getting : Map<String, Integer> suspendsCounts2
>>> debugger> {Reference Handler=1, thread0=1, Common-Cleaner=1, main=1, 
>>> Signal Dispatcher=1, Finalizer=1}
>>> ...
>>> debugger> Received communication breakpoint event.
>>>
>>> I've also shown that it passes if the breakpoint always comes in 
>>> before capturing the initial suspend counts. I added a sleep on the 
>>> debugger side right after eventHandler.waitForRequestedEventSet() 
>>> returns. Output looks like:
>>>
>>> debugger> Received communication breakpoint event.
>>> debugger>??????? got new ThreadStartEvent with propety 'number' == 
>>> ThreadStartRequest1
>>> ...
>>> debugger> ......--> vm.suspend();
>>> debugger>???????? getting : Map<String, Integer> suspendsCounts1
>>> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
>>> Signal Dispatcher=2, Finalizer=2}
>>> debugger>???????? eventSet.resume;
>>> debugger>???????? getting : Map<String, Integer> suspendsCounts2
>>> debugger> {Reference Handler=2, thread0=2, Common-Cleaner=2, main=2, 
>>> Signal Dispatcher=2, Finalizer=2}
>>>
>>> I think we should add synchronization to force one of these two 
>>> outcomes. For the first, you would need to make the debugger modify 
>>> some variable that the debuggee is watching (sitting in a loop 
>>> waiting for it to change). For the second, you can rely on the 
>>> existing methodForCommunication() approach. You just need to 
>>> restructure the debugger a bit. I had started down this path late 
>>> Wednesday, but got sidetracked by a few other things. I can look 
>>> into it some more if you'd like.
>>>
>>> thanks,
>>>
>>> Chris 
>


From jcbeyler at google.com  Wed Jul 25 14:20:11 2018
From: jcbeyler at google.com (JC Beyler)
Date: Wed, 25 Jul 2018 07:20:11 -0700
Subject: RFR (XS) 8208059: [TESTBUG] HeapMonitorInterpreterArrayTest.java fails
Message-ID: <CAF9BGBzmrrWw6iYqhRt3ia8SCfkLAjjjkWv_wMJBkUXPv-t1KA@mail.gmail.com>

Hi all,

There seems to be an intermittent failure with the
HeapMonitorInterpreterArrayTest. I believe it is due to the possibility of
a huge interval being chosen at the end of the test and GC arriving before
checking the samples.

This fix should help alleviate it by reducing the interval to 100k and also
checking the garbage collected objects, could someone review it please?

Webrev: http://cr.openjdk.java.net/~jcbeyler/8208059/webrev.00/
Bug: https://bugs.openjdk.java.net/browse/JDK-8208059

As always, thanks for your help!
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180725/50477638/attachment.html>

From daniel.daugherty at oracle.com  Wed Jul 25 14:42:37 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 25 Jul 2018 10:42:37 -0400
Subject: RFR (XS) 8208059: [TESTBUG] HeapMonitorInterpreterArrayTest.java
 fails
In-Reply-To: <CAF9BGBzmrrWw6iYqhRt3ia8SCfkLAjjjkWv_wMJBkUXPv-t1KA@mail.gmail.com>
References: <CAF9BGBzmrrWw6iYqhRt3ia8SCfkLAjjjkWv_wMJBkUXPv-t1KA@mail.gmail.com>
Message-ID: <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com>

On 7/25/18 10:20 AM, JC Beyler wrote:
> Hi all,
>
> There seems to be an intermittent failure with the 
> HeapMonitorInterpreterArrayTest. I believe it is due to the 
> possibility of a huge interval being chosen at the end of the test and 
> GC arriving before checking the samples.
>
> This fix should help alleviate it by reducing the interval to 100k and 
> also checking the garbage collected objects, could someone review it 
> please?
>
> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208059/webrev.00/ 
> <http://cr.openjdk.java.net/%7Ejcbeyler/8208059/webrev.00/>

test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorTest.java
 ??? No comments.

Thumbs up. We'll need this fix in both JDK11 and JDK12.

Dan


> Bug: https://bugs.openjdk.java.net/browse/JDK-8208059
>
> As always, thanks for your help!
> Jc

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180725/0a69b58c/attachment.html>

From chris.plummer at oracle.com  Wed Jul 25 17:31:17 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 25 Jul 2018 10:31:17 -0700
Subject: [8u-backport] RFR: JDK-8191948: jdb error: InvalidTypeException:
 Can't assign double[][][] to double[][][]
In-Reply-To: <5be949d0-70ba-4a9d-9541-907b37d3d0fb@default>
References: <5be949d0-70ba-4a9d-9541-907b37d3d0fb@default>
Message-ID: <07fea5b7-6922-19d0-d58a-8aa5fb95b69d@oracle.com>

Hi Fairoz,

The changes look good. I'm not sure what the policy is when part of the 
(full) backport contains test changes that aren't directly applicable to 
8u. You might need some sort of noreg label on the backport CR.

thanks,

Chris

On 7/25/18 1:23 AM, Fairoz Matte wrote:
> Hi,
>
> Kindly review the backport of "JDK-8191948: jdb error: InvalidTypeException: Can't assign double[][][] to double[][][]" to 8u
>
> Webrev - http://cr.openjdk.java.net/~fmatte/8191948/webrev.00/
>
> JDK 11 bug - https://bugs.openjdk.java.net/browse/JDK-8191948
>
> JDK 11 changeset - http://hg.openjdk.java.net/jdk/jdk11/rev/73c769e0486a
>
> Review thread - http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024405.html
>
> Thanks,
> Fairoz


From jcbeyler at google.com  Wed Jul 25 17:34:54 2018
From: jcbeyler at google.com (JC Beyler)
Date: Wed, 25 Jul 2018 10:34:54 -0700
Subject: RFR (XS) 8208059: [TESTBUG] HeapMonitorInterpreterArrayTest.java
 fails
In-Reply-To: <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com>
References: <CAF9BGBzmrrWw6iYqhRt3ia8SCfkLAjjjkWv_wMJBkUXPv-t1KA@mail.gmail.com>
 <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com>
Message-ID: <CAF9BGBxJFHT0BA_YwheTd4MrQifyPVoYvPPa43Pxt2gqmVQ=nw@mail.gmail.com>

Thanks for your help Daniel,

Could I get a second review and I'll prepare an updated webrev :)
Jc

On Wed, Jul 25, 2018 at 7:42 AM Daniel D. Daugherty <
daniel.daugherty at oracle.com> wrote:

> On 7/25/18 10:20 AM, JC Beyler wrote:
>
> Hi all,
>
> There seems to be an intermittent failure with the
> HeapMonitorInterpreterArrayTest. I believe it is due to the possibility of
> a huge interval being chosen at the end of the test and GC arriving before
> checking the samples.
>
> This fix should help alleviate it by reducing the interval to 100k and
> also checking the garbage collected objects, could someone review it please?
>
> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208059/webrev.00/
>
>
>
> test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorTest.java
>     No comments.
>
> Thumbs up. We'll need this fix in both JDK11 and JDK12.
>
> Dan
>
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8208059
>
> As always, thanks for your help!
> Jc
>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180725/f763c7f1/attachment.html>

From serguei.spitsyn at oracle.com  Wed Jul 25 17:37:13 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 25 Jul 2018 10:37:13 -0700
Subject: RFR (XS) 8208059: [TESTBUG] HeapMonitorInterpreterArrayTest.java
 fails
In-Reply-To: <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com>
References: <CAF9BGBzmrrWw6iYqhRt3ia8SCfkLAjjjkWv_wMJBkUXPv-t1KA@mail.gmail.com>
 <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com>
Message-ID: <3c7f2aa7-b12e-3eb8-4843-12f422aaf022@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180725/14c20860/attachment.html>

From jcbeyler at google.com  Wed Jul 25 17:54:01 2018
From: jcbeyler at google.com (JC Beyler)
Date: Wed, 25 Jul 2018 10:54:01 -0700
Subject: RFR (XS) 8208059: [TESTBUG] HeapMonitorInterpreterArrayTest.java
 fails
In-Reply-To: <3c7f2aa7-b12e-3eb8-4843-12f422aaf022@oracle.com>
References: <CAF9BGBzmrrWw6iYqhRt3ia8SCfkLAjjjkWv_wMJBkUXPv-t1KA@mail.gmail.com>
 <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com>
 <3c7f2aa7-b12e-3eb8-4843-12f422aaf022@oracle.com>
Message-ID: <CAF9BGBzA6VctuOareGffKD+ygeidMoN5rbqPZ1ANW_9ML5LuPg@mail.gmail.com>

Hi Serguei,

Here it is:
http://cr.openjdk.java.net/~jcbeyler/8208059/webrev.01/

Let me know if you need anything else and thanks for your help!
Jc

On Wed, Jul 25, 2018 at 10:37 AM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Hi Jc,
>
> It looks good.
> I'll push it after you send me a patch.
>
>
> On 7/25/18 07:42, Daniel D. Daugherty wrote:
>
> On 7/25/18 10:20 AM, JC Beyler wrote:
>
> Hi all,
>
> There seems to be an intermittent failure with the
> HeapMonitorInterpreterArrayTest. I believe it is due to the possibility of
> a huge interval being chosen at the end of the test and GC arriving before
> checking the samples.
>
> This fix should help alleviate it by reducing the interval to 100k and
> also checking the garbage collected objects, could someone review it please?
>
> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208059/webrev.00/
>
>
>
> test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorTest.java
>     No comments.
>
> Thumbs up. We'll need this fix in both JDK11 and JDK12.
>
>
> Okay, I've changed the 'Fix Version' to 11.
>
> Thanks,
> Serguei
>
>
> Dan
>
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8208059
>
> As always, thanks for your help!
> Jc
>
>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180725/bb2abd16/attachment.html>

From serguei.spitsyn at oracle.com  Wed Jul 25 17:54:12 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 25 Jul 2018 10:54:12 -0700
Subject: [8u-backport] RFR: JDK-8191948: jdb error: InvalidTypeException:
 Can't assign double[][][] to double[][][]
In-Reply-To: <07fea5b7-6922-19d0-d58a-8aa5fb95b69d@oracle.com>
References: <5be949d0-70ba-4a9d-9541-907b37d3d0fb@default>
 <07fea5b7-6922-19d0-d58a-8aa5fb95b69d@oracle.com>
Message-ID: <96be102f-67c4-462c-89ba-a728b8b81ddc@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180725/81f79467/attachment.html>

From alexey.menkov at oracle.com  Wed Jul 25 18:00:01 2018
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Wed, 25 Jul 2018 11:00:01 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <539dc93d-8984-4d82-50eb-bd0395476247@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
 <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
 <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>
 <f457a86e-491f-472a-b052-a85740e1a135@oracle.com>
 <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com>
 <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com>
 <ff45637f-9ad8-9e63-4484-91fff7ed6777@oracle.com>
 <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com>
 <b75a822d-face-46e6-3b8b-103d9424b310@oracle.com>
 <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com>
 <68ef717f-7388-0dcb-be6b-7b8ed462fdf5@oracle.com>
 <fb82a78a-c1eb-b0a9-0059-91bdc7a79a6d@oracle.com>
 <539dc93d-8984-4d82-50eb-bd0395476247@oracle.com>
Message-ID: <112afd52-cd22-4d43-173f-08dd924eb7cc@oracle.com>

Looks good to me

--alex

On 07/24/2018 16:23, Chris Plummer wrote:
> Thanks, Serguei.
> 
> I could use one more reviewer.
> 
> thanks,
> 
> Chris
> 
> On 7/24/18 3:00 PM, serguei.spitsyn at oracle.com wrote:
>> Chris,
>>
>> Thank you for the explanations.
>> I'm Okay with this webrev as it is.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 7/24/18 13:55, Chris Plummer wrote:
>>> On 7/24/18 1:46 PM, serguei.spitsyn at oracle.com wrote:
>>>>
>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java.frames.html
>>>> - log.complain("Redefinition not started. Maybe running with -Xcomp. 
>>>> Test ignored.");
>>>> + log.complain("Redefinition not started. May need more time for 
>>>> -Xcomp.");
>>>> + status = Consts.TEST_FAILED;
>>>>           return false;
>>>>       }
>>>> . . .
>>>> - log.complain("Redefinition not completed.");
>>>> + log.complain("Redefinition not completed. May need more time for 
>>>> -Xcomp.");
>>>> + status = Consts.TEST_FAILED;
>>>> + return false; The complain is not fully correct if this can happen 
>>>> not only with the -Xcomp. Could this message be relaxed a little bit?
>>> I think it is relaxed. It says *may* need more time for -Xcomp. I'm 
>>> not sure how else to word it unless you want me to just say 
>>> "Redefinition not completed".
>>>> Also, just a side comment: The changes above are not that harmless. 
>>>> As the status now is set to TEST_FAILED there is a potential for the 
>>>> tests to start failing where they were passed before.
>>> Yes, that was intentional. It's still the case that you only need the 
>>> fail = 0 change to fix the bug, but having these methods properly 
>>> cause the test to fail is necessary if something were to ever go 
>>> wrong and the redef was not started or completed. Otherwise the test 
>>> would either silently pass (if redef was not started) or just produce 
>>> error messages like it has been when it checks for the proper redef 
>>> (if the redef never completed).
>>>
>>> thanks,
>>>
>>> Chris
>>>> Otherwise, looks good.
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 7/24/18 13:22, Chris Plummer wrote:
>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01
>>>>>
>>>>> Since I was removed the "else", there was no need for the "if", so 
>>>>> I removed it also. I had to re-indent the body of the "if" section 
>>>>> because of that. The webrev seems to not call out the whitespace 
>>>>> changes, although I also did a couple of other minor formatting 
>>>>> changes in the code that do show up.
>>>>>
>>>>> Chris
>>>>>
>>>>> On 7/24/18 12:42 PM, Chris Plummer wrote:
>>>>>> Yes. I'm just retesting first.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/24/18 12:18 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>> Hi Chris,
>>>>>>>
>>>>>>> You have my all my comments and I leave it up to you to decide 
>>>>>>> what approach to pick.
>>>>>>> Could you send an updated webrev, please?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serguei
>>>>>>>
>>>>>>>
>>>>>>> On 7/24/18 09:27, Chris Plummer wrote:
>>>>>>>> On 7/24/18 12:25 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>> Hi Chris,
>>>>>>>>>
>>>>>>>>> I still feel, this fix adds more confusion and complexity.
>>>>>>>>> Let's look at some fragments.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028/redefclass028.c.frames.html
>>>>>>>>>
>>>>>>>>>   116     if ((strcmp(name, expHSMethod) == 0) &&
>>>>>>>>>   117             (strcmp(sig, expHSSignature) == 0)) {
>>>>>>>>>   118         NSK_DISPLAY0("CompiledMethodLoad: a tested hotspot method found\n");
>>>>>>>>>   119
>>>>>>>>>   120         // CR 6604375: check whether "hot" method was entered
>>>>>>>>>   121         if (enteredHotMethod) {
>>>>>>>>>   122             hsMethodID = method;
>>>>>>>>>   123             fire = 1;
>>>>>>>>>   124         } else {
>>>>>>>>> 125 NSK_DISPLAY0("Compilation occured before method execution\n");
>>>>>>>>> 126 fire = 0; // Ignore this compilation. Wait for next one.
>>>>>>>>>   127         }
>>>>>>>>>   128     }
>>>>>>>>>
>>>>>>>>> I think, the line #126 is not needed.
>>>>>>>>> It just creates a confusion.
>>>>>>>>> The fire == 0 from beginning.
>>>>>>>>> Why do we need it to set to 0 again?
>>>>>>>> Yes, it can be removed. I just didn't give it much thought when 
>>>>>>>> changing the code from -1 to 0.
>>>>>>>>> Is it because it can be already set to 1?
>>>>>>>>> Id so, I'm not sure I understand this code then.
>>>>>>>>>
>>>>>>>>>   187     } while(fire == 0);
>>>>>>>>>   188
>>>>>>>>>   189     NSK_DISPLAY0("agentProc: hotspot method compiled\n\n");
>>>>>>>>>   190
>>>>>>>>>   192     if (fire == 1) {
>>>>>>>>>   . . .
>>>>>>>>>   224     } else {
>>>>>>>>>   225         // fire == -1
>>>>>>>>> 226 // NOTE: This isn't suppose to happen anymore. Hot method 
>>>>>>>>> should always end up being entered.
>>>>>>>>> 227 NSK_COMPLAIN0("agentProc: \"hot\" method wasn't executed. 
>>>>>>>>> Don't perform redefinition\n");
>>>>>>>>>   228     }
>>>>>>>>> I don't understand why do we need the check at the line #192.
>>>>>>>>> The variable fire can be only equal to 0 or 1.
>>>>>>>>> The only way out of the loop at the line #187 is if fire == 1.
>>>>>>>>>
>>>>>>>>> Then the else statement at the lines 224-228 confuses even more.
>>>>>>>> The else section can be removed. I left it in as sort of an 
>>>>>>>> assert, but I see now that it just cause confusion.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> Chris
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 7/23/18 20:19, Chris Plummer wrote:
>>>>>>>>>> On 7/23/18 5:22 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 7/23/18 11:40, Chris Plummer wrote:
>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>
>>>>>>>>>>>> If the fix was complicated I would agree, but it really just 
>>>>>>>>>>>> boils down to this one line change:
>>>>>>>>>>>>
>>>>>>>>>>>> -??????????? fire = -1;
>>>>>>>>>>>> +??????????? fire = 0; // Ignore this compilation. Wait for 
>>>>>>>>>>>> next one.
>>>>>>>>>>>
>>>>>>>>>>> It is not obvious that this will completely fix the problem.
>>>>>>>>>>> Is it possible that there will not be next compilation with 
>>>>>>>>>>> the -Xcomp?
>>>>>>>>>> It's only one method that we check for. I don't see why there 
>>>>>>>>>> would be 2nd -Xcomp compilation for it, but even if there was, 
>>>>>>>>>> the test will ignore it just like the first one. It will 
>>>>>>>>>> ignore compilations of the method until the flag has been set 
>>>>>>>>>> indicating the method has been executed once.
>>>>>>>>>
>>>>>>>>>> If for some reason the method is never compiled after being 
>>>>>>>>>> executed once, the test will give up waiting for it (I think 
>>>>>>>>>> after 30 seconds) and produce an error.
>>>>>>>>>
>>>>>>>>> I'm afraid that it is what will always happen with the -Xcomp.
>>>>>>>>> Then there is no point to waist this by waiting for timeout as 
>>>>>>>>> the test will successfully complete without testing anything.
>>>>>>>>> It seems to be not worth this complexity.
>>>>>>>>>
>>>>>>>>> I guess, you would want some extra tracing though. :)
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Serguei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>> If it is possible then it is better to explicitly exclude 
>>>>>>>>>>> these tests for -Xcomp.
>>>>>>>>>>> Otherwise, consider this reviewed.
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Given that, I see no reason not to increase our test 
>>>>>>>>>>>> coverage by supporting this test during -Xcomp runs.
>>>>>>>>>>>
>>>>>>>>>>> I'd agree if it is going to be stable.
>>>>>>>>>>>
>>>>>>>>>> If problems turn up in the future, we can reconsider disabling 
>>>>>>>>>> it.
>>>>>>>>>>
>>>>>>>>>> thanks,
>>>>>>>>>>
>>>>>>>>>> Chris
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Serguei
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Chris
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/23/18 9:44 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Would it be more simple to avoid running these tests with 
>>>>>>>>>>>>> -Xcomp?
>>>>>>>>>>>>> I guess, this would work: @requires vm.compMode != "Xcomp"
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/23/18 00:42, Chris Plummer wrote:
>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please review the following fix for JDK11:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8151259
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It fixes the following 3 tests:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java
>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java
>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any of which could fail when run with -Xcomp with 
>>>>>>>>>>>>>> (followed by a bunch more errors):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?# ERROR: Redefinition not started. Maybe running with 
>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Although lately we've only seen this with 
>>>>>>>>>>>>>> redefclass030.java on macosx.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> These 3 tests do redefinition of a "hot" method after 
>>>>>>>>>>>>>> triggering compilation for it. After the redef some 
>>>>>>>>>>>>>> testing is done to ensure that the redef was done 
>>>>>>>>>>>>>> correctly, but the issue these test have actually comes 
>>>>>>>>>>>>>> before any redef is done.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The test attempts to trigger compilation by calling a hot 
>>>>>>>>>>>>>> method a lot. The agent detects compilation by receiving a 
>>>>>>>>>>>>>> CompiledMethodLoad event. There was an issue discovered 
>>>>>>>>>>>>>> long ago that when -Xcomp is used, the compilation happens 
>>>>>>>>>>>>>> before the "hot" method is ever called. Then the redef 
>>>>>>>>>>>>>> would happen before compilation, and this somehow messed 
>>>>>>>>>>>>>> up the test (I'm not exactly sure how). The fix was to 
>>>>>>>>>>>>>> basically abandon the redef attempt when this problem is 
>>>>>>>>>>>>>> detected, and then supposedly just let the test run to 
>>>>>>>>>>>>>> completion (skipping the actual testing of the redef). 
>>>>>>>>>>>>>> After this change, if you ran with -Xcomp it would pass, 
>>>>>>>>>>>>>> but if you looked in the log you would see:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?# ERROR: Redefinition not started. Maybe running with 
>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However, there was a bug in the logic to make the test run 
>>>>>>>>>>>>>> to completion, and also causes the above message to not 
>>>>>>>>>>>>>> appear. Instead the test would fail with:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # ERROR: Redefinition not completed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Followed by a bunch more error message during the part of 
>>>>>>>>>>>>>> the test that checks if the redef was done properly.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If the CompiledMethodLoad event comes in before the hot 
>>>>>>>>>>>>>> method is ever called (which it does with -Xcomp), the 
>>>>>>>>>>>>>> test sets fire = -1. If the hot method was called, it is 
>>>>>>>>>>>>>> set to 1.? The setting of fire = -1 was added to fix the 
>>>>>>>>>>>>>> -Xcomp problem mentioned above. The jvmti agent does the 
>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??? do {
>>>>>>>>>>>>>> ??????? THREAD_sleep(1);
>>>>>>>>>>>>>> ??????? /* wait for compilation to happen */
>>>>>>>>>>>>>> ??? } while(fire == 0);
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ??? if (fire == 1) {
>>>>>>>>>>>>>> ??????? /* do the redef here */
>>>>>>>>>>>>>> ??????? NSK_DISPLAY0("agentProc: <<<<<<<< 
>>>>>>>>>>>>>> RedefineClasses() is successfully done\n");
>>>>>>>>>>>>>> ??? } else {
>>>>>>>>>>>>>> ??????? // fire == -1
>>>>>>>>>>>>>> ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't 
>>>>>>>>>>>>>> executed. Don't perform redefinition\n");
>>>>>>>>>>>>>> ??? }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The agent then syncs with the debuggee, waiting for it 
>>>>>>>>>>>>>> finish up. What the test expects is that 
>>>>>>>>>>>>>> waitForRedefinitionStarted() in the debuggee will time out 
>>>>>>>>>>>>>> after two seconds while waiting for fire == 1 (which it 
>>>>>>>>>>>>>> thinks will will always happen because it was set to -1). 
>>>>>>>>>>>>>> When it times out, the test does appear to exit properly 
>>>>>>>>>>>>>> with, but with the following in the log, which is intended:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ?# ERROR: Redefinition not started. Maybe running with 
>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However, sometimes before waitForRedefinitionStarted() 
>>>>>>>>>>>>>> times out, the hot method is called enough times to 
>>>>>>>>>>>>>> trigger compilation. So another CompiledMethodLoad event 
>>>>>>>>>>>>>> arrives, and this time fire is set to 1. Because of this, 
>>>>>>>>>>>>>> waitForRedefinitionStarted() doesn't time out and returns 
>>>>>>>>>>>>>> with an indication that the redef has started. After this 
>>>>>>>>>>>>>> waitForRedefinitionCompleted() is executed. It waits for 
>>>>>>>>>>>>>> the redef to complete, but it never does since the agent 
>>>>>>>>>>>>>> decided not to do the redef when it saw fire == -1. So 
>>>>>>>>>>>>>> waitForRedefinitionCompleted() times out after 10 seconds 
>>>>>>>>>>>>>> and the test fails, with:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # ERROR: Redefinition not completed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Actually the above error is not really what causes the 
>>>>>>>>>>>>>> failure. When the above error is detected, no error status 
>>>>>>>>>>>>>> is set and the test continues as if the redef had been 
>>>>>>>>>>>>>> done. So then the logic that detects if the redef was done 
>>>>>>>>>>>>>> properly ends up failing, and that's where the test 
>>>>>>>>>>>>>> actually indicates a failure status. You see a whole bunch 
>>>>>>>>>>>>>> of other errors in the log because of all the checks that 
>>>>>>>>>>>>>> fail.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The fix is to not abandon the test when the first 
>>>>>>>>>>>>>> CompiledMethodLoad event is before the hot method was 
>>>>>>>>>>>>>> called. Instead just leave fire==0 and wait for the next 
>>>>>>>>>>>>>> CompiledMethodLoad event that is triggered after the 
>>>>>>>>>>>>>> method is called enough times to be recompiled. I'm not 
>>>>>>>>>>>>>> sure why it was not originally done this way. Possibly the 
>>>>>>>>>>>>>> recompilation did not happen reliably, but I have not run 
>>>>>>>>>>>>>> into this problem. The other changes in redefclass030.c 
>>>>>>>>>>>>>> are just cleaning up debug tracing.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Another fix was to properly set the error status when 
>>>>>>>>>>>>>> waitForRedefinitionStarted() or 
>>>>>>>>>>>>>> waitForRedefinitionCompleted() times out, although this is 
>>>>>>>>>>>>>> just a safety net and I didn't run into any cases where 
>>>>>>>>>>>>>> this happened after fixing the CompiledMethodLoad event 
>>>>>>>>>>>>>> handling. So in general the changes in redefclass030.java 
>>>>>>>>>>>>>> were not needed, but provide better error handling.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 

From serguei.spitsyn at oracle.com  Wed Jul 25 17:59:58 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 25 Jul 2018 10:59:58 -0700
Subject: RFR (XS) 8208059: [TESTBUG] HeapMonitorInterpreterArrayTest.java
 fails
In-Reply-To: <CAF9BGBzA6VctuOareGffKD+ygeidMoN5rbqPZ1ANW_9ML5LuPg@mail.gmail.com>
References: <CAF9BGBzmrrWw6iYqhRt3ia8SCfkLAjjjkWv_wMJBkUXPv-t1KA@mail.gmail.com>
 <29ba77e4-8369-6bd3-a953-ed7d40d6dc05@oracle.com>
 <3c7f2aa7-b12e-3eb8-4843-12f422aaf022@oracle.com>
 <CAF9BGBzA6VctuOareGffKD+ygeidMoN5rbqPZ1ANW_9ML5LuPg@mail.gmail.com>
Message-ID: <87a12963-bbdc-1ba3-6476-63bdf3af34a0@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180725/c1db1b80/attachment.html>

From chris.plummer at oracle.com  Wed Jul 25 18:13:09 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 25 Jul 2018 11:13:09 -0700
Subject: RFR(S): 8151259: [TESTBUG]
 nsk/jvmti/RedefineClasses/redefclass030 fails with "unexpected values of
 outer fields of the class" when running with -Xcomp
In-Reply-To: <112afd52-cd22-4d43-173f-08dd924eb7cc@oracle.com>
References: <b8efd7a1-f2fd-a156-f810-6bf99553762a@oracle.com>
 <640cbb9c-44f8-daf2-1033-60b6f46293f8@oracle.com>
 <304f8b38-69ea-84f3-a323-e5eccc3a936f@oracle.com>
 <cf39c5a9-d4f4-f7e7-1670-5fdbddd1b853@oracle.com>
 <f457a86e-491f-472a-b052-a85740e1a135@oracle.com>
 <057d40ba-c390-4057-bd8d-cf519ed23a6f@oracle.com>
 <34e14bc5-2266-4f4f-5548-5c26eda5a1cd@oracle.com>
 <ff45637f-9ad8-9e63-4484-91fff7ed6777@oracle.com>
 <60c8f68d-0775-9512-c2a2-6428e218ce57@oracle.com>
 <b75a822d-face-46e6-3b8b-103d9424b310@oracle.com>
 <0fde8ff3-cc73-6cef-3aea-364a46ac41ef@oracle.com>
 <68ef717f-7388-0dcb-be6b-7b8ed462fdf5@oracle.com>
 <fb82a78a-c1eb-b0a9-0059-91bdc7a79a6d@oracle.com>
 <539dc93d-8984-4d82-50eb-bd0395476247@oracle.com>
 <112afd52-cd22-4d43-173f-08dd924eb7cc@oracle.com>
Message-ID: <7dec2692-c3d0-4903-b9ae-8a22528539a7@oracle.com>

Thanks!

On 7/25/18 11:00 AM, Alex Menkov wrote:
> Looks good to me
>
> --alex
>
> On 07/24/2018 16:23, Chris Plummer wrote:
>> Thanks, Serguei.
>>
>> I could use one more reviewer.
>>
>> thanks,
>>
>> Chris
>>
>> On 7/24/18 3:00 PM, serguei.spitsyn at oracle.com wrote:
>>> Chris,
>>>
>>> Thank you for the explanations.
>>> I'm Okay with this webrev as it is.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 7/24/18 13:55, Chris Plummer wrote:
>>>> On 7/24/18 1:46 PM, serguei.spitsyn at oracle.com wrote:
>>>>>
>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java.frames.html 
>>>>>
>>>>> - log.complain("Redefinition not started. Maybe running with 
>>>>> -Xcomp. Test ignored.");
>>>>> + log.complain("Redefinition not started. May need more time for 
>>>>> -Xcomp.");
>>>>> + status = Consts.TEST_FAILED;
>>>>> ????????? return false;
>>>>> ????? }
>>>>> . . .
>>>>> - log.complain("Redefinition not completed.");
>>>>> + log.complain("Redefinition not completed. May need more time for 
>>>>> -Xcomp.");
>>>>> + status = Consts.TEST_FAILED;
>>>>> + return false; The complain is not fully correct if this can 
>>>>> happen not only with the -Xcomp. Could this message be relaxed a 
>>>>> little bit?
>>>> I think it is relaxed. It says *may* need more time for -Xcomp. I'm 
>>>> not sure how else to word it unless you want me to just say 
>>>> "Redefinition not completed".
>>>>> Also, just a side comment: The changes above are not that 
>>>>> harmless. As the status now is set to TEST_FAILED there is a 
>>>>> potential for the tests to start failing where they were passed 
>>>>> before.
>>>> Yes, that was intentional. It's still the case that you only need 
>>>> the fail = 0 change to fix the bug, but having these methods 
>>>> properly cause the test to fail is necessary if something were to 
>>>> ever go wrong and the redef was not started or completed. Otherwise 
>>>> the test would either silently pass (if redef was not started) or 
>>>> just produce error messages like it has been when it checks for the 
>>>> proper redef (if the redef never completed).
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>> Otherwise, looks good.
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 7/24/18 13:22, Chris Plummer wrote:
>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.01
>>>>>>
>>>>>> Since I was removed the "else", there was no need for the "if", 
>>>>>> so I removed it also. I had to re-indent the body of the "if" 
>>>>>> section because of that. The webrev seems to not call out the 
>>>>>> whitespace changes, although I also did a couple of other minor 
>>>>>> formatting changes in the code that do show up.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/24/18 12:42 PM, Chris Plummer wrote:
>>>>>>> Yes. I'm just retesting first.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 7/24/18 12:18 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>> Hi Chris,
>>>>>>>>
>>>>>>>> You have my all my comments and I leave it up to you to decide 
>>>>>>>> what approach to pick.
>>>>>>>> Could you send an updated webrev, please?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Serguei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 7/24/18 09:27, Chris Plummer wrote:
>>>>>>>>> On 7/24/18 12:25 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>> Hi Chris,
>>>>>>>>>>
>>>>>>>>>> I still feel, this fix adds more confusion and complexity.
>>>>>>>>>> Let's look at some fragments.
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00/test/hotspot/jtreg/vmTestbase/nsk/jvmti/RedefineClasses/redefclass028/redefclass028.c.frames.html 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ? 116???? if ((strcmp(name, expHSMethod) == 0) &&
>>>>>>>>>> ? 117???????????? (strcmp(sig, expHSSignature) == 0)) {
>>>>>>>>>> ? 118???????? NSK_DISPLAY0("CompiledMethodLoad: a tested 
>>>>>>>>>> hotspot method found\n");
>>>>>>>>>> ? 119
>>>>>>>>>> ? 120???????? // CR 6604375: check whether "hot" method was 
>>>>>>>>>> entered
>>>>>>>>>> ? 121???????? if (enteredHotMethod) {
>>>>>>>>>> ? 122???????????? hsMethodID = method;
>>>>>>>>>> ? 123???????????? fire = 1;
>>>>>>>>>> ? 124???????? } else {
>>>>>>>>>> 125 NSK_DISPLAY0("Compilation occured before method 
>>>>>>>>>> execution\n");
>>>>>>>>>> 126 fire = 0; // Ignore this compilation. Wait for next one.
>>>>>>>>>> ? 127???????? }
>>>>>>>>>> ? 128???? }
>>>>>>>>>>
>>>>>>>>>> I think, the line #126 is not needed.
>>>>>>>>>> It just creates a confusion.
>>>>>>>>>> The fire == 0 from beginning.
>>>>>>>>>> Why do we need it to set to 0 again?
>>>>>>>>> Yes, it can be removed. I just didn't give it much thought 
>>>>>>>>> when changing the code from -1 to 0.
>>>>>>>>>> Is it because it can be already set to 1?
>>>>>>>>>> Id so, I'm not sure I understand this code then.
>>>>>>>>>>
>>>>>>>>>> ? 187???? } while(fire == 0);
>>>>>>>>>> ? 188
>>>>>>>>>> ? 189???? NSK_DISPLAY0("agentProc: hotspot method 
>>>>>>>>>> compiled\n\n");
>>>>>>>>>> ? 190
>>>>>>>>>> ? 192???? if (fire == 1) {
>>>>>>>>>> ? . . .
>>>>>>>>>> ? 224???? } else {
>>>>>>>>>> ? 225???????? // fire == -1
>>>>>>>>>> 226 // NOTE: This isn't suppose to happen anymore. Hot method 
>>>>>>>>>> should always end up being entered.
>>>>>>>>>> 227 NSK_COMPLAIN0("agentProc: \"hot\" method wasn't executed. 
>>>>>>>>>> Don't perform redefinition\n");
>>>>>>>>>> ? 228???? }
>>>>>>>>>> I don't understand why do we need the check at the line #192.
>>>>>>>>>> The variable fire can be only equal to 0 or 1.
>>>>>>>>>> The only way out of the loop at the line #187 is if fire == 1.
>>>>>>>>>>
>>>>>>>>>> Then the else statement at the lines 224-228 confuses even more.
>>>>>>>>> The else section can be removed. I left it in as sort of an 
>>>>>>>>> assert, but I see now that it just cause confusion.
>>>>>>>>>
>>>>>>>>> thanks,
>>>>>>>>>
>>>>>>>>> Chris
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 7/23/18 20:19, Chris Plummer wrote:
>>>>>>>>>>> On 7/23/18 5:22 PM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 7/23/18 11:40, Chris Plummer wrote:
>>>>>>>>>>>>> Hi Serguei,
>>>>>>>>>>>>>
>>>>>>>>>>>>> If the fix was complicated I would agree, but it really 
>>>>>>>>>>>>> just boils down to this one line change:
>>>>>>>>>>>>>
>>>>>>>>>>>>> -??????????? fire = -1;
>>>>>>>>>>>>> +??????????? fire = 0; // Ignore this compilation. Wait 
>>>>>>>>>>>>> for next one.
>>>>>>>>>>>>
>>>>>>>>>>>> It is not obvious that this will completely fix the problem.
>>>>>>>>>>>> Is it possible that there will not be next compilation with 
>>>>>>>>>>>> the -Xcomp?
>>>>>>>>>>> It's only one method that we check for. I don't see why 
>>>>>>>>>>> there would be 2nd -Xcomp compilation for it, but even if 
>>>>>>>>>>> there was, the test will ignore it just like the first one. 
>>>>>>>>>>> It will ignore compilations of the method until the flag has 
>>>>>>>>>>> been set indicating the method has been executed once.
>>>>>>>>>>
>>>>>>>>>>> If for some reason the method is never compiled after being 
>>>>>>>>>>> executed once, the test will give up waiting for it (I think 
>>>>>>>>>>> after 30 seconds) and produce an error.
>>>>>>>>>>
>>>>>>>>>> I'm afraid that it is what will always happen with the -Xcomp.
>>>>>>>>>> Then there is no point to waist this by waiting for timeout 
>>>>>>>>>> as the test will successfully complete without testing anything.
>>>>>>>>>> It seems to be not worth this complexity.
>>>>>>>>>>
>>>>>>>>>> I guess, you would want some extra tracing though. :)
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Serguei
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>> If it is possible then it is better to explicitly exclude 
>>>>>>>>>>>> these tests for -Xcomp.
>>>>>>>>>>>> Otherwise, consider this reviewed.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Given that, I see no reason not to increase our test 
>>>>>>>>>>>>> coverage by supporting this test during -Xcomp runs.
>>>>>>>>>>>>
>>>>>>>>>>>> I'd agree if it is going to be stable.
>>>>>>>>>>>>
>>>>>>>>>>> If problems turn up in the future, we can reconsider 
>>>>>>>>>>> disabling it.
>>>>>>>>>>>
>>>>>>>>>>> thanks,
>>>>>>>>>>>
>>>>>>>>>>> Chris
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Serguei
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 7/23/18 9:44 AM, serguei.spitsyn at oracle.com wrote:
>>>>>>>>>>>>>> Hi Chris,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Would it be more simple to avoid running these tests with 
>>>>>>>>>>>>>> -Xcomp?
>>>>>>>>>>>>>> I guess, this would work: @requires vm.compMode != "Xcomp"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Serguei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 7/23/18 00:42, Chris Plummer wrote:
>>>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please review the following fix for JDK11:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8151259
>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8151259/webrev.00
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It fixes the following 3 tests:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass028.java
>>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass029.java
>>>>>>>>>>>>>>> vmTestbase/nsk/jvmti/RedefineClasses/redefclass030.java
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Any of which could fail when run with -Xcomp with 
>>>>>>>>>>>>>>> (followed by a bunch more errors):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ?# ERROR: Redefinition not started. Maybe running with 
>>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Although lately we've only seen this with 
>>>>>>>>>>>>>>> redefclass030.java on macosx.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> These 3 tests do redefinition of a "hot" method after 
>>>>>>>>>>>>>>> triggering compilation for it. After the redef some 
>>>>>>>>>>>>>>> testing is done to ensure that the redef was done 
>>>>>>>>>>>>>>> correctly, but the issue these test have actually comes 
>>>>>>>>>>>>>>> before any redef is done.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The test attempts to trigger compilation by calling a 
>>>>>>>>>>>>>>> hot method a lot. The agent detects compilation by 
>>>>>>>>>>>>>>> receiving a CompiledMethodLoad event. There was an issue 
>>>>>>>>>>>>>>> discovered long ago that when -Xcomp is used, the 
>>>>>>>>>>>>>>> compilation happens before the "hot" method is ever 
>>>>>>>>>>>>>>> called. Then the redef would happen before compilation, 
>>>>>>>>>>>>>>> and this somehow messed up the test (I'm not exactly 
>>>>>>>>>>>>>>> sure how). The fix was to basically abandon the redef 
>>>>>>>>>>>>>>> attempt when this problem is detected, and then 
>>>>>>>>>>>>>>> supposedly just let the test run to completion (skipping 
>>>>>>>>>>>>>>> the actual testing of the redef). After this change, if 
>>>>>>>>>>>>>>> you ran with -Xcomp it would pass, but if you looked in 
>>>>>>>>>>>>>>> the log you would see:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ?# ERROR: Redefinition not started. Maybe running with 
>>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> However, there was a bug in the logic to make the test 
>>>>>>>>>>>>>>> run to completion, and also causes the above message to 
>>>>>>>>>>>>>>> not appear. Instead the test would fail with:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> # ERROR: Redefinition not completed.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Followed by a bunch more error message during the part 
>>>>>>>>>>>>>>> of the test that checks if the redef was done properly.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If the CompiledMethodLoad event comes in before the hot 
>>>>>>>>>>>>>>> method is ever called (which it does with -Xcomp), the 
>>>>>>>>>>>>>>> test sets fire = -1. If the hot method was called, it is 
>>>>>>>>>>>>>>> set to 1. The setting of fire = -1 was added to fix the 
>>>>>>>>>>>>>>> -Xcomp problem mentioned above. The jvmti agent does the 
>>>>>>>>>>>>>>> following:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ??? do {
>>>>>>>>>>>>>>> ??????? THREAD_sleep(1);
>>>>>>>>>>>>>>> ??????? /* wait for compilation to happen */
>>>>>>>>>>>>>>> ??? } while(fire == 0);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ??? if (fire == 1) {
>>>>>>>>>>>>>>> ??????? /* do the redef here */
>>>>>>>>>>>>>>> ??????? NSK_DISPLAY0("agentProc: <<<<<<<< 
>>>>>>>>>>>>>>> RedefineClasses() is successfully done\n");
>>>>>>>>>>>>>>> ??? } else {
>>>>>>>>>>>>>>> ??????? // fire == -1
>>>>>>>>>>>>>>> ??????? NSK_DISPLAY0("agentProc: \"hot\" method wasn't 
>>>>>>>>>>>>>>> executed. Don't perform redefinition\n");
>>>>>>>>>>>>>>> ??? }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The agent then syncs with the debuggee, waiting for it 
>>>>>>>>>>>>>>> finish up. What the test expects is that 
>>>>>>>>>>>>>>> waitForRedefinitionStarted() in the debuggee will time 
>>>>>>>>>>>>>>> out after two seconds while waiting for fire == 1 (which 
>>>>>>>>>>>>>>> it thinks will will always happen because it was set to 
>>>>>>>>>>>>>>> -1). When it times out, the test does appear to exit 
>>>>>>>>>>>>>>> properly with, but with the following in the log, which 
>>>>>>>>>>>>>>> is intended:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ?# ERROR: Redefinition not started. Maybe running with 
>>>>>>>>>>>>>>> -Xcomp. Test ignored.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> However, sometimes before waitForRedefinitionStarted() 
>>>>>>>>>>>>>>> times out, the hot method is called enough times to 
>>>>>>>>>>>>>>> trigger compilation. So another CompiledMethodLoad event 
>>>>>>>>>>>>>>> arrives, and this time fire is set to 1. Because of 
>>>>>>>>>>>>>>> this, waitForRedefinitionStarted() doesn't time out and 
>>>>>>>>>>>>>>> returns with an indication that the redef has started. 
>>>>>>>>>>>>>>> After this waitForRedefinitionCompleted() is executed. 
>>>>>>>>>>>>>>> It waits for the redef to complete, but it never does 
>>>>>>>>>>>>>>> since the agent decided not to do the redef when it saw 
>>>>>>>>>>>>>>> fire == -1. So waitForRedefinitionCompleted() times out 
>>>>>>>>>>>>>>> after 10 seconds and the test fails, with:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> # ERROR: Redefinition not completed.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Actually the above error is not really what causes the 
>>>>>>>>>>>>>>> failure. When the above error is detected, no error 
>>>>>>>>>>>>>>> status is set and the test continues as if the redef had 
>>>>>>>>>>>>>>> been done. So then the logic that detects if the redef 
>>>>>>>>>>>>>>> was done properly ends up failing, and that's where the 
>>>>>>>>>>>>>>> test actually indicates a failure status. You see a 
>>>>>>>>>>>>>>> whole bunch of other errors in the log because of all 
>>>>>>>>>>>>>>> the checks that fail.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The fix is to not abandon the test when the first 
>>>>>>>>>>>>>>> CompiledMethodLoad event is before the hot method was 
>>>>>>>>>>>>>>> called. Instead just leave fire==0 and wait for the next 
>>>>>>>>>>>>>>> CompiledMethodLoad event that is triggered after the 
>>>>>>>>>>>>>>> method is called enough times to be recompiled. I'm not 
>>>>>>>>>>>>>>> sure why it was not originally done this way. Possibly 
>>>>>>>>>>>>>>> the recompilation did not happen reliably, but I have 
>>>>>>>>>>>>>>> not run into this problem. The other changes in 
>>>>>>>>>>>>>>> redefclass030.c are just cleaning up debug tracing.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Another fix was to properly set the error status when 
>>>>>>>>>>>>>>> waitForRedefinitionStarted() or 
>>>>>>>>>>>>>>> waitForRedefinitionCompleted() times out, although this 
>>>>>>>>>>>>>>> is just a safety net and I didn't run into any cases 
>>>>>>>>>>>>>>> where this happened after fixing the CompiledMethodLoad 
>>>>>>>>>>>>>>> event handling. So in general the changes in 
>>>>>>>>>>>>>>> redefclass030.java were not needed, but provide better 
>>>>>>>>>>>>>>> error handling.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Chris
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>


From alexey.menkov at oracle.com  Wed Jul 25 18:22:55 2018
From: alexey.menkov at oracle.com (Alex Menkov)
Date: Wed, 25 Jul 2018 11:22:55 -0700
Subject: RFR: JDK-8199155 : Accessibility issues in jdk.jdi
Message-ID: <3320a618-4522-d442-eeb9-9f1a676b7a00@oracle.com>

Hi,

please review the following for for
https://bugs.openjdk.java.net/browse/JDK-8199155

webrev:
http://cr.openjdk.java.net/~amenkov/accessibility/webrev/

The fix adds standard "banner", "navigation", "main" regions
and fixes "<dl> without <dt>" issue.
For <dl><dd> styles which are used by most browsers are used:
dl { margin-top: 1em; margin-bottom: 1em; }
dd { margin-left: 40px; }

--alex

From daniil.x.titov at oracle.com  Wed Jul 25 18:47:52 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Wed, 25 Jul 2018 11:47:52 -0700
Subject: JDK-8199155 : Accessibility issues in jdk.jdi
In-Reply-To: <E2099E4F-9A0C-4F5F-886C-87859F2A8D7D@oracle.com>
References: <E2099E4F-9A0C-4F5F-886C-87859F2A8D7D@oracle.com>
Message-ID: <A4DA7893-49AA-4EBE-864D-1B9DD4DC9A2B@oracle.com>

Looks good to me.

--Daniil

?On 7/25/18, 11:23 AM, "serviceability-dev on behalf of Alex Menkov" <serviceability-dev-bounces at openjdk.java.net on behalf of alexey.menkov at oracle.com> wrote:

    Hi,
    
    please review the following for for
    https://bugs.openjdk.java.net/browse/JDK-8199155
    
    webrev:
    http://cr.openjdk.java.net/~amenkov/accessibility/webrev/
    
    The fix adds standard "banner", "navigation", "main" regions
    and fixes "<dl> without <dt>" issue.
    For <dl><dd> styles which are used by most browsers are used:
    dl { margin-top: 1em; margin-bottom: 1em; }
    dd { margin-left: 40px; }
    
    --alex
    
    
From daniel.daugherty at oracle.com  Wed Jul 25 19:03:35 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 25 Jul 2018 15:03:35 -0400
Subject: RFR(XS): 8208205: ProblemList tests that fail due to 'Error attaching
 to process: Can't create thread_db agent!'
Message-ID: <8e4359d7-7433-2d93-c7a0-d9b2c1b4b548@oracle.com>

Greetings,

I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so
I need a single (R)eviewer for the following fix:

 ? JDK-8208205 ProblemList tests that fail due to 'Error attaching to
 ????????????? process: Can't create thread_db agent!'
 ? https://bugs.openjdk.java.net/browse/JDK-8208205

Here's the diff:

$ hg diff test/hotspot/jtreg/ProblemList.txt
diff -r 628718bf8970 test/hotspot/jtreg/ProblemList.txt
--- a/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 25 12:32:06 2018 -0400
+++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 25 14:47:58 2018 -0400
@@ -74,14 +74,43 @@
 ?# :hotspot_runtime

 ?runtime/CompressedOops/UseCompressedOops.java 8079353 generic-all
+runtime/SharedArchiveFile/SASymbolTableTest.java 8193639 solaris

 ?#############################################################################

 ?# :hotspot_serviceability

-serviceability/sa/ClhsdbCDSCore.java???????????????? 8207832 linux-x64
-serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all
-serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all
+serviceability/sa/ClhsdbAttach.java 8193639 solaris
+serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64
+serviceability/sa/ClhsdbField.java 8193639 solaris
+serviceability/sa/ClhsdbFindPC.java 8193639 solaris
+serviceability/sa/ClhsdbInspect.java 8193639 solaris
+serviceability/sa/ClhsdbJdis.java 8193639 solaris
+serviceability/sa/ClhsdbJhisto.java 8193639 solaris
+serviceability/sa/ClhsdbJstack.java 8193639 solaris
+serviceability/sa/ClhsdbLongConstant.java 8193639 solaris
+serviceability/sa/ClhsdbPmap.java 8193639 solaris
+serviceability/sa/ClhsdbPrintAll.java 8193639 solaris
+serviceability/sa/ClhsdbPrintAs.java 8193639 solaris
+serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris
+serviceability/sa/ClhsdbPstack.java 8193639 solaris
+serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris
+serviceability/sa/ClhsdbScanOops.java 8193639 solaris
+serviceability/sa/ClhsdbSource.java 8193639 solaris
+serviceability/sa/ClhsdbSymbol.java 8193639 solaris
+serviceability/sa/ClhsdbSymbolTable.java 8193639 solaris
+serviceability/sa/ClhsdbThread.java 8193639 solaris
+serviceability/sa/ClhsdbWhere.java 8193639 solaris
+serviceability/sa/DeadlockDetectionTest.java 8193639 solaris
+serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris
+serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all
+serviceability/sa/TestClassDump.java 8193639 solaris
+serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris
+serviceability/sa/TestDefaultMethods.java 8193639 solaris
+serviceability/sa/TestG1HeapRegion.java 8193639 solaris
+serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all
+serviceability/sa/TestType.java 8193639 solaris
+serviceability/sa/TestUniverse.java 8193639 solaris

 ?#############################################################################


In the above diff, it looks like I deleted these entries:

-serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64
-serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all
-serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all

What I really did was delete the spaces (like most of the other entries
in the hotspot ProblemList:

+serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64
+serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all
+serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all

I also put those entries in sort order which is why a 'diff -w'
wasn't used...


Thanks, in advance, for any questions, comments or suggestions.

Dan

From chris.plummer at oracle.com  Wed Jul 25 19:32:50 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 25 Jul 2018 12:32:50 -0700
Subject: RFR(XS): 8208205: ProblemList tests that fail due to 'Error
 attaching to process: Can't create thread_db agent!'
In-Reply-To: <8e4359d7-7433-2d93-c7a0-d9b2c1b4b548@oracle.com>
References: <8e4359d7-7433-2d93-c7a0-d9b2c1b4b548@oracle.com>
Message-ID: <7b268608-36c2-8243-9e21-fd06b44f0dab@oracle.com>

Hi Dan,

Looks good to me. Thanks for cleaning up the noise.

Chris

On 7/25/18 12:03 PM, Daniel D. Daugherty wrote:
> Greetings,
>
> I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so
> I need a single (R)eviewer for the following fix:
>
> ? JDK-8208205 ProblemList tests that fail due to 'Error attaching to
> ????????????? process: Can't create thread_db agent!'
> ? https://bugs.openjdk.java.net/browse/JDK-8208205
>
> Here's the diff:
>
> $ hg diff test/hotspot/jtreg/ProblemList.txt
> diff -r 628718bf8970 test/hotspot/jtreg/ProblemList.txt
> --- a/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 25 12:32:06 2018 
> -0400
> +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 25 14:47:58 2018 
> -0400
> @@ -74,14 +74,43 @@
> ?# :hotspot_runtime
>
> ?runtime/CompressedOops/UseCompressedOops.java 8079353 generic-all
> +runtime/SharedArchiveFile/SASymbolTableTest.java 8193639 solaris
>
> ?############################################################################# 
>
>
> ?# :hotspot_serviceability
>
> -serviceability/sa/ClhsdbCDSCore.java???????????????? 8207832 linux-x64
> -serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all
> -serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all
> +serviceability/sa/ClhsdbAttach.java 8193639 solaris
> +serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64
> +serviceability/sa/ClhsdbField.java 8193639 solaris
> +serviceability/sa/ClhsdbFindPC.java 8193639 solaris
> +serviceability/sa/ClhsdbInspect.java 8193639 solaris
> +serviceability/sa/ClhsdbJdis.java 8193639 solaris
> +serviceability/sa/ClhsdbJhisto.java 8193639 solaris
> +serviceability/sa/ClhsdbJstack.java 8193639 solaris
> +serviceability/sa/ClhsdbLongConstant.java 8193639 solaris
> +serviceability/sa/ClhsdbPmap.java 8193639 solaris
> +serviceability/sa/ClhsdbPrintAll.java 8193639 solaris
> +serviceability/sa/ClhsdbPrintAs.java 8193639 solaris
> +serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris
> +serviceability/sa/ClhsdbPstack.java 8193639 solaris
> +serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris
> +serviceability/sa/ClhsdbScanOops.java 8193639 solaris
> +serviceability/sa/ClhsdbSource.java 8193639 solaris
> +serviceability/sa/ClhsdbSymbol.java 8193639 solaris
> +serviceability/sa/ClhsdbSymbolTable.java 8193639 solaris
> +serviceability/sa/ClhsdbThread.java 8193639 solaris
> +serviceability/sa/ClhsdbWhere.java 8193639 solaris
> +serviceability/sa/DeadlockDetectionTest.java 8193639 solaris
> +serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris
> +serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all
> +serviceability/sa/TestClassDump.java 8193639 solaris
> +serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris
> +serviceability/sa/TestDefaultMethods.java 8193639 solaris
> +serviceability/sa/TestG1HeapRegion.java 8193639 solaris
> +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all
> +serviceability/sa/TestType.java 8193639 solaris
> +serviceability/sa/TestUniverse.java 8193639 solaris
>
> ?############################################################################# 
>
>
>
> In the above diff, it looks like I deleted these entries:
>
> -serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64
> -serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 generic-all
> -serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 generic-all
>
> What I really did was delete the spaces (like most of the other entries
> in the hotspot ProblemList:
>
> +serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64
> +serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all
> +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all
>
> I also put those entries in sort order which is why a 'diff -w'
> wasn't used...
>
>
> Thanks, in advance, for any questions, comments or suggestions.
>
> Dan


From daniel.daugherty at oracle.com  Wed Jul 25 19:33:31 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 25 Jul 2018 15:33:31 -0400
Subject: RFR(XS): 8208205: ProblemList tests that fail due to 'Error
 attaching to process: Can't create thread_db agent!'
In-Reply-To: <7b268608-36c2-8243-9e21-fd06b44f0dab@oracle.com>
References: <8e4359d7-7433-2d93-c7a0-d9b2c1b4b548@oracle.com>
 <7b268608-36c2-8243-9e21-fd06b44f0dab@oracle.com>
Message-ID: <99ca6ff2-d806-825f-e10b-a7f1354755c7@oracle.com>

Chris,

Thanks for the quick review!

Dan


On 7/25/18 3:32 PM, Chris Plummer wrote:
> Hi Dan,
>
> Looks good to me. Thanks for cleaning up the noise.
>
> Chris
>
> On 7/25/18 12:03 PM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so
>> I need a single (R)eviewer for the following fix:
>>
>> ? JDK-8208205 ProblemList tests that fail due to 'Error attaching to
>> ????????????? process: Can't create thread_db agent!'
>> ? https://bugs.openjdk.java.net/browse/JDK-8208205
>>
>> Here's the diff:
>>
>> $ hg diff test/hotspot/jtreg/ProblemList.txt
>> diff -r 628718bf8970 test/hotspot/jtreg/ProblemList.txt
>> --- a/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 25 12:32:06 2018 
>> -0400
>> +++ b/test/hotspot/jtreg/ProblemList.txt??? Wed Jul 25 14:47:58 2018 
>> -0400
>> @@ -74,14 +74,43 @@
>> ?# :hotspot_runtime
>>
>> ?runtime/CompressedOops/UseCompressedOops.java 8079353 generic-all
>> +runtime/SharedArchiveFile/SASymbolTableTest.java 8193639 solaris
>>
>> ?############################################################################# 
>>
>>
>> ?# :hotspot_serviceability
>>
>> -serviceability/sa/ClhsdbCDSCore.java???????????????? 8207832 linux-x64
>> -serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 
>> generic-all
>> -serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 
>> generic-all
>> +serviceability/sa/ClhsdbAttach.java 8193639 solaris
>> +serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64
>> +serviceability/sa/ClhsdbField.java 8193639 solaris
>> +serviceability/sa/ClhsdbFindPC.java 8193639 solaris
>> +serviceability/sa/ClhsdbInspect.java 8193639 solaris
>> +serviceability/sa/ClhsdbJdis.java 8193639 solaris
>> +serviceability/sa/ClhsdbJhisto.java 8193639 solaris
>> +serviceability/sa/ClhsdbJstack.java 8193639 solaris
>> +serviceability/sa/ClhsdbLongConstant.java 8193639 solaris
>> +serviceability/sa/ClhsdbPmap.java 8193639 solaris
>> +serviceability/sa/ClhsdbPrintAll.java 8193639 solaris
>> +serviceability/sa/ClhsdbPrintAs.java 8193639 solaris
>> +serviceability/sa/ClhsdbPrintStatics.java 8193639 solaris
>> +serviceability/sa/ClhsdbPstack.java 8193639 solaris
>> +serviceability/sa/ClhsdbRegionDetailsScanOopsForG1.java 8193639 solaris
>> +serviceability/sa/ClhsdbScanOops.java 8193639 solaris
>> +serviceability/sa/ClhsdbSource.java 8193639 solaris
>> +serviceability/sa/ClhsdbSymbol.java 8193639 solaris
>> +serviceability/sa/ClhsdbSymbolTable.java 8193639 solaris
>> +serviceability/sa/ClhsdbThread.java 8193639 solaris
>> +serviceability/sa/ClhsdbWhere.java 8193639 solaris
>> +serviceability/sa/DeadlockDetectionTest.java 8193639 solaris
>> +serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris
>> +serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all
>> +serviceability/sa/TestClassDump.java 8193639 solaris
>> +serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris
>> +serviceability/sa/TestDefaultMethods.java 8193639 solaris
>> +serviceability/sa/TestG1HeapRegion.java 8193639 solaris
>> +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all
>> +serviceability/sa/TestType.java 8193639 solaris
>> +serviceability/sa/TestUniverse.java 8193639 solaris
>>
>> ?############################################################################# 
>>
>>
>>
>> In the above diff, it looks like I deleted these entries:
>>
>> -serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64
>> -serviceability/sa/TestRevPtrsForInvokeDynamic.java?? 8191270 
>> generic-all
>> -serviceability/sa/sadebugd/SADebugDTest.java???????? 8163805 
>> generic-all
>>
>> What I really did was delete the spaces (like most of the other entries
>> in the hotspot ProblemList:
>>
>> +serviceability/sa/ClhsdbCDSCore.java 8207832 linux-x64
>> +serviceability/sa/sadebugd/SADebugDTest.java 8163805 generic-all
>> +serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all
>>
>> I also put those entries in sort order which is why a 'diff -w'
>> wasn't used...
>>
>>
>> Thanks, in advance, for any questions, comments or suggestions.
>>
>> Dan
>
>
>


From serguei.spitsyn at oracle.com  Wed Jul 25 20:31:28 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 25 Jul 2018 13:31:28 -0700
Subject: JDK-8199155 : Accessibility issues in jdk.jdi
In-Reply-To: <A4DA7893-49AA-4EBE-864D-1B9DD4DC9A2B@oracle.com>
References: <E2099E4F-9A0C-4F5F-886C-87859F2A8D7D@oracle.com>
 <A4DA7893-49AA-4EBE-864D-1B9DD4DC9A2B@oracle.com>
Message-ID: <fa70ded4-6a96-2c72-796d-340e11a3b9a2@oracle.com>

Hi Alex,

+1

Thanks,
Serguei


On 7/25/18 11:47, Daniil Titov wrote:
> Looks good to me.
>
> --Daniil
>
> ?On 7/25/18, 11:23 AM, "serviceability-dev on behalf of Alex Menkov" <serviceability-dev-bounces at openjdk.java.net on behalf of alexey.menkov at oracle.com> wrote:
>
>      Hi,
>      
>      please review the following for for
>      https://bugs.openjdk.java.net/browse/JDK-8199155
>      
>      webrev:
>      http://cr.openjdk.java.net/~amenkov/accessibility/webrev/
>      
>      The fix adds standard "banner", "navigation", "main" regions
>      and fixes "<dl> without <dt>" issue.
>      For <dl><dd> styles which are used by most browsers are used:
>      dl { margin-top: 1em; margin-bottom: 1em; }
>      dd { margin-left: 40px; }
>      
>      --alex
>      
>      
>
>


From daniel.daugherty at oracle.com  Wed Jul 25 20:50:34 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 25 Jul 2018 16:50:34 -0400
Subject: RFR(XXS): 8208226 ProblemList com/sun/jdi/BasicJDWPConnectionTest.java
Message-ID: <6e585aa2-0bf1-bcaa-7c1b-848ab0493a7d@oracle.com>

Greetings,

I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so
I need a single (R)eviewer for the following fix:

 ? JDK-8208226 ProblemList com/sun/jdi/BasicJDWPConnectionTest.java
 ? https://bugs.openjdk.java.net/browse/JDK-8208226

Here's the diff:

$ hg diff
diff -r ec6d5843068a test/jdk/ProblemList.txt
--- a/test/jdk/ProblemList.txt Wed Jul 25 15:38:37 2018 -0400
+++ b/test/jdk/ProblemList.txt Wed Jul 25 16:44:33 2018 -0400
@@ -834,6 +834,8 @@

  # jdk_jdi

+com/sun/jdi/BasicJDWPConnectionTest.java 8195703 generic-all
+
  com/sun/jdi/RedefineImplementor.sh 8004127 generic-all

  com/sun/jdi/JdbExprTest.sh 8203393 solaris-all


Thanks, in advance, for any questions, comments or suggestions.

Dan


From serguei.spitsyn at oracle.com  Wed Jul 25 21:03:16 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 25 Jul 2018 14:03:16 -0700
Subject: RFR(XXS): 8208226 ProblemList
 com/sun/jdi/BasicJDWPConnectionTest.java
In-Reply-To: <6e585aa2-0bf1-bcaa-7c1b-848ab0493a7d@oracle.com>
References: <6e585aa2-0bf1-bcaa-7c1b-848ab0493a7d@oracle.com>
Message-ID: <5ba15e38-e3bc-1a75-2f5b-0c1d4806a206@oracle.com>

Hi Dan,

Looks good.

Thanks,
Serguei


On 7/25/18 13:50, Daniel D. Daugherty wrote:
> Greetings,
>
> I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so
> I need a single (R)eviewer for the following fix:
>
> ? JDK-8208226 ProblemList com/sun/jdi/BasicJDWPConnectionTest.java
> ? https://bugs.openjdk.java.net/browse/JDK-8208226
>
> Here's the diff:
>
> $ hg diff
> diff -r ec6d5843068a test/jdk/ProblemList.txt
> --- a/test/jdk/ProblemList.txt Wed Jul 25 15:38:37 2018 -0400
> +++ b/test/jdk/ProblemList.txt Wed Jul 25 16:44:33 2018 -0400
> @@ -834,6 +834,8 @@
>
> ?# jdk_jdi
>
> +com/sun/jdi/BasicJDWPConnectionTest.java 8195703 generic-all
> +
> ?com/sun/jdi/RedefineImplementor.sh 8004127 generic-all
>
> ?com/sun/jdi/JdbExprTest.sh 8203393 solaris-all
>
>
> Thanks, in advance, for any questions, comments or suggestions.
>
> Dan
>


From daniel.daugherty at oracle.com  Wed Jul 25 21:16:16 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Wed, 25 Jul 2018 17:16:16 -0400
Subject: RFR(XXS): 8208226 ProblemList
 com/sun/jdi/BasicJDWPConnectionTest.java
In-Reply-To: <5ba15e38-e3bc-1a75-2f5b-0c1d4806a206@oracle.com>
References: <6e585aa2-0bf1-bcaa-7c1b-848ab0493a7d@oracle.com>
 <5ba15e38-e3bc-1a75-2f5b-0c1d4806a206@oracle.com>
Message-ID: <ab802cad-54ec-c8f8-68d4-7cc9567c6e05@oracle.com>

Serguei,

Thanks for the very fast review!

Dan


On 7/25/18 5:03 PM, serguei.spitsyn at oracle.com wrote:
> Hi Dan,
>
> Looks good.
>
> Thanks,
> Serguei
>
>
> On 7/25/18 13:50, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so
>> I need a single (R)eviewer for the following fix:
>>
>> ? JDK-8208226 ProblemList com/sun/jdi/BasicJDWPConnectionTest.java
>> ? https://bugs.openjdk.java.net/browse/JDK-8208226
>>
>> Here's the diff:
>>
>> $ hg diff
>> diff -r ec6d5843068a test/jdk/ProblemList.txt
>> --- a/test/jdk/ProblemList.txt Wed Jul 25 15:38:37 2018 -0400
>> +++ b/test/jdk/ProblemList.txt Wed Jul 25 16:44:33 2018 -0400
>> @@ -834,6 +834,8 @@
>>
>> ?# jdk_jdi
>>
>> +com/sun/jdi/BasicJDWPConnectionTest.java 8195703 generic-all
>> +
>> ?com/sun/jdi/RedefineImplementor.sh 8004127 generic-all
>>
>> ?com/sun/jdi/JdbExprTest.sh 8203393 solaris-all
>>
>>
>> Thanks, in advance, for any questions, comments or suggestions.
>>
>> Dan
>>
>


From daniil.x.titov at oracle.com  Wed Jul 25 23:11:57 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Wed, 25 Jul 2018 16:11:57 -0700
Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to
 start
Message-ID: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com>

Hello,

Please review the change that fix the test issue. The fix increases the  metaspace size and corrects the path to the class files.

Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/
Issue: https://bugs.openjdk.java.net/browse/JDK-8207364

Thanks!

Best regards,
Daniil


From serguei.spitsyn at oracle.com  Wed Jul 25 23:32:07 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 25 Jul 2018 16:32:07 -0700
Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to
 start
In-Reply-To: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com>
References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com>
Message-ID: <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com>

Hi Daniil,

It looks good to me.
What is the need to increase the metaspace size?

Thanks,
Serguei


On 7/25/18 16:11, Daniil Titov wrote:
> Hello,
>
> Please review the change that fix the test issue. The fix increases the  metaspace size and corrects the path to the class files.
>
> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/
> Issue: https://bugs.openjdk.java.net/browse/JDK-8207364
>
> Thanks!
>
> Best regards,
> Daniil
>
>
>


From daniil.x.titov at oracle.com  Thu Jul 26 00:38:13 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Wed, 25 Jul 2018 17:38:13 -0700
Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to
 start
In-Reply-To: <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com>
References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com>
 <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com>
Message-ID: <7076F356-EC65-41C4-A8AD-E1C9D53223AC@oracle.com>

Hi Serguei,

On 64 bit machines Java fails to initialize a VM and prints " MaxMetaspaceSize is too small."  diagnostic  if the max metaspace size set to 8MB or less (java -XX:MaxMetaspaceSize=8m)

Per  open/src/hotspot/share/memory/metaspace.cpp (line 1140) and open/src/hotspot/share/runtime/globals.hpp (line 1059)  MaxMetaspaceSize  on 64 bit machines should be greater than 8MB.  Comparing it to the behavior of Java 8 it seems as these settings were increased since Java 8 where the metaspace size should be greater than 4MB only.  
  
cat -n open/src/hotspot/share/memory/metaspace.cpp

  880	
  881	#define VIRTUALSPACEMULTIPLIER 2
  882	

  1135	  // Initial virtual space size will be calculated at global_initialize()
  1136	  size_t min_metaspace_sz =
  1137	      VIRTUALSPACEMULTIPLIER * InitialBootClassLoaderMetaspaceSize;
  1138	  if (UseCompressedClassPointers) {
  1139	    if ((min_metaspace_sz + CompressedClassSpaceSize) >  MaxMetaspaceSize) {
  1140	      if (min_metaspace_sz >= MaxMetaspaceSize) {
  1141	        vm_exit_during_initialization("MaxMetaspaceSize is too small.");
  1142	      } else {
  1143	        FLAG_SET_ERGO(size_t, CompressedClassSpaceSize,
  1144	                      MaxMetaspaceSize - min_metaspace_sz);
  1145	      }
  1146	    }

cat -n open/src/hotspot/share/runtime/globals.hpp

1058	  product(size_t, InitialBootClassLoaderMetaspaceSize,                      \
  1059	          NOT_LP64(2200*K) LP64_ONLY(4*M),                                  \
  1060	          "Initial size of the boot class loader data metaspace")           \
  1061	          range(30*K, max_uintx/BytesPerWord)                               \
  1062	          constraint(InitialBootClassLoaderMetaspaceSizeConstraintFunc, AfterErgo)\


Best regards,
Daniil

?On 7/25/18, 4:32 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:

    Hi Daniil,
    
    It looks good to me.
    What is the need to increase the metaspace size?
    
    Thanks,
    Serguei
    
    
    On 7/25/18 16:11, Daniil Titov wrote:
    > Hello,
    >
    > Please review the change that fix the test issue. The fix increases the  metaspace size and corrects the path to the class files.
    >
    > Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/
    > Issue: https://bugs.openjdk.java.net/browse/JDK-8207364
    >
    > Thanks!
    >
    > Best regards,
    > Daniil
    >
    >
    >
    
    
From serguei.spitsyn at oracle.com  Thu Jul 26 01:38:02 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Wed, 25 Jul 2018 18:38:02 -0700
Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to
 start
In-Reply-To: <7076F356-EC65-41C4-A8AD-E1C9D53223AC@oracle.com>
References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com>
 <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com>
 <7076F356-EC65-41C4-A8AD-E1C9D53223AC@oracle.com>
Message-ID: <c265e3e6-a713-7e57-2e63-1fce8d833cd1@oracle.com>

Daniil,

Thank you for the explanation.

Thanks,
Serguei


On 7/25/18 17:38, Daniil Titov wrote:
> Hi Serguei,
>
> On 64 bit machines Java fails to initialize a VM and prints " MaxMetaspaceSize is too small."  diagnostic  if the max metaspace size set to 8MB or less (java -XX:MaxMetaspaceSize=8m)
>
> Per  open/src/hotspot/share/memory/metaspace.cpp (line 1140) and open/src/hotspot/share/runtime/globals.hpp (line 1059)  MaxMetaspaceSize  on 64 bit machines should be greater than 8MB.  Comparing it to the behavior of Java 8 it seems as these settings were increased since Java 8 where the metaspace size should be greater than 4MB only.
>    
> cat -n open/src/hotspot/share/memory/metaspace.cpp
>
>    880	
>    881	#define VIRTUALSPACEMULTIPLIER 2
>    882	
>
>    1135	  // Initial virtual space size will be calculated at global_initialize()
>    1136	  size_t min_metaspace_sz =
>    1137	      VIRTUALSPACEMULTIPLIER * InitialBootClassLoaderMetaspaceSize;
>    1138	  if (UseCompressedClassPointers) {
>    1139	    if ((min_metaspace_sz + CompressedClassSpaceSize) >  MaxMetaspaceSize) {
>    1140	      if (min_metaspace_sz >= MaxMetaspaceSize) {
>    1141	        vm_exit_during_initialization("MaxMetaspaceSize is too small.");
>    1142	      } else {
>    1143	        FLAG_SET_ERGO(size_t, CompressedClassSpaceSize,
>    1144	                      MaxMetaspaceSize - min_metaspace_sz);
>    1145	      }
>    1146	    }
>
> cat -n open/src/hotspot/share/runtime/globals.hpp
>
> 1058	  product(size_t, InitialBootClassLoaderMetaspaceSize,                      \
>    1059	          NOT_LP64(2200*K) LP64_ONLY(4*M),                                  \
>    1060	          "Initial size of the boot class loader data metaspace")           \
>    1061	          range(30*K, max_uintx/BytesPerWord)                               \
>    1062	          constraint(InitialBootClassLoaderMetaspaceSizeConstraintFunc, AfterErgo)\
>
>
> Best regards,
> Daniil
>
> ?On 7/25/18, 4:32 PM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:
>
>      Hi Daniil,
>      
>      It looks good to me.
>      What is the need to increase the metaspace size?
>      
>      Thanks,
>      Serguei
>      
>      
>      On 7/25/18 16:11, Daniil Titov wrote:
>      > Hello,
>      >
>      > Please review the change that fix the test issue. The fix increases the  metaspace size and corrects the path to the class files.
>      >
>      > Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/
>      > Issue: https://bugs.openjdk.java.net/browse/JDK-8207364
>      >
>      > Thanks!
>      >
>      > Best regards,
>      > Daniil
>      >
>      >
>      >
>      
>      
>
>


From chris.plummer at oracle.com  Thu Jul 26 04:09:14 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Wed, 25 Jul 2018 21:09:14 -0700
Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to
 start
In-Reply-To: <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com>
References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com>
 <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com>
Message-ID: <99433df0-16fe-6ca7-b0c7-b428d3d12f92@oracle.com>

Hi Daniil,

After reading some old comments I added to JDK-6606767, I wonder if 
bumping the metaspace size all the way up to 16m is the right thing to 
do. It seems the test wants to exhaust the metaspace, so maybe it should 
be set it to the smallest allowed size. Is the test still exhausting the 
metaspace even when it is 16M. Is there a smaller size that will also work?

Also, regarding the class path, what impact was this bug having on the test?

thanks,

Chris

On 7/25/18 4:32 PM, serguei.spitsyn at oracle.com wrote:
> Hi Daniil,
>
> It looks good to me.
> What is the need to increase the metaspace size?
>
> Thanks,
> Serguei
>
>
> On 7/25/18 16:11, Daniil Titov wrote:
>> Hello,
>>
>> Please review the change that fix the test issue. The fix increases 
>> the? metaspace size and corrects the path to the class files.
>>
>> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/
>> Issue: https://bugs.openjdk.java.net/browse/JDK-8207364
>>
>> Thanks!
>>
>> Best regards,
>> Daniil
>>
>>
>>
>


From fairoz.matte at oracle.com  Thu Jul 26 04:20:43 2018
From: fairoz.matte at oracle.com (Fairoz Matte)
Date: Wed, 25 Jul 2018 21:20:43 -0700 (PDT)
Subject: [8u-backport] RFR: JDK-8191948: jdb error: InvalidTypeException:
 Can't assign double[][][] to double[][][]
In-Reply-To: <96be102f-67c4-462c-89ba-a728b8b81ddc@oracle.com>
References: <5be949d0-70ba-4a9d-9541-907b37d3d0fb@default>
 <07fea5b7-6922-19d0-d58a-8aa5fb95b69d@oracle.com>
 <96be102f-67c4-462c-89ba-a728b8b81ddc@oracle.com>
Message-ID: <309fc6ca-8099-4ebc-8502-a4e88e899b8e@default>

Hi Chris and Serguei,

Thanks for the review, I will add the appropriate noreg label.

Thanks,
Fairoz


From: Serguei Spitsyn 
Sent: Wednesday, July 25, 2018 11:24 PM
To: Chris Plummer <chris.plummer at oracle.com>; Fairoz Matte <fairoz.matte at oracle.com>; serviceability-dev at openjdk.java.net
Subject: Re: [8u-backport] RFR: JDK-8191948: jdb error: InvalidTypeException: Can't assign double[][][] to double[][][]

Hi Fairoz,

Looks good to me too.
Thank you for taking care about this backport!

On 7/25/18 10:31, Chris Plummer wrote:
Hi Fairoz, 

The changes look good. I'm not sure what the policy is when part of the (full) backport contains test changes that aren't directly applicable to 8u. You might need some sort of noreg label on the backport CR.

The test test/hotspot/jtreg/vmTestbase/nsk/jdb/eval/eval001 is located in the VM testbase which is a separate repository for jdk 8.
I agree with Chris, noreg label on the backport CR is probably needed.

Thanks,
Serguei


thanks, 

Chris 

On 7/25/18 1:23 AM, Fairoz Matte wrote: 

Hi, 

Kindly review the backport of "JDK-8191948: jdb error: InvalidTypeException: Can't assign double[][][] to double[][][]" to 8u 

Webrev - http://cr.openjdk.java.net/~fmatte/8191948/webrev.00/ 

JDK 11 bug - https://bugs.openjdk.java.net/browse/JDK-8191948 

JDK 11 changeset - http://hg.openjdk.java.net/jdk/jdk11/rev/73c769e0486a 

Review thread - http://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024405.html

Thanks, 
Fairoz 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180725/43c96334/attachment.html>

From sgehwolf at redhat.com  Thu Jul 26 07:33:32 2018
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Thu, 26 Jul 2018 09:33:32 +0200
Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws
 UnmappedAddressException on i686
In-Reply-To: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com>
References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com>
Message-ID: <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com>

On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote:
> Hi,
> 
> Could I please get a review of this one-liner change related to jhsdb
> --mixed when attaching to a running Java process? The issue arises when
> threads are in native code and that native code has frame pointers not
> properly preserved. In such a case the SA performs a simple frame
> pointer valididy check: ebp >= esp
> 
> However, the code of retrieving the value for esp is incorrect in as
> much as it's not in sync with native code in regards to the register
> index:
> 
> native code => X86ThreadContext.SP
> Java code   => X86ThreadContext.ESP
> 
> X86ThreadContext.ESP is never being set by the native code. Since
> X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then
> returns null, ebp.lessThan(esp) wrongly returns false causing the
> issue. This webrev fixes it by using SP as index on the Java side.
> Thoughts?
> 
> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/
> bug: https://bugs.openjdk.java.net/browse/JDK-8208091

Anyone willing to review this one-liner?

Thanks,
Severin

> Thanks,
> Severin


From volker.simonis at gmail.com  Thu Jul 26 09:36:15 2018
From: volker.simonis at gmail.com (Volker Simonis)
Date: Thu, 26 Jul 2018 11:36:15 +0200
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <76229193-7f46-a17a-7ebb-bddbd3d698b9@oracle.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
 <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
 <f96a16917c934a539523e078f902b880@sap.com>
 <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com>
 <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com>
 <6de6362944f84740b80abb22cbbea872@sap.com>
 <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com>
 <eac7c9ba-1d94-3efe-a5ac-1b54bf6303e9@oracle.com>
 <343bce9f-7072-17fd-9351-1c62f67a12de@oracle.com>
 <92dcce7000a94cf89ae2169cb1f843f2@sap.com>
 <76229193-7f46-a17a-7ebb-bddbd3d698b9@oracle.com>
Message-ID: <CA+3eh11v3Cus6V8H9d-NHCM_iikQv6mwxw2wSZV1KkY7Hx007Q@mail.gmail.com>

Hi Sergey,

thanks for your help, but I've just pushed the fix now.

@Thomas: sorry, I really apologize, but I've just realized that I've
forgot to add you as a Reviewer :( I'll promise to look more carefully
next time.

Regards,
Volker


On Tue, Jul 24, 2018 at 6:01 PM, serguei.spitsyn at oracle.com
<serguei.spitsyn at oracle.com> wrote:
> Hi Ralf,
>
> I think, you have to consider it reviewed.
> Sorry, I was not clear no new webrev is needed.
>
> Do you need a sponsor for the push?
>
> Thanks,
> Serguei
>
>
>
> On 7/24/18 06:32, Schmelter, Ralf wrote:
>>
>> Hi all,
>>
>> here is the update webref with the fixed copyright:
>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v5/
>>
>> Best regards,
>> Ralf
>>
>> -----Original Message-----
>> From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com]
>> Sent: Freitag, 20. Juli 2018 23:04
>> To: Chris Plummer <chris.plummer at oracle.com>; Schmelter, Ralf
>> <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; Stuefe,
>> Thomas <thomas.stuefe at sap.com>
>> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
>> prevent quadratic runtime behavior
>>
>> On 7/20/18 13:44, Chris Plummer wrote:
>>>
>>> On 7/20/18 1:40 PM, serguei.spitsyn at oracle.com wrote:
>>>>
>>>> Hi Ralf,
>>>>
>>>>
>>>> On 7/20/18 07:28, Schmelter, Ralf wrote:
>>>>>
>>>>> Hi Sergue,
>>>>>
>>>>> I?ve updated the webref:
>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/
>>>>
>>>> The copyright year in ThreadReferenceImpl.c still has to be 2018, not
>>>> 2008.
>>>>
>>>>
>>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html
>>>>
>>>>
>>>>    72             if (newDepth == -1_000) {
>>>>    73                 // Pop some frames so there is room on the stack
>>>> for the
>>>>    74                 // call (including println()).
>>>>    75                 notifyRecursionEnded();
>>>>    76             }
>>>>
>>>>    I have a concern on potential issue mentioned in the comment above.
>>>>    Should a StackOverflowError be expected here?
>>>>
>>>>    79         } catch (StackOverflowError e) {
>>>>    80             // Use negative depth to indicate the recursion has
>>>> ended.
>>>>    81             return -1;
>>>>    82         }
>>>>
>>>>    What is going to happen if the StackOverflowError was really caught
>>>> above?
>>>
>>> The SOE is really caught in the above code. I returns -1, and starts
>>> the unwinding of the stack. After 1000 frames have been popped via
>>> returns, notifyRecursionEnded() will be called. The pops are so
>>> notifyRecursionEnded() can be called without worry of another SOE.
>>
>> Got it, thanks Chris.
>>
>> So, I'm Okay with the fix assuming the copyright year is fixed.
>>
>> Thanks,
>> Serguei
>
>

From volker.simonis at gmail.com  Thu Jul 26 09:43:34 2018
From: volker.simonis at gmail.com (Volker Simonis)
Date: Thu, 26 Jul 2018 11:43:34 +0200
Subject: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
 prevent quadratic runtime behavior
In-Reply-To: <CA+3eh11v3Cus6V8H9d-NHCM_iikQv6mwxw2wSZV1KkY7Hx007Q@mail.gmail.com>
References: <709161f438f848b0af5fb079c9c0242a@sap.com>
 <7e2e9834-22ce-19ce-8837-e4e83a0c0099@oracle.com>
 <21e17c666ac04930a0e4bb4869e989da@sap.com>
 <a7e3536694e84e859bb14e3cb19f292c@sap.com>
 <5981fb37-1465-9bad-7af2-653e7c72bd38@oracle.com>
 <f96a16917c934a539523e078f902b880@sap.com>
 <9c828d50-98f3-b13e-cf9e-8dd48c427e7d@oracle.com>
 <60ea7c00-70ed-94cb-acab-34b501cc1069@oracle.com>
 <6de6362944f84740b80abb22cbbea872@sap.com>
 <8f3dbf24-9236-1226-75da-034fe146a4f4@oracle.com>
 <eac7c9ba-1d94-3efe-a5ac-1b54bf6303e9@oracle.com>
 <343bce9f-7072-17fd-9351-1c62f67a12de@oracle.com>
 <92dcce7000a94cf89ae2169cb1f843f2@sap.com>
 <76229193-7f46-a17a-7ebb-bddbd3d698b9@oracle.com>
 <CA+3eh11v3Cus6V8H9d-NHCM_iikQv6mwxw2wSZV1KkY7Hx007Q@mail.gmail.com>
Message-ID: <CA+3eh10uNj_oS39-oDOhMDupVKw3Mddm8NpwWVtMgxBwjUbNSg@mail.gmail.com>

Oh my good!

And I've also forgot to add Ralf as a Contributer :(:(:(

I really desperately need a vacation!

Sorry Ralf,
Volker


On Thu, Jul 26, 2018 at 11:36 AM, Volker Simonis
<volker.simonis at gmail.com> wrote:
> Hi Sergey,
>
> thanks for your help, but I've just pushed the fix now.
>
> @Thomas: sorry, I really apologize, but I've just realized that I've
> forgot to add you as a Reviewer :( I'll promise to look more carefully
> next time.
>
> Regards,
> Volker
>
>
> On Tue, Jul 24, 2018 at 6:01 PM, serguei.spitsyn at oracle.com
> <serguei.spitsyn at oracle.com> wrote:
>> Hi Ralf,
>>
>> I think, you have to consider it reviewed.
>> Sorry, I was not clear no new webrev is needed.
>>
>> Do you need a sponsor for the push?
>>
>> Thanks,
>> Serguei
>>
>>
>>
>> On 7/24/18 06:32, Schmelter, Ralf wrote:
>>>
>>> Hi all,
>>>
>>> here is the update webref with the fixed copyright:
>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v5/
>>>
>>> Best regards,
>>> Ralf
>>>
>>> -----Original Message-----
>>> From: serguei.spitsyn at oracle.com [mailto:serguei.spitsyn at oracle.com]
>>> Sent: Freitag, 20. Juli 2018 23:04
>>> To: Chris Plummer <chris.plummer at oracle.com>; Schmelter, Ralf
>>> <ralf.schmelter at sap.com>; serviceability-dev at openjdk.java.net; Stuefe,
>>> Thomas <thomas.stuefe at sap.com>
>>> Subject: Re: RFR (S) 8205608: Fix 'frames()' in ThreadReferenceImpl.c to
>>> prevent quadratic runtime behavior
>>>
>>> On 7/20/18 13:44, Chris Plummer wrote:
>>>>
>>>> On 7/20/18 1:40 PM, serguei.spitsyn at oracle.com wrote:
>>>>>
>>>>> Hi Ralf,
>>>>>
>>>>>
>>>>> On 7/20/18 07:28, Schmelter, Ralf wrote:
>>>>>>
>>>>>> Hi Sergue,
>>>>>>
>>>>>> I?ve updated the webref:
>>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/
>>>>>
>>>>> The copyright year in ThreadReferenceImpl.c still has to be 2018, not
>>>>> 2008.
>>>>>
>>>>>
>>>>> http://cr.openjdk.java.net/~simonis/webrevs/2018/8205608.v4/test/jdk/com/sun/jdi/Frames2Test.java.html
>>>>>
>>>>>
>>>>>    72             if (newDepth == -1_000) {
>>>>>    73                 // Pop some frames so there is room on the stack
>>>>> for the
>>>>>    74                 // call (including println()).
>>>>>    75                 notifyRecursionEnded();
>>>>>    76             }
>>>>>
>>>>>    I have a concern on potential issue mentioned in the comment above.
>>>>>    Should a StackOverflowError be expected here?
>>>>>
>>>>>    79         } catch (StackOverflowError e) {
>>>>>    80             // Use negative depth to indicate the recursion has
>>>>> ended.
>>>>>    81             return -1;
>>>>>    82         }
>>>>>
>>>>>    What is going to happen if the StackOverflowError was really caught
>>>>> above?
>>>>
>>>> The SOE is really caught in the above code. I returns -1, and starts
>>>> the unwinding of the stack. After 1000 frames have been popped via
>>>> returns, notifyRecursionEnded() will be called. The pops are so
>>>> notifyRecursionEnded() can be called without worry of another SOE.
>>>
>>> Got it, thanks Chris.
>>>
>>> So, I'm Okay with the fix assuming the copyright year is fixed.
>>>
>>> Thanks,
>>> Serguei
>>
>>

From yasuenag at gmail.com  Thu Jul 26 12:30:44 2018
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Thu, 26 Jul 2018 21:30:44 +0900
Subject: PING: RFR: 8207843: HSDB cannot show Object Histogram when ZGC is
 working
In-Reply-To: <0e8ce2e6-e043-2320-e660-a2f1f4df820e@gmail.com>
References: <0e8ce2e6-e043-2320-e660-a2f1f4df820e@gmail.com>
Message-ID: <3d4f8faf-e26a-0e6b-6df1-73b6600ee5a0@gmail.com>

PING: Could you review it?

>    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207843/webrev.00/


Yasumasa


On 2018/07/19 23:03, Yasumasa Suenaga wrote:
> Hi all,
> 
> Please review this webrev.
> 
>  ???? JBS: https://bugs.openjdk.java.net/browse/JDK-8207843
>  ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207843/webrev.00/
> 
> I encountered AssertionFailure when I attached HSDB to the process which is working with ZGC as below:
> 
> sun.jvm.hotspot.utilities.AssertionFailure: Unexpected CollectedHeap type: sun.jvm.hotspot.gc.z.ZCollectedHeap
>  ??? at jdk.hotspot.agent/sun.jvm.hotspot.utilities.Assert.that(Assert.java:32)
>  ??? at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.collectLiveRegions(ObjectHeap.java:448)
>  ??? at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.iterate(ObjectHeap.java:173)
>  ??? at jdk.hotspot.agent/sun.jvm.hotspot.HSDB$VisitHeap.run(HSDB.java:1741)
>  ??? at jdk.hotspot.agent/sun.jvm.hotspot.utilities.WorkerThread$MainLoop.run(WorkerThread.java:70)
>  ??? at java.base/java.lang.Thread.run(Thread.java:832)
> 
> ObjectHeap#collectLiveRegions() branches by instance type of CollectedHeap. However it does not support ZCollectedHeap.
> So I add ZCollectedHeap to it and add some methods to iterate ZPageTable.
> 
> 
> Thanks,
> 
> Yasumasa

From yasuenag at gmail.com  Thu Jul 26 13:52:10 2018
From: yasuenag at gmail.com (Yasumasa Suenaga)
Date: Thu, 26 Jul 2018 22:52:10 +0900
Subject: ZGC: RFR: 8207843: HSDB cannot show Object Histogram when ZGC is
 working
In-Reply-To: <3d4f8faf-e26a-0e6b-6df1-73b6600ee5a0@gmail.com>
References: <0e8ce2e6-e043-2320-e660-a2f1f4df820e@gmail.com>
 <3d4f8faf-e26a-0e6b-6df1-73b6600ee5a0@gmail.com>
Message-ID: <06ceb864-bca5-d89c-c54e-fbfce3585066@gmail.com>

CC'ing to hotspot-gc-dev


On 2018/07/26 21:30, Yasumasa Suenaga wrote:
> PING: Could you review it?
> 
>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207843/webrev.00/
> 
> 
> Yasumasa
> 
> 
> On 2018/07/19 23:03, Yasumasa Suenaga wrote:
>> Hi all,
>>
>> Please review this webrev.
>>
>> ????? JBS: https://bugs.openjdk.java.net/browse/JDK-8207843
>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8207843/webrev.00/
>>
>> I encountered AssertionFailure when I attached HSDB to the process which is working with ZGC as below:
>>
>> sun.jvm.hotspot.utilities.AssertionFailure: Unexpected CollectedHeap type: sun.jvm.hotspot.gc.z.ZCollectedHeap
>> ???? at jdk.hotspot.agent/sun.jvm.hotspot.utilities.Assert.that(Assert.java:32)
>> ???? at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.collectLiveRegions(ObjectHeap.java:448)
>> ???? at jdk.hotspot.agent/sun.jvm.hotspot.oops.ObjectHeap.iterate(ObjectHeap.java:173)
>> ???? at jdk.hotspot.agent/sun.jvm.hotspot.HSDB$VisitHeap.run(HSDB.java:1741)
>> ???? at jdk.hotspot.agent/sun.jvm.hotspot.utilities.WorkerThread$MainLoop.run(WorkerThread.java:70)
>> ???? at java.base/java.lang.Thread.run(Thread.java:832)
>>
>> ObjectHeap#collectLiveRegions() branches by instance type of CollectedHeap. However it does not support ZCollectedHeap.
>> So I add ZCollectedHeap to it and add some methods to iterate ZPageTable.
>>
>>
>> Thanks,
>>
>> Yasumasa

From thomas.schatzl at oracle.com  Thu Jul 26 14:06:42 2018
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 26 Jul 2018 16:06:42 +0200
Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean,
 GarbageCollectorMXBean, and jstat counter definitions
In-Reply-To: <FCFCADFE-5CE0-42DE-8ED8-FBC57464207F@amazon.com>
References: <FCFCADFE-5CE0-42DE-8ED8-FBC57464207F@amazon.com>
Message-ID: <e06a126c624e3b4aa836dfeed385882e85261a43.camel@oracle.com>

Hi Paul,

  Erik may not have time in the next few months to review such a large
change. But it would also be better to do the changes in steps for
other reviewers. Also see below.

On Mon, 2018-07-23 at 21:33 +0000, Hohensee, Paul wrote: 
> Please review.
>  
> Bug: https://bugs.openjdk.java.net/browse/JDK-8196989

I may have missed this in the previous discussion (which has been a
while), but has there been any discussion about a "Free (Region) Space"
for the committed but free regions?

It seems a bit random to assign free region to the "old space",
seemingly just a repeat of the current behavior (where everything has
been put into "old gen").

Also, imho the second survivor space should preferably be dropped
completely. :)

> CSR: https://bugs.openjdk.java.net/browse/JDK-8196991
> Webrev: http://cr.openjdk.java.net/~phh/8196989/webrev.00
>  
> This webrev is marked ?L? because it?s a behavioral change (CSR in
> draft state, may I have a review of that too please?) and because the
> test change fanout is large. The actual code changes are ?M?.
>  
> Passes the submit repo, Hotspot tier1, the JFR gc event tests and any
> other test set with ?gc? or ?serviceability? in the test directory
> name. I found it difficult to verify the accuracy of the reported
> values other than manually, since they can vary from run to run of
> the same program. I?d appreciate suggestions for how to go about
> writing accuracy tests.
>  
> I set out originally to revamp only the MXBeans, but decided it would
> be incomplete if I didn?t include the jstat counters and the output
> of the GC.heap_info jcmd option. I can separate the latter two into
> their own RFEs, but I find it easier understand it all in a single
> webrev and hope the reviewers will too.
>  
> The basic approach is to add the new memory pools and collectors, the
> new jstat counters, and an archive region counter that stands in for
> an actual archive region set. HeapRegionSets are disjoint, so 

One option would be to add a HeapRegionSet tailored for archives that
does not check the disjoint-criteria (it is superficially used for
verification only anyway) - we already have special classes/flags for
different kinds of regions (humongous/free/old) in the HeapRegionSet
hierarchy.

> initially I tried to create a first-class archive region set (on the
> same level as the humongous region set), but that idea foundered on
> the fact that there?s too much code I don?t fully understand that
> depends on archive regions being in the existing old region set. 

Probably to simplify the implementation of archive regions :)

This is another option, and does not look too bad actually, we only
need to check and change all HeapRegion::is_old() or
HeapRegion::is_old_or_humongous() checks.

Now we only need a good name for is_old_or_archive_or_humongous()
because that one is a bit lengthy :)

> Externally (i.e., in the MXBeans and the jstat counters), however,
> the old region set doesn?t include archive regions (unless running in
> legacy mode).
>  
> I used CMS?s TraceCMSMemoryManagerStats class as the model for
> TraceConcMemoryManagerStats, which latter collects statistics on
> concurrent cycles. There are two STW pauses in each concurrent cycle:
> they are recorded separately and count as two sun.gc.collector.2
> events.

I would like to move away serviceability code from
G1CollectedHeap.h/cpp as much as possible; e.g. it would be very nice
to make G1MonitoringSupport the owner of all the serviceability related
data. Also the _use_legacy_monitoring member should probably move there
too.

> The humongous and archive space committed and used values are always
> identical,

This is because, for some reason, G1 counts the memory filled with
filler objects as "used". Other collectors don't.

> hence they are always 100% used.

You may have noticed that just recently we got a bug (https://bugs.open
jdk.java.net/browse/JDK-8207200) filed against the G1 MXBeans because
of races in the code particularly code to be not-racy.

The reason is the really weird calculation of used/committed for eden
space/survivor space/old gen and that the precondition written down in
G1MonitoringSupport::recalculate_sizes() does not hold.

G1 MemoryMXBeans basically fabricates some numbers as you might have
noticed :), so in addition to fixing that issue with the race I am
still working on improving the accuracy of the used values.
Also, in course of this change I am considering removing some other
backwards-bending in returned values for G1 (the mentioned and e.g.
funky stuff like assuming that adding together max-capacities of the
pools gives you total heap size).

I have also a preliminary webrev for that at http://cr.openjdk.java.net
/~tschatzl/8207200/webrev/ which unfortunately clashes a lot with your
changes. The reason why it is a single webrev is because I am not
finished yet - I tend to split it up in parts for much better reviewing
at the very end only.

Could we work together on first refactoring the code before adding new
kinds of spaces to the MXBeans?

Looking at this change and mine roughly the following issues would need
to be resolved first:
- find a solution for archive regions as suggested above :) At the
moment, without doing the change, I would tend to make archive regions
separate from old regions.
- move serviceability stuff as much as possible to g1MonitoringSupport
- clean up MemoryPool, remove duplicate information
- provide and return sane memory pool used/committed values to the
MXBeans
- clean up G1MonitoringSupport, e.g. avoid "*used/*committed" variables
for every single memory pool. Use MemoryUsage structs for them. Make
reading of memory pool information atomic wrt to its readers (note that
I think it is currently just impossible to get consistent output for
other statistics like jstat) - that's JDK-8207200.
- add whatever serviceability stuff for the new pools/jstat/* in steps.

> The revised output of jcmd GC.heap_info is in
> G1CollectedHeap::print_on().
> I fixed a typo in src/hotspot/share/gc/g1/g1Policy.hpp by changing
> the result type of young_list_target_length() from size_t to uint,
> which latter is the type of the _young_list_target_length member.
> I updated the copyright date in
> src/hotspot/share/services/memoryService.hpp to 2018, as I neglected
> to do so in a previous push.

Comments?

Thanks,
  Thomas


From jcbeyler at google.com  Thu Jul 26 16:53:48 2018
From: jcbeyler at google.com (JC Beyler)
Date: Thu, 26 Jul 2018 09:53:48 -0700
Subject: RFR (XS) 8208251:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java
 fails intermittently on Linux-X64
Message-ID: <CAF9BGByesvmqE0xa5nmfVqRWxUw7Xv4OP7Cr3cAQmDZRtz-k7Q@mail.gmail.com>

Hi all,

As we fixed the HeapMonitorTest to not fail from time to time, there seems
to be the same issue and risk in HeapMonitorGCTest. Could someone review
the similar fix:

Webrev: http://cr.openjdk.java.net/~jcbeyler/8208251/webrev.00/
Bug: https://bugs.openjdk.java.net/browse/JDK-8208251

The risk is that the last interval is too big and no sampled object is live
after the allocation method. If a GC happens before the check for sample
code, it is possible no live objects still exist.

The solution is to reduce the sampling interval to make it highly unlikely
for no samples to happen in any allocation iteration, keeping at least one
sampled object live. But also check the GC'd objects in the system in case
they did actually all already get GC'd.

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/f2c1c665/attachment.html>

From daniil.x.titov at oracle.com  Thu Jul 26 16:56:08 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Thu, 26 Jul 2018 09:56:08 -0700
Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to
 start
In-Reply-To: <99433df0-16fe-6ca7-b0c7-b428d3d12f92@oracle.com>
References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com>
 <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com>
 <99433df0-16fe-6ca7-b0c7-b428d3d12f92@oracle.com>
Message-ID: <3716AA08-0542-4500-AD55-B41EC34C56BB@oracle.com>

Hi Chris,

The smallest allowed metaspace size for the test is 9MB. In both cases (when the metaspace size is set to 9Mb and to 16 Mb) the expected OutOfMemoryError is thrown and the test passes. 

I did update the patch to use the smallest settings.

Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.02


The test uses a custom class loader to load a class from the byte array read from the predefined specified class file. The incorrect path passed to the test made the test fail to read this class file.
 

java.lang.RuntimeException: Exception when reading file './bin/nsk/jvmti/ResourceExhausted/Helper.class'
	at nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:74)
	at nsk.jvmti.ResourceExhausted.resexhausted003.run(resexhausted003.java:89)
	at nsk.jvmti.ResourceExhausted.resexhausted003.main(resexhausted003.java:129)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.io.FileNotFoundException: ./bin/nsk/jvmti/ResourceExhausted/Helper.class (No such file or directory)
	at java.base/java.io.FileInputStream.open0(Native Method)
	at java.base/java.io.FileInputStream.open(FileInputStream.java:219)
	at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
	at nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:64)
	... 8 more

Best regards,
Daniil

?On 7/25/18, 9:09 PM, "Chris Plummer" <chris.plummer at oracle.com> wrote:

    Hi Daniil,
    
    After reading some old comments I added to JDK-6606767, I wonder if 
    bumping the metaspace size all the way up to 16m is the right thing to 
    do. It seems the test wants to exhaust the metaspace, so maybe it should 
    be set it to the smallest allowed size. Is the test still exhausting the 
    metaspace even when it is 16M. Is there a smaller size that will also work?
    
    Also, regarding the class path, what impact was this bug having on the test?
    
    thanks,
    
    Chris
    
    On 7/25/18 4:32 PM, serguei.spitsyn at oracle.com wrote:
    > Hi Daniil,
    >
    > It looks good to me.
    > What is the need to increase the metaspace size?
    >
    > Thanks,
    > Serguei
    >
    >
    > On 7/25/18 16:11, Daniil Titov wrote:
    >> Hello,
    >>
    >> Please review the change that fix the test issue. The fix increases 
    >> the  metaspace size and corrects the path to the class files.
    >>
    >> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/
    >> Issue: https://bugs.openjdk.java.net/browse/JDK-8207364
    >>
    >> Thanks!
    >>
    >> Best regards,
    >> Daniil
    >>
    >>
    >>
    >
    
    
From jcbeyler at google.com  Thu Jul 26 16:58:24 2018
From: jcbeyler at google.com (JC Beyler)
Date: Thu, 26 Jul 2018 09:58:24 -0700
Subject: RFR 8208303: Track JNI failures and fail tests
Message-ID: <CAF9BGBx_KuWuLdL3CD8ByazJk8ixohL48_ryZ618P3SYwbbZDQ@mail.gmail.com>

Hi all,

The tests in the HeapMonitor subsystem has a lot of JNI calls. There is a
need for verification and testing if anything in the JNI subsystem failed
unexpectedly.

Here is a webrev that tracks if a JNI call does fail and the tests will
fail if any JNI call does fail.

Could I have a few reviews please for:
Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.00/
Bug: https://bugs.openjdk.java.net/browse/JDK-8208303

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/d2b40924/attachment.html>

From chris.plummer at oracle.com  Thu Jul 26 16:59:34 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 26 Jul 2018 09:59:34 -0700
Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to
 start
In-Reply-To: <3716AA08-0542-4500-AD55-B41EC34C56BB@oracle.com>
References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com>
 <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com>
 <99433df0-16fe-6ca7-b0c7-b428d3d12f92@oracle.com>
 <3716AA08-0542-4500-AD55-B41EC34C56BB@oracle.com>
Message-ID: <85219efa-ef16-add4-209d-96f7bf987ba4@oracle.com>

Thanks for the explanation. Update looks good.

Chris

On 7/26/18 9:56 AM, Daniil Titov wrote:
> Hi Chris,
>
> The smallest allowed metaspace size for the test is 9MB. In both cases (when the metaspace size is set to 9Mb and to 16 Mb) the expected OutOfMemoryError is thrown and the test passes.
>
> I did update the patch to use the smallest settings.
>
> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.02
>
>
> The test uses a custom class loader to load a class from the byte array read from the predefined specified class file. The incorrect path passed to the test made the test fail to read this class file.
>   
>
> java.lang.RuntimeException: Exception when reading file './bin/nsk/jvmti/ResourceExhausted/Helper.class'
> 	at nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:74)
> 	at nsk.jvmti.ResourceExhausted.resexhausted003.run(resexhausted003.java:89)
> 	at nsk.jvmti.ResourceExhausted.resexhausted003.main(resexhausted003.java:129)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
> 	at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115)
> 	at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: java.io.FileNotFoundException: ./bin/nsk/jvmti/ResourceExhausted/Helper.class (No such file or directory)
> 	at java.base/java.io.FileInputStream.open0(Native Method)
> 	at java.base/java.io.FileInputStream.open(FileInputStream.java:219)
> 	at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
> 	at nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:64)
> 	... 8 more
>
> Best regards,
> Daniil
>
> ?On 7/25/18, 9:09 PM, "Chris Plummer" <chris.plummer at oracle.com> wrote:
>
>      Hi Daniil,
>      
>      After reading some old comments I added to JDK-6606767, I wonder if
>      bumping the metaspace size all the way up to 16m is the right thing to
>      do. It seems the test wants to exhaust the metaspace, so maybe it should
>      be set it to the smallest allowed size. Is the test still exhausting the
>      metaspace even when it is 16M. Is there a smaller size that will also work?
>      
>      Also, regarding the class path, what impact was this bug having on the test?
>      
>      thanks,
>      
>      Chris
>      
>      On 7/25/18 4:32 PM, serguei.spitsyn at oracle.com wrote:
>      > Hi Daniil,
>      >
>      > It looks good to me.
>      > What is the need to increase the metaspace size?
>      >
>      > Thanks,
>      > Serguei
>      >
>      >
>      > On 7/25/18 16:11, Daniil Titov wrote:
>      >> Hello,
>      >>
>      >> Please review the change that fix the test issue. The fix increases
>      >> the  metaspace size and corrects the path to the class files.
>      >>
>      >> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/
>      >> Issue: https://bugs.openjdk.java.net/browse/JDK-8207364
>      >>
>      >> Thanks!
>      >>
>      >> Best regards,
>      >> Daniil
>      >>
>      >>
>      >>
>      >
>      
>      
>      
>
>


From serguei.spitsyn at oracle.com  Thu Jul 26 17:00:36 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 26 Jul 2018 10:00:36 -0700
Subject: RFR (XS) 8208251:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails
 intermittently on Linux-X64
In-Reply-To: <CAF9BGByesvmqE0xa5nmfVqRWxUw7Xv4OP7Cr3cAQmDZRtz-k7Q@mail.gmail.com>
References: <CAF9BGByesvmqE0xa5nmfVqRWxUw7Xv4OP7Cr3cAQmDZRtz-k7Q@mail.gmail.com>
Message-ID: <e7697a92-0a7a-800f-0a71-7d289b971f89@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/3b2fb26e/attachment.html>

From serguei.spitsyn at oracle.com  Thu Jul 26 17:01:03 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 26 Jul 2018 10:01:03 -0700
Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to
 start
In-Reply-To: <85219efa-ef16-add4-209d-96f7bf987ba4@oracle.com>
References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com>
 <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com>
 <99433df0-16fe-6ca7-b0c7-b428d3d12f92@oracle.com>
 <3716AA08-0542-4500-AD55-B41EC34C56BB@oracle.com>
 <85219efa-ef16-add4-209d-96f7bf987ba4@oracle.com>
Message-ID: <eed1b2d5-29c8-d5ab-e04d-1d3326e5e31b@oracle.com>

+1

Thanks,
Serguei


On 7/26/18 09:59, Chris Plummer wrote:
> Thanks for the explanation. Update looks good.
>
> Chris
>
> On 7/26/18 9:56 AM, Daniil Titov wrote:
>> Hi Chris,
>>
>> The smallest allowed metaspace size for the test is 9MB. In both 
>> cases (when the metaspace size is set to 9Mb and to 16 Mb) the 
>> expected OutOfMemoryError is thrown and the test passes.
>>
>> I did update the patch to use the smallest settings.
>>
>> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.02
>>
>>
>> The test uses a custom class loader to load a class from the byte 
>> array read from the predefined specified class file. The incorrect 
>> path passed to the test made the test fail to read this class file.
>>
>> java.lang.RuntimeException: Exception when reading file 
>> './bin/nsk/jvmti/ResourceExhausted/Helper.class'
>> ????at 
>> nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:74)
>> ????at 
>> nsk.jvmti.ResourceExhausted.resexhausted003.run(resexhausted003.java:89)
>> ????at 
>> nsk.jvmti.ResourceExhausted.resexhausted003.main(resexhausted003.java:129)
>> ????at 
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
>> Method)
>> ????at 
>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> ????at 
>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> ????at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>> ????at 
>> com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115)
>> ????at java.base/java.lang.Thread.run(Thread.java:834)
>> Caused by: java.io.FileNotFoundException: 
>> ./bin/nsk/jvmti/ResourceExhausted/Helper.class (No such file or 
>> directory)
>> ????at java.base/java.io.FileInputStream.open0(Native Method)
>> ????at java.base/java.io.FileInputStream.open(FileInputStream.java:219)
>> ????at 
>> java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
>> ????at 
>> nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:64)
>> ????... 8 more
>>
>> Best regards,
>> Daniil
>>
>> ?On 7/25/18, 9:09 PM, "Chris Plummer" <chris.plummer at oracle.com> wrote:
>>
>> ???? Hi Daniil,
>> ???? ???? After reading some old comments I added to JDK-6606767, I 
>> wonder if
>> ???? bumping the metaspace size all the way up to 16m is the right 
>> thing to
>> ???? do. It seems the test wants to exhaust the metaspace, so maybe 
>> it should
>> ???? be set it to the smallest allowed size. Is the test still 
>> exhausting the
>> ???? metaspace even when it is 16M. Is there a smaller size that will 
>> also work?
>> ???? ???? Also, regarding the class path, what impact was this bug 
>> having on the test?
>> ???? ???? thanks,
>> ???? ???? Chris
>> ???? ???? On 7/25/18 4:32 PM, serguei.spitsyn at oracle.com wrote:
>> ???? > Hi Daniil,
>> ???? >
>> ???? > It looks good to me.
>> ???? > What is the need to increase the metaspace size?
>> ???? >
>> ???? > Thanks,
>> ???? > Serguei
>> ???? >
>> ???? >
>> ???? > On 7/25/18 16:11, Daniil Titov wrote:
>> ???? >> Hello,
>> ???? >>
>> ???? >> Please review the change that fix the test issue. The fix 
>> increases
>> ???? >> the? metaspace size and corrects the path to the class files.
>> ???? >>
>> ???? >> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/
>> ???? >> Issue: https://bugs.openjdk.java.net/browse/JDK-8207364
>> ???? >>
>> ???? >> Thanks!
>> ???? >>
>> ???? >> Best regards,
>> ???? >> Daniil
>> ???? >>
>> ???? >>
>> ???? >>
>> ???? >
>>
>>
>
>


From sharath.ballal at oracle.com  Thu Jul 26 17:04:39 2018
From: sharath.ballal at oracle.com (Sharath Ballal)
Date: Thu, 26 Jul 2018 10:04:39 -0700 (PDT)
Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws
 UnmappedAddressException on i686
In-Reply-To: <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com>
References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com>
 <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com>
Message-ID: <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default>

Changes looks good Severin.

I am not a reviewer though, so you still need a Reviewer to review.

Thanks,
Sharath


-----Original Message-----
From: Severin Gehwolf [mailto:sgehwolf at redhat.com] 
Sent: Thursday, July 26, 2018 1:04 PM
To: serviceability-dev
Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686

On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote:
> Hi,
> 
> Could I please get a review of this one-liner change related to jhsdb 
> --mixed when attaching to a running Java process? The issue arises 
> when threads are in native code and that native code has frame 
> pointers not properly preserved. In such a case the SA performs a 
> simple frame pointer valididy check: ebp >= esp
> 
> However, the code of retrieving the value for esp is incorrect in as 
> much as it's not in sync with native code in regards to the register
> index:
> 
> native code => X86ThreadContext.SP
> Java code   => X86ThreadContext.ESP
> 
> X86ThreadContext.ESP is never being set by the native code. Since
> X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then 
> returns null, ebp.lessThan(esp) wrongly returns false causing the 
> issue. This webrev fixes it by using SP as index on the Java side.
> Thoughts?
> 
> webrev: 
> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/
> bug: https://bugs.openjdk.java.net/browse/JDK-8208091

Anyone willing to review this one-liner?

Thanks,
Severin

> Thanks,
> Severin


From serguei.spitsyn at oracle.com  Thu Jul 26 17:04:55 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 26 Jul 2018 10:04:55 -0700
Subject: RFR 8208303: Track JNI failures and fail tests
In-Reply-To: <CAF9BGBx_KuWuLdL3CD8ByazJk8ixohL48_ryZ618P3SYwbbZDQ@mail.gmail.com>
References: <CAF9BGBx_KuWuLdL3CD8ByazJk8ixohL48_ryZ618P3SYwbbZDQ@mail.gmail.com>
Message-ID: <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/242168b9/attachment.html>

From sgehwolf at redhat.com  Thu Jul 26 17:11:30 2018
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Thu, 26 Jul 2018 19:11:30 +0200
Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws
 UnmappedAddressException on i686
In-Reply-To: <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default>
References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com>
 <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com>
 <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default>
Message-ID: <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com>

On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote:
> Changes looks good Severin.

Thanks for the review, Sharath!

> I am not a reviewer though, so you still need a Reviewer to review.

Anyone?

Thanks,
Severin

> -----Original Message-----
> From: Severin Gehwolf [mailto:sgehwolf at redhat.com] 
> Sent: Thursday, July 26, 2018 1:04 PM
> To: serviceability-dev
> Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686
> 
> On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote:
> > Hi,
> > 
> > Could I please get a review of this one-liner change related to jhsdb 
> > --mixed when attaching to a running Java process? The issue arises 
> > when threads are in native code and that native code has frame 
> > pointers not properly preserved. In such a case the SA performs a 
> > simple frame pointer valididy check: ebp >= esp
> > 
> > However, the code of retrieving the value for esp is incorrect in as 
> > much as it's not in sync with native code in regards to the register
> > index:
> > 
> > native code => X86ThreadContext.SP
> > Java code   => X86ThreadContext.ESP
> > 
> > X86ThreadContext.ESP is never being set by the native code. Since
> > X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then 
> > returns null, ebp.lessThan(esp) wrongly returns false causing the 
> > issue. This webrev fixes it by using SP as index on the Java side.
> > Thoughts?
> > 
> > webrev: 
> > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/
> > bug: https://bugs.openjdk.java.net/browse/JDK-8208091
> 
> Anyone willing to review this one-liner?
> 
> Thanks,
> Severin
> 
> > Thanks,
> > Severin
> 
> 


From jcbeyler at google.com  Thu Jul 26 17:40:09 2018
From: jcbeyler at google.com (JC Beyler)
Date: Thu, 26 Jul 2018 10:40:09 -0700
Subject: RFR 8208303: Track JNI failures and fail tests
In-Reply-To: <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com>
References: <CAF9BGBx_KuWuLdL3CD8ByazJk8ixohL48_ryZ618P3SYwbbZDQ@mail.gmail.com>
 <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com>
Message-ID: <CAF9BGBy9XDsO5+6f-RxEEQCAArbWWq0u_X3LjSAb6KDduz79-g@mail.gmail.com>

Hi Serguei,

As I was looking at another test bug (
https://bugs.openjdk.java.net/browse/JDK-8191519); the proposal for that
bug is to have a JNI call to FatalError to provoke a failure.

If we went down that route, this webrev is simpler, no? Instead of setting
failure_status and checking it later; just fail fatally and be done with
it, no? That way, the tests in Java land don't have to be changed actually,
no?

What would we prefer for tests? Remember there was a failure and test it
later or fail fast via JNI's FatalError?

Thanks,
Jc


On Thu, Jul 26, 2018 at 10:04 AM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Hi Jc,
>
> It looks good to me.
>
> Thanks,
> Serguei
>
>
> On 7/26/18 09:58, JC Beyler wrote:
>
> Hi all,
>
> The tests in the HeapMonitor subsystem has a lot of JNI calls. There is a
> need for verification and testing if anything in the JNI subsystem failed
> unexpectedly.
>
> Here is a webrev that tracks if a JNI call does fail and the tests will
> fail if any JNI call does fail.
>
> Could I have a few reviews please for:
> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.00/
> Bug: https://bugs.openjdk.java.net/browse/JDK-8208303
>
> Thanks,
> Jc
>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/3f6b9d9c/attachment.html>

From serguei.spitsyn at oracle.com  Thu Jul 26 17:45:56 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 26 Jul 2018 10:45:56 -0700
Subject: RFR 8208303: Track JNI failures and fail tests
In-Reply-To: <CAF9BGBy9XDsO5+6f-RxEEQCAArbWWq0u_X3LjSAb6KDduz79-g@mail.gmail.com>
References: <CAF9BGBx_KuWuLdL3CD8ByazJk8ixohL48_ryZ618P3SYwbbZDQ@mail.gmail.com>
 <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com>
 <CAF9BGBy9XDsO5+6f-RxEEQCAArbWWq0u_X3LjSAb6KDduz79-g@mail.gmail.com>
Message-ID: <b3fdce55-805f-7760-2495-18315c1ed9f0@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/133b91e3/attachment.html>

From daniel.daugherty at oracle.com  Thu Jul 26 18:05:27 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 26 Jul 2018 14:05:27 -0400
Subject: RFR (XS) 8208251:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails
 intermittently on Linux-X64
In-Reply-To: <CAF9BGByesvmqE0xa5nmfVqRWxUw7Xv4OP7Cr3cAQmDZRtz-k7Q@mail.gmail.com>
References: <CAF9BGByesvmqE0xa5nmfVqRWxUw7Xv4OP7Cr3cAQmDZRtz-k7Q@mail.gmail.com>
Message-ID: <f3822da5-6afb-06f7-feec-67070aa55c1d@oracle.com>

On 7/26/18 12:53 PM, JC Beyler wrote:
> Hi all,
>
> As we fixed the HeapMonitorTest to not fail from time to time, there 
> seems to be the same issue and risk in HeapMonitorGCTest. Could 
> someone review the similar fix:
>
> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208251/webrev.00/ 
> <http://cr.openjdk.java.net/%7Ejcbeyler/8208251/webrev.00/>

test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCTest.java
 ??? No comments.

Thumbs up!

Perhaps consider filing a bug to refactor HeapMonitorTest and
HeapMonitorGCTest.java so that they share code... then we won't
have to fix the same bug in two places...

Dan


> Bug: https://bugs.openjdk.java.net/browse/JDK-8208251
>
> The risk is that the last interval is too big and no sampled object is 
> live after the allocation method. If a GC happens before the check for 
> sample code, it is possible no live objects still exist.
>
> The solution is to reduce the sampling interval to make it highly 
> unlikely for no samples to happen in any allocation iteration, keeping 
> at least one sampled object live. But also check the GC'd objects in 
> the system in case they did actually all already get GC'd.
>
> Thanks,
> Jc

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/ad4579b5/attachment-0001.html>

From serguei.spitsyn at oracle.com  Thu Jul 26 18:11:37 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 26 Jul 2018 11:11:37 -0700
Subject: RFR (XS) 8208251:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java fails
 intermittently on Linux-X64
In-Reply-To: <f3822da5-6afb-06f7-feec-67070aa55c1d@oracle.com>
References: <CAF9BGByesvmqE0xa5nmfVqRWxUw7Xv4OP7Cr3cAQmDZRtz-k7Q@mail.gmail.com>
 <f3822da5-6afb-06f7-feec-67070aa55c1d@oracle.com>
Message-ID: <92e82eb8-02fb-a6a3-5bd9-f5f8d85591c6@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/39ee61d5/attachment.html>

From jcbeyler at google.com  Thu Jul 26 18:57:10 2018
From: jcbeyler at google.com (JC Beyler)
Date: Thu, 26 Jul 2018 11:57:10 -0700
Subject: RFR (XS) 8208251:
 serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCCMSTest.java
 fails intermittently on Linux-X64
In-Reply-To: <92e82eb8-02fb-a6a3-5bd9-f5f8d85591c6@oracle.com>
References: <CAF9BGByesvmqE0xa5nmfVqRWxUw7Xv4OP7Cr3cAQmDZRtz-k7Q@mail.gmail.com>
 <f3822da5-6afb-06f7-feec-67070aa55c1d@oracle.com>
 <92e82eb8-02fb-a6a3-5bd9-f5f8d85591c6@oracle.com>
Message-ID: <CAF9BGBxdPUHa0A6uxsYS8hJP+_QFP9jkp_yf3DLN+tSmQg8+og@mail.gmail.com>

Here you are Serguei:

http://cr.openjdk.java.net/~jcbeyler/8208251/webrev.01/

Thanks for the push!
Jc

Ps: @Daniel I created the issue and assigned it to me (
https://bugs.openjdk.java.net/browse/JDK-8208352)

On Thu, Jul 26, 2018 at 11:11 AM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Jc,
>
> Could you send me a patch?
> I'll sponsor the push.
>
> Thanks,
> Serguei
>
>
> On 7/26/18 11:05, Daniel D. Daugherty wrote:
>
> On 7/26/18 12:53 PM, JC Beyler wrote:
>
> Hi all,
>
> As we fixed the HeapMonitorTest to not fail from time to time, there seems
> to be the same issue and risk in HeapMonitorGCTest. Could someone review
> the similar fix:
>
> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208251/webrev.00/
>
>
>
> test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorGCTest.java
>     No comments.
>
> Thumbs up!
>
> Perhaps consider filing a bug to refactor HeapMonitorTest and
> HeapMonitorGCTest.java so that they share code... then we won't
> have to fix the same bug in two places...
>
> Dan
>
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8208251
>
> The risk is that the last interval is too big and no sampled object is
> live after the allocation method. If a GC happens before the check for
> sample code, it is possible no live objects still exist.
>
> The solution is to reduce the sampling interval to make it highly unlikely
> for no samples to happen in any allocation iteration, keeping at least one
> sampled object live. But also check the GC'd objects in the system in case
> they did actually all already get GC'd.
>
> Thanks,
> Jc
>
>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/fe26d6b9/attachment.html>

From jcbeyler at google.com  Thu Jul 26 19:03:37 2018
From: jcbeyler at google.com (JC Beyler)
Date: Thu, 26 Jul 2018 12:03:37 -0700
Subject: RFR 8208303: Track JNI failures and fail tests
In-Reply-To: <b3fdce55-805f-7760-2495-18315c1ed9f0@oracle.com>
References: <CAF9BGBx_KuWuLdL3CD8ByazJk8ixohL48_ryZ618P3SYwbbZDQ@mail.gmail.com>
 <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com>
 <CAF9BGBy9XDsO5+6f-RxEEQCAArbWWq0u_X3LjSAb6KDduz79-g@mail.gmail.com>
 <b3fdce55-805f-7760-2495-18315c1ed9f0@oracle.com>
Message-ID: <CAF9BGBy50_xmZ9ppsLv1rV7yz6UYuOfm9LP=8ppOQVoqMRYo1Q@mail.gmail.com>

Hi all,

With the FatalError idea, here is the webrev to consider, note it no longer
changes the tests. If a JNI call fails, then we call FatalError.

Let me know what you think:

Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.01/
Bug: https://bugs.openjdk.java.net/browse/JDK-8208303

Thanks!
Jc

On Thu, Jul 26, 2018 at 10:46 AM serguei.spitsyn at oracle.com <
serguei.spitsyn at oracle.com> wrote:

> Hi Jc,
>
> Good idea.
> I was thinking about something like this.
>
> Thanks,
> Serguei
>
>
> On 7/26/18 10:40, JC Beyler wrote:
>
> Hi Serguei,
>
> As I was looking at another test bug (
> https://bugs.openjdk.java.net/browse/JDK-8191519); the proposal for that
> bug is to have a JNI call to FatalError to provoke a failure.
>
> If we went down that route, this webrev is simpler, no? Instead of setting
> failure_status and checking it later; just fail fatally and be done with
> it, no? That way, the tests in Java land don't have to be changed actually,
> no?
>
> What would we prefer for tests? Remember there was a failure and test it
> later or fail fast via JNI's FatalError?
>
> Thanks,
> Jc
>
>
> On Thu, Jul 26, 2018 at 10:04 AM serguei.spitsyn at oracle.com <
> serguei.spitsyn at oracle.com> wrote:
>
>> Hi Jc,
>>
>> It looks good to me.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 7/26/18 09:58, JC Beyler wrote:
>>
>> Hi all,
>>
>> The tests in the HeapMonitor subsystem has a lot of JNI calls. There is a
>> need for verification and testing if anything in the JNI subsystem failed
>> unexpectedly.
>>
>> Here is a webrev that tracks if a JNI call does fail and the tests will
>> fail if any JNI call does fail.
>>
>> Could I have a few reviews please for:
>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.00/
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8208303
>>
>> Thanks,
>> Jc
>>
>>
>>
>
> --
>
> Thanks,
> Jc
>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/937e953c/attachment.html>

From daniel.daugherty at oracle.com  Thu Jul 26 19:08:02 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 26 Jul 2018 15:08:02 -0400
Subject: RFR 8208303: Track JNI failures and fail tests
In-Reply-To: <CAF9BGBy50_xmZ9ppsLv1rV7yz6UYuOfm9LP=8ppOQVoqMRYo1Q@mail.gmail.com>
References: <CAF9BGBx_KuWuLdL3CD8ByazJk8ixohL48_ryZ618P3SYwbbZDQ@mail.gmail.com>
 <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com>
 <CAF9BGBy9XDsO5+6f-RxEEQCAArbWWq0u_X3LjSAb6KDduz79-g@mail.gmail.com>
 <b3fdce55-805f-7760-2495-18315c1ed9f0@oracle.com>
 <CAF9BGBy50_xmZ9ppsLv1rV7yz6UYuOfm9LP=8ppOQVoqMRYo1Q@mail.gmail.com>
Message-ID: <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com>

Please make sure this fix is well tested in Mach5 prior to pushing.
In particular, I'm focused on reducing the noise in Mach5 tier[1-3]
so adding any new failures there will make me grumpy :-)

Dan


On 7/26/18 3:03 PM, JC Beyler wrote:
> Hi all,
>
> With the FatalError idea, here is the webrev to consider, note it no 
> longer changes the tests. If a JNI call fails, then we call FatalError.
>
> Let me know what you think:
>
> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.01/ 
> <http://cr.openjdk.java.net/%7Ejcbeyler/8208303/webrev.01/>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8208303
>
> Thanks!
> Jc
>
> On Thu, Jul 26, 2018 at 10:46 AM serguei.spitsyn at oracle.com 
> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com 
> <mailto:serguei.spitsyn at oracle.com>> wrote:
>
>     Hi Jc,
>
>     Good idea.
>     I was thinking about something like this.
>
>     Thanks,
>     Serguei
>
>
>     On 7/26/18 10:40, JC Beyler wrote:
>>     Hi?Serguei,
>>
>>     As I was looking at another test bug
>>     (https://bugs.openjdk.java.net/browse/JDK-8191519); the proposal
>>     for that bug is to have a JNI call to FatalError to provoke a
>>     failure.
>>
>>     If we went down that route, this webrev is simpler, no? Instead
>>     of setting failure_status and checking it later; just fail
>>     fatally and be done with it, no? That way, the tests in Java land
>>     don't have to be changed actually, no?
>>
>>     What would we prefer for tests? Remember there was a failure and
>>     test it later or fail fast via JNI's FatalError?
>>
>>     Thanks,
>>     Jc
>>
>>
>>     On Thu, Jul 26, 2018 at 10:04 AM serguei.spitsyn at oracle.com
>>     <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com
>>     <mailto:serguei.spitsyn at oracle.com>> wrote:
>>
>>         Hi Jc,
>>
>>         It looks good to me.
>>
>>         Thanks,
>>         Serguei
>>
>>
>>         On 7/26/18 09:58, JC Beyler wrote:
>>>         Hi all,
>>>
>>>         The tests in the HeapMonitor subsystem has a lot of JNI
>>>         calls. There is a need for verification and testing if
>>>         anything in the JNI subsystem failed unexpectedly.
>>>
>>>         Here is a webrev that tracks if a JNI call does fail and the
>>>         tests will fail if any JNI call does fail.
>>>
>>>         Could I have a few reviews please for:
>>>         Webrev:
>>>         http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.00/
>>>         <http://cr.openjdk.java.net/%7Ejcbeyler/8208303/webrev.00/>
>>>         Bug: https://bugs.openjdk.java.net/browse/JDK-8208303
>>>
>>>         Thanks,
>>>         Jc
>>
>>
>>
>>     -- 
>>
>>     Thanks,
>>     Jc
>
>
>
> -- 
>
> Thanks,
> Jc

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/b775495e/attachment-0001.html>

From serguei.spitsyn at oracle.com  Thu Jul 26 19:14:03 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 26 Jul 2018 12:14:03 -0700
Subject: RFR 8208303: Track JNI failures and fail tests
In-Reply-To: <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com>
References: <CAF9BGBx_KuWuLdL3CD8ByazJk8ixohL48_ryZ618P3SYwbbZDQ@mail.gmail.com>
 <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com>
 <CAF9BGBy9XDsO5+6f-RxEEQCAArbWWq0u_X3LjSAb6KDduz79-g@mail.gmail.com>
 <b3fdce55-805f-7760-2495-18315c1ed9f0@oracle.com>
 <CAF9BGBy50_xmZ9ppsLv1rV7yz6UYuOfm9LP=8ppOQVoqMRYo1Q@mail.gmail.com>
 <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com>
Message-ID: <0650aa94-2e30-54e3-9d2d-dcd871455791@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/7f41b4d2/attachment.html>

From daniel.daugherty at oracle.com  Thu Jul 26 19:15:08 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Thu, 26 Jul 2018 15:15:08 -0400
Subject: RFR 8208303: Track JNI failures and fail tests
In-Reply-To: <0650aa94-2e30-54e3-9d2d-dcd871455791@oracle.com>
References: <CAF9BGBx_KuWuLdL3CD8ByazJk8ixohL48_ryZ618P3SYwbbZDQ@mail.gmail.com>
 <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com>
 <CAF9BGBy9XDsO5+6f-RxEEQCAArbWWq0u_X3LjSAb6KDduz79-g@mail.gmail.com>
 <b3fdce55-805f-7760-2495-18315c1ed9f0@oracle.com>
 <CAF9BGBy50_xmZ9ppsLv1rV7yz6UYuOfm9LP=8ppOQVoqMRYo1Q@mail.gmail.com>
 <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com>
 <0650aa94-2e30-54e3-9d2d-dcd871455791@oracle.com>
Message-ID: <84f55dc6-651f-4c33-46f9-97369932e6e9@oracle.com>

We entered RDP2 today (07.26). So only P1 and P2 bug fixes allowed.

Dan


On 7/26/18 3:14 PM, serguei.spitsyn at oracle.com wrote:
> Yes, of course it has to be well tested before the push.
> Does it make sense to plan it to push to 11 (after th testing is done)?
>
> Thanks,
> Serguei
>
>
> On 7/26/18 12:08, Daniel D. Daugherty wrote:
>> Please make sure this fix is well tested in Mach5 prior to pushing.
>> In particular, I'm focused on reducing the noise in Mach5 tier[1-3]
>> so adding any new failures there will make me grumpy :-)
>>
>> Dan
>>
>>
>> On 7/26/18 3:03 PM, JC Beyler wrote:
>>> Hi all,
>>>
>>> With the FatalError idea, here is the webrev to consider, note it no 
>>> longer changes the tests. If a JNI call fails, then we call FatalError.
>>>
>>> Let me know what you think:
>>>
>>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.01/ 
>>> <http://cr.openjdk.java.net/%7Ejcbeyler/8208303/webrev.01/>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8208303
>>>
>>> Thanks!
>>> Jc
>>>
>>> On Thu, Jul 26, 2018 at 10:46 AM serguei.spitsyn at oracle.com 
>>> <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com 
>>> <mailto:serguei.spitsyn at oracle.com>> wrote:
>>>
>>>     Hi Jc,
>>>
>>>     Good idea.
>>>     I was thinking about something like this.
>>>
>>>     Thanks,
>>>     Serguei
>>>
>>>
>>>     On 7/26/18 10:40, JC Beyler wrote:
>>>>     Hi?Serguei,
>>>>
>>>>     As I was looking at another test bug
>>>>     (https://bugs.openjdk.java.net/browse/JDK-8191519); the
>>>>     proposal for that bug is to have a JNI call to FatalError to
>>>>     provoke a failure.
>>>>
>>>>     If we went down that route, this webrev is simpler, no? Instead
>>>>     of setting failure_status and checking it later; just fail
>>>>     fatally and be done with it, no? That way, the tests in Java
>>>>     land don't have to be changed actually, no?
>>>>
>>>>     What would we prefer for tests? Remember there was a failure
>>>>     and test it later or fail fast via JNI's FatalError?
>>>>
>>>>     Thanks,
>>>>     Jc
>>>>
>>>>
>>>>     On Thu, Jul 26, 2018 at 10:04 AM serguei.spitsyn at oracle.com
>>>>     <mailto:serguei.spitsyn at oracle.com> <serguei.spitsyn at oracle.com
>>>>     <mailto:serguei.spitsyn at oracle.com>> wrote:
>>>>
>>>>         Hi Jc,
>>>>
>>>>         It looks good to me.
>>>>
>>>>         Thanks,
>>>>         Serguei
>>>>
>>>>
>>>>         On 7/26/18 09:58, JC Beyler wrote:
>>>>>         Hi all,
>>>>>
>>>>>         The tests in the HeapMonitor subsystem has a lot of JNI
>>>>>         calls. There is a need for verification and testing if
>>>>>         anything in the JNI subsystem failed unexpectedly.
>>>>>
>>>>>         Here is a webrev that tracks if a JNI call does fail and
>>>>>         the tests will fail if any JNI call does fail.
>>>>>
>>>>>         Could I have a few reviews please for:
>>>>>         Webrev:
>>>>>         http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.00/
>>>>>         <http://cr.openjdk.java.net/%7Ejcbeyler/8208303/webrev.00/>
>>>>>         Bug: https://bugs.openjdk.java.net/browse/JDK-8208303
>>>>>
>>>>>         Thanks,
>>>>>         Jc
>>>>
>>>>
>>>>
>>>>     -- 
>>>>
>>>>     Thanks,
>>>>     Jc
>>>
>>>
>>>
>>> -- 
>>>
>>> Thanks,
>>> Jc
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/618665c8/attachment.html>

From serguei.spitsyn at oracle.com  Thu Jul 26 19:17:15 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Thu, 26 Jul 2018 12:17:15 -0700
Subject: RFR 8208303: Track JNI failures and fail tests
In-Reply-To: <84f55dc6-651f-4c33-46f9-97369932e6e9@oracle.com>
References: <CAF9BGBx_KuWuLdL3CD8ByazJk8ixohL48_ryZ618P3SYwbbZDQ@mail.gmail.com>
 <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com>
 <CAF9BGBy9XDsO5+6f-RxEEQCAArbWWq0u_X3LjSAb6KDduz79-g@mail.gmail.com>
 <b3fdce55-805f-7760-2495-18315c1ed9f0@oracle.com>
 <CAF9BGBy50_xmZ9ppsLv1rV7yz6UYuOfm9LP=8ppOQVoqMRYo1Q@mail.gmail.com>
 <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com>
 <0650aa94-2e30-54e3-9d2d-dcd871455791@oracle.com>
 <84f55dc6-651f-4c33-46f9-97369932e6e9@oracle.com>
Message-ID: <e0c927a3-79fe-b147-bdbb-788c74567947@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/b8a61d3f/attachment-0001.html>

From chris.plummer at oracle.com  Thu Jul 26 19:52:30 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 26 Jul 2018 12:52:30 -0700
Subject: RFR 8208303: Track JNI failures and fail tests
In-Reply-To: <84f55dc6-651f-4c33-46f9-97369932e6e9@oracle.com>
References: <CAF9BGBx_KuWuLdL3CD8ByazJk8ixohL48_ryZ618P3SYwbbZDQ@mail.gmail.com>
 <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com>
 <CAF9BGBy9XDsO5+6f-RxEEQCAArbWWq0u_X3LjSAb6KDduz79-g@mail.gmail.com>
 <b3fdce55-805f-7760-2495-18315c1ed9f0@oracle.com>
 <CAF9BGBy50_xmZ9ppsLv1rV7yz6UYuOfm9LP=8ppOQVoqMRYo1Q@mail.gmail.com>
 <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com>
 <0650aa94-2e30-54e3-9d2d-dcd871455791@oracle.com>
 <84f55dc6-651f-4c33-46f9-97369932e6e9@oracle.com>
Message-ID: <3a2d66d6-e4f8-51ad-3552-dfe3748dff89@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180726/877f24b8/attachment.html>

From chris.plummer at oracle.com  Thu Jul 26 21:07:00 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Thu, 26 Jul 2018 14:07:00 -0700
Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws
 UnmappedAddressException on i686
In-Reply-To: <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com>
References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com>
 <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com>
 <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default>
 <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com>
Message-ID: <4106a672-066a-dc5a-32cc-2eb8eb1bfc4e@oracle.com>

Hi Severin,

I had looked at this review when it came out, but was hesitant to ok it 
because I really don't know this code at all. If you can get another 
reviewer who does know the code, then I'll approve it. This only impacts 
32-bit, right? If so, keep in mind that it won't get tested by Oracle 
testing, including the submit repo, so make sure you do thorough testing.

Also, why is there any code being executed that was not compiled with 
-fno-omit-frame-pointer? The description in the CR just shows a simple 
java program reproducing this, so all the mixed stack traces belong to 
the JVM and libs, and I thought we made sure to compile all of them with 
-fno-omit-frame-pointer.

thanks,

Chris

On 7/26/18 10:11 AM, Severin Gehwolf wrote:
> On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote:
>> Changes looks good Severin.
> Thanks for the review, Sharath!
>
>> I am not a reviewer though, so you still need a Reviewer to review.
> Anyone?
>
> Thanks,
> Severin
>
>> -----Original Message-----
>> From: Severin Gehwolf [mailto:sgehwolf at redhat.com]
>> Sent: Thursday, July 26, 2018 1:04 PM
>> To: serviceability-dev
>> Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686
>>
>> On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote:
>>> Hi,
>>>
>>> Could I please get a review of this one-liner change related to jhsdb
>>> --mixed when attaching to a running Java process? The issue arises
>>> when threads are in native code and that native code has frame
>>> pointers not properly preserved. In such a case the SA performs a
>>> simple frame pointer valididy check: ebp >= esp
>>>
>>> However, the code of retrieving the value for esp is incorrect in as
>>> much as it's not in sync with native code in regards to the register
>>> index:
>>>
>>> native code => X86ThreadContext.SP
>>> Java code   => X86ThreadContext.ESP
>>>
>>> X86ThreadContext.ESP is never being set by the native code. Since
>>> X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then
>>> returns null, ebp.lessThan(esp) wrongly returns false causing the
>>> issue. This webrev fixes it by using SP as index on the Java side.
>>> Thoughts?
>>>
>>> webrev:
>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8208091
>> Anyone willing to review this one-liner?
>>
>> Thanks,
>> Severin
>>
>>> Thanks,
>>> Severin
>>


From daniil.x.titov at oracle.com  Thu Jul 26 23:24:48 2018
From: daniil.x.titov at oracle.com (Daniil Titov)
Date: Thu, 26 Jul 2018 16:24:48 -0700
Subject: RFR 8207364: nsk/jvmti/ResourceExhausted/resexhausted003 fails to
 start
In-Reply-To: <eed1b2d5-29c8-d5ab-e04d-1d3326e5e31b@oracle.com>
References: <562E6735-1C3B-4675-A036-DC86E8BB3527@oracle.com>
 <932f1748-3902-60b2-e3dc-29f32ee43e1e@oracle.com>
 <99433df0-16fe-6ca7-b0c7-b428d3d12f92@oracle.com>
 <3716AA08-0542-4500-AD55-B41EC34C56BB@oracle.com>
 <85219efa-ef16-add4-209d-96f7bf987ba4@oracle.com>
 <eed1b2d5-29c8-d5ab-e04d-1d3326e5e31b@oracle.com>
Message-ID: <0D426A78-5C8C-456B-BEEC-495B86622D0A@oracle.com>

Thank you Serguei and Chris for reviewing this change.

Best regards,
Daniil

?On 7/26/18, 10:01 AM, "serguei.spitsyn at oracle.com" <serguei.spitsyn at oracle.com> wrote:

    +1
    
    Thanks,
    Serguei
    
    
    On 7/26/18 09:59, Chris Plummer wrote:
    > Thanks for the explanation. Update looks good.
    >
    > Chris
    >
    > On 7/26/18 9:56 AM, Daniil Titov wrote:
    >> Hi Chris,
    >>
    >> The smallest allowed metaspace size for the test is 9MB. In both 
    >> cases (when the metaspace size is set to 9Mb and to 16 Mb) the 
    >> expected OutOfMemoryError is thrown and the test passes.
    >>
    >> I did update the patch to use the smallest settings.
    >>
    >> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.02
    >>
    >>
    >> The test uses a custom class loader to load a class from the byte 
    >> array read from the predefined specified class file. The incorrect 
    >> path passed to the test made the test fail to read this class file.
    >>
    >> java.lang.RuntimeException: Exception when reading file 
    >> './bin/nsk/jvmti/ResourceExhausted/Helper.class'
    >>     at 
    >> nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:74)
    >>     at 
    >> nsk.jvmti.ResourceExhausted.resexhausted003.run(resexhausted003.java:89)
    >>     at 
    >> nsk.jvmti.ResourceExhausted.resexhausted003.main(resexhausted003.java:129)
    >>     at 
    >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
    >> Method)
    >>     at 
    >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    >>     at 
    >> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    >>     at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    >>     at 
    >> com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:115)
    >>     at java.base/java.lang.Thread.run(Thread.java:834)
    >> Caused by: java.io.FileNotFoundException: 
    >> ./bin/nsk/jvmti/ResourceExhausted/Helper.class (No such file or 
    >> directory)
    >>     at java.base/java.io.FileInputStream.open0(Native Method)
    >>     at java.base/java.io.FileInputStream.open(FileInputStream.java:219)
    >>     at 
    >> java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
    >>     at 
    >> nsk.jvmti.ResourceExhausted.resexhausted003.fileBytes(resexhausted003.java:64)
    >>     ... 8 more
    >>
    >> Best regards,
    >> Daniil
    >>
    >> ?On 7/25/18, 9:09 PM, "Chris Plummer" <chris.plummer at oracle.com> wrote:
    >>
    >>      Hi Daniil,
    >>           After reading some old comments I added to JDK-6606767, I 
    >> wonder if
    >>      bumping the metaspace size all the way up to 16m is the right 
    >> thing to
    >>      do. It seems the test wants to exhaust the metaspace, so maybe 
    >> it should
    >>      be set it to the smallest allowed size. Is the test still 
    >> exhausting the
    >>      metaspace even when it is 16M. Is there a smaller size that will 
    >> also work?
    >>           Also, regarding the class path, what impact was this bug 
    >> having on the test?
    >>           thanks,
    >>           Chris
    >>           On 7/25/18 4:32 PM, serguei.spitsyn at oracle.com wrote:
    >>      > Hi Daniil,
    >>      >
    >>      > It looks good to me.
    >>      > What is the need to increase the metaspace size?
    >>      >
    >>      > Thanks,
    >>      > Serguei
    >>      >
    >>      >
    >>      > On 7/25/18 16:11, Daniil Titov wrote:
    >>      >> Hello,
    >>      >>
    >>      >> Please review the change that fix the test issue. The fix 
    >> increases
    >>      >> the  metaspace size and corrects the path to the class files.
    >>      >>
    >>      >> Webrev: http://cr.openjdk.java.net/~dtitov/8207364/webrev.01/
    >>      >> Issue: https://bugs.openjdk.java.net/browse/JDK-8207364
    >>      >>
    >>      >> Thanks!
    >>      >>
    >>      >> Best regards,
    >>      >> Daniil
    >>      >>
    >>      >>
    >>      >>
    >>      >
    >>
    >>
    >
    >
    
    
From chris.plummer at oracle.com  Fri Jul 27 23:27:45 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Fri, 27 Jul 2018 16:27:45 -0700
Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find
 boolVar with expected value: false
In-Reply-To: <5B5233DC.5040003@oracle.com>
References: <5B082D2E.7000408@oracle.com>
 <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com>
 <5B5233DC.5040003@oracle.com>
Message-ID: <853aba55-fafc-2797-ed44-818760bd5571@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180727/7388d649/attachment.html>

From jcbeyler at google.com  Fri Jul 27 23:36:41 2018
From: jcbeyler at google.com (JC Beyler)
Date: Fri, 27 Jul 2018 16:36:41 -0700
Subject: RFR 8208303: Track JNI failures and fail tests
In-Reply-To: <3a2d66d6-e4f8-51ad-3552-dfe3748dff89@oracle.com>
References: <CAF9BGBx_KuWuLdL3CD8ByazJk8ixohL48_ryZ618P3SYwbbZDQ@mail.gmail.com>
 <91a1cd81-98a1-676b-6745-5281a3caaac0@oracle.com>
 <CAF9BGBy9XDsO5+6f-RxEEQCAArbWWq0u_X3LjSAb6KDduz79-g@mail.gmail.com>
 <b3fdce55-805f-7760-2495-18315c1ed9f0@oracle.com>
 <CAF9BGBy50_xmZ9ppsLv1rV7yz6UYuOfm9LP=8ppOQVoqMRYo1Q@mail.gmail.com>
 <08c5c3e7-3789-40f3-9266-e63ec51ab6ae@oracle.com>
 <0650aa94-2e30-54e3-9d2d-dcd871455791@oracle.com>
 <84f55dc6-651f-4c33-46f9-97369932e6e9@oracle.com>
 <3a2d66d6-e4f8-51ad-3552-dfe3748dff89@oracle.com>
Message-ID: <CAF9BGBxpCDesppF8uzw5ycAkmNu_xNvq-gMAg_sRHAFqU1OTPw@mail.gmail.com>

Hi all,

I did the new version that calls FatalError if JNI fails a call. This has
the advantage of not having to complicate the Java tests at all, while
adding the post-JNI call checks.

Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.03/
Bug: https://bugs.openjdk.java.net/browse/JDK-8208303

Thanks all!
Jc

On Thu, Jul 26, 2018 at 12:52 PM Chris Plummer <chris.plummer at oracle.com>
wrote:

> I'm pretty sure changes that only affect tests can be any priority. But
> still, be a lot more cautious the closer we get to release.
>
> Chris
>
> On 7/26/18 12:15 PM, Daniel D. Daugherty wrote:
>
> We entered RDP2 today (07.26). So only P1 and P2 bug fixes allowed.
>
> Dan
>
>
> On 7/26/18 3:14 PM, serguei.spitsyn at oracle.com wrote:
>
> Yes, of course it has to be well tested before the push.
> Does it make sense to plan it to push to 11 (after th testing is done)?
>
> Thanks,
> Serguei
>
>
> On 7/26/18 12:08, Daniel D. Daugherty wrote:
>
> Please make sure this fix is well tested in Mach5 prior to pushing.
> In particular, I'm focused on reducing the noise in Mach5 tier[1-3]
> so adding any new failures there will make me grumpy :-)
>
> Dan
>
>
> On 7/26/18 3:03 PM, JC Beyler wrote:
>
> Hi all,
>
> With the FatalError idea, here is the webrev to consider, note it no
> longer changes the tests. If a JNI call fails, then we call FatalError.
>
> Let me know what you think:
>
> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.01/
> Bug: https://bugs.openjdk.java.net/browse/JDK-8208303
>
> Thanks!
> Jc
>
> On Thu, Jul 26, 2018 at 10:46 AM serguei.spitsyn at oracle.com <
> serguei.spitsyn at oracle.com> wrote:
>
>> Hi Jc,
>>
>> Good idea.
>> I was thinking about something like this.
>>
>> Thanks,
>> Serguei
>>
>>
>> On 7/26/18 10:40, JC Beyler wrote:
>>
>> Hi Serguei,
>>
>> As I was looking at another test bug (
>> https://bugs.openjdk.java.net/browse/JDK-8191519); the proposal for that
>> bug is to have a JNI call to FatalError to provoke a failure.
>>
>> If we went down that route, this webrev is simpler, no? Instead of
>> setting failure_status and checking it later; just fail fatally and be done
>> with it, no? That way, the tests in Java land don't have to be changed
>> actually, no?
>>
>> What would we prefer for tests? Remember there was a failure and test it
>> later or fail fast via JNI's FatalError?
>>
>> Thanks,
>> Jc
>>
>>
>> On Thu, Jul 26, 2018 at 10:04 AM serguei.spitsyn at oracle.com <
>> serguei.spitsyn at oracle.com> wrote:
>>
>>> Hi Jc,
>>>
>>> It looks good to me.
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 7/26/18 09:58, JC Beyler wrote:
>>>
>>> Hi all,
>>>
>>> The tests in the HeapMonitor subsystem has a lot of JNI calls. There is
>>> a need for verification and testing if anything in the JNI subsystem failed
>>> unexpectedly.
>>>
>>> Here is a webrev that tracks if a JNI call does fail and the tests will
>>> fail if any JNI call does fail.
>>>
>>> Could I have a few reviews please for:
>>> Webrev: http://cr.openjdk.java.net/~jcbeyler/8208303/webrev.00/
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8208303
>>>
>>> Thanks,
>>> Jc
>>>
>>>
>>>
>>
>> --
>>
>> Thanks,
>> Jc
>>
>>
>>
>
> --
>
> Thanks,
> Jc
>
>
>
>
>
>

-- 

Thanks,
Jc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180727/c0200007/attachment-0001.html>

From chris.plummer at oracle.com  Mon Jul 30 05:05:56 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Sun, 29 Jul 2018 22:05:56 -0700
Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find
 boolVar with expected value: false
In-Reply-To: <853aba55-fafc-2797-ed44-818760bd5571@oracle.com>
References: <5B082D2E.7000408@oracle.com>
 <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com>
 <5B5233DC.5040003@oracle.com>
 <853aba55-fafc-2797-ed44-818760bd5571@oracle.com>
Message-ID: <352ccc2d-8e8a-4b43-45fd-64bed2bb56f1@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180729/5717b812/attachment.html>

From serguei.spitsyn at oracle.com  Mon Jul 30 07:47:04 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Mon, 30 Jul 2018 00:47:04 -0700
Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find
 boolVar with expected value: false
In-Reply-To: <352ccc2d-8e8a-4b43-45fd-64bed2bb56f1@oracle.com>
References: <5B082D2E.7000408@oracle.com>
 <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com>
 <5B5233DC.5040003@oracle.com>
 <853aba55-fafc-2797-ed44-818760bd5571@oracle.com>
 <352ccc2d-8e8a-4b43-45fd-64bed2bb56f1@oracle.com>
Message-ID: <973a96aa-0533-1e7d-a6f7-e948c0ecc371@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180730/013f5033/attachment-0001.html>

From sgehwolf at redhat.com  Mon Jul 30 08:28:22 2018
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Mon, 30 Jul 2018 10:28:22 +0200
Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws
 UnmappedAddressException on i686
In-Reply-To: <4106a672-066a-dc5a-32cc-2eb8eb1bfc4e@oracle.com>
References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com>
 <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com>
 <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default>
 <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com>
 <4106a672-066a-dc5a-32cc-2eb8eb1bfc4e@oracle.com>
Message-ID: <ce303b00ffd5a439ce87a596efeb31e594d117af.camel@redhat.com>

Hi Chris,

On Thu, 2018-07-26 at 14:07 -0700, Chris Plummer wrote:
> I had looked at this review when it came out, but was hesitant to ok it 
> because I really don't know this code at all. If you can get another 
> reviewer who does know the code, then I'll approve it.

Sharath Ballal reviewed it, but he's not a Reviewer as per the OpenJDK
census. As to whether he knows the code, I don't know. He's on CC.

> This only impacts 32-bit, right? If so, keep in mind that it won't get tested by Oracle 
> testing, including the submit repo, so make sure you do thorough testing.

It only impacts 32-bit, yes. I understand that Oracle isn't testing 32-
bit x86 any more. The change itself should be fairly low risk since
it's changing only a 32-bit-x86-linux-only file and the native bits
don't seem to match what the Java code does[1]. REG_INDEX(reg) being
defined as:

#define REG_INDEX(reg) sun_jvm_hotspot_debugger_x86_X86ThreadContext_##reg

and being used as:

REG_INDEX(SP)

Thus, using

sun_jvm_hotspot_debugger_x86_X86ThreadContext_SP

The Java code uses:

sun.jvm.hotspot.debugger.x86.X86ThreadContext.ESP

> Also, why is there any code being executed that was not compiled with 
> -fno-omit-frame-pointer? The description in the CR just shows a simple 
> java program reproducing this, so all the mixed stack traces belong to 
> the JVM and libs, and I thought we made sure to compile all of them with 
> -fno-omit-frame-pointer.

The JVM uses glibc and that simple program is enough to see some
thread's stack currently being in a glibc function when getting a mixed
stack trace. We've originally seen this in JDK 8 with jstack -m and was
reported in [2]. That comment has more details. The problem here isn't
that it's a JDK lib which gets compiled without -fno-omit-frame-
pointer. It's glibc not being compiled with that option.

An example stack trace for a system where this happens looks like this:

Thread 7 (Thread 0xa3863b40 (LWP 834)):
#0  0xf771f430 in __kernel_vsyscall ()
#1  0xf7703acc in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=1, futex=0xf770f000) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
#2  do_futex_wait (sem=0xf770f000, sem at entry=0xf70ea854 <sig_sem>, abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:226
#3  0xf7703bb7 in __new_sem_wait_slow (sem=0xf70ea854 <sig_sem>, abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:407
#4  0xf6cc18d4 in check_pending_signals (wait=true) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:2522
#5  0xf6cbc632 in signal_thread_entry (thread=0xa37a4800, __the_thread__=0xa37a4800) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/share/vm/runtime/os.cpp:250

That is, frames 0-3 are JDK foreign. This bug will happen on all
systems which use any native library which isn't compiled with -fno-
omit-frame-pointer. Be it glibc or some other library.

Thanks,
Severin

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c9
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c4

> thanks,
> 
> Chris
> 
> On 7/26/18 10:11 AM, Severin Gehwolf wrote:
> > On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote:
> > > Changes looks good Severin.
> > 
> > Thanks for the review, Sharath!
> > 
> > > I am not a reviewer though, so you still need a Reviewer to review.
> > 
> > Anyone?
> > 
> > Thanks,
> > Severin
> > 
> > > -----Original Message-----
> > > From: Severin Gehwolf [mailto:sgehwolf at redhat.com]
> > > Sent: Thursday, July 26, 2018 1:04 PM
> > > To: serviceability-dev
> > > Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686
> > > 
> > > On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote:
> > > > Hi,
> > > > 
> > > > Could I please get a review of this one-liner change related to jhsdb
> > > > --mixed when attaching to a running Java process? The issue arises
> > > > when threads are in native code and that native code has frame
> > > > pointers not properly preserved. In such a case the SA performs a
> > > > simple frame pointer valididy check: ebp >= esp
> > > > 
> > > > However, the code of retrieving the value for esp is incorrect in as
> > > > much as it's not in sync with native code in regards to the register
> > > > index:
> > > > 
> > > > native code => X86ThreadContext.SP
> > > > Java code   => X86ThreadContext.ESP
> > > > 
> > > > X86ThreadContext.ESP is never being set by the native code. Since
> > > > X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then
> > > > returns null, ebp.lessThan(esp) wrongly returns false causing the
> > > > issue. This webrev fixes it by using SP as index on the Java side.
> > > > Thoughts?
> > > > 
> > > > webrev:
> > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/
> > > > bug: https://bugs.openjdk.java.net/browse/JDK-8208091
> > > 
> > > Anyone willing to review this one-liner?
> > > 
> > > Thanks,
> > > Severin
> > > 
> > > > Thanks,
> > > > Severin
> 
> 


From thomas.schatzl at oracle.com  Mon Jul 30 13:03:20 2018
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 30 Jul 2018 15:03:20 +0200
Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean,
 GarbageCollectorMXBean, and jstat counter definitions
In-Reply-To: <e06a126c624e3b4aa836dfeed385882e85261a43.camel@oracle.com>
References: <FCFCADFE-5CE0-42DE-8ED8-FBC57464207F@amazon.com>
 <e06a126c624e3b4aa836dfeed385882e85261a43.camel@oracle.com>
Message-ID: <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com>

Hi Paul,

  did some prototyping and wanted to show you the results and get your
input:

On Thu, 2018-07-26 at 16:06 +0200, Thomas Schatzl wrote:
> 
[...]
> Could we work together on first refactoring the code before adding
> new
> kinds of spaces to the MXBeans?
> 
> Looking at this change and mine roughly the following issues would
> need to be resolved first:
> - find a solution for archive regions as suggested above :) At the
> moment, without doing the change, I would tend to make archive
> regions separate from old regions.

I went with that and I am currently testing https://bugs.openjdk.java.n
et/browse/JDK-8208498 ; here's a webrev to look at: http://cr.openjdk.j
ava.net/~tschatzl/8208498/webrev/

> - move serviceability stuff as much as possible to
> g1MonitoringSupport

Preliminary webrev:
http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/

I think this came out better than expected: while we maybe want to add
a ServiceabilitySupport interface that collects the
get_memory_manager/pools/* methods in the future, imho this is a lot
better than current code as it tightens the G1MonitoringSupport
interface quite a bit.

Particularly of note should be the G1MonitoringScope class that
collects both TraceCollectorStats and TraceMemoryManagerStats into a
single class. (Instead of the two bools passed to it something
indicating the GC directly would probably be better too).

It would be nice if something similar could be made for the concurrent
Trace*Stats.

> - clean up MemoryPool, remove duplicate information
> - provide and return sane memory pool used/committed values to the
> MXBeans
> - clean up G1MonitoringSupport, e.g. avoid "*used/*committed"
> variables
> for every single memory pool. Use MemoryUsage structs for them. Make
> reading of memory pool information atomic wrt to its readers (note
> that I think it is currently just impossible to get consistent output
> for other statistics like jstat) - that's JDK-8207200.
> - add whatever serviceability stuff for the new pools/jstat/* in
> steps.


Thanks,
  Thomas


From chris.plummer at oracle.com  Mon Jul 30 16:33:15 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 30 Jul 2018 09:33:15 -0700
Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws
 UnmappedAddressException on i686
In-Reply-To: <ce303b00ffd5a439ce87a596efeb31e594d117af.camel@redhat.com>
References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com>
 <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com>
 <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default>
 <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com>
 <4106a672-066a-dc5a-32cc-2eb8eb1bfc4e@oracle.com>
 <ce303b00ffd5a439ce87a596efeb31e594d117af.camel@redhat.com>
Message-ID: <bdf77413-51d4-574f-9ff9-0eba5bfdda79@oracle.com>

Hi Severin,

On 7/30/18 1:28 AM, Severin Gehwolf wrote:
> Hi Chris,
>
> On Thu, 2018-07-26 at 14:07 -0700, Chris Plummer wrote:
>> I had looked at this review when it came out, but was hesitant to ok it
>> because I really don't know this code at all. If you can get another
>> reviewer who does know the code, then I'll approve it.
> Sharath Ballal reviewed it, but he's not a Reviewer as per the OpenJDK
> census. As to whether he knows the code, I don't know. He's on CC.
Yes, but I was asking for a second reviewer (not counting me).
>
>> This only impacts 32-bit, right? If so, keep in mind that it won't get tested by Oracle
>> testing, including the submit repo, so make sure you do thorough testing.
> It only impacts 32-bit, yes. I understand that Oracle isn't testing 32-
> bit x86 any more. The change itself should be fairly low risk since
> it's changing only a 32-bit-x86-linux-only file and the native bits
> don't seem to match what the Java code does[1]. REG_INDEX(reg) being
> defined as:
>
> #define REG_INDEX(reg) sun_jvm_hotspot_debugger_x86_X86ThreadContext_##reg
>
> and being used as:
>
> REG_INDEX(SP)
>
> Thus, using
>
> sun_jvm_hotspot_debugger_x86_X86ThreadContext_SP
>
> The Java code uses:
>
> sun.jvm.hotspot.debugger.x86.X86ThreadContext.ESP
>
>> Also, why is there any code being executed that was not compiled with
>> -fno-omit-frame-pointer? The description in the CR just shows a simple
>> java program reproducing this, so all the mixed stack traces belong to
>> the JVM and libs, and I thought we made sure to compile all of them with
>> -fno-omit-frame-pointer.
> The JVM uses glibc and that simple program is enough to see some
> thread's stack currently being in a glibc function when getting a mixed
> stack trace. We've originally seen this in JDK 8 with jstack -m and was
> reported in [2]. That comment has more details. The problem here isn't
> that it's a JDK lib which gets compiled without -fno-omit-frame-
> pointer. It's glibc not being compiled with that option.
>
> An example stack trace for a system where this happens looks like this:
>
> Thread 7 (Thread 0xa3863b40 (LWP 834)):
> #0  0xf771f430 in __kernel_vsyscall ()
> #1  0xf7703acc in futex_abstimed_wait (cancel=true, private=<optimized out>, abstime=0x0, expected=1, futex=0xf770f000) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
> #2  do_futex_wait (sem=0xf770f000, sem at entry=0xf70ea854 <sig_sem>, abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:226
> #3  0xf7703bb7 in __new_sem_wait_slow (sem=0xf70ea854 <sig_sem>, abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:407
> #4  0xf6cc18d4 in check_pending_signals (wait=true) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/os/linux/vm/os_linux.cpp:2522
> #5  0xf6cbc632 in signal_thread_entry (thread=0xa37a4800, __the_thread__=0xa37a4800) at /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/hotspot/src/share/vm/runtime/os.cpp:250
>
> That is, frames 0-3 are JDK foreign. This bug will happen on all
> systems which use any native library which isn't compiled with -fno-
> omit-frame-pointer. Be it glibc or some other library.
Ok. It looks like we don't even have a "jstack --mixed" test. Could you 
add one? It would be even better if the test included a JNI lib that 
wasn't compiled with -fno-omit-frame-pointer so you don't need to rely 
on glibc to reproduce this issue (or is glibc pretty much always 
compiled without -fno-omit-frame-pointer)? Or if Sharath agrees, file a 
bug to have a test added.

thanks,

Chris
>
> Thanks,
> Severin
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c9
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c4
>
>> thanks,
>>
>> Chris
>>
>> On 7/26/18 10:11 AM, Severin Gehwolf wrote:
>>> On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote:
>>>> Changes looks good Severin.
>>> Thanks for the review, Sharath!
>>>
>>>> I am not a reviewer though, so you still need a Reviewer to review.
>>> Anyone?
>>>
>>> Thanks,
>>> Severin
>>>
>>>> -----Original Message-----
>>>> From: Severin Gehwolf [mailto:sgehwolf at redhat.com]
>>>> Sent: Thursday, July 26, 2018 1:04 PM
>>>> To: serviceability-dev
>>>> Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686
>>>>
>>>> On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote:
>>>>> Hi,
>>>>>
>>>>> Could I please get a review of this one-liner change related to jhsdb
>>>>> --mixed when attaching to a running Java process? The issue arises
>>>>> when threads are in native code and that native code has frame
>>>>> pointers not properly preserved. In such a case the SA performs a
>>>>> simple frame pointer valididy check: ebp >= esp
>>>>>
>>>>> However, the code of retrieving the value for esp is incorrect in as
>>>>> much as it's not in sync with native code in regards to the register
>>>>> index:
>>>>>
>>>>> native code => X86ThreadContext.SP
>>>>> Java code   => X86ThreadContext.ESP
>>>>>
>>>>> X86ThreadContext.ESP is never being set by the native code. Since
>>>>> X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then
>>>>> returns null, ebp.lessThan(esp) wrongly returns false causing the
>>>>> issue. This webrev fixes it by using SP as index on the Java side.
>>>>> Thoughts?
>>>>>
>>>>> webrev:
>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01/
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8208091
>>>> Anyone willing to review this one-liner?
>>>>
>>>> Thanks,
>>>> Severin
>>>>
>>>>> Thanks,
>>>>> Severin
>>


From chris.plummer at oracle.com  Mon Jul 30 16:46:35 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 30 Jul 2018 09:46:35 -0700
Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find
 boolVar with expected value: false
In-Reply-To: <973a96aa-0533-1e7d-a6f7-e948c0ecc371@oracle.com>
References: <5B082D2E.7000408@oracle.com>
 <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com>
 <5B5233DC.5040003@oracle.com>
 <853aba55-fafc-2797-ed44-818760bd5571@oracle.com>
 <352ccc2d-8e8a-4b43-45fd-64bed2bb56f1@oracle.com>
 <973a96aa-0533-1e7d-a6f7-e948c0ecc371@oracle.com>
Message-ID: <8a256bfc-1ff1-da31-ce31-75099f850461@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180730/7d6d5bca/attachment-0001.html>

From daniel.daugherty at oracle.com  Mon Jul 30 18:05:39 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 30 Jul 2018 14:05:39 -0400
Subject: RFR(XXS): 8208521 ProblemList more tests that fail due to 'Error
 attaching to process: Can't create thread_db agent!'
Message-ID: <cd20f2ae-0e55-dda7-5cec-f6ca7332cac9@oracle.com>

Greetings,

I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so
I need a single (R)eviewer for the following fix:

 ? JDK-8208521 ProblemList more tests that fail due to 'Error attaching to
 ????????????? process: Can't create thread_db agent!'
 ? https://bugs.openjdk.java.net/browse/JDK-8208521

Here's the diff:

$ hg diff
diff -r 24517a097dc1 test/hotspot/jtreg/ProblemList.txt
--- a/test/hotspot/jtreg/ProblemList.txt??? Fri Jul 27 00:00:28 2018 -0700
+++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 30 13:58:45 2018 -0400
@@ -101,6 +101,7 @@
 ?serviceability/sa/ClhsdbSymbol.java 8193639 solaris
 ?serviceability/sa/ClhsdbSymbolTable.java 8193639 solaris
 ?serviceability/sa/ClhsdbThread.java 8193639 solaris
+serviceability/sa/ClhsdbVmStructsDump.java 8193639 solaris
 ?serviceability/sa/ClhsdbWhere.java 8193639 solaris
 ?serviceability/sa/DeadlockDetectionTest.java 8193639 solaris
 ?serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris
@@ -109,6 +110,7 @@
 ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris
 ?serviceability/sa/TestDefaultMethods.java 8193639 solaris
 ?serviceability/sa/TestG1HeapRegion.java 8193639 solaris
+serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris
 ?serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all
 ?serviceability/sa/TestType.java 8193639 solaris
 ?serviceability/sa/TestUniverse.java 8193639 solaris


This is an add-on to the following fix that I pushed last week:

 ? JDK-8208205 ProblemList tests that fail due to 'Error attaching to
 ????????????? process: Can't create thread_db agent!'
 ? https://bugs.openjdk.java.net/browse/JDK-8208205

The above two tests failed in last weekend's jdk-11+24 Thread-SMR
stress test run on Solaris-X64.

Thanks, in advance, for any questions, comments or suggestions.

Dan


From chris.plummer at oracle.com  Mon Jul 30 18:17:38 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 30 Jul 2018 11:17:38 -0700
Subject: RFR(XXS): 8208521 ProblemList more tests that fail due to 'Error
 attaching to process: Can't create thread_db agent!'
In-Reply-To: <cd20f2ae-0e55-dda7-5cec-f6ca7332cac9@oracle.com>
References: <cd20f2ae-0e55-dda7-5cec-f6ca7332cac9@oracle.com>
Message-ID: <f62c9e97-37a8-79b4-7156-6020f8729f07@oracle.com>

Looks good.

Chris

On 7/30/18 11:05 AM, Daniel D. Daugherty wrote:
> Greetings,
>
> I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so
> I need a single (R)eviewer for the following fix:
>
> ? JDK-8208521 ProblemList more tests that fail due to 'Error attaching to
> ????????????? process: Can't create thread_db agent!'
> ? https://bugs.openjdk.java.net/browse/JDK-8208521
>
> Here's the diff:
>
> $ hg diff
> diff -r 24517a097dc1 test/hotspot/jtreg/ProblemList.txt
> --- a/test/hotspot/jtreg/ProblemList.txt??? Fri Jul 27 00:00:28 2018 
> -0700
> +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 30 13:58:45 2018 
> -0400
> @@ -101,6 +101,7 @@
> ?serviceability/sa/ClhsdbSymbol.java 8193639 solaris
> ?serviceability/sa/ClhsdbSymbolTable.java 8193639 solaris
> ?serviceability/sa/ClhsdbThread.java 8193639 solaris
> +serviceability/sa/ClhsdbVmStructsDump.java 8193639 solaris
> ?serviceability/sa/ClhsdbWhere.java 8193639 solaris
> ?serviceability/sa/DeadlockDetectionTest.java 8193639 solaris
> ?serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris
> @@ -109,6 +110,7 @@
> ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris
> ?serviceability/sa/TestDefaultMethods.java 8193639 solaris
> ?serviceability/sa/TestG1HeapRegion.java 8193639 solaris
> +serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris
> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all
> ?serviceability/sa/TestType.java 8193639 solaris
> ?serviceability/sa/TestUniverse.java 8193639 solaris
>
>
> This is an add-on to the following fix that I pushed last week:
>
> ? JDK-8208205 ProblemList tests that fail due to 'Error attaching to
> ????????????? process: Can't create thread_db agent!'
> ? https://bugs.openjdk.java.net/browse/JDK-8208205
>
> The above two tests failed in last weekend's jdk-11+24 Thread-SMR
> stress test run on Solaris-X64.
>
> Thanks, in advance, for any questions, comments or suggestions.
>
> Dan
>


From daniel.daugherty at oracle.com  Mon Jul 30 18:19:10 2018
From: daniel.daugherty at oracle.com (Daniel D. Daugherty)
Date: Mon, 30 Jul 2018 14:19:10 -0400
Subject: RFR(XXS): 8208521 ProblemList more tests that fail due to 'Error
 attaching to process: Can't create thread_db agent!'
In-Reply-To: <f62c9e97-37a8-79b4-7156-6020f8729f07@oracle.com>
References: <cd20f2ae-0e55-dda7-5cec-f6ca7332cac9@oracle.com>
 <f62c9e97-37a8-79b4-7156-6020f8729f07@oracle.com>
Message-ID: <1414847c-9490-1b92-58f0-a0bcf42662ae@oracle.com>

Chris,

Thanks for the fast review!

Dan


On 7/30/18 2:17 PM, Chris Plummer wrote:
> Looks good.
>
> Chris
>
> On 7/30/18 11:05 AM, Daniel D. Daugherty wrote:
>> Greetings,
>>
>> I'm in the process of reducing the noise in the JDK11 and JDK12 CIs so
>> I need a single (R)eviewer for the following fix:
>>
>> ? JDK-8208521 ProblemList more tests that fail due to 'Error 
>> attaching to
>> ????????????? process: Can't create thread_db agent!'
>> ? https://bugs.openjdk.java.net/browse/JDK-8208521
>>
>> Here's the diff:
>>
>> $ hg diff
>> diff -r 24517a097dc1 test/hotspot/jtreg/ProblemList.txt
>> --- a/test/hotspot/jtreg/ProblemList.txt??? Fri Jul 27 00:00:28 2018 
>> -0700
>> +++ b/test/hotspot/jtreg/ProblemList.txt??? Mon Jul 30 13:58:45 2018 
>> -0400
>> @@ -101,6 +101,7 @@
>> ?serviceability/sa/ClhsdbSymbol.java 8193639 solaris
>> ?serviceability/sa/ClhsdbSymbolTable.java 8193639 solaris
>> ?serviceability/sa/ClhsdbThread.java 8193639 solaris
>> +serviceability/sa/ClhsdbVmStructsDump.java 8193639 solaris
>> ?serviceability/sa/ClhsdbWhere.java 8193639 solaris
>> ?serviceability/sa/DeadlockDetectionTest.java 8193639 solaris
>> ?serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris
>> @@ -109,6 +110,7 @@
>> ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris
>> ?serviceability/sa/TestDefaultMethods.java 8193639 solaris
>> ?serviceability/sa/TestG1HeapRegion.java 8193639 solaris
>> +serviceability/sa/TestHeapDumpForInvokeDynamic.java 8193639 solaris
>> ?serviceability/sa/TestRevPtrsForInvokeDynamic.java 8191270 generic-all
>> ?serviceability/sa/TestType.java 8193639 solaris
>> ?serviceability/sa/TestUniverse.java 8193639 solaris
>>
>>
>> This is an add-on to the following fix that I pushed last week:
>>
>> ? JDK-8208205 ProblemList tests that fail due to 'Error attaching to
>> ????????????? process: Can't create thread_db agent!'
>> ? https://bugs.openjdk.java.net/browse/JDK-8208205
>>
>> The above two tests failed in last weekend's jdk-11+24 Thread-SMR
>> stress test run on Solaris-X64.
>>
>> Thanks, in advance, for any questions, comments or suggestions.
>>
>> Dan
>>
>
>


From hohensee at amazon.com  Mon Jul 30 19:18:27 2018
From: hohensee at amazon.com (Hohensee, Paul)
Date: Mon, 30 Jul 2018 19:18:27 +0000
Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean,
 GarbageCollectorMXBean, and jstat counter definitions
In-Reply-To: <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com>
References: <FCFCADFE-5CE0-42DE-8ED8-FBC57464207F@amazon.com>
 <e06a126c624e3b4aa836dfeed385882e85261a43.camel@oracle.com>
 <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com>
Message-ID: <22C41D6E-162C-4758-AE23-6856627250DC@amazon.com>

At JVMLS, so can't look in depth this instant, but I'm fine with your approach, except I'd get the new JMX and jstat structure in place before fixing the data that gets reported. Imo it'll be easier to fit correct data into the new JMX/jstat setup than into the old one, and doing it the new way will give us a good idea of exactly what we should do for the legacy ones.

Your archive region set webrev looks pretty much the same as what I wrote, but I got a trace trap when I tried to execute the resulting JVM. Not a clue why, so I abandoned it.

I'd not have thought of making a G1MonitoringScope, looks good.

Thanks,

Paul

?On 7/30/18, 6:04 AM, "Thomas Schatzl" <thomas.schatzl at oracle.com> wrote:

    Hi Paul,
    
      did some prototyping and wanted to show you the results and get your
    input:
    
    On Thu, 2018-07-26 at 16:06 +0200, Thomas Schatzl wrote:
    > 
    [...]
    > Could we work together on first refactoring the code before adding
    > new
    > kinds of spaces to the MXBeans?
    > 
    > Looking at this change and mine roughly the following issues would
    > need to be resolved first:
    > - find a solution for archive regions as suggested above :) At the
    > moment, without doing the change, I would tend to make archive
    > regions separate from old regions.
    
    I went with that and I am currently testing https://bugs.openjdk.java.n
    et/browse/JDK-8208498 ; here's a webrev to look at: http://cr.openjdk.j
    ava.net/~tschatzl/8208498/webrev/
    
    > - move serviceability stuff as much as possible to
    > g1MonitoringSupport
    
    Preliminary webrev:
    http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/
    
    I think this came out better than expected: while we maybe want to add
    a ServiceabilitySupport interface that collects the
    get_memory_manager/pools/* methods in the future, imho this is a lot
    better than current code as it tightens the G1MonitoringSupport
    interface quite a bit.
    
    Particularly of note should be the G1MonitoringScope class that
    collects both TraceCollectorStats and TraceMemoryManagerStats into a
    single class. (Instead of the two bools passed to it something
    indicating the GC directly would probably be better too).
    
    It would be nice if something similar could be made for the concurrent
    Trace*Stats.
    
    > - clean up MemoryPool, remove duplicate information
    > - provide and return sane memory pool used/committed values to the
    > MXBeans
    > - clean up G1MonitoringSupport, e.g. avoid "*used/*committed"
    > variables
    > for every single memory pool. Use MemoryUsage structs for them. Make
    > reading of memory pool information atomic wrt to its readers (note
    > that I think it is currently just impossible to get consistent output
    > for other statistics like jstat) - that's JDK-8207200.
    > - add whatever serviceability stuff for the new pools/jstat/* in
    > steps.
    
    
    Thanks,
      Thomas
    
    
From coleen.phillimore at oracle.com  Mon Jul 30 20:49:57 2018
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Mon, 30 Jul 2018 16:49:57 -0400
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
Message-ID: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>

Summary: fixed refactoring caused by JDK-8203820

open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
bug link https://bugs.openjdk.java.net/browse/JDK-8208074

Ran the test in mach5 on all Oracle supported platforms.? Also took the 
test out of ProblemList.txt because JDK-8203820 fixes 
https://bugs.openjdk.java.net/browse/JDK-8202896.

Thanks,
Coleen

From david.holmes at oracle.com  Mon Jul 30 21:46:21 2018
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 31 Jul 2018 07:46:21 +1000
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
Message-ID: <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com>

On 31/07/2018 6:49 AM, coleen.phillimore at oracle.com wrote:
> Summary: fixed refactoring caused by JDK-8203820
> 
> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8208074

For the sake of other readers who don't want to have to reverse engineer 
the actual cause of the problem, the original code has two Method.invoke 
sequences: one for a static method and which passed a null receiver; one 
  for a non-static method which passed a non-null receiver. The 
refactoring extracted the invoke logic but always passed a null receiver 
- which was wrong for the non-static case. The fix always passes a 
non-null receiver to fix the non-static case, and which is ignored in 
the static case.

Reviewed. Trivial.

Thanks,
David

> Ran the test in mach5 on all Oracle supported platforms.? Also took the 
> test out of ProblemList.txt because JDK-8203820 fixes 
> https://bugs.openjdk.java.net/browse/JDK-8202896.
> 
> Thanks,
> Coleen

From hohensee at amazon.com  Mon Jul 30 23:26:57 2018
From: hohensee at amazon.com (Hohensee, Paul)
Date: Mon, 30 Jul 2018 23:26:57 +0000
Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean,
 GarbageCollectorMXBean, and jstat counter definitions
In-Reply-To: <22C41D6E-162C-4758-AE23-6856627250DC@amazon.com>
References: <FCFCADFE-5CE0-42DE-8ED8-FBC57464207F@amazon.com>
 <e06a126c624e3b4aa836dfeed385882e85261a43.camel@oracle.com>
 <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com>
 <22C41D6E-162C-4758-AE23-6856627250DC@amazon.com>
Message-ID: <54CF80A5-A8C3-4C20-8D3D-045A8FA181AA@amazon.com>

A couple nits on http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/.

g1CollectedHeap.cpp: in initialize_serviceability(), memory_managers(), and memory_pools(), use g1mm() instead of _g1mm.

g1MonitoringSupport.cpp: there's an extra newline after ~G1MonitoringSupport().

Otherwise looks good.

Paul

?On 7/30/18, 12:18 PM, "Hohensee, Paul" <hohensee at amazon.com> wrote:

    At JVMLS, so can't look in depth this instant, but I'm fine with your approach, except I'd get the new JMX and jstat structure in place before fixing the data that gets reported. Imo it'll be easier to fit correct data into the new JMX/jstat setup than into the old one, and doing it the new way will give us a good idea of exactly what we should do for the legacy ones.
    
    Your archive region set webrev looks pretty much the same as what I wrote, but I got a trace trap when I tried to execute the resulting JVM. Not a clue why, so I abandoned it.
    
    I'd not have thought of making a G1MonitoringScope, looks good.
    
    Thanks,
    
    Paul
    
    On 7/30/18, 6:04 AM, "Thomas Schatzl" <thomas.schatzl at oracle.com> wrote:
    
        Hi Paul,
        
          did some prototyping and wanted to show you the results and get your
        input:
        
        On Thu, 2018-07-26 at 16:06 +0200, Thomas Schatzl wrote:
        > 
        [...]
        > Could we work together on first refactoring the code before adding
        > new
        > kinds of spaces to the MXBeans?
        > 
        > Looking at this change and mine roughly the following issues would
        > need to be resolved first:
        > - find a solution for archive regions as suggested above :) At the
        > moment, without doing the change, I would tend to make archive
        > regions separate from old regions.
        
        I went with that and I am currently testing https://bugs.openjdk.java.n
        et/browse/JDK-8208498 ; here's a webrev to look at: http://cr.openjdk.j
        ava.net/~tschatzl/8208498/webrev/
        
        > - move serviceability stuff as much as possible to
        > g1MonitoringSupport
        
        Preliminary webrev:
        http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/
        
        I think this came out better than expected: while we maybe want to add
        a ServiceabilitySupport interface that collects the
        get_memory_manager/pools/* methods in the future, imho this is a lot
        better than current code as it tightens the G1MonitoringSupport
        interface quite a bit.
        
        Particularly of note should be the G1MonitoringScope class that
        collects both TraceCollectorStats and TraceMemoryManagerStats into a
        single class. (Instead of the two bools passed to it something
        indicating the GC directly would probably be better too).
        
        It would be nice if something similar could be made for the concurrent
        Trace*Stats.
        
        > - clean up MemoryPool, remove duplicate information
        > - provide and return sane memory pool used/committed values to the
        > MXBeans
        > - clean up G1MonitoringSupport, e.g. avoid "*used/*committed"
        > variables
        > for every single memory pool. Use MemoryUsage structs for them. Make
        > reading of memory pool information atomic wrt to its readers (note
        > that I think it is currently just impossible to get consistent output
        > for other statistics like jstat) - that's JDK-8207200.
        > - add whatever serviceability stuff for the new pools/jstat/* in
        > steps.
        
        
        Thanks,
          Thomas
        
        
From chris.plummer at oracle.com  Tue Jul 31 01:34:21 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Mon, 30 Jul 2018 18:34:21 -0700
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
Message-ID: <f2623bd9-932a-865e-119f-c0bdaee63fec@oracle.com>

Hi Coleen,

Now that this had been pushed, I assume JDK-8202896 should be closed as 
a dup. And what about JDK-8206076? Is it fixed by this change also?

thanks,

Chris

On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote:
> Summary: fixed refactoring caused by JDK-8203820
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8208074
>
> Ran the test in mach5 on all Oracle supported platforms.? Also took 
> the test out of ProblemList.txt because JDK-8203820 fixes 
> https://bugs.openjdk.java.net/browse/JDK-8202896.
>
> Thanks,
> Coleen


From sharath.ballal at oracle.com  Tue Jul 31 04:23:46 2018
From: sharath.ballal at oracle.com (Sharath Ballal)
Date: Mon, 30 Jul 2018 21:23:46 -0700 (PDT)
Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws
 UnmappedAddressException on i686
In-Reply-To: <bdf77413-51d4-574f-9ff9-0eba5bfdda79@oracle.com>
References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com>
 <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com>
 <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default>
 <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com>
 <4106a672-066a-dc5a-32cc-2eb8eb1bfc4e@oracle.com>
 <ce303b00ffd5a439ce87a596efeb31e594d117af.camel@redhat.com>
 <bdf77413-51d4-574f-9ff9-0eba5bfdda79@oracle.com>
Message-ID: <8cb164e3-0492-4c92-81d0-469f16158ff4@default>

> Ok. It looks like we don't even have a "jstack --mixed" test. Could you add one? It would be even better if the test included a JNI lib that wasn't compiled with -fno-omit-frame-pointer so you don't need to rely on glibc to reproduce this issue (or is glibc  pretty much always compiled without -fno-omit-frame-pointer)? Or if Sharath agrees, file a bug to have a test added.

That?s a good suggestion.  Severin you can either write a test or open a bug for it.

Thanks,
Sharath


-----Original Message-----
From: Chris Plummer 
Sent: Monday, July 30, 2018 10:03 PM
To: Severin Gehwolf; Sharath Ballal; serviceability-dev
Subject: Re: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws UnmappedAddressException on i686

Hi Severin,

On 7/30/18 1:28 AM, Severin Gehwolf wrote:
> Hi Chris,
>
> On Thu, 2018-07-26 at 14:07 -0700, Chris Plummer wrote:
>> I had looked at this review when it came out, but was hesitant to ok 
>> it because I really don't know this code at all. If you can get 
>> another reviewer who does know the code, then I'll approve it.
> Sharath Ballal reviewed it, but he's not a Reviewer as per the OpenJDK 
> census. As to whether he knows the code, I don't know. He's on CC.
Yes, but I was asking for a second reviewer (not counting me).
>
>> This only impacts 32-bit, right? If so, keep in mind that it won't 
>> get tested by Oracle testing, including the submit repo, so make sure you do thorough testing.
> It only impacts 32-bit, yes. I understand that Oracle isn't testing 
> 32- bit x86 any more. The change itself should be fairly low risk 
> since it's changing only a 32-bit-x86-linux-only file and the native 
> bits don't seem to match what the Java code does[1]. REG_INDEX(reg) 
> being defined as:
>
> #define REG_INDEX(reg) 
> sun_jvm_hotspot_debugger_x86_X86ThreadContext_##reg
>
> and being used as:
>
> REG_INDEX(SP)
>
> Thus, using
>
> sun_jvm_hotspot_debugger_x86_X86ThreadContext_SP
>
> The Java code uses:
>
> sun.jvm.hotspot.debugger.x86.X86ThreadContext.ESP
>
>> Also, why is there any code being executed that was not compiled with 
>> -fno-omit-frame-pointer? The description in the CR just shows a 
>> simple java program reproducing this, so all the mixed stack traces 
>> belong to the JVM and libs, and I thought we made sure to compile all 
>> of them with -fno-omit-frame-pointer.
> The JVM uses glibc and that simple program is enough to see some 
> thread's stack currently being in a glibc function when getting a 
> mixed stack trace. We've originally seen this in JDK 8 with jstack -m 
> and was reported in [2]. That comment has more details. The problem 
> here isn't that it's a JDK lib which gets compiled without 
> -fno-omit-frame- pointer. It's glibc not being compiled with that option.
>
> An example stack trace for a system where this happens looks like this:
>
> Thread 7 (Thread 0xa3863b40 (LWP 834)):
> #0  0xf771f430 in __kernel_vsyscall ()
> #1  0xf7703acc in futex_abstimed_wait (cancel=true, private=<optimized 
> out>, abstime=0x0, expected=1, futex=0xf770f000) at 
> ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
> #2  do_futex_wait (sem=0xf770f000, sem at entry=0xf70ea854 <sig_sem>, 
> abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:226
> #3  0xf7703bb7 in __new_sem_wait_slow (sem=0xf70ea854 <sig_sem>, 
> abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:407
> #4  0xf6cc18d4 in check_pending_signals (wait=true) at 
> /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/h
> otspot/src/os/linux/vm/os_linux.cpp:2522
> #5  0xf6cbc632 in signal_thread_entry (thread=0xa37a4800, 
> __the_thread__=0xa37a4800) at 
> /usr/src/debug/java-1.8.0-openjdk-1.8.0.171-8.b10.el7_5.i386/openjdk/h
> otspot/src/share/vm/runtime/os.cpp:250
>
> That is, frames 0-3 are JDK foreign. This bug will happen on all 
> systems which use any native library which isn't compiled with -fno- 
> omit-frame-pointer. Be it glibc or some other library.
Ok. It looks like we don't even have a "jstack --mixed" test. Could you add one? It would be even better if the test included a JNI lib that wasn't compiled with -fno-omit-frame-pointer so you don't need to rely on glibc to reproduce this issue (or is glibc pretty much always compiled without -fno-omit-frame-pointer)? Or if Sharath agrees, file a bug to have a test added.

thanks,

Chris
>
> Thanks,
> Severin
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c9
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1602008#c4
>
>> thanks,
>>
>> Chris
>>
>> On 7/26/18 10:11 AM, Severin Gehwolf wrote:
>>> On Thu, 2018-07-26 at 10:04 -0700, Sharath Ballal wrote:
>>>> Changes looks good Severin.
>>> Thanks for the review, Sharath!
>>>
>>>> I am not a reviewer though, so you still need a Reviewer to review.
>>> Anyone?
>>>
>>> Thanks,
>>> Severin
>>>
>>>> -----Original Message-----
>>>> From: Severin Gehwolf [mailto:sgehwolf at redhat.com]
>>>> Sent: Thursday, July 26, 2018 1:04 PM
>>>> To: serviceability-dev
>>>> Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws 
>>>> UnmappedAddressException on i686
>>>>
>>>> On Mon, 2018-07-23 at 18:27 +0200, Severin Gehwolf wrote:
>>>>> Hi,
>>>>>
>>>>> Could I please get a review of this one-liner change related to 
>>>>> jhsdb --mixed when attaching to a running Java process? The issue 
>>>>> arises when threads are in native code and that native code has 
>>>>> frame pointers not properly preserved. In such a case the SA 
>>>>> performs a simple frame pointer valididy check: ebp >= esp
>>>>>
>>>>> However, the code of retrieving the value for esp is incorrect in 
>>>>> as much as it's not in sync with native code in regards to the 
>>>>> register
>>>>> index:
>>>>>
>>>>> native code => X86ThreadContext.SP
>>>>> Java code   => X86ThreadContext.ESP
>>>>>
>>>>> X86ThreadContext.ESP is never being set by the native code. Since
>>>>> X86ThreadContext.getRegisterAsAddress(X86ThreadContext.ESP) then 
>>>>> returns null, ebp.lessThan(esp) wrongly returns false causing the 
>>>>> issue. This webrev fixes it by using SP as index on the Java side.
>>>>> Thoughts?
>>>>>
>>>>> webrev:
>>>>> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8208091/webrev.01
>>>>> /
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8208091
>>>> Anyone willing to review this one-liner?
>>>>
>>>> Thanks,
>>>> Severin
>>>>
>>>>> Thanks,
>>>>> Severin
>>


From chris.plummer at oracle.com  Tue Jul 31 07:16:03 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 31 Jul 2018 00:16:03 -0700
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <f2623bd9-932a-865e-119f-c0bdaee63fec@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
 <f2623bd9-932a-865e-119f-c0bdaee63fec@oracle.com>
Message-ID: <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com>

Sorry, I thought this had been pushed already, but it hasn't. But it 
still looks like JDK-8202896 should be closed as a dup, and it's unclear 
to me if JDK-8206076 has been fixed and this test can be removed from 
the problem list.

Chris

On 7/30/18 6:34 PM, Chris Plummer wrote:
> Hi Coleen,
>
> Now that this had been pushed, I assume JDK-8202896 should be closed 
> as a dup. And what about JDK-8206076? Is it fixed by this change also?
>
> thanks,
>
> Chris
>
> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote:
>> Summary: fixed refactoring caused by JDK-8203820
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074
>>
>> Ran the test in mach5 on all Oracle supported platforms.? Also took 
>> the test out of ProblemList.txt because JDK-8203820 fixes 
>> https://bugs.openjdk.java.net/browse/JDK-8202896.
>>
>> Thanks,
>> Coleen
>
>
>


From serguei.spitsyn at oracle.com  Tue Jul 31 07:20:54 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 31 Jul 2018 00:20:54 -0700
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
 <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com>
Message-ID: <af7d58a8-95bf-fd29-71eb-ee88b55ea79d@oracle.com>

Hi Coleen,

The explanation from David is very helpful - thanks!
So the fix looks good to me as well.

We still need to answer questions from Chris though.

Thanks,
Serguei


On 7/30/18 14:46, David Holmes wrote:
> On 31/07/2018 6:49 AM, coleen.phillimore at oracle.com wrote:
>> Summary: fixed refactoring caused by JDK-8203820
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074
>
> For the sake of other readers who don't want to have to reverse 
> engineer the actual cause of the problem, the original code has two 
> Method.invoke sequences: one for a static method and which passed a 
> null receiver; one ?for a non-static method which passed a non-null 
> receiver. The refactoring extracted the invoke logic but always passed 
> a null receiver - which was wrong for the non-static case. The fix 
> always passes a non-null receiver to fix the non-static case, and 
> which is ignored in the static case.
>
> Reviewed. Trivial.
>
> Thanks,
> David
>
>> Ran the test in mach5 on all Oracle supported platforms.? Also took 
>> the test out of ProblemList.txt because JDK-8203820 fixes 
>> https://bugs.openjdk.java.net/browse/JDK-8202896.
>>
>> Thanks,
>> Coleen


From serguei.spitsyn at oracle.com  Tue Jul 31 07:29:59 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 31 Jul 2018 00:29:59 -0700
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
 <f2623bd9-932a-865e-119f-c0bdaee63fec@oracle.com>
 <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com>
Message-ID: <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com>

Hi Chris,

Good catch.
It is possible that this webrev does not fix the JDK-8202896.
The JDK-8202896 is about timeouts which are normally intermittent (is it 
right?).

There are two options here:
 ? A: close 8202896 as a dup of 8208074
 ? B: keep the test problem listed and labeled with 8202896

Let's wait for Coleen's answer.

Thanks,
Serguei


On 7/31/18 00:16, Chris Plummer wrote:
> Sorry, I thought this had been pushed already, but it hasn't. But it 
> still looks like JDK-8202896 should be closed as a dup, and it's 
> unclear to me if JDK-8206076 has been fixed and this test can be 
> removed from the problem list.
>
> Chris
>
> On 7/30/18 6:34 PM, Chris Plummer wrote:
>> Hi Coleen,
>>
>> Now that this had been pushed, I assume JDK-8202896 should be closed 
>> as a dup. And what about JDK-8206076? Is it fixed by this change also?
>>
>> thanks,
>>
>> Chris
>>
>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote:
>>> Summary: fixed refactoring caused by JDK-8203820
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074
>>>
>>> Ran the test in mach5 on all Oracle supported platforms.? Also took 
>>> the test out of ProblemList.txt because JDK-8203820 fixes 
>>> https://bugs.openjdk.java.net/browse/JDK-8202896.
>>>
>>> Thanks,
>>> Coleen
>>
>>
>>
>
>


From sgehwolf at redhat.com  Tue Jul 31 08:14:50 2018
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Tue, 31 Jul 2018 10:14:50 +0200
Subject: [PING] RFR(XS): 8208091: SA: jhsdb jstack --mixed throws
 UnmappedAddressException on i686
In-Reply-To: <8cb164e3-0492-4c92-81d0-469f16158ff4@default>
References: <3227ec7b6a46f8e24882057792c18c84eee934e6.camel@redhat.com>
 <4f6e6ddbcf04ac74e657e2bfa51fabe4219ce238.camel@redhat.com>
 <4ce2ad6f-cc85-40e9-9806-f391ac7671c7@default>
 <24722f6a12d78c2b7710e6072e86dd4ea77c2c6d.camel@redhat.com>
 <4106a672-066a-dc5a-32cc-2eb8eb1bfc4e@oracle.com>
 <ce303b00ffd5a439ce87a596efeb31e594d117af.camel@redhat.com>
 <bdf77413-51d4-574f-9ff9-0eba5bfdda79@oracle.com>
 <8cb164e3-0492-4c92-81d0-469f16158ff4@default>
Message-ID: <994a4b8b3ef5456404d83e3aad1a3ec9027fbc1e.camel@redhat.com>

On Mon, 2018-07-30 at 21:23 -0700, Sharath Ballal wrote:
> > Ok. It looks like we don't even have a "jstack --mixed" test. Could
> > you add one? It would be even better if the test included a JNI lib
> > that wasn't compiled with -fno-omit-frame-pointer so you don't need
> > to rely on glibc to reproduce this issue (or is glibc  pretty much
> > always compiled without -fno-omit-frame-pointer)? Or if Sharath
> > agrees, file a bug to have a test added.
> 
> That?s a good suggestion.  Severin you can either write a test or
> open a bug for it.

I'll write a test for it.

Thanks,
Severin


From serguei.spitsyn at oracle.com  Tue Jul 31 08:32:34 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 31 Jul 2018 01:32:34 -0700
Subject: RFR: JDK-8169718: nsk/jdb/locals/locals002: ERROR: Cannot find
 boolVar with expected value: false
In-Reply-To: <8a256bfc-1ff1-da31-ce31-75099f850461@oracle.com>
References: <5B082D2E.7000408@oracle.com>
 <5d58fa2b-7dda-db64-0280-1e19e791a86d@oracle.com>
 <5B5233DC.5040003@oracle.com>
 <853aba55-fafc-2797-ed44-818760bd5571@oracle.com>
 <352ccc2d-8e8a-4b43-45fd-64bed2bb56f1@oracle.com>
 <973a96aa-0533-1e7d-a6f7-e948c0ecc371@oracle.com>
 <8a256bfc-1ff1-da31-ce31-75099f850461@oracle.com>
Message-ID: <f0a3509a-8c37-7d8a-a6f1-d3a73c85b336@oracle.com>

An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20180731/723d5fcb/attachment-0001.html>

From coleen.phillimore at oracle.com  Tue Jul 31 11:56:15 2018
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 31 Jul 2018 07:56:15 -0400
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
 <08cdf2d7-975a-de61-9ae8-08c1f9bfbf7f@oracle.com>
Message-ID: <72b2618f-ec62-00ff-8af5-53dbc67156ef@oracle.com>


On 7/30/18 5:46 PM, David Holmes wrote:
> On 31/07/2018 6:49 AM, coleen.phillimore at oracle.com wrote:
>> Summary: fixed refactoring caused by JDK-8203820
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074
>
> For the sake of other readers who don't want to have to reverse 
> engineer the actual cause of the problem, the original code has two 
> Method.invoke sequences: one for a static method and which passed a 
> null receiver; one ?for a non-static method which passed a non-null 
> receiver. The refactoring extracted the invoke logic but always passed 
> a null receiver - which was wrong for the non-static case. The fix 
> always passes a non-null receiver to fix the non-static case, and 
> which is ignored in the static case.

Thank you David for summarizing the bug(s) and the review.

Coleen
>
> Reviewed. Trivial.
>
> Thanks,
> David
>
>> Ran the test in mach5 on all Oracle supported platforms.? Also took 
>> the test out of ProblemList.txt because JDK-8203820 fixes 
>> https://bugs.openjdk.java.net/browse/JDK-8202896.
>>
>> Thanks,
>> Coleen


From coleen.phillimore at oracle.com  Tue Jul 31 12:01:08 2018
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 31 Jul 2018 08:01:08 -0400
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <f2623bd9-932a-865e-119f-c0bdaee63fec@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
 <f2623bd9-932a-865e-119f-c0bdaee63fec@oracle.com>
Message-ID: <2f6bf712-594c-d859-128d-cb30343ec591@oracle.com>


On 7/30/18 9:34 PM, Chris Plummer wrote:
> Hi Coleen,
>
> Now that this had been pushed, I assume JDK-8202896 should be closed 
> as a dup. And what about JDK-8206076? Is it fixed by this change also?

Yes, it should be closed also.?? I didn't see this bug.? When I was 
fixing the first one: https://bugs.openjdk.java.net/browse/JDK-8203820 , 
I looked for similar patterns in the vmTestbase tests and found this 
test also. All of these tests were calling InMemoryJavaCompiler from 
within a loop and from within multiple threads to get the same result.? 
I can imagine this easily timing out for -Xcomp.

I haven't pushed it yet.? I was hoping you'd see this and comment on it, 
since you had comments for the whole set of bugs.

Thanks!
Coleen

>
> thanks,
>
> Chris
>
> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote:
>> Summary: fixed refactoring caused by JDK-8203820
>>
>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074
>>
>> Ran the test in mach5 on all Oracle supported platforms.? Also took 
>> the test out of ProblemList.txt because JDK-8203820 fixes 
>> https://bugs.openjdk.java.net/browse/JDK-8202896.
>>
>> Thanks,
>> Coleen
>
>
>


From coleen.phillimore at oracle.com  Tue Jul 31 12:06:35 2018
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 31 Jul 2018 08:06:35 -0400
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
 <f2623bd9-932a-865e-119f-c0bdaee63fec@oracle.com>
 <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com>
 <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com>
Message-ID: <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com>


On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote:
> Hi Chris,
>
> Good catch.
> It is possible that this webrev does not fix the JDK-8202896.
> The JDK-8202896 is about timeouts which are normally intermittent (is 
> it right?).
>
> There are two options here:
> ? A: close 8202896 as a dup of 8208074
> ? B: keep the test problem listed and labeled with 8202896
>
> Let's wait for Coleen's answer.

I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts with 
-Xcomp)
 ?as a duplicate of
https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took 
InMemoryCompiler out of the threads)
because that's where the attempted fix was.

I think
https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many open 
files intermittently)
should be closed as a duplicate too because it's the same root cause.

And this one:
https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix)
fixes my fix and will remove the test from the ProblemList.txt.

I believe it should be removed fromt he problem list because I don't 
think it will time out or intermittently fail again for the same 
reason.? If it times out or fails for a different reason, we should file 
a whole new bug, with that specific analysis.

Thanks,
Coleen

>
> Thanks,
> Serguei
>
>
> On 7/31/18 00:16, Chris Plummer wrote:
>> Sorry, I thought this had been pushed already, but it hasn't. But it 
>> still looks like JDK-8202896 should be closed as a dup, and it's 
>> unclear to me if JDK-8206076 has been fixed and this test can be 
>> removed from the problem list.
>>
>> Chris
>>
>> On 7/30/18 6:34 PM, Chris Plummer wrote:
>>> Hi Coleen,
>>>
>>> Now that this had been pushed, I assume JDK-8202896 should be closed 
>>> as a dup. And what about JDK-8206076? Is it fixed by this change also?
>>>
>>> thanks,
>>>
>>> Chris
>>>
>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote:
>>>> Summary: fixed refactoring caused by JDK-8203820
>>>>
>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074
>>>>
>>>> Ran the test in mach5 on all Oracle supported platforms. Also took 
>>>> the test out of ProblemList.txt because JDK-8203820 fixes 
>>>> https://bugs.openjdk.java.net/browse/JDK-8202896.
>>>>
>>>> Thanks,
>>>> Coleen
>>>
>>>
>>>
>>
>>
>


From chris.plummer at oracle.com  Tue Jul 31 16:13:19 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 31 Jul 2018 09:13:19 -0700
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
 <f2623bd9-932a-865e-119f-c0bdaee63fec@oracle.com>
 <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com>
 <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com>
 <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com>
Message-ID: <a9bb6f70-962a-3ac2-6ebd-edb48b999681@oracle.com>

On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote:
>
>
> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote:
>> Hi Chris,
>>
>> Good catch.
>> It is possible that this webrev does not fix the JDK-8202896.
>> The JDK-8202896 is about timeouts which are normally intermittent (is 
>> it right?).
>>
>> There are two options here:
>> ? A: close 8202896 as a dup of 8208074
>> ? B: keep the test problem listed and labeled with 8202896
>>
>> Let's wait for Coleen's answer.
>
> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts 
> with -Xcomp)
> ?as a duplicate of
> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took 
> InMemoryCompiler out of the threads)
> because that's where the attempted fix was.
>
> I think
> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many 
> open files intermittently)
> should be closed as a duplicate too because it's the same root cause.
>
> And this one:
> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix)
> fixes my fix and will remove the test from the ProblemList.txt.
>
> I believe it should be removed fromt he problem list because I don't 
> think it will time out or intermittently fail again for the same 
> reason.? If it times out or fails for a different reason, we should 
> file a whole new bug, with that specific analysis.
>
> Thanks,
> Coleen

Hi Coleen,

That all sounds reasonable. Thanks for cleaning up the bug situation.

Chris
>
>>
>> Thanks,
>> Serguei
>>
>>
>> On 7/31/18 00:16, Chris Plummer wrote:
>>> Sorry, I thought this had been pushed already, but it hasn't. But it 
>>> still looks like JDK-8202896 should be closed as a dup, and it's 
>>> unclear to me if JDK-8206076 has been fixed and this test can be 
>>> removed from the problem list.
>>>
>>> Chris
>>>
>>> On 7/30/18 6:34 PM, Chris Plummer wrote:
>>>> Hi Coleen,
>>>>
>>>> Now that this had been pushed, I assume JDK-8202896 should be 
>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this 
>>>> change also?
>>>>
>>>> thanks,
>>>>
>>>> Chris
>>>>
>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote:
>>>>> Summary: fixed refactoring caused by JDK-8203820
>>>>>
>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074
>>>>>
>>>>> Ran the test in mach5 on all Oracle supported platforms. Also took 
>>>>> the test out of ProblemList.txt because JDK-8203820 fixes 
>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896.
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>
>>>>
>>>>
>>>
>>>
>>
>


From chris.plummer at oracle.com  Tue Jul 31 17:43:31 2018
From: chris.plummer at oracle.com (Chris Plummer)
Date: Tue, 31 Jul 2018 10:43:31 -0700
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <a9bb6f70-962a-3ac2-6ebd-edb48b999681@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
 <f2623bd9-932a-865e-119f-c0bdaee63fec@oracle.com>
 <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com>
 <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com>
 <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com>
 <a9bb6f70-962a-3ac2-6ebd-edb48b999681@oracle.com>
Message-ID: <bec6e59f-1df1-791e-f3e0-ed961c8715d1@oracle.com>

Hi Coleen,

I just realized that there is also 
https://bugs.openjdk.java.net/browse/JDK-8208234 filed for this test 
last week. It results in an OOME. I think it's the same issue, but just 
want check with you. Please close it as a dup if you think it is the same.

thanks,

Chris

On 7/31/18 9:13 AM, Chris Plummer wrote:
> On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote:
>>> Hi Chris,
>>>
>>> Good catch.
>>> It is possible that this webrev does not fix the JDK-8202896.
>>> The JDK-8202896 is about timeouts which are normally intermittent 
>>> (is it right?).
>>>
>>> There are two options here:
>>> ? A: close 8202896 as a dup of 8208074
>>> ? B: keep the test problem listed and labeled with 8202896
>>>
>>> Let's wait for Coleen's answer.
>>
>> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts 
>> with -Xcomp)
>> ?as a duplicate of
>> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took 
>> InMemoryCompiler out of the threads)
>> because that's where the attempted fix was.
>>
>> I think
>> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many 
>> open files intermittently)
>> should be closed as a duplicate too because it's the same root cause.
>>
>> And this one:
>> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix)
>> fixes my fix and will remove the test from the ProblemList.txt.
>>
>> I believe it should be removed fromt he problem list because I don't 
>> think it will time out or intermittently fail again for the same 
>> reason.? If it times out or fails for a different reason, we should 
>> file a whole new bug, with that specific analysis.
>>
>> Thanks,
>> Coleen
>
> Hi Coleen,
>
> That all sounds reasonable. Thanks for cleaning up the bug situation.
>
> Chris
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 7/31/18 00:16, Chris Plummer wrote:
>>>> Sorry, I thought this had been pushed already, but it hasn't. But 
>>>> it still looks like JDK-8202896 should be closed as a dup, and it's 
>>>> unclear to me if JDK-8206076 has been fixed and this test can be 
>>>> removed from the problem list.
>>>>
>>>> Chris
>>>>
>>>> On 7/30/18 6:34 PM, Chris Plummer wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> Now that this had been pushed, I assume JDK-8202896 should be 
>>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this 
>>>>> change also?
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote:
>>>>>> Summary: fixed refactoring caused by JDK-8203820
>>>>>>
>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074
>>>>>>
>>>>>> Ran the test in mach5 on all Oracle supported platforms. Also 
>>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes 
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896.
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
>


From serguei.spitsyn at oracle.com  Tue Jul 31 18:07:35 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 31 Jul 2018 11:07:35 -0700
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <a9bb6f70-962a-3ac2-6ebd-edb48b999681@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
 <f2623bd9-932a-865e-119f-c0bdaee63fec@oracle.com>
 <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com>
 <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com>
 <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com>
 <a9bb6f70-962a-3ac2-6ebd-edb48b999681@oracle.com>
Message-ID: <60b8abbb-f9da-25b5-f488-b7c76896b62d@oracle.com>

On 7/31/18 09:13, Chris Plummer wrote:
> On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote:
>>> Hi Chris,
>>>
>>> Good catch.
>>> It is possible that this webrev does not fix the JDK-8202896.
>>> The JDK-8202896 is about timeouts which are normally intermittent 
>>> (is it right?).
>>>
>>> There are two options here:
>>> ? A: close 8202896 as a dup of 8208074
>>> ? B: keep the test problem listed and labeled with 8202896
>>>
>>> Let's wait for Coleen's answer.
>>
>> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts 
>> with -Xcomp)
>> ?as a duplicate of
>> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took 
>> InMemoryCompiler out of the threads)
>> because that's where the attempted fix was.
>>
>> I think
>> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many 
>> open files intermittently)
>> should be closed as a duplicate too because it's the same root cause.
>>
>> And this one:
>> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix)
>> fixes my fix and will remove the test from the ProblemList.txt.
>>
>> I believe it should be removed fromt he problem list because I don't 
>> think it will time out or intermittently fail again for the same 
>> reason.? If it times out or fails for a different reason, we should 
>> file a whole new bug, with that specific analysis.
>>
>> Thanks,
>> Coleen
>
> Hi Coleen,
>
> That all sounds reasonable. Thanks for cleaning up the bug situation.

+1

Thanks,
Serguei
>
> Chris
>>
>>>
>>> Thanks,
>>> Serguei
>>>
>>>
>>> On 7/31/18 00:16, Chris Plummer wrote:
>>>> Sorry, I thought this had been pushed already, but it hasn't. But 
>>>> it still looks like JDK-8202896 should be closed as a dup, and it's 
>>>> unclear to me if JDK-8206076 has been fixed and this test can be 
>>>> removed from the problem list.
>>>>
>>>> Chris
>>>>
>>>> On 7/30/18 6:34 PM, Chris Plummer wrote:
>>>>> Hi Coleen,
>>>>>
>>>>> Now that this had been pushed, I assume JDK-8202896 should be 
>>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this 
>>>>> change also?
>>>>>
>>>>> thanks,
>>>>>
>>>>> Chris
>>>>>
>>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote:
>>>>>> Summary: fixed refactoring caused by JDK-8203820
>>>>>>
>>>>>> open webrev at http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074
>>>>>>
>>>>>> Ran the test in mach5 on all Oracle supported platforms. Also 
>>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes 
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896.
>>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
>


From hohensee at amazon.com  Tue Jul 31 18:45:14 2018
From: hohensee at amazon.com (Hohensee, Paul)
Date: Tue, 31 Jul 2018 18:45:14 +0000
Subject: RFR(L): 8196989: Revamp G1 JMX MemoryPoolMXBean,
 GarbageCollectorMXBean, and jstat counter definitions
In-Reply-To: <54CF80A5-A8C3-4C20-8D3D-045A8FA181AA@amazon.com>
References: <FCFCADFE-5CE0-42DE-8ED8-FBC57464207F@amazon.com>
 <e06a126c624e3b4aa836dfeed385882e85261a43.camel@oracle.com>
 <49ed1092212d75bc9f2df1250ebf9f1fdd115b32.camel@oracle.com>
 <22C41D6E-162C-4758-AE23-6856627250DC@amazon.com>
 <54CF80A5-A8C3-4C20-8D3D-045A8FA181AA@amazon.com>
Message-ID: <C524108B-A7A9-49F7-9C3E-D1C3B74512CA@amazon.com>

A few small things for http://cr.openjdk.java.net/~tschatzl/8208498/webrev/, otherwise looks good.

collectionSetChooser.cpp:
	Doesn't !r->is_old() include is_archive()?

g1CollectedHeap.hpp:
	Add archive_region_add(), archive_region_remove(), and old_set_bulk_remove().
	In non_young_capacity_bytes(), use old_regions_count(), humongous_regions_count(), and archive_regions_count().

g1CollectedHeap.cpp:
	Use old_set_add() and friends where possible.
	"// humongous regions set." -> "// humongous and archive region sets."

?On 7/30/18, 4:27 PM, "Hohensee, Paul" <hohensee at amazon.com> wrote:

    A couple nits on http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/.
    
    g1CollectedHeap.cpp: in initialize_serviceability(), memory_managers(), and memory_pools(), use g1mm() instead of _g1mm.
    
    g1MonitoringSupport.cpp: there's an extra newline after ~G1MonitoringSupport().
    
    Otherwise looks good.
    
    Paul
    
    On 7/30/18, 12:18 PM, "Hohensee, Paul" <hohensee at amazon.com> wrote:
    
        At JVMLS, so can't look in depth this instant, but I'm fine with your approach, except I'd get the new JMX and jstat structure in place before fixing the data that gets reported. Imo it'll be easier to fit correct data into the new JMX/jstat setup than into the old one, and doing it the new way will give us a good idea of exactly what we should do for the legacy ones.
        
        Your archive region set webrev looks pretty much the same as what I wrote, but I got a trace trap when I tried to execute the resulting JVM. Not a clue why, so I abandoned it.
        
        I'd not have thought of making a G1MonitoringScope, looks good.
        
        Thanks,
        
        Paul
        
        On 7/30/18, 6:04 AM, "Thomas Schatzl" <thomas.schatzl at oracle.com> wrote:
        
            Hi Paul,
            
              did some prototyping and wanted to show you the results and get your
            input:
            
            On Thu, 2018-07-26 at 16:06 +0200, Thomas Schatzl wrote:
            > 
            [...]
            > Could we work together on first refactoring the code before adding
            > new
            > kinds of spaces to the MXBeans?
            > 
            > Looking at this change and mine roughly the following issues would
            > need to be resolved first:
            > - find a solution for archive regions as suggested above :) At the
            > moment, without doing the change, I would tend to make archive
            > regions separate from old regions.
            
            I went with that and I am currently testing https://bugs.openjdk.java.n
            et/browse/JDK-8208498 ; here's a webrev to look at: http://cr.openjdk.j
            ava.net/~tschatzl/8208498/webrev/
            
            > - move serviceability stuff as much as possible to
            > g1MonitoringSupport
            
            Preliminary webrev:
            http://cr.openjdk.java.net/~tschatzl/move-serviceability-stuff/webrev/
            
            I think this came out better than expected: while we maybe want to add
            a ServiceabilitySupport interface that collects the
            get_memory_manager/pools/* methods in the future, imho this is a lot
            better than current code as it tightens the G1MonitoringSupport
            interface quite a bit.
            
            Particularly of note should be the G1MonitoringScope class that
            collects both TraceCollectorStats and TraceMemoryManagerStats into a
            single class. (Instead of the two bools passed to it something
            indicating the GC directly would probably be better too).
            
            It would be nice if something similar could be made for the concurrent
            Trace*Stats.
            
            > - clean up MemoryPool, remove duplicate information
            > - provide and return sane memory pool used/committed values to the
            > MXBeans
            > - clean up G1MonitoringSupport, e.g. avoid "*used/*committed"
            > variables
            > for every single memory pool. Use MemoryUsage structs for them. Make
            > reading of memory pool information atomic wrt to its readers (note
            > that I think it is currently just impossible to get consistent output
            > for other statistics like jstat) - that's JDK-8207200.
            > - add whatever serviceability stuff for the new pools/jstat/* in
            > steps.
            
            
            Thanks,
              Thomas
            
            
From coleen.phillimore at oracle.com  Tue Jul 31 20:07:25 2018
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 31 Jul 2018 16:07:25 -0400
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <bec6e59f-1df1-791e-f3e0-ed961c8715d1@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
 <f2623bd9-932a-865e-119f-c0bdaee63fec@oracle.com>
 <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com>
 <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com>
 <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com>
 <a9bb6f70-962a-3ac2-6ebd-edb48b999681@oracle.com>
 <bec6e59f-1df1-791e-f3e0-ed961c8715d1@oracle.com>
Message-ID: <c17f4fc6-33e2-f701-091c-0f393892c34f@oracle.com>


On 7/31/18 1:43 PM, Chris Plummer wrote:
> Hi Coleen,
>
> I just realized that there is also 
> https://bugs.openjdk.java.net/browse/JDK-8208234 filed for this test 
> last week. It results in an OOME. I think it's the same issue, but 
> just want check with you. Please close it as a dup if you think it is 
> the same.

Yes, I think this is the same thing.? One call to InMemoryCompiler 
shouldn't OOME but multiple concurrent calls could.
thanks,
Coleen
>
> thanks,
>
> Chris
>
> On 7/31/18 9:13 AM, Chris Plummer wrote:
>> On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote:
>>>> Hi Chris,
>>>>
>>>> Good catch.
>>>> It is possible that this webrev does not fix the JDK-8202896.
>>>> The JDK-8202896 is about timeouts which are normally intermittent 
>>>> (is it right?).
>>>>
>>>> There are two options here:
>>>> ? A: close 8202896 as a dup of 8208074
>>>> ? B: keep the test problem listed and labeled with 8202896
>>>>
>>>> Let's wait for Coleen's answer.
>>>
>>> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts 
>>> with -Xcomp)
>>> ?as a duplicate of
>>> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took 
>>> InMemoryCompiler out of the threads)
>>> because that's where the attempted fix was.
>>>
>>> I think
>>> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many 
>>> open files intermittently)
>>> should be closed as a duplicate too because it's the same root cause.
>>>
>>> And this one:
>>> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix)
>>> fixes my fix and will remove the test from the ProblemList.txt.
>>>
>>> I believe it should be removed fromt he problem list because I don't 
>>> think it will time out or intermittently fail again for the same 
>>> reason.? If it times out or fails for a different reason, we should 
>>> file a whole new bug, with that specific analysis.
>>>
>>> Thanks,
>>> Coleen
>>
>> Hi Coleen,
>>
>> That all sounds reasonable. Thanks for cleaning up the bug situation.
>>
>> Chris
>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 7/31/18 00:16, Chris Plummer wrote:
>>>>> Sorry, I thought this had been pushed already, but it hasn't. But 
>>>>> it still looks like JDK-8202896 should be closed as a dup, and 
>>>>> it's unclear to me if JDK-8206076 has been fixed and this test can 
>>>>> be removed from the problem list.
>>>>>
>>>>> Chris
>>>>>
>>>>> On 7/30/18 6:34 PM, Chris Plummer wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> Now that this had been pushed, I assume JDK-8202896 should be 
>>>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this 
>>>>>> change also?
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote:
>>>>>>> Summary: fixed refactoring caused by JDK-8203820
>>>>>>>
>>>>>>> open webrev at 
>>>>>>> http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074
>>>>>>>
>>>>>>> Ran the test in mach5 on all Oracle supported platforms. Also 
>>>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes 
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>
>


From coleen.phillimore at oracle.com  Tue Jul 31 20:09:20 2018
From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com)
Date: Tue, 31 Jul 2018 16:09:20 -0400
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <60b8abbb-f9da-25b5-f488-b7c76896b62d@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
 <f2623bd9-932a-865e-119f-c0bdaee63fec@oracle.com>
 <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com>
 <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com>
 <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com>
 <a9bb6f70-962a-3ac2-6ebd-edb48b999681@oracle.com>
 <60b8abbb-f9da-25b5-f488-b7c76896b62d@oracle.com>
Message-ID: <9bdf6cc2-d0e6-dd55-88e6-f3bb45b322ec@oracle.com>


On 7/31/18 2:07 PM, serguei.spitsyn at oracle.com wrote:
> On 7/31/18 09:13, Chris Plummer wrote:
>> On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote:
>>>
>>>
>>> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote:
>>>> Hi Chris,
>>>>
>>>> Good catch.
>>>> It is possible that this webrev does not fix the JDK-8202896.
>>>> The JDK-8202896 is about timeouts which are normally intermittent 
>>>> (is it right?).
>>>>
>>>> There are two options here:
>>>> ? A: close 8202896 as a dup of 8208074
>>>> ? B: keep the test problem listed and labeled with 8202896
>>>>
>>>> Let's wait for Coleen's answer.
>>>
>>> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts 
>>> with -Xcomp)
>>> ?as a duplicate of
>>> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took 
>>> InMemoryCompiler out of the threads)
>>> because that's where the attempted fix was.
>>>
>>> I think
>>> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many 
>>> open files intermittently)
>>> should be closed as a duplicate too because it's the same root cause.
>>>
>>> And this one:
>>> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix)
>>> fixes my fix and will remove the test from the ProblemList.txt.
>>>
>>> I believe it should be removed fromt he problem list because I don't 
>>> think it will time out or intermittently fail again for the same 
>>> reason.? If it times out or fails for a different reason, we should 
>>> file a whole new bug, with that specific analysis.
>>>
>>> Thanks,
>>> Coleen
>>
>> Hi Coleen,
>>
>> That all sounds reasonable. Thanks for cleaning up the bug situation.
>
> +1

Thanks Chris and Serguei for your discussion of this bug.? Hopefully 
this test becomes stable and useful now.

Coleen

>
> Thanks,
> Serguei
>>
>> Chris
>>>
>>>>
>>>> Thanks,
>>>> Serguei
>>>>
>>>>
>>>> On 7/31/18 00:16, Chris Plummer wrote:
>>>>> Sorry, I thought this had been pushed already, but it hasn't. But 
>>>>> it still looks like JDK-8202896 should be closed as a dup, and 
>>>>> it's unclear to me if JDK-8206076 has been fixed and this test can 
>>>>> be removed from the problem list.
>>>>>
>>>>> Chris
>>>>>
>>>>> On 7/30/18 6:34 PM, Chris Plummer wrote:
>>>>>> Hi Coleen,
>>>>>>
>>>>>> Now that this had been pushed, I assume JDK-8202896 should be 
>>>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this 
>>>>>> change also?
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote:
>>>>>>> Summary: fixed refactoring caused by JDK-8203820
>>>>>>>
>>>>>>> open webrev at 
>>>>>>> http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074
>>>>>>>
>>>>>>> Ran the test in mach5 on all Oracle supported platforms. Also 
>>>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes 
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>


From serguei.spitsyn at oracle.com  Tue Jul 31 22:55:39 2018
From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com)
Date: Tue, 31 Jul 2018 15:55:39 -0700
Subject: RFR (trivial) 8208074: [TESTBUG]
 vmTestbase/nsk/jvmti/RedefineClasses/StressRedefineWithoutBytecodeCorruption/TestDescription.java
 failed with NullPointerException
In-Reply-To: <9bdf6cc2-d0e6-dd55-88e6-f3bb45b322ec@oracle.com>
References: <2083570b-46d1-bcf9-3dd6-e323d6821936@oracle.com>
 <f2623bd9-932a-865e-119f-c0bdaee63fec@oracle.com>
 <176d868d-ddcc-2970-2aa7-1e56812fff57@oracle.com>
 <0bfb1687-cf08-6740-40d0-ad1944f4edaf@oracle.com>
 <69a03f8b-8ad5-94ee-c801-7166cc46148b@oracle.com>
 <a9bb6f70-962a-3ac2-6ebd-edb48b999681@oracle.com>
 <60b8abbb-f9da-25b5-f488-b7c76896b62d@oracle.com>
 <9bdf6cc2-d0e6-dd55-88e6-f3bb45b322ec@oracle.com>
Message-ID: <306e1860-9066-34e3-036e-1ded191d0cd4@oracle.com>

On 7/31/18 13:09, coleen.phillimore at oracle.com wrote:
>
>
> On 7/31/18 2:07 PM, serguei.spitsyn at oracle.com wrote:
>> On 7/31/18 09:13, Chris Plummer wrote:
>>> On 7/31/18 5:06 AM, coleen.phillimore at oracle.com wrote:
>>>>
>>>>
>>>> On 7/31/18 3:29 AM, serguei.spitsyn at oracle.com wrote:
>>>>> Hi Chris,
>>>>>
>>>>> Good catch.
>>>>> It is possible that this webrev does not fix the JDK-8202896.
>>>>> The JDK-8202896 is about timeouts which are normally intermittent 
>>>>> (is it right?).
>>>>>
>>>>> There are two options here:
>>>>> ? A: close 8202896 as a dup of 8208074
>>>>> ? B: keep the test problem listed and labeled with 8202896
>>>>>
>>>>> Let's wait for Coleen's answer.
>>>>
>>>> I closed https://bugs.openjdk.java.net/browse/JDK-8206076 (timeouts 
>>>> with -Xcomp)
>>>> ?as a duplicate of
>>>> https://bugs.openjdk.java.net/browse/JDK-8203820 (where I took 
>>>> InMemoryCompiler out of the threads)
>>>> because that's where the attempted fix was.
>>>>
>>>> I think
>>>> https://bugs.openjdk.java.net/browse/JDK-8202896 (getting Too many 
>>>> open files intermittently)
>>>> should be closed as a duplicate too because it's the same root cause.
>>>>
>>>> And this one:
>>>> https://bugs.openjdk.java.net/browse/JDK-8208074 (broken fix)
>>>> fixes my fix and will remove the test from the ProblemList.txt.
>>>>
>>>> I believe it should be removed fromt he problem list because I 
>>>> don't think it will time out or intermittently fail again for the 
>>>> same reason.? If it times out or fails for a different reason, we 
>>>> should file a whole new bug, with that specific analysis.
>>>>
>>>> Thanks,
>>>> Coleen
>>>
>>> Hi Coleen,
>>>
>>> That all sounds reasonable. Thanks for cleaning up the bug situation.
>>
>> +1
>
> Thanks Chris and Serguei for your discussion of this bug. Hopefully 
> this test becomes stable and useful now.

Thanks a lot for taking care about this issue, Coleen!

Thanks,
Serguei

> Coleen
>
>>
>> Thanks,
>> Serguei
>>>
>>> Chris
>>>>
>>>>>
>>>>> Thanks,
>>>>> Serguei
>>>>>
>>>>>
>>>>> On 7/31/18 00:16, Chris Plummer wrote:
>>>>>> Sorry, I thought this had been pushed already, but it hasn't. But 
>>>>>> it still looks like JDK-8202896 should be closed as a dup, and 
>>>>>> it's unclear to me if JDK-8206076 has been fixed and this test 
>>>>>> can be removed from the problem list.
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>> On 7/30/18 6:34 PM, Chris Plummer wrote:
>>>>>>> Hi Coleen,
>>>>>>>
>>>>>>> Now that this had been pushed, I assume JDK-8202896 should be 
>>>>>>> closed as a dup. And what about JDK-8206076? Is it fixed by this 
>>>>>>> change also?
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> Chris
>>>>>>>
>>>>>>> On 7/30/18 1:49 PM, coleen.phillimore at oracle.com wrote:
>>>>>>>> Summary: fixed refactoring caused by JDK-8203820
>>>>>>>>
>>>>>>>> open webrev at 
>>>>>>>> http://cr.openjdk.java.net/~coleenp/8208074.01/webrev
>>>>>>>> bug link https://bugs.openjdk.java.net/browse/JDK-8208074
>>>>>>>>
>>>>>>>> Ran the test in mach5 on all Oracle supported platforms. Also 
>>>>>>>> took the test out of ProblemList.txt because JDK-8203820 fixes 
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8202896.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Coleen
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>