From jaroslav.bachorik at oracle.com Tue Jul 9 03:02:48 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 09 Jul 2013 12:02:48 +0200 Subject: jmx-dev RFR: 8010285 Enforce the requirement of Management Interfaces being public In-Reply-To: <51C0440B.9030601@oracle.com> References: <51A4BD98.1040704@oracle.com> <51A6171E.8040606@oracle.com> <51A6382F.3000204@oracle.com> <51A63E82.4050505@oracle.com> <51A70081.5050203@oracle.com> <51AD1955.2090109@oracle.com> <51AF060D.5070706@oracle.com> <51AF136C.8070806@oracle.com> <51AF3494.3070304@oracle.com> <51AF4368.1040403@oracle.com> <51AF4ABB.1080005@oracle.com> <51AF7B42.7020902@oracle.com> <51AF9A56.4090709@oracle.com> <51B0A937.90607@oracle.com> <51B0AAB7.7070802@oracle.com> <51B0B39C.1050305@oracle.com> <51B18803.7060406@oracle.com> <51B1A2E2.5030001@oracle.com> <51C03013.2020700@oracle.com> <51C0359F.8090201@oracle.com> <51C036AF.1030206@oracle.com> <51C0440B.9030601@oracle.com> Message-ID: <51DBDFC8.4090800@oracle.com> Please, review the final version of the changes: http://cr.openjdk.java.net/~jbachorik/8010285/webrev.07 It addresses all the concerns raised during the CCC process. I will need at least one official OpenJDK reviewer for the integration. Thanks, -JB- From mandy.chung at oracle.com Tue Jul 9 12:42:43 2013 From: mandy.chung at oracle.com (Mandy Chung) Date: Tue, 09 Jul 2013 12:42:43 -0700 Subject: jmx-dev RFR: 8010285 Enforce the requirement of Management Interfaces being public In-Reply-To: <51DBDFC8.4090800@oracle.com> References: <51A4BD98.1040704@oracle.com> <51A6171E.8040606@oracle.com> <51A6382F.3000204@oracle.com> <51A63E82.4050505@oracle.com> <51A70081.5050203@oracle.com> <51AD1955.2090109@oracle.com> <51AF060D.5070706@oracle.com> <51AF136C.8070806@oracle.com> <51AF3494.3070304@oracle.com> <51AF4368.1040403@oracle.com> <51AF4ABB.1080005@oracle.com> <51AF7B42.7020902@oracle.com> <51AF9A56.4090709@oracle.com> <51B0A937.90607@oracle.com> <51B0AAB7.7070802@oracle.com> <51B0B39C.1050305@oracle.com> <51B18803.7060406@oracle.com> <51B1A2E2.5030001@oracle.com> <51C03013.2020700@oracle.com> <51C0359F.8090201@oracle.com> <51C036AF.1030206@oracle.com> <51C0440B.9030601@oracle.com> <51DBDFC8.4090800@oracle.com> Message-ID: <51DC67B3.8060103@oracle.com> On 7/9/13 3:02 AM, Jaroslav Bachorik wrote: > Please, review the final version of the changes: > http://cr.openjdk.java.net/~jbachorik/8010285/webrev.07 > The change looks reasonable. In the class spec for MXBean, suggest to rename interface ThisIsNotMXBean{} to something more explicit interface NonPublicInterfaceNotMXBean{} You removed JMX.checkProxyInterface. I believe the checkPackageAccess method on the given mbean interface is called somewhere as part of the MBean validation - where is that check being done? Other than that, it's fine with me. Mandy > It addresses all the concerns raised during the CCC process. > > I will need at least one official OpenJDK reviewer for the integration. > > Thanks, > > -JB- From jaroslav.bachorik at oracle.com Wed Jul 10 01:33:17 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 10 Jul 2013 10:33:17 +0200 Subject: jmx-dev RFR: 8010285 Enforce the requirement of Management Interfaces being public In-Reply-To: <51DC67B3.8060103@oracle.com> References: <51A4BD98.1040704@oracle.com> <51A6171E.8040606@oracle.com> <51A6382F.3000204@oracle.com> <51A63E82.4050505@oracle.com> <51A70081.5050203@oracle.com> <51AD1955.2090109@oracle.com> <51AF060D.5070706@oracle.com> <51AF136C.8070806@oracle.com> <51AF3494.3070304@oracle.com> <51AF4368.1040403@oracle.com> <51AF4ABB.1080005@oracle.com> <51AF7B42.7020902@oracle.com> <51AF9A56.4090709@oracle.com> <51B0A937.90607@oracle.com> <51B0AAB7.7070802@oracle.com> <51B0B39C.1050305@oracle.com> <51B18803.7060406@oracle.com> <51B1A2E2.5030001@oracle.com> <51C03013.2020700@oracle.com> <51C0359F.8090201@oracle.com> <51C036AF.1030206@oracle.com> <51C0440B.9030601@oracle.com> <51DBDFC8.4090800@oracle.com> <51DC67B3.8060103@oracle.com> Message-ID: <51DD1C4D.2060601@oracle.com> On 07/09/2013 09:42 PM, Mandy Chung wrote: > On 7/9/13 3:02 AM, Jaroslav Bachorik wrote: >> Please, review the final version of the changes: >> http://cr.openjdk.java.net/~jbachorik/8010285/webrev.07 >> > > The change looks reasonable. In the class spec for MXBean, suggest to > rename > > interface ThisIsNotMXBean{} > > to something more explicit > > interface NonPublicInterfaceNotMXBean{} Since this was a part of the CCC review which was approved I am not sure if I am allowed to change the class spec. If it is allowed I have no objections against the proposal and will change the interface name. > > You removed JMX.checkProxyInterface. I believe the checkPackageAccess > method on the given mbean > interface is called somewhere as part of the MBean validation - where is > that check being done? com.sun.jmx.mbeanserver.MBeanIntrospector.getMethods() performs this check. It is not possible to construct an M(X)Bean proxy without consulting com.sun.jmx.mbeanserver.MBeanIntrospector.getMethods() first. This functionality is enforced by a closed vulnerability test. -JB- > > Other than that, it's fine with me. > > Mandy > >> It addresses all the concerns raised during the CCC process. >> >> I will need at least one official OpenJDK reviewer for the integration. >> >> Thanks, >> >> -JB- > From jaroslav.bachorik at oracle.com Wed Jul 10 02:10:52 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 10 Jul 2013 11:10:52 +0200 Subject: jmx-dev RFR: 8019826 Test com/sun/management/HotSpotDiagnosticMXBean/SetVMOption.java fails with NPE Message-ID: <51DD251C.5050009@oracle.com> Please, review this simple fix. http://cr.openjdk.java.net/~jbachorik/8019826/webrev.00 Firstly, the patch removes a conditional early exit which checks for a build 52 of an unspecified major JVM version - it is not needed any more. Basically, the condition just made the test a noop till the latest hotspot version. The second fix is correctly setting the "mbean" attribute - it was not properly initialized and because of this the test was going to fail with NPE. Thanks, -JB- From shanliang.jiang at oracle.com Wed Jul 10 09:48:34 2013 From: shanliang.jiang at oracle.com (shanliang) Date: Wed, 10 Jul 2013 18:48:34 +0200 Subject: jmx-dev RFR: 8019826 Test com/sun/management/HotSpotDiagnosticMXBean/SetVMOption.java fails with NPE In-Reply-To: <51DD251C.5050009@oracle.com> References: <51DD251C.5050009@oracle.com> Message-ID: <51DD9062.8020602@oracle.com> It looks fine to me. Shanliang Jaroslav Bachorik wrote: > Please, review this simple fix. > > http://cr.openjdk.java.net/~jbachorik/8019826/webrev.00 > > Firstly, the patch removes a conditional early exit which checks for a > build 52 of an unspecified major JVM version - it is not needed any > more. Basically, the condition just made the test a noop till the latest > hotspot version. > > The second fix is correctly setting the "mbean" attribute - it was not > properly initialized and because of this the test was going to fail with > NPE. > > Thanks, > > -JB- > From mandy.chung at oracle.com Wed Jul 10 17:52:28 2013 From: mandy.chung at oracle.com (Mandy Chung) Date: Thu, 11 Jul 2013 08:52:28 +0800 Subject: jmx-dev RFR: 8010285 Enforce the requirement of Management Interfaces being public In-Reply-To: <51DD1C4D.2060601@oracle.com> References: <51A4BD98.1040704@oracle.com> <51A6171E.8040606@oracle.com> <51A6382F.3000204@oracle.com> <51A63E82.4050505@oracle.com> <51A70081.5050203@oracle.com> <51AD1955.2090109@oracle.com> <51AF060D.5070706@oracle.com> <51AF136C.8070806@oracle.com> <51AF3494.3070304@oracle.com> <51AF4368.1040403@oracle.com> <51AF4ABB.1080005@oracle.com> <51AF7B42.7020902@oracle.com> <51AF9A56.4090709@oracle.com> <51B0A937.90607@oracle.com> <51B0AAB7.7070802@oracle.com> <51B0B39C.1050305@oracle.com> <51B18803.7060406@oracle.com> <51B1A2E2.5030001@oracle.com> <51C03013.2020700@oracle.com> <51C0359F.8090201@oracle.com> <51C036AF.1030206@oracle.com> <51C0440B.9030601@oracle.com> <51DBDFC8.4090800@oracle.com> <51DC67B3.8060103@oracle.com> <51DD1C4D.2060601@oracle.com> Message-ID: <51DE01CC.7060402@oracle.com> On 7/10/2013 4:33 PM, Jaroslav Bachorik wrote: >> >The change looks reasonable. In the class spec for MXBean, suggest to >> >rename >> > >> > interface ThisIsNotMXBean{} >> > >> >to something more explicit >> > >> > interface NonPublicInterfaceNotMXBean{} > Since this was a part of the CCC review which was approved I am not sure > if I am allowed to change the class spec. If it is allowed I have no > objections against the proposal and will change the interface name. That is an example interface name and is non-normative (unless you see it differently). This can be revised. thanks Mandy From david.holmes at oracle.com Wed Jul 10 22:23:36 2013 From: david.holmes at oracle.com (David Holmes) Date: Thu, 11 Jul 2013 15:23:36 +1000 Subject: jmx-dev RFR: 8019826 Test com/sun/management/HotSpotDiagnosticMXBean/SetVMOption.java fails with NPE In-Reply-To: <51DD251C.5050009@oracle.com> References: <51DD251C.5050009@oracle.com> Message-ID: <51DE4158.7080503@oracle.com> On 10/07/2013 7:10 PM, Jaroslav Bachorik wrote: > Please, review this simple fix. > > http://cr.openjdk.java.net/~jbachorik/8019826/webrev.00 > > Firstly, the patch removes a conditional early exit which checks for a > build 52 of an unspecified major JVM version - it is not needed any > more. Basically, the condition just made the test a noop till the latest > hotspot version. > > The second fix is correctly setting the "mbean" attribute - it was not > properly initialized and because of this the test was going to fail with > NPE. Looks fine to me. David ----- > Thanks, > > -JB- > From mandy.chung at oracle.com Wed Jul 10 22:49:32 2013 From: mandy.chung at oracle.com (Mandy Chung) Date: Thu, 11 Jul 2013 13:49:32 +0800 Subject: jmx-dev RFR: 8019826 Test com/sun/management/HotSpotDiagnosticMXBean/SetVMOption.java fails with NPE In-Reply-To: <51DD251C.5050009@oracle.com> References: <51DD251C.5050009@oracle.com> Message-ID: <0B1D8082-0D31-4C40-80DB-2D9D2453F0B3@oracle.com> Looks good. Mandy On Jul 10, 2013, at 5:10 PM, Jaroslav Bachorik wrote: > Please, review this simple fix. > > http://cr.openjdk.java.net/~jbachorik/8019826/webrev.00 > > Firstly, the patch removes a conditional early exit which checks for a > build 52 of an unspecified major JVM version - it is not needed any > more. Basically, the condition just made the test a noop till the latest > hotspot version. > > The second fix is correctly setting the "mbean" attribute - it was not > properly initialized and because of this the test was going to fail with > NPE. > > Thanks, > > -JB- From jaroslav.bachorik at oracle.com Thu Jul 11 04:48:02 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 11 Jul 2013 13:48:02 +0200 Subject: jmx-dev RFR: 8019584 javax/management/remote/mandatory/loading/MissingClassTest.java failed in nightly against jdk7u45: java.io.InvalidObjectException: Invalid notification: null Message-ID: <51DE9B72.5030308@oracle.com> Please, review the change. http://cr.openjdk.java.net/~jbachorik/8019584/webrev.00/ The combination of the fix for JDK-8014085 and ObjectInputStream.readFields() not throwing CNFE when trying to deserialize an object graph containing references to non-available classes makes an InvalidObjectException being thrown instead of the CNFE when processing JMX notifications. The patch makes the ClientNotificationForwarder ready for InvalidObjectException - it will correctly report lost notifications but will not cause the notification processing loop to fail with unhandled exception. Thanks, -JB- From jaroslav.bachorik at oracle.com Mon Jul 15 01:41:10 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 15 Jul 2013 10:41:10 +0200 Subject: jmx-dev RFR: 8019584 javax/management/remote/mandatory/loading/MissingClassTest.java failed in nightly against jdk7u45: java.io.InvalidObjectException: Invalid notification: null Message-ID: <51E3B5A6.4060301@oracle.com> Please, review the patch for https://jbs.oracle.com/bugs/browse/JDK-8019584 http://cr.openjdk.java.net/~jbachorik/8019584/webrev.00/ The reason for the failure is that the ObjectInputStream.readFields() method does not throw CNFE as specified when encountering instances of unknown in the object graph to be deserialized. Instead, it leaves the fields in the default state which in this case is "null" and is not valid. Hence, the deserialization validation fails. Since the main cause is in the RMI code, has been there for very long time and changing the behaviour there might have disrupting effects on various 3rd party applications I decided to work around this problem in the JMX code. The workaround adds InvalidObjectException to the list of expected exceptions when processing JMX notifications. It is treated the same way as eg. CNFE - the exception is logged and the notification will be reported as missing. This will resolve the problem on the JMX side. Thanks, -JB- From daniel.fuchs at oracle.com Mon Jul 15 05:56:35 2013 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Mon, 15 Jul 2013 14:56:35 +0200 Subject: jmx-dev RFR: 8019584 javax/management/remote/mandatory/loading/MissingClassTest.java failed in nightly against jdk7u45: java.io.InvalidObjectException: Invalid notification: null In-Reply-To: <51DE9B72.5030308@oracle.com> References: <51DE9B72.5030308@oracle.com> Message-ID: <51E3F183.3080106@oracle.com> Hi Jaroslav, This looks reasonable. I assume you have run the JCK to verify that it doesn't break anything else? best regards, -- daniel On 7/11/13 1:48 PM, Jaroslav Bachorik wrote: > Please, review the change. > > http://cr.openjdk.java.net/~jbachorik/8019584/webrev.00/ > > The combination of the fix for JDK-8014085 and > ObjectInputStream.readFields() not throwing CNFE when trying to > deserialize an object graph containing references to non-available > classes makes an InvalidObjectException being thrown instead of the CNFE > when processing JMX notifications. > > The patch makes the ClientNotificationForwarder ready for > InvalidObjectException - it will correctly report lost notifications but > will not cause the notification processing loop to fail with unhandled > exception. > > Thanks, > > -JB- > From david.holmes at oracle.com Mon Jul 15 18:01:09 2013 From: david.holmes at oracle.com (David Holmes) Date: Tue, 16 Jul 2013 11:01:09 +1000 Subject: jmx-dev RFR: 8019584 javax/management/remote/mandatory/loading/MissingClassTest.java failed in nightly against jdk7u45: java.io.InvalidObjectException: Invalid notification: null In-Reply-To: <51E3B5A6.4060301@oracle.com> References: <51E3B5A6.4060301@oracle.com> Message-ID: <51E49B55.2060603@oracle.com> On 15/07/2013 6:41 PM, Jaroslav Bachorik wrote: > Please, review the patch for https://jbs.oracle.com/bugs/browse/JDK-8019584 > > http://cr.openjdk.java.net/~jbachorik/8019584/webrev.00/ > > The reason for the failure is that the ObjectInputStream.readFields() > method does not throw CNFE as specified when encountering instances of > unknown in the object graph to be deserialized. Instead, it leaves the > fields in the default state which in this case is "null" and is not > valid. Hence, the deserialization validation fails. > > Since the main cause is in the RMI code, has been there for very long > time and changing the behaviour there might have disrupting effects on > various 3rd party applications I decided to work around this problem in > the JMX code. Can you pinpoint the code that actually fails to propagate the ClassNotFoundException - I don't see any issue in OIS.readFields itself so this comes from elsewhere. Failing to throw CNFE when deserializing seems like a major bug to me. Thanks, David > The workaround adds InvalidObjectException to the list of expected > exceptions when processing JMX notifications. It is treated the same way > as eg. CNFE - the exception is logged and the notification will be > reported as missing. This will resolve the problem on the JMX side. > > Thanks, > > -JB- > From shanliang.jiang at oracle.com Tue Jul 16 00:31:52 2013 From: shanliang.jiang at oracle.com (shanliang) Date: Tue, 16 Jul 2013 09:31:52 +0200 Subject: jmx-dev RFR: 8019584 javax/management/remote/mandatory/loading/MissingClassTest.java failed in nightly against jdk7u45: java.io.InvalidObjectException: Invalid notification: null In-Reply-To: <51DE9B72.5030308@oracle.com> References: <51DE9B72.5030308@oracle.com> Message-ID: <51E4F6E8.7030200@oracle.com> Jaroslav, I am not sure that it is a good idea to add simply InvalidObjectException into the catching list. I remember that we carefully analyzed and tested the catching list, in order to avoid no-needed call of "fetchOneNotif", and to avoid fetching on a dead connection. Look at javax.management.remote.rmi.RMIConnector$RMINotifClient.fetchNotifs, we carefully retrieve an original exception wrapped in a UnmarshalException, with different protocol to allow ClientNotifForwarder to do right catching. What I am afraid is that InvalidObjectException would be thrown with other situations, like the connection was cut in the middle way of fetching, then the fix would make ClientNotifForwarder fail to stop fetching. Shanliang Jaroslav Bachorik wrote: > Please, review the change. > > http://cr.openjdk.java.net/~jbachorik/8019584/webrev.00/ > > The combination of the fix for JDK-8014085 and > ObjectInputStream.readFields() not throwing CNFE when trying to > deserialize an object graph containing references to non-available > classes makes an InvalidObjectException being thrown instead of the CNFE > when processing JMX notifications. > > The patch makes the ClientNotificationForwarder ready for > InvalidObjectException - it will correctly report lost notifications but > will not cause the notification processing loop to fail with unhandled > exception. > > Thanks, > > -JB- > From jaroslav.bachorik at oracle.com Tue Jul 16 00:47:05 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 16 Jul 2013 09:47:05 +0200 Subject: jmx-dev RFR: 8019584 javax/management/remote/mandatory/loading/MissingClassTest.java failed in nightly against jdk7u45: java.io.InvalidObjectException: Invalid notification: null In-Reply-To: <51E4F6E8.7030200@oracle.com> References: <51DE9B72.5030308@oracle.com> <51E4F6E8.7030200@oracle.com> Message-ID: <51E4FA79.3000307@oracle.com> According to the documentation InvalidObjectException "Indicates that one or more deserialized objects failed validation tests." meaning that the severed connection should generate a different type of exception. InvalidObjectException should be reserved for the cases when the deserialized data violate the validation rules. But I can't say I am certain; when a CNFE can disappear in the process, anything might be possible ... Anyway, if I can't catch the InvalidObjectException it will leave me with two options: 1. Just forget about the validation 2. Do the validation but live with the fact that even when the validation fails a potential attacker can get access to an instance with invalid fields (eg. using the finalizer trick) -JB- On Tue 16 Jul 2013 09:31:52 AM CEST, shanliang wrote: > Jaroslav, > > I am not sure that it is a good idea to add simply > InvalidObjectException into the catching list. I remember that we > carefully analyzed and tested the catching list, in order to avoid > no-needed call of "fetchOneNotif", and to avoid fetching on a dead > connection. Look at > javax.management.remote.rmi.RMIConnector$RMINotifClient.fetchNotifs, > we carefully retrieve an original exception wrapped in a > UnmarshalException, with different protocol to allow > ClientNotifForwarder to do right catching. > > What I am afraid is that InvalidObjectException would be thrown with > other situations, like the connection was cut in the middle way of > fetching, then the fix would make ClientNotifForwarder fail to stop > fetching. > > Shanliang > > Jaroslav Bachorik wrote: >> Please, review the change. >> >> http://cr.openjdk.java.net/~jbachorik/8019584/webrev.00/ >> >> The combination of the fix for JDK-8014085 and >> ObjectInputStream.readFields() not throwing CNFE when trying to >> deserialize an object graph containing references to non-available >> classes makes an InvalidObjectException being thrown instead of the CNFE >> when processing JMX notifications. >> >> The patch makes the ClientNotificationForwarder ready for >> InvalidObjectException - it will correctly report lost notifications but >> will not cause the notification processing loop to fail with unhandled >> exception. >> >> Thanks, >> >> -JB- >> > From jaroslav.bachorik at oracle.com Tue Jul 16 00:49:21 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 16 Jul 2013 09:49:21 +0200 Subject: jmx-dev RFR: 8019584 javax/management/remote/mandatory/loading/MissingClassTest.java failed in nightly against jdk7u45: java.io.InvalidObjectException: Invalid notification: null In-Reply-To: <51E49B55.2060603@oracle.com> References: <51E3B5A6.4060301@oracle.com> <51E49B55.2060603@oracle.com> Message-ID: <51E4FB01.60403@oracle.com> On Tue 16 Jul 2013 03:01:09 AM CEST, David Holmes wrote: > On 15/07/2013 6:41 PM, Jaroslav Bachorik wrote: >> Please, review the patch for >> https://jbs.oracle.com/bugs/browse/JDK-8019584 >> >> http://cr.openjdk.java.net/~jbachorik/8019584/webrev.00/ >> >> The reason for the failure is that the ObjectInputStream.readFields() >> method does not throw CNFE as specified when encountering instances of >> unknown in the object graph to be deserialized. Instead, it leaves the >> fields in the default state which in this case is "null" and is not >> valid. Hence, the deserialization validation fails. >> >> Since the main cause is in the RMI code, has been there for very long >> time and changing the behaviour there might have disrupting effects on >> various 3rd party applications I decided to work around this problem in >> the JMX code. > > Can you pinpoint the code that actually fails to propagate the > ClassNotFoundException - I don't see any issue in OIS.readFields > itself so this comes from elsewhere. Failing to throw CNFE when > deserializing seems like a major bug to me. Yes, I agree. When you take a look at the ObjectInputStream.defaultReadObject() you can see that it forwards any captured exception on the lines 509-512 --- ClassNotFoundException ex = handles.lookupException(passHandle); if (ex != null) { throw ex; } --- On the other hand the GetFieldImpl just nulifies the read field on lines 2137-2138 --- return (handles.lookupException(objHandle) == null) ? objVals[off] : null; -- and the ObjectInputStream.readFields() completely disregards the "handles" map and basically swallows any exception discovered during the fields deserialization, AFAIK. -JB- and the ObjectInputStream.readFields > > Thanks, > David > > >> The workaround adds InvalidObjectException to the list of expected >> exceptions when processing JMX notifications. It is treated the same way >> as eg. CNFE - the exception is logged and the notification will be >> reported as missing. This will resolve the problem on the JMX side. >> >> Thanks, >> >> -JB- >> From jaroslav.bachorik at oracle.com Thu Jul 18 02:54:52 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 18 Jul 2013 11:54:52 +0200 Subject: jmx-dev RFR: 8002307 javax.management.modelmbean.ModelMBeanInfoSupport may expose internal representation by storing an externally mutable object In-Reply-To: <51A4DC90.7050809@oracle.com> References: <51A4AB45.4070100@oracle.com> <51A4DC90.7050809@oracle.com> Message-ID: <51E7BB6C.6030704@oracle.com> Hi, thanks for the comments. Here (http://cr.openjdk.java.net/~jbachorik/8002307/webrev.03/) is the updated webrev implementing suggestions from Daniel and Shanliang. -JB- From daniel.fuchs at oracle.com Thu Jul 18 05:11:37 2013 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Thu, 18 Jul 2013 14:11:37 +0200 Subject: jmx-dev RFR: 8002307 javax.management.modelmbean.ModelMBeanInfoSupport may expose internal representation by storing an externally mutable object In-Reply-To: <51E7BB6C.6030704@oracle.com> References: <51A4AB45.4070100@oracle.com> <51A4DC90.7050809@oracle.com> <51E7BB6C.6030704@oracle.com> Message-ID: <51E7DB79.2010504@oracle.com> Hi Jaroslav, Looks good overall. Small nit: You should remove the comment lines 322-327 in ModelMBeanInfoSupport.java since your changes make it obsolete. Also the copyright year in ImmutableDataTest should be 2013 (not 2005). No need for another round of review. -- daniel On 7/18/13 11:54 AM, Jaroslav Bachorik wrote: > Hi, > > thanks for the comments. > > Here (http://cr.openjdk.java.net/~jbachorik/8002307/webrev.03/) is the > updated webrev implementing suggestions from Daniel and Shanliang. > > -JB- > > From jaroslav.bachorik at oracle.com Mon Jul 22 04:55:42 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 22 Jul 2013 13:55:42 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently Message-ID: <51ED1DBE.3030304@oracle.com> The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java test seems to be failing intermittently. The test checks the functionality of the j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by capturing the current value of "getPeakThreadCount()", starting a predefined number of the user threads, stopping them and resetting the stored peak value and making sure the new peak equals to the number of the actually running threads. The main problem is that it is not possible to prevent JVM to start/stop arbitrary system threads while executing the test. This might lead to small variations of the reported peak (a short-lived system thread is started while the batch of the user threads is running) or the expected number of running threads (again, a short-lived system thread is started at the moment the test asks for the number of running threads). The patch does not fix those shortcomings as it is not really possible to do given the nature of the JVM threading system. It rather tries to relax the conditions while still maintaining the ability to detect functional problems - eg. decreasing peak without explicitly resetting it and reporting false number of threads. The webrev is at: http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 Thanks, -JB- From david.holmes at oracle.com Tue Jul 23 01:19:39 2013 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Jul 2013 18:19:39 +1000 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51ED1DBE.3030304@oracle.com> References: <51ED1DBE.3030304@oracle.com> Message-ID: <51EE3C9B.3050604@oracle.com> Hi Jaroslav, On 22/07/2013 9:55 PM, Jaroslav Bachorik wrote: > The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java test > seems to be failing intermittently. > > The test checks the functionality of the > j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by > capturing the current value of "getPeakThreadCount()", starting a > predefined number of the user threads, stopping them and resetting the > stored peak value and making sure the new peak equals to the number of > the actually running threads. > > The main problem is that it is not possible to prevent JVM to start/stop > arbitrary system threads while executing the test. This might lead to > small variations of the reported peak (a short-lived system thread is > started while the batch of the user threads is running) or the expected > number of running threads (again, a short-lived system thread is started > at the moment the test asks for the number of running threads). Do you know what "system threads" these are? I would not expect VM internal threads to be counted in getPeakThreadCount(), but even if they are I can't think of any short-lived threads that get created other than the Signal handling thread. > The patch does not fix those shortcomings as it is not really possible > to do given the nature of the JVM threading system. It rather tries to > relax the conditions while still maintaining the ability to detect > functional problems - eg. decreasing peak without explicitly resetting > it and reporting false number of threads. > > The webrev is at: > http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 Seems reasonable. David ----- > Thanks, > > -JB- > From daniel.fuchs at oracle.com Tue Jul 23 01:25:28 2013 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Tue, 23 Jul 2013 10:25:28 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51ED1DBE.3030304@oracle.com> References: <51ED1DBE.3030304@oracle.com> Message-ID: <51EE3DF8.8060903@oracle.com> Hi Jaroslav, This looks like a tough problem as it is altogether possible that some of the VM daemon threads will terminate during the duration of the call - and if that's the case, the condition: new peak >= old peak + delta might not even be true. I am not a VM specialist so I don't know whether there can be such daemon threads that will be arbitrarily started and stopped by the VM - but if that happens I don't see how you could work around it. There seems to be something strange in the test though: line 209, you catch InterruptedException just to call Thread.currentThread().interrupt() and interrupt the thread again?? Did you mean maybe to call Thread.currentThread().interrupted() instead? There are other places that seems to be prone to failures in this test too for instance: startThreads(...) { while(mbean.getThreadCount() < (current + count)) { ... } } If the VM can start and stop arbitrary threads then this condition seems dubious. There's the same kind of logic in terminateThreads. Not sure you can/should do anything about it though - it's just to point out that these steps might need to be revisited if the test still fails sporadically... Also I'm not sure that using volatile for the 'live' array will work - the array itself is volatile - but does it extends to its elements? It might be better to declare the live array as static final and use a synchronization block on the array itself when accessing it: private static final boolean live[] = new boolean[ALL_THREADS]; private static boolean isAlive(int i) { synchronized(live) { return live[i] }; } ... synchronized(live) { live[i] == false; } ... while (isAlive[id]) { ... } ... best regards, -- daniel On 7/22/13 1:55 PM, Jaroslav Bachorik wrote: > The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java test > seems to be failing intermittently. > > The test checks the functionality of the > j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by > capturing the current value of "getPeakThreadCount()", starting a > predefined number of the user threads, stopping them and resetting the > stored peak value and making sure the new peak equals to the number of > the actually running threads. > > The main problem is that it is not possible to prevent JVM to start/stop > arbitrary system threads while executing the test. This might lead to > small variations of the reported peak (a short-lived system thread is > started while the batch of the user threads is running) or the expected > number of running threads (again, a short-lived system thread is started > at the moment the test asks for the number of running threads). > > The patch does not fix those shortcomings as it is not really possible > to do given the nature of the JVM threading system. It rather tries to > relax the conditions while still maintaining the ability to detect > functional problems - eg. decreasing peak without explicitly resetting > it and reporting false number of threads. > > The webrev is at: > http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 > > Thanks, > > -JB- > From jaroslav.bachorik at oracle.com Tue Jul 23 01:29:22 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 23 Jul 2013 10:29:22 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EE3C9B.3050604@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> Message-ID: <51EE3EE2.1000202@oracle.com> On 07/23/2013 10:19 AM, David Holmes wrote: > Hi Jaroslav, > > On 22/07/2013 9:55 PM, Jaroslav Bachorik wrote: >> The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java test >> seems to be failing intermittently. >> >> The test checks the functionality of the >> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by >> capturing the current value of "getPeakThreadCount()", starting a >> predefined number of the user threads, stopping them and resetting the >> stored peak value and making sure the new peak equals to the number of >> the actually running threads. >> >> The main problem is that it is not possible to prevent JVM to start/stop >> arbitrary system threads while executing the test. This might lead to >> small variations of the reported peak (a short-lived system thread is >> started while the batch of the user threads is running) or the expected >> number of running threads (again, a short-lived system thread is started >> at the moment the test asks for the number of running threads). > > Do you know what "system threads" these are? I would not expect VM > internal threads to be counted in getPeakThreadCount(), but even if they > are I can't think of any short-lived threads that get created other than > the Signal handling thread. Unfortunatelly I don't. Capturing the thread dump at the moment of discovering the discrepancy seems to to be too late. I tried monitoring the JVM under the test from external tools but it just brings more entropy to the result. I am completely relying on the JVM native thread accounting to be correct and accurate - that it reports the thread count peak based on the real data. -JB- > >> The patch does not fix those shortcomings as it is not really possible >> to do given the nature of the JVM threading system. It rather tries to >> relax the conditions while still maintaining the ability to detect >> functional problems - eg. decreasing peak without explicitly resetting >> it and reporting false number of threads. >> >> The webrev is at: >> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 > > Seems reasonable. > > David > ----- > >> Thanks, >> >> -JB- >> From david.holmes at oracle.com Tue Jul 23 02:15:07 2013 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Jul 2013 19:15:07 +1000 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EE3DF8.8060903@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3DF8.8060903@oracle.com> Message-ID: <51EE499B.9060600@oracle.com> On 23/07/2013 6:25 PM, Daniel Fuchs wrote: > Hi Jaroslav, > > This looks like a tough problem as it is altogether possible that > some of the VM daemon threads will terminate during the duration > of the call - and if that's the case, the condition: > new peak >= old peak + delta > might not even be true. > I am not a VM specialist so I don't know whether there can be > such daemon threads that will be arbitrarily started and stopped > by the VM - but if that happens I don't see how you could work around > it. > > There seems to be something strange in the test though: line 209, > you catch InterruptedException just to call > Thread.currentThread().interrupt() and interrupt the thread again?? > Did you mean maybe to call Thread.currentThread().interrupted() instead? No but good catch as the way this is done is not quite right. The re-posting of the interrupt() needs to happen outside the loop otherwise the sleep() will simply rethrow the InterruptedException. The normal pattern is: boolean interrupted = false; while (...) { try { Thread.sleep(5); ... } catch (InterruptedException ie) { interrupted = true; } } if (interrupted) Thread.currentThread().interrupt(); // re-assert interrupt state Of course it is debatable whether there is any point continuing the loop if you do get interrupted (which should never happen anyway). > There are other places that seems to be prone to failures in this test > too for instance: > > startThreads(...) { > > while(mbean.getThreadCount() < (current + count)) { > ... > } > > } > > If the VM can start and stop arbitrary threads then this condition > seems dubious. There's the same kind of logic in terminateThreads. > Not sure you can/should do anything about it though - it's > just to point out that these steps might need to be revisited > if the test still fails sporadically... > > Also I'm not sure that using volatile for the 'live' array will > work - the array itself is volatile - but does it extends to its > elements? No it doesn't. David ----- > It might be better to declare the live array as static final and > use a synchronization block on the array itself when accessing it: > > private static final boolean live[] = new boolean[ALL_THREADS]; > private static boolean isAlive(int i) { > synchronized(live) { return live[i] }; > } > > ... > > synchronized(live) { > live[i] == false; > } > > ... > > while (isAlive[id]) { > ... > } > > ... > > best regards, > > -- daniel > > On 7/22/13 1:55 PM, Jaroslav Bachorik wrote: >> The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java test >> seems to be failing intermittently. >> >> The test checks the functionality of the >> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by >> capturing the current value of "getPeakThreadCount()", starting a >> predefined number of the user threads, stopping them and resetting the >> stored peak value and making sure the new peak equals to the number of >> the actually running threads. >> >> The main problem is that it is not possible to prevent JVM to start/stop >> arbitrary system threads while executing the test. This might lead to >> small variations of the reported peak (a short-lived system thread is >> started while the batch of the user threads is running) or the expected >> number of running threads (again, a short-lived system thread is started >> at the moment the test asks for the number of running threads). >> >> The patch does not fix those shortcomings as it is not really possible >> to do given the nature of the JVM threading system. It rather tries to >> relax the conditions while still maintaining the ability to detect >> functional problems - eg. decreasing peak without explicitly resetting >> it and reporting false number of threads. >> >> The webrev is at: >> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 >> >> Thanks, >> >> -JB- >> > From david.holmes at oracle.com Tue Jul 23 02:19:13 2013 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Jul 2013 19:19:13 +1000 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EE3EE2.1000202@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> Message-ID: <51EE4A91.3000305@oracle.com> On 23/07/2013 6:29 PM, Jaroslav Bachorik wrote: > On 07/23/2013 10:19 AM, David Holmes wrote: >> Hi Jaroslav, >> >> On 22/07/2013 9:55 PM, Jaroslav Bachorik wrote: >>> The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java test >>> seems to be failing intermittently. >>> >>> The test checks the functionality of the >>> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by >>> capturing the current value of "getPeakThreadCount()", starting a >>> predefined number of the user threads, stopping them and resetting the >>> stored peak value and making sure the new peak equals to the number of >>> the actually running threads. >>> >>> The main problem is that it is not possible to prevent JVM to start/stop >>> arbitrary system threads while executing the test. This might lead to >>> small variations of the reported peak (a short-lived system thread is >>> started while the batch of the user threads is running) or the expected >>> number of running threads (again, a short-lived system thread is started >>> at the moment the test asks for the number of running threads). >> >> Do you know what "system threads" these are? I would not expect VM >> internal threads to be counted in getPeakThreadCount(), but even if they >> are I can't think of any short-lived threads that get created other than >> the Signal handling thread. > > Unfortunatelly I don't. Capturing the thread dump at the moment of > discovering the discrepancy seems to to be too late. I tried monitoring > the JVM under the test from external tools but it just brings more > entropy to the result. We'd need to instrument the thread creation logic to keep a separate record. Dtrace probes could probably do it - but the problem is getting the test to fail. > I am completely relying on the JVM native thread accounting to be > correct and accurate - that it reports the thread count peak based on > the real data. The spec isn't clear but I would only expect these counters to apply to Java threads not VM internal threads (compiler, gc etc). So I'd really like to know what thread is messing up this count. David > -JB- > >> >>> The patch does not fix those shortcomings as it is not really possible >>> to do given the nature of the JVM threading system. It rather tries to >>> relax the conditions while still maintaining the ability to detect >>> functional problems - eg. decreasing peak without explicitly resetting >>> it and reporting false number of threads. >>> >>> The webrev is at: >>> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 >> >> Seems reasonable. >> >> David >> ----- >> >>> Thanks, >>> >>> -JB- >>> > From jaroslav.bachorik at oracle.com Tue Jul 23 02:25:59 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 23 Jul 2013 11:25:59 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EE4A91.3000305@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> Message-ID: <51EE4C27.206@oracle.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Tue 23 Jul 2013 11:19:13 AM CEST, David Holmes wrote: > On 23/07/2013 6:29 PM, Jaroslav Bachorik wrote: >> On 07/23/2013 10:19 AM, David Holmes wrote: >>> Hi Jaroslav, >>> >>> On 22/07/2013 9:55 PM, Jaroslav Bachorik wrote: >>>> The >>>> java/lang/management/ThreadMXBean/ResetPeakThreadCount.java >>>> test seems to be failing intermittently. >>>> >>>> The test checks the functionality of the >>>> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so >>>> by capturing the current value of "getPeakThreadCount()", >>>> starting a predefined number of the user threads, stopping >>>> them and resetting the stored peak value and making sure the >>>> new peak equals to the number of the actually running >>>> threads. >>>> >>>> The main problem is that it is not possible to prevent JVM >>>> to start/stop arbitrary system threads while executing the >>>> test. This might lead to small variations of the reported >>>> peak (a short-lived system thread is started while the batch >>>> of the user threads is running) or the expected number of >>>> running threads (again, a short-lived system thread is >>>> started at the moment the test asks for the number of running >>>> threads). >>> >>> Do you know what "system threads" these are? I would not expect >>> VM internal threads to be counted in getPeakThreadCount(), but >>> even if they are I can't think of any short-lived threads that >>> get created other than the Signal handling thread. >> >> Unfortunatelly I don't. Capturing the thread dump at the moment >> of discovering the discrepancy seems to to be too late. I tried >> monitoring the JVM under the test from external tools but it just >> brings more entropy to the result. > > We'd need to instrument the thread creation logic to keep a > separate record. Dtrace probes could probably do it - but the > problem is getting the test to fail. Well, while responding to the previous email I thought about yet another way to try to pinpoint the mysterious thread - I've tried NB profiler. It filters out it's own threads and can do thread monitoring at the same time as tracking the call tree. The result is that the offender is j.u.l.LogManager$Cleaner thread. > >> I am completely relying on the JVM native thread accounting to >> be correct and accurate - that it reports the thread count peak >> based on the real data. > > The spec isn't clear but I would only expect these counters to > apply to Java threads not VM internal threads (compiler, gc etc). > So I'd really like to know what thread is messing up this count. I hope my previous finding makes this clearer. - -JB- > > David > >> -JB- >> >>> >>>> The patch does not fix those shortcomings as it is not really >>>> possible to do given the nature of the JVM threading system. >>>> It rather tries to relax the conditions while still >>>> maintaining the ability to detect functional problems - eg. >>>> decreasing peak without explicitly resetting it and reporting >>>> false number of threads. >>>> >>>> The webrev is at: >>>> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 >>> >>> Seems reasonable. >>> >>> David ----- >>> >>>> Thanks, >>>> >>>> -JB- >>>> >> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJR7kwnAAoJELSZyqhGiB1MT1EH+wVuy+XhmDWRygxGnJCaSGwb B0RoeOovuhQa2y2AKKF8P1PRULNxxDQ5i+DG21Zd/xJA2WVBsm0h8Kkj0s3PJIOq 8EHZMY7Onw/kDrmoJMNlJrFf/wlSOXC6E/4lZeiSCqyzobZQRBzfLUOMPDXjYTEt 76+RYUDw5DON05ph5BbknIAr/JBy0iUoT7K39q8/b5Z6ZId8Z2pIguLUhDs49YOD xZSwHgZkJsJCQCDW3Fnth8qGOkQC4StnwE0X5vTCLCIurjIrAYiIciVBJVpjTOEZ zqo8JL7m5dFVl2NfK1on1XCV71phybgxB2qCpWGh4Z9mv+o9XNe4kY3cC1waIVs= =mSja -----END PGP SIGNATURE----- From jaroslav.bachorik at oracle.com Tue Jul 23 02:35:16 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 23 Jul 2013 11:35:16 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EE3DF8.8060903@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3DF8.8060903@oracle.com> Message-ID: <51EE4E54.5040005@oracle.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 07/23/2013 10:25 AM, Daniel Fuchs wrote: > Hi Jaroslav, > > This looks like a tough problem as it is altogether possible that > some of the VM daemon threads will terminate during the duration of > the call - and if that's the case, the condition: new peak >= old > peak + delta might not even be true. I am not a VM specialist so I > don't know whether there can be such daemon threads that will be > arbitrarily started and stopped by the VM - but if that happens I > don't see how you could work around it. As I wrote in my reply to David the offending thread is j.u.l.LogManager$Cleaner which kicks in randomly. This would confirm my observations that the discrepancy is always at most one thread more than expected. > > There seems to be something strange in the test though: line 209, > you catch InterruptedException just to call > Thread.currentThread().interrupt() and interrupt the thread > again?? Did you mean maybe to call > Thread.currentThread().interrupted() instead? No, it checks whether the thread has been interrupted and cleans the interrupted flag. > > There are other places that seems to be prone to failures in this > test too for instance: > > startThreads(...) { > > while(mbean.getThreadCount() < (current + count)) { ... } > > } > > If the VM can start and stop arbitrary threads then this condition > seems dubious. There's the same kind of logic in terminateThreads. > Not sure you can/should do anything about it though - it's just to > point out that these steps might need to be revisited if the test > still fails sporadically... > > Also I'm not sure that using volatile for the 'live' array will > work - the array itself is volatile - but does it extends to its > elements? No, it does not. But this code has been sitting there for some time. - -JB- > > It might be better to declare the live array as static final and > use a synchronization block on the array itself when accessing it: > > private static final boolean live[] = new boolean[ALL_THREADS]; > private static boolean isAlive(int i) { synchronized(live) { return > live[i] }; } > > ... > > synchronized(live) { live[i] == false; } > > ... > > while (isAlive[id]) { ... } > > ... > > best regards, > > -- daniel > > On 7/22/13 1:55 PM, Jaroslav Bachorik wrote: >> The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java >> test seems to be failing intermittently. >> >> The test checks the functionality of the >> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by >> capturing the current value of "getPeakThreadCount()", starting >> a predefined number of the user threads, stopping them and >> resetting the stored peak value and making sure the new peak >> equals to the number of the actually running threads. >> >> The main problem is that it is not possible to prevent JVM to >> start/stop arbitrary system threads while executing the test. >> This might lead to small variations of the reported peak (a >> short-lived system thread is started while the batch of the user >> threads is running) or the expected number of running threads >> (again, a short-lived system thread is started at the moment the >> test asks for the number of running threads). >> >> The patch does not fix those shortcomings as it is not really >> possible to do given the nature of the JVM threading system. It >> rather tries to relax the conditions while still maintaining the >> ability to detect functional problems - eg. decreasing peak >> without explicitly resetting it and reporting false number of >> threads. >> >> The webrev is at: >> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 >> >> Thanks, >> >> -JB- >> > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJR7k5UAAoJELSZyqhGiB1MsFoH/Rm/Of3/3U0hxvnB/1/PixYJ z1fakf98Gepyp9eIyNKZ5sfNCu6Zy+A826Uqfp/Hve8nUA5D9pzEiTpNoB4Fzts1 CWwn+Gd8r4crXXTNKKEg1vTOUEMcmRkUujY356ndmrcdZElRMQJwdOvkwgg9Z+Tn l0ZJLPTDyaDUtuP5D32RZYSMxf1yXL6hXbXNiOEWm9VD4NgxPpl8b4vu0cMrRiHH A+anZ9nUiEhdBsTJIcqgU4bmHBM8eXEDDepBMpnK6LyM/2eDhPj3iTqQpav26Lsd cURgR1Tgqs36bdlUCU4Q3MqPtHfnBibTTPxphXbhzgfAGMUW5JFerYGJIvTvpAw= =d/Q+ -----END PGP SIGNATURE----- From david.holmes at oracle.com Tue Jul 23 02:39:44 2013 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Jul 2013 19:39:44 +1000 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EE4A91.3000305@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> Message-ID: <51EE4F60.4000506@oracle.com> Sorry - I took a closer look at the full test rather than just that patch. We already have this code to try and help expose these intermittent failures: 213 // Nightly testing showed some intermittent failure. 214 // Check here to get diagnostic information if some strange 215 // behavior occurs. 216 checkThreadCount(expectedCount, current, 0); but the sleep loop you added means this check will rarely fail so we won't get to see this unexpected behaviour happening. So this block of code could be deleted in my view. Though it is preferable to determine exactly why we fail! Also looking at the sleep() used elsewhere you may as well follow the same pattern and abort on interrupt as it isn't expected. Finally with regard to Daniel's comment about the live array he is right that the volatile on the array is not sufficient in theory - a thread need never see the value of live[i] become false. There are a number of reasons why we are unlikely to see that in practice on hotspot. Using synchronized will fix that; or an alternative cancellation mechanism could be used. Cheers, David On 23/07/2013 7:19 PM, David Holmes wrote: > On 23/07/2013 6:29 PM, Jaroslav Bachorik wrote: >> On 07/23/2013 10:19 AM, David Holmes wrote: >>> Hi Jaroslav, >>> >>> On 22/07/2013 9:55 PM, Jaroslav Bachorik wrote: >>>> The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java test >>>> seems to be failing intermittently. >>>> >>>> The test checks the functionality of the >>>> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by >>>> capturing the current value of "getPeakThreadCount()", starting a >>>> predefined number of the user threads, stopping them and resetting the >>>> stored peak value and making sure the new peak equals to the number of >>>> the actually running threads. >>>> >>>> The main problem is that it is not possible to prevent JVM to >>>> start/stop >>>> arbitrary system threads while executing the test. This might lead to >>>> small variations of the reported peak (a short-lived system thread is >>>> started while the batch of the user threads is running) or the expected >>>> number of running threads (again, a short-lived system thread is >>>> started >>>> at the moment the test asks for the number of running threads). >>> >>> Do you know what "system threads" these are? I would not expect VM >>> internal threads to be counted in getPeakThreadCount(), but even if they >>> are I can't think of any short-lived threads that get created other than >>> the Signal handling thread. >> >> Unfortunatelly I don't. Capturing the thread dump at the moment of >> discovering the discrepancy seems to to be too late. I tried monitoring >> the JVM under the test from external tools but it just brings more >> entropy to the result. > > We'd need to instrument the thread creation logic to keep a separate > record. Dtrace probes could probably do it - but the problem is getting > the test to fail. > >> I am completely relying on the JVM native thread accounting to be >> correct and accurate - that it reports the thread count peak based on >> the real data. > > The spec isn't clear but I would only expect these counters to apply to > Java threads not VM internal threads (compiler, gc etc). So I'd really > like to know what thread is messing up this count. > > David > >> -JB- >> >>> >>>> The patch does not fix those shortcomings as it is not really possible >>>> to do given the nature of the JVM threading system. It rather tries to >>>> relax the conditions while still maintaining the ability to detect >>>> functional problems - eg. decreasing peak without explicitly resetting >>>> it and reporting false number of threads. >>>> >>>> The webrev is at: >>>> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 >>> >>> Seems reasonable. >>> >>> David >>> ----- >>> >>>> Thanks, >>>> >>>> -JB- >>>> >> From daniel.fuchs at oracle.com Tue Jul 23 02:44:50 2013 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Tue, 23 Jul 2013 11:44:50 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EE4E54.5040005@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3DF8.8060903@oracle.com> <51EE4E54.5040005@oracle.com> Message-ID: <51EE5092.4050100@oracle.com> On 7/23/13 11:35 AM, Jaroslav Bachorik wrote: > As I wrote in my reply to David the offending thread is > j.u.l.LogManager$Cleaner which kicks in randomly. Argh... Logging again :-) > This would confirm my observations that the discrepancy is always at > most one thread more than expected. What you could do then is call: Logger.getLogger("foo").info("Logging initialized"); first thing in the main(). This way the Cleaner thread will already be there and won't perturb the test. >> There seems to be something strange in the test though: line 209, >> you catch InterruptedException just to call >> Thread.currentThread().interrupt() and interrupt the thread >> again?? Did you mean maybe to call >> Thread.currentThread().interrupted() instead? > > No, it checks whether the thread has been interrupted and cleans the > interrupted flag. That's what interrupted() will do. But interrupt() will cause the next call to Thread.sleep() to throw InterruptedException - hence my question. >> Also I'm not sure that using volatile for the 'live' array will >> work - the array itself is volatile - but does it extends to its >> elements? > > No, it does not. But this code has been sitting there for some time. Well - I'll leave it to you - but personally I would fix it along, just to make sure the test doesn't fail because of it. cheers, -- daniel > > - -JB- > >> >> It might be better to declare the live array as static final and >> use a synchronization block on the array itself when accessing it: >> >> private static final boolean live[] = new boolean[ALL_THREADS]; >> private static boolean isAlive(int i) { synchronized(live) { return >> live[i] }; } >> >> ... >> >> synchronized(live) { live[i] == false; } >> >> ... >> >> while (isAlive[id]) { ... } >> >> ... >> >> best regards, >> >> -- daniel >> >> On 7/22/13 1:55 PM, Jaroslav Bachorik wrote: >>> The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java >>> test seems to be failing intermittently. >>> >>> The test checks the functionality of the >>> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by >>> capturing the current value of "getPeakThreadCount()", starting >>> a predefined number of the user threads, stopping them and >>> resetting the stored peak value and making sure the new peak >>> equals to the number of the actually running threads. >>> >>> The main problem is that it is not possible to prevent JVM to >>> start/stop arbitrary system threads while executing the test. >>> This might lead to small variations of the reported peak (a >>> short-lived system thread is started while the batch of the user >>> threads is running) or the expected number of running threads >>> (again, a short-lived system thread is started at the moment the >>> test asks for the number of running threads). >>> >>> The patch does not fix those shortcomings as it is not really >>> possible to do given the nature of the JVM threading system. It >>> rather tries to relax the conditions while still maintaining the >>> ability to detect functional problems - eg. decreasing peak >>> without explicitly resetting it and reporting false number of >>> threads. >>> >>> The webrev is at: >>> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 >>> >>> Thanks, >>> >>> -JB- >>> >> > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.12 (GNU/Linux) > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iQEcBAEBAgAGBQJR7k5UAAoJELSZyqhGiB1MsFoH/Rm/Of3/3U0hxvnB/1/PixYJ > z1fakf98Gepyp9eIyNKZ5sfNCu6Zy+A826Uqfp/Hve8nUA5D9pzEiTpNoB4Fzts1 > CWwn+Gd8r4crXXTNKKEg1vTOUEMcmRkUujY356ndmrcdZElRMQJwdOvkwgg9Z+Tn > l0ZJLPTDyaDUtuP5D32RZYSMxf1yXL6hXbXNiOEWm9VD4NgxPpl8b4vu0cMrRiHH > A+anZ9nUiEhdBsTJIcqgU4bmHBM8eXEDDepBMpnK6LyM/2eDhPj3iTqQpav26Lsd > cURgR1Tgqs36bdlUCU4Q3MqPtHfnBibTTPxphXbhzgfAGMUW5JFerYGJIvTvpAw= > =d/Q+ > -----END PGP SIGNATURE----- > From david.holmes at oracle.com Tue Jul 23 02:45:24 2013 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Jul 2013 19:45:24 +1000 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EE4BD6.7040707@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> Message-ID: <51EE50B4.8040000@oracle.com> On 23/07/2013 7:24 PM, Jaroslav Bachorik wrote: > The result is that the offender is j.u.l.LogManager$Cleaner thread. I > am attaching the profiler snapshot (can be opened in eg. jvisualvm) That doesn't quite make sense. The Cleaner thread is a shutdownhook, it should not be starting unless the VM is shutting down! David ----- > On Tue 23 Jul 2013 11:19:13 AM CEST, David Holmes wrote: >> On 23/07/2013 6:29 PM, Jaroslav Bachorik wrote: >>> On 07/23/2013 10:19 AM, David Holmes wrote: >>>> Hi Jaroslav, >>>> >>>> On 22/07/2013 9:55 PM, Jaroslav Bachorik wrote: >>>>> The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java test >>>>> seems to be failing intermittently. >>>>> >>>>> The test checks the functionality of the >>>>> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by >>>>> capturing the current value of "getPeakThreadCount()", starting a >>>>> predefined number of the user threads, stopping them and resetting the >>>>> stored peak value and making sure the new peak equals to the number of >>>>> the actually running threads. >>>>> >>>>> The main problem is that it is not possible to prevent JVM to >>>>> start/stop >>>>> arbitrary system threads while executing the test. This might lead to >>>>> small variations of the reported peak (a short-lived system thread is >>>>> started while the batch of the user threads is running) or the >>>>> expected >>>>> number of running threads (again, a short-lived system thread is >>>>> started >>>>> at the moment the test asks for the number of running threads). >>>> >>>> Do you know what "system threads" these are? I would not expect VM >>>> internal threads to be counted in getPeakThreadCount(), but even if >>>> they >>>> are I can't think of any short-lived threads that get created other >>>> than >>>> the Signal handling thread. >>> >>> Unfortunatelly I don't. Capturing the thread dump at the moment of >>> discovering the discrepancy seems to to be too late. I tried monitoring >>> the JVM under the test from external tools but it just brings more >>> entropy to the result. >> >> We'd need to instrument the thread creation logic to keep a separate >> record. Dtrace probes could probably do it - but the problem is >> getting the test to fail. > > Well, while responding to the previous email I thought about yet > another way to try to pinpoint the mysterious thread - I've tried NB > profiler. It filters out it's own threads and can do thread monitoring > at the same time as tracking the call tree. > > The result is that the offender is j.u.l.LogManager$Cleaner thread. I > am attaching the profiler snapshot (can be opened in eg. jvisualvm) > >> >>> I am completely relying on the JVM native thread accounting to be >>> correct and accurate - that it reports the thread count peak based on >>> the real data. >> >> The spec isn't clear but I would only expect these counters to apply >> to Java threads not VM internal threads (compiler, gc etc). So I'd >> really like to know what thread is messing up this count. > > I hope my previous finding makes this clearer. > > -JB- > >> >> David >> >>> -JB- >>> >>>> >>>>> The patch does not fix those shortcomings as it is not really possible >>>>> to do given the nature of the JVM threading system. It rather tries to >>>>> relax the conditions while still maintaining the ability to detect >>>>> functional problems - eg. decreasing peak without explicitly resetting >>>>> it and reporting false number of threads. >>>>> >>>>> The webrev is at: >>>>> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 >>>> >>>> Seems reasonable. >>>> >>>> David >>>> ----- >>>> >>>>> Thanks, >>>>> >>>>> -JB- >>>>> >>> > From daniel.fuchs at oracle.com Tue Jul 23 02:53:19 2013 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Tue, 23 Jul 2013 11:53:19 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EE50B4.8040000@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> Message-ID: <51EE528F.2050302@oracle.com> On 7/23/13 11:45 AM, David Holmes wrote: > On 23/07/2013 7:24 PM, Jaroslav Bachorik wrote: > > The result is that the offender is j.u.l.LogManager$Cleaner thread. I > > am attaching the profiler snapshot (can be opened in eg. jvisualvm) > > That doesn't quite make sense. The Cleaner thread is a shutdownhook, it > should not be starting unless the VM is shutting down! Hummm... Right: the javadoc says "Returns the peak live thread count since the Java virtual machine started or peak was reset." so the Cleaner thread should not be counted. If it is actually counted it might indicate a real problem in the implementation of the ThreadMXBean. -- daniel. > > David > ----- > >> On Tue 23 Jul 2013 11:19:13 AM CEST, David Holmes wrote: >>> On 23/07/2013 6:29 PM, Jaroslav Bachorik wrote: >>>> On 07/23/2013 10:19 AM, David Holmes wrote: >>>>> Hi Jaroslav, >>>>> >>>>> On 22/07/2013 9:55 PM, Jaroslav Bachorik wrote: >>>>>> The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java test >>>>>> seems to be failing intermittently. >>>>>> >>>>>> The test checks the functionality of the >>>>>> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by >>>>>> capturing the current value of "getPeakThreadCount()", starting a >>>>>> predefined number of the user threads, stopping them and resetting >>>>>> the >>>>>> stored peak value and making sure the new peak equals to the >>>>>> number of >>>>>> the actually running threads. >>>>>> >>>>>> The main problem is that it is not possible to prevent JVM to >>>>>> start/stop >>>>>> arbitrary system threads while executing the test. This might lead to >>>>>> small variations of the reported peak (a short-lived system thread is >>>>>> started while the batch of the user threads is running) or the >>>>>> expected >>>>>> number of running threads (again, a short-lived system thread is >>>>>> started >>>>>> at the moment the test asks for the number of running threads). >>>>> >>>>> Do you know what "system threads" these are? I would not expect VM >>>>> internal threads to be counted in getPeakThreadCount(), but even if >>>>> they >>>>> are I can't think of any short-lived threads that get created other >>>>> than >>>>> the Signal handling thread. >>>> >>>> Unfortunatelly I don't. Capturing the thread dump at the moment of >>>> discovering the discrepancy seems to to be too late. I tried monitoring >>>> the JVM under the test from external tools but it just brings more >>>> entropy to the result. >>> >>> We'd need to instrument the thread creation logic to keep a separate >>> record. Dtrace probes could probably do it - but the problem is >>> getting the test to fail. >> >> Well, while responding to the previous email I thought about yet >> another way to try to pinpoint the mysterious thread - I've tried NB >> profiler. It filters out it's own threads and can do thread monitoring >> at the same time as tracking the call tree. >> >> The result is that the offender is j.u.l.LogManager$Cleaner thread. I >> am attaching the profiler snapshot (can be opened in eg. jvisualvm) >> >>> >>>> I am completely relying on the JVM native thread accounting to be >>>> correct and accurate - that it reports the thread count peak based on >>>> the real data. >>> >>> The spec isn't clear but I would only expect these counters to apply >>> to Java threads not VM internal threads (compiler, gc etc). So I'd >>> really like to know what thread is messing up this count. >> >> I hope my previous finding makes this clearer. >> >> -JB- >> >>> >>> David >>> >>>> -JB- >>>> >>>>> >>>>>> The patch does not fix those shortcomings as it is not really >>>>>> possible >>>>>> to do given the nature of the JVM threading system. It rather >>>>>> tries to >>>>>> relax the conditions while still maintaining the ability to detect >>>>>> functional problems - eg. decreasing peak without explicitly >>>>>> resetting >>>>>> it and reporting false number of threads. >>>>>> >>>>>> The webrev is at: >>>>>> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 >>>>> >>>>> Seems reasonable. >>>>> >>>>> David >>>>> ----- >>>>> >>>>>> Thanks, >>>>>> >>>>>> -JB- >>>>>> >>>> >> From david.holmes at oracle.com Tue Jul 23 02:54:01 2013 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Jul 2013 19:54:01 +1000 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EE528F.2050302@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> Message-ID: <51EE52B9.6070506@oracle.com> On 23/07/2013 7:53 PM, Daniel Fuchs wrote: > On 7/23/13 11:45 AM, David Holmes wrote: >> On 23/07/2013 7:24 PM, Jaroslav Bachorik wrote: >> > The result is that the offender is j.u.l.LogManager$Cleaner thread. I >> > am attaching the profiler snapshot (can be opened in eg. jvisualvm) >> >> That doesn't quite make sense. The Cleaner thread is a shutdownhook, it >> should not be starting unless the VM is shutting down! > > Hummm... Right: the javadoc says "Returns the peak live thread count > since the Java virtual machine started or peak was reset." so the > Cleaner thread should not be counted. Not sure why you say that. It is a live Java thread - if you happen to query the MXBean during VM shutdown then it should be in the count. > If it is actually counted it might indicate a real problem in the > implementation of the ThreadMXBean. My point is: why is the VM apparently shutting down while this test is running??? David > -- daniel. > > >> >> David >> ----- >> >>> On Tue 23 Jul 2013 11:19:13 AM CEST, David Holmes wrote: >>>> On 23/07/2013 6:29 PM, Jaroslav Bachorik wrote: >>>>> On 07/23/2013 10:19 AM, David Holmes wrote: >>>>>> Hi Jaroslav, >>>>>> >>>>>> On 22/07/2013 9:55 PM, Jaroslav Bachorik wrote: >>>>>>> The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java test >>>>>>> seems to be failing intermittently. >>>>>>> >>>>>>> The test checks the functionality of the >>>>>>> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by >>>>>>> capturing the current value of "getPeakThreadCount()", starting a >>>>>>> predefined number of the user threads, stopping them and resetting >>>>>>> the >>>>>>> stored peak value and making sure the new peak equals to the >>>>>>> number of >>>>>>> the actually running threads. >>>>>>> >>>>>>> The main problem is that it is not possible to prevent JVM to >>>>>>> start/stop >>>>>>> arbitrary system threads while executing the test. This might >>>>>>> lead to >>>>>>> small variations of the reported peak (a short-lived system >>>>>>> thread is >>>>>>> started while the batch of the user threads is running) or the >>>>>>> expected >>>>>>> number of running threads (again, a short-lived system thread is >>>>>>> started >>>>>>> at the moment the test asks for the number of running threads). >>>>>> >>>>>> Do you know what "system threads" these are? I would not expect VM >>>>>> internal threads to be counted in getPeakThreadCount(), but even if >>>>>> they >>>>>> are I can't think of any short-lived threads that get created other >>>>>> than >>>>>> the Signal handling thread. >>>>> >>>>> Unfortunatelly I don't. Capturing the thread dump at the moment of >>>>> discovering the discrepancy seems to to be too late. I tried >>>>> monitoring >>>>> the JVM under the test from external tools but it just brings more >>>>> entropy to the result. >>>> >>>> We'd need to instrument the thread creation logic to keep a separate >>>> record. Dtrace probes could probably do it - but the problem is >>>> getting the test to fail. >>> >>> Well, while responding to the previous email I thought about yet >>> another way to try to pinpoint the mysterious thread - I've tried NB >>> profiler. It filters out it's own threads and can do thread monitoring >>> at the same time as tracking the call tree. >>> >>> The result is that the offender is j.u.l.LogManager$Cleaner thread. I >>> am attaching the profiler snapshot (can be opened in eg. jvisualvm) >>> >>>> >>>>> I am completely relying on the JVM native thread accounting to be >>>>> correct and accurate - that it reports the thread count peak based on >>>>> the real data. >>>> >>>> The spec isn't clear but I would only expect these counters to apply >>>> to Java threads not VM internal threads (compiler, gc etc). So I'd >>>> really like to know what thread is messing up this count. >>> >>> I hope my previous finding makes this clearer. >>> >>> -JB- >>> >>>> >>>> David >>>> >>>>> -JB- >>>>> >>>>>> >>>>>>> The patch does not fix those shortcomings as it is not really >>>>>>> possible >>>>>>> to do given the nature of the JVM threading system. It rather >>>>>>> tries to >>>>>>> relax the conditions while still maintaining the ability to detect >>>>>>> functional problems - eg. decreasing peak without explicitly >>>>>>> resetting >>>>>>> it and reporting false number of threads. >>>>>>> >>>>>>> The webrev is at: >>>>>>> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 >>>>>> >>>>>> Seems reasonable. >>>>>> >>>>>> David >>>>>> ----- >>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -JB- >>>>>>> >>>>> >>> > From jaroslav.bachorik at oracle.com Tue Jul 23 03:23:38 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 23 Jul 2013 12:23:38 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EE4F60.4000506@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4F60.4000506@oracle.com> Message-ID: <51EE59AA.8010002@oracle.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 07/23/2013 11:39 AM, David Holmes wrote: > Sorry - I took a closer look at the full test rather than just > that patch. We already have this code to try and help expose these > intermittent failures: > > 213 // Nightly testing showed some intermittent failure. > 214 // Check here to get diagnostic information if some > strange 215 // behavior occurs. 216 > checkThreadCount(expectedCount, current, 0); Unfortunately, this does not help to get any closer to the culprit. Until the code gets to the point of making the thread dump the offending thread is gone. So you only get the information that something went wrong. - -JB- > > but the sleep loop you added means this check will rarely fail so > we won't get to see this unexpected behaviour happening. So this > block of code could be deleted in my view. Though it is preferable > to determine exactly why we fail! > > Also looking at the sleep() used elsewhere you may as well follow > the same pattern and abort on interrupt as it isn't expected. > > Finally with regard to Daniel's comment about the live array he is > right that the volatile on the array is not sufficient in theory - > a thread need never see the value of live[i] become false. There > are a number of reasons why we are unlikely to see that in practice > on hotspot. Using synchronized will fix that; or an alternative > cancellation mechanism could be used. > > Cheers, David > > On 23/07/2013 7:19 PM, David Holmes wrote: >> On 23/07/2013 6:29 PM, Jaroslav Bachorik wrote: >>> On 07/23/2013 10:19 AM, David Holmes wrote: >>>> Hi Jaroslav, >>>> >>>> On 22/07/2013 9:55 PM, Jaroslav Bachorik wrote: >>>>> The >>>>> java/lang/management/ThreadMXBean/ResetPeakThreadCount.java >>>>> test seems to be failing intermittently. >>>>> >>>>> The test checks the functionality of the >>>>> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does >>>>> so by capturing the current value of >>>>> "getPeakThreadCount()", starting a predefined number of the >>>>> user threads, stopping them and resetting the stored peak >>>>> value and making sure the new peak equals to the number of >>>>> the actually running threads. >>>>> >>>>> The main problem is that it is not possible to prevent JVM >>>>> to start/stop arbitrary system threads while executing the >>>>> test. This might lead to small variations of the reported >>>>> peak (a short-lived system thread is started while the >>>>> batch of the user threads is running) or the expected >>>>> number of running threads (again, a short-lived system >>>>> thread is started at the moment the test asks for the >>>>> number of running threads). >>>> >>>> Do you know what "system threads" these are? I would not >>>> expect VM internal threads to be counted in >>>> getPeakThreadCount(), but even if they are I can't think of >>>> any short-lived threads that get created other than the >>>> Signal handling thread. >>> >>> Unfortunatelly I don't. Capturing the thread dump at the moment >>> of discovering the discrepancy seems to to be too late. I tried >>> monitoring the JVM under the test from external tools but it >>> just brings more entropy to the result. >> >> We'd need to instrument the thread creation logic to keep a >> separate record. Dtrace probes could probably do it - but the >> problem is getting the test to fail. >> >>> I am completely relying on the JVM native thread accounting to >>> be correct and accurate - that it reports the thread count peak >>> based on the real data. >> >> The spec isn't clear but I would only expect these counters to >> apply to Java threads not VM internal threads (compiler, gc etc). >> So I'd really like to know what thread is messing up this count. >> >> David >> >>> -JB- >>> >>>> >>>>> The patch does not fix those shortcomings as it is not >>>>> really possible to do given the nature of the JVM threading >>>>> system. It rather tries to relax the conditions while still >>>>> maintaining the ability to detect functional problems - eg. >>>>> decreasing peak without explicitly resetting it and >>>>> reporting false number of threads. >>>>> >>>>> The webrev is at: >>>>> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 >>>> >>>> Seems reasonable. >>>> >>>> David ----- >>>> >>>>> Thanks, >>>>> >>>>> -JB- >>>>> >>> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJR7lmqAAoJELSZyqhGiB1MdS8IAJEgnUI83ZQNYP2Md6vMe4C+ kGRgls2ml9x9ljwqMHnreOjww7pzyXeDKoX1vR09OD6znDUIuHkvjIOD8QRjFnjz /E0uBnoaIIhREuvbopq4dHFXU0wPPK9VnU6OgGUtTKU0aqk9256NMJwprO06CrXa TZlmUljgk3rci7pE9ZA7Up4+3Qr0tWPn5EjLAVG/UmAvC5zNptsAZcYjf8i9yQ+1 9Hp+4xY68i9QffdE3bNEAWGTQGkNy2rF4HHwSorxnruUHgi3yTxxbykJ2pBgDgYl 3IwnbrwWxNOOPW3h5DLaqCjdromCBfzYbm4xmY6Tbcxfvh0LR8QWm5eCfE151Ss= =MYqb -----END PGP SIGNATURE----- From shanliang.jiang at oracle.com Tue Jul 23 08:30:17 2013 From: shanliang.jiang at oracle.com (shanliang) Date: Tue, 23 Jul 2013 17:30:17 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51ED1DBE.3030304@oracle.com> References: <51ED1DBE.3030304@oracle.com> Message-ID: <51EEA189.5000801@oracle.com> If it is not possible to prevent JVM to start/stop arbitrary system threads, then the test may still fail even with the fix, but I should say the fix improves the test. Line 176: // assuming no system thread is added so here at line 177 is still a potential failure, even very little. To know a thread status, better to call Thread.getState() for example we can save all MyThread instances into a list, and then check them one by one like: for (Thread t : list) { while (t.getState() != TERMINATED) { Thread.sleep(10); } } (can add a max waiting time here) this is because it is possible that a MyThread is suspended after calling: barrier.signal(); but before leaving run() method, especially when stopping many threads at same time on a slow testing machine. Shanliang Jaroslav Bachorik wrote: > The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java test > seems to be failing intermittently. > > The test checks the functionality of the > j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by > capturing the current value of "getPeakThreadCount()", starting a > predefined number of the user threads, stopping them and resetting the > stored peak value and making sure the new peak equals to the number of > the actually running threads. > > The main problem is that it is not possible to prevent JVM to start/stop > arbitrary system threads while executing the test. This might lead to > small variations of the reported peak (a short-lived system thread is > started while the batch of the user threads is running) or the expected > number of running threads (again, a short-lived system thread is started > at the moment the test asks for the number of running threads). > > The patch does not fix those shortcomings as it is not really possible > to do given the nature of the JVM threading system. It rather tries to > relax the conditions while still maintaining the ability to detect > functional problems - eg. decreasing peak without explicitly resetting > it and reporting false number of threads. > > The webrev is at: > http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 > > Thanks, > > -JB- > From mandy.chung at oracle.com Tue Jul 23 23:01:58 2013 From: mandy.chung at oracle.com (Mandy Chung) Date: Wed, 24 Jul 2013 14:01:58 +0800 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EE52B9.6070506@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> Message-ID: <51EF6DD6.5060806@oracle.com> On 7/23/2013 5:54 PM, David Holmes wrote: > On 23/07/2013 7:53 PM, Daniel Fuchs wrote: >> On 7/23/13 11:45 AM, David Holmes wrote: >>> On 23/07/2013 7:24 PM, Jaroslav Bachorik wrote: >>> > The result is that the offender is j.u.l.LogManager$Cleaner >>> thread. I >>> > am attaching the profiler snapshot (can be opened in eg. jvisualvm) >>> >>> That doesn't quite make sense. The Cleaner thread is a shutdownhook, it >>> should not be starting unless the VM is shutting down! >> >> Hummm... Right: the javadoc says "Returns the peak live thread count >> since the Java virtual machine started or peak was reset." so the >> Cleaner thread should not be counted. > > Not sure why you say that. It is a live Java thread - if you happen to > query the MXBean during VM shutdown then it should be in the count. > I am catching up on this thread.... The thread count counts Java threads that are not hidden. I believe all VM internal threads are hidden from external API. This test runs in othervm mode and AFAICT the thread count is expected to be deterministic. I don't expect the VM will start and terminate any thread any time. I agree with David that we should diagnose why there is one additional thread started before the reset. If it is the LogManager$Cleaner thread, like David said, the VM is shutting down while the test is still running which doesn't quite make sense. Mandy >> If it is actually counted it might indicate a real problem in the >> implementation of the ThreadMXBean. > > My point is: why is the VM apparently shutting down while this test is > running??? > > David > >> -- daniel. >> >> >>> >>> David >>> ----- >>> >>>> On Tue 23 Jul 2013 11:19:13 AM CEST, David Holmes wrote: >>>>> On 23/07/2013 6:29 PM, Jaroslav Bachorik wrote: >>>>>> On 07/23/2013 10:19 AM, David Holmes wrote: >>>>>>> Hi Jaroslav, >>>>>>> >>>>>>> On 22/07/2013 9:55 PM, Jaroslav Bachorik wrote: >>>>>>>> The java/lang/management/ThreadMXBean/ResetPeakThreadCount.java >>>>>>>> test >>>>>>>> seems to be failing intermittently. >>>>>>>> >>>>>>>> The test checks the functionality of the >>>>>>>> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by >>>>>>>> capturing the current value of "getPeakThreadCount()", starting a >>>>>>>> predefined number of the user threads, stopping them and resetting >>>>>>>> the >>>>>>>> stored peak value and making sure the new peak equals to the >>>>>>>> number of >>>>>>>> the actually running threads. >>>>>>>> >>>>>>>> The main problem is that it is not possible to prevent JVM to >>>>>>>> start/stop >>>>>>>> arbitrary system threads while executing the test. This might >>>>>>>> lead to >>>>>>>> small variations of the reported peak (a short-lived system >>>>>>>> thread is >>>>>>>> started while the batch of the user threads is running) or the >>>>>>>> expected >>>>>>>> number of running threads (again, a short-lived system thread is >>>>>>>> started >>>>>>>> at the moment the test asks for the number of running threads). >>>>>>> >>>>>>> Do you know what "system threads" these are? I would not expect VM >>>>>>> internal threads to be counted in getPeakThreadCount(), but even if >>>>>>> they >>>>>>> are I can't think of any short-lived threads that get created other >>>>>>> than >>>>>>> the Signal handling thread. >>>>>> >>>>>> Unfortunatelly I don't. Capturing the thread dump at the moment of >>>>>> discovering the discrepancy seems to to be too late. I tried >>>>>> monitoring >>>>>> the JVM under the test from external tools but it just brings more >>>>>> entropy to the result. >>>>> >>>>> We'd need to instrument the thread creation logic to keep a separate >>>>> record. Dtrace probes could probably do it - but the problem is >>>>> getting the test to fail. >>>> >>>> Well, while responding to the previous email I thought about yet >>>> another way to try to pinpoint the mysterious thread - I've tried NB >>>> profiler. It filters out it's own threads and can do thread monitoring >>>> at the same time as tracking the call tree. >>>> >>>> The result is that the offender is j.u.l.LogManager$Cleaner thread. I >>>> am attaching the profiler snapshot (can be opened in eg. jvisualvm) >>>> >>>>> >>>>>> I am completely relying on the JVM native thread accounting to be >>>>>> correct and accurate - that it reports the thread count peak >>>>>> based on >>>>>> the real data. >>>>> >>>>> The spec isn't clear but I would only expect these counters to apply >>>>> to Java threads not VM internal threads (compiler, gc etc). So I'd >>>>> really like to know what thread is messing up this count. >>>> >>>> I hope my previous finding makes this clearer. >>>> >>>> -JB- >>>> >>>>> >>>>> David >>>>> >>>>>> -JB- >>>>>> >>>>>>> >>>>>>>> The patch does not fix those shortcomings as it is not really >>>>>>>> possible >>>>>>>> to do given the nature of the JVM threading system. It rather >>>>>>>> tries to >>>>>>>> relax the conditions while still maintaining the ability to detect >>>>>>>> functional problems - eg. decreasing peak without explicitly >>>>>>>> resetting >>>>>>>> it and reporting false number of threads. >>>>>>>> >>>>>>>> The webrev is at: >>>>>>>> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 >>>>>>> >>>>>>> Seems reasonable. >>>>>>> >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> -JB- >>>>>>>> >>>>>> >>>> >> From daniel.fuchs at oracle.com Tue Jul 23 23:09:37 2013 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Wed, 24 Jul 2013 08:09:37 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EF6DD6.5060806@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> Message-ID: <51EF6FA1.9000103@oracle.com> On 7/24/13 8:01 AM, Mandy Chung wrote: > On 7/23/2013 5:54 PM, David Holmes wrote: >> On 23/07/2013 7:53 PM, Daniel Fuchs wrote: >>> On 7/23/13 11:45 AM, David Holmes wrote: >>>> On 23/07/2013 7:24 PM, Jaroslav Bachorik wrote: >>>> > The result is that the offender is j.u.l.LogManager$Cleaner >>>> thread. I >>>> > am attaching the profiler snapshot (can be opened in eg. jvisualvm) >>>> >>>> That doesn't quite make sense. The Cleaner thread is a >>>> shutdownhook, it >>>> should not be starting unless the VM is shutting down! >>> >>> Hummm... Right: the javadoc says "Returns the peak live thread count >>> since the Java virtual machine started or peak was reset." so the >>> Cleaner thread should not be counted. >> >> Not sure why you say that. It is a live Java thread - if you happen >> to query the MXBean during VM shutdown then it should be in the count. >> > > I am catching up on this thread.... > > The thread count counts Java threads that are not hidden. I believe > all VM internal threads are hidden from external API. This test runs > in othervm mode and AFAICT the thread count is expected to be > deterministic. I don't expect the VM will start and terminate any > thread any time. > > I agree with David that we should diagnose why there is one additional > thread started before the reset. If it is the LogManager$Cleaner > thread, like David said, the VM is shutting down while the test is > still running which doesn't quite make sense. I think that Shanliang's suspicion that a thread might be still alive if unscheduled just after having called its barrier.signal() is a very good suggestion. I would advise calling thread.join() on all threads in terminateThreads, just to make sure they are all really dead and not in some comatose state... If Shanliang is right then the test would be failing because some of the threads we think are dead are not actually dead yet - and not because of some new VM thread that nobody can see :-) -- daniel > > Mandy > >>> If it is actually counted it might indicate a real problem in the >>> implementation of the ThreadMXBean. >> >> My point is: why is the VM apparently shutting down while this test >> is running??? >> >> David >> >>> -- daniel. >>> >>> >>>> >>>> David >>>> ----- >>>> >>>>> On Tue 23 Jul 2013 11:19:13 AM CEST, David Holmes wrote: >>>>>> On 23/07/2013 6:29 PM, Jaroslav Bachorik wrote: >>>>>>> On 07/23/2013 10:19 AM, David Holmes wrote: >>>>>>>> Hi Jaroslav, >>>>>>>> >>>>>>>> On 22/07/2013 9:55 PM, Jaroslav Bachorik wrote: >>>>>>>>> The >>>>>>>>> java/lang/management/ThreadMXBean/ResetPeakThreadCount.java test >>>>>>>>> seems to be failing intermittently. >>>>>>>>> >>>>>>>>> The test checks the functionality of the >>>>>>>>> j.l.m.ThreadMXBean.resetPeakThreadCount() method. It does so by >>>>>>>>> capturing the current value of "getPeakThreadCount()", starting a >>>>>>>>> predefined number of the user threads, stopping them and >>>>>>>>> resetting >>>>>>>>> the >>>>>>>>> stored peak value and making sure the new peak equals to the >>>>>>>>> number of >>>>>>>>> the actually running threads. >>>>>>>>> >>>>>>>>> The main problem is that it is not possible to prevent JVM to >>>>>>>>> start/stop >>>>>>>>> arbitrary system threads while executing the test. This might >>>>>>>>> lead to >>>>>>>>> small variations of the reported peak (a short-lived system >>>>>>>>> thread is >>>>>>>>> started while the batch of the user threads is running) or the >>>>>>>>> expected >>>>>>>>> number of running threads (again, a short-lived system thread is >>>>>>>>> started >>>>>>>>> at the moment the test asks for the number of running threads). >>>>>>>> >>>>>>>> Do you know what "system threads" these are? I would not expect VM >>>>>>>> internal threads to be counted in getPeakThreadCount(), but >>>>>>>> even if >>>>>>>> they >>>>>>>> are I can't think of any short-lived threads that get created >>>>>>>> other >>>>>>>> than >>>>>>>> the Signal handling thread. >>>>>>> >>>>>>> Unfortunatelly I don't. Capturing the thread dump at the moment of >>>>>>> discovering the discrepancy seems to to be too late. I tried >>>>>>> monitoring >>>>>>> the JVM under the test from external tools but it just brings more >>>>>>> entropy to the result. >>>>>> >>>>>> We'd need to instrument the thread creation logic to keep a separate >>>>>> record. Dtrace probes could probably do it - but the problem is >>>>>> getting the test to fail. >>>>> >>>>> Well, while responding to the previous email I thought about yet >>>>> another way to try to pinpoint the mysterious thread - I've tried NB >>>>> profiler. It filters out it's own threads and can do thread >>>>> monitoring >>>>> at the same time as tracking the call tree. >>>>> >>>>> The result is that the offender is j.u.l.LogManager$Cleaner thread. I >>>>> am attaching the profiler snapshot (can be opened in eg. jvisualvm) >>>>> >>>>>> >>>>>>> I am completely relying on the JVM native thread accounting to be >>>>>>> correct and accurate - that it reports the thread count peak >>>>>>> based on >>>>>>> the real data. >>>>>> >>>>>> The spec isn't clear but I would only expect these counters to apply >>>>>> to Java threads not VM internal threads (compiler, gc etc). So I'd >>>>>> really like to know what thread is messing up this count. >>>>> >>>>> I hope my previous finding makes this clearer. >>>>> >>>>> -JB- >>>>> >>>>>> >>>>>> David >>>>>> >>>>>>> -JB- >>>>>>> >>>>>>>> >>>>>>>>> The patch does not fix those shortcomings as it is not really >>>>>>>>> possible >>>>>>>>> to do given the nature of the JVM threading system. It rather >>>>>>>>> tries to >>>>>>>>> relax the conditions while still maintaining the ability to >>>>>>>>> detect >>>>>>>>> functional problems - eg. decreasing peak without explicitly >>>>>>>>> resetting >>>>>>>>> it and reporting false number of threads. >>>>>>>>> >>>>>>>>> The webrev is at: >>>>>>>>> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.00 >>>>>>>> >>>>>>>> Seems reasonable. >>>>>>>> >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> -JB- >>>>>>>>> >>>>>>> >>>>> >>> > From jaroslav.bachorik at oracle.com Tue Jul 23 23:18:12 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 24 Jul 2013 08:18:12 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EF6DD6.5060806@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> Message-ID: <51EF71A4.3090209@oracle.com> Thanks everyone for taking the time to dig into this issue. I've done more testing and it turns out that my initial analysis was wrong. There are no threads magically appearing and disappearing (it was all caused by the monitoring tools I used). It rather seems that there is an issue with terminating the test threads - I've added a lot of logging to the original test and was able to observe that sometimes the new test threads were started before the terminating test threads have disappeared. So I've added more rigorous check for the threads termination - checking the thread states instead of just comparing the thread counts. By doing this I was able to decrease the chances of failing but it still seems that there is some discrepancy between the numbers reported by the mbean and eg. the result of Thread.getAllStackTraces(). I am logging all the threads reported by Thread.getAllStackTraces() before the call to mbean.getThreadCount() and after the call and sometimes it just happens that mbean.getThreadCount() reports the thread count which is off by 1 in regards to both Thread.getAllStackTraces() calls. I will try the "thread.join()" suggestion from Daniel. -JB- From mandy.chung at oracle.com Tue Jul 23 23:20:56 2013 From: mandy.chung at oracle.com (Mandy Chung) Date: Wed, 24 Jul 2013 14:20:56 +0800 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EF6FA1.9000103@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> Message-ID: <51EF7248.2070405@oracle.com> On 7/24/2013 2:09 PM, Daniel Fuchs wrote: > On 7/24/13 8:01 AM, Mandy Chung wrote: >> I am catching up on this thread.... >> >> The thread count counts Java threads that are not hidden. I believe >> all VM internal threads are hidden from external API. This test runs >> in othervm mode and AFAICT the thread count is expected to be >> deterministic. I don't expect the VM will start and terminate any >> thread any time. >> >> I agree with David that we should diagnose why there is one >> additional thread started before the reset. If it is the >> LogManager$Cleaner thread, like David said, the VM is shutting down >> while the test is still running which doesn't quite make sense. > I think that Shanliang's suspicion that a thread might be still alive > if unscheduled just after having > called its barrier.signal() is a very good suggestion. I would advise > calling thread.join() on all threads in > terminateThreads, just to make sure they are all really dead and not > in some comatose state... > If Shanliang is right then the test would be failing because some of > the threads we think are dead are > not actually dead yet - and not because of some new VM thread that > nobody can see :-) Thanks for pointing that out. I agree that the test should be changed to call Thread.join(). There may be other java.lang.management tests that should also be fixed to call Thread.join. Mandy From jaroslav.bachorik at oracle.com Tue Jul 23 23:47:36 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 24 Jul 2013 08:47:36 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EF7248.2070405@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> Message-ID: <51EF7888.40100@oracle.com> On Wed 24 Jul 2013 08:20:56 AM CEST, Mandy Chung wrote: > > On 7/24/2013 2:09 PM, Daniel Fuchs wrote: >> On 7/24/13 8:01 AM, Mandy Chung wrote: >>> I am catching up on this thread.... >>> >>> The thread count counts Java threads that are not hidden. I believe >>> all VM internal threads are hidden from external API. This test runs >>> in othervm mode and AFAICT the thread count is expected to be >>> deterministic. I don't expect the VM will start and terminate any >>> thread any time. >>> >>> I agree with David that we should diagnose why there is one >>> additional thread started before the reset. If it is the >>> LogManager$Cleaner thread, like David said, the VM is shutting down >>> while the test is still running which doesn't quite make sense. >> I think that Shanliang's suspicion that a thread might be still alive >> if unscheduled just after having >> called its barrier.signal() is a very good suggestion. I would advise >> calling thread.join() on all threads in >> terminateThreads, just to make sure they are all really dead and not >> in some comatose state... >> If Shanliang is right then the test would be failing because some of >> the threads we think are dead are >> not actually dead yet - and not because of some new VM thread that >> nobody can see :-) > > Thanks for pointing that out. > > I agree that the test should be changed to call Thread.join(). There > may be other java.lang.management tests that should also be fixed to > call Thread.join. I've tried using Thread.join() but I am still getting the thread count discrepancy. Specifically: 1. 10 worker threads have been successfully started - mben.getThreadCount() reports 14 and Thread.getAllStackTraces() returns 14 items --- Thread: Thread[Signal Dispatcher,9,system] Thread: Thread[worker-5,5,main] Thread: Thread[worker-7,5,main] Thread: Thread[worker-9,5,main] Thread: Thread[worker-12,5,main] Thread: Thread[worker-11,5,main] Thread: Thread[Reference Handler,10,system] Thread: Thread[main,5,main] Thread: Thread[worker-10,5,main] Thread: Thread[worker-8,5,main] Thread: Thread[Finalizer,8,system] Thread: Thread[worker-6,5,main] Thread: Thread[worker-13,5,main] Thread: Thread[worker-4,5,main] --- 2. Terminating 8 threads 3. After the threads have been terminated (waiting on Thread.join() for them to die) - mbean.getThreadCount() reports 7 while Thread.getAllStackTraces() returns only 6 items --- Thread: Thread[Signal Dispatcher,9,system] Thread: Thread[Finalizer,8,system] Thread: Thread[worker-12,5,main] Thread: Thread[Reference Handler,10,system] Thread: Thread[main,5,main] Thread: Thread[worker-13,5,main] --- This would almost point to mbean.getThreadCount() reporting a stale value. Is that possible? -JB- > > Mandy From shanliang.jiang at oracle.com Wed Jul 24 00:21:53 2013 From: shanliang.jiang at oracle.com (shanliang) Date: Wed, 24 Jul 2013 09:21:53 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EF7888.40100@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> Message-ID: <51EF8091.9030603@oracle.com> Just to be a test, after terminated 8 threads and checked their states by calling Thread.join() (must be same to Thread.getState()), DO sleep sometime and then call mbean.getThreadCount(), if it reports a right number, then we may need to verify mbean.getThreadCount() method. Shanliang Jaroslav Bachorik wrote: > On Wed 24 Jul 2013 08:20:56 AM CEST, Mandy Chung wrote: > >> On 7/24/2013 2:09 PM, Daniel Fuchs wrote: >> >>> On 7/24/13 8:01 AM, Mandy Chung wrote: >>> >>>> I am catching up on this thread.... >>>> >>>> The thread count counts Java threads that are not hidden. I believe >>>> all VM internal threads are hidden from external API. This test runs >>>> in othervm mode and AFAICT the thread count is expected to be >>>> deterministic. I don't expect the VM will start and terminate any >>>> thread any time. >>>> >>>> I agree with David that we should diagnose why there is one >>>> additional thread started before the reset. If it is the >>>> LogManager$Cleaner thread, like David said, the VM is shutting down >>>> while the test is still running which doesn't quite make sense. >>>> >>> I think that Shanliang's suspicion that a thread might be still alive >>> if unscheduled just after having >>> called its barrier.signal() is a very good suggestion. I would advise >>> calling thread.join() on all threads in >>> terminateThreads, just to make sure they are all really dead and not >>> in some comatose state... >>> If Shanliang is right then the test would be failing because some of >>> the threads we think are dead are >>> not actually dead yet - and not because of some new VM thread that >>> nobody can see :-) >>> >> Thanks for pointing that out. >> >> I agree that the test should be changed to call Thread.join(). There >> may be other java.lang.management tests that should also be fixed to >> call Thread.join. >> > > I've tried using Thread.join() but I am still getting the thread count > discrepancy. > > Specifically: > 1. 10 worker threads have been successfully started - > mben.getThreadCount() reports 14 and Thread.getAllStackTraces() returns > 14 items > --- > Thread: Thread[Signal Dispatcher,9,system] > Thread: Thread[worker-5,5,main] > Thread: Thread[worker-7,5,main] > Thread: Thread[worker-9,5,main] > Thread: Thread[worker-12,5,main] > Thread: Thread[worker-11,5,main] > Thread: Thread[Reference Handler,10,system] > Thread: Thread[main,5,main] > Thread: Thread[worker-10,5,main] > Thread: Thread[worker-8,5,main] > Thread: Thread[Finalizer,8,system] > Thread: Thread[worker-6,5,main] > Thread: Thread[worker-13,5,main] > Thread: Thread[worker-4,5,main] > --- > 2. Terminating 8 threads > 3. After the threads have been terminated (waiting on Thread.join() for > them to die) - mbean.getThreadCount() reports 7 while > Thread.getAllStackTraces() returns only 6 items > --- > Thread: Thread[Signal Dispatcher,9,system] > Thread: Thread[Finalizer,8,system] > Thread: Thread[worker-12,5,main] > Thread: Thread[Reference Handler,10,system] > Thread: Thread[main,5,main] > Thread: Thread[worker-13,5,main] > --- > > This would almost point to mbean.getThreadCount() reporting a stale > value. Is that possible? > > -JB- > > >> Mandy >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/jmx-dev/attachments/20130724/6c50234a/attachment.html From jaroslav.bachorik at oracle.com Wed Jul 24 00:35:02 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 24 Jul 2013 09:35:02 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EF8091.9030603@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> Message-ID: <51EF83A6.1040200@oracle.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 07/24/2013 09:21 AM, shanliang wrote: > Just to be a test, after terminated 8 threads and checked their > states by calling Thread.join() (must be same to > Thread.getState()), DO sleep sometime and then call > mbean.getThreadCount(), if it reports a right number, then we may > need to verify mbean.getThreadCount() method. The result is: Thread: Thread[Signal Dispatcher,9,system] Thread: Thread[Finalizer,8,system] Thread: Thread[worker-12,5,main] Thread: Thread[Reference Handler,10,system] Thread: Thread[main,5,main] Thread: Thread[worker-13,5,main] - ---> MBean.getThreadCount() = 7 Thread(1): Thread[Signal Dispatcher,9,system] Thread(1): Thread[Finalizer,8,system] Thread(1): Thread[worker-12,5,main] Thread(1): Thread[Reference Handler,10,system] Thread(1): Thread[main,5,main] Thread(1): Thread[worker-13,5,main] - ---> MBean.getThreadCount() = 6 - -JB- > > Shanliang > > Jaroslav Bachorik wrote: >> On Wed 24 Jul 2013 08:20:56 AM CEST, Mandy Chung wrote: >> >>> On 7/24/2013 2:09 PM, Daniel Fuchs wrote: >>> >>>> On 7/24/13 8:01 AM, Mandy Chung wrote: >>>> >>>>> I am catching up on this thread.... >>>>> >>>>> The thread count counts Java threads that are not hidden. >>>>> I believe all VM internal threads are hidden from external >>>>> API. This test runs in othervm mode and AFAICT the thread >>>>> count is expected to be deterministic. I don't expect the >>>>> VM will start and terminate any thread any time. >>>>> >>>>> I agree with David that we should diagnose why there is >>>>> one additional thread started before the reset. If it is >>>>> the LogManager$Cleaner thread, like David said, the VM is >>>>> shutting down while the test is still running which doesn't >>>>> quite make sense. >>>>> >>>> I think that Shanliang's suspicion that a thread might be >>>> still alive if unscheduled just after having called its >>>> barrier.signal() is a very good suggestion. I would advise >>>> calling thread.join() on all threads in terminateThreads, >>>> just to make sure they are all really dead and not in some >>>> comatose state... If Shanliang is right then the test would >>>> be failing because some of the threads we think are dead are >>>> not actually dead yet - and not because of some new VM thread >>>> that nobody can see :-) >>>> >>> Thanks for pointing that out. >>> >>> I agree that the test should be changed to call Thread.join(). >>> There may be other java.lang.management tests that should also >>> be fixed to call Thread.join. >>> >> >> I've tried using Thread.join() but I am still getting the thread >> count discrepancy. >> >> Specifically: 1. 10 worker threads have been successfully started >> - mben.getThreadCount() reports 14 and >> Thread.getAllStackTraces() returns 14 items --- Thread: >> Thread[Signal Dispatcher,9,system] Thread: >> Thread[worker-5,5,main] Thread: Thread[worker-7,5,main] Thread: >> Thread[worker-9,5,main] Thread: Thread[worker-12,5,main] Thread: >> Thread[worker-11,5,main] Thread: Thread[Reference >> Handler,10,system] Thread: Thread[main,5,main] Thread: >> Thread[worker-10,5,main] Thread: Thread[worker-8,5,main] Thread: >> Thread[Finalizer,8,system] Thread: Thread[worker-6,5,main] >> Thread: Thread[worker-13,5,main] Thread: Thread[worker-4,5,main] >> --- 2. Terminating 8 threads 3. After the threads have been >> terminated (waiting on Thread.join() for them to die) - >> mbean.getThreadCount() reports 7 while Thread.getAllStackTraces() >> returns only 6 items --- Thread: Thread[Signal >> Dispatcher,9,system] Thread: Thread[Finalizer,8,system] Thread: >> Thread[worker-12,5,main] Thread: Thread[Reference >> Handler,10,system] Thread: Thread[main,5,main] Thread: >> Thread[worker-13,5,main] --- >> >> This would almost point to mbean.getThreadCount() reporting a >> stale value. Is that possible? >> >> -JB- >> >> >>> Mandy >>> >> >> >> > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJR74OmAAoJELSZyqhGiB1M4k8H/2hM5o2vVe2lfhc374IBaR5R R8i9Z2n0prBRKIqg4bTkAcllq5pmdozxwFyaEBzJtAGh9vnL7Tmojn6ksg9K+MMl bSgWSeg+gSZyymS7aE8rTVqKigH8vNOpHOogePDrUOCZGeZgJIMpmY1QcVbLeq8k mkz5mPYxEE2E7gt8cjvcXknOWeQUTyZILWGIPBfx9FL0iwBtK5h0PnfasR7bCxcR DO48USIuTxe+aN687OkAlJq9bCR6HRzWQiaSdi4ROVyrx2xYtir4n9sZtNWJwokv 3p5TdX6S64jnVZZMjbPJgCENTYMvTeRCj/8GvCYlI9KQEa9x2zhU2wIp5Zw4ag4= =4vkV -----END PGP SIGNATURE----- From shanliang.jiang at oracle.com Wed Jul 24 01:38:02 2013 From: shanliang.jiang at oracle.com (shanliang) Date: Wed, 24 Jul 2013 10:38:02 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EF83A6.1040200@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> Message-ID: <51EF926A.3060705@oracle.com> - ---> MBean.getThreadCount() = 7 ................ - ---> MBean.getThreadCount() = 6 I suppose that you added sleep between 2 calls, then there might be an issue with MBean.getThreadCount() Shanliang Jaroslav Bachorik wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 07/24/2013 09:21 AM, shanliang wrote: > >> Just to be a test, after terminated 8 threads and checked their >> states by calling Thread.join() (must be same to >> Thread.getState()), DO sleep sometime and then call >> mbean.getThreadCount(), if it reports a right number, then we may >> need to verify mbean.getThreadCount() method. >> > > The result is: > > Thread: Thread[Signal Dispatcher,9,system] > Thread: Thread[Finalizer,8,system] > Thread: Thread[worker-12,5,main] > Thread: Thread[Reference Handler,10,system] > Thread: Thread[main,5,main] > Thread: Thread[worker-13,5,main] > - ---> MBean.getThreadCount() = 7 > Thread(1): Thread[Signal Dispatcher,9,system] > Thread(1): Thread[Finalizer,8,system] > Thread(1): Thread[worker-12,5,main] > Thread(1): Thread[Reference Handler,10,system] > Thread(1): Thread[main,5,main] > Thread(1): Thread[worker-13,5,main] > - ---> MBean.getThreadCount() = 6 > > - -JB- > > > >> Shanliang >> >> Jaroslav Bachorik wrote: >> >>> On Wed 24 Jul 2013 08:20:56 AM CEST, Mandy Chung wrote: >>> >>> >>>> On 7/24/2013 2:09 PM, Daniel Fuchs wrote: >>>> >>>> >>>>> On 7/24/13 8:01 AM, Mandy Chung wrote: >>>>> >>>>> >>>>>> I am catching up on this thread.... >>>>>> >>>>>> The thread count counts Java threads that are not hidden. >>>>>> I believe all VM internal threads are hidden from external >>>>>> API. This test runs in othervm mode and AFAICT the thread >>>>>> count is expected to be deterministic. I don't expect the >>>>>> VM will start and terminate any thread any time. >>>>>> >>>>>> I agree with David that we should diagnose why there is >>>>>> one additional thread started before the reset. If it is >>>>>> the LogManager$Cleaner thread, like David said, the VM is >>>>>> shutting down while the test is still running which doesn't >>>>>> quite make sense. >>>>>> >>>>>> >>>>> I think that Shanliang's suspicion that a thread might be >>>>> still alive if unscheduled just after having called its >>>>> barrier.signal() is a very good suggestion. I would advise >>>>> calling thread.join() on all threads in terminateThreads, >>>>> just to make sure they are all really dead and not in some >>>>> comatose state... If Shanliang is right then the test would >>>>> be failing because some of the threads we think are dead are >>>>> not actually dead yet - and not because of some new VM thread >>>>> that nobody can see :-) >>>>> >>>>> >>>> Thanks for pointing that out. >>>> >>>> I agree that the test should be changed to call Thread.join(). >>>> There may be other java.lang.management tests that should also >>>> be fixed to call Thread.join. >>>> >>>> >>> I've tried using Thread.join() but I am still getting the thread >>> count discrepancy. >>> >>> Specifically: 1. 10 worker threads have been successfully started >>> - mben.getThreadCount() reports 14 and >>> Thread.getAllStackTraces() returns 14 items --- Thread: >>> Thread[Signal Dispatcher,9,system] Thread: >>> Thread[worker-5,5,main] Thread: Thread[worker-7,5,main] Thread: >>> Thread[worker-9,5,main] Thread: Thread[worker-12,5,main] Thread: >>> Thread[worker-11,5,main] Thread: Thread[Reference >>> Handler,10,system] Thread: Thread[main,5,main] Thread: >>> Thread[worker-10,5,main] Thread: Thread[worker-8,5,main] Thread: >>> Thread[Finalizer,8,system] Thread: Thread[worker-6,5,main] >>> Thread: Thread[worker-13,5,main] Thread: Thread[worker-4,5,main] >>> --- 2. Terminating 8 threads 3. After the threads have been >>> terminated (waiting on Thread.join() for them to die) - >>> mbean.getThreadCount() reports 7 while Thread.getAllStackTraces() >>> returns only 6 items --- Thread: Thread[Signal >>> Dispatcher,9,system] Thread: Thread[Finalizer,8,system] Thread: >>> Thread[worker-12,5,main] Thread: Thread[Reference >>> Handler,10,system] Thread: Thread[main,5,main] Thread: >>> Thread[worker-13,5,main] --- >>> >>> This would almost point to mbean.getThreadCount() reporting a >>> stale value. Is that possible? >>> >>> -JB- >>> >>> >>> >>>> Mandy >>>> >>>> >>> >>> >> > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.12 (GNU/Linux) > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iQEcBAEBAgAGBQJR74OmAAoJELSZyqhGiB1M4k8H/2hM5o2vVe2lfhc374IBaR5R > R8i9Z2n0prBRKIqg4bTkAcllq5pmdozxwFyaEBzJtAGh9vnL7Tmojn6ksg9K+MMl > bSgWSeg+gSZyymS7aE8rTVqKigH8vNOpHOogePDrUOCZGeZgJIMpmY1QcVbLeq8k > mkz5mPYxEE2E7gt8cjvcXknOWeQUTyZILWGIPBfx9FL0iwBtK5h0PnfasR7bCxcR > DO48USIuTxe+aN687OkAlJq9bCR6HRzWQiaSdi4ROVyrx2xYtir4n9sZtNWJwokv > 3p5TdX6S64jnVZZMjbPJgCENTYMvTeRCj/8GvCYlI9KQEa9x2zhU2wIp5Zw4ag4= > =4vkV > -----END PGP SIGNATURE----- > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/jmx-dev/attachments/20130724/723980f5/attachment-0001.html From jaroslav.bachorik at oracle.com Wed Jul 24 01:40:46 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 24 Jul 2013 10:40:46 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EF926A.3060705@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> Message-ID: <51EF930E.4050507@oracle.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 07/24/2013 10:38 AM, shanliang wrote: > - ---> MBean.getThreadCount() = 7 > > ................ > > - ---> MBean.getThreadCount() = 6 > > I suppose that you added sleep between 2 calls, then there might be > an issue with MBean.getThreadCount() Actually I tried it with sleep for 10ms as well as without. It seems that the natural latency between those 2 calls is enough to get the thread count updated to the actual value. - -JB- > > Shanliang > > Jaroslav Bachorik wrote: On 07/24/2013 09:21 AM, shanliang wrote: > >>>> Just to be a test, after terminated 8 threads and checked >>>> their states by calling Thread.join() (must be same to >>>> Thread.getState()), DO sleep sometime and then call >>>> mbean.getThreadCount(), if it reports a right number, then we >>>> may need to verify mbean.getThreadCount() method. >>>> > > The result is: > > Thread: Thread[Signal Dispatcher,9,system] Thread: > Thread[Finalizer,8,system] Thread: Thread[worker-12,5,main] Thread: > Thread[Reference Handler,10,system] Thread: Thread[main,5,main] > Thread: Thread[worker-13,5,main] ---> MBean.getThreadCount() = 7 > Thread(1): Thread[Signal Dispatcher,9,system] Thread(1): > Thread[Finalizer,8,system] Thread(1): Thread[worker-12,5,main] > Thread(1): Thread[Reference Handler,10,system] Thread(1): > Thread[main,5,main] Thread(1): Thread[worker-13,5,main] ---> > MBean.getThreadCount() = 6 > > -JB- > > > >>>> Shanliang >>>> >>>> Jaroslav Bachorik wrote: >>>> >>>>> On Wed 24 Jul 2013 08:20:56 AM CEST, Mandy Chung wrote: >>>>> >>>>> >>>>>> On 7/24/2013 2:09 PM, Daniel Fuchs wrote: >>>>>> >>>>>> >>>>>>> On 7/24/13 8:01 AM, Mandy Chung wrote: >>>>>>> >>>>>>> >>>>>>>> I am catching up on this thread.... >>>>>>>> >>>>>>>> The thread count counts Java threads that are not >>>>>>>> hidden. I believe all VM internal threads are hidden >>>>>>>> from external API. This test runs in othervm mode and >>>>>>>> AFAICT the thread count is expected to be >>>>>>>> deterministic. I don't expect the VM will start and >>>>>>>> terminate any thread any time. >>>>>>>> >>>>>>>> I agree with David that we should diagnose why there >>>>>>>> is one additional thread started before the reset. >>>>>>>> If it is the LogManager$Cleaner thread, like David >>>>>>>> said, the VM is shutting down while the test is still >>>>>>>> running which doesn't quite make sense. >>>>>>>> >>>>>>>> >>>>>>> I think that Shanliang's suspicion that a thread might >>>>>>> be still alive if unscheduled just after having called >>>>>>> its barrier.signal() is a very good suggestion. I would >>>>>>> advise calling thread.join() on all threads in >>>>>>> terminateThreads, just to make sure they are all really >>>>>>> dead and not in some comatose state... If Shanliang is >>>>>>> right then the test would be failing because some of >>>>>>> the threads we think are dead are not actually dead yet >>>>>>> - and not because of some new VM thread that nobody can >>>>>>> see :-) >>>>>>> >>>>>>> >>>>>> Thanks for pointing that out. >>>>>> >>>>>> I agree that the test should be changed to call >>>>>> Thread.join(). There may be other java.lang.management >>>>>> tests that should also be fixed to call Thread.join. >>>>>> >>>>>> >>>>> I've tried using Thread.join() but I am still getting the >>>>> thread count discrepancy. >>>>> >>>>> Specifically: 1. 10 worker threads have been successfully >>>>> started - mben.getThreadCount() reports 14 and >>>>> Thread.getAllStackTraces() returns 14 items --- Thread: >>>>> Thread[Signal Dispatcher,9,system] Thread: >>>>> Thread[worker-5,5,main] Thread: Thread[worker-7,5,main] >>>>> Thread: Thread[worker-9,5,main] Thread: >>>>> Thread[worker-12,5,main] Thread: Thread[worker-11,5,main] >>>>> Thread: Thread[Reference Handler,10,system] Thread: >>>>> Thread[main,5,main] Thread: Thread[worker-10,5,main] >>>>> Thread: Thread[worker-8,5,main] Thread: >>>>> Thread[Finalizer,8,system] Thread: Thread[worker-6,5,main] >>>>> Thread: Thread[worker-13,5,main] Thread: >>>>> Thread[worker-4,5,main] --- 2. Terminating 8 threads 3. >>>>> After the threads have been terminated (waiting on >>>>> Thread.join() for them to die) - mbean.getThreadCount() >>>>> reports 7 while Thread.getAllStackTraces() returns only 6 >>>>> items --- Thread: Thread[Signal Dispatcher,9,system] >>>>> Thread: Thread[Finalizer,8,system] Thread: >>>>> Thread[worker-12,5,main] Thread: Thread[Reference >>>>> Handler,10,system] Thread: Thread[main,5,main] Thread: >>>>> Thread[worker-13,5,main] --- >>>>> >>>>> This would almost point to mbean.getThreadCount() reporting >>>>> a stale value. Is that possible? >>>>> >>>>> -JB- >>>>> >>>>> >>>>> >>>>>> Mandy >>>>>> >>>>>> >>>>> >>>>> >>>> > >> > > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJR75MOAAoJELSZyqhGiB1M1CgIAKcMQMTZlZqM6qFsI5nCc53Y fHEFykf4792Qh/TgqDiyNbDCiTgY0TWoChUEJJEQvlho01TpJmKbkyqx5fNoNqjO l94p073f4GsUSHR4exGmDjJkg87DCzhbhX3bZdwjfsxJHxup8qrXxpz4c5lyBHDH ttoSasrcDIUh7cRoeqY7uWkIcnc8xI1cj7p3JlPUwB251eKzh15GZgMJhNKrn9N2 nhjpGywh3t/kwcsDVCibgBBOJ4ju55PRDZTyxH2R6o4fM+Twl80nZSaxUJiPUfEe yDNFUxMfPcNH+jRAhRlmKRZtfHfYV/nwaj/eqCL8CDtluzVR+lraII81pg7OU+c= =lqyg -----END PGP SIGNATURE----- From shanliang.jiang at oracle.com Wed Jul 24 01:50:26 2013 From: shanliang.jiang at oracle.com (shanliang) Date: Wed, 24 Jul 2013 10:50:26 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EF930E.4050507@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> Message-ID: <51EF9552.1020901@oracle.com> Jaroslav Bachorik wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 07/24/2013 10:38 AM, shanliang wrote: > >> - ---> MBean.getThreadCount() = 7 >> >> ................ >> >> - ---> MBean.getThreadCount() = 6 >> >> I suppose that you added sleep between 2 calls, then there might be >> an issue with MBean.getThreadCount() >> > > Actually I tried it with sleep for 10ms as well as without. It seems > that the natural latency between those 2 calls is enough to get the > thread count updated to the actual value. > So we have 2 kinds of issues here: 1) the test related, like Thread state checking, we can fix them in the test 2) MBean.getThreadCount() issue, we can create a bug to trace it (add your test case to the bug), and add a workaround (sleep or call 2 times) in the test to make the test pass. Mandy is the expert and better to get her opinion. Shanliang > - -JB- > > >> Shanliang >> >> Jaroslav Bachorik wrote: On 07/24/2013 09:21 AM, shanliang wrote: >> >> >>>>> Just to be a test, after terminated 8 threads and checked >>>>> their states by calling Thread.join() (must be same to >>>>> Thread.getState()), DO sleep sometime and then call >>>>> mbean.getThreadCount(), if it reports a right number, then we >>>>> may need to verify mbean.getThreadCount() method. >>>>> >>>>> >> The result is: >> >> Thread: Thread[Signal Dispatcher,9,system] Thread: >> Thread[Finalizer,8,system] Thread: Thread[worker-12,5,main] Thread: >> Thread[Reference Handler,10,system] Thread: Thread[main,5,main] >> Thread: Thread[worker-13,5,main] ---> MBean.getThreadCount() = 7 >> Thread(1): Thread[Signal Dispatcher,9,system] Thread(1): >> Thread[Finalizer,8,system] Thread(1): Thread[worker-12,5,main] >> Thread(1): Thread[Reference Handler,10,system] Thread(1): >> Thread[main,5,main] Thread(1): Thread[worker-13,5,main] ---> >> MBean.getThreadCount() = 6 >> >> -JB- >> >> >> >> >>>>> Shanliang >>>>> >>>>> Jaroslav Bachorik wrote: >>>>> >>>>> >>>>>> On Wed 24 Jul 2013 08:20:56 AM CEST, Mandy Chung wrote: >>>>>> >>>>>> >>>>>> >>>>>>> On 7/24/2013 2:09 PM, Daniel Fuchs wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 7/24/13 8:01 AM, Mandy Chung wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> I am catching up on this thread.... >>>>>>>>> >>>>>>>>> The thread count counts Java threads that are not >>>>>>>>> hidden. I believe all VM internal threads are hidden >>>>>>>>> from external API. This test runs in othervm mode and >>>>>>>>> AFAICT the thread count is expected to be >>>>>>>>> deterministic. I don't expect the VM will start and >>>>>>>>> terminate any thread any time. >>>>>>>>> >>>>>>>>> I agree with David that we should diagnose why there >>>>>>>>> is one additional thread started before the reset. >>>>>>>>> If it is the LogManager$Cleaner thread, like David >>>>>>>>> said, the VM is shutting down while the test is still >>>>>>>>> running which doesn't quite make sense. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> I think that Shanliang's suspicion that a thread might >>>>>>>> be still alive if unscheduled just after having called >>>>>>>> its barrier.signal() is a very good suggestion. I would >>>>>>>> advise calling thread.join() on all threads in >>>>>>>> terminateThreads, just to make sure they are all really >>>>>>>> dead and not in some comatose state... If Shanliang is >>>>>>>> right then the test would be failing because some of >>>>>>>> the threads we think are dead are not actually dead yet >>>>>>>> - and not because of some new VM thread that nobody can >>>>>>>> see :-) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> Thanks for pointing that out. >>>>>>> >>>>>>> I agree that the test should be changed to call >>>>>>> Thread.join(). There may be other java.lang.management >>>>>>> tests that should also be fixed to call Thread.join. >>>>>>> >>>>>>> >>>>>>> >>>>>> I've tried using Thread.join() but I am still getting the >>>>>> thread count discrepancy. >>>>>> >>>>>> Specifically: 1. 10 worker threads have been successfully >>>>>> started - mben.getThreadCount() reports 14 and >>>>>> Thread.getAllStackTraces() returns 14 items --- Thread: >>>>>> Thread[Signal Dispatcher,9,system] Thread: >>>>>> Thread[worker-5,5,main] Thread: Thread[worker-7,5,main] >>>>>> Thread: Thread[worker-9,5,main] Thread: >>>>>> Thread[worker-12,5,main] Thread: Thread[worker-11,5,main] >>>>>> Thread: Thread[Reference Handler,10,system] Thread: >>>>>> Thread[main,5,main] Thread: Thread[worker-10,5,main] >>>>>> Thread: Thread[worker-8,5,main] Thread: >>>>>> Thread[Finalizer,8,system] Thread: Thread[worker-6,5,main] >>>>>> Thread: Thread[worker-13,5,main] Thread: >>>>>> Thread[worker-4,5,main] --- 2. Terminating 8 threads 3. >>>>>> After the threads have been terminated (waiting on >>>>>> Thread.join() for them to die) - mbean.getThreadCount() >>>>>> reports 7 while Thread.getAllStackTraces() returns only 6 >>>>>> items --- Thread: Thread[Signal Dispatcher,9,system] >>>>>> Thread: Thread[Finalizer,8,system] Thread: >>>>>> Thread[worker-12,5,main] Thread: Thread[Reference >>>>>> Handler,10,system] Thread: Thread[main,5,main] Thread: >>>>>> Thread[worker-13,5,main] --- >>>>>> >>>>>> This would almost point to mbean.getThreadCount() reporting >>>>>> a stale value. Is that possible? >>>>>> >>>>>> -JB- >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> Mandy >>>>>>> >>>>>>> >>>>>>> >>>>>> >> > > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.12 (GNU/Linux) > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iQEcBAEBAgAGBQJR75MOAAoJELSZyqhGiB1M1CgIAKcMQMTZlZqM6qFsI5nCc53Y > fHEFykf4792Qh/TgqDiyNbDCiTgY0TWoChUEJJEQvlho01TpJmKbkyqx5fNoNqjO > l94p073f4GsUSHR4exGmDjJkg87DCzhbhX3bZdwjfsxJHxup8qrXxpz4c5lyBHDH > ttoSasrcDIUh7cRoeqY7uWkIcnc8xI1cj7p3JlPUwB251eKzh15GZgMJhNKrn9N2 > nhjpGywh3t/kwcsDVCibgBBOJ4ju55PRDZTyxH2R6o4fM+Twl80nZSaxUJiPUfEe > yDNFUxMfPcNH+jRAhRlmKRZtfHfYV/nwaj/eqCL8CDtluzVR+lraII81pg7OU+c= > =lqyg > -----END PGP SIGNATURE----- > From mandy.chung at oracle.com Wed Jul 24 02:31:57 2013 From: mandy.chung at oracle.com (Mandy Chung) Date: Wed, 24 Jul 2013 17:31:57 +0800 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EF9552.1020901@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> Message-ID: <51EF9F0D.7040709@oracle.com> On 7/24/2013 4:50 PM, shanliang wrote: > So we have 2 kinds of issues here: > 1) the test related, like Thread state checking, we can fix them in > the test > 2) MBean.getThreadCount() issue, we can create a bug to trace it (add > your test case to the bug), and add a workaround (sleep or call 2 > times) in the test to make the test pass. Mandy is the expert and > better to get her opinion. It's probably a race in the VM implementation in determining the thread count. You will need to diagnose the VM implementation and compare the thread list and the implementation of getting the thread count (check hotspot/src/share/vm/services/threadService.cpp) Mandy From david.holmes at oracle.com Wed Jul 24 04:21:27 2013 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Jul 2013 21:21:27 +1000 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EF9F0D.7040709@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> <51EF9F0D.7040709@oracle.com> Message-ID: <51EFB8B7.6030204@oracle.com> On 24/07/2013 7:31 PM, Mandy Chung wrote: > > On 7/24/2013 4:50 PM, shanliang wrote: >> So we have 2 kinds of issues here: >> 1) the test related, like Thread state checking, we can fix them in >> the test >> 2) MBean.getThreadCount() issue, we can create a bug to trace it (add >> your test case to the bug), and add a workaround (sleep or call 2 >> times) in the test to make the test pass. Mandy is the expert and >> better to get her opinion. > > It's probably a race in the VM implementation in determining the thread > count. You will need to diagnose the VM implementation and compare the > thread list and the implementation of getting the thread count (check > hotspot/src/share/vm/services/threadService.cpp) There is a considerable code path between the point where a terminating thread causes Thread.join() to be allowed to return, and the point where the live thread count gets decremented. So using join() does not help here. Arguably JVMTI should have based its counts around the lifecycle of the Java thread not the underlying native thread. David ----- > Mandy From david.holmes at oracle.com Wed Jul 24 04:58:32 2013 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Jul 2013 21:58:32 +1000 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EFBFA3.90608@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> <51EF9F0D.7040709@oracle.com> <51EFB8B7.6030204@oracle.com> <51EFBFA3.90608@oracle.com> Message-ID: <51EFC168.6030205@oracle.com> Aside: it is really annoying that jmx-dev mangles the subject such that cross-posts end up creating two different email threads :( On 24/07/2013 9:50 PM, Jaroslav Bachorik wrote: > On 07/24/2013 01:21 PM, David Holmes wrote: >> On 24/07/2013 7:31 PM, Mandy Chung wrote: >>> >>> On 7/24/2013 4:50 PM, shanliang wrote: >>>> So we have 2 kinds of issues here: >>>> 1) the test related, like Thread state checking, we can fix them in >>>> the test >>>> 2) MBean.getThreadCount() issue, we can create a bug to trace it (add >>>> your test case to the bug), and add a workaround (sleep or call 2 >>>> times) in the test to make the test pass. Mandy is the expert and >>>> better to get her opinion. >>> >>> It's probably a race in the VM implementation in determining the thread >>> count. You will need to diagnose the VM implementation and compare the >>> thread list and the implementation of getting the thread count (check >>> hotspot/src/share/vm/services/threadService.cpp) >> >> There is a considerable code path between the point where a terminating >> thread causes Thread.join() to be allowed to return, and the point where >> the live thread count gets decremented. So using join() does not help >> here. Arguably JVMTI should have based its counts around the lifecycle >> of the Java thread not the underlying native thread. > > So, if I understand it correctly, it is not possible to get 100% > accuracy of the thread related counters in situations when you create > and terminate a number of threads rapidly. Correct. > In that case this test could be fixed with a small waiting period after > all the joined threads were terminated - just to make sure that all the > exiting threads were properly collected. Yes. > The only question remains whether a bug should be filed for the > discrepancy between the thread counters obtained from ThreadMXBean and > the ones coming from different paths. I'm unclear what the "different paths" are. David ----- > -JB- > >> >> David >> ----- >> >>> Mandy > From jaroslav.bachorik at oracle.com Wed Jul 24 05:02:13 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 24 Jul 2013 14:02:13 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EFC168.6030205@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> <51EF9F0D.7040709@oracle.com> <51EFB8B7.6030204@oracle.com> <51EFBFA3.90608@oracle.com> <51EFC168.6030205@oracle.com> Message-ID: <51EFC245.7070805@oracle.com> On 07/24/2013 01:58 PM, David Holmes wrote: > Aside: it is really annoying that jmx-dev mangles the subject such that > cross-posts end up creating two different email threads :( > > On 24/07/2013 9:50 PM, Jaroslav Bachorik wrote: >> On 07/24/2013 01:21 PM, David Holmes wrote: >>> On 24/07/2013 7:31 PM, Mandy Chung wrote: >>>> >>>> On 7/24/2013 4:50 PM, shanliang wrote: >>>>> So we have 2 kinds of issues here: >>>>> 1) the test related, like Thread state checking, we can fix them in >>>>> the test >>>>> 2) MBean.getThreadCount() issue, we can create a bug to trace it (add >>>>> your test case to the bug), and add a workaround (sleep or call 2 >>>>> times) in the test to make the test pass. Mandy is the expert and >>>>> better to get her opinion. >>>> >>>> It's probably a race in the VM implementation in determining the thread >>>> count. You will need to diagnose the VM implementation and compare >>>> the >>>> thread list and the implementation of getting the thread count (check >>>> hotspot/src/share/vm/services/threadService.cpp) >>> >>> There is a considerable code path between the point where a terminating >>> thread causes Thread.join() to be allowed to return, and the point where >>> the live thread count gets decremented. So using join() does not help >>> here. Arguably JVMTI should have based its counts around the lifecycle >>> of the Java thread not the underlying native thread. >> >> So, if I understand it correctly, it is not possible to get 100% >> accuracy of the thread related counters in situations when you create >> and terminate a number of threads rapidly. > > Correct. > >> In that case this test could be fixed with a small waiting period after >> all the joined threads were terminated - just to make sure that all the >> exiting threads were properly collected. > > Yes. > >> The only question remains whether a bug should be filed for the >> discrepancy between the thread counters obtained from ThreadMXBean and >> the ones coming from different paths. > > I'm unclear what the "different paths" are. Hm, there might be only one "different path" in Java - Thread.dumpStack() and Thread.getAllStackTraces() -JB- > > David > ----- > >> -JB- >> >>> >>> David >>> ----- >>> >>>> Mandy >> From jaroslav.bachorik at oracle.com Wed Jul 24 05:49:34 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 24 Jul 2013 14:49:34 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EFC951.70704@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> <51EF9F0D.7040709@oracle.com> <51EFB8B7.6030204@oracle.com> <51EFC951.70704@oracle.com> Message-ID: <51EFCD5E.3090007@oracle.com> On 07/24/2013 02:32 PM, Chris Hegarty wrote: > On 24/07/2013 12:21, David Holmes wrote: >> On 24/07/2013 7:31 PM, Mandy Chung wrote: >>> >>> On 7/24/2013 4:50 PM, shanliang wrote: >>>> So we have 2 kinds of issues here: >>>> 1) the test related, like Thread state checking, we can fix them in >>>> the test >>>> 2) MBean.getThreadCount() issue, we can create a bug to trace it (add >>>> your test case to the bug), and add a workaround (sleep or call 2 >>>> times) in the test to make the test pass. Mandy is the expert and >>>> better to get her opinion. >>> >>> It's probably a race in the VM implementation in determining the thread >>> count. You will need to diagnose the VM implementation and compare the >>> thread list and the implementation of getting the thread count (check >>> hotspot/src/share/vm/services/threadService.cpp) >> >> There is a considerable code path between the point where a terminating >> thread causes Thread.join() to be allowed to return, and the point where >> the live thread count gets decremented. So using join() does not help >> here. Arguably JVMTI should have based its counts around the lifecycle >> of the Java thread not the underlying native thread. > > It appears, from my reading of the code, that this situation ( a thread > exiting ) should be handled. Or maybe I'm looking at the wrong interface. > > JavaThread::exit(...) { > ... > ThreadService::current_thread_exiting(this); > ... > ensure_join(..) > ... > } > > So the exiting thread should be removed from the live thread count > before Thread.join returns. Unfortunately, ensure_join(...) is called on line 1860 but Threads::remove(this), which does the actual cleanup of the live threads counter, is called only on line 1919, leaving at least a few ns window when the thread is reported as terminated in java but the counters haven't been updated yet. -JB- > > -Chris. > >> >> David >> ----- >> >>> Mandy From jaroslav.bachorik at oracle.com Wed Jul 24 07:08:02 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 24 Jul 2013 16:08:02 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EFD3F5.3060209@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> <51EF9F0D.7040709@oracle.com> <51EFB8B7.6030204@oracle.com> <51EFC951.70704@oracle.com> <51EFCD5E.3090007@oracle.com> <51EFD3F5.3060209@oracle.com> Message-ID: <51EFDFC2.20503@oracle.com> On 07/24/2013 03:17 PM, Chris Hegarty wrote: > On 24/07/2013 13:49, Jaroslav Bachorik wrote: >> On 07/24/2013 02:32 PM, Chris Hegarty wrote: >>> On 24/07/2013 12:21, David Holmes wrote: >>>> On 24/07/2013 7:31 PM, Mandy Chung wrote: >>>>> >>>>> On 7/24/2013 4:50 PM, shanliang wrote: >>>>>> So we have 2 kinds of issues here: >>>>>> 1) the test related, like Thread state checking, we can fix them in >>>>>> the test >>>>>> 2) MBean.getThreadCount() issue, we can create a bug to trace it (add >>>>>> your test case to the bug), and add a workaround (sleep or call 2 >>>>>> times) in the test to make the test pass. Mandy is the expert and >>>>>> better to get her opinion. >>>>> >>>>> It's probably a race in the VM implementation in determining the >>>>> thread >>>>> count. You will need to diagnose the VM implementation and compare the >>>>> thread list and the implementation of getting the thread count (check >>>>> hotspot/src/share/vm/services/threadService.cpp) >>>> >>>> There is a considerable code path between the point where a terminating >>>> thread causes Thread.join() to be allowed to return, and the point >>>> where >>>> the live thread count gets decremented. So using join() does not help >>>> here. Arguably JVMTI should have based its counts around the lifecycle >>>> of the Java thread not the underlying native thread. >>> >>> It appears, from my reading of the code, that this situation ( a thread >>> exiting ) should be handled. Or maybe I'm looking at the wrong >>> interface. >>> >>> JavaThread::exit(...) { >>> ... >>> ThreadService::current_thread_exiting(this); >>> ... >>> ensure_join(..) >>> ... >>> } >>> >>> So the exiting thread should be removed from the live thread count >>> before Thread.join returns. >> >> Unfortunately, ensure_join(...) is called on line 1860 but >> Threads::remove(this), which does the actual cleanup of the live threads >> counter, is called only on line 1919, leaving at least a few ns window >> when the thread is reported as terminated in java but the counters >> haven't been updated yet. > > Again, maybe I'm missing something but, > > static jlong get_live_thread_count() { return > _live_threads_count->get_value() - _exiting_threads_count; } > > ... and current_thread_exiting(..) increments _exiting_threads_count, no? Well, apparently it does. I am a complete stranger to the concurrency issues in the hotspot - would it be possible that in ThreadService::remove_thread(..) the _exiting_threads_count is decremented but _live_threads_count hasn't been updated yet when someone calls the get_live_thread_count() function? -JB- > > -Chris. > >> >> -JB- >> >>> >>> -Chris. >>> >>>> >>>> David >>>> ----- >>>> >>>>> Mandy >> From david.holmes at oracle.com Wed Jul 24 22:07:01 2013 From: david.holmes at oracle.com (David Holmes) Date: Thu, 25 Jul 2013 15:07:01 +1000 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EFDFC2.20503@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> <51EF9F0D.7040709@oracle.com> <51EFB8B7.6030204@oracle.com> <51EFC951.70704@oracle.com> <51EFCD5E.3090007@oracle.com> <51EFD3F5.3060209@oracle.com> <51EFDFC2.20503@oracle.com> Message-ID: <51F0B275.4060906@oracle.com> On 25/07/2013 12:08 AM, Jaroslav Bachorik wrote: > On 07/24/2013 03:17 PM, Chris Hegarty wrote: >> On 24/07/2013 13:49, Jaroslav Bachorik wrote: >>> On 07/24/2013 02:32 PM, Chris Hegarty wrote: >>>> On 24/07/2013 12:21, David Holmes wrote: >>>>> On 24/07/2013 7:31 PM, Mandy Chung wrote: >>>>>> >>>>>> On 7/24/2013 4:50 PM, shanliang wrote: >>>>>>> So we have 2 kinds of issues here: >>>>>>> 1) the test related, like Thread state checking, we can fix them in >>>>>>> the test >>>>>>> 2) MBean.getThreadCount() issue, we can create a bug to trace it (add >>>>>>> your test case to the bug), and add a workaround (sleep or call 2 >>>>>>> times) in the test to make the test pass. Mandy is the expert and >>>>>>> better to get her opinion. >>>>>> >>>>>> It's probably a race in the VM implementation in determining the >>>>>> thread >>>>>> count. You will need to diagnose the VM implementation and compare the >>>>>> thread list and the implementation of getting the thread count (check >>>>>> hotspot/src/share/vm/services/threadService.cpp) >>>>> >>>>> There is a considerable code path between the point where a terminating >>>>> thread causes Thread.join() to be allowed to return, and the point >>>>> where >>>>> the live thread count gets decremented. So using join() does not help >>>>> here. Arguably JVMTI should have based its counts around the lifecycle >>>>> of the Java thread not the underlying native thread. >>>> >>>> It appears, from my reading of the code, that this situation ( a thread >>>> exiting ) should be handled. Or maybe I'm looking at the wrong >>>> interface. >>>> >>>> JavaThread::exit(...) { >>>> ... >>>> ThreadService::current_thread_exiting(this); >>>> ... >>>> ensure_join(..) >>>> ... >>>> } >>>> >>>> So the exiting thread should be removed from the live thread count >>>> before Thread.join returns. >>> >>> Unfortunately, ensure_join(...) is called on line 1860 but >>> Threads::remove(this), which does the actual cleanup of the live threads >>> counter, is called only on line 1919, leaving at least a few ns window >>> when the thread is reported as terminated in java but the counters >>> haven't been updated yet. >> >> Again, maybe I'm missing something but, >> >> static jlong get_live_thread_count() { return >> _live_threads_count->get_value() - _exiting_threads_count; } >> >> ... and current_thread_exiting(..) increments _exiting_threads_count, no? > > Well, apparently it does. Yes. Thanks Chris I completely missed the use of the _exiting_threads_count to address this very issue. > I am a complete stranger to the concurrency issues in the hotspot - > would it be possible that in ThreadService::remove_thread(..) the > _exiting_threads_count is decremented but _live_threads_count hasn't > been updated yet when someone calls the get_live_thread_count() function? Yes. Updates are guarded by acquiring the Threads_lock, but reads are not. So it is indeed possible to request the live count between the decrement of the exiting count and the decrement of the live count itself. Mind you that is an extremely small window of opportunity in terms of this bug manifesting as often as it does. Because get_live_thread_count returns the sum of two variables it has to use the same synchronization as is used to update those variables to ensure it returns a valid value. We can't grab the Threads_lock directly in get_live_thread_count as it is already called from code that holds the lock. So we would have to push this out to management.cpp's get_long_attribute. David ----- > -JB- > >> >> -Chris. >> >>> >>> -JB- >>> >>>> >>>> -Chris. >>>> >>>>> >>>>> David >>>>> ----- >>>>> >>>>>> Mandy >>> > From jaroslav.bachorik at oracle.com Thu Jul 25 05:28:02 2013 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 25 Jul 2013 14:28:02 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51F0B275.4060906@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> <51EF9F0D.7040709@oracle.com> <51EFB8B7.6030204@oracle.com> <51EFC951.70704@oracle.com> <51EFCD5E.3090007@oracle.com> <51EFD3F5.3060209@oracle.com> <51EFDFC2.20503@oracle.com> <51F0B275.4060906@oracle.com> Message-ID: <51F119D2.4080602@oracle.com> On 07/25/2013 07:07 AM, David Holmes wrote: > On 25/07/2013 12:08 AM, Jaroslav Bachorik wrote: >> On 07/24/2013 03:17 PM, Chris Hegarty wrote: >>> On 24/07/2013 13:49, Jaroslav Bachorik wrote: >>>> On 07/24/2013 02:32 PM, Chris Hegarty wrote: >>>>> On 24/07/2013 12:21, David Holmes wrote: >>>>>> On 24/07/2013 7:31 PM, Mandy Chung wrote: >>>>>>> >>>>>>> On 7/24/2013 4:50 PM, shanliang wrote: >>>>>>>> So we have 2 kinds of issues here: >>>>>>>> 1) the test related, like Thread state checking, we can fix them in >>>>>>>> the test >>>>>>>> 2) MBean.getThreadCount() issue, we can create a bug to trace it >>>>>>>> (add >>>>>>>> your test case to the bug), and add a workaround (sleep or call 2 >>>>>>>> times) in the test to make the test pass. Mandy is the expert and >>>>>>>> better to get her opinion. >>>>>>> >>>>>>> It's probably a race in the VM implementation in determining the >>>>>>> thread >>>>>>> count. You will need to diagnose the VM implementation and >>>>>>> compare the >>>>>>> thread list and the implementation of getting the thread count >>>>>>> (check >>>>>>> hotspot/src/share/vm/services/threadService.cpp) >>>>>> >>>>>> There is a considerable code path between the point where a >>>>>> terminating >>>>>> thread causes Thread.join() to be allowed to return, and the point >>>>>> where >>>>>> the live thread count gets decremented. So using join() does not help >>>>>> here. Arguably JVMTI should have based its counts around the >>>>>> lifecycle >>>>>> of the Java thread not the underlying native thread. >>>>> >>>>> It appears, from my reading of the code, that this situation ( a >>>>> thread >>>>> exiting ) should be handled. Or maybe I'm looking at the wrong >>>>> interface. >>>>> >>>>> JavaThread::exit(...) { >>>>> ... >>>>> ThreadService::current_thread_exiting(this); >>>>> ... >>>>> ensure_join(..) >>>>> ... >>>>> } >>>>> >>>>> So the exiting thread should be removed from the live thread count >>>>> before Thread.join returns. >>>> >>>> Unfortunately, ensure_join(...) is called on line 1860 but >>>> Threads::remove(this), which does the actual cleanup of the live >>>> threads >>>> counter, is called only on line 1919, leaving at least a few ns window >>>> when the thread is reported as terminated in java but the counters >>>> haven't been updated yet. >>> >>> Again, maybe I'm missing something but, >>> >>> static jlong get_live_thread_count() { return >>> _live_threads_count->get_value() - _exiting_threads_count; } >>> >>> ... and current_thread_exiting(..) increments >>> _exiting_threads_count, no? >> >> Well, apparently it does. > > Yes. Thanks Chris I completely missed the use of the > _exiting_threads_count to address this very issue. > >> I am a complete stranger to the concurrency issues in the hotspot - >> would it be possible that in ThreadService::remove_thread(..) the >> _exiting_threads_count is decremented but _live_threads_count hasn't >> been updated yet when someone calls the get_live_thread_count() function? > > Yes. Updates are guarded by acquiring the Threads_lock, but reads are > not. So it is indeed possible to request the live count between the > decrement of the exiting count and the decrement of the live count > itself. Mind you that is an extremely small window of opportunity in > terms of this bug manifesting as often as it does. > > Because get_live_thread_count returns the sum of two variables it has to > use the same synchronization as is used to update those variables to > ensure it returns a valid value. We can't grab the Threads_lock directly > in get_live_thread_count as it is already called from code that holds > the lock. So we would have to push this out to management.cpp's > get_long_attribute. I have filed a separate issue for hotspot/svc (JDK-8021335) For the time being I propose modifying the test to be less race-prone in java and adding a timeout of 500ms after terminating a number of threads. The test modifications are at http://cr.openjdk.java.net/~jbachorik/8020875/webrev.02 Thanks, -JB- > > David > ----- > >> -JB- >> >>> >>> -Chris. >>> >>>> >>>> -JB- >>>> >>>>> >>>>> -Chris. >>>>> >>>>>> >>>>>> David >>>>>> ----- >>>>>> >>>>>>> Mandy >>>> >> From daniel.fuchs at oracle.com Thu Jul 25 05:37:47 2013 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Thu, 25 Jul 2013 14:37:47 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51F119D2.4080602@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> <51EF9F0D.7040709@oracle.com> <51EFB8B7.6030204@oracle.com> <51EFC951.70704@oracle.com> <51EFCD5E.3090007@oracle.com> <51EFD3F5.3060209@oracle.com> <51EFDFC2.20503@oracle.com> <51F0B275.4060906@oracle.com> <51F119D2.4080602@oracle.com> Message-ID: <51F11C1B.5090601@oracle.com> On 7/25/13 2:28 PM, Jaroslav Bachorik wrote: > > For the time being I propose modifying the test to be less race-prone in > java and adding a timeout of 500ms after terminating a number of threads. > > The test modifications are at > http://cr.openjdk.java.net/~jbachorik/8020875/webrev.02 > > Thanks, Hi Jaroslvav, This looks good! -- daniel From daniel.fuchs at oracle.com Thu Jul 25 05:41:49 2013 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Thu, 25 Jul 2013 14:41:49 +0200 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51F11C1B.5090601@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> <51EF9F0D.7040709@oracle.com> <51EFB8B7.6030204@oracle.com> <51EFC951.70704@oracle.com> <51EFCD5E.3090007@oracle.com> <51EFD3F5.3060209@oracle.com> <51EFDFC2.20503@oracle.com> <51F0B275.4060906@oracle.com> <51F119D2.4080602@oracle.com> <51F11C1B.5090601@oracle.com> Message-ID: <51F11D0D.7060902@oracle.com> BTW - I wonder if you should add 8021335 in the @bug line. -- daniel On 7/25/13 2:37 PM, Daniel Fuchs wrote: > On 7/25/13 2:28 PM, Jaroslav Bachorik wrote: >> >> For the time being I propose modifying the test to be less race-prone in >> java and adding a timeout of 500ms after terminating a number of threads. >> >> The test modifications are at >> http://cr.openjdk.java.net/~jbachorik/8020875/webrev.02 >> >> Thanks, > > Hi Jaroslvav, > > This looks good! > > -- daniel > From chris.hegarty at oracle.com Wed Jul 24 05:32:17 2013 From: chris.hegarty at oracle.com (Chris Hegarty) Date: Wed, 24 Jul 2013 13:32:17 +0100 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EFB8B7.6030204@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> <51EF9F0D.7040709@oracle.com> <51EFB8B7.6030204@oracle.com> Message-ID: <51EFC951.70704@oracle.com> On 24/07/2013 12:21, David Holmes wrote: > On 24/07/2013 7:31 PM, Mandy Chung wrote: >> >> On 7/24/2013 4:50 PM, shanliang wrote: >>> So we have 2 kinds of issues here: >>> 1) the test related, like Thread state checking, we can fix them in >>> the test >>> 2) MBean.getThreadCount() issue, we can create a bug to trace it (add >>> your test case to the bug), and add a workaround (sleep or call 2 >>> times) in the test to make the test pass. Mandy is the expert and >>> better to get her opinion. >> >> It's probably a race in the VM implementation in determining the thread >> count. You will need to diagnose the VM implementation and compare the >> thread list and the implementation of getting the thread count (check >> hotspot/src/share/vm/services/threadService.cpp) > > There is a considerable code path between the point where a terminating > thread causes Thread.join() to be allowed to return, and the point where > the live thread count gets decremented. So using join() does not help > here. Arguably JVMTI should have based its counts around the lifecycle > of the Java thread not the underlying native thread. It appears, from my reading of the code, that this situation ( a thread exiting ) should be handled. Or maybe I'm looking at the wrong interface. JavaThread::exit(...) { ... ThreadService::current_thread_exiting(this); ... ensure_join(..) ... } So the exiting thread should be removed from the live thread count before Thread.join returns. -Chris. > > David > ----- > >> Mandy From chris.hegarty at oracle.com Wed Jul 24 06:17:41 2013 From: chris.hegarty at oracle.com (Chris Hegarty) Date: Wed, 24 Jul 2013 14:17:41 +0100 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EFCD5E.3090007@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> <51EF9F0D.7040709@oracle.com> <51EFB8B7.6030204@oracle.com> <51EFC951.70704@oracle.com> <51EFCD5E.3090007@oracle.com> Message-ID: <51EFD3F5.3060209@oracle.com> On 24/07/2013 13:49, Jaroslav Bachorik wrote: > On 07/24/2013 02:32 PM, Chris Hegarty wrote: >> On 24/07/2013 12:21, David Holmes wrote: >>> On 24/07/2013 7:31 PM, Mandy Chung wrote: >>>> >>>> On 7/24/2013 4:50 PM, shanliang wrote: >>>>> So we have 2 kinds of issues here: >>>>> 1) the test related, like Thread state checking, we can fix them in >>>>> the test >>>>> 2) MBean.getThreadCount() issue, we can create a bug to trace it (add >>>>> your test case to the bug), and add a workaround (sleep or call 2 >>>>> times) in the test to make the test pass. Mandy is the expert and >>>>> better to get her opinion. >>>> >>>> It's probably a race in the VM implementation in determining the thread >>>> count. You will need to diagnose the VM implementation and compare the >>>> thread list and the implementation of getting the thread count (check >>>> hotspot/src/share/vm/services/threadService.cpp) >>> >>> There is a considerable code path between the point where a terminating >>> thread causes Thread.join() to be allowed to return, and the point where >>> the live thread count gets decremented. So using join() does not help >>> here. Arguably JVMTI should have based its counts around the lifecycle >>> of the Java thread not the underlying native thread. >> >> It appears, from my reading of the code, that this situation ( a thread >> exiting ) should be handled. Or maybe I'm looking at the wrong interface. >> >> JavaThread::exit(...) { >> ... >> ThreadService::current_thread_exiting(this); >> ... >> ensure_join(..) >> ... >> } >> >> So the exiting thread should be removed from the live thread count >> before Thread.join returns. > > Unfortunately, ensure_join(...) is called on line 1860 but > Threads::remove(this), which does the actual cleanup of the live threads > counter, is called only on line 1919, leaving at least a few ns window > when the thread is reported as terminated in java but the counters > haven't been updated yet. Again, maybe I'm missing something but, static jlong get_live_thread_count() { return _live_threads_count->get_value() - _exiting_threads_count; } ... and current_thread_exiting(..) increments _exiting_threads_count, no? -Chris. > > -JB- > >> >> -Chris. >> >>> >>> David >>> ----- >>> >>>> Mandy > From chris.hegarty at oracle.com Wed Jul 24 07:20:22 2013 From: chris.hegarty at oracle.com (Chris Hegarty) Date: Wed, 24 Jul 2013 15:20:22 +0100 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51EFDFC2.20503@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> <51EF9F0D.7040709@oracle.com> <51EFB8B7.6030204@oracle.com> <51EFC951.70704@oracle.com> <51EFCD5E.3090007@oracle.com> <51EFD3F5.3060209@oracle.com> <51EFDFC2.20503@oracle.com> Message-ID: <51EFE2A6.8010103@oracle.com> On 24/07/2013 15:08, Jaroslav Bachorik wrote: > On 07/24/2013 03:17 PM, Chris Hegarty wrote: >> On 24/07/2013 13:49, Jaroslav Bachorik wrote: >>> On 07/24/2013 02:32 PM, Chris Hegarty wrote: >>>> On 24/07/2013 12:21, David Holmes wrote: >>>>> On 24/07/2013 7:31 PM, Mandy Chung wrote: >>>>>> >>>>>> On 7/24/2013 4:50 PM, shanliang wrote: >>>>>>> So we have 2 kinds of issues here: >>>>>>> 1) the test related, like Thread state checking, we can fix them in >>>>>>> the test >>>>>>> 2) MBean.getThreadCount() issue, we can create a bug to trace it (add >>>>>>> your test case to the bug), and add a workaround (sleep or call 2 >>>>>>> times) in the test to make the test pass. Mandy is the expert and >>>>>>> better to get her opinion. >>>>>> >>>>>> It's probably a race in the VM implementation in determining the >>>>>> thread >>>>>> count. You will need to diagnose the VM implementation and compare the >>>>>> thread list and the implementation of getting the thread count (check >>>>>> hotspot/src/share/vm/services/threadService.cpp) >>>>> >>>>> There is a considerable code path between the point where a terminating >>>>> thread causes Thread.join() to be allowed to return, and the point >>>>> where >>>>> the live thread count gets decremented. So using join() does not help >>>>> here. Arguably JVMTI should have based its counts around the lifecycle >>>>> of the Java thread not the underlying native thread. >>>> >>>> It appears, from my reading of the code, that this situation ( a thread >>>> exiting ) should be handled. Or maybe I'm looking at the wrong >>>> interface. >>>> >>>> JavaThread::exit(...) { >>>> ... >>>> ThreadService::current_thread_exiting(this); >>>> ... >>>> ensure_join(..) >>>> ... >>>> } >>>> >>>> So the exiting thread should be removed from the live thread count >>>> before Thread.join returns. >>> >>> Unfortunately, ensure_join(...) is called on line 1860 but >>> Threads::remove(this), which does the actual cleanup of the live threads >>> counter, is called only on line 1919, leaving at least a few ns window >>> when the thread is reported as terminated in java but the counters >>> haven't been updated yet. >> >> Again, maybe I'm missing something but, >> >> static jlong get_live_thread_count() { return >> _live_threads_count->get_value() - _exiting_threads_count; } >> >> ... and current_thread_exiting(..) increments _exiting_threads_count, no? > > Well, apparently it does. > > I am a complete stranger to the concurrency issues in the hotspot - > would it be possible that in ThreadService::remove_thread(..) the > _exiting_threads_count is decremented but _live_threads_count hasn't > been updated yet when someone calls the get_live_thread_count() function? I am not familiar with the intricate workings of this code, but as a casual observer I would say that this must be a bug in the VM. It appears that the original authors did take into account exiting threads, and went to some lengths to provide accurate diagnostic information. If this is not producing the correct results, then I can only imagine there is a bug here. To your specific question, then yes this would appear possible. I am not sure what synchronization, if any, protects this code. -Chris. > > -JB- > >> >> -Chris. >> >>> >>> -JB- >>> >>>> >>>> -Chris. >>>> >>>>> >>>>> David >>>>> ----- >>>>> >>>>>> Mandy >>> > From chris.hegarty at oracle.com Thu Jul 25 05:53:08 2013 From: chris.hegarty at oracle.com (Chris Hegarty) Date: Thu, 25 Jul 2013 13:53:08 +0100 Subject: jmx-dev RFR: 8020875 java/lang/management/ThreadMXBean/ResetPeakThreadCount.java fails intermittently In-Reply-To: <51F119D2.4080602@oracle.com> References: <51ED1DBE.3030304@oracle.com> <51EE3C9B.3050604@oracle.com> <51EE3EE2.1000202@oracle.com> <51EE4A91.3000305@oracle.com> <51EE4BD6.7040707@oracle.com> <51EE50B4.8040000@oracle.com> <51EE528F.2050302@oracle.com> <51EE52B9.6070506@oracle.com> <51EF6DD6.5060806@oracle.com> <51EF6FA1.9000103@oracle.com> <51EF7248.2070405@oracle.com> <51EF7888.40100@oracle.com> <51EF8091.9030603@oracle.com> <51EF83A6.1040200@oracle.com> <51EF926A.3060705@oracle.com> <51EF930E.4050507@oracle.com> <51EF9552.1020901@oracle.com> <51EF9F0D.7040709@oracle.com> <51EFB8B7.6030204@oracle.com> <51EFC951.70704@oracle.com> <51EFCD5E.3090007@oracle.com> <51EFD3F5.3060209@oracle.com> <51EFDFC2.20503@oracle.com> <51F0B275.4060906@oracle.com> <51F119D2.4080602@oracle.com> Message-ID: <51F11FB4.7070200@oracle.com> On 07/25/2013 01:28 PM, Jaroslav Bachorik wrote: > ...... > > I have filed a separate issue for hotspot/svc (JDK-8021335) Yes, this is probably a separate, and more involved, issue. > For the time being I propose modifying the test to be less race-prone in > java and adding a timeout of 500ms after terminating a number of threads. Sounds reasonable. > The test modifications are at > http://cr.openjdk.java.net/~jbachorik/8020875/webrev.02 Looks fine. Trivially, testFailed should be volatile ( if you still need it ). I don't like that MyThread is not interruptible, but being run in othervm this might not be such an issue. -Chris. > > Thanks, > > -JB-