From staffan.larsen at oracle.com Tue Aug 12 11:17:19 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Tue, 12 Aug 2014 13:17:19 +0200 Subject: jmx-dev RFR 8052961: Test "com/sun/tools/attach/StartManagementAgent.java" failing intermittently In-Reply-To: <53D7A607.5030500@oracle.com> References: <53D7A607.5030500@oracle.com> Message-ID: <56D299BC-6757-4518-8B07-0EAA7B64D159@oracle.com> Looks good! Thanks, /Staffan On 29 jul 2014, at 15:47, Jaroslav Bachorik wrote: > Please, review this (hopefully last) change to StartManagementAgent test > > Issue : https://bugs.openjdk.java.net/browse/JDK-8052961 > Webrev: http://cr.openjdk.java.net/~jbachorik/8052961/webrev.00 > > The test needs a properly registered service for the ServiceLoader - but it fails to compile the provider class explicitly leading to intermittent failures. The fix is to add the "SimpleProvider" to "@run build ..." line. > > Thanks, > > -JB- From jaroslav.bachorik at oracle.com Thu Aug 21 12:44:48 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 21 Aug 2014 14:44:48 +0200 Subject: jmx-dev RFR 8040692: [TESTBUG] sun/management/jmxremote/bootstrap/JvmstatCountersTest.java requires -XX:+UsePerfData option to pass on embedded platforms Message-ID: <53F5E9C0.6050906@oracle.com> Please, review this simple fix. Issue : https://bugs.openjdk.java.net/browse/JDK-8040692 Webrev: http://cr.openjdk.java.net/~jbachorik/8040692/webrev.00 On embedded platforms it is necessary to provide "-XX:+UsePerfData" flag in order to make the performance counters accessible. This fix does this for the tests which need to access the performance counters. Thanks, -JB- From staffan.larsen at oracle.com Thu Aug 21 12:52:18 2014 From: staffan.larsen at oracle.com (Staffan Larsen) Date: Thu, 21 Aug 2014 14:52:18 +0200 Subject: jmx-dev RFR 8040692: [TESTBUG] sun/management/jmxremote/bootstrap/JvmstatCountersTest.java requires -XX:+UsePerfData option to pass on embedded platforms In-Reply-To: <53F5E9C0.6050906@oracle.com> References: <53F5E9C0.6050906@oracle.com> Message-ID: <62C456B6-2045-4B2F-9DDD-462C3DCD8D24@oracle.com> Looks good, except I don?t think you wanted to add the test/sources.list file? /Staffan On 21 aug 2014, at 14:44, Jaroslav Bachorik wrote: > Please, review this simple fix. > > Issue : https://bugs.openjdk.java.net/browse/JDK-8040692 > Webrev: http://cr.openjdk.java.net/~jbachorik/8040692/webrev.00 > > On embedded platforms it is necessary to provide "-XX:+UsePerfData" flag in order to make the performance counters accessible. This fix does this for the tests which need to access the performance counters. > > Thanks, > > -JB- From jaroslav.bachorik at oracle.com Thu Aug 21 12:58:21 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 21 Aug 2014 14:58:21 +0200 Subject: jmx-dev RFR 8040692: [TESTBUG] sun/management/jmxremote/bootstrap/JvmstatCountersTest.java requires -XX:+UsePerfData option to pass on embedded platforms In-Reply-To: <62C456B6-2045-4B2F-9DDD-462C3DCD8D24@oracle.com> References: <53F5E9C0.6050906@oracle.com> <62C456B6-2045-4B2F-9DDD-462C3DCD8D24@oracle.com> Message-ID: <53F5ECED.2000005@oracle.com> On 08/21/2014 02:52 PM, Staffan Larsen wrote: > Looks good, except I don?t think you wanted to add the test/sources.list file? Nope. I've just realized that my review script includes all the applied MQ patches instead of only the last one. -JB- > > /Staffan > > On 21 aug 2014, at 14:44, Jaroslav Bachorik wrote: > >> Please, review this simple fix. >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-8040692 >> Webrev: http://cr.openjdk.java.net/~jbachorik/8040692/webrev.00 >> >> On embedded platforms it is necessary to provide "-XX:+UsePerfData" flag in order to make the performance counters accessible. This fix does this for the tests which need to access the performance counters. >> >> Thanks, >> >> -JB- > From jaroslav.bachorik at oracle.com Thu Aug 21 13:20:37 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 21 Aug 2014 15:20:37 +0200 Subject: jmx-dev RFR 7132590: javax/management/remote/mandatory/notif/NotificationAccessControllerTest.java fails in JDK8-B22 Message-ID: <53F5F225.3050105@oracle.com> Please, review the following test change. Issue : https://bugs.openjdk.java.net/browse/JDK-7132590 Webrev: http://cr.openjdk.java.net/~jbachorik/7132590/webrev.00 Currently, the test waits for an arbitrary time until it gives up on receiving the notifications. This leads to intermittent failures in situations when the execution is slower than anticipated (running against a debug build etc.). The solution is to block the test until all the expected notification had been delivered or the test is timed out by the harness. Thanks, -JB- From jaroslav.bachorik at oracle.com Thu Aug 21 13:21:36 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 21 Aug 2014 15:21:36 +0200 Subject: jmx-dev RFR 8040692: [TESTBUG] sun/management/jmxremote/bootstrap/JvmstatCountersTest.java requires -XX:+UsePerfData option to pass on embedded platforms In-Reply-To: <62C456B6-2045-4B2F-9DDD-462C3DCD8D24@oracle.com> References: <53F5E9C0.6050906@oracle.com> <62C456B6-2045-4B2F-9DDD-462C3DCD8D24@oracle.com> Message-ID: <53F5F260.2030000@oracle.com> On 08/21/2014 02:52 PM, Staffan Larsen wrote: > Looks good, except I don?t think you wanted to add the test/sources.list file? Fixed the webrev - http://cr.openjdk.java.net/~jbachorik/8040692/webrev.01 - not to include test/sources.list file. Thanks for the review. -JB- > > /Staffan > > On 21 aug 2014, at 14:44, Jaroslav Bachorik wrote: > >> Please, review this simple fix. >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-8040692 >> Webrev: http://cr.openjdk.java.net/~jbachorik/8040692/webrev.00 >> >> On embedded platforms it is necessary to provide "-XX:+UsePerfData" flag in order to make the performance counters accessible. This fix does this for the tests which need to access the performance counters. >> >> Thanks, >> >> -JB- > From shanliang.jiang at oracle.com Thu Aug 21 13:55:11 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Thu, 21 Aug 2014 15:55:11 +0200 Subject: jmx-dev RFR 7132590: javax/management/remote/mandatory/notif/NotificationAccessControllerTest.java fails in JDK8-B22 In-Reply-To: <53F5F225.3050105@oracle.com> References: <53F5F225.3050105@oracle.com> Message-ID: <53F5FA3F.4040202@oracle.com> Jaroslav, The fix should be good to fix the failure. It makes me think a special case, suppose that the test waits 2 notifications, but the test might receive one unexpected notification with some more waiting, for example, with the old version, 2 expected notifications arrive within the first second, and the unexpected arrives in the second second, but with your fix the test might end before the unexpected notification arrives. Not sure that we should take care of this case. Thanks, Shanliang Jaroslav Bachorik wrote: > Please, review the following test change. > > Issue : https://bugs.openjdk.java.net/browse/JDK-7132590 > Webrev: http://cr.openjdk.java.net/~jbachorik/7132590/webrev.00 > > Currently, the test waits for an arbitrary time until it gives up on > receiving the notifications. This leads to intermittent failures in > situations when the execution is slower than anticipated (running > against a debug build etc.). > > The solution is to block the test until all the expected notification > had been delivered or the test is timed out by the harness. > > Thanks, > > -JB- From jaroslav.bachorik at oracle.com Thu Aug 21 15:13:21 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 21 Aug 2014 17:13:21 +0200 Subject: jmx-dev RFR 7132590: javax/management/remote/mandatory/notif/NotificationAccessControllerTest.java fails in JDK8-B22 In-Reply-To: <53F5FA3F.4040202@oracle.com> References: <53F5F225.3050105@oracle.com> <53F5FA3F.4040202@oracle.com> Message-ID: <53F60C91.7060109@oracle.com> On 08/21/2014 03:55 PM, shanliang wrote: > Jaroslav, > > The fix should be good to fix the failure. > > It makes me think a special case, suppose that the test waits 2 > notifications, but the test might receive one unexpected notification > with some more waiting, for example, with the old version, 2 expected > notifications arrive within the first second, and the unexpected arrives > in the second second, but with your fix the test might end before the > unexpected notification arrives. Hm, you mean providing a proof that extraneous notifications are not emitted. I'm not really sure you can create such a proof for the existing implementation - even if everything is fine within a certain time window it does not imply that in the next n seconds an unexpected notification wouldn't be delivered. > > Not sure that we should take care of this case. Probably not in this test. This test just asserts that all the expected notifications have been emitted. -JB- > > Thanks, > Shanliang > > Jaroslav Bachorik wrote: >> Please, review the following test change. >> >> Issue : https://bugs.openjdk.java.net/browse/JDK-7132590 >> Webrev: http://cr.openjdk.java.net/~jbachorik/7132590/webrev.00 >> >> Currently, the test waits for an arbitrary time until it gives up on >> receiving the notifications. This leads to intermittent failures in >> situations when the execution is slower than anticipated (running >> against a debug build etc.). >> >> The solution is to block the test until all the expected notification >> had been delivered or the test is timed out by the harness. >> >> Thanks, >> >> -JB- > From shanliang.jiang at oracle.com Thu Aug 21 15:34:51 2014 From: shanliang.jiang at oracle.com (shanliang) Date: Thu, 21 Aug 2014 17:34:51 +0200 Subject: jmx-dev RFR 7132590: javax/management/remote/mandatory/notif/NotificationAccessControllerTest.java fails in JDK8-B22 In-Reply-To: <53F60C91.7060109@oracle.com> References: <53F5F225.3050105@oracle.com> <53F5FA3F.4040202@oracle.com> <53F60C91.7060109@oracle.com> Message-ID: <53F6119B.8070006@oracle.com> Jaroslav Bachorik wrote: > On 08/21/2014 03:55 PM, shanliang wrote: >> Jaroslav, >> >> The fix should be good to fix the failure. >> >> It makes me think a special case, suppose that the test waits 2 >> notifications, but the test might receive one unexpected notification >> with some more waiting, for example, with the old version, 2 expected >> notifications arrive within the first second, and the unexpected arrives >> in the second second, but with your fix the test might end before the >> unexpected notification arrives. > > Hm, you mean providing a proof that extraneous notifications are not > emitted. I'm not really sure you can create such a proof for the > existing implementation - even if everything is fine within a certain > time window it does not imply that in the next n seconds an unexpected > notification wouldn't be delivered. Indeed, it is very difficult to make sure no unexpected notification, but the old version could by chance to get an unexpected because it waited always certain time. > >> >> Not sure that we should take care of this case. > > Probably not in this test. This test just asserts that all the > expected notifications have been emitted. No objection. This is a general issue for many other notification tests too. Shanliang > > -JB- > >> >> Thanks, >> Shanliang >> >> Jaroslav Bachorik wrote: >>> Please, review the following test change. >>> >>> Issue : https://bugs.openjdk.java.net/browse/JDK-7132590 >>> Webrev: http://cr.openjdk.java.net/~jbachorik/7132590/webrev.00 >>> >>> Currently, the test waits for an arbitrary time until it gives up on >>> receiving the notifications. This leads to intermittent failures in >>> situations when the execution is slower than anticipated (running >>> against a debug build etc.). >>> >>> The solution is to block the test until all the expected notification >>> had been delivered or the test is timed out by the harness. >>> >>> Thanks, >>> >>> -JB- >> > From david.holmes at oracle.com Tue Aug 26 04:03:41 2014 From: david.holmes at oracle.com (David Holmes) Date: Tue, 26 Aug 2014 14:03:41 +1000 Subject: jmx-dev RFR 8040692: [TESTBUG] sun/management/jmxremote/bootstrap/JvmstatCountersTest.java requires -XX:+UsePerfData option to pass on embedded platforms In-Reply-To: <53F5E9C0.6050906@oracle.com> References: <53F5E9C0.6050906@oracle.com> Message-ID: <53FC071D.20100@oracle.com> On 21/08/2014 10:44 PM, Jaroslav Bachorik wrote: > Please, review this simple fix. > > Issue : https://bugs.openjdk.java.net/browse/JDK-8040692 > Webrev: http://cr.openjdk.java.net/~jbachorik/8040692/webrev.00 > > On embedded platforms it is necessary to provide "-XX:+UsePerfData" flag > in order to make the performance counters accessible. This fix does this > for the tests which need to access the performance counters. Looks fine to me - except for the sources.list file ?? :) Thanks, David > Thanks, > > -JB- From jaroslav.bachorik at oracle.com Tue Aug 26 08:44:01 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 26 Aug 2014 10:44:01 +0200 Subject: jmx-dev RFR 7132590: javax/management/remote/mandatory/notif/NotificationAccessControllerTest.java fails in JDK8-B22 In-Reply-To: <53F6119B.8070006@oracle.com> References: <53F5F225.3050105@oracle.com> <53F5FA3F.4040202@oracle.com> <53F60C91.7060109@oracle.com> <53F6119B.8070006@oracle.com> Message-ID: <53FC48D1.6040403@oracle.com> On 08/21/2014 05:34 PM, shanliang wrote: > Jaroslav Bachorik wrote: >> On 08/21/2014 03:55 PM, shanliang wrote: >>> Jaroslav, >>> >>> The fix should be good to fix the failure. >>> >>> It makes me think a special case, suppose that the test waits 2 >>> notifications, but the test might receive one unexpected notification >>> with some more waiting, for example, with the old version, 2 expected >>> notifications arrive within the first second, and the unexpected arrives >>> in the second second, but with your fix the test might end before the >>> unexpected notification arrives. >> >> Hm, you mean providing a proof that extraneous notifications are not >> emitted. I'm not really sure you can create such a proof for the >> existing implementation - even if everything is fine within a certain >> time window it does not imply that in the next n seconds an unexpected >> notification wouldn't be delivered. > Indeed, it is very difficult to make sure no unexpected notification, > but the old version could by chance to get an unexpected because it > waited always certain time. IMO, getting something right by chance is even worse than simply stating that the test won't test for such eventuality. >> >>> >>> Not sure that we should take care of this case. >> >> Probably not in this test. This test just asserts that all the >> expected notifications have been emitted. > No objection. This is a general issue for many other notification tests > too. In general it is impossible to test for extraneous notifications since there are no events cleanly defining the time boundaries for receiving a particular notification. The result is, that even though an extraneous notification hasn't been received in n seconds we can't be certain that one wouldn't arrive in n+m (m is a real number and m > 0) seconds. Could I have a (R)eviewer to take a look at this patch, please? -JB- > > Shanliang >> >> -JB- >> >>> >>> Thanks, >>> Shanliang >>> >>> Jaroslav Bachorik wrote: >>>> Please, review the following test change. >>>> >>>> Issue : https://bugs.openjdk.java.net/browse/JDK-7132590 >>>> Webrev: http://cr.openjdk.java.net/~jbachorik/7132590/webrev.00 >>>> >>>> Currently, the test waits for an arbitrary time until it gives up on >>>> receiving the notifications. This leads to intermittent failures in >>>> situations when the execution is slower than anticipated (running >>>> against a debug build etc.). >>>> >>>> The solution is to block the test until all the expected notification >>>> had been delivered or the test is timed out by the harness. >>>> >>>> Thanks, >>>> >>>> -JB- >>> >> > From daniel.fuchs at oracle.com Tue Aug 26 10:08:01 2014 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Tue, 26 Aug 2014 12:08:01 +0200 Subject: jmx-dev RFR 7132590: javax/management/remote/mandatory/notif/NotificationAccessControllerTest.java fails in JDK8-B22 In-Reply-To: <53FC48D1.6040403@oracle.com> References: <53F5F225.3050105@oracle.com> <53F5FA3F.4040202@oracle.com> <53F60C91.7060109@oracle.com> <53F6119B.8070006@oracle.com> <53FC48D1.6040403@oracle.com> Message-ID: <53FC5C81.3020903@oracle.com> Hi Jaroslav, line 143, notifs should be final, and should be of a type that supports concurrent access - something like CopyOnWriteArrayList or Collections.synchronizedList(). Otherwise looks good! best regards, -- daniel On 8/26/14 10:44 AM, Jaroslav Bachorik wrote: > On 08/21/2014 05:34 PM, shanliang wrote: >> Jaroslav Bachorik wrote: >>> On 08/21/2014 03:55 PM, shanliang wrote: >>>> Jaroslav, >>>> >>>> The fix should be good to fix the failure. >>>> >>>> It makes me think a special case, suppose that the test waits 2 >>>> notifications, but the test might receive one unexpected notification >>>> with some more waiting, for example, with the old version, 2 expected >>>> notifications arrive within the first second, and the unexpected >>>> arrives >>>> in the second second, but with your fix the test might end before the >>>> unexpected notification arrives. >>> >>> Hm, you mean providing a proof that extraneous notifications are not >>> emitted. I'm not really sure you can create such a proof for the >>> existing implementation - even if everything is fine within a certain >>> time window it does not imply that in the next n seconds an unexpected >>> notification wouldn't be delivered. >> Indeed, it is very difficult to make sure no unexpected notification, >> but the old version could by chance to get an unexpected because it >> waited always certain time. > > IMO, getting something right by chance is even worse than simply stating > that the test won't test for such eventuality. > >>> >>>> >>>> Not sure that we should take care of this case. >>> >>> Probably not in this test. This test just asserts that all the >>> expected notifications have been emitted. >> No objection. This is a general issue for many other notification tests >> too. > > In general it is impossible to test for extraneous notifications since > there are no events cleanly defining the time boundaries for receiving a > particular notification. The result is, that even though an extraneous > notification hasn't been received in n seconds we can't be certain that > one wouldn't arrive in n+m (m is a real number and m > 0) seconds. > > > Could I have a (R)eviewer to take a look at this patch, please? > > -JB- > >> >> Shanliang >>> >>> -JB- >>> >>>> >>>> Thanks, >>>> Shanliang >>>> >>>> Jaroslav Bachorik wrote: >>>>> Please, review the following test change. >>>>> >>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-7132590 >>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/7132590/webrev.00 >>>>> >>>>> Currently, the test waits for an arbitrary time until it gives up on >>>>> receiving the notifications. This leads to intermittent failures in >>>>> situations when the execution is slower than anticipated (running >>>>> against a debug build etc.). >>>>> >>>>> The solution is to block the test until all the expected notification >>>>> had been delivered or the test is timed out by the harness. >>>>> >>>>> Thanks, >>>>> >>>>> -JB- >>>> >>> >> > From poonam.bajaj at oracle.com Tue Aug 26 10:27:13 2014 From: poonam.bajaj at oracle.com (Poonam Bajaj) Date: Tue, 26 Aug 2014 15:57:13 +0530 Subject: jmx-dev Review request: 8049303: Transient network problems cause JMX thread to fail silenty In-Reply-To: <53FBDB1A.2070501@oracle.com> References: <53FBDB1A.2070501@oracle.com> Message-ID: <53FC6101.4020501@oracle.com> Sending the review request to serviceability-dev list as well... ----------- Could I have reviews for this change: Bug: https://bugs.openjdk.java.net/browse/JDK-8049303 Webrev: http://cr.openjdk.java.net/~poonam/8049303/webrev.00/ Problem and fix: By default the JMX client side notification fetch timeout (jmx.remote.x.notification.fetch.timeout) is 1 minute and the default server connection timeout (jmx.remote.x.server.connection.timeout) is 2 minutes. If the client side connector thread makes a notification fetch request to the server, but a transient network problem prevents the server response from reaching the client, the client side connector will wait for a response until the timeout period (1 minute) has expired before throwing an IOException. The client side RMIConnector implementation handles the IOException, by re-checking the connection status to understand whether or not it is broken. If the connection is not available at that moment, the connector fails by re-throwing the initial IOException. The problem is that this re-check of the connection passes because the server side of the connection doesn't time out until 2 minutes has passed (by default), so the NotifFetcher thread dies without posting a failed notification, and the client application does not get a chance to recover. The fix is to forward the exception on the JMX client side before checking the connection status. Testing: All the jdk_jmx and jdk_management regression tests passed. The fix applies cleanly to 8u and 7u repos. Thanks, Poonam -------------- next part -------------- An HTML attachment was scrubbed... URL: From jaroslav.bachorik at oracle.com Tue Aug 26 10:36:05 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 26 Aug 2014 12:36:05 +0200 Subject: jmx-dev Review request: 8049303: Transient network problems cause JMX thread to fail silenty In-Reply-To: <53FC6101.4020501@oracle.com> References: <53FBDB1A.2070501@oracle.com> <53FC6101.4020501@oracle.com> Message-ID: <53FC6315.8080700@oracle.com> Hi Poonam, On 08/26/2014 12:27 PM, Poonam Bajaj wrote: > Sending the review request to serviceability-dev list as well... > ----------- > > Could I have reviews for this change: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8049303 > Webrev: http://cr.openjdk.java.net/~poonam/8049303/webrev.00/ L1499-1504 can be completely removed. They serve no purpose now. Please, adjust the indentation to fit the original one. Thanks, -JB- > > Problem and fix: > By default the JMX client side notification fetch timeout > (jmx.remote.x.notification.fetch.timeout) is 1 minute and the default > server connection timeout (jmx.remote.x.server.connection.timeout) is 2 > minutes. > > If the client side connector thread makes a notification fetch request > to the server, but a transient network problem prevents the server > response from reaching the client, the client side connector will wait > for a response until the timeout period (1 minute) has expired before > throwing an IOException. > > The client side RMIConnector implementation handles the IOException, by > re-checking the connection status to understand whether or not it is > broken. If the connection is not available at that moment, the connector > fails by re-throwing the initial IOException. The problem is that this > re-check of the connection passes because the server side of the > connection doesn't time out until 2 minutes has passed (by default), so > the NotifFetcher thread > dies without posting a failed notification, and the client application > does not get a chance to recover. > > The fix is to forward the exception on the JMX client side before > checking the connection status. > > Testing: > All the jdk_jmx and jdk_management regression tests passed. > > The fix applies cleanly to 8u and 7u repos. > > > Thanks, > Poonam > > From jaroslav.bachorik at oracle.com Tue Aug 26 10:41:16 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 26 Aug 2014 12:41:16 +0200 Subject: jmx-dev RFR 7132590: javax/management/remote/mandatory/notif/NotificationAccessControllerTest.java fails in JDK8-B22 In-Reply-To: <53FC5C81.3020903@oracle.com> References: <53F5F225.3050105@oracle.com> <53F5FA3F.4040202@oracle.com> <53F60C91.7060109@oracle.com> <53F6119B.8070006@oracle.com> <53FC48D1.6040403@oracle.com> <53FC5C81.3020903@oracle.com> Message-ID: <53FC644C.4050008@oracle.com> On 08/26/2014 12:08 PM, Daniel Fuchs wrote: > Hi Jaroslav, > > line 143, notifs should be final, and should be of a type > that supports concurrent access - something like > CopyOnWriteArrayList or Collections.synchronizedList(). Taken care of. I also took the liberty to get rid off few compiler warnings. http://cr.openjdk.java.net/~jbachorik/7132590/webrev.01 -JB- > > Otherwise looks good! > > best regards, > > -- daniel > > On 8/26/14 10:44 AM, Jaroslav Bachorik wrote: >> On 08/21/2014 05:34 PM, shanliang wrote: >>> Jaroslav Bachorik wrote: >>>> On 08/21/2014 03:55 PM, shanliang wrote: >>>>> Jaroslav, >>>>> >>>>> The fix should be good to fix the failure. >>>>> >>>>> It makes me think a special case, suppose that the test waits 2 >>>>> notifications, but the test might receive one unexpected notification >>>>> with some more waiting, for example, with the old version, 2 expected >>>>> notifications arrive within the first second, and the unexpected >>>>> arrives >>>>> in the second second, but with your fix the test might end before the >>>>> unexpected notification arrives. >>>> >>>> Hm, you mean providing a proof that extraneous notifications are not >>>> emitted. I'm not really sure you can create such a proof for the >>>> existing implementation - even if everything is fine within a certain >>>> time window it does not imply that in the next n seconds an unexpected >>>> notification wouldn't be delivered. >>> Indeed, it is very difficult to make sure no unexpected notification, >>> but the old version could by chance to get an unexpected because it >>> waited always certain time. >> >> IMO, getting something right by chance is even worse than simply stating >> that the test won't test for such eventuality. >> >>>> >>>>> >>>>> Not sure that we should take care of this case. >>>> >>>> Probably not in this test. This test just asserts that all the >>>> expected notifications have been emitted. >>> No objection. This is a general issue for many other notification tests >>> too. >> >> In general it is impossible to test for extraneous notifications since >> there are no events cleanly defining the time boundaries for receiving a >> particular notification. The result is, that even though an extraneous >> notification hasn't been received in n seconds we can't be certain that >> one wouldn't arrive in n+m (m is a real number and m > 0) seconds. >> >> >> Could I have a (R)eviewer to take a look at this patch, please? >> >> -JB- >> >>> >>> Shanliang >>>> >>>> -JB- >>>> >>>>> >>>>> Thanks, >>>>> Shanliang >>>>> >>>>> Jaroslav Bachorik wrote: >>>>>> Please, review the following test change. >>>>>> >>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-7132590 >>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/7132590/webrev.00 >>>>>> >>>>>> Currently, the test waits for an arbitrary time until it gives up on >>>>>> receiving the notifications. This leads to intermittent failures in >>>>>> situations when the execution is slower than anticipated (running >>>>>> against a debug build etc.). >>>>>> >>>>>> The solution is to block the test until all the expected notification >>>>>> had been delivered or the test is timed out by the harness. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -JB- >>>>> >>>> >>> >> > From daniel.fuchs at oracle.com Tue Aug 26 10:48:12 2014 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Tue, 26 Aug 2014 12:48:12 +0200 Subject: jmx-dev RFR 7132590: javax/management/remote/mandatory/notif/NotificationAccessControllerTest.java fails in JDK8-B22 In-Reply-To: <53FC644C.4050008@oracle.com> References: <53F5F225.3050105@oracle.com> <53F5FA3F.4040202@oracle.com> <53F60C91.7060109@oracle.com> <53F6119B.8070006@oracle.com> <53FC48D1.6040403@oracle.com> <53FC5C81.3020903@oracle.com> <53FC644C.4050008@oracle.com> Message-ID: <53FC65EC.2080709@oracle.com> On 8/26/14 12:41 PM, Jaroslav Bachorik wrote: > On 08/26/2014 12:08 PM, Daniel Fuchs wrote: >> Hi Jaroslav, >> >> line 143, notifs should be final, and should be of a type >> that supports concurrent access - something like >> CopyOnWriteArrayList or Collections.synchronizedList(). > > Taken care of. I also took the liberty to get rid off few compiler > warnings. > > http://cr.openjdk.java.net/~jbachorik/7132590/webrev.01 Looks good! -- daniel > > -JB- > >> >> Otherwise looks good! >> >> best regards, >> >> -- daniel >> >> On 8/26/14 10:44 AM, Jaroslav Bachorik wrote: >>> On 08/21/2014 05:34 PM, shanliang wrote: >>>> Jaroslav Bachorik wrote: >>>>> On 08/21/2014 03:55 PM, shanliang wrote: >>>>>> Jaroslav, >>>>>> >>>>>> The fix should be good to fix the failure. >>>>>> >>>>>> It makes me think a special case, suppose that the test waits 2 >>>>>> notifications, but the test might receive one unexpected notification >>>>>> with some more waiting, for example, with the old version, 2 expected >>>>>> notifications arrive within the first second, and the unexpected >>>>>> arrives >>>>>> in the second second, but with your fix the test might end before the >>>>>> unexpected notification arrives. >>>>> >>>>> Hm, you mean providing a proof that extraneous notifications are not >>>>> emitted. I'm not really sure you can create such a proof for the >>>>> existing implementation - even if everything is fine within a certain >>>>> time window it does not imply that in the next n seconds an unexpected >>>>> notification wouldn't be delivered. >>>> Indeed, it is very difficult to make sure no unexpected notification, >>>> but the old version could by chance to get an unexpected because it >>>> waited always certain time. >>> >>> IMO, getting something right by chance is even worse than simply stating >>> that the test won't test for such eventuality. >>> >>>>> >>>>>> >>>>>> Not sure that we should take care of this case. >>>>> >>>>> Probably not in this test. This test just asserts that all the >>>>> expected notifications have been emitted. >>>> No objection. This is a general issue for many other notification tests >>>> too. >>> >>> In general it is impossible to test for extraneous notifications since >>> there are no events cleanly defining the time boundaries for receiving a >>> particular notification. The result is, that even though an extraneous >>> notification hasn't been received in n seconds we can't be certain that >>> one wouldn't arrive in n+m (m is a real number and m > 0) seconds. >>> >>> >>> Could I have a (R)eviewer to take a look at this patch, please? >>> >>> -JB- >>> >>>> >>>> Shanliang >>>>> >>>>> -JB- >>>>> >>>>>> >>>>>> Thanks, >>>>>> Shanliang >>>>>> >>>>>> Jaroslav Bachorik wrote: >>>>>>> Please, review the following test change. >>>>>>> >>>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-7132590 >>>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/7132590/webrev.00 >>>>>>> >>>>>>> Currently, the test waits for an arbitrary time until it gives up on >>>>>>> receiving the notifications. This leads to intermittent failures in >>>>>>> situations when the execution is slower than anticipated (running >>>>>>> against a debug build etc.). >>>>>>> >>>>>>> The solution is to block the test until all the expected >>>>>>> notification >>>>>>> had been delivered or the test is timed out by the harness. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -JB- >>>>>> >>>>> >>>> >>> >> > From poonam.bajaj at oracle.com Tue Aug 26 11:20:21 2014 From: poonam.bajaj at oracle.com (Poonam Bajaj) Date: Tue, 26 Aug 2014 16:50:21 +0530 Subject: jmx-dev Review request: 8049303: Transient network problems cause JMX thread to fail silenty In-Reply-To: <53FC6315.8080700@oracle.com> References: <53FBDB1A.2070501@oracle.com> <53FC6101.4020501@oracle.com> <53FC6315.8080700@oracle.com> Message-ID: <53FC6D75.90702@oracle.com> Hi Jaroslav, On 8/26/2014 4:06 PM, Jaroslav Bachorik wrote: > Hi Poonam, > > On 08/26/2014 12:27 PM, Poonam Bajaj wrote: >> Sending the review request to serviceability-dev list as well... >> ----------- >> >> Could I have reviews for this change: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8049303 >> Webrev: http://cr.openjdk.java.net/~poonam/8049303/webrev.00/ > > L1499-1504 can be completely removed. They serve no purpose now. > Removed this piece of code. > Please, adjust the indentation to fit the original one. Corrected the indentation. Updated webrev: http://cr.openjdk.java.net/~poonam/8049303/webrev.01/ Thanks, Poonam > > Thanks, > > -JB- > >> >> Problem and fix: >> By default the JMX client side notification fetch timeout >> (jmx.remote.x.notification.fetch.timeout) is 1 minute and the default >> server connection timeout (jmx.remote.x.server.connection.timeout) is 2 >> minutes. >> >> If the client side connector thread makes a notification fetch request >> to the server, but a transient network problem prevents the server >> response from reaching the client, the client side connector will wait >> for a response until the timeout period (1 minute) has expired before >> throwing an IOException. >> >> The client side RMIConnector implementation handles the IOException, by >> re-checking the connection status to understand whether or not it is >> broken. If the connection is not available at that moment, the connector >> fails by re-throwing the initial IOException. The problem is that this >> re-check of the connection passes because the server side of the >> connection doesn't time out until 2 minutes has passed (by default), so >> the NotifFetcher thread >> dies without posting a failed notification, and the client application >> does not get a chance to recover. >> >> The fix is to forward the exception on the JMX client side before >> checking the connection status. >> >> Testing: >> All the jdk_jmx and jdk_management regression tests passed. >> >> The fix applies cleanly to 8u and 7u repos. >> >> >> Thanks, >> Poonam >> >> > From jaroslav.bachorik at oracle.com Tue Aug 26 11:41:49 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 26 Aug 2014 13:41:49 +0200 Subject: jmx-dev Review request: 8049303: Transient network problems cause JMX thread to fail silenty In-Reply-To: <53FC6D75.90702@oracle.com> References: <53FBDB1A.2070501@oracle.com> <53FC6101.4020501@oracle.com> <53FC6315.8080700@oracle.com> <53FC6D75.90702@oracle.com> Message-ID: <53FC727D.6070405@oracle.com> On 08/26/2014 01:20 PM, Poonam Bajaj wrote: > Hi Jaroslav, > > On 8/26/2014 4:06 PM, Jaroslav Bachorik wrote: >> Hi Poonam, >> >> On 08/26/2014 12:27 PM, Poonam Bajaj wrote: >>> Sending the review request to serviceability-dev list as well... >>> ----------- >>> >>> Could I have reviews for this change: >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8049303 >>> Webrev: http://cr.openjdk.java.net/~poonam/8049303/webrev.00/ >> >> L1499-1504 can be completely removed. They serve no purpose now. >> > Removed this piece of code. > >> Please, adjust the indentation to fit the original one. > > Corrected the indentation. > > Updated webrev: http://cr.openjdk.java.net/~poonam/8049303/webrev.01/ Looks good! -JB- > > Thanks, > Poonam > >> >> Thanks, >> >> -JB- >> >>> >>> Problem and fix: >>> By default the JMX client side notification fetch timeout >>> (jmx.remote.x.notification.fetch.timeout) is 1 minute and the default >>> server connection timeout (jmx.remote.x.server.connection.timeout) is 2 >>> minutes. >>> >>> If the client side connector thread makes a notification fetch request >>> to the server, but a transient network problem prevents the server >>> response from reaching the client, the client side connector will wait >>> for a response until the timeout period (1 minute) has expired before >>> throwing an IOException. >>> >>> The client side RMIConnector implementation handles the IOException, by >>> re-checking the connection status to understand whether or not it is >>> broken. If the connection is not available at that moment, the connector >>> fails by re-throwing the initial IOException. The problem is that this >>> re-check of the connection passes because the server side of the >>> connection doesn't time out until 2 minutes has passed (by default), so >>> the NotifFetcher thread >>> dies without posting a failed notification, and the client application >>> does not get a chance to recover. >>> >>> The fix is to forward the exception on the JMX client side before >>> checking the connection status. >>> >>> Testing: >>> All the jdk_jmx and jdk_management regression tests passed. >>> >>> The fix applies cleanly to 8u and 7u repos. >>> >>> >>> Thanks, >>> Poonam >>> >>> >> From poonam.bajaj at oracle.com Tue Aug 26 00:55:54 2014 From: poonam.bajaj at oracle.com (Poonam Bajaj) Date: Tue, 26 Aug 2014 06:25:54 +0530 Subject: jmx-dev Review request: 8049303: Transient network problems cause JMX thread to fail silenty Message-ID: <53FBDB1A.2070501@oracle.com> Could I have reviews for this change: Bug: https://bugs.openjdk.java.net/browse/JDK-8049303 Webrev: http://cr.openjdk.java.net/~poonam/8049303/webrev.00/ Problem and fix: By default the JMX client side notification fetch timeout (jmx.remote.x.notification.fetch.timeout) is 1 minute and the default server connection timeout (jmx.remote.x.server.connection.timeout) is 2 minutes. If the client side connector thread makes a notification fetch request to the server, but a transient network problem prevents the server response from reaching the client, the client side connector will wait for a response until the timeout period (1 minute) has expired before throwing an IOException. The client side RMIConnector implementation handles the IOException, by re-checking the connection status to understand whether or not it is broken. If the connection is not available at that moment, the connector fails by re-throwing the initial IOException. The problem is that this re-check of the connection passes because the server side of the connection doesn't time out until 2 minutes has passed (by default), so the NotifFetcher thread dies without posting a failed notification, and the client application does not get a chance to recover. The fix is to forward the exception on the JMX client side before checking the connection status. Testing: All the jdk_jmx and jdk_management regression tests passed. The fix applies cleanly to 8u and 7u repos. Thanks, Poonam From jaroslav.bachorik at oracle.com Thu Aug 28 15:57:43 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 28 Aug 2014 17:57:43 +0200 Subject: jmx-dev Review request: 8049303: Transient network problems cause JMX thread to fail silenty In-Reply-To: <53FBDB1A.2070501@oracle.com> References: <53FBDB1A.2070501@oracle.com> Message-ID: <53FF5177.7000700@oracle.com> I have taken over this issue from Poonam since she will be unavailable for the next month or so. Could I have reviews for this change: Bug: https://bugs.openjdk.java.net/browse/JDK-8049303 Webrev: http://cr.openjdk.java.net/~jbachorik/8049303/webrev.00 Problem and fix: By default the JMX client side notification fetch timeout (jmx.remote.x.notification.fetch.timeout) is 1 minute and the default server connection timeout (jmx.remote.x.server.connection.timeout) is 2 minutes. If the client side connector thread makes a notification fetch request to the server, but a transient network problem prevents the server response from reaching the client, the client side connector will wait for a response until the timeout period (1 minute) has expired before throwing an IOException. The client side RMIConnector implementation handles the IOException, by re-checking the connection status to understand whether or not it is broken. If the connection is not available at that moment, the connector fails by re-throwing the initial IOException. The problem is that this re-check of the connection passes because the server side of the connection doesn't time out until 2 minutes has passed (by default), so the NotifFetcher thread dies without posting a failed notification, and the client application does not get a chance to recover. The fix is to forward the non connection-related exceptions on the JMX client side instead of checking the connection status. The connection-related exceptions will cause closing the session as an unsuccessful connection check would have done. Testing: All the jdk_jmx and jdk_management regression tests passed. All the related JCK tests passed. The fix applies cleanly to 8u and 7u repos. Thanks, -JB- From daniel.fuchs at oracle.com Fri Aug 29 09:25:51 2014 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Fri, 29 Aug 2014 11:25:51 +0200 Subject: jmx-dev Review request: 8049303: Transient network problems cause JMX thread to fail silenty In-Reply-To: <53FF5177.7000700@oracle.com> References: <53FBDB1A.2070501@oracle.com> <53FF5177.7000700@oracle.com> Message-ID: <5400471F.3050802@oracle.com> Hi Jaroslav, I am not sure to understand how this solves the problem. The old code first checked the connection, and if that failed, sent the FAILED notification, closed the connector, and rethrew the exception. The new code directly throws the exception without checking the connection, and therefore without closing the connection and sending the FAILED notification. So is the fix a change of behavior by which the RMIConnector will - in some cases - not try to autoclose the connection but instead simply wait for the caller to explicitely call close()? I'd be interested to hear what Shanliang has to say... best regards, -- daniel On 8/28/14 5:57 PM, Jaroslav Bachorik wrote: > I have taken over this issue from Poonam since she will be unavailable > for the next month or so. > > Could I have reviews for this change: > > Bug: https://bugs.openjdk.java.net/browse/JDK-8049303 > Webrev: http://cr.openjdk.java.net/~jbachorik/8049303/webrev.00 > > Problem and fix: > By default the JMX client side notification fetch timeout > (jmx.remote.x.notification.fetch.timeout) is 1 minute and the default > server connection timeout (jmx.remote.x.server.connection.timeout) is 2 > minutes. > > If the client side connector thread makes a notification fetch request > to the server, but a transient network problem prevents the server > response from reaching the client, the client side connector will wait > for a response until the timeout period (1 minute) has expired before > throwing an IOException. > > The client side RMIConnector implementation handles the IOException, by > re-checking the connection status to understand whether or not it is > broken. If the connection is not available at that moment, the connector > fails by re-throwing the initial IOException. The problem is that this > re-check of the connection passes because the server side of the > connection doesn't time out until 2 minutes has passed (by default), so > the NotifFetcher thread > dies without posting a failed notification, and the client application > does not get a chance to recover. > > The fix is to forward the non connection-related exceptions on the JMX > client side instead of checking the connection status. The > connection-related exceptions will cause closing the session as an > unsuccessful connection check would have done. > > Testing: > All the jdk_jmx and jdk_management regression tests passed. > All the related JCK tests passed. > > The fix applies cleanly to 8u and 7u repos. > > > Thanks, > -JB- > > From jaroslav.bachorik at oracle.com Fri Aug 29 09:41:30 2014 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 29 Aug 2014 11:41:30 +0200 Subject: jmx-dev Review request: 8049303: Transient network problems cause JMX thread to fail silenty In-Reply-To: <5400471F.3050802@oracle.com> References: <53FBDB1A.2070501@oracle.com> <53FF5177.7000700@oracle.com> <5400471F.3050802@oracle.com> Message-ID: <54004ACA.4040802@oracle.com> On 08/29/2014 11:25 AM, Daniel Fuchs wrote: > Hi Jaroslav, > > I am not sure to understand how this solves the problem. > The old code first checked the connection, and if that failed, > sent the FAILED notification, closed the connector, and rethrew > the exception. This problem seems to have something to do with the way RMI works - the customer had problems with one set of ties/stubs while the other set of ties/stubs worked just fine. Seems like in cases of transient network failures the connection check was not reliable. > > The new code directly throws the exception without > checking the connection, and therefore without closing > the connection and sending the FAILED notification. It only does so for the cases where the connection itself is not the culprit - error while executing the method on the server, marshalling problems etc. > > So is the fix a change of behavior by which the RMIConnector > will - in some cases - not try to autoclose the connection but > instead simply wait for the caller to explicitely call close()? Not really - the change is in relying on the RMI providing the information whether the connection is still usable or not. The code didn't autoclose the connection when "connection.getDefaultDomain(null)" didn't throw IOException either. > > I'd be interested to hear what Shanliang has to say... Yep. The code does a lot of things at once and without any spec for handling failures and recovery we can only rely on the tests. -JB- > > best regards, > > -- daniel > > > On 8/28/14 5:57 PM, Jaroslav Bachorik wrote: >> I have taken over this issue from Poonam since she will be unavailable >> for the next month or so. >> >> Could I have reviews for this change: >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8049303 >> Webrev: http://cr.openjdk.java.net/~jbachorik/8049303/webrev.00 >> >> Problem and fix: >> By default the JMX client side notification fetch timeout >> (jmx.remote.x.notification.fetch.timeout) is 1 minute and the default >> server connection timeout (jmx.remote.x.server.connection.timeout) is 2 >> minutes. >> >> If the client side connector thread makes a notification fetch request >> to the server, but a transient network problem prevents the server >> response from reaching the client, the client side connector will wait >> for a response until the timeout period (1 minute) has expired before >> throwing an IOException. >> >> The client side RMIConnector implementation handles the IOException, by >> re-checking the connection status to understand whether or not it is >> broken. If the connection is not available at that moment, the connector >> fails by re-throwing the initial IOException. The problem is that this >> re-check of the connection passes because the server side of the >> connection doesn't time out until 2 minutes has passed (by default), so >> the NotifFetcher thread >> dies without posting a failed notification, and the client application >> does not get a chance to recover. >> >> The fix is to forward the non connection-related exceptions on the JMX >> client side instead of checking the connection status. The >> connection-related exceptions will cause closing the session as an >> unsuccessful connection check would have done. >> >> Testing: >> All the jdk_jmx and jdk_management regression tests passed. >> All the related JCK tests passed. >> >> The fix applies cleanly to 8u and 7u repos. >> >> >> Thanks, >> -JB- >> >> >