From jaroslav.bachorik at oracle.com Tue Oct 2 01:33:13 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Tue, 02 Oct 2012 10:33:13 +0200 Subject: jmx-dev Review Request: 7195779 javax/management/remote/mandatory/threads/ExecutorTest.java fail intermittently In-Reply-To: <505C46EB.60800@oracle.com> References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com> <5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com> <505C46EB.60800@oracle.com> Message-ID: <506AA6C9.30909@oracle.com> On Fri 21 Sep 2012 12:52:27 PM CEST, Alan Bateman wrote: > On 20/09/2012 17:02, Eamonn McManus wrote: >> Changing the generated RMI/IIOP code >> so that it no longer causes this exception, or so that it catches it >> and rethrows a RemoteException, sounds as if it ought to be fairly >> straightforward, and that's probably what I would do if it were up to >> me. > I think this is what I would do to, even though it means going into > the corba repository as that is there the stub generator is. I should > say that I don't violently object to Jaroslav's patch, it's just that > it is an ugly workaround. The generated TIE class is inherently thread-unsafe. The internal state (the target field) can be manipulated without any enforced synchronization - eg. it is valid to set the target field to null by calling the deactivate() method after the _invoke() method has been entered from a different thread. This will lead to the NPE we can observe. Given this example one should make critical sections out of deactivate() and _invoke() methods to prevent this situation. However, this simplistic approach might lead to deadlocks in the existing code as the _invoke() method body might be blocking (it is a 3rd party code) and thus preventing execution of the deactivate() method indefinitely. Also, it is not really possible to solve this problem outside of the generated TIE class - it is caused by the concurrent change of the TIE's internal state. So, the solution would be either caching the target attribute at the beginning of the invoke() operation in a synchronized block and use the cached version afterwards (and throwing a remote exception if it is null - the TIE was deactivated effectively before entering the invoke() operation) or postponing deactivation when the invoke() method is detected as being in progress. -JB- > > >> Disabling this test for the IIOP case, and probably other failing >> JMX tests that involve IIOP, is an option if it is judged that nobody >> uses the RMI/IIOP connector any more so it is all right to let it rot. >> That judgement is a non-technical one that I don't have an informed >> opinion on. >> > I don't know if the rmi-iiop connector was ever used much but since it > seems to be required by the JMX Remote API spec then I think we should > continue to test it. As I think I mentioned in another mail recently, > I think we have to look at making this transport optional as it's > painful to have the CORBA tie/stub classes in > javax.management.remote.rmi. I don't what to hijack Jaroslav's thread > to discuss that, that's a topic for another thread. > > -Alan > From Alan.Bateman at oracle.com Tue Oct 2 06:38:13 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Tue, 02 Oct 2012 14:38:13 +0100 Subject: jmx-dev Review Request: 7195779 javax/management/remote/mandatory/threads/ExecutorTest.java fail intermittently In-Reply-To: <506AA6C9.30909@oracle.com> References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com> <5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com> <505C46EB.60800@oracle.com> <506AA6C9.30909@oracle.com> Message-ID: <506AEE45.7020403@oracle.com> On 02/10/2012 09:33, Jaroslav Bachorik wrote: > : > The generated TIE class is inherently thread-unsafe. The internal state > (the target field) can be manipulated without any enforced > synchronization - eg. it is valid to set the target field to null by > calling the deactivate() method after the _invoke() method has been > entered from a different thread. This will lead to the NPE we can > observe. Given this example one should make critical sections out of > deactivate() and _invoke() methods to prevent this situation. However, > this simplistic approach might lead to deadlocks in the existing code > as the _invoke() method body might be blocking (it is a 3rd party code) > and thus preventing execution of the deactivate() method indefinitely. > > Also, it is not really possible to solve this problem outside of the > generated TIE class - it is caused by the concurrent change of the > TIE's internal state. So, the solution would be either caching the > target attribute at the beginning of the invoke() operation in a > synchronized block and use the cached version afterwards (and throwing > a remote exception if it is null - the TIE was deactivated effectively > before entering the invoke() operation) or postponing deactivation when > the invoke() method is detected as being in progress. > > -JB- > Jaroslav and I chatted on IM about this today. Jaroslav is going to have a go at changing the stub generator and will send a follow-up mail with an updated webrev (this time for the corba repo as the that is where the stub generator lives). -Alan From jaroslav.bachorik at oracle.com Thu Oct 4 08:28:05 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 04 Oct 2012 17:28:05 +0200 Subject: jmx-dev Review Request: 7195779 javax/management/remote/mandatory/threads/ExecutorTest.java fail intermittently In-Reply-To: <506AEE45.7020403@oracle.com> References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com> <5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com> <505C46EB.60800@oracle.com> <506AA6C9.30909@oracle.com> <506AEE45.7020403@oracle.com> Message-ID: <506DAB05.4050500@oracle.com> On Tue 02 Oct 2012 03:38:13 PM CEST, Alan Bateman wrote: > On 02/10/2012 09:33, Jaroslav Bachorik wrote: >> : >> The generated TIE class is inherently thread-unsafe. The internal state >> (the target field) can be manipulated without any enforced >> synchronization - eg. it is valid to set the target field to null by >> calling the deactivate() method after the _invoke() method has been >> entered from a different thread. This will lead to the NPE we can >> observe. Given this example one should make critical sections out of >> deactivate() and _invoke() methods to prevent this situation. However, >> this simplistic approach might lead to deadlocks in the existing code >> as the _invoke() method body might be blocking (it is a 3rd party code) >> and thus preventing execution of the deactivate() method indefinitely. >> >> Also, it is not really possible to solve this problem outside of the >> generated TIE class - it is caused by the concurrent change of the >> TIE's internal state. So, the solution would be either caching the >> target attribute at the beginning of the invoke() operation in a >> synchronized block and use the cached version afterwards (and throwing >> a remote exception if it is null - the TIE was deactivated effectively >> before entering the invoke() operation) or postponing deactivation when >> the invoke() method is detected as being in progress. >> >> -JB- >> > Jaroslav and I chatted on IM about this today. Jaroslav is going to > have a go at changing the stub generator and will send a follow-up > mail with an updated webrev (this time for the corba repo as the that > is where the stub generator lives). This is a follow-up. I've prepared the patch and put it on github - https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/7195779 I wonder who else should be included in the review process since I am changing the IIOP generator code. Also, I didn't find any tests in the corba repository. Which test suite is appropriate to run after changing the corba related code? -JB- > > -Alan From Alan.Bateman at oracle.com Thu Oct 4 08:42:15 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Thu, 04 Oct 2012 16:42:15 +0100 Subject: jmx-dev Review Request: 7195779 javax/management/remote/mandatory/threads/ExecutorTest.java fail intermittently In-Reply-To: <506DAB05.4050500@oracle.com> References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com> <5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com> <505C46EB.60800@oracle.com> <506AA6C9.30909@oracle.com> <506AEE45.7020403@oracle.com> <506DAB05.4050500@oracle.com> Message-ID: <506DAE57.3020300@oracle.com> On 04/10/2012 16:28, Jaroslav Bachorik wrote: > : > This is a follow-up. I've prepared the patch and put it on github - > https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/7195779 > > I wonder who else should be included in the review process since I am > changing the IIOP generator code. Also, I didn't find any tests in the > corba repository. Which test suite is appropriate to run after changing > the corba related code? > > -JB- > I don't mind being reviewer and sponsor for this. Also cc'ing Sean as he is one of the maintainers of the corba code. I don't think the corba tests are in OpenJDK, at least I don't think Oracle has contributed its tests for this area. I think your change looks okay and I assume you've at least run the JMX tests that use RMI-IIOP to verify that the intermittent NPE is gone and those tests now pass reliably. Minor comment but if I were doing this myself then I probably would have added this instead: p.pln(getName(theType) + " target = this.target;"); You'll see lots of examples of this in the core libs and j.u.c. Also as target is now volatile then I'm not sure why you synchronized around target=null, perhaps there is other code generated in the tie class that I don't see? Otherwise it's great to get issue finally resolved. -Alan. From jaroslav.bachorik at oracle.com Thu Oct 4 08:56:10 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 04 Oct 2012 17:56:10 +0200 Subject: jmx-dev Review Request: 7195779 javax/management/remote/mandatory/threads/ExecutorTest.java fail intermittently In-Reply-To: <506DAE57.3020300@oracle.com> References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com> <5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com> <505C46EB.60800@oracle.com> <506AA6C9.30909@oracle.com> <506AEE45.7020403@oracle.com> <506DAB05.4050500@oracle.com> <506DAE57.3020300@oracle.com> Message-ID: <506DB19A.7070604@oracle.com> On Thu 04 Oct 2012 05:42:15 PM CEST, Alan Bateman wrote: > On 04/10/2012 16:28, Jaroslav Bachorik wrote: >> : >> This is a follow-up. I've prepared the patch and put it on github - >> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/7195779 >> >> I wonder who else should be included in the review process since I am >> changing the IIOP generator code. Also, I didn't find any tests in the >> corba repository. Which test suite is appropriate to run after changing >> the corba related code? >> >> -JB- >> > I don't mind being reviewer and sponsor for this. Also cc'ing Sean as > he is one of the maintainers of the corba code. I don't think the Thanks. > corba tests are in OpenJDK, at least I don't think Oracle has > contributed its tests for this area. > > I think your change looks okay and I assume you've at least run the > JMX tests that use RMI-IIOP to verify that the intermittent NPE is > gone and those tests now pass reliably. Yes, I ran those. Especially on the test machine yielding the biggest ration of failed tests previously. Now they all pass. > > Minor comment but if I were doing this myself then I probably would > have added this instead: > p.pln(getName(theType) + " target = this.target;"); > > You'll see lots of examples of this in the core libs and j.u.c. No problem. If it is a convention I will stick to it (I've used a different variable name to prevent confusion about what is a field and what is a local variable). > > Also as target is now volatile then I'm not sure why you synchronized > around target=null, perhaps there is other code generated in the tie > class that I don't see? The synchronization should not be there. It just escaped my purging when I've exchanged the synchronized access for volatile. -JB- > > Otherwise it's great to get issue finally resolved. > > -Alan. > > From jaroslav.bachorik at oracle.com Fri Oct 5 01:10:31 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 05 Oct 2012 10:10:31 +0200 Subject: jmx-dev Review Request: 7195779 javax/management/remote/mandatory/threads/ExecutorTest.java fail intermittently In-Reply-To: <506DB19A.7070604@oracle.com> References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com> <5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com> <505C46EB.60800@oracle.com> <506AA6C9.30909@oracle.com> <506AEE45.7020403@oracle.com> <506DAB05.4050500@oracle.com> <506DAE57.3020300@oracle.com> <506DB19A.7070604@oracle.com> Message-ID: <506E95F7.7000304@oracle.com> I have updated the patch to reflect Alan's remarks. The webrev is at the same location - github takes care of versioning ... -JB- On 10/04/2012 05:56 PM, Jaroslav Bachorik wrote: > On Thu 04 Oct 2012 05:42:15 PM CEST, Alan Bateman wrote: >> On 04/10/2012 16:28, Jaroslav Bachorik wrote: >>> : >>> This is a follow-up. I've prepared the patch and put it on github - >>> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/7195779 >>> >>> I wonder who else should be included in the review process since I am >>> changing the IIOP generator code. Also, I didn't find any tests in the >>> corba repository. Which test suite is appropriate to run after changing >>> the corba related code? >>> >>> -JB- >>> >> I don't mind being reviewer and sponsor for this. Also cc'ing Sean as >> he is one of the maintainers of the corba code. I don't think the > > Thanks. > >> corba tests are in OpenJDK, at least I don't think Oracle has >> contributed its tests for this area. >> >> I think your change looks okay and I assume you've at least run the >> JMX tests that use RMI-IIOP to verify that the intermittent NPE is >> gone and those tests now pass reliably. > > Yes, I ran those. Especially on the test machine yielding the biggest > ration of failed tests previously. Now they all pass. > >> >> Minor comment but if I were doing this myself then I probably would >> have added this instead: >> p.pln(getName(theType) + " target = this.target;"); >> >> You'll see lots of examples of this in the core libs and j.u.c. > > No problem. If it is a convention I will stick to it (I've used a > different variable name to prevent confusion about what is a field and > what is a local variable). > >> >> Also as target is now volatile then I'm not sure why you synchronized >> around target=null, perhaps there is other code generated in the tie >> class that I don't see? > > The synchronization should not be there. It just escaped my purging > when I've exchanged the synchronized access for volatile. > > -JB- > >> >> Otherwise it's great to get issue finally resolved. >> >> -Alan. >> >> > > From Alan.Bateman at oracle.com Fri Oct 5 01:59:51 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Fri, 05 Oct 2012 09:59:51 +0100 Subject: jmx-dev Review Request: 7195779 javax/management/remote/mandatory/threads/ExecutorTest.java fail intermittently In-Reply-To: <506E95F7.7000304@oracle.com> References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com> <5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com> <505C46EB.60800@oracle.com> <506AA6C9.30909@oracle.com> <506AEE45.7020403@oracle.com> <506DAB05.4050500@oracle.com> <506DAE57.3020300@oracle.com> <506DB19A.7070604@oracle.com> <506E95F7.7000304@oracle.com> Message-ID: <506EA187.6040008@oracle.com> On 05/10/2012 09:10, Jaroslav Bachorik wrote: > I have updated the patch to reflect Alan's remarks. The webrev is at the > same location - github takes care of versioning ... > > -JB- > Thanks Jaroslav, I grabbed it from here: https://raw.github.com/jbachorik/openjdk-patches/master/webrevs/7195779/corba.patch I'll push this into jdk8/tl shortly, listing you as the contributor. -Alan. From jaroslav.bachorik at oracle.com Wed Oct 10 01:40:01 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 10 Oct 2012 10:40:01 +0200 Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has inconsistent readObject/writeObject Message-ID: <50753461.4070009@oracle.com> I am looking for a review and a sponsor for this fix. The issue is about an empty array of descriptors being written as a part of the serialization process but not read when deserializing an MBeanInfo/MBeanFeatureInfo instance. While the current ObjectInputStream skips all unread custom written fields it is not a behaviour required by the specification and may cause problems. The patch makes the array to be read in all cases - even when it is known to be an empty one. That way all that has been written as a part of serialization is read back. The webrev with the fix and test is available @ https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6783290 -JB- From jaroslav.bachorik at oracle.com Wed Oct 10 07:17:03 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 10 Oct 2012 16:17:03 +0200 Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from javax.management.timer.Timer Message-ID: <5075835F.3050604@oracle.com> I am looking for a review and a sponsor. The issue is about some javax.management.timer.Timer notifications not being received by the listeners if the notifications are generated rapidly. The problem is caused by ConcurrentModificationException being thrown - the exception itself is ignored but the dispatcher logic is skipped. Therefore the currently processed notification gets lost. The CME is thrown due to the Timer.timerTable being iterated over while other threads try to remove some of its elements. Fix consists of replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap which handles such situations with grace. The patch webrev is available @ https://jbs.oracle.com/bugs/browse/JDK-6809322 Thanks, -JB- From jaroslav.bachorik at oracle.com Wed Oct 10 07:39:27 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 10 Oct 2012 16:39:27 +0200 Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from javax.management.timer.Timer In-Reply-To: <5075835F.3050604@oracle.com> References: <5075835F.3050604@oracle.com> Message-ID: <5075889F.1070808@oracle.com> I am sorry for the webrev URL - a stale clipboard :( The correct webrev URL is https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6809322 -JB- On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote: > I am looking for a review and a sponsor. > > The issue is about some javax.management.timer.Timer notifications not > being received by the listeners if the notifications are generated rapidly. > > The problem is caused by ConcurrentModificationException being thrown - > the exception itself is ignored but the dispatcher logic is skipped. > Therefore the currently processed notification gets lost. > > The CME is thrown due to the Timer.timerTable being iterated over while > other threads try to remove some of its elements. Fix consists of > replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap > which handles such situations with grace. > > The patch webrev is available @ > https://jbs.oracle.com/bugs/browse/JDK-6809322 > > Thanks, > > -JB- > From eamonn at mcmanus.net Wed Oct 10 08:49:11 2012 From: eamonn at mcmanus.net (Eamonn McManus) Date: Wed, 10 Oct 2012 08:49:11 -0700 Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has inconsistent readObject/writeObject In-Reply-To: <50753461.4070009@oracle.com> References: <50753461.4070009@oracle.com> Message-ID: Hi Jaroslav, The patch looks correct and the test is ingenious. I do not understand why the previous SerializationTest needs to be deleted. It doesn't seem that the new test is covering the same things. Reviewed-by: emcmanus Incidentally I was not able to find a way to see the patch with the usual webrev browser UI. Is there a link for that? Regards, ?amonn 2012/10/10 Jaroslav Bachorik : > I am looking for a review and a sponsor for this fix. > > The issue is about an empty array of descriptors being written as a part > of the serialization process but not read when deserializing an > MBeanInfo/MBeanFeatureInfo instance. While the current ObjectInputStream > skips all unread custom written fields it is not a behaviour required by > the specification and may cause problems. > > The patch makes the array to be read in all cases - even when it is > known to be an empty one. That way all that has been written as a part > of serialization is read back. > > The webrev with the fix and test is available @ > https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6783290 > > -JB- From jaroslav.bachorik at oracle.com Wed Oct 10 11:52:55 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 10 Oct 2012 20:52:55 +0200 Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from javax.management.timer.Timer In-Reply-To: <5075BF37.2020400@oracle.com> References: <5075835F.3050604@oracle.com> <5075889F.1070808@oracle.com> <50759773.7080602@oracle.com> <5075BDAF.6070703@oracle.com> <5075BF37.2020400@oracle.com> Message-ID: <5075C407.9060105@oracle.com> Thanks, could you update with this webrev? I've fixed the problem with hgmq <-> webrev where the file copy is mistakenly marked as a file move. Thanks, -JB- On 10/10/2012 08:32 PM, Dmitry Samersoff wrote: > Jaroslav, > > http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322/ > > -Dmitry > > > On 2012-10-10 22:25, Jaroslav Bachorik wrote: >> On Wed 10 Oct 2012 05:42:43 PM CEST, Dmitry Samersoff wrote: >>> Jaroslav, >>> >>> Not able to open it as a webrev - only list of files. >>> >>> E-mail me the webrev and I'll put it to >>> file:///opt/src/jdks/openjdk-patches/webrevs/JDK-6809322.zip >> >>> cr.openjdk.net/~dsamersoff/sponsorship/jbachorik/NNNNN >> >> Attaching... Thanks. I didn't want to put the webrevs on rapidshare or >> the likes and github seemed like a nice choice. Unfortunately it is not >> possible to view raw files :( >> >> -JB- >> >>> >>> -Dmitry >>> >>> >>> On 2012-10-10 18:39, Jaroslav Bachorik wrote: >>>> I am sorry for the webrev URL - a stale clipboard :( >>>> >>>> The correct webrev URL is >>>> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6809322 >>>> >>>> -JB- >>>> >>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote: >>>>> I am looking for a review and a sponsor. >>>>> >>>>> The issue is about some javax.management.timer.Timer notifications not >>>>> being received by the listeners if the notifications are generated rapidly. >>>>> >>>>> The problem is caused by ConcurrentModificationException being thrown - >>>>> the exception itself is ignored but the dispatcher logic is skipped. >>>>> Therefore the currently processed notification gets lost. >>>>> >>>>> The CME is thrown due to the Timer.timerTable being iterated over while >>>>> other threads try to remove some of its elements. Fix consists of >>>>> replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap >>>>> which handles such situations with grace. >>>>> >>>>> The patch webrev is available @ >>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322 >>>>> >>>>> Thanks, >>>>> >>>>> -JB- >>>>> >>>> >>> >>> >> >> > > From jaroslav.bachorik at oracle.com Thu Oct 11 01:07:54 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 11 Oct 2012 10:07:54 +0200 Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from javax.management.timer.Timer In-Reply-To: <5075835F.3050604@oracle.com> References: <5075835F.3050604@oracle.com> Message-ID: <50767E5A.7060908@oracle.com> Dmitry has put the webrev on the public CR - http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/ Thanks! -JB- On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote: > I am looking for a review and a sponsor. > > The issue is about some javax.management.timer.Timer notifications not > being received by the listeners if the notifications are generated rapidly. > > The problem is caused by ConcurrentModificationException being thrown - > the exception itself is ignored but the dispatcher logic is skipped. > Therefore the currently processed notification gets lost. > > The CME is thrown due to the Timer.timerTable being iterated over while > other threads try to remove some of its elements. Fix consists of > replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap > which handles such situations with grace. > > The patch webrev is available @ > https://jbs.oracle.com/bugs/browse/JDK-6809322 > > Thanks, > > -JB- > From jaroslav.bachorik at oracle.com Thu Oct 11 10:26:19 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 11 Oct 2012 19:26:19 +0200 Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has inconsistent readObject/writeObject In-Reply-To: <5075C26C.50309@oracle.com> References: <50753461.4070009@oracle.com> <5075C26C.50309@oracle.com> Message-ID: <5077013B.9040801@oracle.com> Just to keep it clear - here is the webrev hosted at CR - http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6783290-v1/ -JB- On Wed 10 Oct 2012 08:46:04 PM CEST, Jaroslav Bachorik wrote: > Hi, > > On Wed 10 Oct 2012 05:49:11 PM CEST, Eamonn McManus wrote: >> Hi Jaroslav, >> >> The patch looks correct and the test is ingenious. >> >> I do not understand why the previous SerializationTest needs to be >> deleted. It doesn't seem that the new test is covering the same >> things. > > I need to check that. I copied the SerializationTest.java to > SerializationTest1.java - apparently the cooperation of the webrev and > mercurial queues has its glitches :( > > I am attaching the corrected webrev. > > -JB- > >> >> Reviewed-by: emcmanus >> >> Incidentally I was not able to find a way to see the patch with the >> usual webrev browser UI. Is there a link for that? >> >> Regards, >> ?amonn >> >> >> 2012/10/10 Jaroslav Bachorik : >>> I am looking for a review and a sponsor for this fix. >>> >>> The issue is about an empty array of descriptors being written as a part >>> of the serialization process but not read when deserializing an >>> MBeanInfo/MBeanFeatureInfo instance. While the current ObjectInputStream >>> skips all unread custom written fields it is not a behaviour required by >>> the specification and may cause problems. >>> >>> The patch makes the array to be read in all cases - even when it is >>> known to be an empty one. That way all that has been written as a part >>> of serialization is read back. >>> >>> The webrev with the fix and test is available @ >>> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6783290 >>> >>> -JB- > > From eamonn at mcmanus.net Thu Oct 11 10:42:24 2012 From: eamonn at mcmanus.net (Eamonn McManus) Date: Thu, 11 Oct 2012 10:42:24 -0700 Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has inconsistent readObject/writeObject In-Reply-To: <5077013B.9040801@oracle.com> References: <50753461.4070009@oracle.com> <5075C26C.50309@oracle.com> <5077013B.9040801@oracle.com> Message-ID: Looks good. A couple of minor nits about the test: there is a stray IDE template comment on line 74, and the copyright date is wrong. ?amonn 2012/10/11 Jaroslav Bachorik : > Just to keep it clear - here is the webrev hosted at CR - > http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6783290-v1/ > > -JB- > > On Wed 10 Oct 2012 08:46:04 PM CEST, Jaroslav Bachorik wrote: >> Hi, >> >> On Wed 10 Oct 2012 05:49:11 PM CEST, Eamonn McManus wrote: >>> Hi Jaroslav, >>> >>> The patch looks correct and the test is ingenious. >>> >>> I do not understand why the previous SerializationTest needs to be >>> deleted. It doesn't seem that the new test is covering the same >>> things. >> >> I need to check that. I copied the SerializationTest.java to >> SerializationTest1.java - apparently the cooperation of the webrev and >> mercurial queues has its glitches :( >> >> I am attaching the corrected webrev. >> >> -JB- >> >>> >>> Reviewed-by: emcmanus >>> >>> Incidentally I was not able to find a way to see the patch with the >>> usual webrev browser UI. Is there a link for that? >>> >>> Regards, >>> ?amonn >>> >>> >>> 2012/10/10 Jaroslav Bachorik : >>>> I am looking for a review and a sponsor for this fix. >>>> >>>> The issue is about an empty array of descriptors being written as a part >>>> of the serialization process but not read when deserializing an >>>> MBeanInfo/MBeanFeatureInfo instance. While the current ObjectInputStream >>>> skips all unread custom written fields it is not a behaviour required by >>>> the specification and may cause problems. >>>> >>>> The patch makes the array to be read in all cases - even when it is >>>> known to be an empty one. That way all that has been written as a part >>>> of serialization is read back. >>>> >>>> The webrev with the fix and test is available @ >>>> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6783290 >>>> >>>> -JB- >> >> > > From david.holmes at oracle.com Thu Oct 11 19:44:31 2012 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Oct 2012 12:44:31 +1000 Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from javax.management.timer.Timer In-Reply-To: <50767E5A.7060908@oracle.com> References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com> Message-ID: <5077840F.6050601@oracle.com> Hi Jaroslav, On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote: > Dmitry has put the webrev on the public CR - > http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/ > > Thanks! > > -JB- > > On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote: >> I am looking for a review and a sponsor. >> >> The issue is about some javax.management.timer.Timer notifications not >> being received by the listeners if the notifications are generated rapidly. >> >> The problem is caused by ConcurrentModificationException being thrown - >> the exception itself is ignored but the dispatcher logic is skipped. >> Therefore the currently processed notification gets lost. Can you point out where exactly in the code the exception is thrown and caught. I'd like to understand the problem better. >> >> The CME is thrown due to the Timer.timerTable being iterated over while >> other threads try to remove some of its elements. Fix consists of >> replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap >> which handles such situations with grace. Be aware that it may also give surprising results as removal is no longer synchronized at all with processing. So it could now appear that a notification is processed after a listener has been removed. David ----- >> The patch webrev is available @ >> https://jbs.oracle.com/bugs/browse/JDK-6809322 >> >> Thanks, >> >> -JB- >> > From jaroslav.bachorik at oracle.com Fri Oct 12 00:47:21 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 12 Oct 2012 09:47:21 +0200 Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from javax.management.timer.Timer In-Reply-To: <5077840F.6050601@oracle.com> References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com> <5077840F.6050601@oracle.com> Message-ID: <5077CB09.7010005@oracle.com> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote: > Hi Jaroslav, > > > On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote: >> Dmitry has put the webrev on the public CR - >> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/ >> >> >> Thanks! >> >> -JB- >> >> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote: >>> I am looking for a review and a sponsor. >>> >>> The issue is about some javax.management.timer.Timer notifications not >>> being received by the listeners if the notifications are generated >>> rapidly. >>> >>> The problem is caused by ConcurrentModificationException being thrown - >>> the exception itself is ignored but the dispatcher logic is skipped. >>> Therefore the currently processed notification gets lost. > > Can you point out where exactly in the code the exception is thrown > and caught. I'd like to understand the problem better. The CME is thrown in Timer.notifyAlarmClock() method in this case - but may happen in other places as well. Actually, in some places the access to the timerTable map is synchronized while in others it isn't. While switching the Hashtable for ConcurrentHashMap resolves this particular issue it might be beneficial to correct the partial synchronization instead. > >>> >>> The CME is thrown due to the Timer.timerTable being iterated over while >>> other threads try to remove some of its elements. Fix consists of >>> replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap >>> which handles such situations with grace. > > Be aware that it may also give surprising results as removal is no > longer synchronized at all with processing. So it could now appear > that a notification is processed after a listener has been removed. Indeed, the CME is the symptom of the out-of-order processing - the removal method is synchronized on (Timer.this) while the notifyAlarmClock() method, processing the notifications, runs unsynchronized. Thanks for pointing this out. I will have something to think about. -JB- > > David > ----- > >>> The patch webrev is available @ >>> https://jbs.oracle.com/bugs/browse/JDK-6809322 >>> >>> Thanks, >>> >>> -JB- >>> >> From jaroslav.bachorik at oracle.com Fri Oct 12 06:14:47 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 12 Oct 2012 15:14:47 +0200 Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from javax.management.timer.Timer In-Reply-To: <5077CB09.7010005@oracle.com> References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com> <5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com> Message-ID: <507817C7.9000703@oracle.com> The updated webrev is now at http://btrace.kenai.com/webrevs/JDK-6809322/ I am sorry for this chaos with webrev locations but its not that easy to work efficiently without an OpenJDK username :/ -JB- On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote: > On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote: >> Hi Jaroslav, >> >> >> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote: >>> Dmitry has put the webrev on the public CR - >>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/ >>> >>> >>> Thanks! >>> >>> -JB- >>> >>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote: >>>> I am looking for a review and a sponsor. >>>> >>>> The issue is about some javax.management.timer.Timer notifications not >>>> being received by the listeners if the notifications are generated >>>> rapidly. >>>> >>>> The problem is caused by ConcurrentModificationException being thrown - >>>> the exception itself is ignored but the dispatcher logic is skipped. >>>> Therefore the currently processed notification gets lost. >> >> Can you point out where exactly in the code the exception is thrown >> and caught. I'd like to understand the problem better. > > The CME is thrown in Timer.notifyAlarmClock() method in this case - but > may happen in other places as well. > > Actually, in some places the access to the timerTable map is > synchronized while in others it isn't. While switching the Hashtable > for ConcurrentHashMap resolves this particular issue it might be > beneficial to correct the partial synchronization instead. > >> >>>> >>>> The CME is thrown due to the Timer.timerTable being iterated over while >>>> other threads try to remove some of its elements. Fix consists of >>>> replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap >>>> which handles such situations with grace. >> >> Be aware that it may also give surprising results as removal is no >> longer synchronized at all with processing. So it could now appear >> that a notification is processed after a listener has been removed. > > Indeed, the CME is the symptom of the out-of-order processing - the > removal method is synchronized on (Timer.this) while the > notifyAlarmClock() method, processing the notifications, runs > unsynchronized. > > Thanks for pointing this out. I will have something to think about. > > -JB- > >> >> David >> ----- >> >>>> The patch webrev is available @ >>>> https://jbs.oracle.com/bugs/browse/JDK-6809322 >>>> >>>> Thanks, >>>> >>>> -JB- >>>> >>> > > From jaroslav.bachorik at oracle.com Fri Oct 12 06:16:46 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Fri, 12 Oct 2012 15:16:46 +0200 Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has inconsistent readObject/writeObject In-Reply-To: References: <50753461.4070009@oracle.com> <5075C26C.50309@oracle.com> <5077013B.9040801@oracle.com> Message-ID: <5078183E.7060106@oracle.com> Thanks. Minor nits picked ... http://btrace.kenai.com/webrevs/JDK-6783290/webrev.v3/ -JB- On 10/11/2012 07:42 PM, Eamonn McManus wrote: > Looks good. A couple of minor nits about the test: there is a stray > IDE template comment on line 74, and the copyright date is wrong. > > ?amonn > > > 2012/10/11 Jaroslav Bachorik : >> Just to keep it clear - here is the webrev hosted at CR - >> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6783290-v1/ >> >> -JB- >> >> On Wed 10 Oct 2012 08:46:04 PM CEST, Jaroslav Bachorik wrote: >>> Hi, >>> >>> On Wed 10 Oct 2012 05:49:11 PM CEST, Eamonn McManus wrote: >>>> Hi Jaroslav, >>>> >>>> The patch looks correct and the test is ingenious. >>>> >>>> I do not understand why the previous SerializationTest needs to be >>>> deleted. It doesn't seem that the new test is covering the same >>>> things. >>> >>> I need to check that. I copied the SerializationTest.java to >>> SerializationTest1.java - apparently the cooperation of the webrev and >>> mercurial queues has its glitches :( >>> >>> I am attaching the corrected webrev. >>> >>> -JB- >>> >>>> >>>> Reviewed-by: emcmanus >>>> >>>> Incidentally I was not able to find a way to see the patch with the >>>> usual webrev browser UI. Is there a link for that? >>>> >>>> Regards, >>>> ?amonn >>>> >>>> >>>> 2012/10/10 Jaroslav Bachorik : >>>>> I am looking for a review and a sponsor for this fix. >>>>> >>>>> The issue is about an empty array of descriptors being written as a part >>>>> of the serialization process but not read when deserializing an >>>>> MBeanInfo/MBeanFeatureInfo instance. While the current ObjectInputStream >>>>> skips all unread custom written fields it is not a behaviour required by >>>>> the specification and may cause problems. >>>>> >>>>> The patch makes the array to be read in all cases - even when it is >>>>> known to be an empty one. That way all that has been written as a part >>>>> of serialization is read back. >>>>> >>>>> The webrev with the fix and test is available @ >>>>> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6783290 >>>>> >>>>> -JB- >>> >>> >> >> From eamonn at mcmanus.net Fri Oct 12 09:02:14 2012 From: eamonn at mcmanus.net (Eamonn McManus) Date: Fri, 12 Oct 2012 09:02:14 -0700 Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has inconsistent readObject/writeObject In-Reply-To: <5078183E.7060106@oracle.com> References: <50753461.4070009@oracle.com> <5075C26C.50309@oracle.com> <5077013B.9040801@oracle.com> <5078183E.7060106@oracle.com> Message-ID: Looks good to me (emcmanus). ?amonn 2012/10/12 Jaroslav Bachorik > Thanks. Minor nits picked ... > > http://btrace.kenai.com/webrevs/JDK-6783290/webrev.v3/ > > -JB- > > On 10/11/2012 07:42 PM, Eamonn McManus wrote: > > Looks good. A couple of minor nits about the test: there is a stray > > IDE template comment on line 74, and the copyright date is wrong. > > > > ?amonn > > > > > > 2012/10/11 Jaroslav Bachorik : > >> Just to keep it clear - here is the webrev hosted at CR - > >> > http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6783290-v1/ > >> > >> -JB- > >> > >> On Wed 10 Oct 2012 08:46:04 PM CEST, Jaroslav Bachorik wrote: > >>> Hi, > >>> > >>> On Wed 10 Oct 2012 05:49:11 PM CEST, Eamonn McManus wrote: > >>>> Hi Jaroslav, > >>>> > >>>> The patch looks correct and the test is ingenious. > >>>> > >>>> I do not understand why the previous SerializationTest needs to be > >>>> deleted. It doesn't seem that the new test is covering the same > >>>> things. > >>> > >>> I need to check that. I copied the SerializationTest.java to > >>> SerializationTest1.java - apparently the cooperation of the webrev and > >>> mercurial queues has its glitches :( > >>> > >>> I am attaching the corrected webrev. > >>> > >>> -JB- > >>> > >>>> > >>>> Reviewed-by: emcmanus > >>>> > >>>> Incidentally I was not able to find a way to see the patch with the > >>>> usual webrev browser UI. Is there a link for that? > >>>> > >>>> Regards, > >>>> ?amonn > >>>> > >>>> > >>>> 2012/10/10 Jaroslav Bachorik : > >>>>> I am looking for a review and a sponsor for this fix. > >>>>> > >>>>> The issue is about an empty array of descriptors being written as a > part > >>>>> of the serialization process but not read when deserializing an > >>>>> MBeanInfo/MBeanFeatureInfo instance. While the current > ObjectInputStream > >>>>> skips all unread custom written fields it is not a behaviour > required by > >>>>> the specification and may cause problems. > >>>>> > >>>>> The patch makes the array to be read in all cases - even when it is > >>>>> known to be an empty one. That way all that has been written as a > part > >>>>> of serialization is read back. > >>>>> > >>>>> The webrev with the fix and test is available @ > >>>>> > https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6783290 > >>>>> > >>>>> -JB- > >>> > >>> > >> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/jmx-dev/attachments/20121012/2d0ac61d/attachment.html From david.holmes at oracle.com Sun Oct 14 19:19:49 2012 From: david.holmes at oracle.com (David Holmes) Date: Mon, 15 Oct 2012 12:19:49 +1000 Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from javax.management.timer.Timer In-Reply-To: <507817C7.9000703@oracle.com> References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com> <5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com> <507817C7.9000703@oracle.com> Message-ID: <507B72C5.4060807@oracle.com> Hi Jaroslav, I think your changes now go further than needed. The original code uses a dual synchronization scheme: a) it synchronizes most of the Timer methods b) it also uses a thread-safe Hashtable This means that not all of the Timer methods need to be synchronized because the only thread-safe action needed is the actual access to the Hashtable in some methods. The flaw with the original code was simply that the iteration of the Hashtable in notifyAlaramClock was not done in a thread-safe manner. I believe this could be fixed simply by synchronizing on the Hashtable here: 1186 synchronized(timerTable) { with no need to change the type of the timerTable, nor the synchronization on other Timer methods. You could alternatively synchronize on the Timer itself - as you now do - provided all methods of the Timer that mutate the Hashtable are themselves synchronized on the timer. What you have is not incorrect though, and may remove unnecessary synchronization in some cases (but increases the size of critical sections in others). Also here: 165 volatile private int counterID = 0; there is no need to add volatile as counterID is only accessed within synchronized methods. David ----- On 12/10/2012 11:14 PM, Jaroslav Bachorik wrote: > The updated webrev is now at http://btrace.kenai.com/webrevs/JDK-6809322/ > > I am sorry for this chaos with webrev locations but its not that easy to > work efficiently without an OpenJDK username :/ > > -JB- > > On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote: >> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote: >>> Hi Jaroslav, >>> >>> >>> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote: >>>> Dmitry has put the webrev on the public CR - >>>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/ >>>> >>>> >>>> Thanks! >>>> >>>> -JB- >>>> >>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote: >>>>> I am looking for a review and a sponsor. >>>>> >>>>> The issue is about some javax.management.timer.Timer notifications not >>>>> being received by the listeners if the notifications are generated >>>>> rapidly. >>>>> >>>>> The problem is caused by ConcurrentModificationException being thrown - >>>>> the exception itself is ignored but the dispatcher logic is skipped. >>>>> Therefore the currently processed notification gets lost. >>> >>> Can you point out where exactly in the code the exception is thrown >>> and caught. I'd like to understand the problem better. >> >> The CME is thrown in Timer.notifyAlarmClock() method in this case - but >> may happen in other places as well. >> >> Actually, in some places the access to the timerTable map is >> synchronized while in others it isn't. While switching the Hashtable >> for ConcurrentHashMap resolves this particular issue it might be >> beneficial to correct the partial synchronization instead. >> >>> >>>>> >>>>> The CME is thrown due to the Timer.timerTable being iterated over while >>>>> other threads try to remove some of its elements. Fix consists of >>>>> replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap >>>>> which handles such situations with grace. >>> >>> Be aware that it may also give surprising results as removal is no >>> longer synchronized at all with processing. So it could now appear >>> that a notification is processed after a listener has been removed. >> >> Indeed, the CME is the symptom of the out-of-order processing - the >> removal method is synchronized on (Timer.this) while the >> notifyAlarmClock() method, processing the notifications, runs >> unsynchronized. >> >> Thanks for pointing this out. I will have something to think about. >> >> -JB- >> >>> >>> David >>> ----- >>> >>>>> The patch webrev is available @ >>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322 >>>>> >>>>> Thanks, >>>>> >>>>> -JB- >>>>> >>>> >> >> > From jaroslav.bachorik at oracle.com Mon Oct 15 03:08:30 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 15 Oct 2012 12:08:30 +0200 Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from javax.management.timer.Timer In-Reply-To: <507B72C5.4060807@oracle.com> References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com> <5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com> <507817C7.9000703@oracle.com> <507B72C5.4060807@oracle.com> Message-ID: <507BE09E.5090702@oracle.com> Thanks David, On 10/15/2012 04:19 AM, David Holmes wrote: > Hi Jaroslav, > > I think your changes now go further than needed. The original code uses > a dual synchronization scheme: > > a) it synchronizes most of the Timer methods > b) it also uses a thread-safe Hashtable > > This means that not all of the Timer methods need to be synchronized > because the only thread-safe action needed is the actual access to the > Hashtable in some methods. > > The flaw with the original code was simply that the iteration of the > Hashtable in notifyAlaramClock was not done in a thread-safe manner. I > believe this could be fixed simply by synchronizing on the Hashtable here: > > 1186 synchronized(timerTable) { > > with no need to change the type of the timerTable, nor the > synchronization on other Timer methods. You could alternatively > synchronize on the Timer itself - as you now do - provided all methods > of the Timer that mutate the Hashtable are themselves synchronized on > the timer. > > What you have is not incorrect though, and may remove unnecessary > synchronization in some cases (but increases the size of critical > sections in others). > > Also here: > > 165 volatile private int counterID = 0; > > there is no need to add volatile as counterID is only accessed within > synchronized methods. Yes, I see your point. I just want to ask - in cases of fixing issues like this the preferred way is to introduce minimal changes even if it means leaving the parts of the code sub-optimal? IMO, having dual synchronization scheme might be considered as sub-optimal as it makes it more difficult to see the author's intentions. But I am fine with leaving the Hashtable intact and just synchronizing the iteration part correctly - it resolves the issue. The update webrev is available at http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v4 Regards, -JB- > > David > ----- > > On 12/10/2012 11:14 PM, Jaroslav Bachorik wrote: >> The updated webrev is now at http://btrace.kenai.com/webrevs/JDK-6809322/ >> >> I am sorry for this chaos with webrev locations but its not that easy to >> work efficiently without an OpenJDK username :/ >> >> -JB- >> >> On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote: >>> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote: >>>> Hi Jaroslav, >>>> >>>> >>>> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote: >>>>> Dmitry has put the webrev on the public CR - >>>>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/ >>>>> >>>>> >>>>> >>>>> Thanks! >>>>> >>>>> -JB- >>>>> >>>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote: >>>>>> I am looking for a review and a sponsor. >>>>>> >>>>>> The issue is about some javax.management.timer.Timer notifications >>>>>> not >>>>>> being received by the listeners if the notifications are generated >>>>>> rapidly. >>>>>> >>>>>> The problem is caused by ConcurrentModificationException being >>>>>> thrown - >>>>>> the exception itself is ignored but the dispatcher logic is skipped. >>>>>> Therefore the currently processed notification gets lost. >>>> >>>> Can you point out where exactly in the code the exception is thrown >>>> and caught. I'd like to understand the problem better. >>> >>> The CME is thrown in Timer.notifyAlarmClock() method in this case - but >>> may happen in other places as well. >>> >>> Actually, in some places the access to the timerTable map is >>> synchronized while in others it isn't. While switching the Hashtable >>> for ConcurrentHashMap resolves this particular issue it might be >>> beneficial to correct the partial synchronization instead. >>> >>>> >>>>>> >>>>>> The CME is thrown due to the Timer.timerTable being iterated over >>>>>> while >>>>>> other threads try to remove some of its elements. Fix consists of >>>>>> replacing the Hashtable used for Timer.timerTable by >>>>>> ConcurrentHashMap >>>>>> which handles such situations with grace. >>>> >>>> Be aware that it may also give surprising results as removal is no >>>> longer synchronized at all with processing. So it could now appear >>>> that a notification is processed after a listener has been removed. >>> >>> Indeed, the CME is the symptom of the out-of-order processing - the >>> removal method is synchronized on (Timer.this) while the >>> notifyAlarmClock() method, processing the notifications, runs >>> unsynchronized. >>> >>> Thanks for pointing this out. I will have something to think about. >>> >>> -JB- >>> >>>> >>>> David >>>> ----- >>>> >>>>>> The patch webrev is available @ >>>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322 >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -JB- >>>>>> >>>>> >>> >>> >> From david.holmes at oracle.com Mon Oct 15 04:45:37 2012 From: david.holmes at oracle.com (David Holmes) Date: Mon, 15 Oct 2012 21:45:37 +1000 Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from javax.management.timer.Timer In-Reply-To: <507BE09E.5090702@oracle.com> References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com> <5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com> <507817C7.9000703@oracle.com> <507B72C5.4060807@oracle.com> <507BE09E.5090702@oracle.com> Message-ID: <507BF761.8040500@oracle.com> On 15/10/2012 8:08 PM, Jaroslav Bachorik wrote: > On 10/15/2012 04:19 AM, David Holmes wrote: >> I think your changes now go further than needed. The original code uses >> a dual synchronization scheme: >> >> a) it synchronizes most of the Timer methods >> b) it also uses a thread-safe Hashtable >> >> This means that not all of the Timer methods need to be synchronized >> because the only thread-safe action needed is the actual access to the >> Hashtable in some methods. >> >> The flaw with the original code was simply that the iteration of the >> Hashtable in notifyAlaramClock was not done in a thread-safe manner. I >> believe this could be fixed simply by synchronizing on the Hashtable here: >> >> 1186 synchronized(timerTable) { >> >> with no need to change the type of the timerTable, nor the >> synchronization on other Timer methods. You could alternatively >> synchronize on the Timer itself - as you now do - provided all methods >> of the Timer that mutate the Hashtable are themselves synchronized on >> the timer. >> >> What you have is not incorrect though, and may remove unnecessary >> synchronization in some cases (but increases the size of critical >> sections in others). >> >> Also here: >> >> 165 volatile private int counterID = 0; >> >> there is no need to add volatile as counterID is only accessed within >> synchronized methods. > > Yes, I see your point. I just want to ask - in cases of fixing issues > like this the preferred way is to introduce minimal changes even if it > means leaving the parts of the code sub-optimal? IMO, having dual > synchronization scheme might be considered as sub-optimal as it makes it > more difficult to see the author's intentions. Optimal depends on your evaluation criteria. The original design may have been done with performance in mind and a view to minimising critical sections. Without knowing what the original design criteria was, and unless you are fixing a problem caused by key aspects of that design, then minimal changes should be favoured. > But I am fine with leaving the Hashtable intact and just synchronizing > the iteration part correctly - it resolves the issue. > > The update webrev is available at > http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v4 I'm not sure the comment is needed in that form. Hashtable is snchronized internally but you need to use external synchronization when iterating through it. David > Regards, > > -JB- > >> >> David >> ----- >> >> On 12/10/2012 11:14 PM, Jaroslav Bachorik wrote: >>> The updated webrev is now at http://btrace.kenai.com/webrevs/JDK-6809322/ >>> >>> I am sorry for this chaos with webrev locations but its not that easy to >>> work efficiently without an OpenJDK username :/ >>> >>> -JB- >>> >>> On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote: >>>> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote: >>>>> Hi Jaroslav, >>>>> >>>>> >>>>> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote: >>>>>> Dmitry has put the webrev on the public CR - >>>>>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/ >>>>>> >>>>>> >>>>>> >>>>>> Thanks! >>>>>> >>>>>> -JB- >>>>>> >>>>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote: >>>>>>> I am looking for a review and a sponsor. >>>>>>> >>>>>>> The issue is about some javax.management.timer.Timer notifications >>>>>>> not >>>>>>> being received by the listeners if the notifications are generated >>>>>>> rapidly. >>>>>>> >>>>>>> The problem is caused by ConcurrentModificationException being >>>>>>> thrown - >>>>>>> the exception itself is ignored but the dispatcher logic is skipped. >>>>>>> Therefore the currently processed notification gets lost. >>>>> >>>>> Can you point out where exactly in the code the exception is thrown >>>>> and caught. I'd like to understand the problem better. >>>> >>>> The CME is thrown in Timer.notifyAlarmClock() method in this case - but >>>> may happen in other places as well. >>>> >>>> Actually, in some places the access to the timerTable map is >>>> synchronized while in others it isn't. While switching the Hashtable >>>> for ConcurrentHashMap resolves this particular issue it might be >>>> beneficial to correct the partial synchronization instead. >>>> >>>>> >>>>>>> >>>>>>> The CME is thrown due to the Timer.timerTable being iterated over >>>>>>> while >>>>>>> other threads try to remove some of its elements. Fix consists of >>>>>>> replacing the Hashtable used for Timer.timerTable by >>>>>>> ConcurrentHashMap >>>>>>> which handles such situations with grace. >>>>> >>>>> Be aware that it may also give surprising results as removal is no >>>>> longer synchronized at all with processing. So it could now appear >>>>> that a notification is processed after a listener has been removed. >>>> >>>> Indeed, the CME is the symptom of the out-of-order processing - the >>>> removal method is synchronized on (Timer.this) while the >>>> notifyAlarmClock() method, processing the notifications, runs >>>> unsynchronized. >>>> >>>> Thanks for pointing this out. I will have something to think about. >>>> >>>> -JB- >>>> >>>>> >>>>> David >>>>> ----- >>>>> >>>>>>> The patch webrev is available @ >>>>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322 >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -JB- >>>>>>> >>>>>> >>>> >>>> >>> > From jaroslav.bachorik at oracle.com Mon Oct 15 05:14:58 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 15 Oct 2012 14:14:58 +0200 Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from javax.management.timer.Timer In-Reply-To: <507BF761.8040500@oracle.com> References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com> <5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com> <507817C7.9000703@oracle.com> <507B72C5.4060807@oracle.com> <507BE09E.5090702@oracle.com> <507BF761.8040500@oracle.com> Message-ID: <507BFE42.1050200@oracle.com> On 10/15/2012 01:45 PM, David Holmes wrote: > On 15/10/2012 8:08 PM, Jaroslav Bachorik wrote: >> On 10/15/2012 04:19 AM, David Holmes wrote: >>> I think your changes now go further than needed. The original code uses >>> a dual synchronization scheme: >>> >>> a) it synchronizes most of the Timer methods >>> b) it also uses a thread-safe Hashtable >>> >>> This means that not all of the Timer methods need to be synchronized >>> because the only thread-safe action needed is the actual access to the >>> Hashtable in some methods. >>> >>> The flaw with the original code was simply that the iteration of the >>> Hashtable in notifyAlaramClock was not done in a thread-safe manner. I >>> believe this could be fixed simply by synchronizing on the Hashtable >>> here: >>> >>> 1186 synchronized(timerTable) { >>> >>> with no need to change the type of the timerTable, nor the >>> synchronization on other Timer methods. You could alternatively >>> synchronize on the Timer itself - as you now do - provided all methods >>> of the Timer that mutate the Hashtable are themselves synchronized on >>> the timer. >>> >>> What you have is not incorrect though, and may remove unnecessary >>> synchronization in some cases (but increases the size of critical >>> sections in others). >>> >>> Also here: >>> >>> 165 volatile private int counterID = 0; >>> >>> there is no need to add volatile as counterID is only accessed within >>> synchronized methods. >> >> Yes, I see your point. I just want to ask - in cases of fixing issues >> like this the preferred way is to introduce minimal changes even if it >> means leaving the parts of the code sub-optimal? IMO, having dual >> synchronization scheme might be considered as sub-optimal as it makes it >> more difficult to see the author's intentions. > > Optimal depends on your evaluation criteria. The original design may > have been done with performance in mind and a view to minimising > critical sections. Without knowing what the original design criteria > was, and unless you are fixing a problem caused by key aspects of that > design, then minimal changes should be favoured. > >> But I am fine with leaving the Hashtable intact and just synchronizing >> the iteration part correctly - it resolves the issue. >> >> The update webrev is available at >> http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v4 > > I'm not sure the comment is needed in that form. Hashtable is > snchronized internally but you need to use external synchronization when > iterating through it. Ok, it's gone. http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v5/ > > David > >> Regards, >> >> -JB- >> >>> >>> David >>> ----- >>> >>> On 12/10/2012 11:14 PM, Jaroslav Bachorik wrote: >>>> The updated webrev is now at >>>> http://btrace.kenai.com/webrevs/JDK-6809322/ >>>> >>>> I am sorry for this chaos with webrev locations but its not that >>>> easy to >>>> work efficiently without an OpenJDK username :/ >>>> >>>> -JB- >>>> >>>> On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote: >>>>> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote: >>>>>> Hi Jaroslav, >>>>>> >>>>>> >>>>>> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote: >>>>>>> Dmitry has put the webrev on the public CR - >>>>>>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks! >>>>>>> >>>>>>> -JB- >>>>>>> >>>>>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote: >>>>>>>> I am looking for a review and a sponsor. >>>>>>>> >>>>>>>> The issue is about some javax.management.timer.Timer notifications >>>>>>>> not >>>>>>>> being received by the listeners if the notifications are generated >>>>>>>> rapidly. >>>>>>>> >>>>>>>> The problem is caused by ConcurrentModificationException being >>>>>>>> thrown - >>>>>>>> the exception itself is ignored but the dispatcher logic is >>>>>>>> skipped. >>>>>>>> Therefore the currently processed notification gets lost. >>>>>> >>>>>> Can you point out where exactly in the code the exception is thrown >>>>>> and caught. I'd like to understand the problem better. >>>>> >>>>> The CME is thrown in Timer.notifyAlarmClock() method in this case - >>>>> but >>>>> may happen in other places as well. >>>>> >>>>> Actually, in some places the access to the timerTable map is >>>>> synchronized while in others it isn't. While switching the Hashtable >>>>> for ConcurrentHashMap resolves this particular issue it might be >>>>> beneficial to correct the partial synchronization instead. >>>>> >>>>>> >>>>>>>> >>>>>>>> The CME is thrown due to the Timer.timerTable being iterated over >>>>>>>> while >>>>>>>> other threads try to remove some of its elements. Fix consists of >>>>>>>> replacing the Hashtable used for Timer.timerTable by >>>>>>>> ConcurrentHashMap >>>>>>>> which handles such situations with grace. >>>>>> >>>>>> Be aware that it may also give surprising results as removal is no >>>>>> longer synchronized at all with processing. So it could now appear >>>>>> that a notification is processed after a listener has been removed. >>>>> >>>>> Indeed, the CME is the symptom of the out-of-order processing - the >>>>> removal method is synchronized on (Timer.this) while the >>>>> notifyAlarmClock() method, processing the notifications, runs >>>>> unsynchronized. >>>>> >>>>> Thanks for pointing this out. I will have something to think about. >>>>> >>>>> -JB- >>>>> >>>>>> >>>>>> David >>>>>> ----- >>>>>> >>>>>>>> The patch webrev is available @ >>>>>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> -JB- >>>>>>>> >>>>>>> >>>>> >>>>> >>>> >> From david.holmes at oracle.com Mon Oct 15 05:18:25 2012 From: david.holmes at oracle.com (David Holmes) Date: Mon, 15 Oct 2012 22:18:25 +1000 Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from javax.management.timer.Timer In-Reply-To: <507BFE42.1050200@oracle.com> References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com> <5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com> <507817C7.9000703@oracle.com> <507B72C5.4060807@oracle.com> <507BE09E.5090702@oracle.com> <507BF761.8040500@oracle.com> <507BFE42.1050200@oracle.com> Message-ID: <507BFF11.1080404@oracle.com> Looks good to me. Hopefully someone else will chime in too :) Thanks, David On 15/10/2012 10:14 PM, Jaroslav Bachorik wrote: > On 10/15/2012 01:45 PM, David Holmes wrote: >> On 15/10/2012 8:08 PM, Jaroslav Bachorik wrote: >>> On 10/15/2012 04:19 AM, David Holmes wrote: >>>> I think your changes now go further than needed. The original code uses >>>> a dual synchronization scheme: >>>> >>>> a) it synchronizes most of the Timer methods >>>> b) it also uses a thread-safe Hashtable >>>> >>>> This means that not all of the Timer methods need to be synchronized >>>> because the only thread-safe action needed is the actual access to the >>>> Hashtable in some methods. >>>> >>>> The flaw with the original code was simply that the iteration of the >>>> Hashtable in notifyAlaramClock was not done in a thread-safe manner. I >>>> believe this could be fixed simply by synchronizing on the Hashtable >>>> here: >>>> >>>> 1186 synchronized(timerTable) { >>>> >>>> with no need to change the type of the timerTable, nor the >>>> synchronization on other Timer methods. You could alternatively >>>> synchronize on the Timer itself - as you now do - provided all methods >>>> of the Timer that mutate the Hashtable are themselves synchronized on >>>> the timer. >>>> >>>> What you have is not incorrect though, and may remove unnecessary >>>> synchronization in some cases (but increases the size of critical >>>> sections in others). >>>> >>>> Also here: >>>> >>>> 165 volatile private int counterID = 0; >>>> >>>> there is no need to add volatile as counterID is only accessed within >>>> synchronized methods. >>> >>> Yes, I see your point. I just want to ask - in cases of fixing issues >>> like this the preferred way is to introduce minimal changes even if it >>> means leaving the parts of the code sub-optimal? IMO, having dual >>> synchronization scheme might be considered as sub-optimal as it makes it >>> more difficult to see the author's intentions. >> >> Optimal depends on your evaluation criteria. The original design may >> have been done with performance in mind and a view to minimising >> critical sections. Without knowing what the original design criteria >> was, and unless you are fixing a problem caused by key aspects of that >> design, then minimal changes should be favoured. >> >>> But I am fine with leaving the Hashtable intact and just synchronizing >>> the iteration part correctly - it resolves the issue. >>> >>> The update webrev is available at >>> http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v4 >> >> I'm not sure the comment is needed in that form. Hashtable is >> snchronized internally but you need to use external synchronization when >> iterating through it. > > Ok, it's gone. http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v5/ > >> >> David >> >>> Regards, >>> >>> -JB- >>> >>>> >>>> David >>>> ----- >>>> >>>> On 12/10/2012 11:14 PM, Jaroslav Bachorik wrote: >>>>> The updated webrev is now at >>>>> http://btrace.kenai.com/webrevs/JDK-6809322/ >>>>> >>>>> I am sorry for this chaos with webrev locations but its not that >>>>> easy to >>>>> work efficiently without an OpenJDK username :/ >>>>> >>>>> -JB- >>>>> >>>>> On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote: >>>>>> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote: >>>>>>> Hi Jaroslav, >>>>>>> >>>>>>> >>>>>>> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote: >>>>>>>> Dmitry has put the webrev on the public CR - >>>>>>>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/ >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> -JB- >>>>>>>> >>>>>>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote: >>>>>>>>> I am looking for a review and a sponsor. >>>>>>>>> >>>>>>>>> The issue is about some javax.management.timer.Timer notifications >>>>>>>>> not >>>>>>>>> being received by the listeners if the notifications are generated >>>>>>>>> rapidly. >>>>>>>>> >>>>>>>>> The problem is caused by ConcurrentModificationException being >>>>>>>>> thrown - >>>>>>>>> the exception itself is ignored but the dispatcher logic is >>>>>>>>> skipped. >>>>>>>>> Therefore the currently processed notification gets lost. >>>>>>> >>>>>>> Can you point out where exactly in the code the exception is thrown >>>>>>> and caught. I'd like to understand the problem better. >>>>>> >>>>>> The CME is thrown in Timer.notifyAlarmClock() method in this case - >>>>>> but >>>>>> may happen in other places as well. >>>>>> >>>>>> Actually, in some places the access to the timerTable map is >>>>>> synchronized while in others it isn't. While switching the Hashtable >>>>>> for ConcurrentHashMap resolves this particular issue it might be >>>>>> beneficial to correct the partial synchronization instead. >>>>>> >>>>>>> >>>>>>>>> >>>>>>>>> The CME is thrown due to the Timer.timerTable being iterated over >>>>>>>>> while >>>>>>>>> other threads try to remove some of its elements. Fix consists of >>>>>>>>> replacing the Hashtable used for Timer.timerTable by >>>>>>>>> ConcurrentHashMap >>>>>>>>> which handles such situations with grace. >>>>>>> >>>>>>> Be aware that it may also give surprising results as removal is no >>>>>>> longer synchronized at all with processing. So it could now appear >>>>>>> that a notification is processed after a listener has been removed. >>>>>> >>>>>> Indeed, the CME is the symptom of the out-of-order processing - the >>>>>> removal method is synchronized on (Timer.this) while the >>>>>> notifyAlarmClock() method, processing the notifications, runs >>>>>> unsynchronized. >>>>>> >>>>>> Thanks for pointing this out. I will have something to think about. >>>>>> >>>>>> -JB- >>>>>> >>>>>>> >>>>>>> David >>>>>>> ----- >>>>>>> >>>>>>>>> The patch webrev is available @ >>>>>>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322 >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> -JB- >>>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>> >>> > From jaroslav.bachorik at oracle.com Mon Oct 15 09:17:16 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 15 Oct 2012 18:17:16 +0200 Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from javax.management.timer.Timer In-Reply-To: <12F6F059-25F8-4448-8D1E-25C25DAE87E9@oracle.com> References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com> <5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com> <507817C7.9000703@oracle.com> <507B72C5.4060807@oracle.com> <507BE09E.5090702@oracle.com> <507BF761.8040500@oracle.com> <507BFE42.1050200@oracle.com> <12F6F059-25F8-4448-8D1E-25C25DAE87E9@oracle.com> Message-ID: <507C370C.8040906@oracle.com> Thank you guys for the reviews! I would like to kindly ask someone with the committer rights to sponsor this fix and push the change. Thanks! -JB- On 10/15/2012 04:15 PM, Rickard B?ckman wrote: > Looks good! > > /R > > On Oct 15, 2012, at 2:14 PM, Jaroslav Bachorik wrote: > >> On 10/15/2012 01:45 PM, David Holmes wrote: >>> On 15/10/2012 8:08 PM, Jaroslav Bachorik wrote: >>>> On 10/15/2012 04:19 AM, David Holmes wrote: >>>>> I think your changes now go further than needed. The original code uses >>>>> a dual synchronization scheme: >>>>> >>>>> a) it synchronizes most of the Timer methods >>>>> b) it also uses a thread-safe Hashtable >>>>> >>>>> This means that not all of the Timer methods need to be synchronized >>>>> because the only thread-safe action needed is the actual access to the >>>>> Hashtable in some methods. >>>>> >>>>> The flaw with the original code was simply that the iteration of the >>>>> Hashtable in notifyAlaramClock was not done in a thread-safe manner. I >>>>> believe this could be fixed simply by synchronizing on the Hashtable >>>>> here: >>>>> >>>>> 1186 synchronized(timerTable) { >>>>> >>>>> with no need to change the type of the timerTable, nor the >>>>> synchronization on other Timer methods. You could alternatively >>>>> synchronize on the Timer itself - as you now do - provided all methods >>>>> of the Timer that mutate the Hashtable are themselves synchronized on >>>>> the timer. >>>>> >>>>> What you have is not incorrect though, and may remove unnecessary >>>>> synchronization in some cases (but increases the size of critical >>>>> sections in others). >>>>> >>>>> Also here: >>>>> >>>>> 165 volatile private int counterID = 0; >>>>> >>>>> there is no need to add volatile as counterID is only accessed within >>>>> synchronized methods. >>>> >>>> Yes, I see your point. I just want to ask - in cases of fixing issues >>>> like this the preferred way is to introduce minimal changes even if it >>>> means leaving the parts of the code sub-optimal? IMO, having dual >>>> synchronization scheme might be considered as sub-optimal as it makes it >>>> more difficult to see the author's intentions. >>> >>> Optimal depends on your evaluation criteria. The original design may >>> have been done with performance in mind and a view to minimising >>> critical sections. Without knowing what the original design criteria >>> was, and unless you are fixing a problem caused by key aspects of that >>> design, then minimal changes should be favoured. >>> >>>> But I am fine with leaving the Hashtable intact and just synchronizing >>>> the iteration part correctly - it resolves the issue. >>>> >>>> The update webrev is available at >>>> http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v4 >>> >>> I'm not sure the comment is needed in that form. Hashtable is >>> snchronized internally but you need to use external synchronization when >>> iterating through it. >> >> Ok, it's gone. http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v5/ >> >>> >>> David >>> >>>> Regards, >>>> >>>> -JB- >>>> >>>>> >>>>> David >>>>> ----- >>>>> >>>>> On 12/10/2012 11:14 PM, Jaroslav Bachorik wrote: >>>>>> The updated webrev is now at >>>>>> http://btrace.kenai.com/webrevs/JDK-6809322/ >>>>>> >>>>>> I am sorry for this chaos with webrev locations but its not that >>>>>> easy to >>>>>> work efficiently without an OpenJDK username :/ >>>>>> >>>>>> -JB- >>>>>> >>>>>> On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote: >>>>>>> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote: >>>>>>>> Hi Jaroslav, >>>>>>>> >>>>>>>> >>>>>>>> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote: >>>>>>>>> Dmitry has put the webrev on the public CR - >>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks! >>>>>>>>> >>>>>>>>> -JB- >>>>>>>>> >>>>>>>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote: >>>>>>>>>> I am looking for a review and a sponsor. >>>>>>>>>> >>>>>>>>>> The issue is about some javax.management.timer.Timer notifications >>>>>>>>>> not >>>>>>>>>> being received by the listeners if the notifications are generated >>>>>>>>>> rapidly. >>>>>>>>>> >>>>>>>>>> The problem is caused by ConcurrentModificationException being >>>>>>>>>> thrown - >>>>>>>>>> the exception itself is ignored but the dispatcher logic is >>>>>>>>>> skipped. >>>>>>>>>> Therefore the currently processed notification gets lost. >>>>>>>> >>>>>>>> Can you point out where exactly in the code the exception is thrown >>>>>>>> and caught. I'd like to understand the problem better. >>>>>>> >>>>>>> The CME is thrown in Timer.notifyAlarmClock() method in this case - >>>>>>> but >>>>>>> may happen in other places as well. >>>>>>> >>>>>>> Actually, in some places the access to the timerTable map is >>>>>>> synchronized while in others it isn't. While switching the Hashtable >>>>>>> for ConcurrentHashMap resolves this particular issue it might be >>>>>>> beneficial to correct the partial synchronization instead. >>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>>> The CME is thrown due to the Timer.timerTable being iterated over >>>>>>>>>> while >>>>>>>>>> other threads try to remove some of its elements. Fix consists of >>>>>>>>>> replacing the Hashtable used for Timer.timerTable by >>>>>>>>>> ConcurrentHashMap >>>>>>>>>> which handles such situations with grace. >>>>>>>> >>>>>>>> Be aware that it may also give surprising results as removal is no >>>>>>>> longer synchronized at all with processing. So it could now appear >>>>>>>> that a notification is processed after a listener has been removed. >>>>>>> >>>>>>> Indeed, the CME is the symptom of the out-of-order processing - the >>>>>>> removal method is synchronized on (Timer.this) while the >>>>>>> notifyAlarmClock() method, processing the notifications, runs >>>>>>> unsynchronized. >>>>>>> >>>>>>> Thanks for pointing this out. I will have something to think about. >>>>>>> >>>>>>> -JB- >>>>>>> >>>>>>>> >>>>>>>> David >>>>>>>> ----- >>>>>>>> >>>>>>>>>> The patch webrev is available @ >>>>>>>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322 >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> -JB- >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>> >> > From jaroslav.bachorik at oracle.com Wed Oct 24 06:45:26 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 24 Oct 2012 15:45:26 +0200 Subject: jmx-dev [PATCH] JDK-7009998: JMX synchronization during connection restart is faulty Message-ID: <5087F0F6.40902@oracle.com> I am looking for a review and patch sponsor. Webrev available at http://cr.openjdk.java.net/~jbachorik/JDK-7009998/webrev.00 The issue is about a possible race condition in the ClientCommunicatorAdmin when the reconnection process may be initiated by more than one thread (eg. 3). The main reason is that the re-connection routine logic is split into two synchronized blocks and it relies on the state staying consistent when transiting from the one synchronized block to the other. The race condition is described by the reporter as: "In reading the code there is a scenario where the synchronization does the wrong thing if 3 threads attempt to go through the code at the same time. Consider the code in com.sun.jmx.remote.internal.ClientCommunicatorAdmin.restart, the first thread will set the state to RE_CONNECTING and then leave the sychronization block, the second thread will find the state to be RE_CONNECTING and wait on the lock. When thread 1 finishes and sets the state to CONNECTED, then thread 2 can leave the synchronization block - but fails to set the state to RE_CONNECTING because that code is incorrectly in the else branch. Thus thread 2 starts the reconnecting and thread 3 wakes to find the state not RE_CONNECTING so it believes it can safely start the reconnect and it also starts reconnecting. The bad mode is discovered in the preReconnection method." The fix is adding a return statement at the end of the first synchronized block in case when the admin has been successfully re-connected by the other thread. Test in "test/com/sun/jmx/remote/CCAdminReconnectTest.java" tests the fix. Changes in "make/netbeans/jmx/build.properties" are there for the NetBeans project to recognize the newly added test. Thanks, -JB- From jaroslav.bachorik at oracle.com Wed Oct 24 06:50:28 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 24 Oct 2012 15:50:28 +0200 Subject: jmx-dev [PATCH] JDK-6976971: TEST: javax/management/remote/mandatory/URLTest.java should be re-integrated Message-ID: <5087F224.5010603@oracle.com> I am looking for review and sponsor. Webrev is available at http://cr.openjdk.java.net/~jbachorik/JDK-6976971/webrev.00 This is a simple fix - just adding back the test that used to fail due to changes in URI spec. The changes were rolled back before mustang and the test has no reason to fail. Thanks, -JB- From jaroslav.bachorik at oracle.com Wed Oct 24 07:03:27 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 24 Oct 2012 16:03:27 +0200 Subject: jmx-dev [PATCH] JDK-6937053: RMI unmarshalling errors in ClientNotifForwarder cause silent failure Message-ID: <5087F52F.8070809@oracle.com> I am looking for review and a sponsor. Webrev available at http://cr.openjdk.java.net/~jbachorik/JDK-6937053/webrev.00/ The RMI marshalling process may throw java.rmi.UnmarshallException eg. in cases of incompatible changes in enums. The bad thing is that ClientNotifForwarder chooses to silently die instead of reporting the problem. The fix consists of adding support for handling java.rmi.UnmarshallException the same way as java.io.NotSerializableException and appropriate changes in the javadoc. Thanks, -JB- From jaroslav.bachorik at oracle.com Wed Oct 24 07:09:58 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 24 Oct 2012 16:09:58 +0200 Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has inconsistent readObject/writeObject Message-ID: <5087F6B6.3050708@oracle.com> I am looking for review and a sponsor. Webrev is available at http://cr.openjdk.java.net/~jbachorik/JDK-6783290/webrev.01/ The serialization of javax.management.MBeanInfo and javax.management.MBeanFeatureInfo instances is asymmetrical in cases with no attached descriptor. The descriptor is serialized as an empty array but when deserializing the descriptor is not read back at all. Currently for RMI this does not pose any problem but the specification does not explicitly allow this kind of behaviour and it may cause troubles eventually. The patch just reads back the empty array to keep the serialization/deserialization symmetric. Thanks, -JB- From jaroslav.bachorik at oracle.com Wed Oct 24 07:15:34 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 24 Oct 2012 16:15:34 +0200 Subject: jmx-dev [PATCH] JDK-6705499: Bad JMXConnectorProvider class name prevents all connections - even with standard RMI connector Message-ID: <5087F806.40408@oracle.com> I am looking for review and a sponsor. Webrev is available at http://cr.openjdk.java.net/~jbachorik/JDK-6705499/webrev.00/ The issue is caused by the way the java.util.ServiceLoader treats the service registration with incorrect class names. Such a service registration causes java.util.ServiceConfigurationError to be thrown and the JMXConnector(Server)Factory is not ready for this. Thanks to the exception all the other, potentially valid, service registrations are ignored. The patch makes JMXConnector(Server)Factory class ready for java.util.ServiceConfigurationError and when such an exception is caught the factory just proceeds to the next registration. If the only available registration causes the exception it will be rethrown at the end. Thanks, -JB- From Alan.Bateman at oracle.com Wed Oct 24 07:28:17 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 24 Oct 2012 15:28:17 +0100 Subject: jmx-dev [PATCH] JDK-6705499: Bad JMXConnectorProvider class name prevents all connections - even with standard RMI connector In-Reply-To: <5087F806.40408@oracle.com> References: <5087F806.40408@oracle.com> Message-ID: <5087FB01.6010701@oracle.com> On 24/10/2012 15:15, Jaroslav Bachorik wrote: > I am looking for review and a sponsor. > > Webrev is available at > http://cr.openjdk.java.net/~jbachorik/JDK-6705499/webrev.00/ > > The issue is caused by the way the java.util.ServiceLoader treats the > service registration with incorrect class names. Such a service > registration causes java.util.ServiceConfigurationError to be thrown and > the JMXConnector(Server)Factory is not ready for this. Thanks to the > exception all the other, potentially valid, service registrations are > ignored. > > The patch makes JMXConnector(Server)Factory class ready for > java.util.ServiceConfigurationError and when such an exception is caught > the factory just proceeds to the next registration. If the only > available registration causes the exception it will be rethrown at the end. > > Thanks, > > -JB- I'm not so sure this is the right thing to do. When SCE is thrown then there is no guarantee that you can continue and there isn't enough information in the error to know whether it makes sense to attempt to continue or not. We have this same issue in many areas of the platform and I think it requires future work in ServiceLoader to help users of the API decide whether to continue or not. Once we move to modules then many of the reasons for SCE will go away because the list of service provider is precomputed so there is no scanning of class paths or parsing of configuration files at runtime. So if this one is not urgent they it may be something to come back to again in the future. -Alan. From Alan.Bateman at oracle.com Wed Oct 24 07:29:20 2012 From: Alan.Bateman at oracle.com (Alan Bateman) Date: Wed, 24 Oct 2012 15:29:20 +0100 Subject: jmx-dev [PATCH] JDK-6976971: TEST: javax/management/remote/mandatory/URLTest.java should be re-integrated In-Reply-To: <5087F224.5010603@oracle.com> References: <5087F224.5010603@oracle.com> Message-ID: <5087FB40.9040905@oracle.com> On 24/10/2012 14:50, Jaroslav Bachorik wrote: > I am looking for review and sponsor. > > Webrev is available at > http://cr.openjdk.java.net/~jbachorik/JDK-6976971/webrev.00 > > This is a simple fix - just adding back the test that used to fail due > to changes in URI spec. The changes were rolled back before mustang and > the test has no reason to fail. > > Thanks, > > -JB- This looks fine to me, I guess it was just missed when the URI work was rolled back. -Alan From eamonn at mcmanus.net Wed Oct 24 08:49:32 2012 From: eamonn at mcmanus.net (Eamonn McManus) Date: Wed, 24 Oct 2012 08:49:32 -0700 Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has inconsistent readObject/writeObject In-Reply-To: <5087F6B6.3050708@oracle.com> References: <5087F6B6.3050708@oracle.com> Message-ID: This is already Reviewed-by: emcmanus, but I'm afraid I can't sponsor it. ?amonn 2012/10/24 Jaroslav Bachorik > I am looking for review and a sponsor. > > Webrev is available at > http://cr.openjdk.java.net/~jbachorik/JDK-6783290/webrev.01/ > > The serialization of javax.management.MBeanInfo and > javax.management.MBeanFeatureInfo instances is asymmetrical in cases > with no attached descriptor. The descriptor is serialized as an empty > array but when deserializing the descriptor is not read back at all. > Currently for RMI this does not pose any problem but the specification > does not explicitly allow this kind of behaviour and it may cause > troubles eventually. > > The patch just reads back the empty array to keep the > serialization/deserialization symmetric. > > Thanks, > > -JB- > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.openjdk.java.net/pipermail/jmx-dev/attachments/20121024/fe8a9bbe/attachment.html From jaroslav.bachorik at oracle.com Thu Oct 25 04:55:47 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Thu, 25 Oct 2012 13:55:47 +0200 Subject: jmx-dev [PATCH] JDK-6705499: Bad JMXConnectorProvider class name prevents all connections - even with standard RMI connector In-Reply-To: <5087FB01.6010701@oracle.com> References: <5087F806.40408@oracle.com> <5087FB01.6010701@oracle.com> Message-ID: <508928C3.6000208@oracle.com> On 10/24/2012 04:28 PM, Alan Bateman wrote: > On 24/10/2012 15:15, Jaroslav Bachorik wrote: >> I am looking for review and a sponsor. >> >> Webrev is available at >> http://cr.openjdk.java.net/~jbachorik/JDK-6705499/webrev.00/ >> >> The issue is caused by the way the java.util.ServiceLoader treats the >> service registration with incorrect class names. Such a service >> registration causes java.util.ServiceConfigurationError to be thrown and >> the JMXConnector(Server)Factory is not ready for this. Thanks to the >> exception all the other, potentially valid, service registrations are >> ignored. >> >> The patch makes JMXConnector(Server)Factory class ready for >> java.util.ServiceConfigurationError and when such an exception is caught >> the factory just proceeds to the next registration. If the only >> available registration causes the exception it will be rethrown at the >> end. >> >> Thanks, >> >> -JB- > I'm not so sure this is the right thing to do. When SCE is thrown then > there is no guarantee that you can continue and there isn't enough > information in the error to know whether it makes sense to attempt to > continue or not. We have this same issue in many areas of the platform Shouldn't this be indicated by the "hasNext()" method of the iterator returned by ServiceLoader? I mean - whether you can continue enumerating the providers or not. I agree that the fact that SCE is an Error subclass alone makes catching and handling it rather dubious but it seems a bit harsh to throw a (supposedly) unrecoverable exception only because one entry in the service configuration file is invalid. > and I think it requires future work in ServiceLoader to help users of > the API decide whether to continue or not. Once we move to modules then Yes, a proper fix would be in the ServiceLoader but it will most probably involve API changes. Eg. instead of generating an Error a checked exception should be thrown indicating a problem with the particular service configuration line. > many of the reasons for SCE will go away because the list of service > provider is precomputed so there is no scanning of class paths or > parsing of configuration files at runtime. So if this one is not urgent > they it may be something to come back to again in the future. Unfortunately, it will take some time till we have modules :( Until then you can completely disable JMX subsystem simply by placing a poison jar on classpath (probably other ServiceLoder based services as well but this issue is about JMX). Even though the issue is P3 it has been there sitting unresolved for 4 years and keeping it for another 3 (till JDK9) would look quite strange, IMO. What about filing an issue for ServiceLoader (if there is none yet) and then pushing this workaround with comment that it should be revisited once the modules are in place? -JB- > > -Alan. > From jaroslav.bachorik at oracle.com Mon Oct 29 07:15:21 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Mon, 29 Oct 2012 15:15:21 +0100 Subject: jmx-dev [PATCH] JDK-7146162: javax/management/remote/mandatory/connection/BrokenConnectionTest.java failing intermittently Message-ID: <508E8F79.60909@oracle.com> I am looking for a sponsor and reviewers. The webrev is available at http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.03 As explained in the issue the failure is caused by the RMI connection heart-beat thread racing against the thread executing the MBean operation and encountering the IOException. The heart beat thread sets the the admin state to "terminated" but does not send the failure notifications. On the other side the operation thread determines the state to be already terminated and skips the notifications as well. The fix adds the call to handle an ioexception, including sending the failure notifications, to the hear-beat connection failure handler. Also it widens the synchronized block since the whole code block checking for the connection failure and recovering must be run atomically, Thanks, -JB- From eamonn at mcmanus.net Tue Oct 30 09:10:21 2012 From: eamonn at mcmanus.net (Eamonn McManus) Date: Tue, 30 Oct 2012 09:10:21 -0700 Subject: jmx-dev [PATCH] JDK-7146162: javax/management/remote/mandatory/connection/BrokenConnectionTest.java failing intermittently In-Reply-To: <508E8F79.60909@oracle.com> References: <508E8F79.60909@oracle.com> Message-ID: This area has historically caused a lot of problems and I am not surprised to see that there are more. While I don't know what the best way to fix the issue at hand is, I don't think this proposed change is it. The reason is that the checkConnection and gotIOException methods do blocking operations, and it is generally not a good idea to do blocking operations in a synchronized block. Is there a way to avoid the race condition without that? ?amonn 2012/10/29 Jaroslav Bachorik : > I am looking for a sponsor and reviewers. > > The webrev is available at > http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.03 > > As explained in the issue the failure is caused by the RMI connection > heart-beat thread racing against the thread executing the MBean > operation and encountering the IOException. The heart beat thread sets > the the admin state to "terminated" but does not send the failure > notifications. On the other side the operation thread determines the > state to be already terminated and skips the notifications as well. > > The fix adds the call to handle an ioexception, including sending the > failure notifications, to the hear-beat connection failure handler. Also > it widens the synchronized block since the whole code block checking for > the connection failure and recovering must be run atomically, > > > Thanks, > > -JB- From jaroslav.bachorik at oracle.com Wed Oct 31 05:59:28 2012 From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik) Date: Wed, 31 Oct 2012 13:59:28 +0100 Subject: jmx-dev [PATCH] JDK-7146162: javax/management/remote/mandatory/connection/BrokenConnectionTest.java failing intermittently In-Reply-To: References: <508E8F79.60909@oracle.com> Message-ID: <509120B0.6040703@oracle.com> On 10/30/2012 05:10 PM, Eamonn McManus wrote: > This area has historically caused a lot of problems and I am not > surprised to see that there are more. While I don't know what the best > way to fix the issue at hand is, I don't think this proposed change is > it. The reason is that the checkConnection and gotIOException methods > do blocking operations, and it is generally not a good idea to do > blocking operations in a synchronized block. Is there a way to avoid > the race condition without that? The important part is calling the gotIOException() method even from the heart-beat checker. I've tried to return the synchronization block back to the original state and the test passes with the check period of 10ms which pushes the probability of data races rather high. It seems that the worst that can happen would be one additional checkConnection() call - in case when the state gets set to TERMINATED by another thread right after it has been checked in the synchronized block the loop condition might evaluate to true if the state value has not been flushed yet. I could change the "state" variable to be volatile but I am not sure whether it's worth the hassle. -JB- > > ?amonn > > > 2012/10/29 Jaroslav Bachorik : >> I am looking for a sponsor and reviewers. >> >> The webrev is available at >> http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.03 >> >> As explained in the issue the failure is caused by the RMI connection >> heart-beat thread racing against the thread executing the MBean >> operation and encountering the IOException. The heart beat thread sets >> the the admin state to "terminated" but does not send the failure >> notifications. On the other side the operation thread determines the >> state to be already terminated and skips the notifications as well. >> >> The fix adds the call to handle an ioexception, including sending the >> failure notifications, to the hear-beat connection failure handler. Also >> it widens the synchronized block since the whole code block checking for >> the connection failure and recovering must be run atomically, >> >> >> Thanks, >> >> -JB-