From jaroslav.bachorik at oracle.com  Tue Oct  2 01:33:13 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Tue, 02 Oct 2012 10:33:13 +0200
Subject: jmx-dev Review Request: 7195779
 javax/management/remote/mandatory/threads/ExecutorTest.java fail
 intermittently
In-Reply-To: <505C46EB.60800@oracle.com>
References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com>
	<5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com>
	<CACBEn46uchSNDPpoTjNViEq0vpFO4HZAGsoBG+ZpF3jQbpLVRg@mail.gmail.com>
	<505C46EB.60800@oracle.com>
Message-ID: <506AA6C9.30909@oracle.com>

On Fri 21 Sep 2012 12:52:27 PM CEST, Alan Bateman wrote:
> On 20/09/2012 17:02, Eamonn McManus wrote:
>> Changing the generated RMI/IIOP code
>> so that it no longer causes this exception, or so that it catches it
>> and rethrows a RemoteException, sounds as if it ought to be fairly
>> straightforward, and that's probably what I would do if it were up to
>> me.
> I think this is what I would do to, even though it means going into
> the corba repository as that is there the stub generator is. I should
> say that I don't violently object to Jaroslav's patch, it's just that
> it is an ugly workaround.

The generated TIE class is inherently thread-unsafe. The internal state 
(the target field) can be manipulated without any enforced 
synchronization - eg. it is valid to set the target field to null by 
calling the deactivate() method after the _invoke() method has been 
entered from a different thread. This will lead to the NPE we can 
observe. Given this example one should make critical sections out of 
deactivate() and _invoke() methods to prevent this situation. However, 
this simplistic approach might lead to deadlocks in the existing code 
as the _invoke() method body might be blocking (it is a 3rd party code) 
and thus preventing execution of the deactivate() method indefinitely.

Also, it is not really possible to solve this problem outside of the 
generated TIE class - it is caused by the concurrent change of the 
TIE's internal state. So, the solution would be either caching the 
target attribute at the beginning of the invoke() operation in a 
synchronized block and use the cached version afterwards (and throwing 
a remote exception if it is null - the TIE was deactivated effectively 
before entering the invoke() operation) or postponing deactivation when 
the invoke() method is detected as being in progress.

-JB-

>
>
>> Disabling this test for the IIOP case, and probably other failing
>> JMX tests that involve IIOP, is an option if it is judged that nobody
>> uses the RMI/IIOP connector any more so it is all right to let it rot.
>> That judgement is a non-technical one that I don't have an informed
>> opinion on.
>>
> I don't know if the rmi-iiop connector was ever used much but since it
> seems to be required by the JMX Remote API spec then I think we should
> continue to test it. As I think I mentioned in another mail recently,
> I think we have to look at making this transport optional as it's
> painful to have the CORBA tie/stub classes in
> javax.management.remote.rmi. I don't what to hijack Jaroslav's thread
> to discuss that, that's a topic for another thread.
>
> -Alan
>


From Alan.Bateman at oracle.com  Tue Oct  2 06:38:13 2012
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Tue, 02 Oct 2012 14:38:13 +0100
Subject: jmx-dev Review Request: 7195779
 javax/management/remote/mandatory/threads/ExecutorTest.java fail
 intermittently
In-Reply-To: <506AA6C9.30909@oracle.com>
References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com>
	<5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com>
	<CACBEn46uchSNDPpoTjNViEq0vpFO4HZAGsoBG+ZpF3jQbpLVRg@mail.gmail.com>
	<505C46EB.60800@oracle.com> <506AA6C9.30909@oracle.com>
Message-ID: <506AEE45.7020403@oracle.com>

On 02/10/2012 09:33, Jaroslav Bachorik wrote:
> :
> The generated TIE class is inherently thread-unsafe. The internal state
> (the target field) can be manipulated without any enforced
> synchronization - eg. it is valid to set the target field to null by
> calling the deactivate() method after the _invoke() method has been
> entered from a different thread. This will lead to the NPE we can
> observe. Given this example one should make critical sections out of
> deactivate() and _invoke() methods to prevent this situation. However,
> this simplistic approach might lead to deadlocks in the existing code
> as the _invoke() method body might be blocking (it is a 3rd party code)
> and thus preventing execution of the deactivate() method indefinitely.
>
> Also, it is not really possible to solve this problem outside of the
> generated TIE class - it is caused by the concurrent change of the
> TIE's internal state. So, the solution would be either caching the
> target attribute at the beginning of the invoke() operation in a
> synchronized block and use the cached version afterwards (and throwing
> a remote exception if it is null - the TIE was deactivated effectively
> before entering the invoke() operation) or postponing deactivation when
> the invoke() method is detected as being in progress.
>
> -JB-
>
Jaroslav and I chatted on IM about this today. Jaroslav is going to have 
a go at changing the stub generator and will send a follow-up mail with 
an updated webrev (this time for the corba repo as the that is where the 
stub generator lives).

-Alan

From jaroslav.bachorik at oracle.com  Thu Oct  4 08:28:05 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Thu, 04 Oct 2012 17:28:05 +0200
Subject: jmx-dev Review Request: 7195779
 javax/management/remote/mandatory/threads/ExecutorTest.java fail
 intermittently
In-Reply-To: <506AEE45.7020403@oracle.com>
References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com>
	<5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com>
	<CACBEn46uchSNDPpoTjNViEq0vpFO4HZAGsoBG+ZpF3jQbpLVRg@mail.gmail.com>
	<505C46EB.60800@oracle.com> <506AA6C9.30909@oracle.com>
	<506AEE45.7020403@oracle.com>
Message-ID: <506DAB05.4050500@oracle.com>

On Tue 02 Oct 2012 03:38:13 PM CEST, Alan Bateman wrote:
> On 02/10/2012 09:33, Jaroslav Bachorik wrote:
>> :
>> The generated TIE class is inherently thread-unsafe. The internal state
>> (the target field) can be manipulated without any enforced
>> synchronization - eg. it is valid to set the target field to null by
>> calling the deactivate() method after the _invoke() method has been
>> entered from a different thread. This will lead to the NPE we can
>> observe. Given this example one should make critical sections out of
>> deactivate() and _invoke() methods to prevent this situation. However,
>> this simplistic approach might lead to deadlocks in the existing code
>> as the _invoke() method body might be blocking (it is a 3rd party code)
>> and thus preventing execution of the deactivate() method indefinitely.
>>
>> Also, it is not really possible to solve this problem outside of the
>> generated TIE class - it is caused by the concurrent change of the
>> TIE's internal state. So, the solution would be either caching the
>> target attribute at the beginning of the invoke() operation in a
>> synchronized block and use the cached version afterwards (and throwing
>> a remote exception if it is null - the TIE was deactivated effectively
>> before entering the invoke() operation) or postponing deactivation when
>> the invoke() method is detected as being in progress.
>>
>> -JB-
>>
> Jaroslav and I chatted on IM about this today. Jaroslav is going to
> have a go at changing the stub generator and will send a follow-up
> mail with an updated webrev (this time for the corba repo as the that
> is where the stub generator lives).

This is a follow-up. I've prepared the patch and put it on github - 
https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/7195779

I wonder who else should be included in the review process since I am 
changing the IIOP generator code. Also, I didn't find any tests in the 
corba repository. Which test suite is appropriate to run after changing 
the corba related code?

-JB-


>
> -Alan


From Alan.Bateman at oracle.com  Thu Oct  4 08:42:15 2012
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Thu, 04 Oct 2012 16:42:15 +0100
Subject: jmx-dev Review Request: 7195779
 javax/management/remote/mandatory/threads/ExecutorTest.java fail
 intermittently
In-Reply-To: <506DAB05.4050500@oracle.com>
References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com>
	<5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com>
	<CACBEn46uchSNDPpoTjNViEq0vpFO4HZAGsoBG+ZpF3jQbpLVRg@mail.gmail.com>
	<505C46EB.60800@oracle.com> <506AA6C9.30909@oracle.com>
	<506AEE45.7020403@oracle.com> <506DAB05.4050500@oracle.com>
Message-ID: <506DAE57.3020300@oracle.com>

On 04/10/2012 16:28, Jaroslav Bachorik wrote:
> :
> This is a follow-up. I've prepared the patch and put it on github -
> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/7195779
>
> I wonder who else should be included in the review process since I am
> changing the IIOP generator code. Also, I didn't find any tests in the
> corba repository. Which test suite is appropriate to run after changing
> the corba related code?
>
> -JB-
>
I don't mind being reviewer and sponsor for this. Also cc'ing Sean as he 
is one of the maintainers of the corba code. I don't think the corba 
tests are in OpenJDK, at least I don't think Oracle has contributed its 
tests for this area.

I think your change looks okay and I assume you've at least run the JMX 
tests that use RMI-IIOP to verify that the intermittent NPE is gone and 
those tests now pass reliably.

Minor comment but if I were doing this myself then I probably would have 
added this instead:
p.pln(getName(theType) + " target = this.target;");

You'll see lots of examples of this in the core libs and j.u.c.

Also as target is now volatile then I'm not sure why you synchronized 
around target=null, perhaps there is other code generated in the tie 
class that I don't see?

Otherwise it's great to get issue finally resolved.

-Alan.


From jaroslav.bachorik at oracle.com  Thu Oct  4 08:56:10 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Thu, 04 Oct 2012 17:56:10 +0200
Subject: jmx-dev Review Request: 7195779
 javax/management/remote/mandatory/threads/ExecutorTest.java fail
 intermittently
In-Reply-To: <506DAE57.3020300@oracle.com>
References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com>
	<5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com>
	<CACBEn46uchSNDPpoTjNViEq0vpFO4HZAGsoBG+ZpF3jQbpLVRg@mail.gmail.com>
	<505C46EB.60800@oracle.com> <506AA6C9.30909@oracle.com>
	<506AEE45.7020403@oracle.com> <506DAB05.4050500@oracle.com>
	<506DAE57.3020300@oracle.com>
Message-ID: <506DB19A.7070604@oracle.com>

On Thu 04 Oct 2012 05:42:15 PM CEST, Alan Bateman wrote:
> On 04/10/2012 16:28, Jaroslav Bachorik wrote:
>> :
>> This is a follow-up. I've prepared the patch and put it on github -
>> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/7195779
>>
>> I wonder who else should be included in the review process since I am
>> changing the IIOP generator code. Also, I didn't find any tests in the
>> corba repository. Which test suite is appropriate to run after changing
>> the corba related code?
>>
>> -JB-
>>
> I don't mind being reviewer and sponsor for this. Also cc'ing Sean as
> he is one of the maintainers of the corba code. I don't think the

Thanks.

> corba tests are in OpenJDK, at least I don't think Oracle has
> contributed its tests for this area.
>
> I think your change looks okay and I assume you've at least run the
> JMX tests that use RMI-IIOP to verify that the intermittent NPE is
> gone and those tests now pass reliably.

Yes, I ran those. Especially on the test machine yielding the biggest 
ration of failed tests previously. Now they all pass.

>
> Minor comment but if I were doing this myself then I probably would
> have added this instead:
> p.pln(getName(theType) + " target = this.target;");
>
> You'll see lots of examples of this in the core libs and j.u.c.

No problem. If it is a convention I will stick to it (I've used a 
different variable name to prevent confusion about what is a field and 
what is a local variable).

>
> Also as target is now volatile then I'm not sure why you synchronized
> around target=null, perhaps there is other code generated in the tie
> class that I don't see?

The synchronization should not be there. It just escaped my purging 
when I've exchanged the synchronized access for volatile.

-JB-

>
> Otherwise it's great to get issue finally resolved.
>
> -Alan.
>
>


From jaroslav.bachorik at oracle.com  Fri Oct  5 01:10:31 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Fri, 05 Oct 2012 10:10:31 +0200
Subject: jmx-dev Review Request: 7195779
 javax/management/remote/mandatory/threads/ExecutorTest.java fail
 intermittently
In-Reply-To: <506DB19A.7070604@oracle.com>
References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com>
	<5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com>
	<CACBEn46uchSNDPpoTjNViEq0vpFO4HZAGsoBG+ZpF3jQbpLVRg@mail.gmail.com>
	<505C46EB.60800@oracle.com> <506AA6C9.30909@oracle.com>
	<506AEE45.7020403@oracle.com> <506DAB05.4050500@oracle.com>
	<506DAE57.3020300@oracle.com> <506DB19A.7070604@oracle.com>
Message-ID: <506E95F7.7000304@oracle.com>

I have updated the patch to reflect Alan's remarks. The webrev is at the
same location - github takes care of versioning ...

-JB-

On 10/04/2012 05:56 PM, Jaroslav Bachorik wrote:
> On Thu 04 Oct 2012 05:42:15 PM CEST, Alan Bateman wrote:
>> On 04/10/2012 16:28, Jaroslav Bachorik wrote:
>>> :
>>> This is a follow-up. I've prepared the patch and put it on github -
>>> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/7195779
>>>
>>> I wonder who else should be included in the review process since I am
>>> changing the IIOP generator code. Also, I didn't find any tests in the
>>> corba repository. Which test suite is appropriate to run after changing
>>> the corba related code?
>>>
>>> -JB-
>>>
>> I don't mind being reviewer and sponsor for this. Also cc'ing Sean as
>> he is one of the maintainers of the corba code. I don't think the
> 
> Thanks.
> 
>> corba tests are in OpenJDK, at least I don't think Oracle has
>> contributed its tests for this area.
>>
>> I think your change looks okay and I assume you've at least run the
>> JMX tests that use RMI-IIOP to verify that the intermittent NPE is
>> gone and those tests now pass reliably.
> 
> Yes, I ran those. Especially on the test machine yielding the biggest 
> ration of failed tests previously. Now they all pass.
> 
>>
>> Minor comment but if I were doing this myself then I probably would
>> have added this instead:
>> p.pln(getName(theType) + " target = this.target;");
>>
>> You'll see lots of examples of this in the core libs and j.u.c.
> 
> No problem. If it is a convention I will stick to it (I've used a 
> different variable name to prevent confusion about what is a field and 
> what is a local variable).
> 
>>
>> Also as target is now volatile then I'm not sure why you synchronized
>> around target=null, perhaps there is other code generated in the tie
>> class that I don't see?
> 
> The synchronization should not be there. It just escaped my purging 
> when I've exchanged the synchronized access for volatile.
> 
> -JB-
> 
>>
>> Otherwise it's great to get issue finally resolved.
>>
>> -Alan.
>>
>>
> 
> 


From Alan.Bateman at oracle.com  Fri Oct  5 01:59:51 2012
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Fri, 05 Oct 2012 09:59:51 +0100
Subject: jmx-dev Review Request: 7195779
 javax/management/remote/mandatory/threads/ExecutorTest.java fail
 intermittently
In-Reply-To: <506E95F7.7000304@oracle.com>
References: <5059BD10.40801@oracle.com> <5059C33D.7@oracle.com>
	<5059C5B3.6020808@oracle.com> <5059CB04.30107@oracle.com>
	<CACBEn46uchSNDPpoTjNViEq0vpFO4HZAGsoBG+ZpF3jQbpLVRg@mail.gmail.com>
	<505C46EB.60800@oracle.com> <506AA6C9.30909@oracle.com>
	<506AEE45.7020403@oracle.com> <506DAB05.4050500@oracle.com>
	<506DAE57.3020300@oracle.com> <506DB19A.7070604@oracle.com>
	<506E95F7.7000304@oracle.com>
Message-ID: <506EA187.6040008@oracle.com>

On 05/10/2012 09:10, Jaroslav Bachorik wrote:
> I have updated the patch to reflect Alan's remarks. The webrev is at the
> same location - github takes care of versioning ...
>
> -JB-
>
Thanks Jaroslav, I grabbed it from here:

https://raw.github.com/jbachorik/openjdk-patches/master/webrevs/7195779/corba.patch

I'll push this into jdk8/tl shortly, listing you as the contributor.

-Alan.

From jaroslav.bachorik at oracle.com  Wed Oct 10 01:40:01 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Wed, 10 Oct 2012 10:40:01 +0200
Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has
 inconsistent readObject/writeObject
Message-ID: <50753461.4070009@oracle.com>

I am looking for a review and a sponsor for this fix.

The issue is about an empty array of descriptors being written as a part
of the serialization process but not read when deserializing an
MBeanInfo/MBeanFeatureInfo instance. While the current ObjectInputStream
skips all unread custom written fields it is not a behaviour required by
the specification and may cause problems.

The patch makes the array to be read in all cases - even when it is
known to be an empty one. That way all that has been written as a part
of serialization is read back.

The webrev with the fix and test is available @
https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6783290

-JB-

From jaroslav.bachorik at oracle.com  Wed Oct 10 07:17:03 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Wed, 10 Oct 2012 16:17:03 +0200
Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from
	javax.management.timer.Timer
Message-ID: <5075835F.3050604@oracle.com>

I am looking for a review and a sponsor.

The issue is about some javax.management.timer.Timer notifications not
being received by the listeners if the notifications are generated rapidly.

The problem is caused by ConcurrentModificationException being thrown -
the exception itself is ignored but the dispatcher logic is skipped.
Therefore the currently processed notification gets lost.

The CME is thrown due to the Timer.timerTable being iterated over while
other threads try to remove some of its elements. Fix consists of
replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap
which handles such situations with grace.

The patch webrev is available @
https://jbs.oracle.com/bugs/browse/JDK-6809322

Thanks,

-JB-

From jaroslav.bachorik at oracle.com  Wed Oct 10 07:39:27 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Wed, 10 Oct 2012 16:39:27 +0200
Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from
	javax.management.timer.Timer
In-Reply-To: <5075835F.3050604@oracle.com>
References: <5075835F.3050604@oracle.com>
Message-ID: <5075889F.1070808@oracle.com>

I am sorry for the webrev URL - a stale clipboard :(

The correct webrev URL is
https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6809322

-JB-

On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote:
> I am looking for a review and a sponsor.
> 
> The issue is about some javax.management.timer.Timer notifications not
> being received by the listeners if the notifications are generated rapidly.
> 
> The problem is caused by ConcurrentModificationException being thrown -
> the exception itself is ignored but the dispatcher logic is skipped.
> Therefore the currently processed notification gets lost.
> 
> The CME is thrown due to the Timer.timerTable being iterated over while
> other threads try to remove some of its elements. Fix consists of
> replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap
> which handles such situations with grace.
> 
> The patch webrev is available @
> https://jbs.oracle.com/bugs/browse/JDK-6809322
> 
> Thanks,
> 
> -JB-
> 


From eamonn at mcmanus.net  Wed Oct 10 08:49:11 2012
From: eamonn at mcmanus.net (Eamonn McManus)
Date: Wed, 10 Oct 2012 08:49:11 -0700
Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has
 inconsistent readObject/writeObject
In-Reply-To: <50753461.4070009@oracle.com>
References: <50753461.4070009@oracle.com>
Message-ID: <CACBEn44-BgZQbi1PWYc-w9Vj0a91AGBMEpLErsw++-9xhG+HAg@mail.gmail.com>

Hi Jaroslav,

The patch looks correct and the test is ingenious.

I do not understand why the previous SerializationTest needs to be
deleted. It doesn't seem that the new test is covering the same
things.

Reviewed-by: emcmanus

Incidentally I was not able to find a way to see the patch with the
usual webrev browser UI. Is there a link for that?

Regards,
?amonn


2012/10/10 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>:
> I am looking for a review and a sponsor for this fix.
>
> The issue is about an empty array of descriptors being written as a part
> of the serialization process but not read when deserializing an
> MBeanInfo/MBeanFeatureInfo instance. While the current ObjectInputStream
> skips all unread custom written fields it is not a behaviour required by
> the specification and may cause problems.
>
> The patch makes the array to be read in all cases - even when it is
> known to be an empty one. That way all that has been written as a part
> of serialization is read back.
>
> The webrev with the fix and test is available @
> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6783290
>
> -JB-

From jaroslav.bachorik at oracle.com  Wed Oct 10 11:52:55 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Wed, 10 Oct 2012 20:52:55 +0200
Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from
	javax.management.timer.Timer
In-Reply-To: <5075BF37.2020400@oracle.com>
References: <5075835F.3050604@oracle.com> <5075889F.1070808@oracle.com>
	<50759773.7080602@oracle.com> <5075BDAF.6070703@oracle.com>
	<5075BF37.2020400@oracle.com>
Message-ID: <5075C407.9060105@oracle.com>

Thanks, could you update with this webrev? I've fixed the problem with
hgmq <-> webrev where the file copy is mistakenly marked as a file move.

Thanks,

-JB-

On 10/10/2012 08:32 PM, Dmitry Samersoff wrote:
> Jaroslav,
> 
> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322/
> 
> -Dmitry
> 
> 
> On 2012-10-10 22:25, Jaroslav Bachorik wrote:
>> On Wed 10 Oct 2012 05:42:43 PM CEST, Dmitry Samersoff wrote:
>>> Jaroslav,
>>>
>>> Not able to open it as a webrev - only list of files.
>>>
>>> E-mail me the webrev and I'll put it to
>>> file:///opt/src/jdks/openjdk-patches/webrevs/JDK-6809322.zip
>>
>>> cr.openjdk.net/~dsamersoff/sponsorship/jbachorik/NNNNN
>>
>> Attaching... Thanks. I didn't want to put the webrevs on rapidshare or 
>> the likes and github seemed like a nice choice. Unfortunately it is not 
>> possible to view raw files :(
>>
>> -JB-
>>
>>>
>>> -Dmitry
>>>
>>>
>>> On 2012-10-10 18:39, Jaroslav Bachorik wrote:
>>>> I am sorry for the webrev URL - a stale clipboard :(
>>>>
>>>> The correct webrev URL is
>>>> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6809322
>>>>
>>>> -JB-
>>>>
>>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote:
>>>>> I am looking for a review and a sponsor.
>>>>>
>>>>> The issue is about some javax.management.timer.Timer notifications not
>>>>> being received by the listeners if the notifications are generated rapidly.
>>>>>
>>>>> The problem is caused by ConcurrentModificationException being thrown -
>>>>> the exception itself is ignored but the dispatcher logic is skipped.
>>>>> Therefore the currently processed notification gets lost.
>>>>>
>>>>> The CME is thrown due to the Timer.timerTable being iterated over while
>>>>> other threads try to remove some of its elements. Fix consists of
>>>>> replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap
>>>>> which handles such situations with grace.
>>>>>
>>>>> The patch webrev is available @
>>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -JB-
>>>>>
>>>>
>>>
>>>
>>
>>
> 
> 


From jaroslav.bachorik at oracle.com  Thu Oct 11 01:07:54 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Thu, 11 Oct 2012 10:07:54 +0200
Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from
	javax.management.timer.Timer
In-Reply-To: <5075835F.3050604@oracle.com>
References: <5075835F.3050604@oracle.com>
Message-ID: <50767E5A.7060908@oracle.com>

Dmitry has put the webrev on the public CR -
http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/

Thanks!

-JB-

On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote:
> I am looking for a review and a sponsor.
> 
> The issue is about some javax.management.timer.Timer notifications not
> being received by the listeners if the notifications are generated rapidly.
> 
> The problem is caused by ConcurrentModificationException being thrown -
> the exception itself is ignored but the dispatcher logic is skipped.
> Therefore the currently processed notification gets lost.
> 
> The CME is thrown due to the Timer.timerTable being iterated over while
> other threads try to remove some of its elements. Fix consists of
> replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap
> which handles such situations with grace.
> 
> The patch webrev is available @
> https://jbs.oracle.com/bugs/browse/JDK-6809322
> 
> Thanks,
> 
> -JB-
> 


From jaroslav.bachorik at oracle.com  Thu Oct 11 10:26:19 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Thu, 11 Oct 2012 19:26:19 +0200
Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has
 inconsistent readObject/writeObject
In-Reply-To: <5075C26C.50309@oracle.com>
References: <50753461.4070009@oracle.com>
	<CACBEn44-BgZQbi1PWYc-w9Vj0a91AGBMEpLErsw++-9xhG+HAg@mail.gmail.com>
	<5075C26C.50309@oracle.com>
Message-ID: <5077013B.9040801@oracle.com>

Just to keep it clear - here is the webrev hosted at CR - 
http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6783290-v1/

-JB-

On Wed 10 Oct 2012 08:46:04 PM CEST, Jaroslav Bachorik wrote:
> Hi,
>
> On Wed 10 Oct 2012 05:49:11 PM CEST, Eamonn McManus wrote:
>> Hi Jaroslav,
>>
>> The patch looks correct and the test is ingenious.
>>
>> I do not understand why the previous SerializationTest needs to be
>> deleted. It doesn't seem that the new test is covering the same
>> things.
>
> I need to check that. I copied the SerializationTest.java to
> SerializationTest1.java - apparently the cooperation of the webrev and
> mercurial queues has its glitches :(
>
> I am attaching the corrected webrev.
>
> -JB-
>
>>
>> Reviewed-by: emcmanus
>>
>> Incidentally I was not able to find a way to see the patch with the
>> usual webrev browser UI. Is there a link for that?
>>
>> Regards,
>> ?amonn
>>
>>
>> 2012/10/10 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>:
>>> I am looking for a review and a sponsor for this fix.
>>>
>>> The issue is about an empty array of descriptors being written as a part
>>> of the serialization process but not read when deserializing an
>>> MBeanInfo/MBeanFeatureInfo instance. While the current ObjectInputStream
>>> skips all unread custom written fields it is not a behaviour required by
>>> the specification and may cause problems.
>>>
>>> The patch makes the array to be read in all cases - even when it is
>>> known to be an empty one. That way all that has been written as a part
>>> of serialization is read back.
>>>
>>> The webrev with the fix and test is available @
>>> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6783290
>>>
>>> -JB-
>
>


From eamonn at mcmanus.net  Thu Oct 11 10:42:24 2012
From: eamonn at mcmanus.net (Eamonn McManus)
Date: Thu, 11 Oct 2012 10:42:24 -0700
Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has
 inconsistent readObject/writeObject
In-Reply-To: <5077013B.9040801@oracle.com>
References: <50753461.4070009@oracle.com>
	<CACBEn44-BgZQbi1PWYc-w9Vj0a91AGBMEpLErsw++-9xhG+HAg@mail.gmail.com>
	<5075C26C.50309@oracle.com> <5077013B.9040801@oracle.com>
Message-ID: <CACBEn46XZt=paHR3Vj=TdzSCgVqSGzCT4TVjh9=wf1Byg984wQ@mail.gmail.com>

Looks good. A couple of minor nits about the test: there is a stray
IDE template comment on line 74, and the copyright date is wrong.

?amonn


2012/10/11 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>:
> Just to keep it clear - here is the webrev hosted at CR -
> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6783290-v1/
>
> -JB-
>
> On Wed 10 Oct 2012 08:46:04 PM CEST, Jaroslav Bachorik wrote:
>> Hi,
>>
>> On Wed 10 Oct 2012 05:49:11 PM CEST, Eamonn McManus wrote:
>>> Hi Jaroslav,
>>>
>>> The patch looks correct and the test is ingenious.
>>>
>>> I do not understand why the previous SerializationTest needs to be
>>> deleted. It doesn't seem that the new test is covering the same
>>> things.
>>
>> I need to check that. I copied the SerializationTest.java to
>> SerializationTest1.java - apparently the cooperation of the webrev and
>> mercurial queues has its glitches :(
>>
>> I am attaching the corrected webrev.
>>
>> -JB-
>>
>>>
>>> Reviewed-by: emcmanus
>>>
>>> Incidentally I was not able to find a way to see the patch with the
>>> usual webrev browser UI. Is there a link for that?
>>>
>>> Regards,
>>> ?amonn
>>>
>>>
>>> 2012/10/10 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>:
>>>> I am looking for a review and a sponsor for this fix.
>>>>
>>>> The issue is about an empty array of descriptors being written as a part
>>>> of the serialization process but not read when deserializing an
>>>> MBeanInfo/MBeanFeatureInfo instance. While the current ObjectInputStream
>>>> skips all unread custom written fields it is not a behaviour required by
>>>> the specification and may cause problems.
>>>>
>>>> The patch makes the array to be read in all cases - even when it is
>>>> known to be an empty one. That way all that has been written as a part
>>>> of serialization is read back.
>>>>
>>>> The webrev with the fix and test is available @
>>>> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6783290
>>>>
>>>> -JB-
>>
>>
>
>

From david.holmes at oracle.com  Thu Oct 11 19:44:31 2012
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 12 Oct 2012 12:44:31 +1000
Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from
	javax.management.timer.Timer
In-Reply-To: <50767E5A.7060908@oracle.com>
References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com>
Message-ID: <5077840F.6050601@oracle.com>

Hi Jaroslav,


On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote:
> Dmitry has put the webrev on the public CR -
> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/
>
> Thanks!
>
> -JB-
>
> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote:
>> I am looking for a review and a sponsor.
>>
>> The issue is about some javax.management.timer.Timer notifications not
>> being received by the listeners if the notifications are generated rapidly.
>>
>> The problem is caused by ConcurrentModificationException being thrown -
>> the exception itself is ignored but the dispatcher logic is skipped.
>> Therefore the currently processed notification gets lost.

Can you point out where exactly in the code the exception is thrown and 
caught. I'd like to understand the problem better.

>>
>> The CME is thrown due to the Timer.timerTable being iterated over while
>> other threads try to remove some of its elements. Fix consists of
>> replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap
>> which handles such situations with grace.

Be aware that it may also give surprising results as removal is no 
longer synchronized at all with processing. So it could now appear that 
a notification is processed after a listener has been removed.

David
-----

>> The patch webrev is available @
>> https://jbs.oracle.com/bugs/browse/JDK-6809322
>>
>> Thanks,
>>
>> -JB-
>>
>

From jaroslav.bachorik at oracle.com  Fri Oct 12 00:47:21 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Fri, 12 Oct 2012 09:47:21 +0200
Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from
	javax.management.timer.Timer
In-Reply-To: <5077840F.6050601@oracle.com>
References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com>
	<5077840F.6050601@oracle.com>
Message-ID: <5077CB09.7010005@oracle.com>

On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote:
> Hi Jaroslav,
>
>
> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote:
>> Dmitry has put the webrev on the public CR -
>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/
>>
>>
>> Thanks!
>>
>> -JB-
>>
>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote:
>>> I am looking for a review and a sponsor.
>>>
>>> The issue is about some javax.management.timer.Timer notifications not
>>> being received by the listeners if the notifications are generated
>>> rapidly.
>>>
>>> The problem is caused by ConcurrentModificationException being thrown -
>>> the exception itself is ignored but the dispatcher logic is skipped.
>>> Therefore the currently processed notification gets lost.
>
> Can you point out where exactly in the code the exception is thrown
> and caught. I'd like to understand the problem better.

The CME is thrown in Timer.notifyAlarmClock() method in this case - but 
may happen in other places as well.

Actually, in some places the access to the timerTable map is 
synchronized while in others it isn't. While switching the Hashtable 
for ConcurrentHashMap resolves this particular issue it might be 
beneficial to correct the partial synchronization instead.

>
>>>
>>> The CME is thrown due to the Timer.timerTable being iterated over while
>>> other threads try to remove some of its elements. Fix consists of
>>> replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap
>>> which handles such situations with grace.
>
> Be aware that it may also give surprising results as removal is no
> longer synchronized at all with processing. So it could now appear
> that a notification is processed after a listener has been removed.

Indeed, the CME is the symptom of the out-of-order processing - the 
removal method is synchronized on (Timer.this) while the 
notifyAlarmClock() method, processing the notifications, runs 
unsynchronized.

Thanks for pointing this out. I will have something to think about.

-JB-

>
> David
> -----
>
>>> The patch webrev is available @
>>> https://jbs.oracle.com/bugs/browse/JDK-6809322
>>>
>>> Thanks,
>>>
>>> -JB-
>>>
>>


From jaroslav.bachorik at oracle.com  Fri Oct 12 06:14:47 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Fri, 12 Oct 2012 15:14:47 +0200
Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from
	javax.management.timer.Timer
In-Reply-To: <5077CB09.7010005@oracle.com>
References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com>
	<5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com>
Message-ID: <507817C7.9000703@oracle.com>

The updated webrev is now at http://btrace.kenai.com/webrevs/JDK-6809322/

I am sorry for this chaos with webrev locations but its not that easy to
work efficiently without an OpenJDK username :/

-JB-

On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote:
> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote:
>> Hi Jaroslav,
>>
>>
>> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote:
>>> Dmitry has put the webrev on the public CR -
>>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/
>>>
>>>
>>> Thanks!
>>>
>>> -JB-
>>>
>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote:
>>>> I am looking for a review and a sponsor.
>>>>
>>>> The issue is about some javax.management.timer.Timer notifications not
>>>> being received by the listeners if the notifications are generated
>>>> rapidly.
>>>>
>>>> The problem is caused by ConcurrentModificationException being thrown -
>>>> the exception itself is ignored but the dispatcher logic is skipped.
>>>> Therefore the currently processed notification gets lost.
>>
>> Can you point out where exactly in the code the exception is thrown
>> and caught. I'd like to understand the problem better.
> 
> The CME is thrown in Timer.notifyAlarmClock() method in this case - but 
> may happen in other places as well.
> 
> Actually, in some places the access to the timerTable map is 
> synchronized while in others it isn't. While switching the Hashtable 
> for ConcurrentHashMap resolves this particular issue it might be 
> beneficial to correct the partial synchronization instead.
> 
>>
>>>>
>>>> The CME is thrown due to the Timer.timerTable being iterated over while
>>>> other threads try to remove some of its elements. Fix consists of
>>>> replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap
>>>> which handles such situations with grace.
>>
>> Be aware that it may also give surprising results as removal is no
>> longer synchronized at all with processing. So it could now appear
>> that a notification is processed after a listener has been removed.
> 
> Indeed, the CME is the symptom of the out-of-order processing - the 
> removal method is synchronized on (Timer.this) while the 
> notifyAlarmClock() method, processing the notifications, runs 
> unsynchronized.
> 
> Thanks for pointing this out. I will have something to think about.
> 
> -JB-
> 
>>
>> David
>> -----
>>
>>>> The patch webrev is available @
>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322
>>>>
>>>> Thanks,
>>>>
>>>> -JB-
>>>>
>>>
> 
> 


From jaroslav.bachorik at oracle.com  Fri Oct 12 06:16:46 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Fri, 12 Oct 2012 15:16:46 +0200
Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has
 inconsistent readObject/writeObject
In-Reply-To: <CACBEn46XZt=paHR3Vj=TdzSCgVqSGzCT4TVjh9=wf1Byg984wQ@mail.gmail.com>
References: <50753461.4070009@oracle.com>
	<CACBEn44-BgZQbi1PWYc-w9Vj0a91AGBMEpLErsw++-9xhG+HAg@mail.gmail.com>
	<5075C26C.50309@oracle.com> <5077013B.9040801@oracle.com>
	<CACBEn46XZt=paHR3Vj=TdzSCgVqSGzCT4TVjh9=wf1Byg984wQ@mail.gmail.com>
Message-ID: <5078183E.7060106@oracle.com>

Thanks. Minor nits picked ...

http://btrace.kenai.com/webrevs/JDK-6783290/webrev.v3/

-JB-

On 10/11/2012 07:42 PM, Eamonn McManus wrote:
> Looks good. A couple of minor nits about the test: there is a stray
> IDE template comment on line 74, and the copyright date is wrong.
> 
> ?amonn
> 
> 
> 2012/10/11 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>:
>> Just to keep it clear - here is the webrev hosted at CR -
>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6783290-v1/
>>
>> -JB-
>>
>> On Wed 10 Oct 2012 08:46:04 PM CEST, Jaroslav Bachorik wrote:
>>> Hi,
>>>
>>> On Wed 10 Oct 2012 05:49:11 PM CEST, Eamonn McManus wrote:
>>>> Hi Jaroslav,
>>>>
>>>> The patch looks correct and the test is ingenious.
>>>>
>>>> I do not understand why the previous SerializationTest needs to be
>>>> deleted. It doesn't seem that the new test is covering the same
>>>> things.
>>>
>>> I need to check that. I copied the SerializationTest.java to
>>> SerializationTest1.java - apparently the cooperation of the webrev and
>>> mercurial queues has its glitches :(
>>>
>>> I am attaching the corrected webrev.
>>>
>>> -JB-
>>>
>>>>
>>>> Reviewed-by: emcmanus
>>>>
>>>> Incidentally I was not able to find a way to see the patch with the
>>>> usual webrev browser UI. Is there a link for that?
>>>>
>>>> Regards,
>>>> ?amonn
>>>>
>>>>
>>>> 2012/10/10 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>:
>>>>> I am looking for a review and a sponsor for this fix.
>>>>>
>>>>> The issue is about an empty array of descriptors being written as a part
>>>>> of the serialization process but not read when deserializing an
>>>>> MBeanInfo/MBeanFeatureInfo instance. While the current ObjectInputStream
>>>>> skips all unread custom written fields it is not a behaviour required by
>>>>> the specification and may cause problems.
>>>>>
>>>>> The patch makes the array to be read in all cases - even when it is
>>>>> known to be an empty one. That way all that has been written as a part
>>>>> of serialization is read back.
>>>>>
>>>>> The webrev with the fix and test is available @
>>>>> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6783290
>>>>>
>>>>> -JB-
>>>
>>>
>>
>>


From eamonn at mcmanus.net  Fri Oct 12 09:02:14 2012
From: eamonn at mcmanus.net (Eamonn McManus)
Date: Fri, 12 Oct 2012 09:02:14 -0700
Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has
 inconsistent readObject/writeObject
In-Reply-To: <5078183E.7060106@oracle.com>
References: <50753461.4070009@oracle.com>
	<CACBEn44-BgZQbi1PWYc-w9Vj0a91AGBMEpLErsw++-9xhG+HAg@mail.gmail.com>
	<5075C26C.50309@oracle.com> <5077013B.9040801@oracle.com>
	<CACBEn46XZt=paHR3Vj=TdzSCgVqSGzCT4TVjh9=wf1Byg984wQ@mail.gmail.com>
	<5078183E.7060106@oracle.com>
Message-ID: <CACBEn46EBu2qHRyeaM-uNhMD0R89whX+zvAQ3seFJGvevrJBjA@mail.gmail.com>

Looks good to me (emcmanus).

?amonn


2012/10/12 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>

> Thanks. Minor nits picked ...
>
> http://btrace.kenai.com/webrevs/JDK-6783290/webrev.v3/
>
> -JB-
>
> On 10/11/2012 07:42 PM, Eamonn McManus wrote:
> > Looks good. A couple of minor nits about the test: there is a stray
> > IDE template comment on line 74, and the copyright date is wrong.
> >
> > ?amonn
> >
> >
> > 2012/10/11 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>:
> >> Just to keep it clear - here is the webrev hosted at CR -
> >>
> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6783290-v1/
> >>
> >> -JB-
> >>
> >> On Wed 10 Oct 2012 08:46:04 PM CEST, Jaroslav Bachorik wrote:
> >>> Hi,
> >>>
> >>> On Wed 10 Oct 2012 05:49:11 PM CEST, Eamonn McManus wrote:
> >>>> Hi Jaroslav,
> >>>>
> >>>> The patch looks correct and the test is ingenious.
> >>>>
> >>>> I do not understand why the previous SerializationTest needs to be
> >>>> deleted. It doesn't seem that the new test is covering the same
> >>>> things.
> >>>
> >>> I need to check that. I copied the SerializationTest.java to
> >>> SerializationTest1.java - apparently the cooperation of the webrev and
> >>> mercurial queues has its glitches :(
> >>>
> >>> I am attaching the corrected webrev.
> >>>
> >>> -JB-
> >>>
> >>>>
> >>>> Reviewed-by: emcmanus
> >>>>
> >>>> Incidentally I was not able to find a way to see the patch with the
> >>>> usual webrev browser UI. Is there a link for that?
> >>>>
> >>>> Regards,
> >>>> ?amonn
> >>>>
> >>>>
> >>>> 2012/10/10 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>:
> >>>>> I am looking for a review and a sponsor for this fix.
> >>>>>
> >>>>> The issue is about an empty array of descriptors being written as a
> part
> >>>>> of the serialization process but not read when deserializing an
> >>>>> MBeanInfo/MBeanFeatureInfo instance. While the current
> ObjectInputStream
> >>>>> skips all unread custom written fields it is not a behaviour
> required by
> >>>>> the specification and may cause problems.
> >>>>>
> >>>>> The patch makes the array to be read in all cases - even when it is
> >>>>> known to be an empty one. That way all that has been written as a
> part
> >>>>> of serialization is read back.
> >>>>>
> >>>>> The webrev with the fix and test is available @
> >>>>>
> https://github.com/jbachorik/openjdk-patches/tree/master/webrevs/JDK-6783290
> >>>>>
> >>>>> -JB-
> >>>
> >>>
> >>
> >>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/jmx-dev/attachments/20121012/2d0ac61d/attachment.html 

From david.holmes at oracle.com  Sun Oct 14 19:19:49 2012
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 15 Oct 2012 12:19:49 +1000
Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from
	javax.management.timer.Timer
In-Reply-To: <507817C7.9000703@oracle.com>
References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com>
	<5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com>
	<507817C7.9000703@oracle.com>
Message-ID: <507B72C5.4060807@oracle.com>

Hi Jaroslav,

I think your changes now go further than needed. The original code uses 
a dual synchronization scheme:

a) it synchronizes most of the Timer methods
b) it also uses a thread-safe Hashtable

This means that not all of the Timer methods need to be synchronized 
because the only thread-safe action needed is the actual access to the 
Hashtable in some methods.

The flaw with the original code was simply that the iteration of the 
Hashtable in notifyAlaramClock was not done in a thread-safe manner. I 
believe this could be fixed simply by synchronizing on the Hashtable here:

1186         synchronized(timerTable) {

with no need to change the type of the timerTable, nor the 
synchronization on other Timer methods. You could alternatively 
synchronize on the Timer itself - as you now do - provided all methods 
of the Timer that mutate the Hashtable are themselves synchronized on 
the timer.

What you have is not incorrect though, and may remove unnecessary 
synchronization in some cases (but increases the size of critical 
sections in others).

Also here:

165     volatile private int counterID = 0;

there is no need to add volatile as counterID is only accessed within 
synchronized methods.

David
-----

On 12/10/2012 11:14 PM, Jaroslav Bachorik wrote:
> The updated webrev is now at http://btrace.kenai.com/webrevs/JDK-6809322/
>
> I am sorry for this chaos with webrev locations but its not that easy to
> work efficiently without an OpenJDK username :/
>
> -JB-
>
> On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote:
>> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote:
>>> Hi Jaroslav,
>>>
>>>
>>> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote:
>>>> Dmitry has put the webrev on the public CR -
>>>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/
>>>>
>>>>
>>>> Thanks!
>>>>
>>>> -JB-
>>>>
>>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote:
>>>>> I am looking for a review and a sponsor.
>>>>>
>>>>> The issue is about some javax.management.timer.Timer notifications not
>>>>> being received by the listeners if the notifications are generated
>>>>> rapidly.
>>>>>
>>>>> The problem is caused by ConcurrentModificationException being thrown -
>>>>> the exception itself is ignored but the dispatcher logic is skipped.
>>>>> Therefore the currently processed notification gets lost.
>>>
>>> Can you point out where exactly in the code the exception is thrown
>>> and caught. I'd like to understand the problem better.
>>
>> The CME is thrown in Timer.notifyAlarmClock() method in this case - but
>> may happen in other places as well.
>>
>> Actually, in some places the access to the timerTable map is
>> synchronized while in others it isn't. While switching the Hashtable
>> for ConcurrentHashMap resolves this particular issue it might be
>> beneficial to correct the partial synchronization instead.
>>
>>>
>>>>>
>>>>> The CME is thrown due to the Timer.timerTable being iterated over while
>>>>> other threads try to remove some of its elements. Fix consists of
>>>>> replacing the Hashtable used for Timer.timerTable by ConcurrentHashMap
>>>>> which handles such situations with grace.
>>>
>>> Be aware that it may also give surprising results as removal is no
>>> longer synchronized at all with processing. So it could now appear
>>> that a notification is processed after a listener has been removed.
>>
>> Indeed, the CME is the symptom of the out-of-order processing - the
>> removal method is synchronized on (Timer.this) while the
>> notifyAlarmClock() method, processing the notifications, runs
>> unsynchronized.
>>
>> Thanks for pointing this out. I will have something to think about.
>>
>> -JB-
>>
>>>
>>> David
>>> -----
>>>
>>>>> The patch webrev is available @
>>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322
>>>>>
>>>>> Thanks,
>>>>>
>>>>> -JB-
>>>>>
>>>>
>>
>>
>

From jaroslav.bachorik at oracle.com  Mon Oct 15 03:08:30 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Mon, 15 Oct 2012 12:08:30 +0200
Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from
	javax.management.timer.Timer
In-Reply-To: <507B72C5.4060807@oracle.com>
References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com>
	<5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com>
	<507817C7.9000703@oracle.com> <507B72C5.4060807@oracle.com>
Message-ID: <507BE09E.5090702@oracle.com>

Thanks David,

On 10/15/2012 04:19 AM, David Holmes wrote:
> Hi Jaroslav,
> 
> I think your changes now go further than needed. The original code uses
> a dual synchronization scheme:
> 
> a) it synchronizes most of the Timer methods
> b) it also uses a thread-safe Hashtable
> 
> This means that not all of the Timer methods need to be synchronized
> because the only thread-safe action needed is the actual access to the
> Hashtable in some methods.
> 
> The flaw with the original code was simply that the iteration of the
> Hashtable in notifyAlaramClock was not done in a thread-safe manner. I
> believe this could be fixed simply by synchronizing on the Hashtable here:
> 
> 1186         synchronized(timerTable) {
> 
> with no need to change the type of the timerTable, nor the
> synchronization on other Timer methods. You could alternatively
> synchronize on the Timer itself - as you now do - provided all methods
> of the Timer that mutate the Hashtable are themselves synchronized on
> the timer.
> 
> What you have is not incorrect though, and may remove unnecessary
> synchronization in some cases (but increases the size of critical
> sections in others).
> 
> Also here:
> 
> 165     volatile private int counterID = 0;
> 
> there is no need to add volatile as counterID is only accessed within
> synchronized methods.

Yes, I see your point. I just want to ask - in cases of fixing issues
like this the preferred way is to introduce minimal changes even if it
means leaving the parts of the code sub-optimal? IMO, having dual
synchronization scheme might be considered as sub-optimal as it makes it
more difficult to see the author's intentions.

But I am fine with leaving the Hashtable intact and just synchronizing
the iteration part correctly - it resolves the issue.

The update webrev is available at
http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v4

Regards,

-JB-

> 
> David
> -----
> 
> On 12/10/2012 11:14 PM, Jaroslav Bachorik wrote:
>> The updated webrev is now at http://btrace.kenai.com/webrevs/JDK-6809322/
>>
>> I am sorry for this chaos with webrev locations but its not that easy to
>> work efficiently without an OpenJDK username :/
>>
>> -JB-
>>
>> On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote:
>>> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote:
>>>> Hi Jaroslav,
>>>>
>>>>
>>>> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote:
>>>>> Dmitry has put the webrev on the public CR -
>>>>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/
>>>>>
>>>>>
>>>>>
>>>>> Thanks!
>>>>>
>>>>> -JB-
>>>>>
>>>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote:
>>>>>> I am looking for a review and a sponsor.
>>>>>>
>>>>>> The issue is about some javax.management.timer.Timer notifications
>>>>>> not
>>>>>> being received by the listeners if the notifications are generated
>>>>>> rapidly.
>>>>>>
>>>>>> The problem is caused by ConcurrentModificationException being
>>>>>> thrown -
>>>>>> the exception itself is ignored but the dispatcher logic is skipped.
>>>>>> Therefore the currently processed notification gets lost.
>>>>
>>>> Can you point out where exactly in the code the exception is thrown
>>>> and caught. I'd like to understand the problem better.
>>>
>>> The CME is thrown in Timer.notifyAlarmClock() method in this case - but
>>> may happen in other places as well.
>>>
>>> Actually, in some places the access to the timerTable map is
>>> synchronized while in others it isn't. While switching the Hashtable
>>> for ConcurrentHashMap resolves this particular issue it might be
>>> beneficial to correct the partial synchronization instead.
>>>
>>>>
>>>>>>
>>>>>> The CME is thrown due to the Timer.timerTable being iterated over
>>>>>> while
>>>>>> other threads try to remove some of its elements. Fix consists of
>>>>>> replacing the Hashtable used for Timer.timerTable by
>>>>>> ConcurrentHashMap
>>>>>> which handles such situations with grace.
>>>>
>>>> Be aware that it may also give surprising results as removal is no
>>>> longer synchronized at all with processing. So it could now appear
>>>> that a notification is processed after a listener has been removed.
>>>
>>> Indeed, the CME is the symptom of the out-of-order processing - the
>>> removal method is synchronized on (Timer.this) while the
>>> notifyAlarmClock() method, processing the notifications, runs
>>> unsynchronized.
>>>
>>> Thanks for pointing this out. I will have something to think about.
>>>
>>> -JB-
>>>
>>>>
>>>> David
>>>> -----
>>>>
>>>>>> The patch webrev is available @
>>>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> -JB-
>>>>>>
>>>>>
>>>
>>>
>>


From david.holmes at oracle.com  Mon Oct 15 04:45:37 2012
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 15 Oct 2012 21:45:37 +1000
Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from
	javax.management.timer.Timer
In-Reply-To: <507BE09E.5090702@oracle.com>
References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com>
	<5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com>
	<507817C7.9000703@oracle.com> <507B72C5.4060807@oracle.com>
	<507BE09E.5090702@oracle.com>
Message-ID: <507BF761.8040500@oracle.com>

On 15/10/2012 8:08 PM, Jaroslav Bachorik wrote:
> On 10/15/2012 04:19 AM, David Holmes wrote:
>> I think your changes now go further than needed. The original code uses
>> a dual synchronization scheme:
>>
>> a) it synchronizes most of the Timer methods
>> b) it also uses a thread-safe Hashtable
>>
>> This means that not all of the Timer methods need to be synchronized
>> because the only thread-safe action needed is the actual access to the
>> Hashtable in some methods.
>>
>> The flaw with the original code was simply that the iteration of the
>> Hashtable in notifyAlaramClock was not done in a thread-safe manner. I
>> believe this could be fixed simply by synchronizing on the Hashtable here:
>>
>> 1186         synchronized(timerTable) {
>>
>> with no need to change the type of the timerTable, nor the
>> synchronization on other Timer methods. You could alternatively
>> synchronize on the Timer itself - as you now do - provided all methods
>> of the Timer that mutate the Hashtable are themselves synchronized on
>> the timer.
>>
>> What you have is not incorrect though, and may remove unnecessary
>> synchronization in some cases (but increases the size of critical
>> sections in others).
>>
>> Also here:
>>
>> 165     volatile private int counterID = 0;
>>
>> there is no need to add volatile as counterID is only accessed within
>> synchronized methods.
>
> Yes, I see your point. I just want to ask - in cases of fixing issues
> like this the preferred way is to introduce minimal changes even if it
> means leaving the parts of the code sub-optimal? IMO, having dual
> synchronization scheme might be considered as sub-optimal as it makes it
> more difficult to see the author's intentions.

Optimal depends on your evaluation criteria. The original design may 
have been done with performance in mind and a view to minimising 
critical sections. Without knowing what the original design criteria 
was, and unless you are fixing a problem caused by key aspects of that 
design, then minimal changes should be favoured.

> But I am fine with leaving the Hashtable intact and just synchronizing
> the iteration part correctly - it resolves the issue.
>
> The update webrev is available at
> http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v4

I'm not sure the comment is needed in that form. Hashtable is 
snchronized internally but you need to use external synchronization when 
iterating through it.

David

> Regards,
>
> -JB-
>
>>
>> David
>> -----
>>
>> On 12/10/2012 11:14 PM, Jaroslav Bachorik wrote:
>>> The updated webrev is now at http://btrace.kenai.com/webrevs/JDK-6809322/
>>>
>>> I am sorry for this chaos with webrev locations but its not that easy to
>>> work efficiently without an OpenJDK username :/
>>>
>>> -JB-
>>>
>>> On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote:
>>>> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote:
>>>>> Hi Jaroslav,
>>>>>
>>>>>
>>>>> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote:
>>>>>> Dmitry has put the webrev on the public CR -
>>>>>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> -JB-
>>>>>>
>>>>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote:
>>>>>>> I am looking for a review and a sponsor.
>>>>>>>
>>>>>>> The issue is about some javax.management.timer.Timer notifications
>>>>>>> not
>>>>>>> being received by the listeners if the notifications are generated
>>>>>>> rapidly.
>>>>>>>
>>>>>>> The problem is caused by ConcurrentModificationException being
>>>>>>> thrown -
>>>>>>> the exception itself is ignored but the dispatcher logic is skipped.
>>>>>>> Therefore the currently processed notification gets lost.
>>>>>
>>>>> Can you point out where exactly in the code the exception is thrown
>>>>> and caught. I'd like to understand the problem better.
>>>>
>>>> The CME is thrown in Timer.notifyAlarmClock() method in this case - but
>>>> may happen in other places as well.
>>>>
>>>> Actually, in some places the access to the timerTable map is
>>>> synchronized while in others it isn't. While switching the Hashtable
>>>> for ConcurrentHashMap resolves this particular issue it might be
>>>> beneficial to correct the partial synchronization instead.
>>>>
>>>>>
>>>>>>>
>>>>>>> The CME is thrown due to the Timer.timerTable being iterated over
>>>>>>> while
>>>>>>> other threads try to remove some of its elements. Fix consists of
>>>>>>> replacing the Hashtable used for Timer.timerTable by
>>>>>>> ConcurrentHashMap
>>>>>>> which handles such situations with grace.
>>>>>
>>>>> Be aware that it may also give surprising results as removal is no
>>>>> longer synchronized at all with processing. So it could now appear
>>>>> that a notification is processed after a listener has been removed.
>>>>
>>>> Indeed, the CME is the symptom of the out-of-order processing - the
>>>> removal method is synchronized on (Timer.this) while the
>>>> notifyAlarmClock() method, processing the notifications, runs
>>>> unsynchronized.
>>>>
>>>> Thanks for pointing this out. I will have something to think about.
>>>>
>>>> -JB-
>>>>
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>>>> The patch webrev is available @
>>>>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> -JB-
>>>>>>>
>>>>>>
>>>>
>>>>
>>>
>

From jaroslav.bachorik at oracle.com  Mon Oct 15 05:14:58 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Mon, 15 Oct 2012 14:14:58 +0200
Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from
	javax.management.timer.Timer
In-Reply-To: <507BF761.8040500@oracle.com>
References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com>
	<5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com>
	<507817C7.9000703@oracle.com> <507B72C5.4060807@oracle.com>
	<507BE09E.5090702@oracle.com> <507BF761.8040500@oracle.com>
Message-ID: <507BFE42.1050200@oracle.com>

On 10/15/2012 01:45 PM, David Holmes wrote:
> On 15/10/2012 8:08 PM, Jaroslav Bachorik wrote:
>> On 10/15/2012 04:19 AM, David Holmes wrote:
>>> I think your changes now go further than needed. The original code uses
>>> a dual synchronization scheme:
>>>
>>> a) it synchronizes most of the Timer methods
>>> b) it also uses a thread-safe Hashtable
>>>
>>> This means that not all of the Timer methods need to be synchronized
>>> because the only thread-safe action needed is the actual access to the
>>> Hashtable in some methods.
>>>
>>> The flaw with the original code was simply that the iteration of the
>>> Hashtable in notifyAlaramClock was not done in a thread-safe manner. I
>>> believe this could be fixed simply by synchronizing on the Hashtable
>>> here:
>>>
>>> 1186         synchronized(timerTable) {
>>>
>>> with no need to change the type of the timerTable, nor the
>>> synchronization on other Timer methods. You could alternatively
>>> synchronize on the Timer itself - as you now do - provided all methods
>>> of the Timer that mutate the Hashtable are themselves synchronized on
>>> the timer.
>>>
>>> What you have is not incorrect though, and may remove unnecessary
>>> synchronization in some cases (but increases the size of critical
>>> sections in others).
>>>
>>> Also here:
>>>
>>> 165     volatile private int counterID = 0;
>>>
>>> there is no need to add volatile as counterID is only accessed within
>>> synchronized methods.
>>
>> Yes, I see your point. I just want to ask - in cases of fixing issues
>> like this the preferred way is to introduce minimal changes even if it
>> means leaving the parts of the code sub-optimal? IMO, having dual
>> synchronization scheme might be considered as sub-optimal as it makes it
>> more difficult to see the author's intentions.
> 
> Optimal depends on your evaluation criteria. The original design may
> have been done with performance in mind and a view to minimising
> critical sections. Without knowing what the original design criteria
> was, and unless you are fixing a problem caused by key aspects of that
> design, then minimal changes should be favoured.
> 
>> But I am fine with leaving the Hashtable intact and just synchronizing
>> the iteration part correctly - it resolves the issue.
>>
>> The update webrev is available at
>> http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v4
> 
> I'm not sure the comment is needed in that form. Hashtable is
> snchronized internally but you need to use external synchronization when
> iterating through it.

Ok, it's gone. http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v5/

> 
> David
> 
>> Regards,
>>
>> -JB-
>>
>>>
>>> David
>>> -----
>>>
>>> On 12/10/2012 11:14 PM, Jaroslav Bachorik wrote:
>>>> The updated webrev is now at
>>>> http://btrace.kenai.com/webrevs/JDK-6809322/
>>>>
>>>> I am sorry for this chaos with webrev locations but its not that
>>>> easy to
>>>> work efficiently without an OpenJDK username :/
>>>>
>>>> -JB-
>>>>
>>>> On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote:
>>>>> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote:
>>>>>> Hi Jaroslav,
>>>>>>
>>>>>>
>>>>>> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote:
>>>>>>> Dmitry has put the webrev on the public CR -
>>>>>>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> -JB-
>>>>>>>
>>>>>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote:
>>>>>>>> I am looking for a review and a sponsor.
>>>>>>>>
>>>>>>>> The issue is about some javax.management.timer.Timer notifications
>>>>>>>> not
>>>>>>>> being received by the listeners if the notifications are generated
>>>>>>>> rapidly.
>>>>>>>>
>>>>>>>> The problem is caused by ConcurrentModificationException being
>>>>>>>> thrown -
>>>>>>>> the exception itself is ignored but the dispatcher logic is
>>>>>>>> skipped.
>>>>>>>> Therefore the currently processed notification gets lost.
>>>>>>
>>>>>> Can you point out where exactly in the code the exception is thrown
>>>>>> and caught. I'd like to understand the problem better.
>>>>>
>>>>> The CME is thrown in Timer.notifyAlarmClock() method in this case -
>>>>> but
>>>>> may happen in other places as well.
>>>>>
>>>>> Actually, in some places the access to the timerTable map is
>>>>> synchronized while in others it isn't. While switching the Hashtable
>>>>> for ConcurrentHashMap resolves this particular issue it might be
>>>>> beneficial to correct the partial synchronization instead.
>>>>>
>>>>>>
>>>>>>>>
>>>>>>>> The CME is thrown due to the Timer.timerTable being iterated over
>>>>>>>> while
>>>>>>>> other threads try to remove some of its elements. Fix consists of
>>>>>>>> replacing the Hashtable used for Timer.timerTable by
>>>>>>>> ConcurrentHashMap
>>>>>>>> which handles such situations with grace.
>>>>>>
>>>>>> Be aware that it may also give surprising results as removal is no
>>>>>> longer synchronized at all with processing. So it could now appear
>>>>>> that a notification is processed after a listener has been removed.
>>>>>
>>>>> Indeed, the CME is the symptom of the out-of-order processing - the
>>>>> removal method is synchronized on (Timer.this) while the
>>>>> notifyAlarmClock() method, processing the notifications, runs
>>>>> unsynchronized.
>>>>>
>>>>> Thanks for pointing this out. I will have something to think about.
>>>>>
>>>>> -JB-
>>>>>
>>>>>>
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>>> The patch webrev is available @
>>>>>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>> -JB-
>>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>
>>


From david.holmes at oracle.com  Mon Oct 15 05:18:25 2012
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 15 Oct 2012 22:18:25 +1000
Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from
	javax.management.timer.Timer
In-Reply-To: <507BFE42.1050200@oracle.com>
References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com>
	<5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com>
	<507817C7.9000703@oracle.com> <507B72C5.4060807@oracle.com>
	<507BE09E.5090702@oracle.com> <507BF761.8040500@oracle.com>
	<507BFE42.1050200@oracle.com>
Message-ID: <507BFF11.1080404@oracle.com>

Looks good to me.

Hopefully someone else will chime in too :)

Thanks,
David

On 15/10/2012 10:14 PM, Jaroslav Bachorik wrote:
> On 10/15/2012 01:45 PM, David Holmes wrote:
>> On 15/10/2012 8:08 PM, Jaroslav Bachorik wrote:
>>> On 10/15/2012 04:19 AM, David Holmes wrote:
>>>> I think your changes now go further than needed. The original code uses
>>>> a dual synchronization scheme:
>>>>
>>>> a) it synchronizes most of the Timer methods
>>>> b) it also uses a thread-safe Hashtable
>>>>
>>>> This means that not all of the Timer methods need to be synchronized
>>>> because the only thread-safe action needed is the actual access to the
>>>> Hashtable in some methods.
>>>>
>>>> The flaw with the original code was simply that the iteration of the
>>>> Hashtable in notifyAlaramClock was not done in a thread-safe manner. I
>>>> believe this could be fixed simply by synchronizing on the Hashtable
>>>> here:
>>>>
>>>> 1186         synchronized(timerTable) {
>>>>
>>>> with no need to change the type of the timerTable, nor the
>>>> synchronization on other Timer methods. You could alternatively
>>>> synchronize on the Timer itself - as you now do - provided all methods
>>>> of the Timer that mutate the Hashtable are themselves synchronized on
>>>> the timer.
>>>>
>>>> What you have is not incorrect though, and may remove unnecessary
>>>> synchronization in some cases (but increases the size of critical
>>>> sections in others).
>>>>
>>>> Also here:
>>>>
>>>> 165     volatile private int counterID = 0;
>>>>
>>>> there is no need to add volatile as counterID is only accessed within
>>>> synchronized methods.
>>>
>>> Yes, I see your point. I just want to ask - in cases of fixing issues
>>> like this the preferred way is to introduce minimal changes even if it
>>> means leaving the parts of the code sub-optimal? IMO, having dual
>>> synchronization scheme might be considered as sub-optimal as it makes it
>>> more difficult to see the author's intentions.
>>
>> Optimal depends on your evaluation criteria. The original design may
>> have been done with performance in mind and a view to minimising
>> critical sections. Without knowing what the original design criteria
>> was, and unless you are fixing a problem caused by key aspects of that
>> design, then minimal changes should be favoured.
>>
>>> But I am fine with leaving the Hashtable intact and just synchronizing
>>> the iteration part correctly - it resolves the issue.
>>>
>>> The update webrev is available at
>>> http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v4
>>
>> I'm not sure the comment is needed in that form. Hashtable is
>> snchronized internally but you need to use external synchronization when
>> iterating through it.
>
> Ok, it's gone. http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v5/
>
>>
>> David
>>
>>> Regards,
>>>
>>> -JB-
>>>
>>>>
>>>> David
>>>> -----
>>>>
>>>> On 12/10/2012 11:14 PM, Jaroslav Bachorik wrote:
>>>>> The updated webrev is now at
>>>>> http://btrace.kenai.com/webrevs/JDK-6809322/
>>>>>
>>>>> I am sorry for this chaos with webrev locations but its not that
>>>>> easy to
>>>>> work efficiently without an OpenJDK username :/
>>>>>
>>>>> -JB-
>>>>>
>>>>> On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote:
>>>>>> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote:
>>>>>>> Hi Jaroslav,
>>>>>>>
>>>>>>>
>>>>>>> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote:
>>>>>>>> Dmitry has put the webrev on the public CR -
>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> -JB-
>>>>>>>>
>>>>>>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote:
>>>>>>>>> I am looking for a review and a sponsor.
>>>>>>>>>
>>>>>>>>> The issue is about some javax.management.timer.Timer notifications
>>>>>>>>> not
>>>>>>>>> being received by the listeners if the notifications are generated
>>>>>>>>> rapidly.
>>>>>>>>>
>>>>>>>>> The problem is caused by ConcurrentModificationException being
>>>>>>>>> thrown -
>>>>>>>>> the exception itself is ignored but the dispatcher logic is
>>>>>>>>> skipped.
>>>>>>>>> Therefore the currently processed notification gets lost.
>>>>>>>
>>>>>>> Can you point out where exactly in the code the exception is thrown
>>>>>>> and caught. I'd like to understand the problem better.
>>>>>>
>>>>>> The CME is thrown in Timer.notifyAlarmClock() method in this case -
>>>>>> but
>>>>>> may happen in other places as well.
>>>>>>
>>>>>> Actually, in some places the access to the timerTable map is
>>>>>> synchronized while in others it isn't. While switching the Hashtable
>>>>>> for ConcurrentHashMap resolves this particular issue it might be
>>>>>> beneficial to correct the partial synchronization instead.
>>>>>>
>>>>>>>
>>>>>>>>>
>>>>>>>>> The CME is thrown due to the Timer.timerTable being iterated over
>>>>>>>>> while
>>>>>>>>> other threads try to remove some of its elements. Fix consists of
>>>>>>>>> replacing the Hashtable used for Timer.timerTable by
>>>>>>>>> ConcurrentHashMap
>>>>>>>>> which handles such situations with grace.
>>>>>>>
>>>>>>> Be aware that it may also give surprising results as removal is no
>>>>>>> longer synchronized at all with processing. So it could now appear
>>>>>>> that a notification is processed after a listener has been removed.
>>>>>>
>>>>>> Indeed, the CME is the symptom of the out-of-order processing - the
>>>>>> removal method is synchronized on (Timer.this) while the
>>>>>> notifyAlarmClock() method, processing the notifications, runs
>>>>>> unsynchronized.
>>>>>>
>>>>>> Thanks for pointing this out. I will have something to think about.
>>>>>>
>>>>>> -JB-
>>>>>>
>>>>>>>
>>>>>>> David
>>>>>>> -----
>>>>>>>
>>>>>>>>> The patch webrev is available @
>>>>>>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> -JB-
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>

From jaroslav.bachorik at oracle.com  Mon Oct 15 09:17:16 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Mon, 15 Oct 2012 18:17:16 +0200
Subject: jmx-dev [PATCH] JDK-6809322: Missing notifications from
	javax.management.timer.Timer
In-Reply-To: <12F6F059-25F8-4448-8D1E-25C25DAE87E9@oracle.com>
References: <5075835F.3050604@oracle.com> <50767E5A.7060908@oracle.com>
	<5077840F.6050601@oracle.com> <5077CB09.7010005@oracle.com>
	<507817C7.9000703@oracle.com> <507B72C5.4060807@oracle.com>
	<507BE09E.5090702@oracle.com> <507BF761.8040500@oracle.com>
	<507BFE42.1050200@oracle.com>
	<12F6F059-25F8-4448-8D1E-25C25DAE87E9@oracle.com>
Message-ID: <507C370C.8040906@oracle.com>

Thank you guys for the reviews! I would like to kindly ask someone with
the committer rights to sponsor this fix and push the change.


Thanks!

-JB-

On 10/15/2012 04:15 PM, Rickard B?ckman wrote:
> Looks good!
> 
> /R
> 
> On Oct 15, 2012, at 2:14 PM, Jaroslav Bachorik wrote:
> 
>> On 10/15/2012 01:45 PM, David Holmes wrote:
>>> On 15/10/2012 8:08 PM, Jaroslav Bachorik wrote:
>>>> On 10/15/2012 04:19 AM, David Holmes wrote:
>>>>> I think your changes now go further than needed. The original code uses
>>>>> a dual synchronization scheme:
>>>>>
>>>>> a) it synchronizes most of the Timer methods
>>>>> b) it also uses a thread-safe Hashtable
>>>>>
>>>>> This means that not all of the Timer methods need to be synchronized
>>>>> because the only thread-safe action needed is the actual access to the
>>>>> Hashtable in some methods.
>>>>>
>>>>> The flaw with the original code was simply that the iteration of the
>>>>> Hashtable in notifyAlaramClock was not done in a thread-safe manner. I
>>>>> believe this could be fixed simply by synchronizing on the Hashtable
>>>>> here:
>>>>>
>>>>> 1186         synchronized(timerTable) {
>>>>>
>>>>> with no need to change the type of the timerTable, nor the
>>>>> synchronization on other Timer methods. You could alternatively
>>>>> synchronize on the Timer itself - as you now do - provided all methods
>>>>> of the Timer that mutate the Hashtable are themselves synchronized on
>>>>> the timer.
>>>>>
>>>>> What you have is not incorrect though, and may remove unnecessary
>>>>> synchronization in some cases (but increases the size of critical
>>>>> sections in others).
>>>>>
>>>>> Also here:
>>>>>
>>>>> 165     volatile private int counterID = 0;
>>>>>
>>>>> there is no need to add volatile as counterID is only accessed within
>>>>> synchronized methods.
>>>>
>>>> Yes, I see your point. I just want to ask - in cases of fixing issues
>>>> like this the preferred way is to introduce minimal changes even if it
>>>> means leaving the parts of the code sub-optimal? IMO, having dual
>>>> synchronization scheme might be considered as sub-optimal as it makes it
>>>> more difficult to see the author's intentions.
>>>
>>> Optimal depends on your evaluation criteria. The original design may
>>> have been done with performance in mind and a view to minimising
>>> critical sections. Without knowing what the original design criteria
>>> was, and unless you are fixing a problem caused by key aspects of that
>>> design, then minimal changes should be favoured.
>>>
>>>> But I am fine with leaving the Hashtable intact and just synchronizing
>>>> the iteration part correctly - it resolves the issue.
>>>>
>>>> The update webrev is available at
>>>> http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v4
>>>
>>> I'm not sure the comment is needed in that form. Hashtable is
>>> snchronized internally but you need to use external synchronization when
>>> iterating through it.
>>
>> Ok, it's gone. http://btrace.kenai.com/webrevs/JDK-6809322/webrev.v5/
>>
>>>
>>> David
>>>
>>>> Regards,
>>>>
>>>> -JB-
>>>>
>>>>>
>>>>> David
>>>>> -----
>>>>>
>>>>> On 12/10/2012 11:14 PM, Jaroslav Bachorik wrote:
>>>>>> The updated webrev is now at
>>>>>> http://btrace.kenai.com/webrevs/JDK-6809322/
>>>>>>
>>>>>> I am sorry for this chaos with webrev locations but its not that
>>>>>> easy to
>>>>>> work efficiently without an OpenJDK username :/
>>>>>>
>>>>>> -JB-
>>>>>>
>>>>>> On 10/12/2012 09:47 AM, Jaroslav Bachorik wrote:
>>>>>>> On Fri 12 Oct 2012 04:44:31 AM CEST, David Holmes wrote:
>>>>>>>> Hi Jaroslav,
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/10/2012 6:07 PM, Jaroslav Bachorik wrote:
>>>>>>>>> Dmitry has put the webrev on the public CR -
>>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/sponsorship/jbachorik/JDK-6809322-v2/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> -JB-
>>>>>>>>>
>>>>>>>>> On 10/10/2012 04:17 PM, Jaroslav Bachorik wrote:
>>>>>>>>>> I am looking for a review and a sponsor.
>>>>>>>>>>
>>>>>>>>>> The issue is about some javax.management.timer.Timer notifications
>>>>>>>>>> not
>>>>>>>>>> being received by the listeners if the notifications are generated
>>>>>>>>>> rapidly.
>>>>>>>>>>
>>>>>>>>>> The problem is caused by ConcurrentModificationException being
>>>>>>>>>> thrown -
>>>>>>>>>> the exception itself is ignored but the dispatcher logic is
>>>>>>>>>> skipped.
>>>>>>>>>> Therefore the currently processed notification gets lost.
>>>>>>>>
>>>>>>>> Can you point out where exactly in the code the exception is thrown
>>>>>>>> and caught. I'd like to understand the problem better.
>>>>>>>
>>>>>>> The CME is thrown in Timer.notifyAlarmClock() method in this case -
>>>>>>> but
>>>>>>> may happen in other places as well.
>>>>>>>
>>>>>>> Actually, in some places the access to the timerTable map is
>>>>>>> synchronized while in others it isn't. While switching the Hashtable
>>>>>>> for ConcurrentHashMap resolves this particular issue it might be
>>>>>>> beneficial to correct the partial synchronization instead.
>>>>>>>
>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The CME is thrown due to the Timer.timerTable being iterated over
>>>>>>>>>> while
>>>>>>>>>> other threads try to remove some of its elements. Fix consists of
>>>>>>>>>> replacing the Hashtable used for Timer.timerTable by
>>>>>>>>>> ConcurrentHashMap
>>>>>>>>>> which handles such situations with grace.
>>>>>>>>
>>>>>>>> Be aware that it may also give surprising results as removal is no
>>>>>>>> longer synchronized at all with processing. So it could now appear
>>>>>>>> that a notification is processed after a listener has been removed.
>>>>>>>
>>>>>>> Indeed, the CME is the symptom of the out-of-order processing - the
>>>>>>> removal method is synchronized on (Timer.this) while the
>>>>>>> notifyAlarmClock() method, processing the notifications, runs
>>>>>>> unsynchronized.
>>>>>>>
>>>>>>> Thanks for pointing this out. I will have something to think about.
>>>>>>>
>>>>>>> -JB-
>>>>>>>
>>>>>>>>
>>>>>>>> David
>>>>>>>> -----
>>>>>>>>
>>>>>>>>>> The patch webrev is available @
>>>>>>>>>> https://jbs.oracle.com/bugs/browse/JDK-6809322
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> -JB-
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>
> 


From jaroslav.bachorik at oracle.com  Wed Oct 24 06:45:26 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Wed, 24 Oct 2012 15:45:26 +0200
Subject: jmx-dev [PATCH] JDK-7009998: JMX synchronization during connection
 restart is faulty
Message-ID: <5087F0F6.40902@oracle.com>

I am looking for a review and patch sponsor.

Webrev available at
http://cr.openjdk.java.net/~jbachorik/JDK-7009998/webrev.00

The issue is about a possible race condition in the
ClientCommunicatorAdmin when the reconnection process may be initiated
by more than one thread (eg. 3). The main reason is that the
re-connection routine logic is split into two synchronized blocks and it
relies on the state staying consistent when transiting from the one
synchronized block to the other.

The race condition is described by the reporter as:
"In reading the code there is a scenario where the synchronization does
the wrong thing if 3 threads attempt to go through the code at the same
time. Consider the code in
com.sun.jmx.remote.internal.ClientCommunicatorAdmin.restart, the first
thread will set the state to RE_CONNECTING and then leave the
sychronization block, the second thread will find the state to be
RE_CONNECTING and wait on the lock. When thread 1 finishes and sets the
state to CONNECTED, then thread 2 can leave the synchronization block -
but fails to set the state to RE_CONNECTING because that code is
incorrectly in the else branch. Thus thread 2 starts the reconnecting
and thread 3 wakes to find the state not RE_CONNECTING so it believes it
can safely start the reconnect and it also starts reconnecting. The bad
mode is discovered in the preReconnection method."

The fix is adding a return statement at the end of the first
synchronized block in case when the admin has been successfully
re-connected by the other thread.

Test in "test/com/sun/jmx/remote/CCAdminReconnectTest.java" tests the
fix. Changes in "make/netbeans/jmx/build.properties" are there for the
NetBeans project to recognize the newly added test.


Thanks,

-JB-


From jaroslav.bachorik at oracle.com  Wed Oct 24 06:50:28 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Wed, 24 Oct 2012 15:50:28 +0200
Subject: jmx-dev [PATCH] JDK-6976971: TEST:
 javax/management/remote/mandatory/URLTest.java should be re-integrated
Message-ID: <5087F224.5010603@oracle.com>

I am looking for review and sponsor.

Webrev is available at
http://cr.openjdk.java.net/~jbachorik/JDK-6976971/webrev.00

This is a simple fix - just adding back the test that used to fail due
to changes in URI spec. The changes were rolled back before mustang and
the test has no reason to fail.

Thanks,

-JB-

From jaroslav.bachorik at oracle.com  Wed Oct 24 07:03:27 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Wed, 24 Oct 2012 16:03:27 +0200
Subject: jmx-dev [PATCH] JDK-6937053: RMI unmarshalling errors in
 ClientNotifForwarder cause silent failure
Message-ID: <5087F52F.8070809@oracle.com>

I am looking for review and a sponsor.

Webrev available at
http://cr.openjdk.java.net/~jbachorik/JDK-6937053/webrev.00/

The RMI marshalling process may throw java.rmi.UnmarshallException eg.
in cases of incompatible changes in enums. The bad thing is that
ClientNotifForwarder chooses to silently die instead of reporting the
problem.

The fix consists of adding support for handling
java.rmi.UnmarshallException the same way as
java.io.NotSerializableException and appropriate changes in the javadoc.

Thanks,

-JB-

From jaroslav.bachorik at oracle.com  Wed Oct 24 07:09:58 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Wed, 24 Oct 2012 16:09:58 +0200
Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has
 inconsistent readObject/writeObject
Message-ID: <5087F6B6.3050708@oracle.com>

I am looking for review and a sponsor.

Webrev is available at
http://cr.openjdk.java.net/~jbachorik/JDK-6783290/webrev.01/

The serialization of javax.management.MBeanInfo and
javax.management.MBeanFeatureInfo instances is asymmetrical in cases
with no attached descriptor. The descriptor is serialized as an empty
array but when deserializing the descriptor is not read back at all.
Currently for RMI this does not pose any problem but the specification
does not explicitly allow this kind of behaviour and it may cause
troubles eventually.

The patch just reads back the empty array to keep the
serialization/deserialization symmetric.

Thanks,

-JB-

From jaroslav.bachorik at oracle.com  Wed Oct 24 07:15:34 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Wed, 24 Oct 2012 16:15:34 +0200
Subject: jmx-dev [PATCH] JDK-6705499: Bad JMXConnectorProvider class name
 prevents all connections - even with standard RMI connector
Message-ID: <5087F806.40408@oracle.com>

I am looking for review and a sponsor.

Webrev is available at
http://cr.openjdk.java.net/~jbachorik/JDK-6705499/webrev.00/

The issue is caused by the way the java.util.ServiceLoader treats the
service registration with incorrect class names. Such a service
registration causes java.util.ServiceConfigurationError to be thrown and
the JMXConnector(Server)Factory is not ready for this. Thanks to the
exception all the other, potentially valid, service registrations are
ignored.

The patch makes JMXConnector(Server)Factory class ready for
java.util.ServiceConfigurationError and when such an exception is caught
the factory just proceeds to the next registration. If the only
available registration causes the exception it will be rethrown at the end.

Thanks,

-JB-

From Alan.Bateman at oracle.com  Wed Oct 24 07:28:17 2012
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Wed, 24 Oct 2012 15:28:17 +0100
Subject: jmx-dev [PATCH] JDK-6705499: Bad JMXConnectorProvider class
 name prevents all connections - even with standard RMI connector
In-Reply-To: <5087F806.40408@oracle.com>
References: <5087F806.40408@oracle.com>
Message-ID: <5087FB01.6010701@oracle.com>

On 24/10/2012 15:15, Jaroslav Bachorik wrote:
> I am looking for review and a sponsor.
>
> Webrev is available at
> http://cr.openjdk.java.net/~jbachorik/JDK-6705499/webrev.00/
>
> The issue is caused by the way the java.util.ServiceLoader treats the
> service registration with incorrect class names. Such a service
> registration causes java.util.ServiceConfigurationError to be thrown and
> the JMXConnector(Server)Factory is not ready for this. Thanks to the
> exception all the other, potentially valid, service registrations are
> ignored.
>
> The patch makes JMXConnector(Server)Factory class ready for
> java.util.ServiceConfigurationError and when such an exception is caught
> the factory just proceeds to the next registration. If the only
> available registration causes the exception it will be rethrown at the end.
>
> Thanks,
>
> -JB-
I'm not so sure this is the right thing to do. When SCE is thrown then 
there is no guarantee that you can continue and there isn't enough 
information in the error to know whether it makes sense to attempt to 
continue or not. We have this same issue in many areas of the platform 
and I think it requires future work in ServiceLoader to help users of 
the API decide whether to continue or not. Once we move to modules then 
many of the reasons for SCE will go away because the list of service 
provider is precomputed so there is no scanning of class paths or 
parsing of configuration files at runtime. So if this one is not urgent 
they it may be something to come back to again in the future.

-Alan.


From Alan.Bateman at oracle.com  Wed Oct 24 07:29:20 2012
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Wed, 24 Oct 2012 15:29:20 +0100
Subject: jmx-dev [PATCH] JDK-6976971: TEST:
 javax/management/remote/mandatory/URLTest.java should be re-integrated
In-Reply-To: <5087F224.5010603@oracle.com>
References: <5087F224.5010603@oracle.com>
Message-ID: <5087FB40.9040905@oracle.com>

On 24/10/2012 14:50, Jaroslav Bachorik wrote:
> I am looking for review and sponsor.
>
> Webrev is available at
> http://cr.openjdk.java.net/~jbachorik/JDK-6976971/webrev.00
>
> This is a simple fix - just adding back the test that used to fail due
> to changes in URI spec. The changes were rolled back before mustang and
> the test has no reason to fail.
>
> Thanks,
>
> -JB-
This looks fine to me, I guess it was just missed when the URI work was 
rolled back.

-Alan

From eamonn at mcmanus.net  Wed Oct 24 08:49:32 2012
From: eamonn at mcmanus.net (Eamonn McManus)
Date: Wed, 24 Oct 2012 08:49:32 -0700
Subject: jmx-dev [PATCH] JDK-6783290: MBeanInfo/MBeanFeatureInfo has
	inconsistent readObject/writeObject
In-Reply-To: <5087F6B6.3050708@oracle.com>
References: <5087F6B6.3050708@oracle.com>
Message-ID: <CACBEn44kZsryb5ckYMHzDq6VxPt5GTsNpEQFRv_EvgEniM=OAg@mail.gmail.com>

This is already Reviewed-by: emcmanus, but I'm afraid I can't sponsor it.

?amonn


2012/10/24 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>

> I am looking for review and a sponsor.
>
> Webrev is available at
> http://cr.openjdk.java.net/~jbachorik/JDK-6783290/webrev.01/
>
> The serialization of javax.management.MBeanInfo and
> javax.management.MBeanFeatureInfo instances is asymmetrical in cases
> with no attached descriptor. The descriptor is serialized as an empty
> array but when deserializing the descriptor is not read back at all.
> Currently for RMI this does not pose any problem but the specification
> does not explicitly allow this kind of behaviour and it may cause
> troubles eventually.
>
> The patch just reads back the empty array to keep the
> serialization/deserialization symmetric.
>
> Thanks,
>
> -JB-
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/jmx-dev/attachments/20121024/fe8a9bbe/attachment.html 

From jaroslav.bachorik at oracle.com  Thu Oct 25 04:55:47 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Thu, 25 Oct 2012 13:55:47 +0200
Subject: jmx-dev [PATCH] JDK-6705499: Bad JMXConnectorProvider class
 name prevents all connections - even with standard RMI connector
In-Reply-To: <5087FB01.6010701@oracle.com>
References: <5087F806.40408@oracle.com> <5087FB01.6010701@oracle.com>
Message-ID: <508928C3.6000208@oracle.com>

On 10/24/2012 04:28 PM, Alan Bateman wrote:
> On 24/10/2012 15:15, Jaroslav Bachorik wrote:
>> I am looking for review and a sponsor.
>>
>> Webrev is available at
>> http://cr.openjdk.java.net/~jbachorik/JDK-6705499/webrev.00/
>>
>> The issue is caused by the way the java.util.ServiceLoader treats the
>> service registration with incorrect class names. Such a service
>> registration causes java.util.ServiceConfigurationError to be thrown and
>> the JMXConnector(Server)Factory is not ready for this. Thanks to the
>> exception all the other, potentially valid, service registrations are
>> ignored.
>>
>> The patch makes JMXConnector(Server)Factory class ready for
>> java.util.ServiceConfigurationError and when such an exception is caught
>> the factory just proceeds to the next registration. If the only
>> available registration causes the exception it will be rethrown at the
>> end.
>>
>> Thanks,
>>
>> -JB-
> I'm not so sure this is the right thing to do. When SCE is thrown then
> there is no guarantee that you can continue and there isn't enough
> information in the error to know whether it makes sense to attempt to
> continue or not. We have this same issue in many areas of the platform

Shouldn't this be indicated by the "hasNext()" method of the iterator
returned by ServiceLoader? I mean - whether you can continue enumerating
the providers or not.

I agree that the fact that SCE is an Error subclass alone makes catching
and handling it rather dubious but it seems a bit harsh to throw a
(supposedly) unrecoverable exception only because one entry in the
service configuration file is invalid.

> and I think it requires future work in ServiceLoader to help users of
> the API decide whether to continue or not. Once we move to modules then

Yes, a proper fix would be in the ServiceLoader but it will most
probably involve API changes. Eg. instead of generating an Error a
checked exception should be thrown indicating a problem with the
particular service configuration line.

> many of the reasons for SCE will go away because the list of service
> provider is precomputed so there is no scanning of class paths or
> parsing of configuration files at runtime. So if this one is not urgent
> they it may be something to come back to again in the future.

Unfortunately, it will take some time till we have modules :( Until then
you can completely disable JMX subsystem simply by placing a poison jar
on classpath (probably other ServiceLoder based services as well but
this issue is about JMX).

Even though the issue is P3 it has been there sitting unresolved for 4
years and keeping it for another 3 (till JDK9) would look quite strange,
IMO.

What about filing an issue for ServiceLoader (if there is none yet) and
then pushing this workaround with comment that it should be revisited
once the modules are in place?

-JB-

> 
> -Alan.
> 


From jaroslav.bachorik at oracle.com  Mon Oct 29 07:15:21 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Mon, 29 Oct 2012 15:15:21 +0100
Subject: jmx-dev [PATCH] JDK-7146162:
 javax/management/remote/mandatory/connection/BrokenConnectionTest.java
 failing intermittently
Message-ID: <508E8F79.60909@oracle.com>

I am looking for a sponsor and reviewers.

The webrev is available at
http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.03

As explained in the issue the failure is caused by the RMI connection
heart-beat thread racing against the thread executing the MBean
operation and encountering the IOException. The heart beat thread sets
the the admin state to "terminated" but does not send the failure
notifications. On the other side the operation thread determines the
state to be already terminated and skips the notifications as well.

The fix adds the call to handle an ioexception, including sending the
failure notifications, to the hear-beat connection failure handler. Also
it widens the synchronized block since the whole code block checking for
the connection failure and recovering must be run atomically,


Thanks,

-JB-

From eamonn at mcmanus.net  Tue Oct 30 09:10:21 2012
From: eamonn at mcmanus.net (Eamonn McManus)
Date: Tue, 30 Oct 2012 09:10:21 -0700
Subject: jmx-dev [PATCH] JDK-7146162:
 javax/management/remote/mandatory/connection/BrokenConnectionTest.java
 failing intermittently
In-Reply-To: <508E8F79.60909@oracle.com>
References: <508E8F79.60909@oracle.com>
Message-ID: <CACBEn45F15=8JLT3nnTgELM3Z6WwSMq8gZEp22SZPFNJxDNe7g@mail.gmail.com>

This area has historically caused a lot of problems and I am not
surprised to see that there are more. While I don't know what the best
way to fix the issue at hand is, I don't think this proposed change is
it. The reason is that the checkConnection and gotIOException methods
do blocking operations, and it is generally not a good idea to do
blocking operations in a synchronized block. Is there a way to avoid
the race condition without that?

?amonn


2012/10/29 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>:
> I am looking for a sponsor and reviewers.
>
> The webrev is available at
> http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.03
>
> As explained in the issue the failure is caused by the RMI connection
> heart-beat thread racing against the thread executing the MBean
> operation and encountering the IOException. The heart beat thread sets
> the the admin state to "terminated" but does not send the failure
> notifications. On the other side the operation thread determines the
> state to be already terminated and skips the notifications as well.
>
> The fix adds the call to handle an ioexception, including sending the
> failure notifications, to the hear-beat connection failure handler. Also
> it widens the synchronized block since the whole code block checking for
> the connection failure and recovering must be run atomically,
>
>
> Thanks,
>
> -JB-

From jaroslav.bachorik at oracle.com  Wed Oct 31 05:59:28 2012
From: jaroslav.bachorik at oracle.com (Jaroslav Bachorik)
Date: Wed, 31 Oct 2012 13:59:28 +0100
Subject: jmx-dev [PATCH] JDK-7146162:
 javax/management/remote/mandatory/connection/BrokenConnectionTest.java
 failing intermittently
In-Reply-To: <CACBEn45F15=8JLT3nnTgELM3Z6WwSMq8gZEp22SZPFNJxDNe7g@mail.gmail.com>
References: <508E8F79.60909@oracle.com>
	<CACBEn45F15=8JLT3nnTgELM3Z6WwSMq8gZEp22SZPFNJxDNe7g@mail.gmail.com>
Message-ID: <509120B0.6040703@oracle.com>

On 10/30/2012 05:10 PM, Eamonn McManus wrote:
> This area has historically caused a lot of problems and I am not
> surprised to see that there are more. While I don't know what the best
> way to fix the issue at hand is, I don't think this proposed change is
> it. The reason is that the checkConnection and gotIOException methods
> do blocking operations, and it is generally not a good idea to do
> blocking operations in a synchronized block. Is there a way to avoid
> the race condition without that?

The important part is calling the gotIOException() method even from the
heart-beat checker. I've tried to return the synchronization block back
to the original state and the test passes with the check period of 10ms
which pushes the probability of data races rather high.

It seems that the worst that can happen would be one additional
checkConnection() call - in case when the state gets set to TERMINATED
by another thread right after it has been checked in the synchronized
block the loop condition might evaluate to true if the state value has
not been flushed yet.

I could change the "state" variable to be volatile but I am not sure
whether it's worth the hassle.

-JB-

> 
> ?amonn
> 
> 
> 2012/10/29 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>:
>> I am looking for a sponsor and reviewers.
>>
>> The webrev is available at
>> http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.03
>>
>> As explained in the issue the failure is caused by the RMI connection
>> heart-beat thread racing against the thread executing the MBean
>> operation and encountering the IOException. The heart beat thread sets
>> the the admin state to "terminated" but does not send the failure
>> notifications. On the other side the operation thread determines the
>> state to be already terminated and skips the notifications as well.
>>
>> The fix adds the call to handle an ioexception, including sending the
>> failure notifications, to the hear-beat connection failure handler. Also
>> it widens the synchronized block since the whole code block checking for
>> the connection failure and recovering must be run atomically,
>>
>>
>> Thanks,
>>
>> -JB-