[PATCH] JDK-7146162: javax/management/remote/mandatory/connection/BrokenConnectionTest.java failing intermittently
Jaroslav Bachorik
jaroslav.bachorik at oracle.com
Tue Dec 18 07:28:11 PST 2012
I've updated the webrev - no additional synchronisation is required.
Please, review: http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.04
-JB-
On 10/31/2012 01:59 PM, Jaroslav Bachorik wrote:
> On 10/30/2012 05:10 PM, Eamonn McManus wrote:
>> This area has historically caused a lot of problems and I am not
>> surprised to see that there are more. While I don't know what the best
>> way to fix the issue at hand is, I don't think this proposed change is
>> it. The reason is that the checkConnection and gotIOException methods
>> do blocking operations, and it is generally not a good idea to do
>> blocking operations in a synchronized block. Is there a way to avoid
>> the race condition without that?
>
> The important part is calling the gotIOException() method even from the
> heart-beat checker. I've tried to return the synchronization block back
> to the original state and the test passes with the check period of 10ms
> which pushes the probability of data races rather high.
>
> It seems that the worst that can happen would be one additional
> checkConnection() call - in case when the state gets set to TERMINATED
> by another thread right after it has been checked in the synchronized
> block the loop condition might evaluate to true if the state value has
> not been flushed yet.
>
> I could change the "state" variable to be volatile but I am not sure
> whether it's worth the hassle.
>
> -JB-
>
>>
>> Éamonn
>>
>>
>> 2012/10/29 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>:
>>> I am looking for a sponsor and reviewers.
>>>
>>> The webrev is available at
>>> http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.03
>>>
>>> As explained in the issue the failure is caused by the RMI connection
>>> heart-beat thread racing against the thread executing the MBean
>>> operation and encountering the IOException. The heart beat thread sets
>>> the the admin state to "terminated" but does not send the failure
>>> notifications. On the other side the operation thread determines the
>>> state to be already terminated and skips the notifications as well.
>>>
>>> The fix adds the call to handle an ioexception, including sending the
>>> failure notifications, to the hear-beat connection failure handler. Also
>>> it widens the synchronized block since the whole code block checking for
>>> the connection failure and recovering must be run atomically,
>>>
>>>
>>> Thanks,
>>>
>>> -JB-
>
More information about the serviceability-dev
mailing list