jmx-dev [PATCH] JDK-7146162: javax/management/remote/mandatory/connection/BrokenConnectionTest.java failing intermittently

shanliang shanliang.jiang at oracle.com
Tue Dec 18 11:21:13 PST 2012


It is good to call
    gotIOException((IOException)e);

but no need to call in next
    restart((IOException)e);

the method "gotIOException" should call "restart" if necessary, 
otherwise "restart" may be called 2 times and generated 2 times 
"Reconnection notification".

Shanliang

   
Jaroslav Bachorik wrote:
> I've updated the webrev - no additional synchronisation is required.
> Please, review: http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.04
>
> -JB-
>
> On 10/31/2012 01:59 PM, Jaroslav Bachorik wrote:
>   
>> On 10/30/2012 05:10 PM, Eamonn McManus wrote:
>>     
>>> This area has historically caused a lot of problems and I am not
>>> surprised to see that there are more. While I don't know what the best
>>> way to fix the issue at hand is, I don't think this proposed change is
>>> it. The reason is that the checkConnection and gotIOException methods
>>> do blocking operations, and it is generally not a good idea to do
>>> blocking operations in a synchronized block. Is there a way to avoid
>>> the race condition without that?
>>>       
>> The important part is calling the gotIOException() method even from the
>> heart-beat checker. I've tried to return the synchronization block back
>> to the original state and the test passes with the check period of 10ms
>> which pushes the probability of data races rather high.
>>
>> It seems that the worst that can happen would be one additional
>> checkConnection() call - in case when the state gets set to TERMINATED
>> by another thread right after it has been checked in the synchronized
>> block the loop condition might evaluate to true if the state value has
>> not been flushed yet.
>>
>> I could change the "state" variable to be volatile but I am not sure
>> whether it's worth the hassle.
>>
>> -JB-
>>
>>     
>>> Éamonn
>>>
>>>
>>> 2012/10/29 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>:
>>>       
>>>> I am looking for a sponsor and reviewers.
>>>>
>>>> The webrev is available at
>>>> http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.03
>>>>
>>>> As explained in the issue the failure is caused by the RMI connection
>>>> heart-beat thread racing against the thread executing the MBean
>>>> operation and encountering the IOException. The heart beat thread sets
>>>> the the admin state to "terminated" but does not send the failure
>>>> notifications. On the other side the operation thread determines the
>>>> state to be already terminated and skips the notifications as well.
>>>>
>>>> The fix adds the call to handle an ioexception, including sending the
>>>> failure notifications, to the hear-beat connection failure handler. Also
>>>> it widens the synchronized block since the whole code block checking for
>>>> the connection failure and recovering must be run atomically,
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> -JB-
>>>>         
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20121218/42b26420/attachment.html 


More information about the serviceability-dev mailing list