jmx-dev [PATCH] JDK-7146162: javax/management/remote/mandatory/connection/BrokenConnectionTest.java failing intermittently

Wed Dec 19 02:53:07 PST 2012

The fix is OK for me.

Shanliang

Jaroslav Bachorik wrote:
> On 12/18/2012 08:21 PM, shanliang wrote:
>   
>> It is good to call
>>    gotIOException((IOException)e);
>>
>> but no need to call in next
>>    restart((IOException)e);
>>
>> the method "gotIOException" should call "restart" if necessary,
>> otherwise "restart" may be called 2 times and generated 2 times
>> "Reconnection notification".
>>     
>
> Yes, it seems that the restart() will be called exactly twice or not at
> all. I will remove the restart() call at the line 199.
>
> Updated webrev to
> http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.05
>
> -JB-
>
>
>
>   
>> Shanliang
>>
>>   Jaroslav Bachorik wrote:
>>     
>>> I've updated the webrev - no additional synchronisation is required.
>>> Please, review:
>>> http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.04
>>>
>>> -JB-
>>>
>>> On 10/31/2012 01:59 PM, Jaroslav Bachorik wrote:
>>>  
>>>       
>>>> On 10/30/2012 05:10 PM, Eamonn McManus wrote:
>>>>    
>>>>         
>>>>> This area has historically caused a lot of problems and I am not
>>>>> surprised to see that there are more. While I don't know what the best
>>>>> way to fix the issue at hand is, I don't think this proposed change is
>>>>> it. The reason is that the checkConnection and gotIOException methods
>>>>> do blocking operations, and it is generally not a good idea to do
>>>>> blocking operations in a synchronized block. Is there a way to avoid
>>>>> the race condition without that?
>>>>>       
>>>>>           
>>>> The important part is calling the gotIOException() method even from the
>>>> heart-beat checker. I've tried to return the synchronization block back
>>>> to the original state and the test passes with the check period of 10ms
>>>> which pushes the probability of data races rather high.
>>>>
>>>> It seems that the worst that can happen would be one additional
>>>> checkConnection() call - in case when the state gets set to TERMINATED
>>>> by another thread right after it has been checked in the synchronized
>>>> block the loop condition might evaluate to true if the state value has
>>>> not been flushed yet.
>>>>
>>>> I could change the "state" variable to be volatile but I am not sure
>>>> whether it's worth the hassle.
>>>>
>>>> -JB-
>>>>
>>>>    
>>>>         
>>>>> Éamonn
>>>>>
>>>>>
>>>>> 2012/10/29 Jaroslav Bachorik <jaroslav.bachorik at oracle.com>:
>>>>>      
>>>>>           
>>>>>> I am looking for a sponsor and reviewers.
>>>>>>
>>>>>> The webrev is available at
>>>>>> http://cr.openjdk.java.net/~jbachorik/JDK-7146162/webrev.03
>>>>>>
>>>>>> As explained in the issue the failure is caused by the RMI connection
>>>>>> heart-beat thread racing against the thread executing the MBean
>>>>>> operation and encountering the IOException. The heart beat thread sets
>>>>>> the the admin state to "terminated" but does not send the failure
>>>>>> notifications. On the other side the operation thread determines the
>>>>>> state to be already terminated and skips the notifications as well.
>>>>>>
>>>>>> The fix adds the call to handle an ioexception, including sending the
>>>>>> failure notifications, to the hear-beat connection failure handler.
>>>>>> Also
>>>>>> it widens the synchronized block since the whole code block
>>>>>> checking for
>>>>>> the connection failure and recovering must be run atomically,
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> -JB-
>>>>>>         
>>>>>>             
>>>   
>>>       
>>     
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20121219/c16d00f0/attachment.html