Review request: 8049303: Transient network problems cause JMX thread to fail silenty

Jaroslav Bachorik jaroslav.bachorik at oracle.com
Tue Aug 26 11:41:49 UTC 2014


On 08/26/2014 01:20 PM, Poonam Bajaj wrote:
> Hi Jaroslav,
>
> On 8/26/2014 4:06 PM, Jaroslav Bachorik wrote:
>> Hi Poonam,
>>
>> On 08/26/2014 12:27 PM, Poonam Bajaj wrote:
>>> Sending the review request to serviceability-dev list as well...
>>> -----------
>>>
>>> Could I have reviews for this change:
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8049303
>>> Webrev: http://cr.openjdk.java.net/~poonam/8049303/webrev.00/
>>
>> L1499-1504 can be completely removed. They serve no purpose now.
>>
> Removed this piece of code.
>
>> Please, adjust the indentation to fit the original one.
>
> Corrected the indentation.
>
> Updated webrev: http://cr.openjdk.java.net/~poonam/8049303/webrev.01/

Looks good!

-JB-

>
> Thanks,
> Poonam
>
>>
>> Thanks,
>>
>> -JB-
>>
>>>
>>> Problem and fix:
>>> By default the JMX client side notification fetch timeout
>>> (jmx.remote.x.notification.fetch.timeout) is 1 minute and the default
>>> server connection timeout (jmx.remote.x.server.connection.timeout) is 2
>>> minutes.
>>>
>>> If the client side connector thread makes a notification fetch request
>>> to the server, but a transient network problem prevents the server
>>> response from reaching the client, the client side connector will wait
>>> for a response until the timeout period (1 minute) has expired before
>>> throwing an IOException.
>>>
>>> The client side RMIConnector implementation handles the IOException, by
>>> re-checking the connection status to understand whether or not it is
>>> broken. If the connection is not available at that moment, the connector
>>> fails by re-throwing the initial IOException. The problem is that this
>>> re-check of the connection passes because the server side of the
>>> connection doesn't time out until 2 minutes has passed (by default), so
>>> the NotifFetcher thread
>>> dies without posting a failed notification, and the client application
>>> does not get a chance to recover.
>>>
>>> The fix is to forward the exception on the JMX client side before
>>> checking the connection status.
>>>
>>> Testing:
>>> All the jdk_jmx and jdk_management regression tests passed.
>>>
>>> The fix applies cleanly to 8u and 7u repos.
>>>
>>>
>>> Thanks,
>>> Poonam
>>>
>>>
>>



More information about the serviceability-dev mailing list