Review request: 8049303: Transient network problems cause JMX thread to fail silenty

Jaroslav Bachorik jaroslav.bachorik at oracle.com
Tue Aug 26 10:36:05 UTC 2014


Hi Poonam,

On 08/26/2014 12:27 PM, Poonam Bajaj wrote:
> Sending the review request to serviceability-dev list as well...
> -----------
>
> Could I have reviews for this change:
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8049303
> Webrev: http://cr.openjdk.java.net/~poonam/8049303/webrev.00/

L1499-1504 can be completely removed. They serve no purpose now.

Please, adjust the indentation to fit the original one.

Thanks,

-JB-

>
> Problem and fix:
> By default the JMX client side notification fetch timeout
> (jmx.remote.x.notification.fetch.timeout) is 1 minute and the default
> server connection timeout (jmx.remote.x.server.connection.timeout) is 2
> minutes.
>
> If the client side connector thread makes a notification fetch request
> to the server, but a transient network problem prevents the server
> response from reaching the client, the client side connector will wait
> for a response until the timeout period (1 minute) has expired before
> throwing an IOException.
>
> The client side RMIConnector implementation handles the IOException, by
> re-checking the connection status to understand whether or not it is
> broken. If the connection is not available at that moment, the connector
> fails by re-throwing the initial IOException. The problem is that this
> re-check of the connection passes because the server side of the
> connection doesn't time out until 2 minutes has passed (by default), so
> the NotifFetcher thread
> dies without posting a failed notification, and the client application
> does not get a chance to recover.
>
> The fix is to forward the exception on the JMX client side before
> checking the connection status.
>
> Testing:
> All the jdk_jmx and jdk_management regression tests passed.
>
> The fix applies cleanly to 8u and 7u repos.
>
>
> Thanks,
> Poonam
>
>



More information about the serviceability-dev mailing list