RFR 8066708: JMXStartStopTest fails to connect to port 38112
olivier.lagneau at oracle.com
olivier.lagneau at oracle.com
Thu Dec 11 15:09:41 UTC 2014
Hi Dmitry,
On 11/12/2014 15:43, Dmitry Samersoff wrote:
> Jaroslav,
>
> You can set SO_LINGER to zero, in this case socket will be closed
> immediately without waiting in TIME_WAIT
SO-LINGER did not help either in my case (see my previous mail to Jaroslav).
That ended-up in using another hard-coded (supposedly free) port.
Note that was before RMI tests used randomly allocated ports.
> But there are no reliable way to predict whether you can take this port
> or not after you close it.
This is what I observed in my case.
>
> So the only valid solution is to try to connect to a random port and if
> this attempt fails try another random port. Everything else will cause
> more or less frequent intermittent failures.
IIRC think this is what is currently done in RMI tests.
Olivier.
>
> On 2014-12-11 17:06, Jaroslav Bachorik wrote:
>> On 12/09/2014 01:25 PM, Jaroslav Bachorik wrote:
>>> On 12/09/2014 01:39 AM, Stuart Marks wrote:
>>>> On 12/8/14 12:35 PM, Jaroslav Bachorik wrote:
>>>>> Please, review the following test change
>>>>>
>>>>> Issue : https://bugs.openjdk.java.net/browse/JDK-8066708
>>>>> Webrev: http://cr.openjdk.java.net/~jbachorik/8066708/webrev.00
>>>>>
>>>>> The test fails very intermittently when RMI registry is trying to bind
>>>>> to a port
>>>>> previously used in the test (via ServerSocket).
>>>>>
>>>>> This seems to be caused by the sockets created via `new
>>>>> ServerSocket(0)` and
>>>>> being in reusable mode. The fix attempts to prevent this by explicitly
>>>>> forbidding the reusable mode.
>>>> Hi Jaroslav,
>>>>
>>>> I happened to see this fly by, and there are (I think) some similar
>>>> issues going on in the RMI tests.
>>>>
>>>> But first I'll note that I don't think setReuseAddress() will have the
>>>> effect that you want. Typically it's set to true before binding a
>>>> socket, so that a subsequent bind operation will succeed even if the
>>>> address/port is already in use. ServerSockets created with new
>>>> ServerSocket(0) are already bound, and I'm not sure what calling
>>>> setReuseAddress(false) will do on such sockets. The spec says behavior
>>>> is undefined, but my bet is that it does nothing.
>>>>
>>>> I guess it doesn't hurt to try this out to see if it makes a difference,
>>>> but I don't have much confidence it will help.
>>>>
>>>> The potential similarity to the RMI tests is exemplified by JDK-8049202
>>>> (sorry, this bug report isn't open) but briefly this tests the RMI
>>>> registry as follows:
>>>>
>>>> 1. Opens port 1099 using new ServerSocket(1099) [1099 is the default
>>>> RMI registry port] in order to ensure that 1099 isn't in use by
>>>> something else already;
>>>>
>>>> 2. If this succeeds, it immediately closes the ServerSocket.
>>>>
>>>> 3. Then it creates a new RMI registry on port 1099.
>>>>
>>>> In principle, this should succeed, yet it fails around 10% of the time
>>>> on some systems. The error is "port already in use". My best theory is
>>>> that even though the socket has just been closed by a user program, the
>>>> kernel has to run the socket through some of the socket states such as
>>>> FIN_WAIT_1, FIN_WAIT_2, or CLOSING before the socket is actually closed
>>>> and is available for reuse. If a program -- even the same one --
>>>> attempts to open a socket on the same port before the socket has reached
>>>> its final state, it will get an "already in use error".
>>>>
>>>> If this is true I don't believe that setting SO_REUSEADDR will work if
>>>> the socket is in one of these final states. (I remember reading this
>>>> somewhere but I'm not sure where at the moment. I can try to dig it up
>>>> if there is interest.)
>>>>
>>>> I admit this is just a theory and I'm open to alternatives, and I'm also
>>>> open to hearing about ways to deal with this problem.
>>>>
>>>> Could something similar be going on with this JMX test?
>>> Hm, this is exactly what happened with this test :(
>>>
>>> The problem is that the port is reported as available while it is still
>>> occupied and RMI registry attempts to start using that port.
>>>
>>> If setting SO_REUSEADDR does not work then the only solution would be to
>>> retry the test case when this exception occurs.
>> Further investigation shows that the problem was rather the client
>> connecting to a socket being shut down.
>>
>> It sounds like setting SO_REUSEADDR to false should prevent this failure.
>>
>> From the ServerSocket javadoc:
>> "When a TCP connection is closed the connection may remain in a timeout
>> state for a period of time after the connection is closed (typically
>> known as the TIME_WAIT state or 2MSL wait state). For applications using
>> a well known socket address or port it may not be possible to bind a
>> socket to the required SocketAddress if there is a connection in the
>> timeout state involving the socket address or port."
>>
>> It also turns out that the test does not close the server sockets
>> properly so there might be several sockets being opened or timed out
>> dangling around.
>>
>> I've updated the test so it is setting SO_REUSEADDR for all the new
>> ServerSockets instances + introduced the mechanism to run the test code
>> while properly cleaning up any allocated ports.
>>
>> http://cr.openjdk.java.net/~jbachorik/8066708/webrev.01/
>>
>> -JB-
>>
>>> -JB-
>>>
>>>> s'marks
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/serviceability-dev/attachments/20141211/1b5cc949/attachment-0001.html>
More information about the serviceability-dev
mailing list