[9] RFR: 8162484: javax/net/ssl/Stapling/SSLSocketWithStapling.java test fails intermittently with "Address already in use" error

Artem Smotrakov artem.smotrakov at oracle.com
Mon Aug 15 17:31:19 UTC 2016


Hi Xuelei,

Makes sense to me, I removed those 1 second delays

http://cr.openjdk.java.net/~asmotrak/8162484/webrev.03/

Artem


On 08/12/2016 06:17 PM, Xuelei Fan wrote:
> It's a nice find of the port reuse issues.
>
> This update will turn into expected connection failure into
> reading/writing interruption as the server simulate the failure by
> closing the incoming connections.  It's fine for this test,  I think.
>
> For lines like:
>   288         intOcsp.rejectConnections();
>   289         rootOcsp.rejectConnections();
>   290         Thread.sleep(1000);
>
> I was wondering as the server does not need to bootup again, is the
> delay still needed?
>
> Otherwise, looks fine to me.
>
> Thanks,
> Xuelei
>
> On 8/13/2016 6:25 AM, Artem Smotrakov wrote:
>> Thank you for review Jamil.
>>
>> Xuelei,
>>
>> Could you please take a look?
>>
>> Artem
>>
>>
>> On 08/12/2016 02:38 PM, Jamil Nimeh wrote:
>>> Thank you Artem.  The fix looks good.  You just need a +1 from an
>>> official reviewer.
>>>
>>>
>>>
>>> --Jamil
>>>
>>> -------- Original message --------
>>> From: Artem Smotrakov <artem.smotrakov at oracle.com>
>>> Date: 8/12/16 1:07 PM (GMT-08:00)
>>> To: Jamil Nimeh <jamil.j.nimeh at oracle.com>, Security Dev OpenJDK
>>> <security-dev at openjdk.java.net>
>>> Subject: Re: [9] RFR: 8162484:
>>> javax/net/ssl/Stapling/SSLSocketWithStapling.java test fails
>>> intermittently with "Address already in use" error
>>>
>>> No problem.
>>>
>>> http://cr.openjdk.java.net/~asmotrak/8162484/webrev.02/
>>>
>>> Artem
>>>
>>>
>>> On 08/12/2016 12:02 PM, Jamil Nimeh wrote:
>>>> For the tests as we use them today we don't intend the server to
>>>> restart.  The intent of SimpleOCSPServer was to be of use for a
>>>> variety of testing purposes.  I don't know that we can say for all
>>>> intended uses that we'll *never* need to restart it. That's why I'd
>>>> like to keep the unbound socket/set sockopt/bind/listen behavior.  I
>>>> don't think ServerSocket(0) achieves that.
>>>>
>>>> --Jamil
>>>>
>>>> On 8/12/2016 11:30 AM, Artem Smotrakov wrote:
>>>>> Hi Jamil,
>>>>>
>>>>> There was no any specific reason to remove ServerSocket.bind() call.
>>>>> ServerSocket(0) constructor creates a server socket, automatically
>>>>> bound to a random free port. If I am not missing something, it
>>>>> doesn't look necessary to set the SO_REUSEADDR socket options if the
>>>>> server is not going to restart. The code is just shorter if we use
>>>>> ServerSocket(0) constructor to open a server socket, but I can revert
>>>>> it to use bind() with 0 port number if you think it's better.
>>>>>
>>>>> Artem
>>>>>
>>>>>
>>>>> On 08/12/2016 09:13 AM, Jamil Nimeh wrote:
>>>>>> Hi Artem, more comments in-line
>>>>>>
>>>>>>
>>>>>> On 8/11/2016 11:46 AM, Artem Smotrakov wrote:
>>>>>>> Hi Jamil,
>>>>>>>
>>>>>>> Thank you for review. Please see inline.
>>>>>>>
>>>>>>>
>>>>>>> On 08/10/2016 04:16 PM, Jamil Nimeh wrote:
>>>>>>>> Hi Artem,
>>>>>>>>
>>>>>>>> I'm not an official reviewer but the solution for making the
>>>>>>>> servers reject connections rather than stop and start looks pretty
>>>>>>>> fair to me and seems like a nice way to simulate a downed OCSP
>>>>>>>> responder instead of having to bounce it.  A couple
>>>>>>>> comments/questions:
>>>>>>>>
>>>>>>>> I'm a bit surprised that you get the "Address already in use"
>>>>>>>> error though.
>>>>>>> Well, to be honest, I was not able to reproduce this failure
>>>>>>> locally. I was running the test in a loop for a couple of days, and
>>>>>>> it didn't fail. But the test has been observed to fail in other
>>>>>>> test runs (jprt, CI, etc).
>>>>>>>
>>>>>>> I am not an expert in networking, and I would appreciate if someone
>>>>>>> more knowledgeable gives an advise how these intermittent failures
>>>>>>> can be avoided.
>>>>>>>
>>>>>>>> Isn't servSocket.setReuseAddress(true) on line 214 supposed to set
>>>>>>>> the SO_REUSEADDR at the system call level and prevent EADDRINUSE
>>>>>>>> when listening or binding?
>>>>>>> If I am not missing something, the test has been observed to fails
>>>>>>> while re-binding. I am wondering if it's possible that the port
>>>>>>> becomes busy after the server socket was closed, but before bind()
>>>>>>> is called again. The probability of this situation seems to be very
>>>>>>> low which has been actually seen - the test fails very rare.
>>>>>>>
>>>>>>> If this is the case, it seems like servSocket.setReuseAddress(true)
>>>>>>> doesn't help because the port is taken by another process (I am not
>>>>>>> sure that SO_REUSEADDR prevents from this). Again, this is only my
>>>>>>> guess, and I may be wrong.
>>>>>> You know, I hadn't thought of that.  I've never been able to
>>>>>> reproduce that problem either, but I'm running on a local virtual
>>>>>> box VM on a laptop, and usually the tests are running sequentially.
>>>>>> I could definitely see the case where other processes are soaking up
>>>>>> the OCSP responder's port.  With those tests, I kind of need the
>>>>>> port to remain the same since I'm putting that server and port in
>>>>>> the AIA extensions of the certs for which it answers.  Given this
>>>>>> particular case, it seems like your solution of keeping the server
>>>>>> bound but just chopping connections off is the best way to go.
>>>>>>>> When you create the new ServerSocket on line 212, you're now
>>>>>>>> binding it to the port now where originally it started as an
>>>>>>>> unbound socket.  By doing so, the behavior of setReuseAddress() on
>>>>>>>> line 214 is now undefined.
>>>>>>> This setReuseAddress() call looks unnecessary now. I'll update the
>>>>>>> test.
>>>>>>>> While this test no longer stops/starts the server, other tests may
>>>>>>>> wish to do so and their behavior may not be consistent (though
>>>>>>>> apparently it wasn't consistent even in the old scheme where the
>>>>>>>> socket was unbound, then setReuseAddress() was called...)
>>>>>>> Correct. I checked other code which depend on SimpleOCSPServer
>>>>>>>
>>>>>>> javax/net/ssl/Stapling/HttpsUrlConnClient.java
>>>>>>> javax/net/ssl/Stapling/SSLEngineWithStapling.java
>>>>>>> javax/net/ssl/Stapling/SSLSocketWithStapling.java
>>>>>>> javax/net/ssl/Stapling/StapleEnableProps.java
>>>>>>>
>>> sun/security/ssl/StatusStapling/java.base/sun/security/ssl/StatusReqSelection.java
>>>
>>>>>>>
>>> sun/security/ssl/StatusStapling/java.base/sun/security/ssl/StatusResponseManagerTests.java
>>>
>>>>>>> artem at artem-laptop:~/ws/jdk/jdk9_dev_stapling_test/jdk/test$ kate
>>>>>>> javax/net/ssl/Stapling/HttpsUrlConnClient.java
>>>>>>> javax/net/ssl/Stapling/SSLEngineWithStapling.java
>>>>>>> javax/net/ssl/Stapling/SSLSocketWithStapling.java
>>>>>>> javax/net/ssl/Stapling/StapleEnableProps.java
>>>>>>>
>>> sun/security/ssl/StatusStapling/java.base/sun/security/ssl/StatusReqSelection.java
>>>
>>> sun/security/ssl/StatusStapling/java.base/sun/security/ssl/StatusResponseManagerTests.java
>>>
>>>>>>>
>>>>>>> These tests call stop() only once when actual testcases are done.
>>>>>>> Actually, some of them don't even call stop(), but it seems to work
>>>>>>> fine. As an enhancement, I would add stop() calls to finally
>>>>>>> blocks, but it seems to work fine without it anyway.
>>>>>> I liked your solution with the stop() calls in finally blocks and I
>>>>>> agree that they should have them.  I think we get away with it
>>>>>> because in most if not all of those cases they are running as
>>>>>> othervm tests (because we have properties that we set specific to
>>>>>> the tests).  So when the JVM exits resources like sockets are closed
>>>>>> by the OS.  Still, it's better to have the try/finally guards and
>>>>>> explicitly and gracefully shutdown the OCSP responders.
>>>>>>> Here is an updated webrev:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~asmotrak/8162484/webrev.01/
>>>>>> I realize that in many of these test cases we're going to move away
>>>>>> from a start/stop approach to your accept/reject one, but in general
>>>>>> sockets designed to be listening should start unbound, set the
>>>>>> SO_REUSEADDR sockopt, then bind and listen. Was there a specific
>>>>>> reason to change that code, or was it just to streamline it?  Aside
>>>>>> from fewer lines of code, I'm not sure what it buys us.
>>>>>>> Artem
>>>>>>>
>>>>>>>
>>>>>>>> --Jamil
>>>>>>>>
>>>>>>>>
>>>>>>>> On 08/10/2016 03:44 PM, Artem Smotrakov wrote:
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> Please review this update for OCSP stapling tests.
>>>>>>>>>
>>>>>>>>> The tests use
>>>>>>>>> test/java/security/testlibrary/SimpleOCSPServer.java which try to
>>>>>>>>> re-use a server port if the server restarted. Looks like
>>>>>>>>> sometimes it may cause "Address already in use" error.
>>>>>>>>>
>>>>>>>>> The patch updates OCSP stapling tests with the following:
>>>>>>>>> - updated SSLSocketWithStapling.java test not to restart OCSP
>>>>>>>>> responders
>>>>>>>>> - updated SimpleOCSPServer to be able to reject incoming
>>> connections
>>>>>>>>> - updated SimpleOCSPServer to be able to reproduce a delay
>>>>>>>>> without restarting
>>>>>>>>>
>>>>>>>>> Jamil,
>>>>>>>>>
>>>>>>>>> Could you please take a look at this update, and confirm if this
>>>>>>>>> update doesn't break the original test scenarios?
>>>>>>>>>
>>>>>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8162484
>>>>>>>>> Webrev: http://cr.openjdk.java.net/~asmotrak/8162484/webrev.00/
>>>>>>>>>
>>>>>>>>> Artem




More information about the security-dev mailing list