8198928: (so) SocketChannel connect may deadlock if closed at around same time that connect fails

Hamlin Li huaming.li at oracle.com
Tue Mar 6 08:39:59 UTC 2018


Hi Alan,

I have minor comment about CloseDuringConnect.java, is it better to move 
following code from main() to test()?

SocketAddress remote = new InetSocketAddress("127.0.0.1", Utils.getFreePort());

It will reduce the chance to interfere with other tests which run 
concurrently and also create a listening socket at THE free port same as 
this test.


Thank you

-Hamlin


On 05/03/2018 4:22 AM, Alan Bateman wrote:
>
> This is a deadlock found with one of the HTTP client tests. When 
> establishing a connection with a SocketChannel then the channel is 
> specified to be closed if the connection cannot be established. If the 
> SocketChannel is configured non-blocking and connect (or 
> finishConnect) is closing the channel due to a connect failure then it 
> can potentially deadlock with another thread that attempts to close 
> the channel at the same time. The deadlock arises because 
> connect/finishConnect invoke close (closeLock) while holding the read 
> and write locks. Another thread closing the channel acquires the 
> clockLock, the actual close implementation then attempts to acquire 
> the read and write locks in order to coordinate with threads doing 
> non-blocking operations on the change.
>
> Historical note: The implementation of connect/finishConnect has 
> always called close while holding the read and write locks but it 
> hasn't been problem in the past. The recent changes to to separate the 
> blocking and non-blocking code paths have changed the locking during 
> close which is why we are only seeing this now.
>
> The proposed change is to connect/finishConnect so that these methods 
> don't close the channel until after they've released the read and 
> write locks. The changes mean there is a window where some other 
> thread might call finishConnect or some other operation so a few other 
> small changes are needed to deal with that. It would be possible to 
> introduce a special state for this case this adds complexity so I've 
> decided against that.
>
> The webrev with the proposed changes and test are here:
>   http://cr.openjdk.java.net/~alanb/8198928/webrev/index.html
>
> In terms of behavior changes then an observer polling 
> isConnectionPending will now observe the connection is pending even if 
> the channel is configured blocking. Previously, isConnectionPending 
> would never return true if connect was called when configured blocking.
>
> Another observer polling with getRemoteAddress will now observe the 
> remote address while the connection is pending. This was always true 
> with socket channels configured non-blocking, not so with channels 
> configured blocking. The spec for getRemoteAddress is that the remote 
> address is only returned when the channel is connected but it can be 
> useful to observe it while a connection is established. I will create 
> a separate JIRA issue to track this as it's a long standing spec vs. 
> implementation difference.
>
> As I mentioned, if the connection cannot be established, there is a 
> small window for other threads to call finishConnect or some other 
> operation before the channel is closed. This is mostly benign except 
> on platforms where the connection error isn't available after it has 
> been obtained. To deal with this requires checkConnect to check for 
> POLLHUP, which is should have done anyway. While in the area I did 
> some cleanup on the Windows implementation of checkConnect.
>
> All tests are passing with these changes, including the HTTP client 
> tests where this issue was initially observed as a test timing out 
> intermittently.
>
> -Alan
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20180306/d1c18d11/attachment.html>


More information about the nio-dev mailing list