8198928: (so) SocketChannel connect may deadlock if closed at around same time that connect fails
Hamlin Li
huaming.li at oracle.com
Tue Mar 6 08:39:59 UTC 2018
Hi Alan,
I have minor comment about CloseDuringConnect.java, is it better to move
following code from main() to test()?
SocketAddress remote = new InetSocketAddress("127.0.0.1", Utils.getFreePort());
It will reduce the chance to interfere with other tests which run
concurrently and also create a listening socket at THE free port same as
this test.
Thank you
-Hamlin
On 05/03/2018 4:22 AM, Alan Bateman wrote:
>
> This is a deadlock found with one of the HTTP client tests. When
> establishing a connection with a SocketChannel then the channel is
> specified to be closed if the connection cannot be established. If the
> SocketChannel is configured non-blocking and connect (or
> finishConnect) is closing the channel due to a connect failure then it
> can potentially deadlock with another thread that attempts to close
> the channel at the same time. The deadlock arises because
> connect/finishConnect invoke close (closeLock) while holding the read
> and write locks. Another thread closing the channel acquires the
> clockLock, the actual close implementation then attempts to acquire
> the read and write locks in order to coordinate with threads doing
> non-blocking operations on the change.
>
> Historical note: The implementation of connect/finishConnect has
> always called close while holding the read and write locks but it
> hasn't been problem in the past. The recent changes to to separate the
> blocking and non-blocking code paths have changed the locking during
> close which is why we are only seeing this now.
>
> The proposed change is to connect/finishConnect so that these methods
> don't close the channel until after they've released the read and
> write locks. The changes mean there is a window where some other
> thread might call finishConnect or some other operation so a few other
> small changes are needed to deal with that. It would be possible to
> introduce a special state for this case this adds complexity so I've
> decided against that.
>
> The webrev with the proposed changes and test are here:
> http://cr.openjdk.java.net/~alanb/8198928/webrev/index.html
>
> In terms of behavior changes then an observer polling
> isConnectionPending will now observe the connection is pending even if
> the channel is configured blocking. Previously, isConnectionPending
> would never return true if connect was called when configured blocking.
>
> Another observer polling with getRemoteAddress will now observe the
> remote address while the connection is pending. This was always true
> with socket channels configured non-blocking, not so with channels
> configured blocking. The spec for getRemoteAddress is that the remote
> address is only returned when the channel is connected but it can be
> useful to observe it while a connection is established. I will create
> a separate JIRA issue to track this as it's a long standing spec vs.
> implementation difference.
>
> As I mentioned, if the connection cannot be established, there is a
> small window for other threads to call finishConnect or some other
> operation before the channel is closed. This is mostly benign except
> on platforms where the connection error isn't available after it has
> been obtained. To deal with this requires checkConnect to check for
> POLLHUP, which is should have done anyway. While in the area I did
> some cleanup on the Windows implementation of checkConnect.
>
> All tests are passing with these changes, including the HTTP client
> tests where this issue was initially observed as a test timing out
> intermittently.
>
> -Alan
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20180306/d1c18d11/attachment.html>
More information about the nio-dev
mailing list