RFR 8006395: Race in async socket close on Linux

Chris Hegarty chris.hegarty at oracle.com
Wed Jan 30 13:36:07 PST 2013


There is a very small, and very old, window for a race in the socket 
asynchronous close mechanism on Linux. This can lead to blocking socket 
operations never returning even after the socket has been closed.

This issue would appear to exist since its (linux_close.c) creation back 
in 1.4, but since the window for the race is tiny, it seems to have gone 
unnoticed until now. It was originally diagnosed through code 
inspection, but since then I have created and added a small test that 
reproduce the issue about one in every 10 - 20 runs, with jdk8, on 
Ubuntu 12.04, with 2x 2.33GHz Intel Xeon E5345 (2x quad-core, 1 thread 
per core => 8 threads).

closefd first interrupts (sends wakeup signal to) all the threads 
blocked on the fd, then it closes/dup2's the fd. However, the signal may 
arrive at its target thread before that thread has entered the blocking 
system call, and before close/dup2. In this case, the target thread will 
simple enter the blocking system call and never return.

Solution
---------
If it was to close/dup2 the fd before issuing the wake up, then any 
thread not yet blocked in a system call should see that the fd is closed 
on entry, otherwise it will be woken up by the signal.

While there is an equivalent closefd in bsd_close.c ( mac/bsd specific 
code), I have not been able to reproduce this issue after many test runs 
on mac. Also, making similar changes to closefd in bsd_close runs into a 
problem with dup2; dup2 will hang if another thread is doing a blocking 
operation. I believe this issue is similar to 7133499. So as far as this 
issue is concerned changes will only be make to the Linux version of 
closefd.

Webrev
-------

http://cr.openjdk.java.net/~chegar/8006395/webrev.00/webrev/

-Chris.



More information about the net-dev mailing list