RFR: 8334719: (se) Deferred close of SelectableChannel may result in a Selector doing the final close before concurrent I/O on channel has completed [v2]

Alan Bateman alanb at openjdk.org
Tue Jun 25 15:22:15 UTC 2024


On Tue, 25 Jun 2024 13:55:47 GMT, Jaikiran Pai <jpai at openjdk.org> wrote:

>> Can I please get a review of this change which proposes to fix the issue noted in https://bugs.openjdk.org/browse/JDK-8334719?
>> 
>> Alan's comment in that issue summarizes what this issue is about https://bugs.openjdk.org/browse/JDK-8334719?focusedId=14684071&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14684071. As noted there, the deferred close implementation in several of the `SelectableChannel` implementations have a bug where they don't check for in-progress operations on the channel when closing/releasing the underlying resources of the channel. We started noticing this with `DatagramChannel` consistenly, but the issue is applicable for other channel implementations as well. 
>> 
>> The fix for the issue in this PR has been provided by Alan and I've run the necessary tests several thousand times to verify that it fixes the original issue. 
>> 
>> A new jtreg test has been introduced to reproduce the bug and verify the fix. Without the fix, the new test consistently fails for all test method (except for ServerSocketChannel, where it isn't easy to trigger a race). With the fix all tests consistently pass.
>> 
>> A successful test run of the new test takes around 60 seconds. So I've intentionally set a timeout of 4 minutes on the test to allow for some leeway on slow CI systems.
>> 
>> Another round of tier testing with these changes is currently in progress.
>
> Jaikiran Pai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   don't inject NIO_ACCESS instead patch InetSocketAddress to introduce crafted delays

test/jdk/java/nio/channels/Selector/java.base/java/net/InetSocketAddress.java line 39:

> 37:             return false;
> 38:         }
> 39:     };

A simpler way to do this is to just add a setDelay method or even a constructor that allows the delay to be specified when creating the object.  That will allow the ThredLocal to go away and make the test a bit simpler.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19879#discussion_r1653031412


More information about the nio-dev mailing list