RFR: 8334719: (se) Deferred close of SelectableChannel may result in a Selector doing the final close before concurrent I/O on channel has completed

Jaikiran Pai jpai at openjdk.org
Tue Jun 25 13:58:10 UTC 2024


On Tue, 25 Jun 2024 11:31:53 GMT, Jaikiran Pai <jpai at openjdk.org> wrote:

> Can I please get a review of this change which proposes to fix the issue noted in https://bugs.openjdk.org/browse/JDK-8334719?
> 
> Alan's comment in that issue summarizes what this issue is about https://bugs.openjdk.org/browse/JDK-8334719?focusedId=14684071&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14684071. As noted there, the deferred close implementation in several of the `SelectableChannel` implementations have a bug where they don't check for in-progress operations on the channel when closing/releasing the underlying resources of the channel. We started noticing this with `DatagramChannel` consistenly, but the issue is applicable for other channel implementations as well. 
> 
> The fix for the issue in this PR has been provided by Alan and I've run the necessary tests several thousand times to verify that it fixes the original issue. 
> 
> A new jtreg test has been introduced to reproduce the bug and verify the fix. Without the fix, the new test consistently fails for all test method (except for ServerSocketChannel, where it isn't easy to trigger a race). With the fix all tests consistently pass.
> 
> A successful test run of the new test takes around 60 seconds. So I've intentionally set a timeout of 4 minutes on the test to allow for some leeway on slow CI systems.
> 
> Another round of tier testing with these changes is currently in progress.

Alan suggested that in this new test, we use a different class/API that isn't subject to frequent changes, unlike the `JavaNIOAccess`.

Based on Alan's suggestion of patching `InetSocketAddress` in this test, I was able to introduce delays at certain places within the patched `InetSocketAddress`. The updated test continues to consistently reproduce the failures in DatagramChannel, without the fix proposed in this PR. The test then passes with the proposed fix.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/19879#issuecomment-2189031001


More information about the nio-dev mailing list