RFR: 8357639: DigestEchoClient fails intermittently due to: java.io.IOException: Data received while in pool [v3]
Daniel Fuchs
dfuchs at openjdk.org
Mon Jun 9 11:33:55 UTC 2025
On Mon, 9 Jun 2025 11:17:19 GMT, Daniel Fuchs <dfuchs at openjdk.org> wrote:
>> The java/net/httpclient/DigestEchoClient.java fails intemittently with IOException: HTTP/1.1 header parser received no bytes.
>>
>> Analysis shows that this is caused by the CleanupTrigger which receives data after the reused connection has been taken out of the HTTP/1.1 clear pool (Caused by: java.io.IOException: Data received while in pool). This should not happen.
>>
>> The fix for [JDK-8338569](https://bugs.openjdk.org/browse/JDK-8338569) has improved the situation but apparently didn't fix the issue completely.
>> The issue is that the write publisher manages to come active before the read subscriber has been switched, which opens a window in which the old read subscriber receives the data that should have been passed to the new one.
>>
>> To fix that, this change proposes to pause reading at the selector level before connecting the flows with the new publisher /subscriber pair. This should ensure that data doesn't reach the "old" subscriber, if the new write publisher manage to send data before the new read subscriber is subscribed.
>
> Daniel Fuchs has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision:
>
> - A better fix
> - Merge branch 'master' into DigestEchoClient-cleanup-8357639
> - Access pendingSubscriptions only from InternalReadSubscriptionImpl - and synchronize offer/handle to avoid concurrent removal from the queue
> - Merge branch 'master' into DigestEchoClient-cleanup-8357639
> - Avoid unnecessary volatile reads
> - 8357639
@djelinski managed to find the exact place where to put a Thread.sleep() to reproduce the issue more often. Thanks to that I was able to come with a better fix. The issue no longer reproduces. The fix is too call `handle` in the read subscribe event, instead of `readScheduler.readOrSchedule`. Indeed the `read` loop calls `handle` and is supposed to return after handling pending subscriptions. But if the pending subscription has been already handled it could proceed to read and end up reading from the socket at the wrong time. I also changed the `pendingSubscriptions` from queue to `AtomicReference` since with the additional synchronization we are now certain that there can be only one pending subscriber in the queue at any given time.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/25416#issuecomment-2955521761
More information about the net-dev
mailing list