NIO Socket.read() may not respect the socket timeout
Alexei Olkhovskii
alexei.olkhovskii at servicenow.com
Wed May 28 17:56:20 UTC 2025
Hi Alan, thanks for response
> Would it be possible to do some digging to say for sure if this happens after StackOverflowError (SOE)?
We saw a thread throwing an SOE while holding a DB connection. Later, there was another thread getting stuck while trying to cleanup a connection. Unfortunately, we haven’t collected the thread/memory dumps and can’t state it with 100% confidence. But everything we see in logs and the code is pointing to the scenario described below.
> Can you confirm you created this to demonstrate your point and isn't representative of what the code is doing?
The code is demonstrating the point, but it comes from real code. It starts with our DB Connection Pool checking the connection has no stale data in its socket buffer before handing it out. If it detects stale data (which was the case) it realizes the connection can’t be trusted and kills it by calling the most brutal API method available: abort(). In MariaDB JDBC driver, abort() is calling its internal method closeSocket():
https://github.com/mariadb-corporation/mariadb-connector-j/blob/2.3.0/src/main/java/org/mariadb/jdbc/internal/protocol/AbstractConnectProtocol.java#L275
Now, for some reason the closeSocket() is trying to ensure the socket buffer is empty by attempting to read a byte from it. To protect from hangs, the driver is setting a 3-msec timeout on the socket:
https://github.com/mariadb-corporation/mariadb-connector-j/blob/2.3.0/src/main/java/org/mariadb/jdbc/internal/protocol/AbstractConnectProtocol.java#L203
The NIO’s read() first tries to protect the operation by obtaining the lock which makes total sense. But the gap is that it tries to acquire the lock w/o any timeout. Since the lock is already held by the compromised thread, our thread also hangs.
As you’ve pointed out, this isn’t a normal or intended scenario. Moreover, NIO has every right and obligation to protect multi-threaded access to the socket.
The only ask is to respect the timeout while doing that. Feels like this can be easily achieved by using tryLock() similarly to accept().
--
Regards, Alexei (team: Data Scale/Dev-Persistence, dept: Platform Engineering, Loc: San Diego, Mgr: Venkata Koya, Teams/MM: alexei.olkhovskii)
From: Alan Bateman <alan.bateman at oracle.com>
Date: Tuesday, May 27, 2025 at 22:59
To: Alexei Olkhovskii <alexei.olkhovskii at servicenow.com>, nio-dev at openjdk.org <nio-dev at openjdk.org>
Subject: Re: NIO Socket.read() may not respect the socket timeout
[External Email]
________________________________
On 27/05/2025 23:50, Alexei Olkhovskii wrote:
Hello,
We’ve ran into a corner-case issue with socket read(). In short, it doesn’t honor socket timeout when trying to acquire the read lock:
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/sun/nio/ch/NioSocketImpl.java#L336
readLock.lock();
In our case one thread obtained a database connection and got stuck in read() (we think because of a StackOverflow).
Would it be possible to do some digging to say for sure if this happens after StackOverflowError (SOE)? There have been a few reports of issues that appear to be lock "corruption" when continuing after SOE and it may be that the reserved stack area for critical sections needs to be re-visited.
As regards the test. Can you confirm you created this to demonstrate your point and isn't representative of what the code is doing? For a TCP/stream connection then having several threads reading from the same connection would require coordination at a high level. If it does arise that one thread is blocked reading and another thread attempts to read from the same connection then it is blocked until the first thread finishes reading. It would be a strange scenario to arise and different to the accept case where it's okay to have concurrently threads attempt to accept connections, they don't interfere.
-Alan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/nio-dev/attachments/20250528/6aee848e/attachment-0001.htm>
More information about the nio-dev
mailing list