RFR: 8339356: Test javax/net/ssl/SSLSocket/Tls13PacketSize.java failed with java.net.SocketException: An established connection was aborted by the software in your host machine [v2]

Daniel Jeliński djelinski at openjdk.org
Fri Dec 13 11:03:38 UTC 2024


On Thu, 12 Dec 2024 16:54:07 GMT, Matthew Donovan <mdonovan at openjdk.org> wrote:

>> test/jdk/javax/net/ssl/SSLSocket/Tls13PacketSize.java line 75:
>> 
>>> 73:         sslOS.write(appData);
>>> 74:         sslOS.flush();
>>> 75:         sslIS.read();
>> 
>> The failure is caused by closing the socket before all the data sent by the client is read. In order to read all the data, you need something like:
>> Suggestion:
>> 
>>         sslIS.read(appData, 1, appData.length-1);
>> 
>> The failure is hard to reproduce. You might need to repeat the test a few hundred times to see if it's gone. And it only affects Windows.
>
> I ran the test many hundreds of times and didn't see the error. I updated the code as you suggested and ran it 100s of times again without reproducing it.

FWIW, here's a diff that reliably reproduces the issue:

diff --git a/src/java.base/share/classes/sun/security/ssl/SSLSocketOutputRecord.java b/src/java.base/share/classes/sun/security/ssl/SSLSocketOutputRecord.java
index a7809754ed0..b57f1665408 100644
--- a/src/java.base/share/classes/sun/security/ssl/SSLSocketOutputRecord.java
+++ b/src/java.base/share/classes/sun/security/ssl/SSLSocketOutputRecord.java
@@ -358,6 +358,13 @@ void deliver(byte[] source, int offset, int length) throws IOException {
                 }

                 offset += fragLen;
+                if (offset < limit) {
+                    try {
+                        Thread.sleep(100);
+                    } catch (InterruptedException e) {
+                        //throw new RuntimeException(e);
+                    }
+                }
             }
         } finally {
             recordLock.unlock();

pretty typical for reproducing race conditions, it adds a delay on the right path.

Just verified, the PR doesn't fix the failure. Probably because the call to `read` returns whatever data is available immediately, and doesn't wait for more data to become available.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/22591#discussion_r1883759949


More information about the security-dev mailing list