RFR(s): Improving performance of Windows socket connect on the loopback adapter
Nikola Grcevski
Nikola.Grcevski at microsoft.com
Tue Jul 28 14:03:02 UTC 2020
Hi Alan,
Thanks again for testing this change. I dug deep into the issue yesterday and got some answers from the Windows Networking team.
The issue is that the flag TCP_INITIAL_RTO_NO_SYN_RETRANSMISSIONS, which we passed in to completely eliminate the network delay, isn't defined (or checked) on Windows 10 versions prior to RS3 (Redstone 3). The flag was interpreted as TCP_INITIAL_RTO_DEFAULT_MAX_SYN_RETRANSMISSIONS, causing a retry of 255 times, each one taking 500ms. This made each individual connect delay take 128 seconds in total.
I was advised to change the code to perform a runtime check on the exact Windows version and unless it's Windows 10RS3 or later, we should set the retransmissions to 1. Strangely enough, we can't set it to 0, which is a special value interpreted as use the default. With retransmission count of 1, we speed up the localhost connects on older versions of Windows by factor of 2.
I have prepared a new webrev with the runtime check for review here:
http://cr.openjdk.java.net/~adityam/nikola/fast_connect_loopback_4/
For the Windows version check function I followed the naming standards the SDK uses in:
https://docs.microsoft.com/en-us/windows/win32/api/versionhelpers/
If it's not a suitable function name please let me know. They have added this helper function for .NET 4.8 but it's not there yet for Win32. Hopefully, it comes provided by Microsoft in a future SDK update and we can remove the helper. I attempted to use IsWindowsVersionOrGreater, but unfortunately that API doesn't allow me to specify the build number to detect RS3.
Thanks,
Nikola
-----Original Message-----
From: Alan Bateman <Alan.Bateman at oracle.com>
Sent: July 26, 2020 6:45 AM
To: Nikola Grcevski <Nikola.Grcevski at microsoft.com>; net-dev at openjdk.java.net
Subject: Re: RFR(s): Improving performance of Windows socket connect on the loopback adapter
On 24/07/2020 16:20, Nikola Grcevski wrote:
> Thanks Alan, yes I'll need a sponsor for the patch.
>
>
I tried the patch in our CI and test/jdk/java/net/Socket/Timeouts.java
is consistently failing on Windows Server 2016 systems, specifically
testTimedConnect2 which expects a "connection refused" within 10s of attempting to connect to a port on the loopback that doesn't have any service running. SIO_TCO_INITIAL_RTO seems to be intended for desktop systems so I'm wondering if you can find out if there is any issues with using it on Windows Server editions.
-Alan
More information about the net-dev
mailing list