JEP-353 - Socket.connect now throws NoRouteToHostException as against ConnectException previously

Alan Bateman Alan.Bateman at oracle.com
Sat Aug 21 16:26:52 UTC 2021


On 21/08/2021 12:40, Jaikiran Pai wrote:
> I was able to reproduce this on a MacOS. However, the continuous 
> integration setup project for Quarkus projects runs these tests 
> against Linux and Windows setups and they have run into this issue at 
> least on the Linux OS jobs (I will need to go and check if Windows 
> jobs had failed too). I can get the specific OS versions if necessary, 
> but I don't think that will be needed (due to the reproducer I explain 
> below).
>
> :
>
> From what I see in the output of this program, the resolution of 
> microprofile.io returns 4 IP addresses. 2 of them are of type IPv4 and 
> 2 are of type IPv6. Across all Java versions, for IPv4 addresses, the 
> connection attempts fail with the same 
> "java.net.SocketTimeoutException: Connect timed out". However, for the 
> IPv6 addresses, in Java 11, the connection attempts fail with 
> "java.net.ConnectException: No route to host (connect failed)" whereas 
> in Java 16, 17 and upstream latest, the connection attempts against 
> the IPv6 addresses fails with "java.net.NoRouteToHostException: No 
> route to host".

Thanks for the additional information, I think I understand the issue now.

If you extend your test to include a connect without a timeout then 
you'll see that old and new implementations throw NoRouteToHostException 
when the underlying error is EHOSTUNREACH "No route to host".

However, for the connect with timeout case on Linux/macOS/Unix the old 
implementation doesn't correctly handle network errors when they are 
reported immediately. It throws ConnectException for all errors, 
including EHOSTUNREACH "No route to host", whereas it should map the 
error to a specific exception as it does for the untimed case. It's 
possible that this bug has existed for a long time.

So while there is indeed a behavior change between the old and new 
implementation for the timed case where the connect fails immediately, I 
don't think we should attempt to change the new implementation to have 
this buggy behavior.

Do you connections to the Apache HTTP client library and the retry code 
that is looking for specific exceptions? From a distance it seems very 
fragile and depending on very implementation specific behavior. I wonder 
if it has ever been tested on Windows or with an untimed connect.


-Alan


More information about the net-dev mailing list