InetAddress API extension

Sergey Chernyshev serge.chernyshev at bell-sw.com
Thu Mar 28 19:09:45 UTC 2024


Hi Net Dev team,

I posted this earlier to core-libs-dev mailing list, which was not a 
proper place to discuss the net issues. Thanks Alan Bateman for 
directing me to the right place.

I would like to propose a PR to extend the InetAddress API in JDK 23, 
namely to provide interface to constructing InetAddress objects from 
literal addresses in POSIX/BSD form (please see the discussion [1]), to 
the Apps that need to mimic the behavior of POSIX network APIs 
(|inet_addr|) used by standard network utilities such as 
netcat/curl/wget and the majority of web browsers. At present time, 
there's no way to construct |InetAddress| object from such literal 
addresses because the new API |InetAddress.ofLiteral()| and 
|Inet4Address.ofLiteral()| will consume an octal address and 
successfully parse it as decimal, ignoring the octal prefix. Hence, the 
resulting object will point to a different IP address than it is 
expected to point to. There's also no direct way to create an 
InetAddress from a literal address with hexadecimal segments, although 
this can be the case in certain systems.

Historically |InetAddress.getByName()/.getAllByName()| were the only way 
to convert a literal address into an InetAddress object. 
|getAllByName()| API relies on POSIX |getaddrinfo| / |inet_addr| which 
parses IP address segments with |strtoul| (accepts octal and hexadecimal 
bases). The fallback to |getaddrinfo| is undesirable as it may end up 
with network queries (blocking mode), if |inet_aton| rejects the input 
literal address. The Java standard explicitly says that

|"If a literal IP address is supplied, only the validity of the address 
format is checked." |

Aleksei Efimov contributed JDK-8272215 [2] that adds new factory methods 
|.ofLiteral()| to |InetAddress| classes. Although the new API is not 
affected by the |getaddrinfo| fallback issue, it is not sufficient for 
an app that wants to mimic the behavior of BSD/POSIX network utlilities. 
In particular, Java apps that involve parsing or interpreting the 
parameters of the standard tools as well as their configuration / 
environment.

It is suggested to add a new factory method such as |.ofPosixLiteral()| 
to |Inet4Address| class to fill this gap. This won't introduce ambiguity 
into the API and won't break the long standing behavior. As a new 
method, it will not affect Java utilities such as HttpClient, nor the 
existing Java applications. At the same time, the new method will help 
dealing with confusion between BSD and Java standards.

The parsing algorithm was added as part of JDK-8277608 [3]. It requires 
minor modification to produce 4 bytes output (now it doesn't produce any 
output). The algorithm allows up to 4 segments splitted by dots (.), the 
leading segment(s) must not exceed 255 if there are more than 1 segment, 
the trailing segment must not exceed 256ˆ(5 - numberOfSegments) - 1. The 
algorithm rejects numbers greater than 0xFF hex, 0377 octal, 255 decimal 
per octet. It is different to .ofLiteral() where it is simply 255 per 
octet, regardless of leading 0s (the total length must not exceed 15). 
In .ofPosixLiteral() there'd be no limit of the number of leading 0s, 
which is also the case with inet_addr(). The corner case for both 
methods are numbers that are accepted in both, but produce different 
outputs such as octal numbers between 010 and 0255. 0256 and above are 
rejected by ofLiteral() as well as all hexadecimal numbers. Zero 
prefixed decimal numbers such as 0239 should be rejected by 
ofPosixLiteral().

There could be a slight discrepancy in terms of how different standard 
tools are working under different OS. For example in MacOS wget & nc 
disregard octal prefix (0) while allowing hexadecimal prefix (0x), at 
the same time curl & ping process both prefixes. In Ubuntu Server 22.04 
both prefixes are processed, but they are not allowed in /etc/hosts 
file, while in MacOS it's legal to use 0x. Despite the deviations in how 
and where the BSD standard is implemented, there are two distinct 
approaches. I don't see why Java should't provide two different 
indepentent APIs. It would give the future apps flexibility to decide 
which standard to rely on, ability to see the full picture.

Please share your thoughts on whether such a change might be desirable 
in JDK 23. Thank you for your help!

Best regards
Sergey Chernyshev

[1] https://bugs.openjdk.org/browse/JDK-8315767
[2] https://bugs.openjdk.org/browse/JDK-8272215
[3] https://github.com/openjdk/jdk/commit/cdc1582
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/net-dev/attachments/20240328/05cfe83c/attachment-0001.htm>


More information about the net-dev mailing list