InetAddress API extension
Sergey Chernyshev
serge.chernyshev at bell-sw.com
Thu Mar 28 19:09:45 UTC 2024
Hi Net Dev team,
I posted this earlier to core-libs-dev mailing list, which was not a
proper place to discuss the net issues. Thanks Alan Bateman for
directing me to the right place.
I would like to propose a PR to extend the InetAddress API in JDK 23,
namely to provide interface to constructing InetAddress objects from
literal addresses in POSIX/BSD form (please see the discussion [1]), to
the Apps that need to mimic the behavior of POSIX network APIs
(|inet_addr|) used by standard network utilities such as
netcat/curl/wget and the majority of web browsers. At present time,
there's no way to construct |InetAddress| object from such literal
addresses because the new API |InetAddress.ofLiteral()| and
|Inet4Address.ofLiteral()| will consume an octal address and
successfully parse it as decimal, ignoring the octal prefix. Hence, the
resulting object will point to a different IP address than it is
expected to point to. There's also no direct way to create an
InetAddress from a literal address with hexadecimal segments, although
this can be the case in certain systems.
Historically |InetAddress.getByName()/.getAllByName()| were the only way
to convert a literal address into an InetAddress object.
|getAllByName()| API relies on POSIX |getaddrinfo| / |inet_addr| which
parses IP address segments with |strtoul| (accepts octal and hexadecimal
bases). The fallback to |getaddrinfo| is undesirable as it may end up
with network queries (blocking mode), if |inet_aton| rejects the input
literal address. The Java standard explicitly says that
|"If a literal IP address is supplied, only the validity of the address
format is checked." |
Aleksei Efimov contributed JDK-8272215 [2] that adds new factory methods
|.ofLiteral()| to |InetAddress| classes. Although the new API is not
affected by the |getaddrinfo| fallback issue, it is not sufficient for
an app that wants to mimic the behavior of BSD/POSIX network utlilities.
In particular, Java apps that involve parsing or interpreting the
parameters of the standard tools as well as their configuration /
environment.
It is suggested to add a new factory method such as |.ofPosixLiteral()|
to |Inet4Address| class to fill this gap. This won't introduce ambiguity
into the API and won't break the long standing behavior. As a new
method, it will not affect Java utilities such as HttpClient, nor the
existing Java applications. At the same time, the new method will help
dealing with confusion between BSD and Java standards.
The parsing algorithm was added as part of JDK-8277608 [3]. It requires
minor modification to produce 4 bytes output (now it doesn't produce any
output). The algorithm allows up to 4 segments splitted by dots (.), the
leading segment(s) must not exceed 255 if there are more than 1 segment,
the trailing segment must not exceed 256ˆ(5 - numberOfSegments) - 1. The
algorithm rejects numbers greater than 0xFF hex, 0377 octal, 255 decimal
per octet. It is different to .ofLiteral() where it is simply 255 per
octet, regardless of leading 0s (the total length must not exceed 15).
In .ofPosixLiteral() there'd be no limit of the number of leading 0s,
which is also the case with inet_addr(). The corner case for both
methods are numbers that are accepted in both, but produce different
outputs such as octal numbers between 010 and 0255. 0256 and above are
rejected by ofLiteral() as well as all hexadecimal numbers. Zero
prefixed decimal numbers such as 0239 should be rejected by
ofPosixLiteral().
There could be a slight discrepancy in terms of how different standard
tools are working under different OS. For example in MacOS wget & nc
disregard octal prefix (0) while allowing hexadecimal prefix (0x), at
the same time curl & ping process both prefixes. In Ubuntu Server 22.04
both prefixes are processed, but they are not allowed in /etc/hosts
file, while in MacOS it's legal to use 0x. Despite the deviations in how
and where the BSD standard is implemented, there are two distinct
approaches. I don't see why Java should't provide two different
indepentent APIs. It would give the future apps flexibility to decide
which standard to rely on, ability to see the full picture.
Please share your thoughts on whether such a change might be desirable
in JDK 23. Thank you for your help!
Best regards
Sergey Chernyshev
[1] https://bugs.openjdk.org/browse/JDK-8315767
[2] https://bugs.openjdk.org/browse/JDK-8272215
[3] https://github.com/openjdk/jdk/commit/cdc1582
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/net-dev/attachments/20240328/05cfe83c/attachment-0001.htm>
More information about the net-dev
mailing list