InetAddress API extension
Michael McMahon
michael.x.mcmahon at oracle.com
Fri Mar 29 11:35:56 UTC 2024
Hi,
I think this could be a useful addition. The apidoc would need to be
clear about the differences from the existing literal parsing method and
any implications for ambiguity between octal and decimal formats spelled
out.
- Michael
On 28/03/2024 19:09, Sergey Chernyshev wrote:
>
> Hi Net Dev team,
>
> I posted this earlier to core-libs-dev mailing list, which was not a
> proper place to discuss the net issues. Thanks Alan Bateman for
> directing me to the right place.
>
> I would like to propose a PR to extend the InetAddress API in JDK 23,
> namely to provide interface to constructing InetAddress objects from
> literal addresses in POSIX/BSD form (please see the discussion [1]),
> to the Apps that need to mimic the behavior of POSIX network APIs
> (|inet_addr|) used by standard network utilities such as
> netcat/curl/wget and the majority of web browsers. At present time,
> there's no way to construct |InetAddress| object from such literal
> addresses because the new API |InetAddress.ofLiteral()| and
> |Inet4Address.ofLiteral()| will consume an octal address and
> successfully parse it as decimal, ignoring the octal prefix. Hence,
> the resulting object will point to a different IP address than it is
> expected to point to. There's also no direct way to create an
> InetAddress from a literal address with hexadecimal segments, although
> this can be the case in certain systems.
>
> Historically |InetAddress.getByName()/.getAllByName()| were the only
> way to convert a literal address into an InetAddress object.
> |getAllByName()| API relies on POSIX |getaddrinfo| / |inet_addr| which
> parses IP address segments with |strtoul| (accepts octal and
> hexadecimal bases). The fallback to |getaddrinfo| is undesirable as it
> may end up with network queries (blocking mode), if |inet_aton|
> rejects the input literal address. The Java standard explicitly says that
>
> |"If a literal IP address is supplied, only the validity of the
> address format is checked." |
>
> Aleksei Efimov contributed JDK-8272215 [2] that adds new factory
> methods |.ofLiteral()| to |InetAddress| classes. Although the new API
> is not affected by the |getaddrinfo| fallback issue, it is not
> sufficient for an app that wants to mimic the behavior of BSD/POSIX
> network utlilities. In particular, Java apps that involve parsing or
> interpreting the parameters of the standard tools as well as their
> configuration / environment.
>
> It is suggested to add a new factory method such as
> |.ofPosixLiteral()| to |Inet4Address| class to fill this gap. This
> won't introduce ambiguity into the API and won't break the long
> standing behavior. As a new method, it will not affect Java utilities
> such as HttpClient, nor the existing Java applications. At the same
> time, the new method will help dealing with confusion between BSD and
> Java standards.
>
> The parsing algorithm was added as part of JDK-8277608 [3]. It
> requires minor modification to produce 4 bytes output (now it doesn't
> produce any output). The algorithm allows up to 4 segments splitted by
> dots (.), the leading segment(s) must not exceed 255 if there are more
> than 1 segment, the trailing segment must not exceed 256ˆ(5 -
> numberOfSegments) - 1. The algorithm rejects numbers greater than 0xFF
> hex, 0377 octal, 255 decimal per octet. It is different to
> .ofLiteral() where it is simply 255 per octet, regardless of leading
> 0s (the total length must not exceed 15). In .ofPosixLiteral() there'd
> be no limit of the number of leading 0s, which is also the case with
> inet_addr(). The corner case for both methods are numbers that are
> accepted in both, but produce different outputs such as octal numbers
> between 010 and 0255. 0256 and above are rejected by ofLiteral() as
> well as all hexadecimal numbers. Zero prefixed decimal numbers such as
> 0239 should be rejected by ofPosixLiteral().
>
> There could be a slight discrepancy in terms of how different standard
> tools are working under different OS. For example in MacOS wget & nc
> disregard octal prefix (0) while allowing hexadecimal prefix (0x), at
> the same time curl & ping process both prefixes. In Ubuntu Server
> 22.04 both prefixes are processed, but they are not allowed in
> /etc/hosts file, while in MacOS it's legal to use 0x. Despite the
> deviations in how and where the BSD standard is implemented, there are
> two distinct approaches. I don't see why Java should't provide two
> different indepentent APIs. It would give the future apps flexibility
> to decide which standard to rely on, ability to see the full picture.
>
> Please share your thoughts on whether such a change might be desirable
> in JDK 23. Thank you for your help!
>
> Best regards
> Sergey Chernyshev
>
> [1] https://bugs.openjdk.org/browse/JDK-8315767
> [2] https://bugs.openjdk.org/browse/JDK-8272215
> [3] https://github.com/openjdk/jdk/commit/cdc1582
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/net-dev/attachments/20240329/dcc16f35/attachment.htm>
More information about the net-dev
mailing list