InetAddress API extension
David Lloyd
david.lloyd at redhat.com
Thu Mar 28 20:09:04 UTC 2024
On Thu, Mar 28, 2024 at 2:10 PM Sergey Chernyshev <
serge.chernyshev at bell-sw.com> wrote:
> Hi Net Dev team,
>
> I posted this earlier to core-libs-dev mailing list, which was not a
> proper place to discuss the net issues. Thanks Alan Bateman for directing
> me to the right place.
>
> I would like to propose a PR to extend the InetAddress API in JDK 23,
> namely to provide interface to constructing InetAddress objects from
> literal addresses in POSIX/BSD form (please see the discussion [1]), to the
> Apps that need to mimic the behavior of POSIX network APIs (inet_addr)
> used by standard network utilities such as netcat/curl/wget and the
> majority of web browsers. At present time, there's no way to construct
> InetAddress object from such literal addresses because the new API
> InetAddress.ofLiteral() and Inet4Address.ofLiteral() will consume an
> octal address and successfully parse it as decimal, ignoring the octal
> prefix. Hence, the resulting object will point to a different IP address
> than it is expected to point to. There's also no direct way to create an
> InetAddress from a literal address with hexadecimal segments, although this
> can be the case in certain systems.
>
Would this proposal be unique to IPv4 addresses, or is there an equivalent
for IPv6? (I would suspect that there isn't, given that the parsing rules
for IPv6 are a bit more well-defined...)
> Aleksei Efimov contributed JDK-8272215 [2] that adds new factory methods
> .ofLiteral() to InetAddress classes. Although the new API is not affected
> by the getaddrinfo fallback issue, it is not sufficient for an app that
> wants to mimic the behavior of BSD/POSIX network utlilities. In particular,
> Java apps that involve parsing or interpreting the parameters of the
> standard tools as well as their configuration / environment.
> It is suggested to add a new factory method such as .ofPosixLiteral() to
> Inet4Address class to fill this gap. This won't introduce ambiguity into
> the API and won't break the long standing behavior. As a new method, it
> will not affect Java utilities such as HttpClient, nor the existing Java
> applications. At the same time, the new method will help dealing with
> confusion between BSD and Java standards.
>
I would suggest normatively calling this behavior "POSIX standard" parsing
(not BSD or POSIX/BSD), since it (at least nominally) comes from a
standards body [1]. Bear in mind that `inet_pton` follows different rules
though [2]. RFC 6943 [3] has a bit more to say about so called "loose" vs
"strict" IP address parsing rules.
> The parsing algorithm was added as part of JDK-8277608 [3]. It requires
> minor modification to produce 4 bytes output (now it doesn't produce any
> output). The algorithm allows up to 4 segments splitted by dots (.), the
> leading segment(s) must not exceed 255 if there are more than 1 segment,
> the trailing segment must not exceed 256ˆ(5 - numberOfSegments) - 1. The
> algorithm rejects numbers greater than 0xFF hex, 0377 octal, 255 decimal
> per octet. It is different to .ofLiteral() where it is simply 255 per
> octet, regardless of leading 0s (the total length must not exceed 15). In
> .ofPosixLiteral() there'd be no limit of the number of leading 0s, which
> is also the case with inet_addr(). The corner case for both methods are
> numbers that are accepted in both, but produce different outputs such as
> octal numbers between 010 and 0255. 0256 and above are rejected by
> ofLiteral() as well as all hexadecimal numbers. Zero prefixed decimal
> numbers such as 0239 should be rejected by ofPosixLiteral().
>
> There could be a slight discrepancy in terms of how different standard
> tools are working under different OS. For example in MacOS wget & nc
> disregard octal prefix (0) while allowing hexadecimal prefix (0x), at the
> same time curl & ping process both prefixes. In Ubuntu Server 22.04 both
> prefixes are processed, but they are not allowed in /etc/hosts file, while
> in MacOS it's legal to use 0x. Despite the deviations in how and where the
> BSD standard is implemented, there are two distinct approaches. I don't see
> why Java should't provide two different indepentent APIs. It would give the
> future apps flexibility to decide which standard to rely on, ability to see
> the full picture.
>
> Please share your thoughts on whether such a change might be desirable in
> JDK 23. Thank you for your help!
>
I guess it could be useful when the need arises to interoperate with
tooling that supports this kind of syntax, and if it was done, I would
agree that a separate method would be the way to go. But, I don't have any
comment as to whether the potential use cases are sufficient to justify the
API surface and additional implementation complexity (whatever that may be).
As another random data point: the projects I've been working on have
relegated such extra-JDK IP address handling tasks to a utility library
[4]. We don't have a parser for this particular syntax though.
[1] https://pubs.opengroup.org/onlinepubs/009695399/functions/inet_addr.html
[2] https://pubs.opengroup.org/onlinepubs/009695399/functions/inet_pton.html
[3] https://datatracker.ietf.org/doc/html/rfc6943#section-3.1.1
[4]
https://github.com/smallrye/smallrye-common/blob/main/net/src/main/java/io/smallrye/common/net/Inet.java
--
- DML • he/him
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/net-dev/attachments/20240328/45b71426/attachment.htm>
More information about the net-dev
mailing list