<div dir="ltr"><div dir="ltr"></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Mar 28, 2024 at 2:10 PM Sergey Chernyshev <<a href="mailto:serge.chernyshev@bell-sw.com">serge.chernyshev@bell-sw.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><u></u>
<div>
<p>Hi Net Dev team, </p>
<p>I posted this earlier to core-libs-dev mailing list, which was
not a proper place to discuss the net issues. Thanks Alan Bateman
for directing me to the right place.<br>
</p>
<p>I would like to propose a PR to extend the InetAddress API in JDK
23, namely to provide interface to constructing InetAddress
objects from literal addresses in POSIX/BSD form (please see the
discussion [1]), to the Apps that need to mimic the behavior of
POSIX network APIs (<code class="gmail-notranslate">inet_addr</code>)
used by standard network utilities such as netcat/curl/wget and
the majority of web browsers. At present time, there's no way to
construct <code class="gmail-notranslate">InetAddress</code> object
from such literal addresses because the new API <code class="gmail-notranslate">InetAddress.ofLiteral()</code> and <code class="gmail-notranslate">Inet4Address.ofLiteral()</code> will consume
an octal address and successfully parse it as decimal, ignoring
the octal prefix. Hence, the resulting object will point to a
different IP address than it is expected to point to. There's also
no direct way to create an InetAddress from a literal address with
hexadecimal segments, although this can be the case in certain
systems.<br></p></div></blockquote><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">Would this proposal be unique to IPv4 addresses, or is there an equivalent for IPv6? (I would suspect that there isn't, given that the parsing rules for IPv6 are a bit more well-defined...)</div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p dir="auto">Aleksei Efimov contributed JDK-8272215 [2] that adds
new factory methods <code class="gmail-notranslate">.ofLiteral()</code>
to <code class="gmail-notranslate">InetAddress</code> classes. Although
the new API is not affected by the <code class="gmail-notranslate">getaddrinfo</code>
fallback issue, it is not sufficient for an app that wants to
mimic the behavior of BSD/POSIX network utlilities. In particular,
Java apps that involve parsing or interpreting the parameters of
the standard tools as well as their configuration / environment.<br></p>
It is suggested to add a new factory method such as <code class="gmail-notranslate">.ofPosixLiteral()</code> to <code class="gmail-notranslate">Inet4Address</code> class to fill this gap.
This won't introduce ambiguity into the API and won't break the long
standing behavior. As a new method, it will not affect Java
utilities such as HttpClient, nor the existing Java applications. At
the same time, the new method will help dealing with confusion
between BSD and Java standards.</div></blockquote><div><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">I would suggest normatively calling this behavior "POSIX standard" parsing (not BSD or POSIX/BSD), since it (at least nominally) comes from a standards body [1]. Bear in mind that `inet_pton` follows different rules though [2]. RFC 6943 [3] has a bit more to say about so called "loose" vs "strict" IP address parsing rules.</div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p dir="auto">The parsing algorithm was added as part of JDK-8277608
[3]. It requires minor modification to produce 4 bytes output (now
it doesn't produce any output). The algorithm allows up to 4
segments splitted by dots (.), the leading segment(s) must not
exceed 255 if there are more than 1 segment, the trailing segment
must not exceed <font face="monospace">256ˆ(5 - numberOfSegments)
- 1</font>. The algorithm rejects numbers greater than 0xFF hex,
0377 octal, 255 decimal per octet. It is different to <font face="monospace">.ofLiteral()</font> where it is simply 255 per
octet, regardless of leading 0s (the total length must not exceed
15). In <font face="monospace">.ofPosixLiteral()</font> there'd
be no limit of the number of leading 0s, which is also the case
with <font face="monospace">inet_addr()</font>. The corner case
for both methods are numbers that are accepted in both, but
produce different outputs such as octal numbers between 010 and
0255. 0256 and above are rejected by ofLiteral() as well as all
hexadecimal numbers. Zero prefixed decimal numbers such as 0239
should be rejected by ofPosixLiteral().<br>
</p>
<p>There could be a slight <span lang="en"><span><span>discrepancy</span></span></span>
in terms of how different standard tools are working under
different OS. For example in MacOS wget & nc disregard octal
prefix (0) while allowing hexadecimal prefix (0x), at the same
time curl & ping process both prefixes. In Ubuntu Server 22.04
both prefixes are processed, but they are not allowed in
/etc/hosts file, while in MacOS it's legal to use 0x. Despite the
deviations in how and where the BSD standard is implemented, there
are two distinct approaches. I don't see why Java should't provide
two different indepentent APIs. It would give the future apps
flexibility to decide which standard to rely on, ability to see
the full picture.<br>
</p>
<p dir="auto">Please share your thoughts on whether such a change
might be desirable in JDK 23. Thank you for your help!</p></div></blockquote><div><span style="font-family:arial,helvetica,sans-serif">I guess it could be useful when the need arises to interoperate with tooling that supports this kind of syntax, and if it was done, I would agree that a separate method would be the way to go. But, I don't have any comment as to whether the potential use cases are sufficient to justify the API surface and additional implementation complexity (whatever that may be).</span><br></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">As another random data point: the projects I've been working on have relegated such extra-JDK IP address handling tasks to a utility library [4]. We don't have a parser for this particular syntax though.</div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"><br></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">[1] <a href="https://pubs.opengroup.org/onlinepubs/009695399/functions/inet_addr.html">https://pubs.opengroup.org/onlinepubs/009695399/functions/inet_addr.html</a><br></div></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">[2] <a href="https://pubs.opengroup.org/onlinepubs/009695399/functions/inet_pton.html">https://pubs.opengroup.org/onlinepubs/009695399/functions/inet_pton.html</a></div></div><div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">[3] <a href="https://datatracker.ietf.org/doc/html/rfc6943#section-3.1.1">https://datatracker.ietf.org/doc/html/rfc6943#section-3.1.1</a><br class="gmail-Apple-interchange-newline"></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif">[4] <a href="https://github.com/smallrye/smallrye-common/blob/main/net/src/main/java/io/smallrye/common/net/Inet.java">https://github.com/smallrye/smallrye-common/blob/main/net/src/main/java/io/smallrye/common/net/Inet.java</a></div><div class="gmail_default" style="font-family:arial,helvetica,sans-serif"></div></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr">- DML • he/him<br></div></div></div>