<!DOCTYPE html>
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>
<body>
<p>Hi Net Dev team, </p>
<p>I posted this earlier to core-libs-dev mailing list, which was
not a proper place to discuss the net issues. Thanks Alan Bateman
for directing me to the right place.<br>
</p>
<p>I would like to propose a PR to extend the InetAddress API in JDK
23, namely to provide interface to constructing InetAddress
objects from literal addresses in POSIX/BSD form (please see the
discussion [1]), to the Apps that need to mimic the behavior of
POSIX network APIs (<code class="notranslate">inet_addr</code>)
used by standard network utilities such as netcat/curl/wget and
the majority of web browsers. At present time, there's no way to
construct <code class="notranslate">InetAddress</code> object
from such literal addresses because the new API <code
class="notranslate">InetAddress.ofLiteral()</code> and <code
class="notranslate">Inet4Address.ofLiteral()</code> will consume
an octal address and successfully parse it as decimal, ignoring
the octal prefix. Hence, the resulting object will point to a
different IP address than it is expected to point to. There's also
no direct way to create an InetAddress from a literal address with
hexadecimal segments, although this can be the case in certain
systems.<br>
</p>
<p dir="auto">Historically <code class="notranslate">InetAddress.getByName()/.getAllByName()</code>
were the only way to convert a literal address into an InetAddress
object. <code class="notranslate">getAllByName()</code> API
relies on POSIX <code class="notranslate">getaddrinfo</code> / <code
class="notranslate">inet_addr</code> which parses IP address
segments with <code class="notranslate">strtoul</code> (accepts
octal and hexadecimal bases). The fallback to <code
class="notranslate">getaddrinfo</code> is undesirable as it may
end up with network queries (blocking mode), if <code
class="notranslate">inet_aton</code> rejects the input literal
address. The Java standard explicitly says that</p>
<div
class="snippet-clipboard-content notranslate position-relative overflow-auto">
<pre class="notranslate"><code class="notranslate">"If a literal IP address is supplied, only the validity of the address format is checked."
</code></pre>
<div
class="zeroclipboard-container position-absolute right-0 top-0">
</div>
</div>
<p dir="auto">Aleksei Efimov contributed JDK-8272215 [2] that adds
new factory methods <code class="notranslate">.ofLiteral()</code>
to <code class="notranslate">InetAddress</code> classes. Although
the new API is not affected by the <code class="notranslate">getaddrinfo</code>
fallback issue, it is not sufficient for an app that wants to
mimic the behavior of BSD/POSIX network utlilities. In particular,
Java apps that involve parsing or interpreting the parameters of
the standard tools as well as their configuration / environment.<br>
</p>
It is suggested to add a new factory method such as <code
class="notranslate">.ofPosixLiteral()</code> to <code
class="notranslate">Inet4Address</code> class to fill this gap.
This won't introduce ambiguity into the API and won't break the long
standing behavior. As a new method, it will not affect Java
utilities such as HttpClient, nor the existing Java applications. At
the same time, the new method will help dealing with confusion
between BSD and Java standards.
<p dir="auto">The parsing algorithm was added as part of JDK-8277608
[3]. It requires minor modification to produce 4 bytes output (now
it doesn't produce any output). The algorithm allows up to 4
segments splitted by dots (.), the leading segment(s) must not
exceed 255 if there are more than 1 segment, the trailing segment
must not exceed <font face="monospace">256ˆ(5 - numberOfSegments)
- 1</font>. The algorithm rejects numbers greater than 0xFF hex,
0377 octal, 255 decimal per octet. It is different to <font
face="monospace">.ofLiteral()</font> where it is simply 255 per
octet, regardless of leading 0s (the total length must not exceed
15). In <font face="monospace">.ofPosixLiteral()</font> there'd
be no limit of the number of leading 0s, which is also the case
with <font face="monospace">inet_addr()</font>. The corner case
for both methods are numbers that are accepted in both, but
produce different outputs such as octal numbers between 010 and
0255. 0256 and above are rejected by ofLiteral() as well as all
hexadecimal numbers. Zero prefixed decimal numbers such as 0239
should be rejected by ofPosixLiteral().<br>
</p>
<p>There could be a slight <span class="HwtZe" lang="en"><span
class="jCAhz ChMk0b"><span class="ryNqvb">discrepancy</span></span></span>
in terms of how different standard tools are working under
different OS. For example in MacOS wget & nc disregard octal
prefix (0) while allowing hexadecimal prefix (0x), at the same
time curl & ping process both prefixes. In Ubuntu Server 22.04
both prefixes are processed, but they are not allowed in
/etc/hosts file, while in MacOS it's legal to use 0x. Despite the
deviations in how and where the BSD standard is implemented, there
are two distinct approaches. I don't see why Java should't provide
two different indepentent APIs. It would give the future apps
flexibility to decide which standard to rely on, ability to see
the full picture.<br>
</p>
<p dir="auto">Please share your thoughts on whether such a change
might be desirable in JDK 23. Thank you for your help! </p>
<p></p>
<p dir="auto"></p>
<p dir="auto">Best regards<br>
Sergey Chernyshev</p>
[1] <span style=""><a class="moz-txt-link-freetext"
href="https://bugs.openjdk.org/browse/JDK-8315767">https://bugs.openjdk.org/browse/JDK-8315767</a></span><br>
[2] <span style=""><a class="moz-txt-link-freetext"
href="https://bugs.openjdk.org/browse/JDK-8272215">https://bugs.openjdk.org/browse/JDK-8272215</a><br>
</span>[3] <a class="moz-txt-link-freetext"
href="https://github.com/openjdk/jdk/commit/cdc1582">https://github.com/openjdk/jdk/commit/cdc1582</a>
<p></p>
</body>
</html>