InetAddress API extension

Michael McMahon michael.x.mcmahon at oracle.com
Fri Mar 29 11:35:56 UTC 2024


Hi,

I think this could be a useful addition. The apidoc would need to be 
clear about the differences from the existing literal parsing method and 
any implications for ambiguity between octal and decimal formats spelled 
out.

- Michael

On 28/03/2024 19:09, Sergey Chernyshev wrote:
>
> Hi Net Dev team,
>
> I posted this earlier to core-libs-dev mailing list, which was not a 
> proper place to discuss the net issues. Thanks Alan Bateman for 
> directing me to the right place.
>
> I would like to propose a PR to extend the InetAddress API in JDK 23, 
> namely to provide interface to constructing InetAddress objects from 
> literal addresses in POSIX/BSD form (please see the discussion [1]), 
> to the Apps that need to mimic the behavior of POSIX network APIs 
> (|inet_addr|) used by standard network utilities such as 
> netcat/curl/wget and the majority of web browsers. At present time, 
> there's no way to construct |InetAddress| object from such literal 
> addresses because the new API |InetAddress.ofLiteral()| and 
> |Inet4Address.ofLiteral()| will consume an octal address and 
> successfully parse it as decimal, ignoring the octal prefix. Hence, 
> the resulting object will point to a different IP address than it is 
> expected to point to. There's also no direct way to create an 
> InetAddress from a literal address with hexadecimal segments, although 
> this can be the case in certain systems.
>
> Historically |InetAddress.getByName()/.getAllByName()| were the only 
> way to convert a literal address into an InetAddress object. 
> |getAllByName()| API relies on POSIX |getaddrinfo| / |inet_addr| which 
> parses IP address segments with |strtoul| (accepts octal and 
> hexadecimal bases). The fallback to |getaddrinfo| is undesirable as it 
> may end up with network queries (blocking mode), if |inet_aton| 
> rejects the input literal address. The Java standard explicitly says that
>
> |"If a literal IP address is supplied, only the validity of the 
> address format is checked." |
>
> Aleksei Efimov contributed JDK-8272215 [2] that adds new factory 
> methods |.ofLiteral()| to |InetAddress| classes. Although the new API 
> is not affected by the |getaddrinfo| fallback issue, it is not 
> sufficient for an app that wants to mimic the behavior of BSD/POSIX 
> network utlilities. In particular, Java apps that involve parsing or 
> interpreting the parameters of the standard tools as well as their 
> configuration / environment.
>
> It is suggested to add a new factory method such as 
> |.ofPosixLiteral()| to |Inet4Address| class to fill this gap. This 
> won't introduce ambiguity into the API and won't break the long 
> standing behavior. As a new method, it will not affect Java utilities 
> such as HttpClient, nor the existing Java applications. At the same 
> time, the new method will help dealing with confusion between BSD and 
> Java standards.
>
> The parsing algorithm was added as part of JDK-8277608 [3]. It 
> requires minor modification to produce 4 bytes output (now it doesn't 
> produce any output). The algorithm allows up to 4 segments splitted by 
> dots (.), the leading segment(s) must not exceed 255 if there are more 
> than 1 segment, the trailing segment must not exceed 256ˆ(5 - 
> numberOfSegments) - 1. The algorithm rejects numbers greater than 0xFF 
> hex, 0377 octal, 255 decimal per octet. It is different to 
> .ofLiteral() where it is simply 255 per octet, regardless of leading 
> 0s (the total length must not exceed 15). In .ofPosixLiteral() there'd 
> be no limit of the number of leading 0s, which is also the case with 
> inet_addr(). The corner case for both methods are numbers that are 
> accepted in both, but produce different outputs such as octal numbers 
> between 010 and 0255. 0256 and above are rejected by ofLiteral() as 
> well as all hexadecimal numbers. Zero prefixed decimal numbers such as 
> 0239 should be rejected by ofPosixLiteral().
>
> There could be a slight discrepancy in terms of how different standard 
> tools are working under different OS. For example in MacOS wget & nc 
> disregard octal prefix (0) while allowing hexadecimal prefix (0x), at 
> the same time curl & ping process both prefixes. In Ubuntu Server 
> 22.04 both prefixes are processed, but they are not allowed in 
> /etc/hosts file, while in MacOS it's legal to use 0x. Despite the 
> deviations in how and where the BSD standard is implemented, there are 
> two distinct approaches. I don't see why Java should't provide two 
> different indepentent APIs. It would give the future apps flexibility 
> to decide which standard to rely on, ability to see the full picture.
>
> Please share your thoughts on whether such a change might be desirable 
> in JDK 23. Thank you for your help!
>
> Best regards
> Sergey Chernyshev
>
> [1] https://bugs.openjdk.org/browse/JDK-8315767
> [2] https://bugs.openjdk.org/browse/JDK-8272215
> [3] https://github.com/openjdk/jdk/commit/cdc1582 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/net-dev/attachments/20240329/dcc16f35/attachment.htm>


More information about the net-dev mailing list