RFR: 8315767: InetAddress: constructing objects from BSD literal addresses

Sergey Chernyshev schernyshev at openjdk.org
Tue Apr 9 00:47:20 UTC 2024


There are two distinct approaches to parsing IPv4 literal addresses. One is the Java baseline "strict" syntax (all-decimal d.d.d.d form family), another one is the "loose" syntax of RFC 6943 section 3.1.1 [1] (POSIX `inet_addr` allowing octal and hexadecimal forms [2]). The goal of this PR is to provide interface to construct InetAddress objects from literal addresses in POSIX form, to applications that need to mimic the behavior of `inet_addr` used by standard network utilities such as netcat/curl/wget and the majority of web browsers. At present time, there's no way to construct `InetAddress` object from such literal addresses because the existing APIs such as `InetAddress.getByName()`, `InetAddress#ofLiteral()` and `Inet4Address#ofLiteral()` will consume an address and successfully parse it as decimal, regardless of the octal prefix. Hence, the resulting object will point to a different IP address.

Historically `InetAddress.getByName()/.getAllByName()` were the only way to convert a literal address into an InetAddress object. `getAllByName()` API relies on POSIX `getaddrinfo` / `inet_addr` which parses IP address segments with `strtoul` (accepts octal and hexadecimal bases).

The fallback to `getaddrinfo` is undesirable as it may end up with network queries (blocking mode), if `inet_addr` rejects the input literal address. The Java standard explicitly says that

"If a literal IP address is supplied, only the validity of the address format is checked."

@AlekseiEfimov contributed JDK-8272215 [3] that adds new factory methods `.ofLiteral()` to `InetAddress` classes. Although the new API is not affected by the `getaddrinfo` fallback issue, it is not sufficient for an application that needs to interoperate with external tooling that follows POSIX standard. In the current state, `InetAddress#ofLiteral()` and `Inet4Address#ofLiteral()` will consume the input literal address and (regardless of the octal prefix) parse it as decimal numbers. Hence, it's not possible to reliably construct an `InetAddress` object from a literal address in POSIX form that would point to the desired host.

It is proposed to extend the factory methods with `Inet4Address#ofPosixLiteral()` that allows parsing literal IP(v4) addresses in "loose" syntax, compatible with `inet_addr` POSIX api. The implementation is based on `.isBsdParsableV4()` method added along with JDK-8277608 [4]. The changes in the original algorithm are as follows:

- `IPAddressUtil#parseBsdLiteralV4()` method is extracted from `.isBsdParsableV4()`
- an additional check is added, whether an input string is empty
- `null` is returned whenever the original algorithm fails
- a condition was added to the parser loop, that stores the IPv4 address segment value when it fits in 1 byte (0 <= x < 256)
- an additional check was added to verify that the last field value is non-negative
- when the last field value is multi-byte (the number of fields is less than 4), it is written to the last (4-(fieldNumber-1)) octets

The new method hasn't been added to InetAddress superclass because the change is only related to IPv4 addressing. This reduces the chance that client code will call the wrong factory method.

`test/jdk/java/net/InetAddress/OfLiteralTest.java` was updated to include `.ofPosixLiteral()` tests

Javadocs in `Inet4Address` were updated accordingly

The new method can be used as follows

import java.net.InetAddress;
import java.net.Inet4Address;

public class Test {
    public static void main(String[] args) throws Throwable {
        if (args.length < 1) {
            System.err.println("USAGE: java Test <host>");
            return;
        }
        InetAddress ia = Inet4Address.ofPosixLiteral(args[0]);
        System.out.println(ia.toString());
    }
}

The output would be

$ ./build/images/jdk/bin/java Test 2130706433
/127.0.0.1
$ ./build/images/jdk/bin/java Test 02130706433
/17.99.141.27
$ ./build/images/jdk/bin/java Test 2130706438
/127.0.0.6
$ ./build/images/jdk/bin/java Test 02130706438
Exception in thread "main" java.lang.IllegalArgumentException: Invalid IP address literal: 02130706438
        at java.base/sun.net.util.IPAddressUtil.invalidIpAddressLiteral(IPAddressUtil.java:169)
        at java.base/java.net.Inet4Address.parseAddressStringPosix(Inet4Address.java:302)
        at java.base/java.net.Inet4Address.ofPosixLiteral(Inet4Address.java:239)
        at Test.main(Test.java:10)


[1] https://www.ietf.org/rfc/rfc6943.html#section-3.1.1
[2] https://pubs.opengroup.org/onlinepubs/9699919799/functions/inet_addr.html
[3] https://bugs.openjdk.org/browse/JDK-8272215
[4] https://github.com/openjdk/jdk/commit/cdc1582d1d7629c2077f6cd19786d23323111018

-------------

Commit messages:
 - updated specification for java.net.Inet4Address, javadoc for sun.net.util.IPAddressUtil
 - removed trailing whitespace
 - updated javadocs, added apinotes, tests for corner cases
 - handle empty strings
 - 8315767: InetAddress.getByName() accepts ambiguous addresses

Changes: https://git.openjdk.org/jdk/pull/18493/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18493&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8315767
  Stats: 256 lines in 3 files changed: 241 ins; 1 del; 14 mod
  Patch: https://git.openjdk.org/jdk/pull/18493.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18493/head:pull/18493

PR: https://git.openjdk.org/jdk/pull/18493


More information about the net-dev mailing list