RFR: 8352728: InternalError loading java.security due to Windows parent folder permissions
Francisco Ferrari Bihurriet
fferrari at openjdk.org
Tue Apr 15 02:46:45 UTC 2025
On Thu, 10 Apr 2025 05:52:17 GMT, Alan Bateman <alanb at openjdk.org> wrote:
>> Hi, this is a proposal to fix 8352728.
>>
>> The main idea is to replace [`java.nio.file.Path::toRealPath`](https://docs.oracle.com/en/java/javase/24/docs/api/java.base/java/nio/file/Path.html#toRealPath(java.nio.file.LinkOption...)) by [`java.io.File::getCanonicalPath`](https://docs.oracle.com/en/java/javase/24/docs/api/java.base/java/io/File.html#getCanonicalPath()) for path canonicalization purposes. The rationale behind this decision is the following:
>>
>> 1. In _Windows_, `File::getCanonicalPath` handles restricted permissions in parent directories. Contrarily, `Path::toRealPath` fails with `AccessDeniedException`.
>> 2. In _Linux_, `File::getCanonicalPath` handles non-regular files (e.g. `/dev/stdin`). Contrarily, `Path::toRealPath` fails with `NoSuchFileException`.
>>
>> #### Windows Case
>>
>> @martinuy and I tracked down the `File::getCanonicalPath` vs `Path::toRealPath` behaviour differences in _Windows_. Both methods end up calling the [`FindFirstFileW`](https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-findfirstfilew) API inside a loop for each parent directory in the path, until they include the leaf:
>>
>> * [`File::getCanonicalPath`](https://github.com/openjdk/jdk/blob/jdk-24-ga/src/java.base/share/classes/java/io/File.java#L618 "src/java.base/share/classes/java/io/File.java:618") goes through the following stack into `FindFirstFileW`:
>> * [`WinNTFileSystem::canonicalize`](https://github.com/openjdk/jdk/blob/jdk-24-ga/src/java.base/windows/classes/java/io/WinNTFileSystem.java#L473 "src/java.base/windows/classes/java/io/WinNTFileSystem.java:473")
>> * [`WinNTFileSystem::canonicalize0`](https://github.com/openjdk/jdk/blob/jdk-24-ga/src/java.base/windows/native/libjava/WinNTFileSystem_md.c#L288 "src/java.base/windows/native/libjava/WinNTFileSystem_md.c:288")
>> * [`wcanonicalize`](https://github.com/openjdk/jdk/blob/jdk-24-ga/src/java.base/windows/native/libjava/canonicalize_md.c#L233 "src/java.base/windows/native/libjava/canonicalize_md.c:233") (here is the loop)
>> * If `FindFirstFileW` fails with `ERROR_ACCESS_DENIED`, `lastErrorReportable` is consulted, the error is [considered non-reportable](https://github.com/openjdk/jdk/blob/jdk-24-ga/src/java.base/windows/native/libjava/canonicalize_md.c#L139 "src/java.base/windows/native/libjava/canonicalize_md.c:139") and the iteration is stopped [here](https://github.com/openjdk/jdk/blob/jdk-24-ga/src/java.base/windows/native/libjava/canonicalize_md.c#L246-L250 "src/ja...
>
> I don't think this change should be integrated before more investigation. I think start by finding out why this code is using toRealPath. For the two cases listed, it looks like toRealPath is correctly failing for the first, but for /dev/stdin then please bring it to nio-dev to discuss how special devices should be handled by that method.
Hi again @AlanBateman,
I've been doing some research on _Linux_, debugging the following sample:
### `GetContentAndRealPath.java`
import java.nio.file.Files;
import java.nio.file.Path;
import static java.nio.charset.StandardCharsets.UTF_8;
public final class GetContentAndRealPath {
public static void main(String[] args) throws Exception {
if (args.length != 1) {
throw new Exception("Please specify a file path");
}
Path path = Path.of(args[0]);
System.out.println("%(java) Content:");
System.out.println("%(java) => " +
new String(Files.readAllBytes(path), UTF_8).trim());
System.out.println("%(java)");
System.out.println("%(java) File::getCanonicalPath:");
System.out.println("%(java) => " +
path.toFile().getCanonicalFile().toPath());
System.out.println("%(java)");
System.out.println("%(java) Path::toRealPath:");
System.out.println("%(java) => " + path.toRealPath());
}
}
The first thing to note is that `/dev/stdin` is not a problem per se, the actual problem is when it is provided by an anonymous pipe. So the following works fine:
$ java GetContentAndRealPath.java /dev/stdin </dev/null
%(java) Content:
%(java) =>
%(java)
%(java) File::getCanonicalPath:
%(java) => /dev/null
%(java)
%(java) Path::toRealPath:
%(java) => /dev/null
In Bash, there are various ways to provide `stdin` through an anonymous pipe:
# https://www.gnu.org/software/bash/manual/bash.html#Pipelines
echo Pipelines | java GetContentAndRealPath.java /dev/stdin
# https://www.gnu.org/software/bash/manual/bash.html#Here-Strings
java GetContentAndRealPath.java /dev/./stdin <<<Here-Strings
# https://www.gnu.org/software/bash/manual/bash.html#Here-Documents
java GetContentAndRealPath.java /etc/../dev/stdin <<EOF
Here-Documents
EOF
# https://www.gnu.org/software/bash/manual/bash.html#Process-Substitution
java GetContentAndRealPath.java <(echo Process-Substitution)
Here-Documents example:
$ java GetContentAndRealPath.java /etc/../dev/stdin <<EOF
Here-Documents
EOF
%(java) Content:
%(java) => Here-Documents
%(java)
%(java) File::getCanonicalPath:
%(java) => /dev/stdin
%(java)
%(java) Path::toRealPath:
Exception in thread "main" java.nio.file.NoSuchFileException: /etc/../dev/stdin
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixPath.toRealPath(UnixPath.java:834)
at GetContentAndRealPath.main(GetContentAndRealPath.java:24)
Please note how `File::getCanonicalPath` resolves `/etc/../dev/stdin` → `/dev/stdin` while `Path::toRealPath` fails with `NoSuchFileException`.
### _glibc_'s `realpath()`
I traced this behaviour inside _glibc_'s `realpath()`, with the following GDB script:
gdb -q --nx java <<'EOF' 2>&1 | grep --color=never '^%(\w*)|^Exception|^ at'
# Settings
set debuginfod enabled on
set breakpoint pending on
handle SIGSEGV nostop noprint pass
# Break once at JDK_Canonicalize(), if the path starts with /dev/fd/
tbreak JDK_Canonicalize if ((int) strncmp(orig, "/dev/fd/", 8)) == 0
# Start java with a Process-Substitution anonymous pipe
run GetContentAndRealPath.java <(echo Process-Substitution)
# Stopped at JDK_Canonicalize() add a trace for each glibc's realpath() call
# https://github.com/bminor/glibc/blob/glibc-2.39/stdlib/canonicalize.c#L431
break canonicalize.c:431
commands
python begin_realpath()
continue
end
# Inside glibc's realpath() implementation, also trace each readlink() result
# https://github.com/bminor/glibc/blob/glibc-2.39/stdlib/canonicalize.c#L311
break canonicalize.c:311
commands
python after_readlink()
continue
end
############################################################
python
from errno import errorcode
def begin_realpath():
name = gdb.parse_and_eval('name').string('utf-8')
print(f'%(glibc) realpath("{name}")')
def after_readlink():
rname = gdb.parse_and_eval('rname').string('utf-8')
print(f'%(glibc) readlink("{rname}") -> ', end='')
n = int(gdb.parse_and_eval('n'))
if n >= 0:
buf = gdb.parse_and_eval('buf').string('utf-8')[:n]
print(f'"{buf}"')
else:
errno = int(gdb.parse_and_eval('errno'))
print(f'{errorcode[errno]} ({errno})')
end
############################################################
# Resume execution from JDK_Canonicalize()
continue
EOF
It produces the following output:
%(java) Content:
%(java) => Process-Substitution
%(java)
%(java) File::getCanonicalPath:
%(glibc) realpath("/dev/fd/63")
%(glibc) readlink("/dev") -> EINVAL (22)
%(glibc) readlink("/dev/fd") -> "/proc/self/fd"
%(glibc) readlink("/proc") -> EINVAL (22)
%(glibc) readlink("/proc/self") -> "1390261"
%(glibc) readlink("/proc/1390261") -> EINVAL (22)
%(glibc) readlink("/proc/1390261/fd") -> EINVAL (22)
%(glibc) readlink("/proc/1390261/fd/63") -> "pipe:[18671624]"
%(glibc) readlink("/proc/1390261/fd/pipe:[18671624]") -> ENOENT (2)
%(glibc) realpath("/dev/fd")
%(glibc) readlink("/dev") -> EINVAL (22)
%(glibc) readlink("/dev/fd") -> "/proc/self/fd"
%(glibc) readlink("/proc") -> EINVAL (22)
%(glibc) readlink("/proc/self") -> "1390261"
%(glibc) readlink("/proc/1390261") -> EINVAL (22)
%(glibc) readlink("/proc/1390261/fd") -> EINVAL (22)
%(java) => /proc/1390261/fd/63
%(java)
%(java) Path::toRealPath:
%(glibc) realpath("/dev/fd/63")
%(glibc) readlink("/dev") -> EINVAL (22)
%(glibc) readlink("/dev/fd") -> "/proc/self/fd"
%(glibc) readlink("/proc") -> EINVAL (22)
%(glibc) readlink("/proc/self") -> "1390261"
%(glibc) readlink("/proc/1390261") -> EINVAL (22)
%(glibc) readlink("/proc/1390261/fd") -> EINVAL (22)
%(glibc) readlink("/proc/1390261/fd/63") -> "pipe:[18671624]"
%(glibc) readlink("/proc/1390261/fd/pipe:[18671624]") -> ENOENT (2)
Exception in thread "main" java.nio.file.NoSuchFileException: /dev/fd/63
at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
at java.base/sun.nio.fs.UnixPath.toRealPath(UnixPath.java:834)
at GetContentAndRealPath.main(GetContentAndRealPath.java:24)
We can see that `realpath()` invokes `readlink()` several times for the partially normalized path. Anonymous pipes live in the [**PipeFS** virtual filesystem](https://okc1.github.io/blog/2015/07/23/how-is-pipe-implemented-unix), which isn't mounted in userspace. However, anonymous pipes do have a `dentry` which gives them a `dname` with the `pipe:[<i_ino>]` format (Code: [`readlink` syscall](https://github.com/torvalds/linux/blob/v6.12/fs/stat.c#L574) → [`do_readlinkat`](https://github.com/torvalds/linux/blob/v6.12/fs/stat.c#L551) → [`vfs_readlink`](https://github.com/torvalds/linux/blob/v6.12/fs/namei.c#L5255) → [`proc_pid_readlink`](https://github.com/torvalds/linux/blob/v6.12/fs/proc/base.c#L1856) → [`do_proc_readlink`](https://github.com/torvalds/linux/blob/v6.12/fs/proc/base.c#L1827) → [`d_path`](https://github.com/torvalds/linux/blob/v6.12/fs/d_path.c#L270-L283) → [`pipefs_dname`](https://github.com/torvalds/linux/blob/v6.12/fs/pipe.c#L867-L874). Also, there are som
e higher level explanations [here](https://unix.stackexchange.com/a/124977) and [here](https://narkive.com/QM2vnkZu.5)).
When `readlink()` resolves `/proc/<pid>/fd/63`, it returns the `pipe:[18543635]` link target, which makes the symlink look "broken" (from the userspace perspective) and this ultimately makes `realpath()` fail with `ENOENT (2)`.
However, the unresolved `/proc/<pid>/fd/0` link works, as the _Linux Kernel_ can access the anonymous pipe behind it:
fferrari at vmhost:~$ echo TEST | sleep 20 &>/dev/null &disown
[1] 1386980
fferrari at vmhost:~$ cat $(realpath /proc/$(pgrep sleep)/fd/0)
cat: '/proc/1386980/fd/pipe:[18629664]': No such file or directory
fferrari at vmhost:~$ cat /proc/$(pgrep sleep)/fd/0
TEST
### _OpenJDK_ APIs difference
We can see in [`UnixNativeDispatcher::realpath0`](https://github.com/openjdk/jdk/blob/jdk-25+17/src/java.base/unix/native/libnio/fs/UnixNativeDispatcher.c#L1105 "src/java.base/unix/native/libnio/fs/UnixNativeDispatcher.c:1105") that `Path::toRealPath` translates the _glibc_'s `realpath()` failure immediately.
On the other hand `File::getCanonicalPath` goes through [`UnixFileSystem::canonicalize0`](https://github.com/openjdk/jdk/blob/jdk-25+17/src/java.base/unix/native/libjava/UnixFileSystem_md.c#L86-L87 "src/java.base/unix/native/libjava/UnixFileSystem_md.c:86") and `JDK_Canonicalize`, [which retries `realpath()` until some subpath works](https://github.com/openjdk/jdk/blob/jdk-25+17/src/java.base/unix/native/libjava/canonicalize_md.c#L67-L86 "src/java.base/unix/native/libjava/canonicalize_md.c:86"). This aligns with the previous GDB experiment, which showed `File::getCanonicalPath` is doing an additional `realpath("/dev/fd")` call when `realpath("/dev/fd/63")` fails.
Given it just follows the _glibc_'s behaviour, I don't think a `Path::toRealPath` change is justifiable for an `nio-dev` request. For example, [an equivalent discussion has been raised for Rust](https://github.com/rust-lang/rust/issues/95239), and the developers aren't considering making any change.
`File::getCanonicalPath` seems to take the best-effort approach (both in _Linux_ and _Windows_), whereas `Path::toRealPath` is stricter.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24465#issuecomment-2803609222
More information about the security-dev
mailing list