java fails under a path with 4-byte UTF-8 character & JDK-8258246
Jaikiran Pai
jai.forums2013 at gmail.com
Sat Nov 23 15:38:25 UTC 2024
Hello Fabian,
There are two issues here. The first is - if "java" is launched with a
classpath containing a path which has emoji character in it then that
leads to an exception causing the launch to fail.
For example, if you have a "Foo.class" in /tmp/ascii-dir and you
currently are in "<dir-with-emoji-char>" and you do:
<dir-with-emoji-char> $> java -cp .:/tmp/ascii-dir Foo
Then that leads to the following exception:
Error: A JNI error has occurred, please check your installation and try
again
Exception in thread "main" java.lang.IllegalArgumentException: Error
decoding percent encoded characters
at java.base/sun.net.www.ParseUtil.decode(ParseUtil.java:221)
at
java.base/jdk.internal.loader.URLClassPath$FileLoader.<init>(URLClassPath.java:1030)
at
java.base/jdk.internal.loader.URLClassPath$3.run(URLClassPath.java:483)
at
java.base/jdk.internal.loader.URLClassPath$3.run(URLClassPath.java:477)
at
java.base/java.security.AccessController.doPrivileged(AccessController.java:714)
at
java.base/jdk.internal.loader.URLClassPath.getLoader(URLClassPath.java:476)
at
java.base/jdk.internal.loader.URLClassPath.getLoader(URLClassPath.java:444)
at
java.base/jdk.internal.loader.URLClassPath.getResource(URLClassPath.java:315)
at
java.base/jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:757)
...
The second issue is that due to an internal implementation detail of
jdk.internal.loader.URLClassPath it cannot handle paths that have an
emoji character in it. What that implies is that it cannot serve any
resources from such paths.
The first issue which causes the launch to fail with an exception is a
straightforward bug that needs fixing. A URLClassPath can have multiple
URLs as the classpath and it is implemented in a way to not throw
exceptions if any of those URLs are unusable. We do have a bug in this
specific case where this unusable URL raises the exception from the
URLClassPath code. That should be addressed through
https://bugs.openjdk.org/browse/JDK-8344908.
The second issue to enhance URLClassPath to be able to support serving
resources from paths containing emojis (for example) is more involved
and ties into https://bugs.openjdk.org/browse/JDK-8258246. I will have
to refresh myself with that issue and check with others on what can be
done for that (if anything).
-Jaikiran
On 22/11/24 11:57 pm, Fabian Meumertzheim wrote:
> I just observed that the java launcher fails to run (any) Java program
> on macOS if executed from a working directory that contains a
> character with a 4-byte UTF-8 encoding (say an emoji).
>
> While it does seem related to JDK-8258246, which was considered to be
> caused by more general issues with legacy URL/URI conversion, I
> believe that my current issue comes down to just
>
> sun.net.www.ParseUtil.decode(sun.net.www.ParseUtil.fileToEncodedURL(f))
>
> failing for any File f with such a character in its path.
>
> Since sun.net.www.ParseUtil.fileToEncodedURL seems to have its single
> call site in URLClassPath, which is the source of the issue, I wonder
> whether this could be fixed in a more targeted manner without
> incurring the high risks alluded to in JDK-8258246. For example,
> replacing the URL encoding logic with f.toPath().toUri().toURL() does
> seem to resolve the issue.
>
> Is the latest comment on JDK-8258246 still the plan or would a
> separate bug report for this more targeted issue be welcome?
>
> Fabian
More information about the core-libs-dev
mailing list