RFR: 8300819: -Dfile.encoding=Cp943C option does not work as expected since jdk18
Alan Bateman
alanb at openjdk.org
Sun Jan 22 19:25:02 UTC 2023
On Sun, 22 Jan 2023 09:18:37 GMT, Ichiroh Takiguchi <itakiguchi at openjdk.org> wrote:
> On jdk17, following testcase works fine on Linux platform.
>
> Testcase
>
> $ cat cstest1.java
> import java.nio.charset.*;
>
> public class cstest1 {
> public static void main(String[] args) throws Exception {
> Charset cs = Charset.defaultCharset();
> System.out.println(cs + ", " + cs.getClass() + ", " + cs.getClass().getModule());
> }
> }
>
>
> $ ~/jdk-17.0.6+10/bin/java -Dfile.encoding=Cp943C -showversion cstest1
> openjdk version "17.0.6" 2023-01-17
> OpenJDK Runtime Environment Temurin-17.0.6+10 (build 17.0.6+10)
> OpenJDK 64-Bit Server VM Temurin-17.0.6+10 (build 17.0.6+10, mixed mode, sharing)
> x-IBM943C, class sun.nio.cs.ext.IBM943C, module jdk.charsets
>
>
> But it does not work as expected on jdk18 and jdk21b06
>
> $ ~/jdk-18.0.2.1+1/bin/java -Dfile.encoding=Cp943C -showversion cstest1
> openjdk version "18.0.2.1" 2022-08-18
> OpenJDK Runtime Environment Temurin-18.0.2.1+1 (build 18.0.2.1+1)
> OpenJDK 64-Bit Server VM Temurin-18.0.2.1+1 (build 18.0.2.1+1, mixed mode, sharing)
> UTF-8, class sun.nio.cs.UTF_8, module java.base
> $ ~/jdk-21/bin/java -Dfile.encoding=Cp943C -showversion cstest1
> openjdk version "21-ea" 2023-09-19
> OpenJDK Runtime Environment (build 21-ea+6-365)
> OpenJDK 64-Bit Server VM (build 21-ea+6-365, mixed mode, sharing)
> UTF-8, class sun.nio.cs.UTF_8, module java.base
>
>
> Fixed result is as follows:
>
> $ java -Dfile.encoding=Cp943C -showversion PrintDefaultCharset
> openjdk version "21-internal" 2023-09-19
> OpenJDK Runtime Environment (build 21-internal-adhoc.jdktest.jdk)
> OpenJDK 64-Bit Server VM (build 21-internal-adhoc.jdktest.jdk, mixed mode, sharing)
> x-IBM943C
I don't think this is the right thing to do.
In terms of setup, there wasn't a documented/supported way to change the default charset in JDK 17 or earlier. JDK 18 was the first release where it documented (via an implNote) that it is possible to start with -Dfile.encoding=COMPAT to derive the default charset from native.encoding (which in turns depends on the locale and charset of the underlying operating system).
The second point is that it is too fragile to assume that Charset.defaultCharset won't be called until after phase 2 of startup. For the system to be reliable, the default charset must be in java.base. The proposed test may work in some cases but it won't work in cases where there are command line options and Java code executed at startup that indirectly uses Charset.defaultCharset.
That said, the change does highlight an issue in StaticProperty.<clinit> where it calls Charset.defaultCharset(), which in turn will StaticProperty.FILE_ENCODING before that class is fully initialized. This is nothing to do with trying to run with -Dfile.encoding=Cp943C of course but this part of the initialization will need to be re-examined to avoid issues.
-------------
PR: https://git.openjdk.org/jdk/pull/12132
More information about the core-libs-dev
mailing list