Allowing apps to force sun.jnu.encoding = "UTF-8" on Windows
Fabian Meumertzheim
fabian at buildbuddy.io
Mon Nov 4 09:38:11 UTC 2024
On Thu, Oct 31, 2024 at 11:26 PM Naoto Sato <naoto.sato at oracle.com> wrote:
> >> This has been discussed when we did JEP 400: UTF-8 by Default and
> >> decided not to do it, mainly because it affects filename/path encoding.
> >> Changing `sun.jnu.encoding` apart from Windows system encoding will make
> >> apps not being able to access those files/directories (e.g. home
> >> directory) if the path/name contains characters with different encodings.
> >
> > Based on grepping the source, it looks like the JDK (almost?)
> > exclusively uses the -W Windows APIs to interface with the file
> > system, with the active code page only being relevant for the internal
> > conversion between Java strings and platform UTF-16 strings through
> > `MultiByteToWideChar` and `WideCharToMultiByte` (via `CP_ACP`).
>
> I don't believe this assumption is correct. Java runtime is implicitly
> using Win32 ANSI calls, as it is not entirely compiled with `UNICODE`
> flag. Those calls would fail if ANSI code page differs from UTF-8.
I see, thank you for making me aware of this.
Would contributions in this area be welcome? If it is possible to get
to a state where this assumption does hold via incremental, behavior
preserving changes to the native Windows parts of the Java runtime,
that would potentially allow the codepage to be made configurable from
the java.exe command line in the future.
Fabian
More information about the core-libs-dev
mailing list