JEP proposed to target JDK 18: 400: UTF-8 by Default
Naoto Sato
naoto.sato at oracle.com
Wed Aug 18 21:44:06 UTC 2021
On 8/18/21 2:03 PM, Simon Nash wrote:
> I am the developer of a fairly large application that uses file I/O
> extensively. In most cases, the charset should be UTF-8 and I have used
> an explicit charset parameter on all method invocations where this
> applies. In some cases, the charset needs to be the platform default
> charset to produce output that is readable by other programs or by a
> user (for example, Windows-1252 on some versions of Windows).
>
> In the cases that need the platform default charset, I have omitted the
> charset (intentionally, not carelessly or accidentally). If the
> behaviour changes in these cases, it will produce unexpected results for
> users.
In preparation for JEP 400, we have provided a new system property that
retrieves the native encoding name in JDK17:
https://bugs.openjdk.java.net/browse/JDK-8265989
Apps that have luxury to make code base change can use the property to
retrieve the native encoding, then use it to replace no-arg I/O
constructors to the explicit equivalent ones.
>
> I could try to find all the method invocations that currently use the
> implicit default charset, although I have no idea how to do this other
> than reading every line of code. The problems with this are 1) that I
> would almost certainly miss some invocations that need to be changed and
> 2) more seriously, from what I have seen in the JEP I don't think there
> is a way to update these method invocations that works exactly as at
> present on all versions of Java back to JDK 8 and provides the same
> behaviour on JDK 18. This is because (as far as I can tell) there is no
> API call that returns the "old-style" platform default charset and can
> be used on all JDK versions from JDK 8 to JDK 18. Adding a -D option
> when the application is started isn't possible in some contexts such as
> launching the application from a Windows executable jar file association.
>
> If I could make a single API call when the application is first started
> to force backward-compatible behaviour in all cases, this would solve
> the problem. This feels very much like a "hack" and I would much prefer
> a clean solution but it would be better than nothing.
I am reluctant to provide such a single API call which has the same
effect as `COMPAT`, as it will become less meaningful when UTF-8 as
default sinks in.
Naoto
More information about the core-libs-dev
mailing list