Insufficiencies in JEP: 400: UTF-8 by Default

Sun Mar 14 12:01:54 UTC 2021

On 14/03/2021 11:00, Marco wrote:
> :
>
> IMO Charset should provide standardized getters for the OS charset and the
> console charset. The latter being different has been a long standing issue on
> Windows where the codepage differs between its CLI and regular environments.
> OpenJDK has the necessary data already available in its custom system
> properties.
>
> The console charset is currently hidden behind PrintStream not exposing the
> underlying OSWriter and not offering getEncoding() itself. The OS charset
> would be hidden in the future by Charset.getDefaultCharset()'s specification
> change in JEP 400.
The intention that there will be little or no impact to the console 
streams. This means that java.io.Console reader/writer methods should 
continue to return a Reader/PrintWriter that uses the platform encoding 
(or code page is on Windows). Same thing for the System.out/System.err 
print streams. We need to make this clearer in the JEP.

There has been discussion on this mailing list about adding a 
Console::charset method but it didn't come to a consensus. Naoto Sato 
and I have been chatting about it again recently as there may be a need 
to add an API in advance of proposing to target the JEP.

One case that we are still mulling over is code that creates an 
InputStreamReader on System.in without specifying the charset. This 
might be older code that pre-dates java.io.Console or maybe code that 
wasn't tested on a wide range or platforms. Options range from a spec 
change to doing nothing (the latter meaning running with "COMPACT" or 
migrating the code to use the 2-arg constructor as the default charset 
is not the right choice).

-Alan