A bug in filesystem bootstrap (unix/ linux) prevents

Xueming Shen xueming.shen at oracle.com
Thu Jul 5 18:02:11 UTC 2012


On 07/05/2012 01:40 AM, Dawid Weiss wrote:
>> export LC_ALL=en_US.UTF-8; java Foo
> Not really, the shell won't let you use a multibyte locale (because of
> issues with null-terminated strings). And multibyte (with BOM) is most
> fun when you're trying to find buggy code ;)

Encodings for those Chinese, Japnese, Korean locales are all "multibyte" 
. UTF-8
is a multibyte encoding, most recent unix/linux platform should have no 
problem
to work with "multibyte" locale. In fact UTF-16 is normally not 
categorized as
multibye (mb), but wide char, as "wc". There are reason(s) why you 
(normally)
only see utf-8 locale but no utf-16 locale on Unix/Linux based platforms 
and why
you have "W" version of APIs and "A" version of APIs (and even "T" 
version for
some APIs) on Windows platfrom.

I agree it might be helpful if there is mechanism that you can change 
the "default
charset" used by various Java APIs, similar to what you do with 
Locale.setDefault().
With the introduction of sun.jnu.encoding (which takes over the 
responsibility of
the encoding jvm used to talk to the underlying OS APIs) it might be 
possible to
reduce the scope of system property file.encoding to only for the 
default encoding
of the "file content" and do something here, but it is not on the 
priority list for now.

-Sherman

Btw, I need to make it clear here that sun.jnu.encoding is purely an 
implementation
detail, app is not supposed to use it for whatever purpose.

>>   I would assume there is no en_US.UTF-16 locale there :-)
> I wish there were. It'd make people care more ;)
>
> Dawid




More information about the core-libs-dev mailing list