A bug in filesystem bootstrap (unix/ linux) prevents
Xueming Shen
xueming.shen at oracle.com
Thu Jul 5 18:02:11 UTC 2012
On 07/05/2012 01:40 AM, Dawid Weiss wrote:
>> export LC_ALL=en_US.UTF-8; java Foo
> Not really, the shell won't let you use a multibyte locale (because of
> issues with null-terminated strings). And multibyte (with BOM) is most
> fun when you're trying to find buggy code ;)
Encodings for those Chinese, Japnese, Korean locales are all "multibyte"
. UTF-8
is a multibyte encoding, most recent unix/linux platform should have no
problem
to work with "multibyte" locale. In fact UTF-16 is normally not
categorized as
multibye (mb), but wide char, as "wc". There are reason(s) why you
(normally)
only see utf-8 locale but no utf-16 locale on Unix/Linux based platforms
and why
you have "W" version of APIs and "A" version of APIs (and even "T"
version for
some APIs) on Windows platfrom.
I agree it might be helpful if there is mechanism that you can change
the "default
charset" used by various Java APIs, similar to what you do with
Locale.setDefault().
With the introduction of sun.jnu.encoding (which takes over the
responsibility of
the encoding jvm used to talk to the underlying OS APIs) it might be
possible to
reduce the scope of system property file.encoding to only for the
default encoding
of the "file content" and do something here, but it is not on the
priority list for now.
-Sherman
Btw, I need to make it clear here that sun.jnu.encoding is purely an
implementation
detail, app is not supposed to use it for whatever purpose.
>> I would assume there is no en_US.UTF-16 locale there :-)
> I wish there were. It'd make people care more ;)
>
> Dawid
More information about the core-libs-dev
mailing list