java encoding charset suggestion

Martin Buchholz martinrb at google.com
Mon Mar 18 18:24:21 UTC 2013


It *would* be nice if the world agreed on using UTF-8 as a universal
encoding for all text.  However:

Standard says
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html

"""If the LANG environment variable is not set or is set to the empty
string, the implementation-defined default locale shall be used."""

But I think the operating system should set the default, not the
application.  On my Ubuntu system I see the traditional ASCII English
default:

 $ (unset LC_ALL LC_COLLATE LANG LANGUAGE GDM_LANG; locale)
LANG=
LANGUAGE=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=


On Mon, Mar 18, 2013 at 11:09 AM, Helio Frota <heliofrota at gmail.com> wrote:

>
> I would suggest taking en_US.UTF-8 as default when the LANG variable is not
> set to avoid problems with encoding.
>



More information about the core-libs-dev mailing list