System properties
Xueming Shen
xueming.shen at oracle.com
Sun May 27 18:29:17 PDT 2012
On 5/27/2012 3:46 PM, Michael Hall wrote:
>> I mean if run jvm in C or POSIX locale, the file.encoding is set to US-ASCII by Oracle 7u4/7u6, which
>> is expected.
>>
>> -Sherman
> OK, since locale has been indicated as the determiner.
>
> Consider….
>
> public class LocaleTester {
>
> public static void main(String[] args) {
> System.out.println("java.version="+System.getProperty("java.version"));
> System.out.println("file.encoding="+System.getProperty("file.encoding"));
> System.out.println(rtexec(new String[] { "locale" }));
> }
>
> private static String rtexec(String[] args) {
> try {
> StringBuilder execout = new StringBuilder();
> Process proc = Runtime.getRuntime().exec(args);
> proc.waitFor();
> java.io.InputStream inout = proc.getInputStream();
> java.io.InputStream inerr = proc.getErrorStream();
> byte []buffer = new byte[256];
> while (true) {
> int stderrLen = inerr.read(buffer, 0, buffer.length);
> if (stderrLen> 0) {
> execout.append(new String(buffer,0,stderrLen));
> }
> int stdoutLen = inout.read(buffer, 0, buffer.length);
> if (stdoutLen> 0) {
> execout.append(new String(buffer,0,stdoutLen));
> }
> if (stderrLen< 0&& stdoutLen< 0)
> break;
> }
> return execout.toString();
> }
> catch(Throwable tossed) { tossed.printStackTrace(); }
> return "-";
> }
> }
>
> Run Terminal 1..7 (Should be an Oracle build at the moment)
> /usr/libexec/java_home -v 1.7 --exec java LocaleTester
> java.version=1.7.0_04
> file.encoding=UTF-8
> LANG="en_US.UTF-8"
> LC_COLLATE="en_US.UTF-8"
> LC_CTYPE="en_US.UTF-8"
> LC_MESSAGES="en_US.UTF-8"
> LC_MONETARY="en_US.UTF-8"
> LC_NUMERIC="en_US.UTF-8"
> LC_TIME="en_US.UTF-8"
> LC_ALL=
>
> Run Terminal 1.6
> /usr/libexec/java_home -v 1.6 --exec java LocaleTester
> java.version=1.6.0_31
> file.encoding=MacRoman
> LANG="en_US.UTF-8"
> LC_COLLATE="en_US.UTF-8"
> LC_CTYPE="en_US.UTF-8"
> LC_MESSAGES="en_US.UTF-8"
> LC_MONETARY="en_US.UTF-8"
> LC_NUMERIC="en_US.UTF-8"
> LC_TIME="en_US.UTF-8"
> LC_ALL=
>
> Run using built-in functionality from my application (Again for this should be a Oracle 1.7 build)
> set java.version
> java.version=1.7.0_04
> set file.encoding
> file.encoding=US-ASCII
> exec locale
> LANG=
> LC_COLLATE="C"
> LC_CTYPE="C"
> LC_MESSAGES="C"
> LC_MONETARY="C"
> LC_NUMERIC="C"
> LC_TIME="C"
> LC_ALL=
>
> Apple 1.6 never seems to give UTF-8 contrary to what Andrew said. It is always MacRoman. Either he was not accurate in what he said or is changing the property to UTF-8 somewhere that he isn't aware of it.
>
> 1.7 gives UTF-8 or US-ASCII depending on how it is run. From Terminal command line you get UTF-8.
> From an application for some reason you get the C/POSIX locale and get US-ASCII for the encoding.
>
> According to Xueming Shen who I'm guessing has some Java responsibility involving these things, all of this is correct for the given locale's except as I recall he agreed the prior MacRoman might be problematic. To which Andrew Thompson's other points may of course be valid and worth being addressed.
> I still have no problems or issues but a little curiosity as to why command line java and application java are run differently involving different locale's and ending up with different properties?
>
Yes, file.encoding is set based on the locale that the jvm is running
on, and if you run your java vm
in xxxx.UTF-8 locale, the file.encoding should be UTF-8, set to MacRoman
will be problematic.
So I believe Oracle/OpenJDK7 behaves correctly. That said, as Andrew
Thompson pointed out,
if all previous Apple JDK releases use MacRoman as the file.encoding for
english/UTF-8 locale,
there is a "compatibility" concern here, it might worth putting
something in the release note
to give Oracle/OpenJDK MacOS user a heads up.
sun.jnu.encoding should be the same as the file.encoding on Unix-like
platform. That "internal"
encoding is introduced in jdk6(? if I remember correctly) to address the
different system locale
and user locale setting problem on Windows platform, in which the system
locale and user locale
are set to different one, so the encoding for file "content" is not the
same as the encoding used
by Windows' "A" version APIs (ansi version, vs the "W"/unicode version).
Again, this is an internal
property/implementation detail, never ever try to use it in your app:-)
I don't know how your "application java" works, but normally on a Unix
platform, it is NOT unusual
to run those system scripts in C locale, actually I think most of them
run in C locale, this is something
"configurable" (default system locale). My guess is that either your
"application" is configured to
run in C locale or your system has configured to have the C locale as
its default system locale and
your "application" runs in this default locale (again, it's USUAL, if
your "application" is some "service
provider" kind of app, such as a httpd, for example). Therefor any
process (including the Java)
started from your "application" inherits the C as its running locale.
-Sherman
More information about the macosx-port-dev
mailing list