Java 8 RFR 8011194: Apps launched via double-clicked .jars have file.encoding value of US-ASCII on Mac OS X

Mike Swingler swingler at
Tue Jul 30 22:53:30 UTC 2013

Apple is highly unlikely to change the behavior of nl_langinfo().

There is already code in the JDK that calls into JRSCopyPrimaryLanguage(), JRSCopyCanonicalLanguageForPrimaryLanguage(), and JRSSetDefaultLocalization() for exactly this purpose.

Please proceed with setting the encoding to UTF-8. It is the de-facto standard for every Cocoa application I have ever seen. US-ASCII is always the wrong choice for a graphical app on OS X.

Mike Swingler
Apple Inc.

On Jul 30, 2013, at 9:05 AM, Francis Devereux <francis at> wrote:

> I suspect that Apple might be unlikely to change the value that nl_langinfo returns when LANG is unset.
> However, it might be possible to fix this issue without second-guessing the character set reported by the OS by calling [NSLocale currentLocale] (or the CFLocale equivalent) instead of nl_langinfo. I think (although I haven't checked) that that [NSLocale currentLocale] determines the current locale using a mechanism other than environment variables, because LANG is usually be unset for GUI apps on OS X.
> On 30 Jul 2013, at 15:56, Scott Palmer <swpalmer at> wrote:
>> Then shouldn't you be complaining to Apple that the value returned by
>> nl_langinfo needs to be changed?
>> David's point seems to be that second guessing the character set reported
>> by the OS is likely to cause a different set of problems.
>> Scott
>> On Tue, Jul 30, 2013 at 10:14 AM, Johannes Schindelin <
>> Johannes.Schindelin at> wrote:
>>> Hi,
>>> On Tue, 30 Jul 2013, David Holmes wrote:
>>>> On 30/07/2013 5:54 AM, Brent Christian wrote:
>>>>> On 7/28/13 10:13 PM, David Holmes wrote:
>>>>>> On 27/07/2013 3:53 AM, Brent Christian wrote:
>>>>>>> Please review my fix for 8011194 : "Apps launched via
>>> double-clicked
>>>>>>> .jars have file.encoding value of US-ASCII on Mac OS X"
>>>>>>> In most cases of launching a Java app on Mac (from the cmdline, or
>>>>>>> from a native .app bundle), reading and displaying UTF-8
>>>>>>> characters beyond the standard ASCII range works fine.
>>>>>>> A notable exception is the launching of an app by double-clicking
>>>>>>> a .jar file.  In this case, file.encoding defaults to US-ASCII,
>>>>>>> and characters outside of the ASCII range show up as garbage.
>>>>>> Why does this occur? What sets the encoding to US-ASCII?
>>>>> "US-ASCII" is the answer we get from nl_langinfo(CODESET) because no
>>>>> values for LANG/LC* are set in the environment when double-clicking a
>>>>> .jar.
>>>>> We get "UTF-8" when launching from the command line because the
>>>>> default setup on Mac will setup LANG for you (to
>>>>> "en_US.UTF-8" in the US).
>>>> Sounds like a user environment error to me. This isn't my area but I'm
>>>> not convinced we should be second guessing what we think the encoding
>>>> should be.
>>> Except that that is not the case here, of course. The user did *not* set
>>> any environment variable in this case.
>>> So we are not talking about "second guessing" or "user environment error"
>>> but about a sensible default.
>>> As to US-ASCII, sorry to say: the seventies called and want their
>>> character set back.
>>> There can be no question that UTF-8 is the best default character
>>> encoding, or are you even going to question *that*?
>>>> What if someone intends for it to be US-ASCII?
>>> Then LANG would not be unset, would it.
>>> Hth,
>>> Johannes

More information about the core-libs-dev mailing list