Java 8 RFR 8011194: Apps launched via double-clicked .jars have file.encoding value of US-ASCII on Mac OS X

Francis Devereux francis at
Tue Jul 30 16:05:16 UTC 2013

I suspect that Apple might be unlikely to change the value that nl_langinfo returns when LANG is unset.

However, it might be possible to fix this issue without second-guessing the character set reported by the OS by calling [NSLocale currentLocale] (or the CFLocale equivalent) instead of nl_langinfo. I think (although I haven't checked) that that [NSLocale currentLocale] determines the current locale using a mechanism other than environment variables, because LANG is usually be unset for GUI apps on OS X.

On 30 Jul 2013, at 15:56, Scott Palmer <swpalmer at> wrote:

> Then shouldn't you be complaining to Apple that the value returned by
> nl_langinfo needs to be changed?
> David's point seems to be that second guessing the character set reported
> by the OS is likely to cause a different set of problems.
> Scott
> On Tue, Jul 30, 2013 at 10:14 AM, Johannes Schindelin <
> Johannes.Schindelin at> wrote:
>> Hi,
>> On Tue, 30 Jul 2013, David Holmes wrote:
>>> On 30/07/2013 5:54 AM, Brent Christian wrote:
>>>> On 7/28/13 10:13 PM, David Holmes wrote:
>>>>> On 27/07/2013 3:53 AM, Brent Christian wrote:
>>>>>> Please review my fix for 8011194 : "Apps launched via
>> double-clicked
>>>>>> .jars have file.encoding value of US-ASCII on Mac OS X"
>>>>>> In most cases of launching a Java app on Mac (from the cmdline, or
>>>>>> from a native .app bundle), reading and displaying UTF-8
>>>>>> characters beyond the standard ASCII range works fine.
>>>>>> A notable exception is the launching of an app by double-clicking
>>>>>> a .jar file.  In this case, file.encoding defaults to US-ASCII,
>>>>>> and characters outside of the ASCII range show up as garbage.
>>>>> Why does this occur? What sets the encoding to US-ASCII?
>>>> "US-ASCII" is the answer we get from nl_langinfo(CODESET) because no
>>>> values for LANG/LC* are set in the environment when double-clicking a
>>>> .jar.
>>>> We get "UTF-8" when launching from the command line because the
>>>> default setup on Mac will setup LANG for you (to
>>>> "en_US.UTF-8" in the US).
>>> Sounds like a user environment error to me. This isn't my area but I'm
>>> not convinced we should be second guessing what we think the encoding
>>> should be.
>> Except that that is not the case here, of course. The user did *not* set
>> any environment variable in this case.
>> So we are not talking about "second guessing" or "user environment error"
>> but about a sensible default.
>> As to US-ASCII, sorry to say: the seventies called and want their
>> character set back.
>> There can be no question that UTF-8 is the best default character
>> encoding, or are you even going to question *that*?
>>> What if someone intends for it to be US-ASCII?
>> Then LANG would not be unset, would it.
>> Hth,
>> Johannes

More information about the core-libs-dev mailing list