Java 8 RFR 8011194: Apps launched via double-clicked .jars have file.encoding value of US-ASCII on Mac OS X

Scott Palmer swpalmer at gmail.com
Tue Jul 30 14:56:59 UTC 2013


Then shouldn't you be complaining to Apple that the value returned by
nl_langinfo needs to be changed?
David's point seems to be that second guessing the character set reported
by the OS is likely to cause a different set of problems.

Scott


On Tue, Jul 30, 2013 at 10:14 AM, Johannes Schindelin <
Johannes.Schindelin at gmx.de> wrote:

> Hi,
>
> On Tue, 30 Jul 2013, David Holmes wrote:
>
> > On 30/07/2013 5:54 AM, Brent Christian wrote:
> > > On 7/28/13 10:13 PM, David Holmes wrote:
> > > > On 27/07/2013 3:53 AM, Brent Christian wrote:
> > > > > Please review my fix for 8011194 : "Apps launched via
> double-clicked
> > > > > .jars have file.encoding value of US-ASCII on Mac OS X"
> > > > >
> > > > > http://bugs.sun.com/view_bug.do?bug_id=8011194
> > > > >
> > > > > In most cases of launching a Java app on Mac (from the cmdline, or
> > > > > from a native .app bundle), reading and displaying UTF-8
> > > > > characters beyond the standard ASCII range works fine.
> > > > >
> > > > > A notable exception is the launching of an app by double-clicking
> > > > > a .jar file.  In this case, file.encoding defaults to US-ASCII,
> > > > > and characters outside of the ASCII range show up as garbage.
> > > >
> > > > Why does this occur? What sets the encoding to US-ASCII?
> > >
> > > "US-ASCII" is the answer we get from nl_langinfo(CODESET) because no
> > > values for LANG/LC* are set in the environment when double-clicking a
> > > .jar.
> > >
> > > We get "UTF-8" when launching from the command line because the
> > > default Terminal.app setup on Mac will setup LANG for you (to
> > > "en_US.UTF-8" in the US).
> >
> > Sounds like a user environment error to me. This isn't my area but I'm
> > not convinced we should be second guessing what we think the encoding
> > should be.
>
> Except that that is not the case here, of course. The user did *not* set
> any environment variable in this case.
>
> So we are not talking about "second guessing" or "user environment error"
> but about a sensible default.
>
> As to US-ASCII, sorry to say: the seventies called and want their
> character set back.
>
> There can be no question that UTF-8 is the best default character
> encoding, or are you even going to question *that*?
>
> > What if someone intends for it to be US-ASCII?
>
> Then LANG would not be unset, would it.
>
> Hth,
> Johannes
>



More information about the core-libs-dev mailing list