file.encoding vs. sun.jnu.encoding(?) on OS X

Alan Bateman Alan.Bateman at oracle.com
Fri Nov 9 05:12:21 PST 2012


On 08/11/2012 19:25, Scott Kovatch wrote:
> Hello,
>
> I want to bring up something that is causing a lot of confusion, and is generating a lot of bugs on OS X.
>
> What is the relationship between path names and file.encoding? Or, maybe more correctly, _why_ is there some relationship between path names and file.encoding? On OS X filenames are ALWAYS in UTF-8, so the current locale should never come into play.
>
> I was about to launch into a discussion (rant) about our use of nl_langinfo(CODESET) for file.encoding, but the more I look into it, I don't think that's the problem, though you can also make a case that all text files on OS X are UTF-8 by default as well. I'm wondering if this has something to do with sun.jnu.encoding being set to the same value as file.encoding.
>
> -- Scott K.
>
I've seen several mails on macosx-port-dev about this, although I think 
several issues have been conflated which makes for confusing reading.

One of the issues is that HFS normalizes to a variant of NFD and the 
changes that came via the Mac port weren't right. Sherman has 
re-implemented this via 7130915 in jdk8 and it has been back-ported to 
7u for 7u12. We've had confirmation from several people that this 
resolves the issues that they were seeing.

The other issue, and I think the issue that you are trying to get at, is 
that sun.jnu.encoding is being set based on the locale whereas you are 
saying that it should always be UTF-8 on Mac. I think we need to create 
a bug on that and it would be great if you can get technical references 
so that we know this is the right thing to do. There are at least two 
places in the property initialization that would need to be updated to 
do this. I don't think we should change file.encoding as that would 
change the default encoding for the file contents whereas the issues all 
seem to related to the encoding/decoding of file names.

The final issue is just consistent use of sun.jnu.encoding. This 
property was originally only used for Windows but now we have cases 
where it may different to file.encoding on other platforms. Sherman 
brought up 7050570, which is addressing something different again but 
part of it does fix up the new file system API to use sun.jnu.encoding. 
I haven't seen any mails on macosx-port-dev that look like this issue 
but we should get it in anyway (Sherman - you ask why I hadn't pushed 
that several months ago, sorry, it's been on my list as a low priority 
item and low priority items have been starved of cycles).

-Alan.


More information about the jdk8-dev mailing list