MacOS file system changes between 7u10 and 7u40?

Xueming Shen xueming.shen at oracle.com
Sun Oct 6 10:58:49 PDT 2013


On 10/6/13 10:27 AM, Philippe Marschall wrote:
> On Fri, Oct 4, 2013 at 5:30 PM, Martin Buchholz <martinrb at google.com> wrote:
>> It is already the case that you cannot access all possible Unix file names
>> from Java because by design, file names are represented by Java strings
>> (UTF-16), but at the OS level filenames are actually arbitrary byte
>> sequences with no concept of encoding.
> While file names are Strings and java.io.File is String based
> sun.nio.fs.UnixPath is actually byte[] based. This means so you can
> access files whos name is not valid in the respective encoding given
> you can get a hold of the path. The "easy way" is through
> DirectoryStream the other through the package protected constructor.

The byte[] representation is really an internal implementation detail to 
have much better
performance when the path is pushed back and forth between the Java 
level and the native
level. Any String level file name access to the nio Path still involves 
the charset's encoding
and decoding, which normally do not handle the nfc/nfd at all, with the 
exception that
the utf based encoding that can do a code point to code point mapping.  
While we had
put lots of thoughts into the encoding/decoding issue for the byte[] <=> 
String conversion
the nfc/nfd issue just kicked in "recently" after the MacOS filesystem 
started get attraction,
with a NFD internal representation (and an interesting flipflop-able 
case sensitiveness).
Again, the idea here is to try to keep the consistency of the file name 
representation at
Java level, which I personally feel more important. So maybe the 
question here is do we
want to see the NFD file name at Java level, which means developer/end 
user will probably
be forced to handle nfd/nfc handling themselves and the file name 
hashing/equality
operation will be way more "expensive", and then the "visual" 
representation... do you
want to see the file names being displayed as nfd or nfc...

-Sherman

>
> (Yes I did some checking on a Linux box with file names that are invalid UTF-8).
>
> Cheers
> Philippe



More information about the nio-dev mailing list