Unix paths as bytes

Martin Buchholz martinrb at google.com
Mon May 4 15:41:52 PDT 2009


On Mon, May 4, 2009 at 00:20, Philip Jenvey <pjenvey at underboss.org> wrote:
>
> On May 3, 2009, at 5:02 PM, Martin Buchholz wrote:
>
>> The python proposal is interesting,
>> but also does not provide real access to the underlying bytes,
>> and appears to have round-trip preservation problems.
>
> Python does provide direct access to paths as bytes via different APIs. Byte
> versions of the environment and the command line args have been discussed
> and may happen in the future, even with PEP 383.
>
> I mention this new PEP because it's made for the general case of working
> with strings and expecting strings back from these APIs. Our UNIX APIs will
> encode these paths back to their original bytes via the filesystem's
> encoding + the PEP's new encoder error handler, and Python code can also
> encode them back to bytes in the same way. There are no round-trip
> preservation issues.

I believe that no implementation based on error handlers can work
because it cannot handle the situation where two different byte inputs
are converted to the same char sequence without error.  The original
byte sequence cannot be reliably re-created.
What am I missing?

>> The Paths API seems to be parallel to the environment variable API
>> in that it catches most of the places where file names would be
>> corrupted by round-trip encoding/decoding, but it is easy to
>> construct sample code where the abstraction is leaky,
>> E.g. if you try to construct a file name from the concatenation of
>> an existing file name and a suffix defined in Java code as a string.
>> (Correct me if I'm wrong)
>
> This example does work for paths as long as you're concatenating via Path
> objects (and the value of suffix is valid according to file.encoding).

I don't see any place in the Paths API where manipulation of a Path component
is supported.  E.g. how would an Emacs implemented in Java append the
"~" character to the filename to create the backup file?

Martin



More information about the nio-dev mailing list