Unix paths as bytes

Alan Bateman Alan.Bateman at Sun.COM
Sat May 2 04:21:22 PDT 2009


Philip Jenvey wrote:
> UnixPath solves the issue of java.io.File treating unix paths as 
> Strings (e.g. http://bugs.sun.com/view_bug.do?bug_id=4899439 ) -- but 
> AFAICT not for all situations on the JVM.
>
> For example in Jython, paths are represented by Strings, not wrapper 
> objects (JRuby has wrappers but e.g. their Dir.entries() similarly 
> return paths as Strings). Without access to the underlying unix path 
> name as bytes we are stuck with the same old problem of garbage names 
> -- UnixPaths translate their byte representation to Strings by munging 
> invalid characters to the 0xFFFD replacement character.
>
> FYI Python 3 will deal with these invalid characters by representing 
> them with half surrogates (detailed in PEP 383 
> http://www.python.org/dev/peps/pep-0383/ ) -- this allows 
> roundtripping those invalid characters back to bytes.
>
> Can we allow access to UnixPath's byte representation of path names 
> and the reverse: the ability to create a Path object from said bytes?
The only way currently to "export" or "import" as bytes is via URIs. 
When encoding as a URI the platform representation is used and 
characters that aren't legal in the URI path component are escaped. This 
gives you the round-trip but isn't exactly what you want in the String 
is a URI rather than a path. I'm not familar with the Python proposal 
but I will examine it - thanks for forwarding.

-Alan.



More information about the nio-dev mailing list