JDK 1.8.0 33/40, diacritics and file problems

Fabrizio Giudici Fabrizio.Giudici at tidalwave.it
Sun May 10 11:08:43 UTC 2015


Ok, I thought it was over, but it is not over yet. Many problematic file  
names are now correctly handled with explicit normalisation, but I just  
got:

Caused by: java.nio.file.InvalidPathException: Malformed input or input  
contains unmappable characters: M?sica Antigua & Eduardo Paniagua/La Vida  
de Mar?a - Cantigas de las Fiestas de Santa Mar?a/1-01 Estrella Del Dia.m4a
         at sun.nio.fs.UnixPath.encode(UnixPath.java:147) ~[na:1.8.0_33]
         at sun.nio.fs.UnixPath.<init>(UnixPath.java:71) ~[na:1.8.0_33]
         at sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:281)  
~[na:1.8.0_33]
         at java.nio.file.Paths.get(Paths.java:84) ~[na:1.8.0_33]

It seems that there's nothing new with the previous case:

[localhost:~/Library/Application Support/blueMarine2] fritz% grep Paniagua  
repository.n3 | grep Estrella | od -c -t x1
0000000   \t   b   m   m   o   :   p   a   t   h       "   M   ú  **   s
            09  62  6d  6d  6f  3a  70  61  74  68  20  22  4d  c3  ba  73

[localhost:~/Library/Application Support/blueMarine2] fritz% grep Englou  
repository.n3 | od -c -t x1
0000140    a       C   a   t   h   é  **   d   r   a   l   e       E   n
            61  20  43  61  74  68  c3  a9  64  72  61  6c  65  20  45  6e

I had c3a9 for é, now I have c3ba for ú. Why do I now get this  
InvalidPathException?

Thanks.




http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/sun/nio/fs/UnixPath.java/

118    // encodes the given path-string into a sequence of bytes
119    private static byte[] encode(UnixFileSystem fs, String input) {
120        SoftReference<CharsetEncoder> ref = encoder.get();
121        CharsetEncoder ce = (ref != null) ? ref.get() : null;
122        if (ce == null) {
123            ce = Util.jnuEncoding().newEncoder()
124                .onMalformedInput(CodingErrorAction.REPORT)
125                .onUnmappableCharacter(CodingErrorAction.REPORT);
126            encoder.set(new SoftReference<CharsetEncoder>(ce));
127        }
128
129        char[] ca = fs.normalizeNativePath(input.toCharArray());
130
131        // size output buffer for worse-case size
132        byte[] ba = new byte[(int)(ca.length *  
(double)ce.maxBytesPerChar())];
133
134        // encode
135        ByteBuffer bb = ByteBuffer.wrap(ba);
136        CharBuffer cb = CharBuffer.wrap(ca);
137        ce.reset();
138        CoderResult cr = ce.encode(cb, bb, true);
139        boolean error;
140        if (!cr.isUnderflow()) {
141            error = true;
142        } else {
143            cr = ce.flush(bb);
144            error = !cr.isUnderflow();
145        }
146        if (error) {
147            throw new InvalidPathException(input,
148                "Malformed input or input contains unmappable  
characters");
149        }
150
151        // trim result to actual length if required
152        int len = bb.position();
153        if (len != ba.length)
154            ba = Arrays.copyOf(ba, len);
155
156        return ba;
157    }




-- 
Fabrizio Giudici - Java Architect @ Tidalwave s.a.s.
"We make Java work. Everywhere."
http://tidalwave.it/fabrizio/blog - fabrizio.giudici at tidalwave.it


More information about the openjfx-dev mailing list