Strange behaviour of java.nio.charset.StandardCharsets.UTF8.newDecoder()

Martin Buchholz martinrb at google.com
Fri Mar 18 18:31:26 UTC 2016


It looks like you're trying to decode one byte at a time, which cannot
work?  The minimum unit to decode that will work is 4 bytes, and you
have to handle incomplete decoding.

On Fri, Mar 18, 2016 at 11:23 AM, Pavel Rappo <pavel.rappo at oracle.com> wrote:
> Hello,
>
> As far as I understand one of the major use cases of j.n.c.CharsetDecoder is
> incremental decoding. That is, a case where a complete data is available in
> chunks.
> If the above is true, would the following behaviour be considered an expected
> one? Namely, byte-wise decoding fails with an error, while bulk (and complete)
> decoding of the same data completes successfully.
>
> Thanks.
>
>     public static void main(String[] args) {
>         String hex = "48656c6c6f2dc2b540c39fc3b6c3a4c3bcc3a0c3a12d5554462d382121";
>         Matcher m = Pattern.compile("\\p{XDigit}{2}").matcher(hex);
>         List<Integer> ints = new ArrayList<>();
>         while (m.find()) {
>             Integer i = Integer.parseInt(m.group(0), 16);
>             ints.add(i);
>         }
>         if (ints.isEmpty()) {
>             throw new AssertionError();
>         }
>
> //        decodeByteWise(ints);
>         decodeWholesale1(ints);
>         decodeWholesale2(ints);
>     }
>
>     private static void decodeByteWise(List<Integer> ints) {
>         CoderResult r = null;
>         CharsetDecoder d = StandardCharsets.UTF_8.newDecoder();
>         CharBuffer out = CharBuffer.allocate(ints.size());
>         for (Integer i : ints) {
>             r = d.decode(ByteBuffer.wrap(new byte[]{i.byteValue()}), out, false);
>             if (r != CoderResult.UNDERFLOW) {
>                 throw new AssertionError();
>             }
>         }
>         if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
>         r = d.decode(ByteBuffer.allocate(0), out, true);
>         if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
>         r = d.flush(out);
>         if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
>         out.flip();
>         System.out.println(out);
>     }
>
>     private static void decodeWholesale1(List<Integer> ints) {
>         CharsetDecoder d = StandardCharsets.UTF_8.newDecoder();
>         CharBuffer out = CharBuffer.allocate(ints.size());
>         ByteBuffer in = ByteBuffer.allocate(ints.size());
>         for (Integer i : ints) {
>             in.put(i.byteValue());
>         }
>         in.flip();
>         CoderResult r = d.decode(in, out, false);
>         if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
>         r = d.decode(in, out, true);
>         if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
>         r = d.flush(out);
>         if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
>         out.flip();
>         System.out.println(out);
>     }
>
>     private static void decodeWholesale2(List<Integer> ints) {
>         CharsetDecoder d = StandardCharsets.UTF_8.newDecoder();
>         CharBuffer out = CharBuffer.allocate(ints.size());
>         ByteBuffer in = ByteBuffer.allocate(ints.size());
>         for (Integer i : ints) {
>             in.put(i.byteValue());
>         }
>         in.flip();
>         CoderResult r = d.decode(in, out, true);
>         if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
>         r = d.flush(out);
>         if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
>         out.flip();
>         System.out.println(out);
>     }
>
>
>


More information about the nio-dev mailing list