Strange behaviour of java.nio.charset.StandardCharsets.UTF8.newDecoder()
Martin Buchholz
martinrb at google.com
Fri Mar 18 18:31:26 UTC 2016
It looks like you're trying to decode one byte at a time, which cannot
work? The minimum unit to decode that will work is 4 bytes, and you
have to handle incomplete decoding.
On Fri, Mar 18, 2016 at 11:23 AM, Pavel Rappo <pavel.rappo at oracle.com> wrote:
> Hello,
>
> As far as I understand one of the major use cases of j.n.c.CharsetDecoder is
> incremental decoding. That is, a case where a complete data is available in
> chunks.
> If the above is true, would the following behaviour be considered an expected
> one? Namely, byte-wise decoding fails with an error, while bulk (and complete)
> decoding of the same data completes successfully.
>
> Thanks.
>
> public static void main(String[] args) {
> String hex = "48656c6c6f2dc2b540c39fc3b6c3a4c3bcc3a0c3a12d5554462d382121";
> Matcher m = Pattern.compile("\\p{XDigit}{2}").matcher(hex);
> List<Integer> ints = new ArrayList<>();
> while (m.find()) {
> Integer i = Integer.parseInt(m.group(0), 16);
> ints.add(i);
> }
> if (ints.isEmpty()) {
> throw new AssertionError();
> }
>
> // decodeByteWise(ints);
> decodeWholesale1(ints);
> decodeWholesale2(ints);
> }
>
> private static void decodeByteWise(List<Integer> ints) {
> CoderResult r = null;
> CharsetDecoder d = StandardCharsets.UTF_8.newDecoder();
> CharBuffer out = CharBuffer.allocate(ints.size());
> for (Integer i : ints) {
> r = d.decode(ByteBuffer.wrap(new byte[]{i.byteValue()}), out, false);
> if (r != CoderResult.UNDERFLOW) {
> throw new AssertionError();
> }
> }
> if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
> r = d.decode(ByteBuffer.allocate(0), out, true);
> if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
> r = d.flush(out);
> if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
> out.flip();
> System.out.println(out);
> }
>
> private static void decodeWholesale1(List<Integer> ints) {
> CharsetDecoder d = StandardCharsets.UTF_8.newDecoder();
> CharBuffer out = CharBuffer.allocate(ints.size());
> ByteBuffer in = ByteBuffer.allocate(ints.size());
> for (Integer i : ints) {
> in.put(i.byteValue());
> }
> in.flip();
> CoderResult r = d.decode(in, out, false);
> if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
> r = d.decode(in, out, true);
> if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
> r = d.flush(out);
> if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
> out.flip();
> System.out.println(out);
> }
>
> private static void decodeWholesale2(List<Integer> ints) {
> CharsetDecoder d = StandardCharsets.UTF_8.newDecoder();
> CharBuffer out = CharBuffer.allocate(ints.size());
> ByteBuffer in = ByteBuffer.allocate(ints.size());
> for (Integer i : ints) {
> in.put(i.byteValue());
> }
> in.flip();
> CoderResult r = d.decode(in, out, true);
> if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
> r = d.flush(out);
> if (r != CoderResult.UNDERFLOW) { throw new AssertionError(); }
> out.flip();
> System.out.println(out);
> }
>
>
>
More information about the nio-dev
mailing list