RFR [8058875]: CharsetEncoder.maxBytesPerChar() should return 4 for UTF-8
Ivan Gerasimov
ivan.gerasimov at oracle.com
Tue Sep 23 07:37:07 UTC 2014
Martin, Sherman thanks for clarification!
Closing the bug as not a bug.
> The "character" in the nio Charset and CharDe/Encoder is specified as
> "sixteen-bit Unicode
> code unit", so it is reasonable to interpret the "character" in the
> "maximum number of bytes
> that will be produced for each character of input" to be the Java
> "char" as well. In case of
> UTF8, each 4-byte form supplementary character is always coded into 2
> surrogate chars,
> it's "2 byte per char".
> Do we have a real escalation that complains about this?
>
Yes, the link in on the bug page:
https://bugs.openjdk.java.net/browse/JDK-8058875
I'm going to try to explain what I've just realized about this function :-)
Sincerely yours,
Ivan
More information about the nio-dev
mailing list