RFR [8058875]: CharsetEncoder.maxBytesPerChar() should return 4 for UTF-8

Ivan Gerasimov ivan.gerasimov at oracle.com
Tue Sep 23 07:37:07 UTC 2014


Martin, Sherman thanks for clarification!

Closing the bug as not a bug.

> The "character" in the nio Charset and CharDe/Encoder is specified as 
> "sixteen-bit Unicode
> code unit", so it is reasonable to interpret the "character" in the 
> "maximum number of bytes
> that will be produced for each character of input" to be the Java 
> "char" as well. In case of
> UTF8, each 4-byte form supplementary character is always coded into 2 
> surrogate chars,
> it's "2 byte per char".

> Do we have a real escalation that complains about this?
>
Yes, the link in on the bug page: 
https://bugs.openjdk.java.net/browse/JDK-8058875
I'm going to try to explain what I've just realized about this function :-)

Sincerely yours,
Ivan


More information about the nio-dev mailing list