CharsetEncoder.maxBytesPerChar()

mark.reinhold at oracle.com mark.reinhold at oracle.com
Thu Sep 26 21:23:36 UTC 2019


2019/9/24 13:00:21 -0700, ulf.zibis at cosoco.de:
> Am 21.09.19 um 00:03 schrieb mark.reinhold at oracle.com:
>> To avoid this confusion, a more verbose specification might read:
>>     * Returns the maximum number of $otype$s that will be produced for each
>>     * $itype$ of input.  This value may be used to compute the worst-case size
>>     * of the output buffer required for a given input sequence. This value
>>     * accounts for any necessary content-independent prefix or suffix
>> #if[encoder]
>>     * $otype$s, such as byte-order marks.
>> #end[encoder]
>> #if[decoder]
>>     * $otype$s.
>> #end[decoder]
> 
> wouldn't it be more clear to use "char" or even "{@code char}" instead
> "character" as replacment for the $xtype$ parameters?

The specifications of the Charset{De,En}coder classes make it clear
up front that “character” means “sixteen-bit Unicode character,” so
I don’t think changing “character” everywhere to “{@code char}” is
necessary.

This usage of “character” is common throughout the API specification.
With the introduction of 32-bit Unicode characters we started calling
those “code points,” but kept on calling sixteen-bit characters just
“characters.”  (I don’t think the official term “Unicode code unit”
ever caught on, and it’s a bit of a mouthful anyway.)

- Mark


More information about the core-libs-dev mailing list