RFR: 8372353: API to compute the byte length of a String encoded in a given Charset [v5]
Liam Miller-Cushon
cushon at openjdk.org
Thu Jan 15 17:05:57 UTC 2026
On Thu, 15 Jan 2026 16:10:57 GMT, Volkan Yazici <vyazici at openjdk.org> wrote:
>> Liam Miller-Cushon has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Update tests
>
> src/java.base/share/classes/java/lang/String.java line 1585:
>
>> 1583:
>> 1584: // This follows the implementation of encodeUTF8_UTF16
>> 1585: private static int encodedLengthUTF8_UTF16(byte[] val) {
>
> Doesn't this duplicate the `computeSizeUTF8_UTF16`?
>
> AFAICS, `computeSizeUTF8_UTF16` is missing the ASCII fast loop, but we can enhance it.
>
> FWIW, if we decide reuse `computeSizeUTF8_UTF16`, it might be better to rename it to `encodedLengthUTF8_UTF16`, which will be in line with the introduced `encodedLength*` method family.
Thanks for the catch, good point I will look at switching to `computeSizeUTF8_UTF16`.
`computeSizeUTF8_UTF16` returns `long`, this raises a question of what to do in that case. The return type of `getBytesLength` could potentially be `long` and allow computing the encoded length of strings that wouldn't fit into an array if they were encoded. Or it could throw an exception in that case, similar to `getBytes`, and have an `int` return type
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/28454#discussion_r2695225769
More information about the core-libs-dev
mailing list