RFR: 8372353: API to compute the byte length of a String encoded in a given Charset [v5]

Thu Jan 15 17:05:57 UTC 2026

On Thu, 15 Jan 2026 16:10:57 GMT, Volkan Yazici <vyazici at openjdk.org> wrote:

>> Liam Miller-Cushon has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Update tests
>
> src/java.base/share/classes/java/lang/String.java line 1585:
> 
>> 1583: 
>> 1584:     // This follows the implementation of encodeUTF8_UTF16
>> 1585:     private static int encodedLengthUTF8_UTF16(byte[] val) {
> 
> Doesn't this duplicate the `computeSizeUTF8_UTF16`?
> 
> AFAICS, `computeSizeUTF8_UTF16` is missing the ASCII fast loop, but we can enhance it.
> 
> FWIW, if we decide reuse `computeSizeUTF8_UTF16`, it might be better to rename it to `encodedLengthUTF8_UTF16`, which will be in line with the introduced `encodedLength*` method family.

Thanks for the catch, good point I will look at switching to `computeSizeUTF8_UTF16`.

`computeSizeUTF8_UTF16` returns `long`, this raises a question of what to do in that case. The return type of `getBytesLength` could potentially be `long` and allow computing the encoded length of strings that wouldn't fit into an array if they were encoded. Or it could throw an exception in that case, similar to `getBytes`, and have an `int` return type

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28454#discussion_r2695225769