RFR: 8372353: API to compute the byte length of a String encoded in a given Charset [v2]

Roger Riggs rriggs at openjdk.org
Thu Jan 15 13:41:11 UTC 2026


On Thu, 15 Jan 2026 09:04:09 GMT, Liam Miller-Cushon <cushon at openjdk.org> wrote:

>> src/java.base/share/classes/java/lang/String.java line 1080:
>> 
>>> 1078:             return value.length;
>>> 1079:         }
>>> 1080:         int len = value.length >> 1;
>> 
>> I don't think I understand what's being done and what Charset encoder it is mimicking.
>> It probably needs to document the assumptions about unmappable characters and malformed surrogates.
>> (Likely it is correct since the test of US_ASCII passes, but could use an explanation).
>
> I added some `//` comments documenting which methods the `encodedLength*` methods are mimicking. The logic here should be identical to `encodeASCII` (except that it isn't allocating and writing to a destination array).
> 
> The handling of unmappable characters and malformed surrogates should match `encodeASCII`.

Thanks for the doc update. Duplicating code (almost) is a unfortunate side-effect.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28454#discussion_r2694424323


More information about the core-libs-dev mailing list