RFR: 8372353: API to compute the byte length of a String encoded in a given Charset [v2]
Roger Riggs
rriggs at openjdk.org
Thu Jan 15 13:41:11 UTC 2026
On Thu, 15 Jan 2026 09:04:09 GMT, Liam Miller-Cushon <cushon at openjdk.org> wrote:
>> src/java.base/share/classes/java/lang/String.java line 1080:
>>
>>> 1078: return value.length;
>>> 1079: }
>>> 1080: int len = value.length >> 1;
>>
>> I don't think I understand what's being done and what Charset encoder it is mimicking.
>> It probably needs to document the assumptions about unmappable characters and malformed surrogates.
>> (Likely it is correct since the test of US_ASCII passes, but could use an explanation).
>
> I added some `//` comments documenting which methods the `encodedLength*` methods are mimicking. The logic here should be identical to `encodeASCII` (except that it isn't allocating and writing to a destination array).
>
> The handling of unmappable characters and malformed surrogates should match `encodeASCII`.
Thanks for the doc update. Duplicating code (almost) is a unfortunate side-effect.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/28454#discussion_r2694424323
More information about the core-libs-dev
mailing list