RFR: 8311220: Optimization for StringLatin UpperLower

Glavo duke at openjdk.org
Mon Jul 3 09:24:54 UTC 2023


On Mon, 3 Jul 2023 09:17:29 GMT, 温绍锦 <duke at openjdk.org> wrote:

>> src/java.base/share/classes/java/lang/StringLatin1.java line 442:
>> 
>>> 440:             if (!canEncode(cp)) {                      // not a latin1 character
>>> 441:                 return toLowerCaseEx(str, value, first, locale, false);
>>> 442:             }
>> 
>> Suggestion:
>> 
>> 
>> 
>> I think this branch is unnecessary. All lowercase forms of Latin-1 characters can be encoded using Latin-1.
>
> String str0 = new String(new byte[]{-75}, StandardCharsets.ISO_8859_1);
> String str1 = str0.toUpperCase();
> 
> str0.coder is LATIN1
> str1.coder is UTF16

> ```java
> String str0 = new String(new byte[]{-75}, StandardCharsets.ISO_8859_1);
> String str1 = str0.toUpperCase();
> ```
> 
> str0.coder is LATIN1 str1.coder is UTF16

This happens only with `toUpperCase`, not with `toLowerCase`.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/14751#discussion_r1250545802


More information about the core-libs-dev mailing list