RFR: 8365675: Add String Unicode Case-Folding Support [v11]

Xueming Shen sherman at openjdk.org
Tue Dec 2 18:38:29 UTC 2025


On Mon, 1 Dec 2025 23:44:52 GMT, Naoto Sato <naoto at openjdk.org> wrote:

>> Xueming Shen has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   minor doc formatting update
>
> src/java.base/share/classes/jdk/internal/lang/CaseFolding.java.template line 69:
> 
>> 67:     *  | 1:2 mapping |  0002  |   0000  |  xxxx  |  xxxx  |  FB02 => 0066 006C
>> 68:     *  +---+---------+--------+---------+--------+--------+
>> 69:     *  | 1:3 mapping |  0003  |   xxxx  |  xxxx  |  xxxx  |  FB03 => 0066 0066 0069
> 
> What if 1:2/3 mappings included non-BMP case folded forms?

1:2 should be fine, we still have enough bits available. 1:3 will be more challenging,  but in theory 21-bit x 3 = 63. we still have the msb to indicate it's 3 non-bmp.  That said, I assume we might simply fallback to the char/int[] mode when the 'flag' byte indicates 0004 for 1:2 non-bmp or 0006 for 1:3 non-bmp, for example.  I don't think we need to worry much about the performance for those 'special' cases, if the standard does add such mappings.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27628#discussion_r2582390889


More information about the core-libs-dev mailing list