RFR: 8365675: Add String Unicode Case-Folding Support [v11]
Xueming Shen
sherman at openjdk.org
Tue Dec 2 18:38:29 UTC 2025
On Mon, 1 Dec 2025 23:44:52 GMT, Naoto Sato <naoto at openjdk.org> wrote:
>> Xueming Shen has updated the pull request incrementally with one additional commit since the last revision:
>>
>> minor doc formatting update
>
> src/java.base/share/classes/jdk/internal/lang/CaseFolding.java.template line 69:
>
>> 67: * | 1:2 mapping | 0002 | 0000 | xxxx | xxxx | FB02 => 0066 006C
>> 68: * +---+---------+--------+---------+--------+--------+
>> 69: * | 1:3 mapping | 0003 | xxxx | xxxx | xxxx | FB03 => 0066 0066 0069
>
> What if 1:2/3 mappings included non-BMP case folded forms?
1:2 should be fine, we still have enough bits available. 1:3 will be more challenging, but in theory 21-bit x 3 = 63. we still have the msb to indicate it's 3 non-bmp. That said, I assume we might simply fallback to the char/int[] mode when the 'flag' byte indicates 0004 for 1:2 non-bmp or 0006 for 1:3 non-bmp, for example. I don't think we need to worry much about the performance for those 'special' cases, if the standard does add such mappings.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/27628#discussion_r2582390889
More information about the core-libs-dev
mailing list