<i18n dev> RFR: 8365675: Add String Unicode Case-Folding Support [v7]

Roger Riggs rriggs at openjdk.org
Thu Oct 30 17:59:42 UTC 2025


On Thu, 30 Oct 2025 03:00:47 GMT, Xueming Shen <sherman at openjdk.org> wrote:

>>> The performance is slightly better, but not as good as I would have expected. The access to codepoint from the long looks a little clumsy, but the logic looks smooth. need more work. opinion?
>> It does look cleaner without the array indexing in the loops.
>> Can the counting of characters (fcnt1,fcnt2) be eliminated by encoding 3 20-bit characters into the long and then checking `f1 != 0` to indicate there are more characters.  Its a bit of an odd mix of 16-bit characters vs a single 20-bit char. Are there any 20-bit chars from or to folded replacements in the folding mappings?
>
> Good idea.   After removing the fcnt the implementation looks much cleaner and more straightforward. The1:m folding implementation is also faster.  Maybe this is good enough to. go :-) 
> 
> The latest numbers
> 
> 
> Benchmark                                    Mode  Cnt   Score   Error  Units
> StringCompareToFoldCase.asciiLower           avgt   15  15.874 ± 1.276  ns/op
> StringCompareToFoldCase.asciiLowerEQ         avgt   15   9.915 ± 0.242  ns/op
> StringCompareToFoldCase.asciiLowerEQFC       avgt   15  10.751 ± 0.219  ns/op
> StringCompareToFoldCase.asciiLowerFC         avgt   15  10.277 ± 0.126  ns/op
> StringCompareToFoldCase.asciiUpperLower      avgt   15  12.121 ± 0.699  ns/op
> StringCompareToFoldCase.asciiUpperLowerEQ    avgt   15  10.836 ± 0.746  ns/op
> StringCompareToFoldCase.asciiUpperLowerEQFC  avgt   15   9.091 ± 0.273  ns/op
> StringCompareToFoldCase.asciiUpperLowerFC    avgt   15   9.207 ± 0.255  ns/op
> StringCompareToFoldCase.asciiWithDFFC        avgt   15  38.322 ± 0.975  ns/op
> StringCompareToFoldCase.greekLower           avgt   15  39.746 ± 0.127  ns/op
> StringCompareToFoldCase.greekLowerEQ         avgt   15  39.303 ± 0.063  ns/op
> StringCompareToFoldCase.greekLowerEQFC       avgt   15  20.470 ± 0.329  ns/op
> StringCompareToFoldCase.greekLowerFC         avgt   15  19.734 ± 0.295  ns/op
> StringCompareToFoldCase.greekUpperLower      avgt   15   7.084 ± 0.085  ns/op
> StringCompareToFoldCase.greekUpperLowerEQ    avgt   15   7.472 ± 0.115  ns/op
> StringCompareToFoldCase.greekUpperLowerEQFC  avgt   15   6.608 ± 0.248  ns/op
> StringCompareToFoldCase.greekUpperLowerFC    avgt   15   6.573 ± 0.189  ns/op
> StringCompareToFoldCase.latin1UTF16          avgt   15  24.407 ± 2.157  ns/op
> StringCompareToFoldCase.latin1UTF16EQ        avgt   15  22.632 ± 0.131  ns/op
> StringCompareToFoldCase.latin1UTF16EQFC      avgt   15  29.564 ± 0.655  ns/op
> StringCompareToFoldCase.latin1UTF16FC        avgt   15  29.273 ± 0.324  ns/op
> StringCompareToFoldCase.supLower             avgt   15  54.145 ± 0.075  ns/op
> StringCompareToFoldCase.supLowerEQ           avgt   15  55.545 ± 0.042  ns/op
> StringCompareToFoldCase.supLowerEQFC         avgt   15  24.788 ± 0.180  ns/op
> StringCompareToFoldCase.supLowerFC           avgt   15  24.515 ± 0.025  ns/op
> StringCompareToFoldCase.supUpperLower        avgt   15  14.437 ± 0.127  ns/op
> StringCompareToFoldCase.supUpperLowerEQ      avgt   15  15.253 ± 0.728  ns/op
> StringCompareToFoldCase.supUpperLowerEQFC    avgt   15   9.820 ± 0.104  ns/op
> StringCompareToFoldCase.supUpp...

Looking pretty good.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/27628#discussion_r2479020631


More information about the i18n-dev mailing list