<i18n dev> RFR: 8365675: Add String Unicode Case-Folding Support [v7]
Roger Riggs
rriggs at openjdk.org
Thu Oct 30 17:59:42 UTC 2025
On Thu, 30 Oct 2025 03:00:47 GMT, Xueming Shen <sherman at openjdk.org> wrote:
>>> The performance is slightly better, but not as good as I would have expected. The access to codepoint from the long looks a little clumsy, but the logic looks smooth. need more work. opinion?
>> It does look cleaner without the array indexing in the loops.
>> Can the counting of characters (fcnt1,fcnt2) be eliminated by encoding 3 20-bit characters into the long and then checking `f1 != 0` to indicate there are more characters. Its a bit of an odd mix of 16-bit characters vs a single 20-bit char. Are there any 20-bit chars from or to folded replacements in the folding mappings?
>
> Good idea. After removing the fcnt the implementation looks much cleaner and more straightforward. The1:m folding implementation is also faster. Maybe this is good enough to. go :-)
>
> The latest numbers
>
>
> Benchmark Mode Cnt Score Error Units
> StringCompareToFoldCase.asciiLower avgt 15 15.874 ± 1.276 ns/op
> StringCompareToFoldCase.asciiLowerEQ avgt 15 9.915 ± 0.242 ns/op
> StringCompareToFoldCase.asciiLowerEQFC avgt 15 10.751 ± 0.219 ns/op
> StringCompareToFoldCase.asciiLowerFC avgt 15 10.277 ± 0.126 ns/op
> StringCompareToFoldCase.asciiUpperLower avgt 15 12.121 ± 0.699 ns/op
> StringCompareToFoldCase.asciiUpperLowerEQ avgt 15 10.836 ± 0.746 ns/op
> StringCompareToFoldCase.asciiUpperLowerEQFC avgt 15 9.091 ± 0.273 ns/op
> StringCompareToFoldCase.asciiUpperLowerFC avgt 15 9.207 ± 0.255 ns/op
> StringCompareToFoldCase.asciiWithDFFC avgt 15 38.322 ± 0.975 ns/op
> StringCompareToFoldCase.greekLower avgt 15 39.746 ± 0.127 ns/op
> StringCompareToFoldCase.greekLowerEQ avgt 15 39.303 ± 0.063 ns/op
> StringCompareToFoldCase.greekLowerEQFC avgt 15 20.470 ± 0.329 ns/op
> StringCompareToFoldCase.greekLowerFC avgt 15 19.734 ± 0.295 ns/op
> StringCompareToFoldCase.greekUpperLower avgt 15 7.084 ± 0.085 ns/op
> StringCompareToFoldCase.greekUpperLowerEQ avgt 15 7.472 ± 0.115 ns/op
> StringCompareToFoldCase.greekUpperLowerEQFC avgt 15 6.608 ± 0.248 ns/op
> StringCompareToFoldCase.greekUpperLowerFC avgt 15 6.573 ± 0.189 ns/op
> StringCompareToFoldCase.latin1UTF16 avgt 15 24.407 ± 2.157 ns/op
> StringCompareToFoldCase.latin1UTF16EQ avgt 15 22.632 ± 0.131 ns/op
> StringCompareToFoldCase.latin1UTF16EQFC avgt 15 29.564 ± 0.655 ns/op
> StringCompareToFoldCase.latin1UTF16FC avgt 15 29.273 ± 0.324 ns/op
> StringCompareToFoldCase.supLower avgt 15 54.145 ± 0.075 ns/op
> StringCompareToFoldCase.supLowerEQ avgt 15 55.545 ± 0.042 ns/op
> StringCompareToFoldCase.supLowerEQFC avgt 15 24.788 ± 0.180 ns/op
> StringCompareToFoldCase.supLowerFC avgt 15 24.515 ± 0.025 ns/op
> StringCompareToFoldCase.supUpperLower avgt 15 14.437 ± 0.127 ns/op
> StringCompareToFoldCase.supUpperLowerEQ avgt 15 15.253 ± 0.728 ns/op
> StringCompareToFoldCase.supUpperLowerEQFC avgt 15 9.820 ± 0.104 ns/op
> StringCompareToFoldCase.supUpp...
Looking pretty good.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/27628#discussion_r2479020631
More information about the i18n-dev
mailing list