<i18n dev> RFR: 8302871: Speed up StringLatin1.regionMatchesCI [v4]

Claes Redestad redestad at openjdk.org
Mon Feb 20 16:26:26 UTC 2023


On Mon, 20 Feb 2023 16:16:45 GMT, Eirik Bjorsnos <duke at openjdk.org> wrote:

>> src/java.base/share/classes/java/lang/CharacterDataLatin1.java.template line 170:
>> 
>>> 168:      * @return true if the two bytes are considered equals ignoring case in latin1
>>> 169:      */
>>> 170:      static boolean equalsIgnoreCase(byte b1, byte b2) {
>> 
>> Perhaps put this in `CharacterDataLatin1`, keeping it close to toLowerCase/toUpperCase that you're changing to use similar logic with #12623 
>> 
>> If you apply #12623 first - how much difference does this make on the micro you're adding with this PR?
>
> Is it not already in CharacterDataLatin1?
> 
> Here is a comparison of relying on improvements in `CharacterDataLatin1.toUpperCase/toLowerCase` only vs. using `CharacterDataLatin1.equalsIgnoreCase`:
> 
> Character.toUpperCase/toLowerCase only:
> 
> 
> Benchmark                                  (codePoints)  (size)  Mode  Cnt     Score    Error  Units
> RegionMatchesIC.Latin1.regionMatchesIC      ascii-match    1024  avgt   15  1310.582 ± 84.777  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC   ascii-mismatch    1024  avgt   15     4.547 ±  0.545  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC     number-match    1024  avgt   15   686.947 ± 11.850  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC  number-mismatch    1024  avgt   15     3.836 ±  0.634  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC       lat1-match    1024  avgt   15  2107.219 ± 17.662  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC    lat1-mismatch    1024  avgt   15     4.924 ±  0.829  ns/op
> 
> 
> CharacterDataLatin1.equalsIgnoreCase:
> 
> 
> Benchmark                                  (codePoints)  (size)  Mode  Cnt     Score    Error  Units
> RegionMatchesIC.Latin1.regionMatchesIC      ascii-match    1024  avgt   15   742.467 ± 34.490  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC   ascii-mismatch    1024  avgt   15     3.960 ±  0.046  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC     number-match    1024  avgt   15   361.158 ± 37.096  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC  number-mismatch    1024  avgt   15     4.039 ±  0.521  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC       lat1-match    1024  avgt   15  1158.091 ± 41.617  ns/op
> RegionMatchesIC.Latin1.regionMatchesIC    lat1-mismatch    1024  avgt   15     4.358 ±  0.123  ns/op

Oops, I lost context and thought this was in `StringLatin1`.

Thanks for running the numbers with #12623. Looks like you're getting big enough of an improvement on top.

-------------

PR: https://git.openjdk.org/jdk/pull/12632


More information about the i18n-dev mailing list