RFR: 8285255: refine StringLatin1.regionMatchesCI_UTF16 [v2]

Wed Apr 20 21:00:25 UTC 2022

On Wed, 20 Apr 2022 20:09:06 GMT, XenoAmess <duke at openjdk.java.net> wrote:

>> some thoughts after watching 8285001: Simplify StringLatin1.regionMatches  https://github.com/openjdk/jdk/pull/8292/
>> 
>>             if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
>>                 continue;
>>             }
>> 
>> should be changed to 
>> 
>>             if (((u1 == c1) ? CharacterDataLatin1.instance.toLowerCase(c1) : c1) == Character.toLowerCase(u2)) {
>>                 continue;
>>             }
>> 
>> as:
>> 
>> 1. c1 is LATIN1, so CharacterDataLatin1.instance.toLowerCase seems faster.
>> 2. because c1 is LATIN1, so if u1 != c1, then c1 is already lowercase, and don't need a lowercase cauculation.
>
> XenoAmess has updated the pull request incrementally with one additional commit since the last revision:
> 
>   add jmh test

Thanks for the JMH tests and data.

The only part needed from the JMH run is the last 9 lines.  The rest is noise.
If it was formatted as a literal it would be easier to read.

What I see is that the run with == is quite a bit slower.

With the == check:

StringOther.regionMatchesU1024LL  avgt   15   187.258 ±   1.038  ns/op
StringOther.regionMatchesU1024LU  avgt   15  2589.833 ±   8.823  ns/op
StringOther.regionMatchesU1024UL  avgt   15  2379.645 ±   6.481  ns/op
StringOther.regionMatchesU1024UU  avgt   15   191.587 ±   7.069  ns/op

Without the == check:

StringOther.regionMatchesU1024LL  avgt   15   187.732 ±   1.914  ns/op
StringOther.regionMatchesU1024LU  avgt   15  1324.156 ±  11.761  ns/op
StringOther.regionMatchesU1024UL  avgt   15  1331.857 ±  22.509  ns/op
StringOther.regionMatchesU1024UU  avgt   15   188.872 ±   2.396  ns/op

In the JMH cases, does the long prefix of latin1 characters distort the timings?

-------------

PR: https://git.openjdk.java.net/jdk/pull/8308