Possible optimization in StringLatin1.regionMatchesCI

Mon May 25 21:13:48 UTC 2020

Hi,

I've recently looked through the StringLatin1 code - specifically regionMatchesCI.

I think I have an optimization, but would need someone with more domain knowledge to verify if I'm missing nothing.

Currently, the code does a conversion to uppercase and if that doesn't match it does an additional comparison of the lowercase characters.
What's confusing to me is that there are actually both upper- and lowercase checks needed, but that might be explained by the comment in the UTF-16 version about the Georgian alphabet.

Assuming that the additional lowercase check is needed, I was wondering if this must be on the uppercase variant. Wouldn't it be faster on the character itself to avoid potentially converting a lowercase character to an uppercase character and back?

I think code is actually better explaining what I'm suggesting:

--- a/src/java.base/share/classes/java/lang/StringLatin1.java   Wed May 13 16:18:16 2020 +0200
+++ b/src/java.base/share/classes/java/lang/StringLatin1.java   Mon May 25 22:59:13 2020 +0200
@@ -396,7 +396,7 @@
             if (u1 == u2) {
                 continue;
             }
-            if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
+            if (Character.toLowerCase(c1) == Character.toLowerCase(c2)) {
                 continue;
             }
             return false;


And indeed the newer version seems to be faster if I use the following benchmark:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class MyBenchmark {

    @State(Scope.Benchmark)
    public static class ThreadState {
        private String test1 = "test";
        private String test2 = "best";
    }

    @Benchmark
    public boolean test(ThreadState threadState) {
        return threadState.test1.equalsIgnoreCase(threadState.test2);
    }

}

Benchmark                                      Mode  Cnt   Score    Error   Units
MyBenchmark.testOld                  avgt   10   8,843 ±  0,274   ns/op
MyBenchmark.testPatched          avgt   10   7,067 ±  0,177   ns/op

Does this make sense? Do I miss something here? I would appreciate if someone can either explain the shortcomings of the solution above or - in case there aren't any - can maybe sponsor it.

Cheers,
Christoph