Possible optimization in StringLatin1.regionMatchesCI
Christoph Dreis
christoph.dreis at freenet.de
Mon May 25 21:13:48 UTC 2020
Hi,
I've recently looked through the StringLatin1 code - specifically regionMatchesCI.
I think I have an optimization, but would need someone with more domain knowledge to verify if I'm missing nothing.
Currently, the code does a conversion to uppercase and if that doesn't match it does an additional comparison of the lowercase characters.
What's confusing to me is that there are actually both upper- and lowercase checks needed, but that might be explained by the comment in the UTF-16 version about the Georgian alphabet.
Assuming that the additional lowercase check is needed, I was wondering if this must be on the uppercase variant. Wouldn't it be faster on the character itself to avoid potentially converting a lowercase character to an uppercase character and back?
I think code is actually better explaining what I'm suggesting:
--- a/src/java.base/share/classes/java/lang/StringLatin1.java Wed May 13 16:18:16 2020 +0200
+++ b/src/java.base/share/classes/java/lang/StringLatin1.java Mon May 25 22:59:13 2020 +0200
@@ -396,7 +396,7 @@
if (u1 == u2) {
continue;
}
- if (Character.toLowerCase(u1) == Character.toLowerCase(u2)) {
+ if (Character.toLowerCase(c1) == Character.toLowerCase(c2)) {
continue;
}
return false;
And indeed the newer version seems to be faster if I use the following benchmark:
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class MyBenchmark {
@State(Scope.Benchmark)
public static class ThreadState {
private String test1 = "test";
private String test2 = "best";
}
@Benchmark
public boolean test(ThreadState threadState) {
return threadState.test1.equalsIgnoreCase(threadState.test2);
}
}
Benchmark Mode Cnt Score Error Units
MyBenchmark.testOld avgt 10 8,843 ± 0,274 ns/op
MyBenchmark.testPatched avgt 10 7,067 ± 0,177 ns/op
Does this make sense? Do I miss something here? I would appreciate if someone can either explain the shortcomings of the solution above or - in case there aren't any - can maybe sponsor it.
Cheers,
Christoph
More information about the core-libs-dev
mailing list