String.equalsIgnoreCase(...) optimization

Aleksey Shipilev aleksey.shipilev at oracle.com
Wed Apr 27 11:37:54 UTC 2016


Hi Andrey,

On 04/27/2016 12:57 PM, Andrey wrote:
> I publish my JMH benchmark at github https://github.com/volodin-aa/openjdk-benchmark

Please note that you really should compete with JDK 9 String, not with
JDK 8. String.equalsIgnoreCase is different in JDK 9, and the obvious
improvement one can do there is:

diff -r 5a6df35b0f97 src/java.base/share/classes/java/lang/StringLatin1.java
--- a/src/java.base/share/classes/java/lang/StringLatin1.java	Wed Apr 27
09:13:51 2016 +0200
+++ b/src/java.base/share/classes/java/lang/StringLatin1.java	Wed Apr 27
14:16:14 2016 +0300
@@ -315,11 +315,14 @@
                                           byte[] other, int ooffset,
int len) {
         int last = toffset + len;
         while (toffset < last) {
-            char c1 = (char)(value[toffset++] & 0xff);
-            char c2 = (char)(other[ooffset++] & 0xff);
-            if (c1 == c2) {
+            byte b1 = value[toffset++];
+            byte b2 = other[ooffset++];
+            if (b1 == b2) {
                 continue;
             }
+            char c1 = (char)(b1 & 0xff);
+            char c2 = (char)(b2 & 0xff);
+
             char u1 = Character.toUpperCase(c1);
             char u2 = Character.toUpperCase(c2);
             if (u1 == u2) {

...which improves performance even on short Strings. Maybe specializing
Character.toUpperCase for Latin1 strings would pay off more. Maybe
specializing regionMatches for complete Strings would worth the
increased complexity too, but that should be tried on JDK code first.

To make a good justification for the change, the benchmark should really
test:
 a) Different lengths;
 b) Different starting mismatch offset;
 c) Different Latin1/UTF16 pairs.

Note well: these optimizations would require studying the generated code
looking for the compiler quirks, and as such would require much more
time than anyone thinks it would take (even after taking Hofstadter's
Law into account).

Thanks,
-Aleksey




More information about the core-libs-dev mailing list