RFR: 8189103 - AARCH64: optimize String indexOf intrinsic

Tue Apr 24 17:14:53 UTC 2018

On 04/20/2018 11:30 AM, Dmitrij Pochepko wrote:
> On 19.04.2018 20:08, Andrew Haley wrote:
>> On 04/18/2018 05:09 PM, Dmitrij Pochepko wrote:
>>> 2) large update of Boyer Moore algorithm implementation by increasing
>>> search table size from 128 to 256 to remove few branches for Latin1
>>> encoding cases and also improve performance for cases when Latin1
>>> symbols with values >128 are met. This patch also significantly improves
>>> search of Latin1 string inside UTF string by upgrading algorithm
>>> logic(the idea is to skip <pattern length> symbols in case pure UTF
>>> symbol is met in source string).
>> I don't see a change to the comment which describes the algorithm.
>>
>> Does that comment need to be changed?
>>
> Do you mean C pseudo-code which describes algorithm in general? Well, 
> algorithm main idea wasn't changed, so, I left pseudo-code untouched to 
> keep it more readable(my patch adds few "if" operators depending on 
> LL/UU/UL cases).

It does a bunch of other things differently.

Pseudocode must correspond to the actual code, otherwise it is
misleading, and it costs the maintenance programmer because they waste
time having to figure out where the code is different from the
pseudocode.

> I can update this comment to reflect exact changes it if you think
> it'll make things clearer.

Thanks.  Your version is closer to the "classic" Boyer-Moore-Horspool
I've seen.

The really important differences are

 //       /* Preprocessing */
 //       for (i = 0; i < ASIZE; ++i)
-//          bc[i] = 0;
+//          bc[i] = m;

and

-//          if (c < ASIZE) bc[c] = i;
+//          // c < 256 for Latin1 string, so, no need for branch
+//          #ifdef PATTERN_STRING_IS_LATIN1
+//          bc[c] = m - i;
+//          #else
+//          if (c < ASIZE) bc[c] = m - i;
+//          #endif

and

+//          #ifdef PATTERN_IS_LATIN1_AND_SOURCE_IS_UTF
+//          // UL case: need if (c<ASIZE) check. Skip <pattern length> if not.
 //          if (c < ASIZE)
-//            j = j - bc[y[j+m-1]] + m;
+//            j += bc[y[j+m-1]];
 //          else
-//            j += 1; // Advance by 1 only if char >= ASIZE
+//            j += m
+//          #endif

I'll grant you that the effect is the same, but without this
explanation it's very hard to understand what is going on.

I modified your benchmark to compare ASCII-only strings.  This patch
shows what I did.

Please show another spreadsheet run of tests so that we can see if
there are ASCII-only regressions.  Thank you

--- ./src/main/java/com/bellsw/simple/IndexOfBench.java	2018-04-17 21:13:37.000000000 +0100
+++ ./src/main/java/com/bellsw/simple/IndexOfBench2.java	2018-04-24 18:11:02.886810345 +0100
@@ -21,10 +21,12 @@
 @OutputTimeUnit(TimeUnit.NANOSECONDS)
 @Fork(3)
 @State(Scope.Thread)
-public class IndexOfBench {
+public class IndexOfBench2 {
     String PATTERN_L;
     String PATTERN_U;
-    static final int ALL = 1000000; // more than L1
+
+    @Param({"1000000"})
+    int ALL;

     @Param({"8", "16", "256", "10000"})
     int sourceSize;
@@ -45,7 +47,7 @@
         char[] patternL = new char[patternSize];
         char[] patternU = new char[patternSize];
         for (int i = 0; i < patternSize; i++) {
-          char c = (char)(255 - i % 255);
+            char c = (char)(127 - i % 127);
           patternL[i] = c;
           // UTF pattern is a mix of UTF/Latin1 characters(even non
           // latin-alphabet-based text has spaces, commas, points e.t.c.)
@@ -61,7 +63,7 @@
         char[] sourceL = new char[sourceSize - patternSize];
         char[] sourceU = new char[sourceSize - patternSize];
         for (int i = 0; i < sourceSize - patternSize; i++) {
-            char c = (char)(2 + i % 254);
+            char c = (char)(2 + i % 126);
             sourceL[i] = c;
             // UTF string is a mix of UTF/Latin1 characters
             sourceU[i] =  ((i & 1) == 0) ? (char)(0xCAFE + (i % 255)) : c;


-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671