RFR: 8189103 - AARCH64: optimize String indexOf intrinsic
Dmitrij Pochepko
dmitrij.pochepko at bell-sw.com
Tue Apr 24 18:22:17 UTC 2018
Hi,
I measured modified benchmark version(LL case only, because we're
talking about latin1-related encoding)
I reduced execution matrix to save time by removing large string cases,
because Boyer-Moore-Horspool is used for strings < 256 only
http://cr.openjdk.java.net/~dpochepk/8189103/str_indexof_modified_result.xls
Summary: results seems good too
Thanks,
Dmitrij
On 24.04.2018 20:14, Andrew Haley wrote:
> On 04/20/2018 11:30 AM, Dmitrij Pochepko wrote:
>> On 19.04.2018 20:08, Andrew Haley wrote:
>>> On 04/18/2018 05:09 PM, Dmitrij Pochepko wrote:
>>>> 2) large update of Boyer Moore algorithm implementation by increasing
>>>> search table size from 128 to 256 to remove few branches for Latin1
>>>> encoding cases and also improve performance for cases when Latin1
>>>> symbols with values >128 are met. This patch also significantly improves
>>>> search of Latin1 string inside UTF string by upgrading algorithm
>>>> logic(the idea is to skip <pattern length> symbols in case pure UTF
>>>> symbol is met in source string).
>>> I don't see a change to the comment which describes the algorithm.
>>>
>>> Does that comment need to be changed?
>>>
>> Do you mean C pseudo-code which describes algorithm in general? Well,
>> algorithm main idea wasn't changed, so, I left pseudo-code untouched to
>> keep it more readable(my patch adds few "if" operators depending on
>> LL/UU/UL cases).
> It does a bunch of other things differently.
>
> Pseudocode must correspond to the actual code, otherwise it is
> misleading, and it costs the maintenance programmer because they waste
> time having to figure out where the code is different from the
> pseudocode.
>
>> I can update this comment to reflect exact changes it if you think
>> it'll make things clearer.
> Thanks. Your version is closer to the "classic" Boyer-Moore-Horspool
> I've seen.
>
> The really important differences are
>
> // /* Preprocessing */
> // for (i = 0; i < ASIZE; ++i)
> -// bc[i] = 0;
> +// bc[i] = m;
>
> and
>
> -// if (c < ASIZE) bc[c] = i;
> +// // c < 256 for Latin1 string, so, no need for branch
> +// #ifdef PATTERN_STRING_IS_LATIN1
> +// bc[c] = m - i;
> +// #else
> +// if (c < ASIZE) bc[c] = m - i;
> +// #endif
>
> and
>
> +// #ifdef PATTERN_IS_LATIN1_AND_SOURCE_IS_UTF
> +// // UL case: need if (c<ASIZE) check. Skip <pattern length> if not.
> // if (c < ASIZE)
> -// j = j - bc[y[j+m-1]] + m;
> +// j += bc[y[j+m-1]];
> // else
> -// j += 1; // Advance by 1 only if char >= ASIZE
> +// j += m
> +// #endif
>
> I'll grant you that the effect is the same, but without this
> explanation it's very hard to understand what is going on.
>
> I modified your benchmark to compare ASCII-only strings. This patch
> shows what I did.
>
> Please show another spreadsheet run of tests so that we can see if
> there are ASCII-only regressions. Thank you
>
>
> --- ./src/main/java/com/bellsw/simple/IndexOfBench.java 2018-04-17 21:13:37.000000000 +0100
> +++ ./src/main/java/com/bellsw/simple/IndexOfBench2.java 2018-04-24 18:11:02.886810345 +0100
> @@ -21,10 +21,12 @@
> @OutputTimeUnit(TimeUnit.NANOSECONDS)
> @Fork(3)
> @State(Scope.Thread)
> -public class IndexOfBench {
> +public class IndexOfBench2 {
> String PATTERN_L;
> String PATTERN_U;
> - static final int ALL = 1000000; // more than L1
> +
> + @Param({"1000000"})
> + int ALL;
>
> @Param({"8", "16", "256", "10000"})
> int sourceSize;
> @@ -45,7 +47,7 @@
> char[] patternL = new char[patternSize];
> char[] patternU = new char[patternSize];
> for (int i = 0; i < patternSize; i++) {
> - char c = (char)(255 - i % 255);
> + char c = (char)(127 - i % 127);
> patternL[i] = c;
> // UTF pattern is a mix of UTF/Latin1 characters(even non
> // latin-alphabet-based text has spaces, commas, points e.t.c.)
> @@ -61,7 +63,7 @@
> char[] sourceL = new char[sourceSize - patternSize];
> char[] sourceU = new char[sourceSize - patternSize];
> for (int i = 0; i < sourceSize - patternSize; i++) {
> - char c = (char)(2 + i % 254);
> + char c = (char)(2 + i % 126);
> sourceL[i] = c;
> // UTF string is a mix of UTF/Latin1 characters
> sourceU[i] = ((i & 1) == 0) ? (char)(0xCAFE + (i % 255)) : c;
>
>
More information about the hotspot-compiler-dev
mailing list