RFR: 8254807: Optimize startsWith() for String.substring()
Xin Liu
xliu at openjdk.java.net
Mon Nov 2 07:35:56 UTC 2020
On Sat, 31 Oct 2020 14:59:16 GMT, Claes Redestad <redestad at openjdk.org> wrote:
>> The optimization transforms code from s=substring(base, beg, end); s.startsWith(prefix)
>> to substring(base, beg, end) | base.startsWith(prefix, beg).
>>
>> it reduces uses of substring. hopefully c2 optimizer can remove the used substring.
>
> Some comments and nits on the microbenchmark.
>
> A general comment is that I think it would be good to add variants exercising UTF16 Strings: one where `sample` has some UTF-16 chars, and one where both `sample` and `prefix` do (latin-1 `sample` and UTF-16 `prefix` could be interesting too, to ensure this variant shortcuts quickly).
>
> Should the `prefix` be something a bit more complex than a single char string? `startsWith("a", off)` is a case that'd be tempting to optimize down to `charAt(off) == 'a'` and then this micro might no longer do what it intends to do.
@cl4es
Thank you for taking time to review this. I understand you would like to see more variants, such as UTF16 strings and different prefixes.
This api-level substitution actually doesn't care the underlying representation of string and prefix of startsWith. it works in the same way. The purpose of this microbench is to prove that substring() is not inevitable in a certain pattern. JIT compilers can archive similar performance of the hand-craft code. Right now, I have only a single variable, which is the length of substring.
The result shows that the throughput is irrelevant of the lengths of substrings.
My concern is that we would make results discernible if we introduce more than one variable. or I should write a group of benchmarks?
> test/micro/org/openjdk/bench/vm/compiler/SubstringAndStartsWith.java line 68:
>
>> 66: // compare prefix length with the length of substring
>> 67: if (prefix.length() > substrLength) return false;
>> 68: return sample.startsWith(prefix, substrLength); // substrLength here is actually the beginIdex of substring
>
> Suggestion:
>
> return sample.startsWith(prefix, substrLength); // substrLength here is actually the beginIndex of substring
thinks. I will fix it in next revision.
-------------
PR: https://git.openjdk.java.net/jdk/pull/974
More information about the hotspot-compiler-dev
mailing list