RFR: 8254807: Optimize startsWith() for String.substring() [v2]
Xin Liu
xliu at openjdk.java.net
Wed Nov 11 05:37:56 UTC 2020
On Mon, 2 Nov 2020 07:32:40 GMT, Xin Liu <xliu at openjdk.org> wrote:
>> Some comments and nits on the microbenchmark.
>>
>> A general comment is that I think it would be good to add variants exercising UTF16 Strings: one where `sample` has some UTF-16 chars, and one where both `sample` and `prefix` do (latin-1 `sample` and UTF-16 `prefix` could be interesting too, to ensure this variant shortcuts quickly).
>>
>> Should the `prefix` be something a bit more complex than a single char string? `startsWith("a", off)` is a case that'd be tempting to optimize down to `charAt(off) == 'a'` and then this micro might no longer do what it intends to do.
>
> @cl4es
>
> Thank you for taking time to review this. I understand you would like to see more variants, such as UTF16 strings and different prefixes.
>
> This api-level substitution actually doesn't care the underlying representation of string and prefix of startsWith. it works in the same way. The purpose of this microbench is to prove that substring() is not inevitable in a certain pattern. JIT compilers can archive similar performance of the hand-craft code. Right now, I have only a single variable, which is the length of substring.
> The result shows that the throughput is irrelevant of the lengths of substrings.
>
> My concern is that we would make results discernible if we introduce more than one variable. or I should write a group of benchmarks?
here is the result of updated micro benchmark.
DoubeleBytes group is the utf-16 string. SingleByte group the ascii string.
without the optimization, the throughput is just 1/4~1/5 than hand-crafted code.
With the optimization, the throughput can reach over 80% of hand-crafted code. The reason it can't reach 100% throughput of hand-crafted equivalence is that it has to generate stricter boundary checks.
// without OptimizeSubstring
Benchmark (substrLength) Mode Cnt Score Error Units
SubstringStartsWith.substr2StartsWith_doubleBytes 4 thrpt 125 40847.644 ? 1269.274 ops/ms
SubstringStartsWith.substr2StartsWith_doubleBytes 24 thrpt 125 39284.961 ? 155.109 ops/ms
SubstringStartsWith.substr2StartsWith_doubleBytes 256 thrpt 125 12370.385 ? 622.280 ops/ms
SubstringStartsWith.substr2StartsWith_singleByte 4 thrpt 125 60633.544 ? 1989.512 ops/ms
SubstringStartsWith.substr2StartsWith_singleByte 24 thrpt 125 60613.490 ? 1024.846 ops/ms
SubstringStartsWith.substr2StartsWith_singleByte 256 thrpt 125 30212.033 ? 1106.752 ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_doubleBytes 4 thrpt 125 154666.457 ? 7.132 ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_doubleBytes 24 thrpt 125 154659.583 ? 7.663 ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_doubleBytes 256 thrpt 125 154665.357 ? 6.414 ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_singleByte 4 thrpt 125 162833.972 ? 8.170 ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_singleByte 24 thrpt 125 162834.059 ? 7.862 ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_singleByte 256 thrpt 125 162456.046 ? 217.304 ops/ms
// with OptimizeSubstring
Benchmark (substrLength) Mode Cnt Score Error Units
SubstringStartsWith.substr2StartsWith_doubleBytes 4 thrpt 125 119789.181 ? 3374.123 ops/ms
SubstringStartsWith.substr2StartsWith_doubleBytes 24 thrpt 125 123740.637 ? 31.982 ops/ms
SubstringStartsWith.substr2StartsWith_doubleBytes 256 thrpt 125 123701.525 ? 68.741 ops/ms
SubstringStartsWith.substr2StartsWith_singleByte 4 thrpt 125 134529.257 ? 6.331 ops/ms
SubstringStartsWith.substr2StartsWith_singleByte 24 thrpt 125 134517.222 ? 6.373 ops/ms
SubstringStartsWith.substr2StartsWith_singleByte 256 thrpt 125 134527.929 ? 4.660 ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_doubleBytes 4 thrpt 125 154668.784 ? 6.990 ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_doubleBytes 24 thrpt 125 154567.971 ? 69.286 ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_doubleBytes 256 thrpt 125 154630.115 ? 36.157 ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_singleByte 4 thrpt 125 162852.894 ? 5.933 ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_singleByte 24 thrpt 125 162850.630 ? 5.990 ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_singleByte 256 thrpt 125 162841.513 ? 6.441 ops/ms
-------------
PR: https://git.openjdk.java.net/jdk/pull/974
More information about the hotspot-compiler-dev
mailing list