RFR: 8254807: Optimize startsWith() for String.substring() [v2]

Xin Liu xliu at openjdk.java.net
Wed Nov 11 05:37:56 UTC 2020


On Mon, 2 Nov 2020 07:32:40 GMT, Xin Liu <xliu at openjdk.org> wrote:

>> Some comments and nits on the microbenchmark.
>> 
>> A general comment is that I think it would be good to add variants exercising UTF16 Strings: one where `sample` has some UTF-16 chars, and one where both `sample` and `prefix` do (latin-1 `sample` and UTF-16 `prefix` could be interesting too, to ensure this variant shortcuts quickly).
>> 
>> Should the `prefix` be something a bit more complex than a single char string? `startsWith("a", off)` is a case that'd be tempting to optimize down to `charAt(off) == 'a'` and then this micro might no longer do what it intends to do.
>
> @cl4es 
> 
> Thank you for taking time to review this.  I understand you would like to see more variants, such as UTF16 strings and different prefixes.  
> 
> This api-level substitution actually doesn't care the underlying representation of string and prefix of startsWith. it works in the same way. The purpose of this microbench is to prove that substring() is not inevitable in a certain pattern.  JIT compilers can archive similar performance of the hand-craft code.  Right now, I have only a single variable, which is the length of substring. 
> The result shows that the throughput is irrelevant of the lengths of substrings.  
> 
> My concern is that we would make results discernible if we introduce more than one variable. or I should write a group of benchmarks?

here is the result of updated micro benchmark. 
DoubeleBytes group is the utf-16 string.  SingleByte group the ascii string.
without the optimization,  the throughput is just 1/4~1/5 than hand-crafted code. 

With the optimization, the throughput can reach over 80%  of hand-crafted code.   The reason it can't reach 100% throughput of hand-crafted equivalence is that it has to generate stricter boundary checks.

// without OptimizeSubstring
Benchmark                                                  (substrLength)   Mode  Cnt       Score      Error   Units
SubstringStartsWith.substr2StartsWith_doubleBytes                       4  thrpt  125   40847.644 ? 1269.274  ops/ms
SubstringStartsWith.substr2StartsWith_doubleBytes                      24  thrpt  125   39284.961 ?  155.109  ops/ms
SubstringStartsWith.substr2StartsWith_doubleBytes                     256  thrpt  125   12370.385 ?  622.280  ops/ms
SubstringStartsWith.substr2StartsWith_singleByte                        4  thrpt  125   60633.544 ? 1989.512  ops/ms
SubstringStartsWith.substr2StartsWith_singleByte                       24  thrpt  125   60613.490 ? 1024.846  ops/ms
SubstringStartsWith.substr2StartsWith_singleByte                      256  thrpt  125   30212.033 ? 1106.752  ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_doubleBytes               4  thrpt  125  154666.457 ?    7.132  ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_doubleBytes              24  thrpt  125  154659.583 ?    7.663  ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_doubleBytes             256  thrpt  125  154665.357 ?    6.414  ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_singleByte                4  thrpt  125  162833.972 ?    8.170  ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_singleByte               24  thrpt  125  162834.059 ?    7.862  ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_singleByte              256  thrpt  125  162456.046 ?  217.304  ops/ms


// with OptimizeSubstring
Benchmark                                                  (substrLength)   Mode  Cnt       Score      Error   Units
SubstringStartsWith.substr2StartsWith_doubleBytes                       4  thrpt  125  119789.181 ? 3374.123  ops/ms
SubstringStartsWith.substr2StartsWith_doubleBytes                      24  thrpt  125  123740.637 ?   31.982  ops/ms
SubstringStartsWith.substr2StartsWith_doubleBytes                     256  thrpt  125  123701.525 ?   68.741  ops/ms
SubstringStartsWith.substr2StartsWith_singleByte                        4  thrpt  125  134529.257 ?    6.331  ops/ms
SubstringStartsWith.substr2StartsWith_singleByte                       24  thrpt  125  134517.222 ?    6.373  ops/ms
SubstringStartsWith.substr2StartsWith_singleByte                      256  thrpt  125  134527.929 ?    4.660  ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_doubleBytes               4  thrpt  125  154668.784 ?    6.990  ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_doubleBytes              24  thrpt  125  154567.971 ?   69.286  ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_doubleBytes             256  thrpt  125  154630.115 ?   36.157  ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_singleByte                4  thrpt  125  162852.894 ?    5.933  ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_singleByte               24  thrpt  125  162850.630 ?    5.990  ops/ms
SubstringStartsWith.substr2StartsWith_noalloc_singleByte              256  thrpt  125  162841.513 ?    6.441  ops/ms

-------------

PR: https://git.openjdk.java.net/jdk/pull/974


More information about the hotspot-compiler-dev mailing list