RFR: 8334554: RISC-V: verify & fix perf of string comparison

Hamlin Li mli at openjdk.org
Tue Jun 25 08:50:11 UTC 2024


On Tue, 25 Jun 2024 06:01:15 GMT, Fei Yang <fyang at openjdk.org> wrote:

>> Seem it gains benefit on compareToUU, but introduces regression on compareToLL.
>
>> Seems the `Error` column is huge for tests `compareToLL`.
> 
> I think the reason is not enough warmup. Increasing the warmup iterations to 10, I have:
> Before:
> 
> Benchmark                                   (delta)  (size)  Mode  Cnt      Score     Error  Units
> StringCompareToDifferentLength.compareToLL        2      24  avgt    9   4130.834 ±  32.876  us/op
> StringCompareToDifferentLength.compareToLL        2      36  avgt    9   4194.660 ±  50.024  us/op
> StringCompareToDifferentLength.compareToLL        2      72  avgt    9   5632.843 ±  39.958  us/op
> StringCompareToDifferentLength.compareToLL        2     128  avgt    9   5537.939 ± 102.826  us/op
> StringCompareToDifferentLength.compareToLL        2     256  avgt    9   8410.254 ±  48.978  us/op
> StringCompareToDifferentLength.compareToLL        2     512  avgt    9  14190.077 ±  58.298  us/op
> StringCompareToDifferentLength.compareToLU        2      24  avgt    9   4746.320 ±  26.752  us/op
> StringCompareToDifferentLength.compareToLU        2      36  avgt    9   4745.934 ±  29.010  us/op
> StringCompareToDifferentLength.compareToLU        2      72  avgt    9   7010.726 ±  34.604  us/op
> StringCompareToDifferentLength.compareToLU        2     128  avgt    9   6932.810 ± 116.194  us/op
> StringCompareToDifferentLength.compareToLU        2     256  avgt    9  11299.320 ±  71.107  us/op
> StringCompareToDifferentLength.compareToLU        2     512  avgt    9  20284.136 ± 518.531  us/op
> StringCompareToDifferentLength.compareToUL        2      24  avgt    9   4909.746 ±  62.347  us/op
> StringCompareToDifferentLength.compareToUL        2      36  avgt    9   4931.501 ±  21.065  us/op
> StringCompareToDifferentLength.compareToUL        2      72  avgt    9   7120.069 ± 121.244  us/op
> StringCompareToDifferentLength.compareToUL        2     128  avgt    9   7082.143 ±  37.576  us/op
> StringCompareToDifferentLength.compareToUL        2     256  avgt    9  11519.615 ± 159.860  us/op
> StringCompareToDifferentLength.compareToUL        2     512  avgt    9  20657.453 ± 366.615  us/op
> StringCompareToDifferentLength.compareToUU        2      24  avgt    9   4239.266 ±  18.867  us/op
> StringCompareToDifferentLength.compareToUU        2      36  avgt    9   5809.216 ±  32.901  us/op
> StringCompareToDifferentLength.compareToUU        2      72  avgt    9   7388.057 ±  49.952  us/op
> StringCompareToDifferentLength.compareToUU        2     128  avgt    9   9027.548 ±  45.262  us/op
> StringCompareToDifferentLength.compareToUU        2     256  avg...

Thanks for testing!
I merged your above test result as below, we can see for UU/LL tests, it brings regression when count==24/36, improves the performance  when count > 36, and the improvement trend is stable when count grows up.
In my previous test result, similar things happened, the only difference is it brings regression when count==24.

Benchmark | (delta) | (size) | Mode | Cnt | Score(+rvv) - before | Error | Units | Score(+rvv) - after | Before/after(bigger, better)
-- | -- | -- | -- | -- | -- | -- | -- | -- | --
StringCompareToDifferentLength.compareToLL | 2 | 24 | avgt | 9 | 4130.834 | 32.876 | us/op | 4738.613 | 0.872
StringCompareToDifferentLength.compareToLL | 2 | 36 | avgt | 9 | 4194.66 | 50.024 | us/op | 4791.263 | 0.875
StringCompareToDifferentLength.compareToLL | 2 | 72 | avgt | 9 | 5632.843 | 39.958 | us/op | 4746.75 | 1.187
StringCompareToDifferentLength.compareToLL | 2 | 128 | avgt | 9 | 5537.939 | 102.826 | us/op | 4745.569 | 1.167
StringCompareToDifferentLength.compareToLL | 2 | 256 | avgt | 9 | 8410.254 | 48.978 | us/op | 6770.867 | 1.242
StringCompareToDifferentLength.compareToLL | 2 | 512 | avgt | 9 | 14190.077 | 58.298 | us/op | 10931.753 | 1.298
StringCompareToDifferentLength.compareToLU | 2 | 24 | avgt | 9 | 4746.32 | 26.752 | us/op | 4747.007 | 1
StringCompareToDifferentLength.compareToLU | 2 | 36 | avgt | 9 | 4745.934 | 29.01 | us/op | 4742.046 | 1.001
StringCompareToDifferentLength.compareToLU | 2 | 72 | avgt | 9 | 7010.726 | 34.604 | us/op | 7013.791 | 1
StringCompareToDifferentLength.compareToLU | 2 | 128 | avgt | 9 | 6932.81 | 116.194 | us/op | 6935.089 | 1
StringCompareToDifferentLength.compareToLU | 2 | 256 | avgt | 9 | 11299.32 | 71.107 | us/op | 11467.796 | 0.985
StringCompareToDifferentLength.compareToLU | 2 | 512 | avgt | 9 | 20284.136 | 518.531 | us/op | 20280.133 | 1
StringCompareToDifferentLength.compareToUL | 2 | 24 | avgt | 9 | 4909.746 | 62.347 | us/op | 4918.766 | 0.998
StringCompareToDifferentLength.compareToUL | 2 | 36 | avgt | 9 | 4931.501 | 21.065 | us/op | 4927.695 | 1.001
StringCompareToDifferentLength.compareToUL | 2 | 72 | avgt | 9 | 7120.069 | 121.244 | us/op | 7159.904 | 0.994
StringCompareToDifferentLength.compareToUL | 2 | 128 | avgt | 9 | 7082.143 | 37.576 | us/op | 7097.52 | 0.998
StringCompareToDifferentLength.compareToUL | 2 | 256 | avgt | 9 | 11519.615 | 159.86 | us/op | 11734.633 | 0.982
StringCompareToDifferentLength.compareToUL | 2 | 512 | avgt | 9 | 20657.453 | 366.615 | us/op | 20435.038 | 1.011
StringCompareToDifferentLength.compareToUU | 2 | 24 | avgt | 9 | 4239.266 | 18.867 | us/op | 4864.055 | 0.872
StringCompareToDifferentLength.compareToUU | 2 | 36 | avgt | 9 | 5809.216 | 32.901 | us/op | 4864.965 | 1.194
StringCompareToDifferentLength.compareToUU | 2 | 72 | avgt | 9 | 7388.057 | 49.952 | us/op | 7079.59 | 1.044
StringCompareToDifferentLength.compareToUU | 2 | 128 | avgt | 9 | 9027.548 | 45.262 | us/op | 7260.679 | 1.243
StringCompareToDifferentLength.compareToUU | 2 | 256 | avgt | 9 | 15462.502 | 352.451 | us/op | 11482.104 | 1.347
StringCompareToDifferentLength.compareToUU | 2 | 512 | avgt | 9 | 27770.511 | 54.271 | us/op | 20338.735 | 1.365

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19825#discussion_r1652284050


More information about the hotspot-compiler-dev mailing list