RFR: 8352316: More MergeStoreBench [v7]
Shaojin Wen
swen at openjdk.org
Sat Mar 29 07:47:08 UTC 2025
On Sat, 29 Mar 2025 07:27:24 GMT, Shaojin Wen <swen at openjdk.org> wrote:
>> Added performance tests related to String.getBytes/String.getChars/StringBuilder.append/System.arraycopy in constant scenarios to verify whether MergeStore works
>
> Shaojin Wen has updated the pull request incrementally with one additional commit since the last revision:
>
> add StringBuilderUnsafePut
I added a new scenario `StringBuilderUnsafePut`, using Unsafe to modify StringBuilder directly to implement append constants.
The performance numbers below show that ArraySetConst/StringBuilderUnsafePut/UnsafePut have better performance.
These numbers show that Stable Value's arraycopy has great performance optimization potential, which is worth more optimization for C2.
# 1. Scipt
git remote add wenshao git at github.com:wenshao/jdk.git
git fetch wenshao
git checkout cd1d8fb3b137a741446c894d1893e7180535ce8f
make test TEST="micro:vm.compiler.MergeStoreBench.str"
# 2. aliyun_ecs_c8a_x64 (CPU AMD EPYC™ Genoa)
Benchmark Mode Cnt Score Error Units
MergeStoreBench.str4ArraySetConst avgt 5 1338.414 ± 3.209 ns/op
MergeStoreBench.str4Arraycopy avgt 5 7271.203 ± 19.400 ns/op
MergeStoreBench.str4GetBytes avgt 5 6154.684 ± 9.910 ns/op
MergeStoreBench.str4GetChars avgt 5 14078.790 ± 59.175 ns/op
MergeStoreBench.str4StringBuilder avgt 5 15766.528 ± 4634.119 ns/op
MergeStoreBench.str4StringBuilderAppendChar avgt 5 41388.364 ± 9871.409 ns/op
MergeStoreBench.str4StringBuilderUnsafePut avgt 5 1575.792 ± 4.102 ns/op
MergeStoreBench.str4UnsafePut avgt 5 1326.499 ± 2.400 ns/op
MergeStoreBench.str4Utf16ArrayCopy avgt 5 13949.307 ± 1045.255 ns/op
MergeStoreBench.str4Utf16ArraySetConst avgt 5 1511.967 ± 5.250 ns/op
MergeStoreBench.str4Utf16StringBuilder avgt 5 18030.261 ± 1656.463 ns/op
MergeStoreBench.str4Utf16StringBuilderAppendChar avgt 5 35047.855 ± 16674.635 ns/op
MergeStoreBench.str4Utf16StringBuilderUnsafePut avgt 5 2785.792 ± 5.571 ns/op
MergeStoreBench.str4Utf16UnsafePut avgt 5 1613.812 ± 1.249 ns/op
MergeStoreBench.str5ArraySetConst avgt 5 2599.310 ± 8.667 ns/op
MergeStoreBench.str5Arraycopy avgt 5 9487.926 ± 29.234 ns/op
MergeStoreBench.str5GetBytes avgt 5 5972.453 ± 16.035 ns/op
MergeStoreBench.str5GetChars avgt 5 13516.943 ± 10.978 ns/op
MergeStoreBench.str5StringBuilder avgt 5 16539.070 ± 3097.339 ns/op
MergeStoreBench.str5StringBuilderAppendChar avgt 5 50506.770 ± 11536.414 ns/op
MergeStoreBench.str5StringBuilderUnsafePut avgt 5 2653.493 ± 7.397 ns/op
MergeStoreBench.str5UnsafePut avgt 5 2431.003 ± 10.690 ns/op
MergeStoreBench.str5Utf16ArrayCopy avgt 5 20949.585 ± 1128.737 ns/op
MergeStoreBench.str5Utf16ArraySetConst avgt 5 2933.045 ± 5.864 ns/op
MergeStoreBench.str5Utf16StringBuilder avgt 5 21769.670 ± 4910.378 ns/op
MergeStoreBench.str5Utf16StringBuilderAppendChar avgt 5 47491.137 ± 15262.349 ns/op
MergeStoreBench.str5Utf16StringBuilderUnsafePut avgt 5 2652.690 ± 5.348 ns/op
MergeStoreBench.str5Utf16UnsafePut avgt 5 2871.860 ± 5.845 ns/op
MergeStoreBench.str7ArraySetConst avgt 5 3583.059 ± 22.359 ns/op
MergeStoreBench.str7Arraycopy avgt 5 12289.685 ± 14.769 ns/op
MergeStoreBench.str7GetBytes avgt 5 8968.316 ± 34.194 ns/op
MergeStoreBench.str7GetChars avgt 5 16792.196 ± 72.787 ns/op
MergeStoreBench.str7StringBuilder avgt 5 25231.342 ± 2851.998 ns/op
MergeStoreBench.str7StringBuilderAppendChar avgt 5 67351.162 ± 51.074 ns/op
MergeStoreBench.str7StringBuilderUnsafePut avgt 5 3397.856 ± 7.576 ns/op
MergeStoreBench.str7UnsafePut avgt 5 3578.465 ± 3.344 ns/op
MergeStoreBench.str7Utf16ArrayCopy avgt 5 21314.607 ± 117.545 ns/op
MergeStoreBench.str7Utf16ArraySetConst avgt 5 3915.540 ± 7.042 ns/op
MergeStoreBench.str7Utf16StringBuilder avgt 5 21113.390 ± 1452.353 ns/op
MergeStoreBench.str7Utf16StringBuilderAppendChar avgt 5 79597.044 ± 176.197 ns/op
MergeStoreBench.str7Utf16StringBuilderUnsafePut avgt 5 6413.179 ± 11.302 ns/op
MergeStoreBench.str7Utf16UnsafePut avgt 5 4180.867 ± 7.475 ns/op
# 3. aliyun_ecs_c8i_x64 (CPU Intel®Xeon®Emerald Rapids)
Benchmark Mode Cnt Score Error Units
MergeStoreBench.str4ArraySetConst avgt 5 1558.502 ± 2.989 ns/op
MergeStoreBench.str4Arraycopy avgt 5 5855.148 ± 10.116 ns/op
MergeStoreBench.str4GetBytes avgt 5 5874.873 ± 3.767 ns/op
MergeStoreBench.str4GetChars avgt 5 12674.479 ± 103.618 ns/op
MergeStoreBench.str4StringBuilder avgt 5 16564.323 ± 229.666 ns/op
MergeStoreBench.str4StringBuilderAppendChar avgt 5 39590.870 ± 14968.244 ns/op
MergeStoreBench.str4StringBuilderUnsafePut avgt 5 1797.398 ± 3.972 ns/op
MergeStoreBench.str4UnsafePut avgt 5 1547.226 ± 1.950 ns/op
MergeStoreBench.str4Utf16ArrayCopy avgt 5 13984.076 ± 332.735 ns/op
MergeStoreBench.str4Utf16ArraySetConst avgt 5 2592.408 ± 5.338 ns/op
MergeStoreBench.str4Utf16StringBuilder avgt 5 18244.127 ± 2436.822 ns/op
MergeStoreBench.str4Utf16StringBuilderAppendChar avgt 5 36861.665 ± 10735.884 ns/op
MergeStoreBench.str4Utf16StringBuilderUnsafePut avgt 5 3103.648 ± 0.809 ns/op
MergeStoreBench.str4Utf16UnsafePut avgt 5 2539.181 ± 11.556 ns/op
MergeStoreBench.str5ArraySetConst avgt 5 3006.719 ± 4.606 ns/op
MergeStoreBench.str5Arraycopy avgt 5 7152.151 ± 27.593 ns/op
MergeStoreBench.str5GetBytes avgt 5 5572.568 ± 9.664 ns/op
MergeStoreBench.str5GetChars avgt 5 14478.429 ± 597.483 ns/op
MergeStoreBench.str5StringBuilder avgt 5 18249.007 ± 359.685 ns/op
MergeStoreBench.str5StringBuilderAppendChar avgt 5 48156.310 ± 21354.806 ns/op
MergeStoreBench.str5StringBuilderUnsafePut avgt 5 3039.131 ± 5.040 ns/op
MergeStoreBench.str5UnsafePut avgt 5 2885.440 ± 4.323 ns/op
MergeStoreBench.str5Utf16ArrayCopy avgt 5 4648.957 ± 115.805 ns/op
MergeStoreBench.str5Utf16ArraySetConst avgt 5 3862.566 ± 3.036 ns/op
MergeStoreBench.str5Utf16StringBuilder avgt 5 24592.386 ± 6936.461 ns/op
MergeStoreBench.str5Utf16StringBuilderAppendChar avgt 5 44162.880 ± 36224.171 ns/op
MergeStoreBench.str5Utf16StringBuilderUnsafePut avgt 5 3042.734 ± 9.256 ns/op
MergeStoreBench.str5Utf16UnsafePut avgt 5 3858.479 ± 2.273 ns/op
MergeStoreBench.str7ArraySetConst avgt 5 4656.166 ± 3.053 ns/op
MergeStoreBench.str7Arraycopy avgt 5 12139.304 ± 10.065 ns/op
MergeStoreBench.str7GetBytes avgt 5 11909.980 ± 14.371 ns/op
MergeStoreBench.str7GetChars avgt 5 20885.722 ± 3159.820 ns/op
MergeStoreBench.str7StringBuilder avgt 5 14813.587 ± 354.177 ns/op
MergeStoreBench.str7StringBuilderAppendChar avgt 5 61647.309 ± 153.877 ns/op
MergeStoreBench.str7StringBuilderUnsafePut avgt 5 4256.645 ± 1.095 ns/op
MergeStoreBench.str7UnsafePut avgt 5 4662.482 ± 2.893 ns/op
MergeStoreBench.str7Utf16ArrayCopy avgt 5 4939.354 ± 12.117 ns/op
MergeStoreBench.str7Utf16ArraySetConst avgt 5 5401.214 ± 5.342 ns/op
MergeStoreBench.str7Utf16StringBuilder avgt 5 25070.599 ± 8313.323 ns/op
MergeStoreBench.str7Utf16StringBuilderAppendChar avgt 5 84853.104 ± 210.843 ns/op
MergeStoreBench.str7Utf16StringBuilderUnsafePut avgt 5 5290.793 ± 21.012 ns/op
MergeStoreBench.str7Utf16UnsafePut avgt 5 5502.576 ± 11.820 ns/op
# 4. aliyun_ecs_c8y_aarch64 (CPU Aliyun Yitian 710)
Benchmark Mode Cnt Score Error Units
MergeStoreBench.str4ArraySetConst avgt 5 2229.455 ± 2.024 ns/op
MergeStoreBench.str4Arraycopy avgt 5 8323.527 ± 60.470 ns/op
MergeStoreBench.str4GetBytes avgt 5 7008.143 ± 6.658 ns/op
MergeStoreBench.str4GetChars avgt 5 12343.528 ± 6.584 ns/op
MergeStoreBench.str4StringBuilder avgt 5 21238.814 ± 1410.339 ns/op
MergeStoreBench.str4StringBuilderAppendChar avgt 5 68667.406 ± 720.511 ns/op
MergeStoreBench.str4StringBuilderUnsafePut avgt 5 2281.267 ± 1.324 ns/op
MergeStoreBench.str4UnsafePut avgt 5 2230.367 ± 0.626 ns/op
MergeStoreBench.str4Utf16ArrayCopy avgt 5 16338.896 ± 74.446 ns/op
MergeStoreBench.str4Utf16ArraySetConst avgt 5 3098.749 ± 35.606 ns/op
MergeStoreBench.str4Utf16StringBuilder avgt 5 21491.710 ± 2598.145 ns/op
MergeStoreBench.str4Utf16StringBuilderAppendChar avgt 5 67748.629 ± 2224.953 ns/op
MergeStoreBench.str4Utf16StringBuilderUnsafePut avgt 5 3840.268 ± 2.786 ns/op
MergeStoreBench.str4Utf16UnsafePut avgt 5 2858.839 ± 46.434 ns/op
MergeStoreBench.str5ArraySetConst avgt 5 3769.990 ± 2.877 ns/op
MergeStoreBench.str5Arraycopy avgt 5 10604.229 ± 85.266 ns/op
MergeStoreBench.str5GetBytes avgt 5 6604.073 ± 4.599 ns/op
MergeStoreBench.str5GetChars avgt 5 15499.577 ± 166.819 ns/op
MergeStoreBench.str5StringBuilder avgt 5 22817.332 ± 1330.696 ns/op
MergeStoreBench.str5StringBuilderAppendChar avgt 5 86993.698 ± 419.806 ns/op
MergeStoreBench.str5StringBuilderUnsafePut avgt 5 3803.737 ± 0.974 ns/op
MergeStoreBench.str5UnsafePut avgt 5 3765.698 ± 1.774 ns/op
MergeStoreBench.str5Utf16ArrayCopy avgt 5 5691.730 ± 4.200 ns/op
MergeStoreBench.str5Utf16ArraySetConst avgt 5 4620.050 ± 73.237 ns/op
MergeStoreBench.str5Utf16StringBuilder avgt 5 26974.200 ± 9799.822 ns/op
MergeStoreBench.str5Utf16StringBuilderAppendChar avgt 5 84214.630 ± 1770.595 ns/op
MergeStoreBench.str5Utf16StringBuilderUnsafePut avgt 5 3803.749 ± 2.164 ns/op
MergeStoreBench.str5Utf16UnsafePut avgt 5 4463.146 ± 94.255 ns/op
MergeStoreBench.str7ArraySetConst avgt 5 5905.221 ± 17.324 ns/op
MergeStoreBench.str7Arraycopy avgt 5 14400.712 ± 68.866 ns/op
MergeStoreBench.str7GetBytes avgt 5 11693.448 ± 11.413 ns/op
MergeStoreBench.str7GetChars avgt 5 21262.620 ± 393.963 ns/op
MergeStoreBench.str7StringBuilder avgt 5 21559.944 ± 97.469 ns/op
MergeStoreBench.str7StringBuilderAppendChar avgt 5 120774.017 ± 927.175 ns/op
MergeStoreBench.str7StringBuilderUnsafePut avgt 5 5520.405 ± 5.431 ns/op
MergeStoreBench.str7UnsafePut avgt 5 5918.814 ± 8.237 ns/op
MergeStoreBench.str7Utf16ArrayCopy avgt 5 6348.146 ± 2.766 ns/op
MergeStoreBench.str7Utf16ArraySetConst avgt 5 4333.009 ± 1.980 ns/op
MergeStoreBench.str7Utf16StringBuilder avgt 5 29406.714 ± 9703.134 ns/op
MergeStoreBench.str7Utf16StringBuilderAppendChar avgt 5 117801.880 ± 811.216 ns/op
MergeStoreBench.str7Utf16StringBuilderUnsafePut avgt 5 6684.164 ± 16.496 ns/op
MergeStoreBench.str7Utf16UnsafePut avgt 5 6286.796 ± 316.658 ns/op
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24108#issuecomment-2763215404
More information about the hotspot-compiler-dev
mailing list