RFR: 8352316: More MergeStoreBench [v7]

Shaojin Wen swen at openjdk.org
Sat Mar 29 07:47:08 UTC 2025


On Sat, 29 Mar 2025 07:27:24 GMT, Shaojin Wen <swen at openjdk.org> wrote:

>> Added performance tests related to String.getBytes/String.getChars/StringBuilder.append/System.arraycopy in constant scenarios to verify whether MergeStore works
>
> Shaojin Wen has updated the pull request incrementally with one additional commit since the last revision:
> 
>   add StringBuilderUnsafePut

I added a new scenario `StringBuilderUnsafePut`, using Unsafe to modify StringBuilder directly to implement append constants.

The performance numbers below show that ArraySetConst/StringBuilderUnsafePut/UnsafePut have better performance.

These numbers show that Stable Value's arraycopy has great performance optimization potential, which is worth more optimization for C2.

# 1. Scipt

git remote add wenshao git at github.com:wenshao/jdk.git
git fetch wenshao
git checkout cd1d8fb3b137a741446c894d1893e7180535ce8f
make test TEST="micro:vm.compiler.MergeStoreBench.str"


# 2. aliyun_ecs_c8a_x64 (CPU AMD EPYC™ Genoa)

Benchmark                                         Mode  Cnt      Score       Error  Units
MergeStoreBench.str4ArraySetConst                 avgt    5   1338.414 ±     3.209  ns/op
MergeStoreBench.str4Arraycopy                     avgt    5   7271.203 ±    19.400  ns/op
MergeStoreBench.str4GetBytes                      avgt    5   6154.684 ±     9.910  ns/op
MergeStoreBench.str4GetChars                      avgt    5  14078.790 ±    59.175  ns/op
MergeStoreBench.str4StringBuilder                 avgt    5  15766.528 ±  4634.119  ns/op
MergeStoreBench.str4StringBuilderAppendChar       avgt    5  41388.364 ±  9871.409  ns/op
MergeStoreBench.str4StringBuilderUnsafePut        avgt    5   1575.792 ±     4.102  ns/op
MergeStoreBench.str4UnsafePut                     avgt    5   1326.499 ±     2.400  ns/op
MergeStoreBench.str4Utf16ArrayCopy                avgt    5  13949.307 ±  1045.255  ns/op
MergeStoreBench.str4Utf16ArraySetConst            avgt    5   1511.967 ±     5.250  ns/op
MergeStoreBench.str4Utf16StringBuilder            avgt    5  18030.261 ±  1656.463  ns/op
MergeStoreBench.str4Utf16StringBuilderAppendChar  avgt    5  35047.855 ± 16674.635  ns/op
MergeStoreBench.str4Utf16StringBuilderUnsafePut   avgt    5   2785.792 ±     5.571  ns/op
MergeStoreBench.str4Utf16UnsafePut                avgt    5   1613.812 ±     1.249  ns/op
MergeStoreBench.str5ArraySetConst                 avgt    5   2599.310 ±     8.667  ns/op
MergeStoreBench.str5Arraycopy                     avgt    5   9487.926 ±    29.234  ns/op
MergeStoreBench.str5GetBytes                      avgt    5   5972.453 ±    16.035  ns/op
MergeStoreBench.str5GetChars                      avgt    5  13516.943 ±    10.978  ns/op
MergeStoreBench.str5StringBuilder                 avgt    5  16539.070 ±  3097.339  ns/op
MergeStoreBench.str5StringBuilderAppendChar       avgt    5  50506.770 ± 11536.414  ns/op
MergeStoreBench.str5StringBuilderUnsafePut        avgt    5   2653.493 ±     7.397  ns/op
MergeStoreBench.str5UnsafePut                     avgt    5   2431.003 ±    10.690  ns/op
MergeStoreBench.str5Utf16ArrayCopy                avgt    5  20949.585 ±  1128.737  ns/op
MergeStoreBench.str5Utf16ArraySetConst            avgt    5   2933.045 ±     5.864  ns/op
MergeStoreBench.str5Utf16StringBuilder            avgt    5  21769.670 ±  4910.378  ns/op
MergeStoreBench.str5Utf16StringBuilderAppendChar  avgt    5  47491.137 ± 15262.349  ns/op
MergeStoreBench.str5Utf16StringBuilderUnsafePut   avgt    5   2652.690 ±     5.348  ns/op
MergeStoreBench.str5Utf16UnsafePut                avgt    5   2871.860 ±     5.845  ns/op
MergeStoreBench.str7ArraySetConst                 avgt    5   3583.059 ±    22.359  ns/op
MergeStoreBench.str7Arraycopy                     avgt    5  12289.685 ±    14.769  ns/op
MergeStoreBench.str7GetBytes                      avgt    5   8968.316 ±    34.194  ns/op
MergeStoreBench.str7GetChars                      avgt    5  16792.196 ±    72.787  ns/op
MergeStoreBench.str7StringBuilder                 avgt    5  25231.342 ±  2851.998  ns/op
MergeStoreBench.str7StringBuilderAppendChar       avgt    5  67351.162 ±    51.074  ns/op
MergeStoreBench.str7StringBuilderUnsafePut        avgt    5   3397.856 ±     7.576  ns/op
MergeStoreBench.str7UnsafePut                     avgt    5   3578.465 ±     3.344  ns/op
MergeStoreBench.str7Utf16ArrayCopy                avgt    5  21314.607 ±   117.545  ns/op
MergeStoreBench.str7Utf16ArraySetConst            avgt    5   3915.540 ±     7.042  ns/op
MergeStoreBench.str7Utf16StringBuilder            avgt    5  21113.390 ±  1452.353  ns/op
MergeStoreBench.str7Utf16StringBuilderAppendChar  avgt    5  79597.044 ±   176.197  ns/op
MergeStoreBench.str7Utf16StringBuilderUnsafePut   avgt    5   6413.179 ±    11.302  ns/op
MergeStoreBench.str7Utf16UnsafePut                avgt    5   4180.867 ±     7.475  ns/op


# 3. aliyun_ecs_c8i_x64 (CPU Intel®Xeon®Emerald Rapids)

Benchmark                                         Mode  Cnt      Score       Error  Units
MergeStoreBench.str4ArraySetConst                 avgt    5   1558.502 ±     2.989  ns/op
MergeStoreBench.str4Arraycopy                     avgt    5   5855.148 ±    10.116  ns/op
MergeStoreBench.str4GetBytes                      avgt    5   5874.873 ±     3.767  ns/op
MergeStoreBench.str4GetChars                      avgt    5  12674.479 ±   103.618  ns/op
MergeStoreBench.str4StringBuilder                 avgt    5  16564.323 ±   229.666  ns/op
MergeStoreBench.str4StringBuilderAppendChar       avgt    5  39590.870 ± 14968.244  ns/op
MergeStoreBench.str4StringBuilderUnsafePut        avgt    5   1797.398 ±     3.972  ns/op
MergeStoreBench.str4UnsafePut                     avgt    5   1547.226 ±     1.950  ns/op
MergeStoreBench.str4Utf16ArrayCopy                avgt    5  13984.076 ±   332.735  ns/op
MergeStoreBench.str4Utf16ArraySetConst            avgt    5   2592.408 ±     5.338  ns/op
MergeStoreBench.str4Utf16StringBuilder            avgt    5  18244.127 ±  2436.822  ns/op
MergeStoreBench.str4Utf16StringBuilderAppendChar  avgt    5  36861.665 ± 10735.884  ns/op
MergeStoreBench.str4Utf16StringBuilderUnsafePut   avgt    5   3103.648 ±     0.809  ns/op
MergeStoreBench.str4Utf16UnsafePut                avgt    5   2539.181 ±    11.556  ns/op
MergeStoreBench.str5ArraySetConst                 avgt    5   3006.719 ±     4.606  ns/op
MergeStoreBench.str5Arraycopy                     avgt    5   7152.151 ±    27.593  ns/op
MergeStoreBench.str5GetBytes                      avgt    5   5572.568 ±     9.664  ns/op
MergeStoreBench.str5GetChars                      avgt    5  14478.429 ±   597.483  ns/op
MergeStoreBench.str5StringBuilder                 avgt    5  18249.007 ±   359.685  ns/op
MergeStoreBench.str5StringBuilderAppendChar       avgt    5  48156.310 ± 21354.806  ns/op
MergeStoreBench.str5StringBuilderUnsafePut        avgt    5   3039.131 ±     5.040  ns/op
MergeStoreBench.str5UnsafePut                     avgt    5   2885.440 ±     4.323  ns/op
MergeStoreBench.str5Utf16ArrayCopy                avgt    5   4648.957 ±   115.805  ns/op
MergeStoreBench.str5Utf16ArraySetConst            avgt    5   3862.566 ±     3.036  ns/op
MergeStoreBench.str5Utf16StringBuilder            avgt    5  24592.386 ±  6936.461  ns/op
MergeStoreBench.str5Utf16StringBuilderAppendChar  avgt    5  44162.880 ± 36224.171  ns/op
MergeStoreBench.str5Utf16StringBuilderUnsafePut   avgt    5   3042.734 ±     9.256  ns/op
MergeStoreBench.str5Utf16UnsafePut                avgt    5   3858.479 ±     2.273  ns/op
MergeStoreBench.str7ArraySetConst                 avgt    5   4656.166 ±     3.053  ns/op
MergeStoreBench.str7Arraycopy                     avgt    5  12139.304 ±    10.065  ns/op
MergeStoreBench.str7GetBytes                      avgt    5  11909.980 ±    14.371  ns/op
MergeStoreBench.str7GetChars                      avgt    5  20885.722 ±  3159.820  ns/op
MergeStoreBench.str7StringBuilder                 avgt    5  14813.587 ±   354.177  ns/op
MergeStoreBench.str7StringBuilderAppendChar       avgt    5  61647.309 ±   153.877  ns/op
MergeStoreBench.str7StringBuilderUnsafePut        avgt    5   4256.645 ±     1.095  ns/op
MergeStoreBench.str7UnsafePut                     avgt    5   4662.482 ±     2.893  ns/op
MergeStoreBench.str7Utf16ArrayCopy                avgt    5   4939.354 ±    12.117  ns/op
MergeStoreBench.str7Utf16ArraySetConst            avgt    5   5401.214 ±     5.342  ns/op
MergeStoreBench.str7Utf16StringBuilder            avgt    5  25070.599 ±  8313.323  ns/op
MergeStoreBench.str7Utf16StringBuilderAppendChar  avgt    5  84853.104 ±   210.843  ns/op
MergeStoreBench.str7Utf16StringBuilderUnsafePut   avgt    5   5290.793 ±    21.012  ns/op
MergeStoreBench.str7Utf16UnsafePut                avgt    5   5502.576 ±    11.820  ns/op


# 4. aliyun_ecs_c8y_aarch64 (CPU Aliyun Yitian 710)

Benchmark                                         Mode  Cnt       Score      Error  Units
MergeStoreBench.str4ArraySetConst                 avgt    5    2229.455 ±    2.024  ns/op
MergeStoreBench.str4Arraycopy                     avgt    5    8323.527 ±   60.470  ns/op
MergeStoreBench.str4GetBytes                      avgt    5    7008.143 ±    6.658  ns/op
MergeStoreBench.str4GetChars                      avgt    5   12343.528 ±    6.584  ns/op
MergeStoreBench.str4StringBuilder                 avgt    5   21238.814 ± 1410.339  ns/op
MergeStoreBench.str4StringBuilderAppendChar       avgt    5   68667.406 ±  720.511  ns/op
MergeStoreBench.str4StringBuilderUnsafePut        avgt    5    2281.267 ±    1.324  ns/op
MergeStoreBench.str4UnsafePut                     avgt    5    2230.367 ±    0.626  ns/op
MergeStoreBench.str4Utf16ArrayCopy                avgt    5   16338.896 ±   74.446  ns/op
MergeStoreBench.str4Utf16ArraySetConst            avgt    5    3098.749 ±   35.606  ns/op
MergeStoreBench.str4Utf16StringBuilder            avgt    5   21491.710 ± 2598.145  ns/op
MergeStoreBench.str4Utf16StringBuilderAppendChar  avgt    5   67748.629 ± 2224.953  ns/op
MergeStoreBench.str4Utf16StringBuilderUnsafePut   avgt    5    3840.268 ±    2.786  ns/op
MergeStoreBench.str4Utf16UnsafePut                avgt    5    2858.839 ±   46.434  ns/op
MergeStoreBench.str5ArraySetConst                 avgt    5    3769.990 ±    2.877  ns/op
MergeStoreBench.str5Arraycopy                     avgt    5   10604.229 ±   85.266  ns/op
MergeStoreBench.str5GetBytes                      avgt    5    6604.073 ±    4.599  ns/op
MergeStoreBench.str5GetChars                      avgt    5   15499.577 ±  166.819  ns/op
MergeStoreBench.str5StringBuilder                 avgt    5   22817.332 ± 1330.696  ns/op
MergeStoreBench.str5StringBuilderAppendChar       avgt    5   86993.698 ±  419.806  ns/op
MergeStoreBench.str5StringBuilderUnsafePut        avgt    5    3803.737 ±    0.974  ns/op
MergeStoreBench.str5UnsafePut                     avgt    5    3765.698 ±    1.774  ns/op
MergeStoreBench.str5Utf16ArrayCopy                avgt    5    5691.730 ±    4.200  ns/op
MergeStoreBench.str5Utf16ArraySetConst            avgt    5    4620.050 ±   73.237  ns/op
MergeStoreBench.str5Utf16StringBuilder            avgt    5   26974.200 ± 9799.822  ns/op
MergeStoreBench.str5Utf16StringBuilderAppendChar  avgt    5   84214.630 ± 1770.595  ns/op
MergeStoreBench.str5Utf16StringBuilderUnsafePut   avgt    5    3803.749 ±    2.164  ns/op
MergeStoreBench.str5Utf16UnsafePut                avgt    5    4463.146 ±   94.255  ns/op
MergeStoreBench.str7ArraySetConst                 avgt    5    5905.221 ±   17.324  ns/op
MergeStoreBench.str7Arraycopy                     avgt    5   14400.712 ±   68.866  ns/op
MergeStoreBench.str7GetBytes                      avgt    5   11693.448 ±   11.413  ns/op
MergeStoreBench.str7GetChars                      avgt    5   21262.620 ±  393.963  ns/op
MergeStoreBench.str7StringBuilder                 avgt    5   21559.944 ±   97.469  ns/op
MergeStoreBench.str7StringBuilderAppendChar       avgt    5  120774.017 ±  927.175  ns/op
MergeStoreBench.str7StringBuilderUnsafePut        avgt    5    5520.405 ±    5.431  ns/op
MergeStoreBench.str7UnsafePut                     avgt    5    5918.814 ±    8.237  ns/op
MergeStoreBench.str7Utf16ArrayCopy                avgt    5    6348.146 ±    2.766  ns/op
MergeStoreBench.str7Utf16ArraySetConst            avgt    5    4333.009 ±    1.980  ns/op
MergeStoreBench.str7Utf16StringBuilder            avgt    5   29406.714 ± 9703.134  ns/op
MergeStoreBench.str7Utf16StringBuilderAppendChar  avgt    5  117801.880 ±  811.216  ns/op
MergeStoreBench.str7Utf16StringBuilderUnsafePut   avgt    5    6684.164 ±   16.496  ns/op
MergeStoreBench.str7Utf16UnsafePut                avgt    5    6286.796 ±  316.658  ns/op

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24108#issuecomment-2763215404


More information about the hotspot-compiler-dev mailing list