RFR: 8343629: More MergeStore benchmark
Shaojin Wen
swen at openjdk.org
Wed Nov 6 00:31:28 UTC 2024
On Wed, 23 Oct 2024 07:03:33 GMT, Shaojin Wen <swen at openjdk.org> wrote:
> 1. Added the putBytes4 benchmark, which corresponds to StringBuilder appendNull
> 2. Optimized the putChars4/setInt/setLong series of benchmarks to reduce extra overhead and more accurately reflect performance differences.
@eme64
Below are the performance numbers running under AMD EPYC™ Genoa (x64), where the scenario of putBytes4GetBytes is
"null".getBytes(0, 4, bytes4, off);
Is it possible to do MergeStore in this scenario?
Benchmark Mode Cnt Score Error Units
MergeStoreBench.getCharB avgt 5 6038.532 ± 533.982 ns/op
MergeStoreBench.getCharBU avgt 5 4923.182 ± 163.872 ns/op
MergeStoreBench.getCharBV avgt 5 3111.268 ± 84.077 ns/op
MergeStoreBench.getCharC avgt 5 2245.270 ± 33.559 ns/op
MergeStoreBench.getCharL avgt 5 6109.519 ± 249.512 ns/op
MergeStoreBench.getCharLU avgt 5 4552.425 ± 161.933 ns/op
MergeStoreBench.getCharLV avgt 5 2239.866 ± 91.853 ns/op
MergeStoreBench.getIntB avgt 5 8163.035 ± 137.565 ns/op
MergeStoreBench.getIntBU avgt 5 9136.199 ± 259.491 ns/op
MergeStoreBench.getIntBV avgt 5 314.123 ± 4.510 ns/op
MergeStoreBench.getIntL avgt 5 7879.011 ± 10.759 ns/op
MergeStoreBench.getIntLU avgt 5 8968.715 ± 268.414 ns/op
MergeStoreBench.getIntLV avgt 5 2228.228 ± 1.510 ns/op
MergeStoreBench.getIntRB avgt 5 8618.141 ± 22.545 ns/op
MergeStoreBench.getIntRBU avgt 5 11239.977 ± 447.754 ns/op
MergeStoreBench.getIntRL avgt 5 9060.754 ± 236.147 ns/op
MergeStoreBench.getIntRLU avgt 5 9365.050 ± 154.357 ns/op
MergeStoreBench.getIntRU avgt 5 2540.704 ± 75.198 ns/op
MergeStoreBench.getIntU avgt 5 2508.954 ± 74.999 ns/op
MergeStoreBench.getLongB avgt 5 24940.668 ± 16857.311 ns/op
MergeStoreBench.getLongBU avgt 5 14126.468 ± 329.241 ns/op
MergeStoreBench.getLongBV avgt 5 607.128 ± 23.775 ns/op
MergeStoreBench.getLongL avgt 5 25519.679 ± 15393.727 ns/op
MergeStoreBench.getLongLU avgt 5 14598.271 ± 481.158 ns/op
MergeStoreBench.getLongLV avgt 5 2227.659 ± 16.334 ns/op
MergeStoreBench.getLongRB avgt 5 25158.839 ± 18209.451 ns/op
MergeStoreBench.getLongRBU avgt 5 14005.082 ± 208.154 ns/op
MergeStoreBench.getLongRL avgt 5 25303.319 ± 14775.524 ns/op
MergeStoreBench.getLongRLU avgt 5 14481.847 ± 309.623 ns/op
MergeStoreBench.getLongRU avgt 5 3065.744 ± 15.405 ns/op
MergeStoreBench.getLongU avgt 5 3048.522 ± 0.704 ns/op
MergeStoreBench.putBytes4 avgt 5 933.283 ± 6.197 ns/op
MergeStoreBench.putBytes4GetBytes avgt 5 5917.932 ± 199.901 ns/op
MergeStoreBench.putBytes4U avgt 5 944.097 ± 25.902 ns/op
MergeStoreBench.putBytes4X avgt 5 944.714 ± 18.924 ns/op
MergeStoreBench.putChars4B avgt 5 5679.262 ± 154.030 ns/op
MergeStoreBench.putChars4BU avgt 5 1143.133 ± 4.250 ns/op
MergeStoreBench.putChars4BV avgt 5 4530.941 ± 124.318 ns/op
MergeStoreBench.putChars4C avgt 5 1138.541 ± 27.843 ns/op
MergeStoreBench.putChars4L avgt 5 5647.885 ± 112.363 ns/op
MergeStoreBench.putChars4LU avgt 5 1142.501 ± 4.400 ns/op
MergeStoreBench.putChars4LV avgt 5 1143.770 ± 3.435 ns/op
MergeStoreBench.putChars4S avgt 5 1141.919 ± 36.528 ns/op
MergeStoreBench.setCharBS avgt 5 6114.143 ± 144.826 ns/op
MergeStoreBench.setCharBV avgt 5 3607.599 ± 87.720 ns/op
MergeStoreBench.setCharC avgt 5 4510.196 ± 5.445 ns/op
MergeStoreBench.setCharLS avgt 5 5641.424 ± 195.167 ns/op
MergeStoreBench.setCharLV avgt 5 2267.712 ± 40.752 ns/op
MergeStoreBench.setIntB avgt 5 8049.368 ± 233.618 ns/op
MergeStoreBench.setIntBU avgt 5 18052.279 ± 2428.567 ns/op
MergeStoreBench.setIntBV avgt 5 3287.905 ± 63.375 ns/op
MergeStoreBench.setIntL avgt 5 2135.887 ± 62.601 ns/op
MergeStoreBench.setIntLU avgt 5 4795.636 ± 74.974 ns/op
MergeStoreBench.setIntLV avgt 5 2154.363 ± 81.324 ns/op
MergeStoreBench.setIntRB avgt 5 13895.941 ± 7981.782 ns/op
MergeStoreBench.setIntRBU avgt 5 14756.267 ± 1585.571 ns/op
MergeStoreBench.setIntRL avgt 5 3284.792 ± 37.939 ns/op
MergeStoreBench.setIntRLU avgt 5 5958.555 ± 27.404 ns/op
MergeStoreBench.setIntRU avgt 5 5983.119 ± 79.627 ns/op
MergeStoreBench.setIntU avgt 5 4848.655 ± 168.466 ns/op
MergeStoreBench.setLongB avgt 5 31871.401 ± 1233.822 ns/op
MergeStoreBench.setLongBU avgt 5 25704.975 ± 5105.792 ns/op
MergeStoreBench.setLongBV avgt 5 2199.367 ± 69.511 ns/op
MergeStoreBench.setLongL avgt 5 5486.926 ± 30.874 ns/op
MergeStoreBench.setLongLU avgt 5 4503.212 ± 81.635 ns/op
MergeStoreBench.setLongLV avgt 5 2144.943 ± 38.944 ns/op
MergeStoreBench.setLongRB avgt 5 30338.353 ± 1631.512 ns/op
MergeStoreBench.setLongRBU avgt 5 25025.442 ± 2690.138 ns/op
MergeStoreBench.setLongRL avgt 5 4553.245 ± 128.721 ns/op
MergeStoreBench.setLongRLU avgt 5 4793.427 ± 1.474 ns/op
MergeStoreBench.setLongRU avgt 5 4803.963 ± 74.017 ns/op
MergeStoreBench.setLongU avgt 5 4564.326 ± 146.283 ns/op
-------------
PR Comment: https://git.openjdk.org/jdk/pull/21659#issuecomment-2458465745
More information about the hotspot-compiler-dev
mailing list