RFR: 8343629: More MergeStore benchmark [v4]
Shaojin Wen
swen at openjdk.org
Thu Nov 14 00:05:43 UTC 2024
On Wed, 13 Nov 2024 17:06:12 GMT, Shaojin Wen <swen at openjdk.org> wrote:
>> 1. Added the putBytes4 benchmark, which corresponds to StringBuilder appendNull
>> 2. Optimized the putChars4/setInt/setLong series of benchmarks to reduce extra overhead and more accurately reflect performance differences.
>
> Shaojin Wen has updated the pull request incrementally with one additional commit since the last revision:
>
> fix jvmArgs
After jvm_args is set correctly, +/-MergeStores has a significant performance difference.
## List of tests with significant performance improvements
The putBytes series and LittleEndian-based tests show that +MergeStores has a significant performance improvement under x64.
| | -MergeStores | +MergeStores | delta|
| --- | --- | --- | --- |
| MergeStoreBench.putBytes4 | 4475.891 | 929.327 | 381.63% |
| MergeStoreBench.putBytes4U | 4479.133 | 928.502 | 382.40% |
| MergeStoreBench.putBytes4X | 4477.133 | 929.183 | 381.84% |
| MergeStoreBench.putChars4B | 9008.350 | 5638.550 | 59.76% |
| MergeStoreBench.putChars4BU | 8961.671 | 1144.479 | 683.03% |
| MergeStoreBench.putChars4C | 4485.308 | 1133.473 | 295.71% |
| MergeStoreBench.putChars4L | 9013.570 | 5640.893 | 59.79% |
| MergeStoreBench.putChars4LU | 8957.625 | 1142.796 | 683.83% |
| MergeStoreBench.putChars4LV | 4488.698 | 1134.303 | 295.72% |
| MergeStoreBench.putChars4S | 4485.836 | 1133.430 | 295.78% |
| MergeStoreBench.setIntL | 15430.183 | 2113.544 | 630.06% |
| MergeStoreBench.setIntLU | 17361.730 | 4783.040 | 262.99% |
| MergeStoreBench.setIntRL | 16525.068 | 3244.126 | 409.38% |
| MergeStoreBench.setIntRLU | 14401.071 | 5930.149 | 142.85% |
| MergeStoreBench.setLongL | 31353.713 | 5405.189 | 480.07% |
| MergeStoreBench.setLongLU | 26113.756 | 4287.166 | 509.11% |
| MergeStoreBench.setLongRL | 27232.898 | 4523.658 | 502.01% |
| MergeStoreBench.setLongRLU | 26196.973 | 4798.177 | 445.98% |
| MergeStoreBench.setLongU | 4500.659 | 4271.225 | 5.37% |
## List of tests that were not improved
On x64 machines, all tests of get operations have no significant improvement, and BigEndian put performance has not improved. It is also very common to use BigEndian byte content on little-endian machines. For example, most network protocols are big-endian. It is expected that MergeStore can be supported.
| | -MergeStores | +MergeStores | delta|
| --- | --- | --- | --- |
| MergeStoreBench.getCharB | 5908.544 | 5903.216 | 0.09% |
| MergeStoreBench.getCharBU | 4853.054 | 4861.850 | -0.18% |
| MergeStoreBench.getCharBV | 3080.971 | 3081.138 | -0.01% |
| MergeStoreBench.getCharC | 2235.832 | 2235.306 | 0.02% |
| MergeStoreBench.getCharL | 6046.201 | 6034.378 | 0.20% |
| MergeStoreBench.getCharLU | 4934.757 | 4494.743 | 9.79% |
| MergeStoreBench.getCharLV | 2221.754 | 2222.086 | -0.01% |
| MergeStoreBench.getIntB | 8002.830 | 8008.578 | -0.07% |
| MergeStoreBench.getIntBU | 9054.151 | 9048.937 | 0.06% |
| MergeStoreBench.getIntBV | 308.274 | 308.438 | -0.05% |
| MergeStoreBench.getIntL | 7885.680 | 7875.204 | 0.13% |
| MergeStoreBench.getIntLU | 8863.323 | 8866.561 | -0.04% |
| MergeStoreBench.getIntLV | 2228.348 | 2228.067 | 0.01% |
| MergeStoreBench.getIntRB | 8636.679 | 8633.762 | 0.03% |
| MergeStoreBench.getIntRBU | 11102.938 | 11105.491 | -0.02% |
| MergeStoreBench.getIntRL | 8975.416 | 8962.822 | 0.14% |
| MergeStoreBench.getIntRLU | 9249.430 | 9258.589 | -0.10% |
| MergeStoreBench.getIntRU | 2510.359 | 2505.505 | 0.19% |
| MergeStoreBench.getIntU | 2493.932 | 2494.808 | -0.04% |
| MergeStoreBench.getLongB | 24811.283 | 24804.034 | 0.03% |
| MergeStoreBench.getLongBU | 14024.209 | 14013.247 | 0.08% |
| MergeStoreBench.getLongBV | 601.852 | 602.426 | -0.10% |
| MergeStoreBench.getLongL | 25073.219 | 25115.247 | -0.17% |
| MergeStoreBench.getLongLU | 14483.618 | 14497.662 | -0.10% |
| MergeStoreBench.getLongLV | 2225.597 | 2225.810 | -0.01% |
| MergeStoreBench.getLongRB | 24832.411 | 24801.799 | 0.12% |
| MergeStoreBench.getLongRBU | 14027.084 | 14026.284 | 0.01% |
| MergeStoreBench.getLongRL | 25008.679 | 25113.927 | -0.42% |
| MergeStoreBench.getLongRLU | 14425.883 | 14493.830 | -0.47% |
| MergeStoreBench.getLongRU | 3059.614 | 3058.726 | 0.03% |
| MergeStoreBench.getLongU | 3049.682 | 3048.266 | 0.05% |
| MergeStoreBench.putBytes4GetBytes | 5880.164 | 5883.995 | -0.07% |
| MergeStoreBench.putChars4BV | 4488.270 | 4486.457 | 0.04% |
| MergeStoreBench.setCharBS | 6088.826 | 6085.857 | 0.05% |
| MergeStoreBench.setCharBV | 3596.210 | 3595.236 | 0.03% |
| MergeStoreBench.setCharC | 4519.981 | 4471.174 | 1.09% |
| MergeStoreBench.setCharLS | 5619.414 | 5618.239 | 0.02% |
| MergeStoreBench.setCharLV | 2248.493 | 2245.939 | 0.11% |
| MergeStoreBench.setIntB | 8039.705 | 8045.113 | -0.07% |
| MergeStoreBench.setIntBU | 17884.223 | 17764.347 | 0.67% |
| MergeStoreBench.setIntBV | 3239.985 | 3227.997 | 0.37% |
| MergeStoreBench.setIntLV | 2128.975 | 2126.187 | 0.13% |
| MergeStoreBench.setIntRB | 13786.186 | 13815.759 | -0.21% |
| MergeStoreBench.setIntRBU | 14747.463 | 14771.017 | -0.16% |
| MergeStoreBench.setIntRU | 5898.169 | 5875.589 | 0.38% |
| MergeStoreBench.setIntU | 4805.170 | 4784.162 | 0.44% |
| MergeStoreBench.setLongB | 31674.058 | 31662.483 | 0.04% |
| MergeStoreBench.setLongBU | 25696.702 | 25674.394 | 0.09% |
| MergeStoreBench.setLongBV | 2168.387 | 2165.313 | 0.14% |
| MergeStoreBench.setLongLV | 2048.737 | 2116.054 | -3.18% |
| MergeStoreBench.setLongRB | 29901.778 | 29909.501 | -0.03% |
| MergeStoreBench.setLongRBU | 24945.914 | 25005.171 | -0.24% |
| MergeStoreBench.setLongRU | 4797.817 | 4795.018 | 0.06% |
## Full tests
| | -MergeStores | +MergeStores | delta|
| --- | --- | --- | --- |
| MergeStoreBench.getCharB | 5908.544 | 5903.216 | 0.09% |
| MergeStoreBench.getCharBU | 4853.054 | 4861.850 | -0.18% |
| MergeStoreBench.getCharBV | 3080.971 | 3081.138 | -0.01% |
| MergeStoreBench.getCharC | 2235.832 | 2235.306 | 0.02% |
| MergeStoreBench.getCharL | 6046.201 | 6034.378 | 0.20% |
| MergeStoreBench.getCharLU | 4934.757 | 4494.743 | 9.79% |
| MergeStoreBench.getCharLV | 2221.754 | 2222.086 | -0.01% |
| MergeStoreBench.getIntB | 8002.830 | 8008.578 | -0.07% |
| MergeStoreBench.getIntBU | 9054.151 | 9048.937 | 0.06% |
| MergeStoreBench.getIntBV | 308.274 | 308.438 | -0.05% |
| MergeStoreBench.getIntL | 7885.680 | 7875.204 | 0.13% |
| MergeStoreBench.getIntLU | 8863.323 | 8866.561 | -0.04% |
| MergeStoreBench.getIntLV | 2228.348 | 2228.067 | 0.01% |
| MergeStoreBench.getIntRB | 8636.679 | 8633.762 | 0.03% |
| MergeStoreBench.getIntRBU | 11102.938 | 11105.491 | -0.02% |
| MergeStoreBench.getIntRL | 8975.416 | 8962.822 | 0.14% |
| MergeStoreBench.getIntRLU | 9249.430 | 9258.589 | -0.10% |
| MergeStoreBench.getIntRU | 2510.359 | 2505.505 | 0.19% |
| MergeStoreBench.getIntU | 2493.932 | 2494.808 | -0.04% |
| MergeStoreBench.getLongB | 24811.283 | 24804.034 | 0.03% |
| MergeStoreBench.getLongBU | 14024.209 | 14013.247 | 0.08% |
| MergeStoreBench.getLongBV | 601.852 | 602.426 | -0.10% |
| MergeStoreBench.getLongL | 25073.219 | 25115.247 | -0.17% |
| MergeStoreBench.getLongLU | 14483.618 | 14497.662 | -0.10% |
| MergeStoreBench.getLongLV | 2225.597 | 2225.810 | -0.01% |
| MergeStoreBench.getLongRB | 24832.411 | 24801.799 | 0.12% |
| MergeStoreBench.getLongRBU | 14027.084 | 14026.284 | 0.01% |
| MergeStoreBench.getLongRL | 25008.679 | 25113.927 | -0.42% |
| MergeStoreBench.getLongRLU | 14425.883 | 14493.830 | -0.47% |
| MergeStoreBench.getLongRU | 3059.614 | 3058.726 | 0.03% |
| MergeStoreBench.getLongU | 3049.682 | 3048.266 | 0.05% |
| MergeStoreBench.putBytes4 | 4475.891 | 929.327 | 381.63% |
| MergeStoreBench.putBytes4GetBytes | 5880.164 | 5883.995 | -0.07% |
| MergeStoreBench.putBytes4U | 4479.133 | 928.502 | 382.40% |
| MergeStoreBench.putBytes4X | 4477.133 | 929.183 | 381.84% |
| MergeStoreBench.putChars4B | 9008.350 | 5638.550 | 59.76% |
| MergeStoreBench.putChars4BU | 8961.671 | 1144.479 | 683.03% |
| MergeStoreBench.putChars4BV | 4488.270 | 4486.457 | 0.04% |
| MergeStoreBench.putChars4C | 4485.308 | 1133.473 | 295.71% |
| MergeStoreBench.putChars4L | 9013.570 | 5640.893 | 59.79% |
| MergeStoreBench.putChars4LU | 8957.625 | 1142.796 | 683.83% |
| MergeStoreBench.putChars4LV | 4488.698 | 1134.303 | 295.72% |
| MergeStoreBench.putChars4S | 4485.836 | 1133.430 | 295.78% |
| MergeStoreBench.setCharBS | 6088.826 | 6085.857 | 0.05% |
| MergeStoreBench.setCharBV | 3596.210 | 3595.236 | 0.03% |
| MergeStoreBench.setCharC | 4519.981 | 4471.174 | 1.09% |
| MergeStoreBench.setCharLS | 5619.414 | 5618.239 | 0.02% |
| MergeStoreBench.setCharLV | 2248.493 | 2245.939 | 0.11% |
| MergeStoreBench.setIntB | 8039.705 | 8045.113 | -0.07% |
| MergeStoreBench.setIntBU | 17884.223 | 17764.347 | 0.67% |
| MergeStoreBench.setIntBV | 3239.985 | 3227.997 | 0.37% |
| MergeStoreBench.setIntL | 15430.183 | 2113.544 | 630.06% |
| MergeStoreBench.setIntLU | 17361.730 | 4783.040 | 262.99% |
| MergeStoreBench.setIntLV | 2128.975 | 2126.187 | 0.13% |
| MergeStoreBench.setIntRB | 13786.186 | 13815.759 | -0.21% |
| MergeStoreBench.setIntRBU | 14747.463 | 14771.017 | -0.16% |
| MergeStoreBench.setIntRL | 16525.068 | 3244.126 | 409.38% |
| MergeStoreBench.setIntRLU | 14401.071 | 5930.149 | 142.85% |
| MergeStoreBench.setIntRU | 5898.169 | 5875.589 | 0.38% |
| MergeStoreBench.setIntU | 4805.170 | 4784.162 | 0.44% |
| MergeStoreBench.setLongB | 31674.058 | 31662.483 | 0.04% |
| MergeStoreBench.setLongBU | 25696.702 | 25674.394 | 0.09% |
| MergeStoreBench.setLongBV | 2168.387 | 2165.313 | 0.14% |
| MergeStoreBench.setLongL | 31353.713 | 5405.189 | 480.07% |
| MergeStoreBench.setLongLU | 26113.756 | 4287.166 | 509.11% |
| MergeStoreBench.setLongLV | 2048.737 | 2116.054 | -3.18% |
| MergeStoreBench.setLongRB | 29901.778 | 29909.501 | -0.03% |
| MergeStoreBench.setLongRBU | 24945.914 | 25005.171 | -0.24% |
| MergeStoreBench.setLongRL | 27232.898 | 4523.658 | 502.01% |
| MergeStoreBench.setLongRLU | 26196.973 | 4798.177 | 445.98% |
| MergeStoreBench.setLongRU | 4797.817 | 4795.018 | 0.06% |
| MergeStoreBench.setLongU | 4500.659 | 4271.225 | 5.37% |
-------------
PR Comment: https://git.openjdk.org/jdk/pull/21659#issuecomment-2475069346
More information about the hotspot-compiler-dev
mailing list