RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10]
Yuri Gaevsky
duke at openjdk.org
Wed Dec 6 22:09:40 UTC 2023
On Wed, 6 Dec 2023 21:58:55 GMT, Yuri Gaevsky <duke at openjdk.org> wrote:
>> Hello All,
>>
>> Please review these changes to support _vectorizedHashCode intrinsic on
>> RISC-V platform. The patch adds the "scalar" code for the intrinsic without
>> usage of any RVV instruction but provides manual unrolling of the appropriate
>> loop. The code with usage of RVV instruction could be added as follow-up of
>> the patch or independently.
>>
>> Thanks,
>> -Yuri Gaevsky
>>
>> P.S. My OCA has been accepted recently (ygaevsky).
>>
>> ### Correctness checks
>>
>> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux.
>>
>> ### Performance results (the numbers for non-ints are similar)
>>
>> #### StarFive JH7110 board:
>>
>>
>> ArraysHashCode: without intrinsic with intrinsic
>> -------------------------------------------------------------------------------
>> Benchmark (size) Mode Cnt Score Error Score Error Units
>> -------------------------------------------------------------------------------
>> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op
>> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op
>> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op
>> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op
>> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op
>> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op
>> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op
>> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op
>> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op
>> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op
>> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op
>> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op
>> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op
>> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op
>> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op
>> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op
>> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op
>> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op
>> ---------------------------------------...
>
> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision:
>
> Added two temp registers for loads; all loads in wide loop has been moved to the start of the loop.
The results of commit 99f91d0 are below.
Sifive Unmatched:
Benchmark (size) Mode Cnt Score Error Score Error Units
ArraysHashCode.bytes 10 avgt 15 65.190 ? 0.954 45.527 ? 1.771 ns/op
ArraysHashCode.bytes 100 avgt 15 321.443 ? 5.586 258.807 ? 4.922 ns/op
ArraysHashCode.bytes 1000 avgt 15 2878.206 ? 9.105 2347.219 ? 8.947 ns/op
ArraysHashCode.bytes 10000 avgt 15 28421.840 ? 35.467 23160.425 ? 30.340 ns/op
ArraysHashCode.chars 10 avgt 15 64.544 ? 1.713 50.808 ? 2.629 ns/op
ArraysHashCode.chars 100 avgt 15 338.919 ? 1.623 265.971 ? 4.874 ns/op
ArraysHashCode.chars 1000 avgt 15 2986.972 ? 4.009 2336.699 ? 2.537 ns/op
ArraysHashCode.chars 10000 avgt 15 29474.441 ? 14.634 23161.582 ? 29.067 ns/op
ArraysHashCode.ints 10 avgt 15 57.104 ? 2.517 46.034 ? 0.602 ns/op
ArraysHashCode.ints 100 avgt 15 330.264 ? 4.543 258.327 ? 1.517 ns/op
ArraysHashCode.ints 1000 avgt 15 2995.208 ? 3.188 2339.664 ? 6.849 ns/op
ArraysHashCode.ints 10000 avgt 15 33855.312 ? 115.319 27836.954 ? 27.304 ns/op
ArraysHashCode.multibytes 10 avgt 15 34.378 ? 0.230 27.076 ? 0.108 ns/op
ArraysHashCode.multibytes 100 avgt 15 193.131 ? 0.370 141.907 ? 0.244 ns/op
ArraysHashCode.multibytes 1000 avgt 15 1651.909 ? 7.812 1377.842 ? 10.299 ns/op
ArraysHashCode.multibytes 10000 avgt 15 16620.685 ? 37.854 13960.556 ? 43.473 ns/op
ArraysHashCode.multichars 10 avgt 15 35.104 ? 0.195 26.308 ? 0.127 ns/op
ArraysHashCode.multichars 100 avgt 15 204.391 ? 0.233 144.662 ? 0.337 ns/op
ArraysHashCode.multichars 1000 avgt 15 1902.088 ? 6.922 1579.549 ? 7.266 ns/op
ArraysHashCode.multichars 10000 avgt 15 18905.923 ? 79.263 15952.155 ? 68.664 ns/op
ArraysHashCode.multiints 10 avgt 15 35.111 ? 0.093 26.551 ? 0.264 ns/op
ArraysHashCode.multiints 100 avgt 15 211.251 ? 0.550 153.683 ? 0.208 ns/op
ArraysHashCode.multiints 1000 avgt 15 2223.176 ? 8.982 1927.689 ? 7.075 ns/op
ArraysHashCode.multiints 10000 avgt 15 31567.767 ? 249.609 29463.762 ? 186.245 ns/op
ArraysHashCode.multishorts 10 avgt 15 35.311 ? 0.313 26.372 ? 0.116 ns/op
ArraysHashCode.multishorts 100 avgt 15 203.294 ? 0.241 144.988 ? 0.494 ns/op
ArraysHashCode.multishorts 1000 avgt 15 1898.485 ? 6.704 1579.381 ? 5.600 ns/op
ArraysHashCode.multishorts 10000 avgt 15 18855.850 ? 66.545 15718.005 ? 75.154 ns/op
ArraysHashCode.shorts 10 avgt 15 56.418 ? 0.186 47.488 ? 2.261 ns/op
ArraysHashCode.shorts 100 avgt 15 337.844 ? 1.202 256.671 ? 0.761 ns/op
ArraysHashCode.shorts 1000 avgt 15 2988.457 ? 6.158 2337.570 ? 2.510 ns/op
ArraysHashCode.shorts 10000 avgt 15 29506.107 ? 41.616 23148.772 ? 40.625 ns/op
T-Head RVB-ICE:
Benchmark (size) Mode Cnt Score Error Score Error Units
ArraysHashCode.bytes 10 avgt 15 53.463 ? 0.274 46.625 ? 0.247 ns/op
ArraysHashCode.bytes 100 avgt 15 280.976 ? 1.478 225.197 ? 1.141 ns/op
ArraysHashCode.bytes 1000 avgt 15 2553.393 ? 4.925 1818.613 ? 3.789 ns/op
ArraysHashCode.bytes 10000 avgt 15 25138.794 ? 39.992 16787.514 ? 59.261 ns/op
ArraysHashCode.chars 10 avgt 15 52.075 ? 0.246 45.924 ? 0.561 ns/op
ArraysHashCode.chars 100 avgt 15 283.441 ? 0.743 237.660 ? 1.074 ns/op
ArraysHashCode.chars 1000 avgt 15 2562.833 ? 3.370 1915.665 ? 4.166 ns/op
ArraysHashCode.chars 10000 avgt 15 25168.219 ? 94.226 18843.917 ? 51.859 ns/op
ArraysHashCode.ints 10 avgt 15 52.126 ? 0.382 46.739 ? 0.366 ns/op
ArraysHashCode.ints 100 avgt 15 283.643 ? 0.901 242.191 ? 0.776 ns/op
ArraysHashCode.ints 1000 avgt 15 2556.508 ? 6.937 1913.271 ? 2.920 ns/op
ArraysHashCode.ints 10000 avgt 15 25171.578 ? 51.725 18835.638 ? 49.785 ns/op
ArraysHashCode.multibytes 10 avgt 15 26.432 ? 0.157 18.762 ? 0.184 ns/op
ArraysHashCode.multibytes 100 avgt 15 160.788 ? 0.484 117.339 ? 0.285 ns/op
ArraysHashCode.multibytes 1000 avgt 15 1366.697 ? 9.217 923.814 ? 4.709 ns/op
ArraysHashCode.multibytes 10000 avgt 15 13360.445 ? 22.830 9350.136 ? 18.251 ns/op
ArraysHashCode.multichars 10 avgt 15 26.732 ? 0.181 19.234 ? 0.136 ns/op
ArraysHashCode.multichars 100 avgt 15 164.043 ? 0.310 117.900 ? 0.386 ns/op
ArraysHashCode.multichars 1000 avgt 15 1398.259 ? 2.765 1030.563 ? 2.701 ns/op
ArraysHashCode.multichars 10000 avgt 15 13331.460 ? 21.356 9749.817 ? 23.566 ns/op
ArraysHashCode.multiints 10 avgt 15 25.972 ? 0.135 18.745 ? 0.155 ns/op
ArraysHashCode.multiints 100 avgt 15 169.487 ? 0.357 125.620 ? 0.330 ns/op
ArraysHashCode.multiints 1000 avgt 15 1399.977 ? 9.000 1036.132 ? 3.237 ns/op
ArraysHashCode.multiints 10000 avgt 15 13760.907 ? 23.137 10324.485 ? 18.437 ns/op
ArraysHashCode.multishorts 10 avgt 15 26.541 ? 0.223 19.389 ? 0.151 ns/op
ArraysHashCode.multishorts 100 avgt 15 163.990 ? 0.301 117.797 ? 0.419 ns/op
ArraysHashCode.multishorts 1000 avgt 15 1402.545 ? 3.285 1031.649 ? 7.023 ns/op
ArraysHashCode.multishorts 10000 avgt 15 13349.611 ? 25.599 9778.011 ? 19.135 ns/op
ArraysHashCode.shorts 10 avgt 15 52.037 ? 0.265 46.881 ? 0.636 ns/op
ArraysHashCode.shorts 100 avgt 15 285.775 ? 0.702 244.200 ? 1.012 ns/op
ArraysHashCode.shorts 1000 avgt 15 2553.894 ? 5.309 1926.098 ? 3.496 ns/op
ArraysHashCode.shorts 10000 avgt 15 25201.063 ? 95.129 18843.485 ? 73.870 ns/op
StarFive JH7110
Benchmark (size) Mode Cnt Score Error Score Error Units
ArraysHashCode.bytes 10 avgt 15 41.093 ? 0.541 34.051 ? 0.032 ns/op
ArraysHashCode.bytes 100 avgt 15 250.250 ? 0.846 201.460 ? 0.631 ns/op
ArraysHashCode.bytes 1000 avgt 15 2283.792 ? 0.293 1855.048 ? 0.337 ns/op
ArraysHashCode.bytes 10000 avgt 15 22613.649 ? 85.647 18454.512 ? 93.310 ns/op
ArraysHashCode.chars 10 avgt 15 45.441 ? 0.108 34.747 ? 0.008 ns/op
ArraysHashCode.chars 100 avgt 15 261.762 ? 1.081 203.169 ? 0.118 ns/op
ArraysHashCode.chars 1000 avgt 15 2372.976 ? 1.541 1856.964 ? 4.764 ns/op
ArraysHashCode.chars 10000 avgt 15 23429.722 ? 6.530 18390.679 ? 2.956 ns/op
ArraysHashCode.ints 10 avgt 15 45.530 ? 0.284 34.744 ? 0.005 ns/op
ArraysHashCode.ints 100 avgt 15 261.117 ? 0.721 203.332 ? 0.218 ns/op
ArraysHashCode.ints 1000 avgt 15 2373.573 ? 3.175 1856.836 ? 0.223 ns/op
ArraysHashCode.ints 10000 avgt 15 29624.472 ? 44.767 24626.598 ? 54.767 ns/op
ArraysHashCode.multibytes 10 avgt 15 26.975 ? 0.259 19.854 ? 0.114 ns/op
ArraysHashCode.multibytes 100 avgt 15 156.220 ? 0.247 113.744 ? 0.366 ns/op
ArraysHashCode.multibytes 1000 avgt 15 1296.236 ? 7.541 1073.224 ? 4.383 ns/op
ArraysHashCode.multibytes 10000 avgt 15 12779.460 ? 2.007 10593.835 ? 5.531 ns/op
ArraysHashCode.multichars 10 avgt 15 27.520 ? 0.102 19.992 ? 0.054 ns/op
ArraysHashCode.multichars 100 avgt 15 166.026 ? 0.695 117.982 ? 0.639 ns/op
ArraysHashCode.multichars 1000 avgt 15 1430.447 ? 1.517 1165.180 ? 5.783 ns/op
ArraysHashCode.multichars 10000 avgt 15 14134.839 ? 6.270 11499.764 ? 37.546 ns/op
ArraysHashCode.multiints 10 avgt 15 26.872 ? 0.066 20.127 ? 0.083 ns/op
ArraysHashCode.multiints 100 avgt 15 178.919 ? 0.245 132.377 ? 0.484 ns/op
ArraysHashCode.multiints 1000 avgt 15 1607.719 ? 2.903 1339.118 ? 8.704 ns/op
ArraysHashCode.multiints 10000 avgt 15 16390.804 ? 49.820 13706.741 ? 11.994 ns/op
ArraysHashCode.multishorts 10 avgt 15 27.749 ? 0.165 20.011 ? 0.096 ns/op
ArraysHashCode.multishorts 100 avgt 15 166.625 ? 0.592 119.115 ? 0.324 ns/op
ArraysHashCode.multishorts 1000 avgt 15 1429.682 ? 1.607 1165.839 ? 6.013 ns/op
ArraysHashCode.multishorts 10000 avgt 15 14199.682 ? 6.483 11493.880 ? 6.484 ns/op
ArraysHashCode.shorts 10 avgt 15 45.878 ? 0.348 34.768 ? 0.116 ns/op
ArraysHashCode.shorts 100 avgt 15 260.598 ? 0.079 203.937 ? 0.078 ns/op
ArraysHashCode.shorts 1000 avgt 15 2374.712 ? 7.961 1857.542 ? 0.248 ns/op
ArraysHashCode.shorts 10000 avgt 15 23428.899 ? 4.859 18433.195 ? 34.224 ns/op
The improvements on SiFive/StarFive came after the move of all memory loads up to the start of the loop. Differences between Out-of-Order versus In-Order CPUs?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1843766166
More information about the hotspot-dev
mailing list