RFR: 8318217: RISC-V: C2 VectorizedHashCode [v9]
Fei Yang
fyang at openjdk.org
Wed Dec 6 09:27:46 UTC 2023
On Tue, 5 Dec 2023 12:57:05 GMT, Yuri Gaevsky <duke at openjdk.org> wrote:
>> Hello All,
>>
>> Please review these changes to support _vectorizedHashCode intrinsic on
>> RISC-V platform. The patch adds the "scalar" code for the intrinsic without
>> usage of any RVV instruction but provides manual unrolling of the appropriate
>> loop. The code with usage of RVV instruction could be added as follow-up of
>> the patch or independently.
>>
>> Thanks,
>> -Yuri Gaevsky
>>
>> P.S. My OCA has been accepted recently (ygaevsky).
>>
>> ### Correctness checks
>>
>> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux.
>>
>> ### Performance results (the numbers for non-ints are similar)
>>
>> #### StarFive JH7110 board:
>>
>>
>> ArraysHashCode: without intrinsic with intrinsic
>> -------------------------------------------------------------------------------
>> Benchmark (size) Mode Cnt Score Error Score Error Units
>> -------------------------------------------------------------------------------
>> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op
>> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op
>> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op
>> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op
>> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op
>> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op
>> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op
>> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op
>> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op
>> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op
>> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op
>> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op
>> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op
>> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op
>> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op
>> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op
>> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op
>> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op
>> ---------------------------------------...
>
> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision:
>
> Changed lb-->lbu for T_BOOLEAN and iRegINoSp-->iRegLNoSp for tmp2/tmp3.
So I tried this on sifive unmatched. Unfortunately, I see some performance regressions with this change.
Before:
Benchmark (size) Mode Cnt Score Error Units
ArraysHashCode.bytes 1 avgt 15 19.737 ? 5.405 ns/op
ArraysHashCode.bytes 10 avgt 15 56.102 ? 3.191 ns/op
ArraysHashCode.bytes 100 avgt 15 317.126 ? 3.452 ns/op
ArraysHashCode.bytes 10000 avgt 15 28380.470 ? 20.709 ns/op
ArraysHashCode.chars 1 avgt 15 15.532 ? 2.623 ns/op
ArraysHashCode.chars 10 avgt 15 59.603 ? 2.440 ns/op
ArraysHashCode.chars 100 avgt 15 333.995 ? 3.834 ns/op
ArraysHashCode.chars 10000 avgt 15 29464.768 ? 16.751 ns/op
ArraysHashCode.ints 1 avgt 15 16.031 ? 2.820 ns/op
ArraysHashCode.ints 10 avgt 15 59.506 ? 3.980 ns/op
ArraysHashCode.ints 100 avgt 15 335.514 ? 4.695 ns/op
ArraysHashCode.ints 10000 avgt 15 33966.175 ? 929.859 ns/op
ArraysHashCode.multibytes 1 avgt 15 7.840 ? 0.110 ns/op
ArraysHashCode.multibytes 10 avgt 15 34.727 ? 0.547 ns/op
ArraysHashCode.multibytes 100 avgt 15 193.085 ? 0.814 ns/op
ArraysHashCode.multibytes 10000 avgt 15 16610.239 ? 27.290 ns/op
ArraysHashCode.multichars 1 avgt 15 7.853 ? 0.092 ns/op
ArraysHashCode.multichars 10 avgt 15 35.059 ? 0.241 ns/op
ArraysHashCode.multichars 100 avgt 15 203.483 ? 0.413 ns/op
ArraysHashCode.multichars 10000 avgt 15 18819.804 ? 75.487 ns/op
ArraysHashCode.multiints 1 avgt 15 7.878 ? 0.104 ns/op
ArraysHashCode.multiints 10 avgt 15 35.232 ? 0.196 ns/op
ArraysHashCode.multiints 100 avgt 15 211.087 ? 1.914 ns/op
ArraysHashCode.multiints 10000 avgt 15 30172.693 ? 1447.757 ns/op
ArraysHashCode.multishorts 1 avgt 15 7.788 ? 0.046 ns/op
ArraysHashCode.multishorts 10 avgt 15 35.504 ? 0.465 ns/op
ArraysHashCode.multishorts 100 avgt 15 203.530 ? 0.342 ns/op
ArraysHashCode.multishorts 10000 avgt 15 18801.799 ? 77.159 ns/op
ArraysHashCode.shorts 1 avgt 15 19.685 ? 5.413 ns/op
ArraysHashCode.shorts 10 avgt 15 59.583 ? 4.684 ns/op
ArraysHashCode.shorts 100 avgt 15 333.170 ? 5.367 ns/op
ArraysHashCode.shorts 10000 avgt 15 29455.665 ? 13.302 ns/op
After:
Benchmark (size) Mode Cnt Score Error Units
ArraysHashCode.bytes 1 avgt 15 18.575 ? 3.780 ns/op
ArraysHashCode.bytes 10 avgt 15 55.394 ? 4.610 ns/op
ArraysHashCode.bytes 100 avgt 15 340.807 ? 3.387 ns/op
ArraysHashCode.bytes 10000 avgt 15 31506.478 ? 27.694 ns/op
ArraysHashCode.chars 1 avgt 15 15.966 ? 2.291 ns/op
ArraysHashCode.chars 10 avgt 15 56.524 ? 4.301 ns/op
ArraysHashCode.chars 100 avgt 15 343.389 ? 3.272 ns/op
ArraysHashCode.chars 10000 avgt 15 31520.717 ? 13.290 ns/op
ArraysHashCode.ints 1 avgt 15 16.078 ? 3.977 ns/op
ArraysHashCode.ints 10 avgt 15 55.467 ? 2.845 ns/op
ArraysHashCode.ints 100 avgt 15 344.500 ? 3.531 ns/op
ArraysHashCode.ints 10000 avgt 15 36234.542 ? 39.191 ns/op
ArraysHashCode.multibytes 1 avgt 15 7.816 ? 0.072 ns/op
ArraysHashCode.multibytes 10 avgt 15 29.617 ? 0.257 ns/op
ArraysHashCode.multibytes 100 avgt 15 183.986 ? 0.236 ns/op
ArraysHashCode.multibytes 10000 avgt 15 18349.268 ? 28.711 ns/op
ArraysHashCode.multichars 1 avgt 15 7.821 ? 0.050 ns/op
ArraysHashCode.multichars 10 avgt 15 29.293 ? 0.273 ns/op
ArraysHashCode.multichars 100 avgt 15 186.538 ? 0.404 ns/op
ArraysHashCode.multichars 10000 avgt 15 20149.487 ? 87.300 ns/op
ArraysHashCode.multiints 1 avgt 15 7.847 ? 0.044 ns/op
ArraysHashCode.multiints 10 avgt 15 29.765 ? 1.082 ns/op
ArraysHashCode.multiints 100 avgt 15 193.887 ? 0.360 ns/op
ArraysHashCode.multiints 10000 avgt 15 30997.145 ? 420.328 ns/op
ArraysHashCode.multishorts 1 avgt 15 7.856 ? 0.128 ns/op
ArraysHashCode.multishorts 10 avgt 15 29.231 ? 0.434 ns/op
ArraysHashCode.multishorts 100 avgt 15 187.044 ? 0.289 ns/op
ArraysHashCode.multishorts 10000 avgt 15 20146.327 ? 89.985 ns/op
ArraysHashCode.shorts 1 avgt 15 15.162 ? 4.191 ns/op
ArraysHashCode.shorts 10 avgt 15 54.279 ? 2.661 ns/op
ArraysHashCode.shorts 100 avgt 15 343.085 ? 4.204 ns/op
ArraysHashCode.shorts 10000 avgt 15 31536.455 ? 23.874 ns/op
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1842496746
More information about the hotspot-dev
mailing list