RFR: 8322174: RISC-V: C2 VectorizedHashCode RVV Version [v3]

Yuri Gaevsky duke at openjdk.org
Fri Apr 25 13:10:09 UTC 2025


On Thu, 24 Apr 2025 14:27:50 GMT, Yuri Gaevsky <duke at openjdk.org> wrote:

>> The patch adds possibility to use RVV instructions for faster vectorizedHashCode calculations on RVV v1.0.0 capable hardware.
>> 
>> Testing: hotspot/jtreg/compiler/ under QEMU-8.1 with RVV v1.0.0.
>
> Yuri Gaevsky has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits:
> 
>  - Merge master
>  - num_8b_elems_in_vec --> nof_vec_elems
>  - Removed checks for (MaxVectorSize >= 16) per @RealFYang suggestion.
>  - 8322174: RISC-V: C2 VectorizedHashCode RVV Version

JFTR: the numbers after the above merge on real RVV-1.0 hardware (BPI-F3 16g box) are below:

Legend: UseVHI ==> UseVectorizedHashCodeIntrinsic
------------------------------------------------------------------------------------
                                        (baseline)              (patch)
------------------------------------------------------------------------------------
                               |-XX:-UseVHI -XX:+UseRVV|-XX:+UseVHI -XX:+UseRVV]
------------------------------------------------------------------------------------
Benchmark    (size)  Mode  Cnt |     Score      Error  |    Score      Error |Units|
------------------------------------------------------------------------------------
bytes             1  avgt   10      11.281 ±    0.005      11.279 ±    0.001  ns/op
bytes            10  avgt   10      35.096 ±    0.027      35.730 ±    0.032  ns/op
bytes           100  avgt   10     246.627 ±    0.144     132.879 ±    0.150  ns/op
bytes          1000  avgt   10    2368.472 ±    2.174     914.207 ±    0.931  ns/op
bytes         10000  avgt   10   23548.070 ±    3.285    8707.273 ±    5.666  ns/op
bytes        100000  avgt   10  236725.770 ±  591.357   86590.456 ±  173.544  ns/op
chars             1  avgt   10      11.282 ±    0.006      11.281 ±    0.005  ns/op
chars            10  avgt   10      35.726 ±    0.013      36.978 ±    0.015  ns/op
chars           100  avgt   10     246.913 ±    0.152     134.704 ±    0.036  ns/op
chars          1000  avgt   10    2370.329 ±   10.804     935.244 ±    0.385  ns/op
chars         10000  avgt   10   23596.177 ±   19.305    9495.412 ±    6.368  ns/op
chars        100000  avgt   10  384796.824 ± 3353.051  155220.554 ± 1753.764  ns/op
ints              1  avgt   10      11.280 ±    0.002      11.280 ±    0.002  ns/op
ints             10  avgt   10      35.774 ±    0.220      36.357 ±    0.014  ns/op
ints            100  avgt   10     246.935 ±    0.112     126.494 ±    0.159  ns/op
ints           1000  avgt   10    2372.602 ±    0.753     818.846 ±    0.253  ns/op
ints          10000  avgt   10   25309.538 ±  106.280    8942.238 ±   65.537  ns/op
ints         100000  avgt   10  409074.598 ± 4049.489   87796.390 ±  545.247  ns/op
multibytes        1  avgt   10       5.137 ±    0.006       5.138 ±    0.003  ns/op
multibytes       10  avgt   10      18.361 ±    0.022      19.618 ±    0.006  ns/op
multibytes      100  avgt   10     132.814 ±    0.543      96.236 ±    0.236  ns/op
multibytes     1000  avgt   10    2160.723 ±   22.749     596.236 ±    1.166  ns/op
multibytes    10000  avgt   10   22195.062 ±  300.592    5749.928 ±    5.748  ns/op
multibytes   100000  avgt   10  205825.738 ± 1340.919   47757.644 ±   80.729  ns/op
multichars        1  avgt   10       4.995 ±    0.003       5.003 ±    0.002  ns/op
multichars       10  avgt   10      18.512 ±    0.015      19.430 ±    0.011  ns/op
multichars      100  avgt   10     230.563 ±    1.320     101.515 ±    0.353  ns/op
multichars     1000  avgt   10    1396.042 ±   16.038     634.955 ±    0.824  ns/op
multichars    10000  avgt   10   13445.146 ±    8.403    5838.638 ±    9.851  ns/op
multichars   100000  avgt   10  127475.457 ±   81.919   50308.336 ±   26.640  ns/op
multiints         1  avgt   10       6.980 ±    2.561       5.017 ±    0.007  ns/op
multiints        10  avgt   10      29.743 ±    6.021      19.479 ±    0.008  ns/op
multiints       100  avgt   10     149.720 ±    0.516     110.728 ±    0.280  ns/op
multiints      1000  avgt   10    1442.903 ±   30.199    1023.673 ±   16.614  ns/op
multiints     10000  avgt   10   22702.792 ±  286.336    5941.205 ±   30.079  ns/op
multiints    100000  avgt   10  127134.718 ±  117.502   48745.440 ±   69.036  ns/op
multishorts       1  avgt   10       5.145 ±    0.009       5.140 ±    0.004  ns/op
multishorts      10  avgt   10      18.506 ±    0.006      19.419 ±    0.006  ns/op
multishorts     100  avgt   10     232.937 ±    2.433     100.298 ±    0.318  ns/op
multishorts    1000  avgt   10    1388.111 ±   16.740     657.008 ±    4.362  ns/op
multishorts   10000  avgt   10   13458.090 ±   10.975    5860.546 ±    8.239  ns/op
multishorts  100000  avgt   10  127463.240 ±  102.736   50534.548 ±   34.661  ns/op
shorts            1  avgt   10      11.280 ±    0.007      11.280 ±    0.004  ns/op
shorts           10  avgt   10      35.721 ±    0.011      62.661 ±    0.533  ns/op
shorts          100  avgt   10     246.942 ±    0.165     135.960 ±    0.029  ns/op
shorts         1000  avgt   10    2368.908 ±    0.955     935.607 ±    0.678  ns/op
shorts        10000  avgt   10   23608.074 ±   22.901    8913.395 ±    5.318  ns/op
shorts       100000  avgt   10  237055.625 ±  532.713   94719.177 ±  352.058  ns/op
------------------------------------------------------------------------------------

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17413#issuecomment-2830382940


More information about the hotspot-compiler-dev mailing list