RFR: 8322174: RISC-V: C2 VectorizedHashCode RVV Version

Yuri Gaevsky duke at openjdk.org
Sat Jan 13 09:27:26 UTC 2024


On Sat, 13 Jan 2024 09:21:37 GMT, Yuri Gaevsky <duke at openjdk.org> wrote:

> The patch adds possibility to use RVV instructions for faster vectorizedHashCode calculations on RVV v1.0.0 capable hardware.
> 
> Testing: hotspot/jtreg/compiler/ under QEMU-8.1 with RVV v1.0.0.

NB: I have no access to RVV v1.0.0 hardware so to estimate performance improvements
adopted the patch to RVV v0.7.1 ISA under OpenJDK-21 and run the JMH test
org.openjdk.bench.java.lang.ArraysHashCode on LicheePi-4A TH1520 which does support
RVV v.0.7.1.

The results are below. Hopefully they will be similar on RVV v1.0.0 hardware.

Legend: UseVHI ==> UseVectorizedHashCodeIntrinsic

----------------------------------------------------------------------------------------------------------------------------------------------
                                [-XX:-UseVHI -XX:-UseRVV] [-XX:-UseVHI -XX:+UseRVV] [-XX:+UseVHI -XX:-UseRVV] [-XX:+UseVHi -XX:+UseRVV]
----------------------------------------------------------------------------------------------------------------------------------------------
Benchmark    (size)  Mode  Cnt |       Score      Error  |       Score      Error  |       Score      Error  |       Score      Error  |Units|
----------------------------------------------------------------------------------------------------------------------------------------------
bytes             1  avgt   10 |      20.292 ±    0.524  |      20.693 ±    1.706  |      20.458 ±    0.718  |      20.276 ±    0.525  |ns/op|
bytes            10  avgt   10 |      35.107 ±    0.180  |      35.054 ±    0.029  |      30.898 ±    0.109  |      31.033 ±    0.132  |ns/op|
bytes           100  avgt   10 |     188.190 ±    4.192  |     188.805 ±    4.345  |     152.324 ±    2.205  |      97.673 ±    3.145  |ns/op|
bytes          1000  avgt   10 |    1664.569 ±    1.662  |    1663.711 ±    2.229  |    1184.224 ±    0.731  |     656.340 ±    1.908  |ns/op|
bytes         10000  avgt   10 |   16419.434 ±   68.995  |   16407.357 ±   43.737  |   11599.876 ±   23.574  |    6171.500 ±   16.633  |ns/op|
bytes        100000  avgt   10 |  167738.927 ± 3313.255  |  166577.887 ± 1552.963  |  119475.413 ± 1358.363  |   62061.873 ±  130.268  |ns/op|
chars             1  avgt   10 |      20.420 ±    1.031  |      20.294 ±    0.527  |      20.402 ±    0.992  |      21.267 ±    0.027  |ns/op|
chars            10  avgt   10 |      35.800 ±    0.032  |      35.778 ±    0.049  |      31.170 ±    0.199  |      31.744 ±    0.169  |ns/op|
chars           100  avgt   10 |     185.715 ±    0.674  |     184.531 ±    1.152  |     143.918 ±    1.147  |      90.613 ±    0.092  |ns/op|
chars          1000  avgt   10 |    1683.711 ±   46.493  |    1668.926 ±    6.850  |    1120.730 ±    3.017  |     652.677 ±    2.026  |ns/op|
chars         10000  avgt   10 |   16402.007 ±   16.654  |   16468.497 ±  136.411  |   10939.505 ±   72.647  |    6174.555 ±   28.879  |ns/op|
chars        100000  avgt   10 |  164826.072 ±  381.240  |  165807.663 ± 4328.908  |  114787.826 ± 4217.557  |   61724.436 ±   45.819  |ns/op|
ints              1  avgt   10 |      20.730 ±    2.375  |      20.506 ±    1.458  |      20.277 ±    0.517  |      20.169 ±    0.015  |ns/op|
ints             10  avgt   10 |      36.878 ±    0.059  |      36.162 ±    1.033  |      31.338 ±    0.243  |      32.511 ±    0.165  |ns/op|
ints            100  avgt   10 |     184.288 ±    0.790  |     184.939 ±    0.624  |     143.794 ±    0.708  |      80.406 ±    6.987  |ns/op|
ints           1000  avgt   10 |    1669.219 ±    3.559  |    1670.992 ±   13.830  |    1118.856 ±    1.086  |     486.305 ±    4.471  |ns/op|
ints          10000  avgt   10 |   16432.730 ±   62.326  |   16710.540 ±   68.028  |   11128.766 ±   57.448  |    5232.062 ±  291.835  |ns/op|
ints         100000  avgt   10 |  165387.705 ±  431.814  |  165597.050 ±  278.567  |  115605.648 ± 8245.853  |   45468.032 ± 1793.979  |ns/op|
multibytes        1  avgt   10 |       3.459 ±    0.020  |       3.473 ±    0.055  |       3.477 ±    0.145  |       3.480 ±    0.043  |ns/op|
multibytes       10  avgt   10 |      16.983 ±    0.264  |      17.526 ±    0.375  |      12.325 ±    0.117  |      13.415 ±    0.136  |ns/op|
multibytes      100  avgt   10 |     105.251 ±    0.250  |     105.032 ±    0.180  |      78.795 ±    0.260  |      53.210 ±    1.024  |ns/op|
multibytes     1000  avgt   10 |     948.171 ±    5.950  |     957.757 ±   12.117  |     700.407 ±    1.928  |     440.352 ±    2.248  |ns/op|
multibytes    10000  avgt   10 |    8829.949 ±   64.161  |    9007.879 ±  510.217  |    6406.776 ±   17.982  |    3430.480 ±   35.108  |ns/op|
multibytes   100000  avgt   10 |   89545.793 ± 6151.064  |   88335.319 ±   51.310  |   64236.061 ±   46.572  |   33380.485 ±   56.708  |ns/op|
multichars        1  avgt   10 |       3.475 ±    0.054  |       3.453 ±    0.066  |       3.492 ±    0.122  |       3.495 ±    0.047  |ns/op|
multichars       10  avgt   10 |      17.719 ±    0.645  |      17.201 ±    0.152  |      12.318 ±    0.141  |      13.093 ±    0.147  |ns/op|
multichars      100  avgt   10 |     106.735 ±    0.283  |     106.625 ±    0.177  |      77.695 ±    0.212  |      51.495 ±    0.166  |ns/op|
multichars     1000  avgt   10 |     927.573 ±    6.839  |     932.211 ±    3.445  |     696.374 ±    1.757  |     471.226 ±    1.499  |ns/op|
multichars    10000  avgt   10 |    9846.872 ±   20.840  |    9909.611 ±  188.165  |    6392.901 ±    4.849  |    3978.730 ±  180.130  |ns/op|
multichars   100000  avgt   10 |   88110.303 ±   41.764  |   88892.543 ± 2534.299  |   60615.033 ±   94.002  |   33956.859 ±  199.178  |ns/op|
multiints         1  avgt   10 |       3.450 ±    0.328  |       3.382 ±    0.150  |       3.345 ±    0.024  |       3.380 ±    0.040  |ns/op|
multiints        10  avgt   10 |      18.265 ±    0.424  |      18.644 ±    1.433  |      12.036 ±    0.041  |      13.773 ±    0.114  |ns/op|
multiints       100  avgt   10 |     107.500 ±    0.636  |     107.318 ±    0.466  |      77.971 ±    0.296  |      47.700 ±    0.408  |ns/op|
multiints      1000  avgt   10 |     924.920 ±    9.106  |     937.609 ±   44.303  |     695.427 ±    2.075  |     449.475 ±    2.061  |ns/op|
multiints     10000  avgt   10 |    9322.880 ±   49.589  |    9277.425 ±   91.828  |    7009.704 ±  297.983  |    6196.819 ±  367.531  |ns/op|
multiints    100000  avgt   10 |   88154.281 ±  279.258  |   88272.818 ±  103.608  |   64118.963 ± 6445.702  |   55317.212 ±  916.179  |ns/op|
multishorts       1  avgt   10 |       3.488 ±    0.034  |       3.531 ±    0.227  |       3.521 ±    0.051  |       3.512 ±    0.054  |ns/op|
multishorts      10  avgt   10 |      17.907 ±    0.380  |      17.408 ±    0.659  |      12.252 ±    0.110  |      13.445 ±    0.102  |ns/op|
multishorts     100  avgt   10 |     106.588 ±    0.188  |     107.500 ±    0.531  |      79.630 ±    0.428  |      53.886 ±    3.243  |ns/op|
multishorts    1000  avgt   10 |     931.732 ±    6.891  |     923.814 ±   11.836  |     701.534 ±    1.742  |     470.312 ±    2.117  |ns/op|
multishorts   10000  avgt   10 |    9663.105 ± 1017.387  |    9859.034 ±   66.672  |    6422.864 ±    7.486  |    3785.710 ±   37.656  |ns/op|
multishorts  100000  avgt   10 |   88799.262 ± 2363.672  |   88015.545 ±   52.795  |   60541.966 ±  155.521  |   33888.677 ±  127.071  |ns/op|
shorts            1  avgt   10 |      20.199 ±    0.083  |      20.190 ±    0.027  |      21.389 ±    0.600  |      21.250 ±    0.024  |ns/op|
shorts           10  avgt   10 |      35.842 ±    0.189  |      35.806 ±    0.167  |      30.960 ±    0.186  |      31.451 ±    0.182  |ns/op|
shorts          100  avgt   10 |     184.323 ±    0.488  |     185.318 ±    0.776  |     143.652 ±    1.057  |      90.657 ±    0.052  |ns/op|
shorts         1000  avgt   10 |    1664.583 ±    2.016  |    1666.803 ±    3.100  |    1118.623 ±    0.661  |     652.112 ±    0.346  |ns/op|
shorts        10000  avgt   10 |   16395.042 ±   39.388  |   16426.231 ±   75.461  |   10933.090 ±   16.165  |    6200.135 ±  116.218  |ns/op|
shorts       100000  avgt   10 |  165037.332 ±  226.003  |  167782.156 ± 8844.288  |  114329.012 ± 4326.851  |   61693.056 ±   93.278  |ns/op|
----------------------------------------------------------------------------------------------------------------------------------------------

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17413#issuecomment-1890392431


More information about the hotspot-compiler-dev mailing list