RFR: 8318217: RISC-V: C2 VectorizedHashCode [v9]

Fei Yang fyang at openjdk.org
Wed Dec 6 09:27:46 UTC 2023


On Tue, 5 Dec 2023 12:57:05 GMT, Yuri Gaevsky <duke at openjdk.org> wrote:

>> Hello All,
>> 
>> Please review these changes to support _vectorizedHashCode intrinsic on
>> RISC-V platform. The patch adds the "scalar" code for the intrinsic without
>> usage of any RVV instruction but provides manual unrolling of the appropriate
>> loop. The code with usage of RVV instruction could be added as follow-up of
>> the patch or independently.
>> 
>> Thanks,
>> -Yuri Gaevsky
>> 
>> P.S. My OCA has been accepted recently (ygaevsky).
>> 
>> ### Correctness checks
>> 
>> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux.
>> 
>> ### Performance results (the numbers for non-ints are similar)
>> 
>> #### StarFive JH7110 board:
>> 
>> 
>> ArraysHashCode:              without intrinsic      with intrinsic
>> -------------------------------------------------------------------------------
>> Benchmark  (size)  Mode  Cnt       Score     Error       Score     Error  Units
>> -------------------------------------------------------------------------------
>> multiints       0  avgt   30       2.658 ?   0.001       2.661 ?   0.004  ns/op
>> multiints       1  avgt   30       4.881 ?   0.011       4.892 ?   0.015  ns/op
>> multiints       2  avgt   30      16.109 ?   0.041      10.451 ?   0.075  ns/op
>> multiints       3  avgt   30      14.873 ?   0.068      11.753 ?   0.024  ns/op
>> multiints       4  avgt   30      17.283 ?   0.078      13.176 ?   0.044  ns/op
>> multiints       5  avgt   30      19.691 ?   0.136      14.723 ?   0.046  ns/op
>> multiints       6  avgt   30      21.727 ?   0.166      15.463 ?   0.124  ns/op
>> multiints       7  avgt   30      23.790 ?   0.126      18.298 ?   0.059  ns/op
>> multiints       8  avgt   30      23.527 ?   0.116      18.267 ?   0.046  ns/op
>> multiints       9  avgt   30      27.981 ?   0.303      20.453 ?   0.069  ns/op
>> multiints      10  avgt   30      26.947 ?   0.215      20.541 ?   0.051  ns/op
>> multiints      50  avgt   30      95.373 ?   0.588      69.238 ?   0.208  ns/op
>> multiints     100  avgt   30     177.109 ?   0.525     137.852 ?   0.417  ns/op
>> multiints     200  avgt   30     341.074 ?   1.363     296.832 ?   0.725  ns/op
>> multiints     500  avgt   30     847.993 ?   1.713     752.415 ?   1.918  ns/op
>> multiints    1000  avgt   30    1610.199 ?   5.424    1426.112 ?   3.407  ns/op
>> multiints   10000  avgt   30   16234.260 ?  26.789   14447.936 ?  26.345  ns/op
>> multiints  100000  avgt   30  170726.025 ? 184.003  152587.649 ? 381.964  ns/op
>> ---------------------------------------...
>
> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Changed lb-->lbu for T_BOOLEAN and iRegINoSp-->iRegLNoSp for tmp2/tmp3.

So I tried this on sifive unmatched. Unfortunately, I see some performance regressions with this change.
Before:

Benchmark                   (size)  Mode  Cnt      Score     Error  Units
ArraysHashCode.bytes             1  avgt   15     19.737 ?    5.405  ns/op
ArraysHashCode.bytes            10  avgt   15     56.102 ?    3.191  ns/op
ArraysHashCode.bytes           100  avgt   15    317.126 ?    3.452  ns/op
ArraysHashCode.bytes         10000  avgt   15  28380.470 ?   20.709  ns/op
ArraysHashCode.chars             1  avgt   15     15.532 ?    2.623  ns/op
ArraysHashCode.chars            10  avgt   15     59.603 ?    2.440  ns/op
ArraysHashCode.chars           100  avgt   15    333.995 ?    3.834  ns/op
ArraysHashCode.chars         10000  avgt   15  29464.768 ?   16.751  ns/op
ArraysHashCode.ints              1  avgt   15     16.031 ?    2.820  ns/op
ArraysHashCode.ints             10  avgt   15     59.506 ?    3.980  ns/op
ArraysHashCode.ints            100  avgt   15    335.514 ?    4.695  ns/op
ArraysHashCode.ints          10000  avgt   15  33966.175 ?  929.859  ns/op
ArraysHashCode.multibytes        1  avgt   15      7.840 ?    0.110  ns/op
ArraysHashCode.multibytes       10  avgt   15     34.727 ?    0.547  ns/op
ArraysHashCode.multibytes      100  avgt   15    193.085 ?    0.814  ns/op
ArraysHashCode.multibytes    10000  avgt   15  16610.239 ?   27.290  ns/op
ArraysHashCode.multichars        1  avgt   15      7.853 ?    0.092  ns/op
ArraysHashCode.multichars       10  avgt   15     35.059 ?    0.241  ns/op
ArraysHashCode.multichars      100  avgt   15    203.483 ?    0.413  ns/op
ArraysHashCode.multichars    10000  avgt   15  18819.804 ?   75.487  ns/op
ArraysHashCode.multiints         1  avgt   15      7.878 ?    0.104  ns/op
ArraysHashCode.multiints        10  avgt   15     35.232 ?    0.196  ns/op
ArraysHashCode.multiints       100  avgt   15    211.087 ?    1.914  ns/op
ArraysHashCode.multiints     10000  avgt   15  30172.693 ? 1447.757  ns/op
ArraysHashCode.multishorts       1  avgt   15      7.788 ?    0.046  ns/op
ArraysHashCode.multishorts      10  avgt   15     35.504 ?    0.465  ns/op
ArraysHashCode.multishorts     100  avgt   15    203.530 ?    0.342  ns/op
ArraysHashCode.multishorts   10000  avgt   15  18801.799 ?   77.159  ns/op
ArraysHashCode.shorts            1  avgt   15     19.685 ?    5.413  ns/op
ArraysHashCode.shorts           10  avgt   15     59.583 ?    4.684  ns/op
ArraysHashCode.shorts          100  avgt   15    333.170 ?    5.367  ns/op
ArraysHashCode.shorts        10000  avgt   15  29455.665 ?   13.302  ns/op


After:

Benchmark                   (size)  Mode  Cnt      Score     Error  Units
ArraysHashCode.bytes             1  avgt   15     18.575 ?   3.780  ns/op
ArraysHashCode.bytes            10  avgt   15     55.394 ?   4.610  ns/op
ArraysHashCode.bytes           100  avgt   15    340.807 ?   3.387  ns/op
ArraysHashCode.bytes         10000  avgt   15  31506.478 ?  27.694  ns/op
ArraysHashCode.chars             1  avgt   15     15.966 ?   2.291  ns/op
ArraysHashCode.chars            10  avgt   15     56.524 ?   4.301  ns/op
ArraysHashCode.chars           100  avgt   15    343.389 ?   3.272  ns/op
ArraysHashCode.chars         10000  avgt   15  31520.717 ?  13.290  ns/op
ArraysHashCode.ints              1  avgt   15     16.078 ?   3.977  ns/op
ArraysHashCode.ints             10  avgt   15     55.467 ?   2.845  ns/op
ArraysHashCode.ints            100  avgt   15    344.500 ?   3.531  ns/op
ArraysHashCode.ints          10000  avgt   15  36234.542 ?  39.191  ns/op
ArraysHashCode.multibytes        1  avgt   15      7.816 ?   0.072  ns/op
ArraysHashCode.multibytes       10  avgt   15     29.617 ?   0.257  ns/op
ArraysHashCode.multibytes      100  avgt   15    183.986 ?   0.236  ns/op
ArraysHashCode.multibytes    10000  avgt   15  18349.268 ?  28.711  ns/op
ArraysHashCode.multichars        1  avgt   15      7.821 ?   0.050  ns/op
ArraysHashCode.multichars       10  avgt   15     29.293 ?   0.273  ns/op
ArraysHashCode.multichars      100  avgt   15    186.538 ?   0.404  ns/op
ArraysHashCode.multichars    10000  avgt   15  20149.487 ?  87.300  ns/op
ArraysHashCode.multiints         1  avgt   15      7.847 ?   0.044  ns/op
ArraysHashCode.multiints        10  avgt   15     29.765 ?   1.082  ns/op
ArraysHashCode.multiints       100  avgt   15    193.887 ?   0.360  ns/op
ArraysHashCode.multiints     10000  avgt   15  30997.145 ? 420.328  ns/op
ArraysHashCode.multishorts       1  avgt   15      7.856 ?   0.128  ns/op
ArraysHashCode.multishorts      10  avgt   15     29.231 ?   0.434  ns/op
ArraysHashCode.multishorts     100  avgt   15    187.044 ?   0.289  ns/op
ArraysHashCode.multishorts   10000  avgt   15  20146.327 ?  89.985  ns/op
ArraysHashCode.shorts            1  avgt   15     15.162 ?   4.191  ns/op
ArraysHashCode.shorts           10  avgt   15     54.279 ?   2.661  ns/op
ArraysHashCode.shorts          100  avgt   15    343.085 ?   4.204  ns/op
ArraysHashCode.shorts        10000  avgt   15  31536.455 ?  23.874  ns/op

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1842496746


More information about the hotspot-dev mailing list