RFR: 8324124: RISC-V: implement _vectorizedMismatch intrinsic [v5]

Yuri Gaevsky duke at openjdk.org
Tue Aug 12 12:35:13 UTC 2025


On Tue, 12 Aug 2025 05:58:16 GMT, Yuri Gaevsky <duke at openjdk.org> wrote:

>> Hello All,
>> 
>> Please review these changes to enable the __vectorizedMismatch_ intrinsic on RISC-V platform with RVV instructions supported.
>> 
>> Thank you,
>> -Yuri Gaevsky
>> 
>> **Correctness checks:**
>>   hotspot/jtreg/compiler/{intrinsic/c1/c2}/ under QEMU-8.1 with RVV v1.0.0 and -XX:TieredStopAtLevel=1/2/3/4.
>
> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision:
> 
>   - moved vector register declarations into vectorized_mismatch() body.
>   - added initial support for ArrayOperationPartialInlineSize.

Changeset `6341020` (BPI-F3 16g):

=============================================================================================================================
                                  |  -UseRVV       |  +UseRVV (m1)  |  +UseRVV (m2)  |  +UseRVV (m4)  |  +UseRVV (m8) |
=============================================================================================================================
Int.mismatchEnd       5  avgt   30  13.176 ± 0.016   13.154 ± 0.003   13.185 ± 0.018   13.157 ± 0.005   13.155 ± 0.003  ns/op
Int.mismatchEnd       8  avgt   30  29.410 ± 0.272   32.821 ± 0.362   35.331 ± 0.383   41.645 ± 0.153   55.595 ± 1.066  ns/op
Int.mismatchEnd      16  avgt   30  37.623 ± 0.421   46.344 ± 0.006   35.368 ± 0.358   41.687 ± 0.148   54.483 ± 0.007  ns/op
Int.mismatchEnd      32  avgt   30  57.399 ± 0.444   72.661 ± 0.024   50.619 ± 0.808   41.764 ± 0.192   54.265 ± 0.161  ns/op
Int.mismatchEnd      40  avgt   30  66.177 ± 0.279   85.869 ± 0.048   64.599 ± 1.164   62.919 ± 0.702   53.939 ± 0.356  ns/op
Int.mismatchEnd      50  avgt   30  75.398 ± 0.198   95.191 ± 0.014   65.218 ± 0.077   62.627 ± 0.013   53.862 ± 0.008  ns/op
Int.mismatchEnd      64  avgt   30  95.176 ± 0.558  121.951 ± 3.239   81.017 ± 1.581   63.682 ± 0.310   53.703 ± 0.379  ns/op
-----------------------------------------------------------------------------------------------------------------------------
Int.mismatchMid       5  avgt   30  28.893 ± 0.079   33.316 ± 0.075   36.333 ± 0.009   42.867 ± 0.963   59.605 ± 0.106  ns/op
Int.mismatchMid       8  avgt   30  25.275 ± 0.154   33.226 ± 0.026   36.343 ± 0.011   43.961 ± 0.040   58.947 ± 0.645  ns/op
Int.mismatchMid      16  avgt   30  34.135 ± 0.560   46.345 ± 0.007   34.921 ± 0.447   41.443 ± 0.307   55.132 ± 1.252  ns/op
Int.mismatchMid      32  avgt   30  42.832 ± 0.156   59.564 ± 0.064   51.378 ± 0.008   42.631 ± 0.646   54.488 ± 0.012  ns/op
Int.mismatchMid      40  avgt   30  47.237 ± 0.233   58.246 ± 1.199   50.526 ± 0.809   41.827 ± 0.127   54.509 ± 0.026  ns/op
Int.mismatchMid      50  avgt   30  52.497 ± 0.208   72.730 ± 0.065   50.539 ± 0.803   41.027 ± 0.228   54.484 ± 0.010  ns/op
Int.mismatchMid      64  avgt   30  62.040 ± 0.316   85.900 ± 0.070   65.797 ± 1.179   63.050 ± 0.796   54.285 ± 0.187  ns/op
-----------------------------------------------------------------------------------------------------------------------------
Int.mismatchStart     5  avgt   30  29.101 ± 0.176   32.602 ± 0.019   35.125 ± 0.031   42.606 ± 0.612   58.884 ± 0.031  ns/op
Int.mismatchStart     8  avgt   30  28.884 ± 0.071   33.223 ± 0.018   35.740 ± 0.033   43.837 ± 0.006   58.016 ± 1.691  ns/op
Int.mismatchStart    16  avgt   30  28.845 ± 0.044   33.205 ± 0.010   36.400 ± 0.040   42.792 ± 1.004   59.635 ± 0.100  ns/op
Int.mismatchStart    32  avgt   30  28.852 ± 0.082   32.790 ± 0.397   35.706 ± 0.010   41.960 ± 0.006   54.334 ± 0.144  ns/op
Int.mismatchStart    40  avgt   30  29.137 ± 0.295   32.787 ± 0.392   35.294 ± 0.404   41.956 ± 0.011   55.855 ± 1.753  ns/op
Int.mismatchStart    50  avgt   30  29.543 ± 0.359   32.611 ± 0.034   35.144 ± 0.076   41.391 ± 0.059   55.545 ± 1.604  ns/op
Int.mismatchStart    64  avgt   30  28.988 ± 0.112   32.786 ± 0.403   35.748 ± 0.053   43.089 ± 0.773   54.114 ± 0.192  ns/op
=============================================================================================================================
Byte.mismatchEnd      5  avgt   30  17.567 ± 0.043   18.193 ± 0.027   17.959 ± 0.199   18.221 ± 0.035   17.870 ± 0.184  ns/op
Byte.mismatchEnd      8  avgt   30  22.356 ± 0.390   35.551 ± 1.248   38.292 ± 1.557   44.816 ± 1.372   63.172 ± 0.025  ns/op
Byte.mismatchEnd     16  avgt   30  29.471 ± 0.396   35.766 ± 1.125   38.635 ± 1.098   47.061 ± 1.024   63.978 ± 0.115  ns/op
Byte.mismatchEnd     32  avgt   30  35.606 ± 0.352   35.006 ± 1.269   38.867 ± 1.013   44.779 ± 1.354   58.762 ± 0.061  ns/op
Byte.mismatchEnd     40  avgt   30  36.299 ± 0.233   49.543 ± 0.068   39.384 ± 0.069   45.629 ± 0.052   58.155 ± 0.058  ns/op
Byte.mismatchEnd     50  avgt   30  40.831 ± 0.400   48.221 ± 1.205   38.515 ± 1.186   44.252 ± 1.380   59.907 ± 1.547  ns/op
Byte.mismatchEnd     64  avgt   30  50.740 ± 0.269   49.471 ± 1.201   38.981 ± 1.240   43.970 ± 1.056   57.533 ± 1.517  ns/op
-----------------------------------------------------------------------------------------------------------------------------
Byte.mismatchMid      5  avgt   30  30.346 ± 1.820   28.527 ± 1.364   28.992 ± 1.322   28.991 ± 1.782   29.945 ± 1.755  ns/op
Byte.mismatchMid      8  avgt   30  21.933 ± 0.023   36.217 ± 1.245   37.457 ± 1.231   48.120 ± 0.046   63.750 ± 0.021  ns/op
Byte.mismatchMid     16  avgt   30  29.273 ± 0.146   35.789 ± 1.262   38.637 ± 1.210   44.467 ± 1.936   60.595 ± 3.065  ns/op
Byte.mismatchMid     32  avgt   30  27.632 ± 0.558   36.155 ± 1.253   39.978 ± 0.541   46.168 ± 1.093   60.817 ± 1.509  ns/op
Byte.mismatchMid     40  avgt   30  28.596 ± 0.400   37.000 ± 0.404   38.030 ± 1.664   47.003 ± 1.031   62.470 ± 1.401  ns/op
Byte.mismatchMid     50  avgt   30  35.171 ± 0.468   37.034 ± 0.428   39.550 ± 0.404   45.030 ± 0.958   57.389 ± 0.989  ns/op
Byte.mismatchMid     64  avgt   30  35.879 ± 0.672   49.512 ± 1.219   39.994 ± 0.047   44.351 ± 1.392   57.130 ± 2.729  ns/op
-----------------------------------------------------------------------------------------------------------------------------
Byte.mismatchStart    5  avgt   30  29.944 ± 1.876   30.086 ± 1.849   31.777 ± 0.808   30.560 ± 2.040   29.034 ± 1.821  ns/op
Byte.mismatchStart    8  avgt   30  22.159 ± 0.219   34.091 ± 1.294   38.037 ± 1.270   44.722 ± 1.341   58.877 ± 2.451  ns/op
Byte.mismatchStart   16  avgt   30  21.942 ± 0.021   34.797 ± 1.349   38.725 ± 1.277   46.050 ± 0.997   58.949 ± 0.664  ns/op
Byte.mismatchStart   32  avgt   30  22.166 ± 0.188   33.611 ± 1.230   40.017 ± 0.084   47.512 ± 0.643   63.875 ± 0.039  ns/op
Byte.mismatchStart   40  avgt   30  21.936 ± 0.013   34.460 ± 1.151   38.701 ± 1.280   46.868 ± 0.605   63.365 ± 0.120  ns/op
Byte.mismatchStart   50  avgt   30  22.136 ± 0.198   34.251 ± 1.299   37.158 ± 1.409   44.893 ± 1.654   61.724 ± 1.344  ns/op
Byte.mismatchStart   64  avgt   30  21.953 ± 0.037   34.875 ± 1.244   39.218 ± 1.389   48.289 ± 0.062   61.671 ± 2.012  ns/op
=============================================================================================================================

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17750#issuecomment-3179138714


More information about the hotspot-dev mailing list