RFR: 8324124: RISC-V: implement _vectorizedMismatch intrinsic [v6]

Yuri Gaevsky duke at openjdk.org
Thu Aug 14 18:55:13 UTC 2025


On Thu, 14 Aug 2025 08:19:00 GMT, Fei Yang <fyang at openjdk.org> wrote:

> Not sure if I understand it correctly, but from your JMH numbers seems we are using more `ns` for each `op` with UseRVV?

That's true - the LMUL change to `m2` is an attempt to improve perfromance for small sizes as much as possible. I've just measured large sizes and it's also not good:

m2:
========================================================================================
                                                    -XX:-UseRVV      -XX:+UseRVV
========================================================================================
Benchmark                     (size)  Mode  Cnt    Score   Error    Score   Error  Units
Int.differentSubrangeMatches     100  avgt   10  137.172 ± 0.054   98.497 ± 0.310  ns/op
Int.differentSubrangeMatches     200  avgt   10  156.312 ± 0.281  140.852 ± 0.361  ns/op
Int.differentSubrangeMatches     300  avgt   10  327.659 ± 0.317  191.959 ± 0.440  ns/op
Int.differentSubrangeMatches     400  avgt   10  240.912 ± 0.429  230.264 ± 0.164  ns/op
Int.differentSubrangeMatches     500  avgt   10  523.581 ± 0.292  286.112 ± 0.307  ns/op
Int.differentSubrangeMatches     600  avgt   10  352.296 ± 0.480  322.362 ± 0.924  ns/op
Int.differentSubrangeMatches     700  avgt   10  725.652 ± 0.555  382.037 ± 0.434  ns/op
Int.differentSubrangeMatches     800  avgt   10  455.651 ± 1.003  412.241 ± 0.411  ns/op
----------------------------------------------------------------------------------------
Int.matches                      100  avgt   10  143.116 ± 0.627  128.433 ± 0.057  ns/op
Int.matches                      200  avgt   10  227.868 ± 0.190  231.481 ± 0.343  ns/op
Int.matches                      300  avgt   10  336.983 ± 0.094  301.416 ± 0.279  ns/op
Int.matches                      400  avgt   10  440.492 ± 0.503  389.587 ± 0.752  ns/op
Int.matches                      500  avgt   10  524.292 ± 0.828  490.197 ± 1.283  ns/op
Int.matches                      600  avgt   10  627.717 ± 0.880  577.573 ± 0.764  ns/op
Int.matches                      700  avgt   10  730.503 ± 0.281  719.430 ± 0.278  ns/op
Int.matches                      800  avgt   10  831.331 ± 0.446  810.678 ± 0.482  ns/op
----------------------------------------------------------------------------------------
Int.mismatchEnd                  100  avgt   10  133.878 ± 0.434  106.791 ± 0.056  ns/op
Int.mismatchEnd                  200  avgt   10  220.972 ± 1.055  223.622 ± 0.110  ns/op
Int.mismatchEnd                  300  avgt   10  326.363 ± 0.069  294.368 ± 0.076  ns/op
Int.mismatchEnd                  400  avgt   10  432.284 ± 0.311  380.235 ± 0.096  ns/op
Int.mismatchEnd                  500  avgt   10  512.964 ± 0.139  466.615 ± 0.135  ns/op
Int.mismatchEnd                  600  avgt   10  613.120 ± 0.291  568.137 ± 0.120  ns/op
Int.mismatchEnd                  700  avgt   10  716.861 ± 0.667  709.291 ± 0.571  ns/op
Int.mismatchEnd                  800  avgt   10  821.902 ± 0.564  740.929 ± 0.241  ns/op
----------------------------------------------------------------------------------------
Int.mismatchMid                  100  avgt   10   84.289 ± 0.221   77.660 ± 0.018  ns/op
Int.mismatchMid                  200  avgt   10  142.339 ± 0.228  120.884 ± 0.037  ns/op
Int.mismatchMid                  300  avgt   10  170.238 ± 0.248  164.259 ± 0.457  ns/op
Int.mismatchMid                  400  avgt   10  221.964 ± 0.555  207.503 ± 0.126  ns/op
Int.mismatchMid                  500  avgt   10  275.343 ± 0.848  252.248 ± 0.395  ns/op
Int.mismatchMid                  600  avgt   10  322.031 ± 0.173  317.887 ± 0.314  ns/op
Int.mismatchMid                  700  avgt   10  371.653 ± 0.259  337.068 ± 0.069  ns/op
Int.mismatchMid                  800  avgt   10  419.094 ± 0.087  394.663 ± 0.231  ns/op
----------------------------------------------------------------------------------------
Int.mismatchStart                100  avgt   10   28.920 ± 0.179   34.449 ± 0.015  ns/op
Int.mismatchStart                200  avgt   10   28.845 ± 0.051   35.706 ± 0.022  ns/op
Int.mismatchStart                300  avgt   10   28.928 ± 0.051   34.444 ± 0.008  ns/op
Int.mismatchStart                400  avgt   10   29.369 ± 0.127   35.698 ± 0.008  ns/op
Int.mismatchStart                500  avgt   10   29.953 ± 0.595   34.488 ± 0.045  ns/op
Int.mismatchStart                600  avgt   10   28.809 ± 0.008   34.459 ± 0.011  ns/op
Int.mismatchStart                700  avgt   10   28.930 ± 0.124   35.702 ± 0.009  ns/op
Int.mismatchStart                800  avgt   10   28.814 ± 0.017   35.697 ± 0.009  ns/op
========================================================================================

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17750#issuecomment-3189538950


More information about the hotspot-dev mailing list