RFR: 8324124: RISC-V: implement _vectorizedMismatch intrinsic [v2]

Yuri Gaevsky duke at openjdk.org
Sat Apr 26 10:29:54 UTC 2025


On Thu, 24 Apr 2025 17:27:39 GMT, Yuri Gaevsky <duke at openjdk.org> wrote:

>> Hello All,
>> 
>> Please review these changes to enable the __vectorizedMismatch_ intrinsic on RISC-V platform with RVV instructions supported.
>> 
>> Thank you,
>> -Yuri Gaevsky
>> 
>> **Correctness checks:**
>>   hotspot/jtreg/compiler/{intrinsic/c1/c2}/ under QEMU-8.1 with RVV v1.0.0 and -XX:TieredStopAtLevel=1/2/3/4.
>
> Yuri Gaevsky has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits:
> 
>  - Merge master
>  - 8324124: RISC-V: implement _vectorizedMismatch intrinsic

JFTR: the numbers after the above merge on real RVV-1.0 hardware (BPI-F3 16g box) are below:

Legend: UseVHI ==> UseVectorizedMismatchIntrinsic
--------------------------------------------------------------------------------------------
                                                    (baseline)              (patch)
--------------------------------------------------------------------------------------------
                                           |-XX:-UseVMI -XX:+UseRVV|-XX:+UseVMI -XX:+UseRVV]
--------------------------------------------------------------------------------------------
Benchmark                      (size)  Mode  Cnt     Score    Error    Score    Error  Units
Byte.differentSubrangeMatches      90  avgt   50    90.835 ± 10.252   68.210 ±  0.229  ns/op
Byte.differentSubrangeMatches     800  avgt   50   242.182 ±  0.746  116.399 ± 13.814  ns/op
Byte.matches                       90  avgt   50    73.180 ±  2.713   58.416 ±  0.127  ns/op
Byte.matches                      800  avgt   50   272.341 ± 34.117  158.017 ±  0.084  ns/op
Byte.mismatchEnd                   90  avgt   50    59.989 ±  7.329   56.029 ±  0.172  ns/op
Byte.mismatchEnd                  800  avgt   50   266.516 ± 31.592  157.103 ±  1.889  ns/op
Byte.mismatchMid                   90  avgt   50    48.171 ±  6.174   61.952 ±  7.008  ns/op
Byte.mismatchMid                  800  avgt   50   147.665 ±  0.307   91.287 ±  0.622  ns/op
Byte.mismatchStart                 90  avgt   50    24.798 ±  2.390   63.256 ±  7.751  ns/op
Byte.mismatchStart                800  avgt   50    25.437 ±  2.515   66.168 ±  8.645  ns/op
--------------------------------------------------------------------------------------------
Short.differentSubrangeMatches      90  avgt   50   90.105 ± 10.843   90.357 ± 12.323  ns/op
Short.differentSubrangeMatches     800  avgt   50  454.344 ± 56.206  193.659 ± 22.036  ns/op
Short.matches                       90  avgt   50   84.396 ±  0.408   60.820 ±  0.012  ns/op
Short.matches                      800  avgt   50  443.555 ±  0.863  263.479 ±  2.496  ns/op
Short.mismatchEnd                   90  avgt   50   76.201 ±  0.312   55.378 ±  0.144  ns/op
Short.mismatchEnd                  800  avgt   50  454.414 ± 38.085  259.003 ±  1.868  ns/op
Short.mismatchMid                   90  avgt   50   48.644 ±  0.286   62.407 ±  6.994  ns/op
Short.mismatchMid                  800  avgt   50  220.501 ±  0.369  174.774 ± 20.577  ns/op
Short.mismatchStart                 90  avgt   50   23.942 ±  0.400   57.576 ±  4.415  ns/op
Short.mismatchStart                800  avgt   50   24.404 ±  0.045   75.023 ±  9.965  ns/op
--------------------------------------------------------------------------------------------
Int.differentSubrangeMatches      90  avgt   50    94.477 ±   0.242   72.728 ±  0.956  ns/op
Int.differentSubrangeMatches     800  avgt   50   452.339 ±   0.254  274.710 ±  0.736  ns/op
Int.matches                       90  avgt   50   129.260 ±   0.284   97.297 ±  0.467  ns/op
Int.matches                      800  avgt   50  1060.039 ± 137.858  481.137 ± 32.131  ns/op
Int.mismatchEnd                   90  avgt   50   125.489 ±   0.252   87.385 ±  0.160  ns/op
Int.mismatchEnd                  800  avgt   50   858.671 ±  67.981  456.869 ±  0.131  ns/op
Int.mismatchMid                   90  avgt   50    78.305 ±   0.583   55.822 ±  1.331  ns/op
Int.mismatchMid                  800  avgt   50   418.998 ±   0.246  252.832 ±  0.161  ns/op
Int.mismatchStart                 90  avgt   50    32.955 ±   3.942   56.859 ±  3.755  ns/op
Int.mismatchStart                800  avgt   50    31.276 ±   3.301   54.804 ±  0.200  ns/op
--------------------------------------------------------------------------------------------

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17750#issuecomment-2832019831


More information about the hotspot-dev mailing list