RFR: 8355667: RISC-V: Add backend implementation for unsigned vector Min / Max operations
Gui Cao
gcao at openjdk.org
Mon Apr 28 06:06:44 UTC 2025
On Mon, 28 Apr 2025 02:09:35 GMT, Fei Yang <fyang at openjdk.org> wrote:
> Hi, please review this change.
> https://bugs.openjdk.org/browse/JDK-8338021 proposed new vector operators including Unsigned Vector Min / Max.
> This intrinsify Unsigned Vector Min / Max operations with RVV extension for RISC-V backend to improve performance.
> This also enables some extra IR tests in file `test/hotspot/jtreg/compiler/vectorapi/VectorCommutativeOperSharingTest.java`.
>
> Testing:
> - [x] make test TEST="jdk_vector" (QEMU / fastdebug)
>
> JMH tested on BPI-F3 SBC for reference:
> Before:
>
> ByteMaxVector.UMAX 1024 thrpt 5 58.657 ± 17.216 ops/ms
> ByteMaxVector.UMAXMasked 1024 thrpt 5 45.581 ± 18.164 ops/ms
> ByteMaxVector.UMIN 1024 thrpt 5 55.275 ± 15.863 ops/ms
> ByteMaxVector.UMINMasked 1024 thrpt 5 44.651 ± 29.209 ops/ms
> ShortMaxVector.UMAX 1024 thrpt 5 24.146 ± 7.570 ops/ms
> ShortMaxVector.UMAXMasked 1024 thrpt 5 21.506 ± 0.430 ops/ms
> ShortMaxVector.UMIN 1024 thrpt 5 24.261 ± 6.993 ops/ms
> ShortMaxVector.UMINMasked 1024 thrpt 5 20.980 ± 1.622 ops/ms
> IntMaxVector.UMAX 1024 thrpt 5 10.780 ± 0.812 ops/ms
> IntMaxVector.UMAXMasked 1024 thrpt 5 10.609 ± 0.851 ops/ms
> IntMaxVector.UMIN 1024 thrpt 5 10.845 ± 0.578 ops/ms
> IntMaxVector.UMINMasked 1024 thrpt 5 10.705 ± 0.562 ops/ms
> LongMaxVector.UMAX 1024 thrpt 5 5.445 ± 0.439 ops/ms
> LongMaxVector.UMAXMasked 1024 thrpt 5 5.387 ± 0.285 ops/ms
> LongMaxVector.UMIN 1024 thrpt 5 5.379 ± 0.407 ops/ms
> LongMaxVector.UMINMasked 1024 thrpt 5 5.373 ± 0.236 ops/ms
>
>
> After:
>
> ByteMaxVector.UMAX 1024 thrpt 5 2552.161 ± 121.213 ops/ms
> ByteMaxVector.UMAXMasked 1024 thrpt 5 2444.001 ± 105.139 ops/ms
> ByteMaxVector.UMIN 1024 thrpt 5 2616.963 ± 3.065 ops/ms
> ByteMaxVector.UMINMasked 1024 thrpt 5 2367.968 ± 1057.028 ops/ms
> ShortMaxVector.UMAX 1024 thrpt 5 1363.676 ± 9.294 ops/ms
> ShortMaxVector.UMAXMasked 1024 thrpt 5 1321.759 ± 121.471 ops/ms
> ShortMaxVector.UMIN 1024 thrpt 5 1368.598 ± 5.251 ops/ms
> ShortMaxVector.UMINMasked 1024 thrpt 5 1337.044 ± 1.434 ops/ms
> IntMaxVector.UMAX 1024 thrpt 5 566.509 ± 0.658 ops/ms
> IntMaxVector.UMAXMasked 1024 thrpt 5 559.456 ± 0.491 ops/ms
> IntMaxVector.UMIN 1024 thrpt 5 569.238 ± 1.309 ops/ms
> IntMaxVector.UMINMasked 1024 thrpt 5 535.359 ± 15.765 ops/ms
> LongMaxVector.UMAX ...
LGTM.
-------------
Marked as reviewed by gcao (Author).
PR Review: https://git.openjdk.org/jdk/pull/24909#pullrequestreview-2798062768
More information about the hotspot-compiler-dev
mailing list