RFR: 8307609: RISC-V: Added support for Extract, Compress, Expand and other nodes for Vector API [v6]

Fei Yang fyang at openjdk.org
Wed May 17 03:05:58 UTC 2023


On Tue, 16 May 2023 12:44:36 GMT, Dingli Zhang <dzhang at openjdk.org> wrote:

>> Hi all,
>> 
>> We have added support for Extract, Compress, Expand and other nodes for Vector
>> API. It was implemented by referring to RVV v1.0 [1]. Please take a look and
>> have some reviews. Thanks a lot.
>> 
>> In this PR, we will support these new nodes:
>> 
>> CompressM/CompressV/ExpandV
>> LoadVectorGather/StoreVectorScatter/LoadVectorGatherMasked/StoreVectorScatterMasked
>> Extract
>> VectorLongToMask/VectorMaskToLong
>> PopulateIndex
>> VectorLongToMask/VectorMaskToLong
>> VectorMaskTrueCount/VectorMaskFirstTrue
>> VectorInsert
>> 
>> 
>> At the same time, we refactored methods such as
>> `match_rule_supported_vector_mask`. All implemented vector nodes support mask
>> operations by default now, so we also added mask nodes for all implemented
>> nodes. 
>> 
>> By the way, we will implement the VectorTest node in the next PR.
>> 
>> We can use the tests under `test/jdk/jdk/incubator/vector` to print the 
>> compilation log for most of the new nodes. And we can use the following 
>> command to print the compilation log of a jtreg test case:
>> 
>> 
>> $ jtreg \
>> -v:default \
>> -concurrency:16 -timeout:50 \
>> -javaoption:-XX:+UnlockExperimentalVMOptions \
>> -javaoption:-XX:+UseRVV \
>> -javaoption:-XX:+PrintOptoAssembly \
>> -javaoption:-XX:LogFile=log_name.log \
>> -jdk:build/linux-riscv64-server-fastdebug/jdk \
>> -compilejdk:build/linux-x86_64-server-release/images/jdk \
>> <test-case>
>> 
>> 
>> 
>> 
>> ### CompressM/CompressV/ExpandV
>> 
>> There is no inverse vdecompress provided in RVV, as this operation can be
>> readily synthesized using iota and a masked vrgather in `ExpandV`.
>> 
>> We can use `test/jdk/jdk/incubator/vector/Float256VectorTests.java` to emit
>> these nodes and the compilation log is as follows:
>> 
>> 
>> ## CompressM
>> 2aa     addi  R29, R10, #16	# ptr, #@addP_reg_imm
>> 2ae     mcompress V0, V30	# KILL R30
>> 2c2     vstoremask V2, V0
>> 2ce     storeV [R7], V2	# vector (rvv)
>> 2d6     bgeu  R29, R28, B47	#@cmpP_branch  P=0.000100 C=-1.000000
>> 
>> ## CompressV
>> 0ee     addi  R29, R10, #16	# ptr, #@addP_reg_imm
>> 0f2     vcompress V1, V2, V0
>> 0fe     storeV [R7], V1	# vector (rvv)
>> 106     bgeu  R29, R28, B10	#@cmpP_branch  P=0.000100 C=-1.000000
>> 
>> ## ExpandV
>> 0ee     addi  R29, R10, #16	# ptr, #@addP_reg_imm
>> 0f2     vexpand V3, V2, V0
>> 102     storeV [R7], V3	# vector (rvv)
>> 10a     bgeu  R29, R28, B10	#@cmpP_branch  P=0.000100 C=-1.000000
>> 
>> 
>> 
>> 
>> ### LoadVectorGather/StoreVectorScatter/LoadVectorGatherMasked/StoreVectorScatterMasked
>> 
>> We use the vs...
>
> Dingli Zhang has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Remove trailing whitespace

Thanks for the update. Would you mind a few more tweaks?

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1647:

> 1645: void C2_MacroAssembler::minmax_fp_masked_v(VectorRegister dst, VectorRegister src1, VectorRegister src2,
> 1646:                                            VectorRegister vmask, int vector_length, VectorRegister tmp1,
> 1647:                                            VectorRegister tmp2, bool is_double, bool is_min) {

Suggestion: make `vector_length` the last parameter so that it will be more consistent in style with friend `C2_MacroAssembler::minmax_fp_v`

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1683:

> 1681: 
> 1682:   is_min ? vfredmin_vs(tmp1, src2, tmp2, vm)
> 1683:          : vfredmax_vs(tmp1, src2, tmp2, vm);

Suggestion: put the result of reduction in `dst` with `vfmv_f_s(dst, tmp1)` here and save the `j(L_done_check)` at line 1695.

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1717:

> 1715: void C2_MacroAssembler::reduce_integral_v(Register dst, VectorRegister tmp,
> 1716:                                           Register src1, VectorRegister src2,
> 1717:                                           BasicType bt, int opc, int vector_length, VectorMask vm) {

Suggested parameter order: dst, src1, src2, tmp, opc, bt, vector_length, vm

-------------

Changes requested by fyang (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/13862#pullrequestreview-1429683625
PR Review Comment: https://git.openjdk.org/jdk/pull/13862#discussion_r1195852453
PR Review Comment: https://git.openjdk.org/jdk/pull/13862#discussion_r1195876189
PR Review Comment: https://git.openjdk.org/jdk/pull/13862#discussion_r1195876780


More information about the hotspot-compiler-dev mailing list