RFR: 8307609: RISC-V: Added support for Extract, Compress, Expand and other nodes for Vector API
Fei Yang
fyang at openjdk.org
Mon May 15 02:02:54 UTC 2023
On Mon, 8 May 2023 11:04:09 GMT, Dingli Zhang <dzhang at openjdk.org> wrote:
> Hi all,
>
> We have added support for Extract, Compress, Expand and other nodes for Vector
> API. It was implemented by referring to RVV v1.0 [1]. Please take a look and
> have some reviews. Thanks a lot.
>
> In this PR, we will support these new nodes:
>
> CompressM/CompressV/ExpandV
> LoadVectorGather/StoreVectorScatter/LoadVectorGatherMasked/StoreVectorScatterMasked
> Extract
> VectorLongToMask/VectorMaskToLong
> PopulateIndex
> VectorLongToMask/VectorMaskToLong
> VectorMaskTrueCount/VectorMaskFirstTrue
> VectorInsert
>
>
> At the same time, we refactored methods such as
> `match_rule_supported_vector_mask`. All implemented vector nodes support mask
> operations by default now, so we also added mask nodes for all implemented
> nodes.
>
> By the way, we will implement the VectorTest node in the next PR.
>
> We can use the tests under `test/jdk/jdk/incubator/vector` to print the
> compilation log for most of the new nodes. And we can use the following
> command to print the compilation log of a jtreg test case:
>
>
> $ jtreg \
> -v:default \
> -concurrency:16 -timeout:50 \
> -javaoption:-XX:+UnlockExperimentalVMOptions \
> -javaoption:-XX:+UseRVV \
> -javaoption:-XX:+PrintOptoAssembly \
> -javaoption:-XX:LogFile=log_name.log \
> -jdk:build/linux-riscv64-server-fastdebug/jdk \
> -compilejdk:build/linux-x86_64-server-release/images/jdk \
> <test-case>
>
>
>
>
> ### CompressM/CompressV/ExpandV
>
> There is no inverse vdecompress provided in RVV, as this operation can be
> readily synthesized using iota and a masked vrgather in `ExpandV`.
>
> We can use `test/jdk/jdk/incubator/vector/Float256VectorTests.java` to emit
> these nodes and the compilation log is as follows:
>
>
> ## CompressM
> 2aa addi R29, R10, #16 # ptr, #@addP_reg_imm
> 2ae mcompress V0, V30 # KILL R30
> 2c2 vstoremask V2, V0
> 2ce storeV [R7], V2 # vector (rvv)
> 2d6 bgeu R29, R28, B47 #@cmpP_branch P=0.000100 C=-1.000000
>
> ## CompressV
> 0ee addi R29, R10, #16 # ptr, #@addP_reg_imm
> 0f2 vcompress V1, V2, V0
> 0fe storeV [R7], V1 # vector (rvv)
> 106 bgeu R29, R28, B10 #@cmpP_branch P=0.000100 C=-1.000000
>
> ## ExpandV
> 0ee addi R29, R10, #16 # ptr, #@addP_reg_imm
> 0f2 vexpand V3, V2, V0
> 102 storeV [R7], V3 # vector (rvv)
> 10a bgeu R29, R28, B10 #@cmpP_branch P=0.000100 C=-1.000000
>
>
>
>
> ### LoadVectorGather/StoreVectorScatter/LoadVectorGatherMasked/StoreVectorScatterMasked
>
> We use the vsoxei32_v instruction regardless of what sew is set to. The
> indexMap in fromArr...
Some initial comments from a cursory look.
src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1635:
> 1633:
> 1634: // Set dst to NaN if any NaN input.
> 1635: void C2_MacroAssembler::minmax_fp_masked_v(VectorRegister dst_src1, VectorRegister src2,
Better to break down `dst_src1` into two seperate operands, i.e., `dst` and `src1`.
src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1644:
> 1642: // Check vector elements of src1 and src2 for quiet and signaling NaN.
> 1643: vfclass_v(tmp1, dst_src1);
> 1644: vfclass_v(tmp2, src2);
As discussed offline, a better way for finding NaN from the vector elements is with `vmfne` instruction, like: `vmfeq.vv v0, va, va`. vmfne writes 1 to the destination element when the corresponding element of `va` is NaN.
src/hotspot/cpu/riscv/riscv_v.ad line 4134:
> 4132: __ vsetvli_helper(bt, Matcher::vector_length(this));
> 4133: __ vid_v(as_VectorRegister($v0$$reg));
> 4134: __ mv($tmp1$$Register, (int)($idx$$constant));
Suggestion: make `idx` an register input operand and eliminate this `mv` instruction and maybe the `tmp1` register reserved.
-------------
Changes requested by fyang (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/13862#pullrequestreview-1425664153
PR Review Comment: https://git.openjdk.org/jdk/pull/13862#discussion_r1193257710
PR Review Comment: https://git.openjdk.org/jdk/pull/13862#discussion_r1193261133
PR Review Comment: https://git.openjdk.org/jdk/pull/13862#discussion_r1193256663
More information about the hotspot-compiler-dev
mailing list