RFR: 8306966: RISC-V: Support vector cast node for Vector API [v8]
Fei Yang
fyang at openjdk.org
Sat May 6 03:08:18 UTC 2023
On Fri, 5 May 2023 06:18:23 GMT, Gui Cao <gcao at openjdk.org> wrote:
>> Hi,
>>
>> we have added some implementations related to vector cast, It was implemented by referring to RVV v1.0 [1]. please take a look and have some reviews. Thanks a lot.
>>
>> We can use the VectorReshapeTests.java[2] to print the compilation log, verify and observe the generation of nodes.
>>
>> For example, we can use the following command to print the compilation log of a jtreg test case:
>>
>>
>> /home/zifeihan/jdk-tools/jtreg/bin/jtreg \
>> -v:default \
>> -concurrency:16 -timeout:50 \
>> -javaoption:-XX:+UnlockExperimentalVMOptions \
>> -javaoption:-XX:+UseRVV \
>> -javaoption:-XX:+PrintOptoAssembly \
>> -javaoption:-XX:LogFile=/home/zifeihan/jdk-rvv/VectorReshapeTests_PrintOptoAssembly_20230426.log \
>> -jdk:/home/zifeihan/jdk-rvv/build/linux-riscv64-server-fastdebug/jdk \
>> -compilejdk:/home/zifeihan/jdk-rvv/build/linux-x86_64-server-release/images/jdk \
>> /home/zifeihan/jdk/test/jdk/jdk/incubator/vector/VectorReshapeTests.java
>>
>>
>> #### VectorCast/VectorCastB2X/VectorCastD2X/VectorCastF2X/VectorCastI2X/VectorCastL2X/VectorCastS2X
>> There are too many nodes here, and the following shows the log of `VectorCastB2X` nodes:
>>
>> ```
>> 1ba0 ld R28, [R23, #280] # ptr, #@loadP
>> 1ba4 addi R29, R7, #32 # ptr, #@addP_reg_imm
>> 1ba8 reinterpretResize V1, V5
>> 1bb0 vcvtBtoX V4, V1
>> 1bb8 far_bgeu R29, R28, B465 #@far_cmpP_branch P=0.000100 C=-1.000000
>> ```
>>
>> #### VectorRearrange
>>
>> When the original vector is converted to the target vector, if the actual number of elements of the original vector is greater than the number of elements of the target vector, a slicing action is performed to provide data for subsequent cast nodes. The slicing action depends on the VectorRearrange node.
>>
>> The compilation log for the `VectorRearrange` node:
>>
>> ```
>> 1f6 spill R7 -> [sp, #320] # spill size = 64
>> 1f8 spill [sp, #128] -> V1 # vector spill size = 256
>> 200 spill [sp, #160] -> V2 # vector spill size = 256
>> 208 rearrange V3, V1, V2
>> 210 spill V3 -> [sp, #96] # vector spill size = 256
>> 218 li R11, #4 # int, #@loadConI
>> ```
>>
>> #### VectorReinterpret
>> If num_elem_from and num_elem_to are not equal, Reinterpret is needed to reset the correct number.
>> https://github.com/openjdk/jdk/blob/3554e7a3ffb879c7e5ef7547eb053e484d09d12b/src/hotspot/share/opto/vectorIntrinsics.cpp#L2374-L2376
>> The compilation log for the `VectorReinterpret` node:
>>
>>
>> 1218 spill [sp, #32] -> V4 # vector spill size = 256
>> 1220 spill [sp, #176] -> V3 # vector spill size = 256
>> 1228 rearrange V2, V4, V3
>> 1230 spill [sp, #72] -> V0 # vmask spill size = 32
>> 123c vmerge_vvm V1, V1, V2, v0 #@vector blend
>> 1244 reinterpretResize V2, V1
>> 124c vcvtStoX_extend V5, V2
>> 1254 bgeu R28, R7, B169 #@cmpP_branch P=0.000100 C=-1.000000
>>
>>
>> #### LShiftCntV/RShiftCntV
>>
>> We have merged `LShiftCntV`, `RShiftCntV` nodes and support boolean types
>>
>> The compilation log for the LShiftCntV/RShiftCntV node:
>>
>>
>> 24c vasrB V3, V1, V2
>> 260 storeV [R19], V3 # vector (rvv)
>> 268 lbu R19, [R29, #48] # byte, #@loadUB
>> 26c andi R19, R19, #7 #@andI_reg_imm
>> 270 loadV V1, [R25] # vector (rvv)
>> 278 vshiftcnt V2, R19
>> 280 vasrB V3, V1, V2
>> 294 storeV [R26], V3 # vector (rvv)
>> 29c lbu R19, [R29, #80] # byte, #@loadUB
>> 2a0 andi R19, R19, #7 #@andI_reg_imm
>> 2a4 loadV V1, [R22] # vector (rvv)
>> 2ac vshiftcnt V2, R19
>>
>>
>> [1] https://github.com/riscv/riscv-v-spec/blob/v1.0/v-spec.adoc
>> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/incubator/vector/VectorReshapeTests.java
>> Testing:
>> qemu with UseRVV:
>>
>> - [x] Tier1 tests (release)
>> - [x] Tier2 tests (release)
>> - [x] Tier3 tests (release)
>> - [x] test/jdk/jdk/incubator/vector (fastdebug)
>
> Gui Cao has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision:
>
> - Merge remote-tracking branch 'upstream/master' into JDK-8306966
> - rename rvv_vsetvli to vsetvli_helper
> - Fix round mode and optimize widen/narrow vcast
> - Small refactoring of rvv_vsetvli
> - Fix VectorCastF2X
> - During the conversion, specify the number of vectors
> - Use zr register instead of x0
> - 8306966: RISC-V: Support vector cast node for Vector API
Thanks for the update. Would you mind a few more tweaks?
src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1823:
> 1821: // High part of dst vector will be filled with zero.
> 1822: void C2_MacroAssembler::integer_narrow_v(VectorRegister dst, BasicType dst_bt, int vector_length,
> 1823: VectorRegister src, BasicType src_bt, VectorRegister tmp) {
If you allocate different vector registers for 'dst' and 'src' on the callsite, then we should be able to eliminate the 'tmp' register parameter for this function. That is saving the intermediate result in 'dst' instead.
src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.hpp line 239:
> 237: VectorRegister src, BasicType src_bt, VectorRegister tmp);
> 238:
> 239: void vfcvt_rtz_xu_f_v_safe(VectorRegister dst, VectorRegister src);
I don't think we need the unsigned version. Could you please remove them?
-------------
Changes requested by fyang (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/13684#pullrequestreview-1415641580
PR Review Comment: https://git.openjdk.org/jdk/pull/13684#discussion_r1186604752
PR Review Comment: https://git.openjdk.org/jdk/pull/13684#discussion_r1186605732
More information about the hotspot-compiler-dev
mailing list