RFR: 8306966: RISC-V: Support vector cast node for Vector API [v6]
Feilong Jiang
fjiang at openjdk.org
Thu May 4 12:41:21 UTC 2023
On Wed, 3 May 2023 12:46:15 GMT, Gui Cao <gcao at openjdk.org> wrote:
>> Hi,
>>
>> we have added some implementations related to vector cast, It was implemented by referring to RVV v1.0 [1]. please take a look and have some reviews. Thanks a lot.
>>
>> We can use the VectorReshapeTests.java[2] to print the compilation log, verify and observe the generation of nodes.
>>
>> For example, we can use the following command to print the compilation log of a jtreg test case:
>>
>>
>> /home/zifeihan/jdk-tools/jtreg/bin/jtreg \
>> -v:default \
>> -concurrency:16 -timeout:50 \
>> -javaoption:-XX:+UnlockExperimentalVMOptions \
>> -javaoption:-XX:+UseRVV \
>> -javaoption:-XX:+PrintOptoAssembly \
>> -javaoption:-XX:LogFile=/home/zifeihan/jdk-rvv/VectorReshapeTests_PrintOptoAssembly_20230426.log \
>> -jdk:/home/zifeihan/jdk-rvv/build/linux-riscv64-server-fastdebug/jdk \
>> -compilejdk:/home/zifeihan/jdk-rvv/build/linux-x86_64-server-release/images/jdk \
>> /home/zifeihan/jdk/test/jdk/jdk/incubator/vector/VectorReshapeTests.java
>>
>>
>> #### VectorCast/VectorCastB2X/VectorCastD2X/VectorCastF2X/VectorCastI2X/VectorCastL2X/VectorCastS2X
>> There are too many nodes here, and the following shows the log of `VectorCastB2X` nodes:
>>
>> ```
>> 1ba0 ld R28, [R23, #280] # ptr, #@loadP
>> 1ba4 addi R29, R7, #32 # ptr, #@addP_reg_imm
>> 1ba8 reinterpretResize V1, V5
>> 1bb0 vcvtBtoX V4, V1
>> 1bb8 far_bgeu R29, R28, B465 #@far_cmpP_branch P=0.000100 C=-1.000000
>> ```
>>
>> #### VectorRearrange
>>
>> When the original vector is converted to the target vector, if the actual number of elements of the original vector is greater than the number of elements of the target vector, a slicing action is performed to provide data for subsequent cast nodes. The slicing action depends on the VectorRearrange node.
>>
>> The compilation log for the `VectorRearrange` node:
>>
>> ```
>> 1f6 spill R7 -> [sp, #320] # spill size = 64
>> 1f8 spill [sp, #128] -> V1 # vector spill size = 256
>> 200 spill [sp, #160] -> V2 # vector spill size = 256
>> 208 rearrange V3, V1, V2
>> 210 spill V3 -> [sp, #96] # vector spill size = 256
>> 218 li R11, #4 # int, #@loadConI
>> ```
>>
>> #### VectorReinterpret
>> If num_elem_from and num_elem_to are not equal, Reinterpret is needed to reset the correct number.
>> https://github.com/openjdk/jdk/blob/3554e7a3ffb879c7e5ef7547eb053e484d09d12b/src/hotspot/share/opto/vectorIntrinsics.cpp#L2374-L2376
>> The compilation log for the `VectorReinterpret` node:
>>
>>
>> 1218 spill [sp, #32] -> V4 # vector spill size = 256
>> 1220 spill [sp, #176] -> V3 # vector spill size = 256
>> 1228 rearrange V2, V4, V3
>> 1230 spill [sp, #72] -> V0 # vmask spill size = 32
>> 123c vmerge_vvm V1, V1, V2, v0 #@vector blend
>> 1244 reinterpretResize V2, V1
>> 124c vcvtStoX_extend V5, V2
>> 1254 bgeu R28, R7, B169 #@cmpP_branch P=0.000100 C=-1.000000
>>
>>
>> #### LShiftCntV/RShiftCntV
>>
>> We have merged `LShiftCntV`, `RShiftCntV` nodes and support boolean types
>>
>> The compilation log for the LShiftCntV/RShiftCntV node:
>>
>>
>> 24c vasrB V3, V1, V2
>> 260 storeV [R19], V3 # vector (rvv)
>> 268 lbu R19, [R29, #48] # byte, #@loadUB
>> 26c andi R19, R19, #7 #@andI_reg_imm
>> 270 loadV V1, [R25] # vector (rvv)
>> 278 vshiftcnt V2, R19
>> 280 vasrB V3, V1, V2
>> 294 storeV [R26], V3 # vector (rvv)
>> 29c lbu R19, [R29, #80] # byte, #@loadUB
>> 2a0 andi R19, R19, #7 #@andI_reg_imm
>> 2a4 loadV V1, [R22] # vector (rvv)
>> 2ac vshiftcnt V2, R19
>>
>>
>> [1] https://github.com/riscv/riscv-v-spec/blob/v1.0/v-spec.adoc
>> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/incubator/vector/VectorReshapeTests.java
>> Testing:
>> qemu with UseRVV:
>>
>> - [ ] Tier1 tests (release)
>> - [ ] Tier2 tests (release)
>> - [ ] Tier3 tests (release)
>> - [x] test/jdk/jdk/incubator/vector (fastdebug)
>
> Gui Cao has updated the pull request incrementally with one additional commit since the last revision:
>
> Fix round mode and optimize widen/narrow vcast
Overall looks good, with some suggestions:
src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1673:
> 1671: }
> 1672:
> 1673: void C2_MacroAssembler::rvv_reduce_integral(Register dst, VectorRegister tmp,
Could you please rename this to `reduce_integral_v`, we already got `xxxxx_v` naming style.
Suggestion:
void C2_MacroAssembler::reduce_integral_v(Register dst, VectorRegister tmp,
src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1711:
> 1709: // Set vl and vtype for full and partial vector operations.
> 1710: // (vma = mu, vta = tu, vill = false)
> 1711: void C2_MacroAssembler::rvv_vsetvli(BasicType bt, int vector_length, LMUL vlmul, Register tmp) {
Same here:
Suggestion:
void C2_MacroAssembler::vsetvli_v(BasicType bt, int vector_length, LMUL vlmul, Register tmp) {
src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1784:
> 1782: }
> 1783:
> 1784: void C2_MacroAssembler::vector_integer_extend(VectorRegister dst, BasicType dst_bt, int vector_length,
Suggestion:
void C2_MacroAssembler::integer_extend_v(VectorRegister dst, BasicType dst_bt, int vector_length,
src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1822:
> 1820: // Vector narrow from src to dst with specified element sizes.
> 1821: // High part of dst vector will be filled with zero.
> 1822: void C2_MacroAssembler::vector_integer_narrow(VectorRegister dst, BasicType dst_bt, int vector_length,
Suggestion:
void C2_MacroAssembler::integer_narrow_v(VectorRegister dst, BasicType dst_bt, int vector_length,
-------------
Changes requested by fjiang (Author).
PR Review: https://git.openjdk.org/jdk/pull/13684#pullrequestreview-1412980702
PR Review Comment: https://git.openjdk.org/jdk/pull/13684#discussion_r1184948465
PR Review Comment: https://git.openjdk.org/jdk/pull/13684#discussion_r1184948943
PR Review Comment: https://git.openjdk.org/jdk/pull/13684#discussion_r1184945401
PR Review Comment: https://git.openjdk.org/jdk/pull/13684#discussion_r1184945681
More information about the hotspot-compiler-dev
mailing list