RFR: 8306966: RISC-V: Support vector cast node for Vector API [v6]

Thu May 4 12:41:21 UTC 2023

On Wed, 3 May 2023 12:46:15 GMT, Gui Cao <gcao at openjdk.org> wrote:

>> Hi, 
>> 
>> we have added some implementations related to vector cast, It was implemented by referring to RVV v1.0 [1]. please take a look and have some reviews. Thanks a lot.
>> 
>> We can use the VectorReshapeTests.java[2]  to print the compilation log, verify and observe the generation of nodes.
>> 
>> For example, we can use the following command to print the compilation log of a jtreg test case:
>> 
>> 
>> /home/zifeihan/jdk-tools/jtreg/bin/jtreg \
>> -v:default \
>> -concurrency:16 -timeout:50 \
>> -javaoption:-XX:+UnlockExperimentalVMOptions \
>> -javaoption:-XX:+UseRVV \
>> -javaoption:-XX:+PrintOptoAssembly \
>> -javaoption:-XX:LogFile=/home/zifeihan/jdk-rvv/VectorReshapeTests_PrintOptoAssembly_20230426.log \
>> -jdk:/home/zifeihan/jdk-rvv/build/linux-riscv64-server-fastdebug/jdk \
>> -compilejdk:/home/zifeihan/jdk-rvv/build/linux-x86_64-server-release/images/jdk \
>> /home/zifeihan/jdk/test/jdk/jdk/incubator/vector/VectorReshapeTests.java
>> 
>> 
>> #### VectorCast/VectorCastB2X/VectorCastD2X/VectorCastF2X/VectorCastI2X/VectorCastL2X/VectorCastS2X
>> There are too many nodes here, and the following shows the log of `VectorCastB2X` nodes:
>> 
>>    ```
>>    1ba0    ld  R28, [R23, #280]	# ptr, #@loadP
>>    1ba4    addi  R29, R7, #32	# ptr, #@addP_reg_imm
>>    1ba8    reinterpretResize V1, V5
>>    1bb0    vcvtBtoX V4, V1
>>    1bb8    far_bgeu  R29, R28, B465	#@far_cmpP_branch  P=0.000100 C=-1.000000
>>    ```
>> 
>> #### VectorRearrange
>> 
>> When the original vector is converted to the target vector, if the actual number of elements of the original vector is greater than the number of elements of the target vector, a slicing action is performed to provide data for subsequent cast nodes. The slicing action depends on the VectorRearrange node.
>> 
>> The compilation log for the `VectorRearrange` node:
>> 
>>    ```
>> 1f6     spill R7 -> [sp, #320]	# spill size = 64
>> 1f8     spill [sp, #128] -> V1	# vector spill size = 256
>> 200     spill [sp, #160] -> V2	# vector spill size = 256
>> 208     rearrange V3, V1, V2
>> 210     spill V3 -> [sp, #96]	# vector spill size = 256
>> 218     li R11, #4	# int, #@loadConI
>>    ```
>> 
>> #### VectorReinterpret
>> If num_elem_from and num_elem_to are not equal, Reinterpret is needed to reset the correct number.
>> https://github.com/openjdk/jdk/blob/3554e7a3ffb879c7e5ef7547eb053e484d09d12b/src/hotspot/share/opto/vectorIntrinsics.cpp#L2374-L2376
>> The compilation log for the `VectorReinterpret` node:
>> 
>> 
>> 1218    spill [sp, #32] -> V4	# vector spill size = 256
>> 1220    spill [sp, #176] -> V3	# vector spill size = 256
>> 1228    rearrange V2, V4, V3
>> 1230    spill [sp, #72] -> V0	# vmask spill size = 32
>> 123c    vmerge_vvm V1, V1, V2, v0	#@vector blend
>> 1244    reinterpretResize V2, V1
>> 124c    vcvtStoX_extend V5, V2
>> 1254    bgeu  R28, R7, B169	#@cmpP_branch  P=0.000100 C=-1.000000
>> 
>> 
>> ####  LShiftCntV/RShiftCntV
>> 
>> We have merged `LShiftCntV`, `RShiftCntV` nodes and support boolean types
>> 
>> The compilation log for the LShiftCntV/RShiftCntV node:
>> 
>> 
>> 24c     vasrB V3, V1, V2
>> 260     storeV [R19], V3	# vector (rvv)
>> 268     lbu  R19, [R29, #48]	# byte, #@loadUB
>> 26c     andi  R19, R19, #7	#@andI_reg_imm
>> 270     loadV V1, [R25]	# vector (rvv)
>> 278     vshiftcnt V2, R19
>> 280     vasrB V3, V1, V2
>> 294     storeV [R26], V3	# vector (rvv)
>> 29c     lbu  R19, [R29, #80]	# byte, #@loadUB
>> 2a0     andi  R19, R19, #7	#@andI_reg_imm
>> 2a4     loadV V1, [R22]	# vector (rvv)
>> 2ac     vshiftcnt V2, R19
>> 
>> 
>> [1] https://github.com/riscv/riscv-v-spec/blob/v1.0/v-spec.adoc
>> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/incubator/vector/VectorReshapeTests.java
>> Testing:
>> qemu with UseRVV:
>> 
>> - [ ] Tier1 tests (release)
>> - [ ] Tier2 tests (release)
>> - [ ] Tier3 tests (release)
>> - [x] test/jdk/jdk/incubator/vector (fastdebug)
>
> Gui Cao has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix round mode and optimize widen/narrow vcast

Overall looks good, with some suggestions:

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1673:

> 1671: }
> 1672: 
> 1673: void C2_MacroAssembler::rvv_reduce_integral(Register dst, VectorRegister tmp,

Could you please rename this to `reduce_integral_v`, we already got `xxxxx_v` naming style.
Suggestion:

void C2_MacroAssembler::reduce_integral_v(Register dst, VectorRegister tmp,

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1711:

> 1709: // Set vl and vtype for full and partial vector operations.
> 1710: // (vma = mu, vta = tu, vill = false)
> 1711: void C2_MacroAssembler::rvv_vsetvli(BasicType bt, int vector_length, LMUL vlmul, Register tmp) {

Same here:
Suggestion:

void C2_MacroAssembler::vsetvli_v(BasicType bt, int vector_length, LMUL vlmul, Register tmp) {

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1784:

> 1782: }
> 1783: 
> 1784: void C2_MacroAssembler::vector_integer_extend(VectorRegister dst, BasicType dst_bt, int vector_length,

Suggestion:

void C2_MacroAssembler::integer_extend_v(VectorRegister dst, BasicType dst_bt, int vector_length,

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1822:

> 1820: // Vector narrow from src to dst with specified element sizes.
> 1821: // High part of dst vector will be filled with zero.
> 1822: void C2_MacroAssembler::vector_integer_narrow(VectorRegister dst, BasicType dst_bt, int vector_length,

Suggestion:

void C2_MacroAssembler::integer_narrow_v(VectorRegister dst, BasicType dst_bt, int vector_length,

-------------

Changes requested by fjiang (Author).

PR Review: https://git.openjdk.org/jdk/pull/13684#pullrequestreview-1412980702
PR Review Comment: https://git.openjdk.org/jdk/pull/13684#discussion_r1184948465
PR Review Comment: https://git.openjdk.org/jdk/pull/13684#discussion_r1184948943
PR Review Comment: https://git.openjdk.org/jdk/pull/13684#discussion_r1184945401
PR Review Comment: https://git.openjdk.org/jdk/pull/13684#discussion_r1184945681