RFR: 8306966: RISC-V: Support vector cast node for Vector API [v3]

Fei Yang fyang at openjdk.org
Fri Apr 28 02:44:53 UTC 2023


On Thu, 27 Apr 2023 14:03:58 GMT, Gui Cao <gcao at openjdk.org> wrote:

>> Hi, 
>> 
>> we have added some implementations related to vector cast, It was implemented by referring to RVV v1.0 [1]. please take a look and have some reviews. Thanks a lot.
>> 
>> We can use the VectorReshapeTests.java[2]  to print the compilation log, verify and observe the generation of nodes.
>> 
>> For example, we can use the following command to print the compilation log of a jtreg test case:
>> 
>> 
>> /home/zifeihan/jdk-tools/jtreg/bin/jtreg \
>> -v:default \
>> -concurrency:16 -timeout:50 \
>> -javaoption:-XX:+UnlockExperimentalVMOptions \
>> -javaoption:-XX:+UseRVV \
>> -javaoption:-XX:+PrintOptoAssembly \
>> -javaoption:-XX:LogFile=/home/zifeihan/jdk-rvv/VectorReshapeTests_PrintOptoAssembly_20230426.log \
>> -jdk:/home/zifeihan/jdk-rvv/build/linux-riscv64-server-fastdebug/jdk \
>> -compilejdk:/home/zifeihan/jdk-rvv/build/linux-x86_64-server-release/images/jdk \
>> /home/zifeihan/jdk/test/jdk/jdk/incubator/vector/VectorReshapeTests.java
>> 
>> 
>> #### VectorCast/VectorCastB2X/VectorCastD2X/VectorCastF2X/VectorCastI2X/VectorCastL2X/VectorCastS2X
>> There are too many nodes here, and the following shows the log of `VectorCastB2X` nodes:
>> 
>>    ```
>>    1ba0    ld  R28, [R23, #280]	# ptr, #@loadP
>>    1ba4    addi  R29, R7, #32	# ptr, #@addP_reg_imm
>>    1ba8    reinterpretResize V1, V5
>>    1bb0    vcvtBtoX V4, V1
>>    1bb8    far_bgeu  R29, R28, B465	#@far_cmpP_branch  P=0.000100 C=-1.000000
>>    ```
>> 
>> #### VectorRearrange/VectorReinterpret
>> 
>> When the original vector is transformed to the target vector, if the actual number of elements of the original vector is larger than the number of elements of the target vector, a slice action is performed to provide data for the subsequent cast nodes. the slice action depends on the `VectorRearrange` and `VectorReinterpret` nodes.
>> 
>> The compilation log for the `VectorRearrange` node:
>> 
>>    ```
>> 1f6     spill R7 -> [sp, #320]	# spill size = 64
>> 1f8     spill [sp, #128] -> V1	# vector spill size = 256
>> 200     spill [sp, #160] -> V2	# vector spill size = 256
>> 208     rearrange V3, V1, V2
>> 210     spill V3 -> [sp, #96]	# vector spill size = 256
>> 218     li R11, #4	# int, #@loadConI
>>    ```
>> 
>> The compilation log for the `VectorReinterpret` node:
>> 
>> 
>> 1218    spill [sp, #32] -> V4	# vector spill size = 256
>> 1220    spill [sp, #176] -> V3	# vector spill size = 256
>> 1228    rearrange V2, V4, V3
>> 1230    spill [sp, #72] -> V0	# vmask spill size = 32
>> 123c    vmerge_vvm V1, V1, V2, v0	#@vector blend
>> 1244    reinterpretResize V2, V1
>> 124c    vcvtStoX_extend V5, V2
>> 1254    bgeu  R28, R7, B169	#@cmpP_branch  P=0.000100 C=-1.000000
>> 
>> 
>> ####  LShiftCntV/RShiftCntV/MaskAll
>> 
>> We have merged `LShiftCntV`, `RShiftCntV` nodes and support boolean types
>> 
>> The compilation log for the LShiftCntV/RShiftCntV node:
>> 
>> 
>> 24c     vasrB V3, V1, V2
>> 260     storeV [R19], V3	# vector (rvv)
>> 268     lbu  R19, [R29, #48]	# byte, #@loadUB
>> 26c     andi  R19, R19, #7	#@andI_reg_imm
>> 270     loadV V1, [R25]	# vector (rvv)
>> 278     vshiftcnt V2, R19
>> 280     vasrB V3, V1, V2
>> 294     storeV [R26], V3	# vector (rvv)
>> 29c     lbu  R19, [R29, #80]	# byte, #@loadUB
>> 2a0     andi  R19, R19, #7	#@andI_reg_imm
>> 2a4     loadV V1, [R22]	# vector (rvv)
>> 2ac     vshiftcnt V2, R19
>> 
>> 
>> By the way, the mask version of MaskAll is supported.
>> 
>> [1] https://github.com/riscv/riscv-v-spec/blob/v1.0/v-spec.adoc
>> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/incubator/vector/VectorReshapeTests.java
>> Testing:
>> qemu with UseRVV:
>> 
>> - [ ] Tier1 tests (release)
>> - [ ] Tier2 tests (release)
>> - [ ] Tier3 tests (release)
>> - [x] test/jdk/jdk/incubator/vector (fastdebug)
>
> Gui Cao has updated the pull request incrementally with one additional commit since the last revision:
> 
>   During the conversion, specify the number of vectors

Changes requested by fyang (Reviewer).

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1797:

> 1795:   assert_different_registers(dst, src);
> 1796: 
> 1797:   rvv_vsetvli(dst_bt, length_in_bytes);

I think we should use the actual AVL instread of 'length_in_bytes' for rvv_vsetvli ?

src/hotspot/cpu/riscv/riscv_v.ad line 2837:

> 2835:     if (bt == T_LONG) {
> 2836:       __ vector_integer_extend(as_VectorRegister($dst$$reg), T_LONG,
> 2837:                                Matcher::vector_length_in_bytes(this), as_VectorRegister($dst$$reg), T_INT);

Will this work? I see you are asserting that 'dst' and 'src' vector registers are different in vector_integer_extend. But the same vector register is passed for these two paramerters here.

src/hotspot/cpu/riscv/riscv_v.ad line 2885:

> 2883: %}
> 2884: 
> 2885: instruct vcvtDtoF(vReg dst_src1, vReg tmp) %{

Why not break down 'dst_src1' into two seperate 'dst' and 'src' inputs like you do for 'vcvtFtoD' ?

-------------

PR Review: https://git.openjdk.org/jdk/pull/13684#pullrequestreview-1405146050
PR Review Comment: https://git.openjdk.org/jdk/pull/13684#discussion_r1179873004
PR Review Comment: https://git.openjdk.org/jdk/pull/13684#discussion_r1179875372
PR Review Comment: https://git.openjdk.org/jdk/pull/13684#discussion_r1179873881


More information about the hotspot-compiler-dev mailing list