RFR: 8333006: RISC-V: C2: Support vector-scalar and vector-immediate arithmetic instructions [v2]
Fei Yang
fyang at openjdk.org
Wed May 29 02:27:02 UTC 2024
On Tue, 28 May 2024 10:49:26 GMT, Gui Cao <gcao at openjdk.org> wrote:
>> Hi, We want to support vector-scalar and vector-immediate arithmetic instructions, It was implemented by referring to RVV v1.0 [1]. please take a look and have some reviews. Thanks a lot.
>> We can use the Byte256VectorTests.java[2] to print the Opto JIT Code, verify and observe the generation of nodes.
>>
>> For example, we can use the following command to print the Opto JIT Code of a jtreg test case:
>>
>>
>> /home/zifeihan/jtreg/bin/jtreg \
>> -v:default \
>> -concurrency:16 -timeout:50 \
>> -javaoption:-XX:+UnlockExperimentalVMOptions \
>> -javaoption:-XX:+UseRVV \
>> -javaoption:-XX:+PrintOptoAssembly \
>> -javaoption:-XX:LogFile=/home/zifeihan/jdk/Byte256VectorTests_PrintOptoAssembly.log \
>> -jdk:/home/zifeihan/jdk/build/linux-riscv64-server-fastdebug/jdk \
>> /home/zifeihan/jdk/test/jdk/jdk/incubator/vector/Byte256VectorTests.java
>>
>>
>>
>> we can observe the specified compilation log `Byte256VectorTests_PrintOptoAssembly.log`, which contains the vector-scalar and vector-immediate arithmetic instructions for the PR implementation.
>>
>> vadd_immI Node
>>
>> 16c addw R11, R10, zr #@convI2L_reg_reg
>> 170 add R9, R31, R11 # ptr, #@addP_reg_reg
>> 174 addi R9, R9, #16 # ptr, #@addP_reg_imm
>> 176 loadV V1, [R9] # vector (rvv)
>> 17e vadd_immI V1, V1, #7
>> 186 add R11, R15, R11 # ptr, #@addP_reg_reg
>> 188 addi R11, R11, #16 # ptr, #@addP_reg_imm
>> 18a storeV [R11], V1 # vector (rvv)
>>
>>
>> vadd_immI_masked Node
>>
>> 1e8 B31: # out( B37 B32 ) <- in( B30 ) Freq: 76.2281
>> 1e8 loadV V2, [R31] # vector (rvv)
>> 1f0 vloadmask V0, V1
>> 1f8 vadd_immI_masked V2, V2, #7
>> 200 addi R31, R10, #48 # ptr, #@addP_reg_imm
>> 204 bgeu R30, R7, B37 #@cmpU_branch P=0.000001 C=-1.000000
>>
>>
>> vadd_regI Node
>>
>> 0c4 B4: # out( B9 B5 ) <- in( B8 B3 ) Freq: 1
>> 0c4 vloadcon V1 # generate iota indices
>> 0cc spill [sp, #4] -> R30 # spill size = 32
>> 0ce vmul_regI V1, V1, R30
>> 0d6 spill [sp, #0] -> R29 # spill size = 32
>> 0d8 vadd_regI V1, V1, R29
>>
>>
>> vadd_regI_masked Node
>>
>> 244 B36: # out( B33 B37 ) <- in( B35 ) Freq: 7427.81
>> 244 # castII of R30, #@castII
>> 244 addw R31, R30, zr #@convI2L_reg_reg
>> 248 spill [sp, #32] -> R10 # spill size = 64
>> 24a add R10, R10, R31 # ptr, #@addP_reg_reg
>> 24c addi R10, R10, #16 # ptr, #@addP_reg_imm
>> 24e loadV V2, [R10] # vector (rvv)
>> 256 vloadmask V0, V1
>> 25e vadd_regI_masked V2, V2, R29
>>
>>
>> ...
>
> Gui Cao has updated the pull request incrementally with one additional commit since the last revision:
>
> Code Format
Updated change looks good. Thanks.
-------------
Marked as reviewed by fyang (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/19415#pullrequestreview-2084137672
More information about the hotspot-compiler-dev
mailing list