RFR: 8331281: RISC-V: C2: Support vector-scalar and vector-immediate bitwise logic instructions

Gui Cao gcao at openjdk.org
Tue Apr 30 13:03:16 UTC 2024


Hi, We want to support vector-scalar and vector-immediate bitwise logic instructions, It was implemented by referring to RVV v1.0 [1]. please take a look and have some reviews. Thanks a lot.
We can use the Int256VectorTests.java[2] to print the compilation log, verify and observe the generation of nodes.

For example, we can use the following command to print the compilation log of a jtreg test case:


/home/zifeihan/jdk-tools/jtreg/bin/jtreg \
-v:default \
-concurrency:16 -timeout:50 \
-javaoption:-XX:+UnlockExperimentalVMOptions \
-javaoption:-XX:+UseRVV \
-javaoption:-XX:+PrintOptoAssembly \
-javaoption:-XX:LogFile=/home/zifeihan/jdk/Int256VectorTests_PrintOptoAssembly.log \
-jdk:/home/zifeihan/jdk/build/linux-riscv64-server-fastdebug/jdk \
/home/zifeihan/jdk/test/jdk/jdk/incubator/vector/Int256VectorTests.java



 we can observe the specified compilation log `Int256VectorTests_PrintOptoAssembly.log`, which contains the vector-scalar and vector-immediate bitwise logic node for the PR implementation.

vand_immI Node


0b4     vloadcon V3	# generate iota indices
0bc     vmla V2, V2, V3, V1
0c4     vand_immI V2, V2, #7
0cc     addi  R7, R30, #16	# ptr, #@addP_reg_imm
0d0     storeV [R7], V2	# vector (rvv)


vor_regI Node


180     vor_regI V1, V1, R30
188     add R31, R14, R31	# ptr, #@addP_reg_reg
18a     addi  R31, R31, #16	# ptr, #@addP_reg_imm
18c     storeV [R31], V1	# vector (rvv)
194     addiw  R11, R11, #8	#@addI_reg_imm
196     blt  R11, R13, B17	#@cmpI_loop  P=0.500000 C=30564.000000


vxor_regI Node

198     vxor_regI V1, V1, R30
1a0     add R14, R16, R14	# ptr, #@addP_reg_reg
1a2     addi  R14, R14, #16	# ptr, #@addP_reg_imm
1a4     storeV [R14], V1	# vector (rvv)
1ac     addiw  R11, R11, #8	#@addI_reg_imm
1ae     blt  R11, R13, B21	#@cmpI_loop  P=0.500000 C=30564.000000


vand_regI_masked Node

234     B31: #	out( B40 B32 ) <- in( B30 )  Freq: 78.5481
234     loadV V2, [R15]	# vector (rvv)
23c     vand_regI_masked V2, V2, R11
244     storeV [R9], V2	# vector (rvv)
24c     mv R10, #8	# int, #@loadConI
24e     ble  R7, R10, B40	#@cmpI_branch  P=0.000001 C=-1.000000


vor_regI_masked Node

1ee     B32: #	out( B38 B33 ) <- in( B31 )  Freq: 75.8475
1ee     loadV V1, [R11]	# vector (rvv)
1f6     vor_regI_masked V1, V1, R31
1fe     addi  R11, R13, #32	# ptr, #@addP_reg_imm
202     bgeu  R29, R10, B38	#@cmpU_branch  P=0.000001 C=-1.000000

vxor_regI_masked Node

1ee     B32: #	out( B38 B33 ) <- in( B31 )  Freq: 75.8475
1ee     loadV V1, [R11]	# vector (rvv)
1f6     vxor_regI_masked V1, V1, R31
1fe     addi  R11, R13, #32	# ptr, #@addP_reg_imm
202     bgeu  R29, R10, B38	#@cmpU_branch  P=0.000001 C=-1.000000


vnotI Node


13c     B23: #	out( B52 B24 ) <- in( B22 )  Freq: 75.1106
13c     loadV V2, [R16]	# vector (rvv)
144     vnotI V2, V2
14c     vand V1, V1, V2
154     bgeu  R9, R12, B52	#@cmpU_branch  P=0.000001 C=-1.000000


vnotI_masked Node

14a     B19: #	out( B22 ) <- in( B18 )  Freq: 0.99999
14a     replicate_imm5 V1, #-3
152     vnotI_masked V1, V1, V0
15a      -- 	// R23=Thread::current(), empty, #@tlsLoadP
15a     mv R31, #0	# int, #@loadConI
15c     j  B22	#@branch

We can test test/jdk/jdk/incubator/vector/Long256VectorTests.java in the same way, and looking at the Opto logs, we will see nodes similar to vand_regL、vor_regL、vxor_regL、vnotL.

vand_regL Node

180     vand_regL V1, V1, R22
188     add R30, R17, R30	# ptr, #@addP_reg_reg
18a     addi  R30, R30, #16	# ptr, #@addP_reg_imm
18c     storeV [R30], V1	# vector (rvv)
194     addiw  R20, R20, #2	#@addI_reg_imm
196     blt  R20, R15, B17	#@cmpI_loop  P=0.500000 C=30564.000000


vor_regL Node

178     loadV V1, [R12]	# vector (rvv)
180     vor_regL V1, V1, R22
188     add R30, R17, R30	# ptr, #@addP_reg_reg
18a     addi  R30, R30, #16	# ptr, #@addP_reg_imm
18c     storeV [R30], V1	# vector (rvv)
194     addiw  R20, R20, #2	#@addI_reg_imm
196     blt  R20, R15, B17	#@cmpI_loop  P=0.500000 C=30564.000000




vxor_regL Node

178     loadV V1, [R12]	# vector (rvv)
180     vxor_regL V1, V1, R22
188     add R30, R17, R30	# ptr, #@addP_reg_reg
18a     addi  R30, R30, #16	# ptr, #@addP_reg_imm
18c     storeV [R30], V1	# vector (rvv)
194     addiw  R20, R20, #2	#@addI_reg_imm
196     blt  R20, R15, B17	#@cmpI_loop  P=0.500000 C=30564.000000


vand_regL_masked Node

1da     B31: #	out( B37 B32 ) <- in( B30 )  Freq: 75.8503
1da     loadV V1, [R31]	# vector (rvv)
1e2     vand_regL_masked V1, V1, R11
1ea     addi  R31, R10, #32	# ptr, #@addP_reg_imm
1ee     bgeu  R30, R29, B37	#@cmpU_branch  P=0.000001 C=-1.000000


vor_regL_masked Node

1da     B31: #	out( B37 B32 ) <- in( B30 )  Freq: 75.8503
1da     loadV V1, [R31]	# vector (rvv)
1e2     vor_regL_masked V1, V1, R11
1ea     addi  R31, R10, #32	# ptr, #@addP_reg_imm
1ee     bgeu  R30, R29, B37	#@cmpU_branch  P=0.000001 C=-1.000000


vxor_regL_masked Node

1da     B31: #	out( B37 B32 ) <- in( B30 )  Freq: 75.8503
1da     loadV V1, [R31]	# vector (rvv)
1e2     vxor_regL_masked V1, V1, R11
1ea     addi  R31, R10, #32	# ptr, #@addP_reg_imm
1ee     bgeu  R30, R29, B37	#@cmpU_branch  P=0.000001 C=-1.000000


vnotL Node

0f4     B17: #	out( B38 B18 ) <- in( B16 )  Freq: 76.238
0f4     # castII of R19, #@castII
0f4     addw  R30, R19, zr	#@convI2L_reg_reg
0f8     slli  R30, R30, (#3 & 0x3f)	#@lShiftL_reg_imm
0fa     add R13, R31, R30	# ptr, #@addP_reg_reg
0fe     addi  R13, R13, #16	# ptr, #@addP_reg_imm
100     loadV V1, [R13]	# vector (rvv)
108     vnotL V1, V1
110     bgeu  R19, R12, B38	#@cmpU_branch  P=0.000001 C=-1.000000


[1] https://github.com/riscv/riscv-v-spec/blob/v1.0/v-spec.adoc
[2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/incubator/vector/Int256VectorTests.java

### Testing
- [x] Run tier1-3 tests on SOPHON SG2042 (release)
- [x] test/jdk/jdk/incubator/vector (fastdebug) qemu 8.1.50 with UseRVV

-------------

Commit messages:
 - Polishing Code comment
 - Add vand/vor/vxor predicated Node
 - Polishing Code Comment
 - 8331281: RISC-V: C2: Support vector-scalar and vector-immediate bitwise logic instructions

Changes: https://git.openjdk.org/jdk/pull/18999/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=18999&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8331281
  Stats: 471 lines in 2 files changed: 469 ins; 0 del; 2 mod
  Patch: https://git.openjdk.org/jdk/pull/18999.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/18999/head:pull/18999

PR: https://git.openjdk.org/jdk/pull/18999


More information about the hotspot-compiler-dev mailing list