RFR: 8302908: RISC-V: Support masked vector arithmetic instructions for Vector API [v6]
Dingli Zhang
dzhang at openjdk.org
Thu Mar 16 13:53:53 UTC 2023
> HI,
>
> We have added support for vector add mask instructions, please take a look and have some reviews. Thanks a lot.
>
> This patch will add support of vector add/sub/mul/div mask version. It was implemented by referring to RVV v1.0 [1].
> `VectorLoadMask, VectorMaskCmp, VectorStoreMask` implement the mask-passed datapath.
>
> We can see where the data is passed in the compilation log with `jdk/incubator/vector/Byte128VectorTests.java`:
>
> 21c loadV V1, [R7] # vector (rvv)
> 224 vloadmask V30, V1 # KILL cr
> 22c vmaskcmp_rvv_masked V30, V4, V5, V30, #0 # KILL cr
> 240
> 240 MEMBAR-store-store #@membar_storestore
> 244 # checkcastPP of R8, #@checkCastPP
> 244 vstoremask V1, V30
>
>
> The corresponding generated jit assembly:
>
> # loadV
> 0x000000400c8fce5c: vsetivli t0,16,e8,m1,tu,mu
> 0x000000400c8fce60: vle8.v v1,(t2)
>
> # vloadmask
> 0x000000400c8fce64: vsetivli t0,16,e8,m1,tu,mu
> 0x000000400c8fce68: vmsne.vx v30,v1,zero
>
> # vmaskcmp_rvv_masked
> 0x000000400c8fce6c: vsetvli t0,zero,e8,m1,tu,mu
> 0x000000400c8fce70: vmmv.m v0,v30
> 0x000000400c8fce74: vsetivli t0,16,e8,m1,tu,mu
> 0x000000400c8fce78: vmclr.m v30
> 0x000000400c8fce7c: vmseq.vv v30,v4,v5,v0.t
>
> # vstoremask
> 0x000000400c8fce84: vsetvli t0,zero,e8,m1,tu,mu
> 0x000000400c8fce88: vmv.v.i v1,0
> 0x000000400c8fce8c: vsetvli t0,zero,e8,m1,tu,mu
> 0x000000400c8fce90: vmmv.m v0,v30
> 0x000000400c8fce94: vmerge.vim v1,v1,1,v0
>
>
> `AndVMask, OrVMask, XorVMask` will be used for operations such as division.
> The current implementation of `VectorMaskCast` is for the case of equal width of the parameter data, other cases depend on the subsequent cast node.
>
> AddMaskTestMerge case:
>
>
> import jdk.incubator.vector.IntVector;
> import jdk.incubator.vector.VectorMask;
> import jdk.incubator.vector.VectorOperators;
> import jdk.incubator.vector.VectorSpecies;
>
> public class AddMaskTestMerge {
>
> static final VectorSpecies<Integer> SPECIES = IntVector.SPECIES_128;
> static final int SIZE = 1024;
> static int[] a = new int[SIZE];
> static int[] b = new int[SIZE];
> static int[] r = new int[SIZE];
> static boolean[] c = new boolean[]{true,false,true,false,true,false,true,false};
> static {
> for (int i = 0; i < SIZE; i++) {
> a[i] = i;
> b[i] = i;
> }
> }
>
> static void workload(int idx) {
> VectorMask<Integer> vmask = VectorMask.fromArray(SPECIES, c, 0);
> IntVector av = IntVector.fromArray(SPECIES, a, idx);
> IntVector bv = IntVector.fromArray(SPECIES, b, idx);
> av.lanewise(VectorOperators.ADD, bv, vmask).intoArray(r, idx);
> }
>
> public static void main(String[] args) {
> for (int i = 0; i < 30_0000; i++) {
> for (int j = 0; j < SIZE; j += SPECIES.length()) {
> workload(j);
> }
> }
> }
> }
>
>
> This test case is reduced from existing jtreg vector tests Int128VectorTests.java[2]. This test case corresponds to the add instruction of the vector mask version and other instructions are similar.
>
> Before this patch, the compilation log will not print RVV-related instructions. Now the compilation log is as follows:
>
>
> 0ae B10: # out( B25 B11 ) <- in( B9 ) Freq: 0.999991
> 0ae loadV V1, [R31] # vector (rvv)
> 0b6 vloadmask V0, V2 # KILL cr
> 0be vadd.vv V3, V1, V0 #@vaddI_masked
> 0c6 lwu R28, [R7, #124] # loadN, compressed ptr, #@loadN ! Field: AddMaskTestMerge.r
> 0ca decode_heap_oop R28, R28 #@decodeHeapOop
> 0cc lwu R7, [R28, #12] # range, #@loadRange
> 0d0 NullCheck R28
>
>
> And the jit code is as follows:
>
>
> 0x000000400c8109ae: vsetivli t0,4,e32,m1,tu,mu
> 0x000000400c8109b2: vle32.v v1,(t6) ;*invokestatic store {reexecute=0 rethrow=0 return_oop=0}
> ; - jdk.incubator.vector.IntVector::intoArray at 43 (line 3228)
> ; - AddMaskTestMerge::workload at 46 (line 25)
> 0x000000400c8109b6: vsetivli t0,4,e8,m1,tu,mu
> 0x000000400c8109ba: vmsne.vx v0,v2,zero ;*invokestatic load {reexecute=0 rethrow=0 return_oop=0}
> ; - jdk.incubator.vector.VectorMask::fromArray at 47 (line 208)
> ; - AddMaskTestMerge::workload at 7 (line 22)
> 0x000000400c8109be: vsetivli t0,4,e32,m1,tu,mu
> 0x000000400c8109c2: vadd.vv v3,v3,v1,v0.t ;*invokestatic binaryOp {reexecute=0 rethrow=0 return_oop=0}
> ; - jdk.incubator.vector.IntVector::lanewiseTemplate at 192 (line 834)
> ; - jdk.incubator.vector.Int128Vector::lanewise at 9 (line 291)
> ; - jdk.incubator.vector.Int128Vector::lanewise at 4 (line 41)
> ; - AddMaskTestMerge::workload at 39 (line 25)
>
>
>
>
> [1] https://github.com/riscv/riscv-v-spec/blob/v1.0/v-spec.adoc
> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/incubator/vector/Int128VectorTests.java
>
> ### Testing:
>
> qemu with UseRVV:
> - [x] Tier1 tests (release)
> - [x] Tier2 tests (release)
> - [x] test/jdk/jdk/incubator/vector (release/fastdebug)
>
> Unmatched:
> - [x] Tier1 tests (release)
> - [x] Tier2 tests (release)
Dingli Zhang has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision:
RISC-V: Support vector add mask instructions for Vector API
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/12682/files
- new: https://git.openjdk.org/jdk/pull/12682/files/59a15d59..c2aa9997
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=12682&range=05
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=12682&range=04-05
Stats: 134181 lines in 1133 files changed: 98159 ins; 21395 del; 14627 mod
Patch: https://git.openjdk.org/jdk/pull/12682.diff
Fetch: git fetch https://git.openjdk.org/jdk pull/12682/head:pull/12682
PR: https://git.openjdk.org/jdk/pull/12682
More information about the hotspot-compiler-dev
mailing list