RFR: 8302908: RISC-V: Support masked vector arithmetic instructions for Vector API [v8]

Dingli Zhang dzhang at openjdk.org
Fri Mar 17 02:01:54 UTC 2023


> HI,
> 
> We have added support for vector add mask instructions, please take a look and have some reviews. Thanks a lot.
> 
> This patch will add support of vector add/sub/mul/div mask version. It was implemented by referring to RVV v1.0 [1].
> `VectorLoadMask, VectorMaskCmp, VectorStoreMask` implement the mask-passed datapath.
> 
> We can see where the data is passed in the compilation log with `jdk/incubator/vector/Byte128VectorTests.java`:
> 
> 21c     loadV V1, [R7]	# vector (rvv)
> 224     vloadmask V30, V1	# KILL cr
> 22c     vmaskcmp_rvv_masked V30, V4, V5, V30, #0	# KILL cr
> 240     
> 240     MEMBAR-store-store	#@membar_storestore
> 244     # checkcastPP of R8, #@checkCastPP
> 244     vstoremask V1, V30
> 
> 
> The corresponding generated jit assembly:
> 
> # loadV
> 0x000000400c8fce5c:   vsetivli	t0,16,e8,m1,tu,mu
> 0x000000400c8fce60:   vle8.v	v1,(t2)
> 
> # vloadmask
> 0x000000400c8fce64:   vsetivli	t0,16,e8,m1,tu,mu
> 0x000000400c8fce68:   vmsne.vx	v30,v1,zero
> 
> # vmaskcmp_rvv_masked
> 0x000000400c8fce6c:   vsetvli	t0,zero,e8,m1,tu,mu
> 0x000000400c8fce70:   vmmv.m	v0,v30
> 0x000000400c8fce74:   vsetivli	t0,16,e8,m1,tu,mu
> 0x000000400c8fce78:   vmclr.m	v30
> 0x000000400c8fce7c:   vmseq.vv	v30,v4,v5,v0.t
> 
> # vstoremask
> 0x000000400c8fce84:   vsetvli	t0,zero,e8,m1,tu,mu
> 0x000000400c8fce88:   vmv.v.i	v1,0
> 0x000000400c8fce8c:   vsetvli	t0,zero,e8,m1,tu,mu
> 0x000000400c8fce90:   vmmv.m	v0,v30
> 0x000000400c8fce94:   vmerge.vim	v1,v1,1,v0
> 
> 
> `AndVMask, OrVMask, XorVMask` will be used for operations such as division.
> The current implementation of `VectorMaskCast` is for the case of equal width of the parameter data, other cases depend on the subsequent cast node.
> 
> AddMaskTestMerge case:
> 
> 
> import jdk.incubator.vector.IntVector;
> import jdk.incubator.vector.VectorMask;
> import jdk.incubator.vector.VectorOperators;
> import jdk.incubator.vector.VectorSpecies;
> 
> public class AddMaskTestMerge {
> 
>     static final VectorSpecies<Integer> SPECIES = IntVector.SPECIES_128;
>     static final int SIZE = 1024;
>     static int[] a = new int[SIZE];
>     static int[] b = new int[SIZE];
>     static int[] r = new int[SIZE];
>     static boolean[] c = new boolean[]{true,false,true,false,true,false,true,false};
>     static {
>         for (int i = 0; i < SIZE; i++) {
>             a[i] = i;
>             b[i] = i;
>         }
>     }
> 
>     static void workload(int idx) {
>         VectorMask<Integer> vmask = VectorMask.fromArray(SPECIES, c, 0);
>         IntVector av = IntVector.fromArray(SPECIES, a, idx);
>         IntVector bv = IntVector.fromArray(SPECIES, b, idx);
>         av.lanewise(VectorOperators.ADD, bv, vmask).intoArray(r, idx);
>     }
> 
>     public static void main(String[] args) {
>         for (int i = 0; i < 30_0000; i++) {
>             for (int j = 0; j < SIZE; j += SPECIES.length()) {
>                 workload(j);
>             }
>         }
>     }
> }
> 
> 
> This test case is reduced from existing jtreg vector tests Int128VectorTests.java[2]. This test case corresponds to the add instruction of the vector mask version and other instructions are similar.
> 
> Before this patch, the compilation log will not print RVV-related instructions. Now the compilation log is as follows:
> 
> 
> 0ae     B10: #	out( B25 B11 ) <- in( B9 )  Freq: 0.999991
> 0ae     loadV V1, [R31]	# vector (rvv)
> 0b6     vloadmask V0, V2	# KILL cr
> 0be     vadd.vv V3, V1, V0	#@vaddI_masked
> 0c6     lwu  R28, [R7, #124]	# loadN, compressed ptr, #@loadN ! Field: AddMaskTestMerge.r
> 0ca     decode_heap_oop  R28, R28	#@decodeHeapOop
> 0cc     lwu  R7, [R28, #12]	# range, #@loadRange
> 0d0     NullCheck R28
> 
> 
> And the jit code is as follows:
> 
> 
>   0x000000400c8109ae:   vsetivli        t0,4,e32,m1,tu,mu
>   0x000000400c8109b2:   vle32.v v1,(t6)                     ;*invokestatic store {reexecute=0 rethrow=0 return_oop=0}
>                                                             ; - jdk.incubator.vector.IntVector::intoArray at 43 (line 3228)
>                                                             ; - AddMaskTestMerge::workload at 46 (line 25)
>   0x000000400c8109b6:   vsetivli        t0,4,e8,m1,tu,mu
>   0x000000400c8109ba:   vmsne.vx        v0,v2,zero          ;*invokestatic load {reexecute=0 rethrow=0 return_oop=0}
>                                                             ; - jdk.incubator.vector.VectorMask::fromArray at 47 (line 208)
>                                                             ; - AddMaskTestMerge::workload at 7 (line 22)
>   0x000000400c8109be:   vsetivli        t0,4,e32,m1,tu,mu
>   0x000000400c8109c2:   vadd.vv         v3,v3,v1,v0.t       ;*invokestatic binaryOp {reexecute=0 rethrow=0 return_oop=0}
>                                                             ; - jdk.incubator.vector.IntVector::lanewiseTemplate at 192 (line 834)
>                                                             ; - jdk.incubator.vector.Int128Vector::lanewise at 9 (line 291)
>                                                             ; - jdk.incubator.vector.Int128Vector::lanewise at 4 (line 41)
>                                                             ; - AddMaskTestMerge::workload at 39 (line 25)
> 
> 
> 
> 
> [1] https://github.com/riscv/riscv-v-spec/blob/v1.0/v-spec.adoc
> [2] https://github.com/openjdk/jdk/blob/master/test/jdk/jdk/incubator/vector/Int128VectorTests.java
> 
> ### Testing:
> 
> qemu with UseRVV:
> - [x] Tier1 tests (release)
> - [x] Tier2 tests (release)
> - [x] test/jdk/jdk/incubator/vector (release/fastdebug)
> 
> Unmatched:
> - [x] Tier1 tests (release)
> - [x] Tier2 tests (release)

Dingli Zhang has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:

  Fix useage of iRegIorL2I and remove nouse cr

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/12682/files
  - new: https://git.openjdk.org/jdk/pull/12682/files/60052751..2f7b7c06

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=12682&range=07
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12682&range=06-07

  Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod
  Patch: https://git.openjdk.org/jdk/pull/12682.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/12682/head:pull/12682

PR: https://git.openjdk.org/jdk/pull/12682


More information about the hotspot-compiler-dev mailing list