RFR: 8322174: RISC-V: C2 VectorizedHashCode RVV Version

Fei Yang fyang at openjdk.org
Wed Jan 17 08:06:56 UTC 2024


On Sat, 13 Jan 2024 09:21:37 GMT, Yuri Gaevsky <duke at openjdk.org> wrote:

> The patch adds possibility to use RVV instructions for faster vectorizedHashCode calculations on RVV v1.0.0 capable hardware.
> 
> Testing: hotspot/jtreg/compiler/ under QEMU-8.1 with RVV v1.0.0.

Some initial comments from a brief look.

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1603:

> 1601:   la(pows31, ExternalAddress(adr_pows31));
> 1602:   mv(t1, num_8b_elems_in_vec);
> 1603:   vsetvli(t0, t1, Assembler::e32, Assembler::m4);

I wonder if the scalar code for handling `WIDE_TAIL` could be eliminated with RVV's design for stripmining approach [1]? Looks like the current code doesn't make use of this approach as new vl returned by `vsetvli` is not checked and used.

[1] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#sec-vector-config

One of the common approaches to handling a large number of elements is "stripmining" where each iteration of
a loop handles some number of elements, and the iterations continue until all elements have been processed. 
The RISC-V vector specification provides direct, portable support for this approach. The application specifies the
 total number of elements to be processed (the application vector length or AVL) as a candidate value for vl, and 
the hardware responds via a general-purpose register with the (frequently smaller) number of elements that the 
hardware will handle per iteration (stored in vl), based on the microarchitectural implementation and the vtype 
setting. A straightforward loop structure, shown in [Example of stripmining and changes to SEW]
(https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#example-stripmine-sew),  depicts the ease with
 which the code keeps track of the remaining number of elements and the amount per iteration handled by hardware.

src/hotspot/cpu/riscv/riscv_v.ad line 2681:

> 2679:                           iRegLNoSp tmp4, iRegLNoSp tmp5, iRegLNoSp tmp6, rFlagsReg cr)
> 2680: %{
> 2681:   predicate(UseRVV && (MaxVectorSize >= 16));

Similar here: `MaxVectorSize >= 16` condition is already checked and ensured on JVM startup.

src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5266:

> 5264:     }
> 5265: 
> 5266:     if (UseVectorizedHashCodeIntrinsic && UseRVV && (MaxVectorSize >= 16)) {

I think `MaxVectorSize >= 16` condition is already checked and ensured on JVM startup when RVV extension is available.

-------------

PR Review: https://git.openjdk.org/jdk/pull/17413#pullrequestreview-1826634240
PR Review Comment: https://git.openjdk.org/jdk/pull/17413#discussion_r1454866513
PR Review Comment: https://git.openjdk.org/jdk/pull/17413#discussion_r1454805091
PR Review Comment: https://git.openjdk.org/jdk/pull/17413#discussion_r1454799065


More information about the hotspot-compiler-dev mailing list