RFR: 8322174: RISC-V: C2 VectorizedHashCode RVV Version [v8]

Mon May 5 14:15:50 UTC 2025

On Mon, 5 May 2025 10:17:27 GMT, Yuri Gaevsky <duke at openjdk.org> wrote:

>> The patch adds possibility to use RVV instructions for faster vectorizedHashCode calculations on RVV v1.0.0 capable hardware.
>> 
>> Testing: hotspot/jtreg/compiler/ under QEMU-8.1 with RVV v1.0.0.
>
> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision:
> 
>   change slli+add sequence to shadd

I think what @RealFYang  is saying:

You don't need to know the vector size, i.e.:

  const int nof_vec_elems = MaxVectorSize;
....
  mv(t1, nof_vec_elems);
  vsetvli(t0, t1, Assembler::e32, Assembler::m4);

You can set vsetvli to to cnt round down to nearest 4 byte.
And let vsetvli process as much as it can per iteration.
It will never process more than vlen, so the last loop it may process only 4 bytes.

Here is example of a memcopy:
https://github.com/riscvarchive/riscv-v-spec/blob/master/example/memcpy.s

This means the main loop is vector register length agonistic.

Now you have 3 or less bytes left to process with normal scalar ops.

-------------

PR Review: https://git.openjdk.org/jdk/pull/17413#pullrequestreview-2814965363