RFR: 8339738: RISC-V: Vectorize crc32 intrinsic [v4]

Hamlin Li mli at openjdk.org
Wed Sep 11 07:46:04 UTC 2024


On Wed, 11 Sep 2024 07:23:27 GMT, Fei Yang <fyang at openjdk.org> wrote:

>> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   fix perf regression
>
> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1584:
> 
>> 1582:     sub(tmp1, len, tmp_limit);
>> 1583:     bge(tmp1, zr, L_vector_entry);
>> 1584:   }
> 
> Hi Hamlin, I think maybe we should introduce another assember routine for the vector code? Let's say `kernel_crc32_using_vector` and delegate the work to it under `UseRVV`. That seems more cleaner to me and avoids "offset is too large" issue. I will take a look at the vector code later. BTW: Should `single_talbe_size` be `single_table_size`?

Not sure if I understand your suggestion correctly. Do you mean something like below?

address generate_updateBytesCRC32() {
  if (UseRVV) { kernel_crc32_using_vector(); }
  else { kernel_crc32(...); }
}

But as kernel_crc32_using_vector reuses the code in kernel_crc32, and even with UseRVV, in some condition (when size is not large enough) we still need to fallback to L_unroll_loop_entry.
Or maybe I could misunderstand what you mean?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20910#discussion_r1753436559


More information about the hotspot-dev mailing list