RFR: 8339738: RISC-V: Vectorize crc32 intrinsic [v4]
Hamlin Li
mli at openjdk.org
Wed Sep 11 07:46:04 UTC 2024
On Wed, 11 Sep 2024 07:23:27 GMT, Fei Yang <fyang at openjdk.org> wrote:
>> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
>>
>> fix perf regression
>
> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1584:
>
>> 1582: sub(tmp1, len, tmp_limit);
>> 1583: bge(tmp1, zr, L_vector_entry);
>> 1584: }
>
> Hi Hamlin, I think maybe we should introduce another assember routine for the vector code? Let's say `kernel_crc32_using_vector` and delegate the work to it under `UseRVV`. That seems more cleaner to me and avoids "offset is too large" issue. I will take a look at the vector code later. BTW: Should `single_talbe_size` be `single_table_size`?
Not sure if I understand your suggestion correctly. Do you mean something like below?
address generate_updateBytesCRC32() {
if (UseRVV) { kernel_crc32_using_vector(); }
else { kernel_crc32(...); }
}
But as kernel_crc32_using_vector reuses the code in kernel_crc32, and even with UseRVV, in some condition (when size is not large enough) we still need to fallback to L_unroll_loop_entry.
Or maybe I could misunderstand what you mean?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/20910#discussion_r1753436559
More information about the hotspot-dev
mailing list