RFR: 8339738: RISC-V: Vectorize crc32 intrinsic [v6]
Hamlin Li
mli at openjdk.org
Wed Sep 11 16:42:09 UTC 2024
On Wed, 11 Sep 2024 14:29:47 GMT, Fei Yang <fyang at openjdk.org> wrote:
>> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision:
>>
>> remove redundant jump
>
> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1583:
>
>> 1581: const int64_t tmp_limit = MaxVectorSize >= 32 ? unroll_words*2 : unroll_words*4;
>> 1582: sub(tmp1, len, tmp_limit);
>> 1583: bge(tmp1, zr, L_vector_entry);
>
> I don't quite understand this compare of `len` with `tmp_limit` here as I see `len` has already been updated on entry with `subw(len, len, unroll_words)`. Should we compare with the original `len` before the update? (And maybe remove the `addi(len, len, unroll_words)` in `vector_update_crc32` at the same time?).
I'm not sure how you feel, when I red the original scalar version of crc32, the `len` is bit confusing to, it's added and subtracted here and there, it's bit hard to understand.
When I implement the vector version, at first I'd like to refactor it and add vector version, but finally I gave up, I think it's better to separate the refactor and vector version, and in this pr I try hard to not touch original.
So if you have the similar feeling, I would suggest to we do refactoring in a specific pr after this one, in which I will try to make the code straight and clear. And I'm going to implement crc in carry-less multiplication, so before carry-less one I would prefer to refactoring first anyway.
To your specific question above: we could do it as you suggested, but seems it will not make code clear if we don't do a refactoring first, as in the first `if (UseRVV)` block, it could also jump to `L_unroll_loop_entry` as a fallback, in this way the code will be more complicated rather than clear.
In a summary, my solution is to do a specific refactoring pr which will also address your comments (including the `s/bgt/bgtz` suggestion above, as `bgt` is the original style of this kernel_crc32 method).
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/20910#discussion_r1755150447
More information about the hotspot-dev
mailing list