RFR: 8317720: RISC-V: Implement Adler32 intrinsic [v17]
ArsenyBochkarev
duke at openjdk.org
Tue Jul 16 11:11:10 UTC 2024
On Tue, 16 Jul 2024 06:49:12 GMT, Fei Yang <fyang at openjdk.org> wrote:
>> ArsenyBochkarev has updated the pull request incrementally with two additional commits since the last revision:
>>
>> - Use t2 instead of count as scratch register
>> - Remove blt after by16_loop_unroll
>
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5048:
>
>> 5046: VectorRegister vzero, VectorRegister vbytes, VectorRegister vs1acc, VectorRegister vs2acc,
>> 5047: Register temp0, Register temp1, Register temp2, Register temp3,
>> 5048: VectorRegister vtemp1, VectorRegister vtemp2, int step, Assembler::LMUL LMUL) {
>
> Better to use small ‘lmul’ and `lmulx2` for `LMUL` and `LMULx2` respectively to be consistent in naming style.
Done
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5277:
>
>> 5275: __ sub(len, len, step_64);
>> 5276: // By now the count should still be 64
>> 5277: __ bge(len, count, L_by16_loop_unroll);
>
> Code comment and `count` here needs update as well when you change to `t2`.
Done modulo one minor note: I changed my mind on `t2` and used `temp3` instead (as it is known alias)
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 5290:
>
>> 5288: __ bltz(len, L_do_mod);
>> 5289:
>> 5290: __ bind(L_by1_loop);
>
> The loop body of `L_by1_loop` and `L_simple_by1_loop` looks the same except for the branch at the end. Could we eliminate `L_simple_by1_loop` and jump to `L_by1_loop` instead? Seems the only need is to substract `len` by one before the jump.
Done!
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18382#discussion_r1679204213
PR Review Comment: https://git.openjdk.org/jdk/pull/18382#discussion_r1679204177
PR Review Comment: https://git.openjdk.org/jdk/pull/18382#discussion_r1679204163
More information about the hotspot-compiler-dev
mailing list