RFR: 8316592: RISC-V: implement poly1305 intrinsic [v3]
null
duke at openjdk.org
Thu Nov 2 13:48:37 UTC 2023
On Wed, 1 Nov 2023 16:07:58 GMT, Hamlin Li <mli at openjdk.org> wrote:
>> null has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Replace slli/add sequence with shadd instruction
>
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4450:
>
>> 4448: __ srli(t3, t3, 24);
>> 4449: if (dest2->is_valid()) {
>> 4450: __ add(dest2, zr, t3); // 2 bits in dest2
>
> could be replaced with mv(dest2, t3)?
Replaced.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4486:
>
>> 4484: // computation.
>> 4485:
>> 4486: address generate_poly1305_processBlocks() {
>
> Seems to me, the explicitl usage of tempaory register t0/1/2 can be reduced as just t1/t2.
> And leave t0 as the default implicit temporary register. Maybe this way can make the code more clear to read. Can you have a try?
Done.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4506:
>
>> 4504: pack_26(R_0, R_1, r_start);
>> 4505:
>> 4506: const Register tmp1 = t0;
>
> Seems renaming tx to tmpx does not bring much benefit.
> And some instructions use t0 internally as default tmp register, naming t0 to tmp1 make it a bit more confusing to understand.
> And here 1 start index is used, and for other naming it's all 0-start index.
Renamed.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4506:
>
>> 4504: pack_26(R_0, R_1, r_start);
>> 4505:
>> 4506: const Register tmp1 = t0;
>
> seems these tmpx does not bring much more help than built-in name tx?
Renamed.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4524:
>
>> 4522: Label DONE, LOOP;
>> 4523:
>> 4524: __ addi(tmp1, zr, checked_cast<u1>(BLOCK_LENGTH));
>
> Is this the same as `mv(tmp1, checked_cast<u1>(BLOCK_LENGTH))`?
It is. Replaced.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4535:
>
>> 4533: __ cad(S_0, S_0, U_0, tmp1); // Add U_0 to S_0 with carry output to tmp1
>> 4534: __ cadc(S_1, S_1, U_1, tmp1); // Add U_1 with carry to S_1 with carry output to tmp1
>> 4535: __ adc(S_2, zr, U_2, tmp1);
>
> Is this the same as `add(S_2, U_2, tmp1)`? If positive, it should be add(...) instead, as adc takes 2 instruction.
Fixed.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4567:
>
>> 4565: __ shadd(tmp1, tmp1, tmp1, tmp2, 2); // tmp1 is impossible to overflow since two leftmost bits are zero'ed in 'srli(tmp1, U_2, 2)'
>> 4566: __ cad(U_0, U_0, tmp1, tmp3); // Add tmp1 (= (U_2 >> 2) * 5) to U_0 with carry output to tmp3
>> 4567: __ cadc(U_1, U_1, zr, tmp3); // Add carry to U_1 with carry output to tmp3
>
> should this be simplied as `cad(U_1, U_1, tmp3, tmp3);`
> and cad use one less instruction than cadc.
Fixed.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4572:
>
>> 4570: __ sub(length, length, checked_cast<u1>(BLOCK_LENGTH));
>> 4571: __ addi(input_start, input_start, 2 * wordSize);
>> 4572: __ addi(tmp1, zr, checked_cast<u1>(BLOCK_LENGTH));
>
> Is this the same as mv(tmp1, checked_cast<u1>(BLOCK_LENGTH))? If positive, then better to use mv.
It is. Replaced.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4580:
>
>> 4578: __ shadd(tmp1, tmp1, tmp1, tmp2, 2); // tmp1 = U_2 * 5
>> 4579: __ cad(U_0, U_0, tmp1, tmp3); // U_0 += U_2 * 5 with carry output to tmp3
>> 4580: __ cadc(U_1, U_1, zr, tmp3); // Add carry to U_1 with carry output to tmp3
>
> should this be simplied as `cad(U_1, U_1, tmp3, tmp3);`
> and cad use one less instruction than cadc.
Fixed.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166608
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166505
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166348
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380170415
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166194
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166568
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166249
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166406
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166310
More information about the hotspot-compiler-dev
mailing list