RFR: 8316592: RISC-V: implement poly1305 intrinsic [v3]

null duke at openjdk.org
Thu Nov 2 13:48:37 UTC 2023


On Wed, 1 Nov 2023 16:07:58 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> null has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Replace slli/add sequence with shadd instruction
>
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4450:
> 
>> 4448:     __ srli(t3, t3, 24);
>> 4449:     if (dest2->is_valid()) {
>> 4450:       __ add(dest2, zr, t3);        // 2 bits in dest2
> 
> could be replaced with mv(dest2, t3)?

Replaced.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4486:
> 
>> 4484:   // computation.
>> 4485: 
>> 4486:   address generate_poly1305_processBlocks() {
> 
> Seems to me, the explicitl usage of tempaory register t0/1/2 can be reduced as just t1/t2.
> And leave t0 as the default implicit temporary register. Maybe this way can make the code more clear to read. Can you have a try?

Done.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4506:
> 
>> 4504:     pack_26(R_0, R_1, r_start);
>> 4505: 
>> 4506:     const Register tmp1 = t0;
> 
> Seems renaming tx to tmpx does not bring much benefit.
> And some instructions use t0 internally as default tmp register, naming t0 to tmp1 make it a bit more confusing to understand.
> And here 1 start index is used, and for other naming it's all 0-start index.

Renamed.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4506:
> 
>> 4504:     pack_26(R_0, R_1, r_start);
>> 4505: 
>> 4506:     const Register tmp1 = t0;
> 
> seems these tmpx does not bring much more help than built-in name tx?

Renamed.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4524:
> 
>> 4522:     Label DONE, LOOP;
>> 4523: 
>> 4524:     __ addi(tmp1, zr, checked_cast<u1>(BLOCK_LENGTH));
> 
> Is this the same as `mv(tmp1, checked_cast<u1>(BLOCK_LENGTH))`?

It is. Replaced.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4535:
> 
>> 4533:       __ cad(S_0, S_0, U_0, tmp1); // Add U_0 to S_0 with carry output to tmp1
>> 4534:       __ cadc(S_1, S_1, U_1, tmp1); // Add U_1 with carry to S_1 with carry output to tmp1
>> 4535:       __ adc(S_2, zr, U_2, tmp1);
> 
> Is this the same as `add(S_2, U_2, tmp1)`? If positive, it should be add(...) instead, as adc takes 2 instruction.

Fixed.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4567:
> 
>> 4565:       __ shadd(tmp1, tmp1, tmp1, tmp2, 2); // tmp1 is impossible to overflow since two leftmost bits are zero'ed in 'srli(tmp1, U_2, 2)'
>> 4566:       __ cad(U_0, U_0, tmp1, tmp3); // Add tmp1 (= (U_2 >> 2) * 5) to U_0 with carry output to tmp3
>> 4567:       __ cadc(U_1, U_1, zr, tmp3); // Add carry to U_1 with carry output to tmp3
> 
> should this be simplied as `cad(U_1, U_1, tmp3, tmp3);`
> and cad use one less instruction than cadc.

Fixed.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4572:
> 
>> 4570:       __ sub(length, length, checked_cast<u1>(BLOCK_LENGTH));
>> 4571:       __ addi(input_start, input_start, 2 * wordSize);
>> 4572:       __ addi(tmp1, zr, checked_cast<u1>(BLOCK_LENGTH));
> 
> Is this the same as mv(tmp1, checked_cast<u1>(BLOCK_LENGTH))? If positive, then better to use mv.

It is. Replaced.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4580:
> 
>> 4578:     __ shadd(tmp1, tmp1, tmp1, tmp2, 2); // tmp1 = U_2 * 5
>> 4579:     __ cad(U_0, U_0, tmp1, tmp3); // U_0 += U_2 * 5 with carry output to tmp3
>> 4580:     __ cadc(U_1, U_1, zr, tmp3); // Add carry to U_1 with carry output to tmp3
> 
> should this be simplied as `cad(U_1, U_1, tmp3, tmp3);`
> and cad use one less instruction than cadc.

Fixed.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166608
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166505
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166348
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380170415
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166194
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166568
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166249
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166406
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1380166310


More information about the hotspot-compiler-dev mailing list