RFR: 8316592: RISC-V: implement poly1305 intrinsic [v3]

null duke at openjdk.org
Wed Nov 1 18:51:36 UTC 2023


On Wed, 1 Nov 2023 03:56:23 GMT, Fei Yang <fyang at openjdk.org> wrote:

>> null has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Replace slli/add sequence with shadd instruction
>
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4448:
> 
>> 4446:     __ add(dest1, dest1, t4);       // dest1 is full
>> 4447: 
>> 4448:     __ srli(t3, t3, 24);
> 
> Maybe move this `srli` instruction into the if-else blocks thus making it conditional?

Good catch! Moved.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4491:
> 
>> 4489:     address start = __ pc();
>> 4490:     Label here;
>> 4491: 
> 
> I think we should start a new frame with enter() on entry and leave() on exit for proper stackwalking.

Done.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4587:
> 
>> 4585:     const int64_t bits26 = right_n_bits(26);
>> 4586:     // First 26 bits is the first limb
>> 4587:     __ andi(tmp1, U_0, bits26); // Take lowest 26 bits
> 
> `andi` won't help save instructions in the case of a big bitmask. So I would suggest emit a pair of `slli` and `srli`.
> Consider:
> 
> __ slli(tmp1, U_0, 38); // Take lowest 26 bits
> __ srli(tmp1, tmp1, 38);

Done.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4593:
> 
>> 4591:     __ srli(tmp2, U_0, 26);
>> 4592:     __ andi(tmp2, tmp2, bits26); // Take next 27-52 bits
>> 4593:     __ sd(tmp2, Address(acc_start, sizeof (jlong))); // Second 26-bit limb
> 
> Consider replacing lines L4591-L4593 with:
> 
> __ slli(tmp1, U_0, 12); // Take next 27-52 bits
> __ srli(tmp1, tmp1, 38);
> __ sd(tmp1, Address(acc_start, sizeof (jlong))); // Second 26-bit limb

Done.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4600:
> 
>> 4598:     __ addw(tmp1, tmp1, tmp2);
>> 4599:     __ andi(tmp3, tmp1, bits26, tmp2); // Take remaining bits of tmp1
>> 4600:     __ sd(tmp3, Address(acc_start, 2 * sizeof (jlong))); // Third 26-bit limb
> 
> Consider replacing lines L4596-L4600 with:
> 
>     __ srli(tmp1, U_0, 52);
>     __ slli(tmp2, U_1, 50);
>     __ srli(tmp2, tmp2, 38);
>     __ add(tmp1, tmp1, tmp2);
>     __ sd(tmp1, Address(acc_start, 2 * sizeof (jlong))); // Third 26-bit limb

Replaced.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4604:
> 
>> 4602:     // Storing 15-40 bits of U_1
>> 4603:     __ srli(tmp3, U_1, 14); // Already used up 14 bits
>> 4604:     __ andi(tmp1, tmp3, bits26); // Clear all other bits from tmp3
> 
> Consider replacing lines L4603-L4604 with:
> 
> __ slli(tmp1, U_1, 24); // Already used up 14 bits
> __ srli(tmp1, tmp1, 38);

Replaced.

> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4611:
> 
>> 4609:     __ andi(tmp2, U_2, 3); // Clear all bits in U_2 except for first 2
>> 4610:     __ slli(tmp2, tmp2, 24);
>> 4611:     __ addw(tmp1, tmp1, tmp2);
> 
> Suggestion: __ add(tmp1, tmp1, tmp2);

Done.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178250
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178365
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178506
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178692
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178644
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178593
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178745


More information about the hotspot-compiler-dev mailing list