RFR: 8316592: RISC-V: implement poly1305 intrinsic [v3]
null
duke at openjdk.org
Wed Nov 1 18:51:36 UTC 2023
On Wed, 1 Nov 2023 03:56:23 GMT, Fei Yang <fyang at openjdk.org> wrote:
>> null has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Replace slli/add sequence with shadd instruction
>
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4448:
>
>> 4446: __ add(dest1, dest1, t4); // dest1 is full
>> 4447:
>> 4448: __ srli(t3, t3, 24);
>
> Maybe move this `srli` instruction into the if-else blocks thus making it conditional?
Good catch! Moved.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4491:
>
>> 4489: address start = __ pc();
>> 4490: Label here;
>> 4491:
>
> I think we should start a new frame with enter() on entry and leave() on exit for proper stackwalking.
Done.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4587:
>
>> 4585: const int64_t bits26 = right_n_bits(26);
>> 4586: // First 26 bits is the first limb
>> 4587: __ andi(tmp1, U_0, bits26); // Take lowest 26 bits
>
> `andi` won't help save instructions in the case of a big bitmask. So I would suggest emit a pair of `slli` and `srli`.
> Consider:
>
> __ slli(tmp1, U_0, 38); // Take lowest 26 bits
> __ srli(tmp1, tmp1, 38);
Done.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4593:
>
>> 4591: __ srli(tmp2, U_0, 26);
>> 4592: __ andi(tmp2, tmp2, bits26); // Take next 27-52 bits
>> 4593: __ sd(tmp2, Address(acc_start, sizeof (jlong))); // Second 26-bit limb
>
> Consider replacing lines L4591-L4593 with:
>
> __ slli(tmp1, U_0, 12); // Take next 27-52 bits
> __ srli(tmp1, tmp1, 38);
> __ sd(tmp1, Address(acc_start, sizeof (jlong))); // Second 26-bit limb
Done.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4600:
>
>> 4598: __ addw(tmp1, tmp1, tmp2);
>> 4599: __ andi(tmp3, tmp1, bits26, tmp2); // Take remaining bits of tmp1
>> 4600: __ sd(tmp3, Address(acc_start, 2 * sizeof (jlong))); // Third 26-bit limb
>
> Consider replacing lines L4596-L4600 with:
>
> __ srli(tmp1, U_0, 52);
> __ slli(tmp2, U_1, 50);
> __ srli(tmp2, tmp2, 38);
> __ add(tmp1, tmp1, tmp2);
> __ sd(tmp1, Address(acc_start, 2 * sizeof (jlong))); // Third 26-bit limb
Replaced.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4604:
>
>> 4602: // Storing 15-40 bits of U_1
>> 4603: __ srli(tmp3, U_1, 14); // Already used up 14 bits
>> 4604: __ andi(tmp1, tmp3, bits26); // Clear all other bits from tmp3
>
> Consider replacing lines L4603-L4604 with:
>
> __ slli(tmp1, U_1, 24); // Already used up 14 bits
> __ srli(tmp1, tmp1, 38);
Replaced.
> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4611:
>
>> 4609: __ andi(tmp2, U_2, 3); // Clear all bits in U_2 except for first 2
>> 4610: __ slli(tmp2, tmp2, 24);
>> 4611: __ addw(tmp1, tmp1, tmp2);
>
> Suggestion: __ add(tmp1, tmp1, tmp2);
Done.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178250
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178365
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178506
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178692
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178644
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178593
PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1379178745
More information about the hotspot-compiler-dev
mailing list