RFR: 8316592: RISC-V: implement poly1305 intrinsic [v9]

ArsenyBochkarev duke at openjdk.org
Tue Nov 14 06:52:35 UTC 2023


On Mon, 13 Nov 2023 14:37:25 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Please correct me if I'm wrong:
>> 
>> 1. In first loop iteration: after `poly1305_pack_26` before the loop `U_2` is `0b11` max. After any loop iteration the max for `U_2` is `0b100` (as we found out earlier);
>> 2. After `Add U_0 to S_0`: the carry is `0b1` max, plus mandatory `addi` of `0b1` to `S_2` --> `S_2` is `0b110` max;
>> 3. After `wide_mul`'s and `wide_madd`'s: `U_2` is again `0b11` max due to `andi(U_2, R_0, bits2)` (and even `0b0` in case of `0b100` in first step). NB: inside `wide_*` functions `S_2` is unchanged;
>> 4. `mul(U_2, S_2, U_2)`: `U_2` is `0b1111` max --> in `poly1305_reduce` the `tmp1` is `0b11` max, which is safe?
>
> Did the above derivation process consider adc(U_2, U_2, U_1HI, t1); at L4564?

Oh, I missed it, thanks! However, it doesn't affect much, as far as I can see:
in `adc(U_2, U_2, U_1HI, t1)` register `t1` is `0b1` max, so the max for `U_2` is `0b10000` --> in `poly1305_reduce` the `tmp1` max is `0b100`. Multiplying it by `0b101` we get `0b10100`, which is still ok. Or am I missing something here?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1392063975


More information about the hotspot-compiler-dev mailing list