RFR: 8316592: RISC-V: implement poly1305 intrinsic [v9]

Tue Nov 14 17:47:38 UTC 2023

On Tue, 14 Nov 2023 16:13:36 GMT, Hamlin Li <mli at openjdk.org> wrote:

>> Done.
>
> Thanks for updating.
> 
> 
>> If the above issue is not a false one, and the implemention was tested with existing jdk test suite, then I'm afraid we need to add more test to catch this issue.
> 
> As we talked above, seems current test did not catch this overlow issue? Could you try to expand the current tests to catch this overflow issue?

Hmm, I looked at the code once again and it seems to me that the overflow is actually impossible. Consider the case of 8-bit registers (for simplicity).

1. Above `wide_mul`'s and `wide_madd`'s:
> we know that the top four bits of R_0 and R_1 are zero

  Therefore,`R_1` and `R_0`'s max value (in our case of 8-bit registers) is `0b1111`. Then max value for `RR_1` is `(R_1 >> 2) * 5` = `0b1000`

2. As we figured out earlier: the max value for `S_2` is `0b110`;

Using these facts we can deduce the restrictions for `U_1HI`:
1. `wide_mul(U_1, U_1HI, S_0, R_1)`: let `S_0 = 0b11111111`. Then the `U_1HI` here is `0b1110`;

2. `wide_madd(U_1, U_1HI, S_1, R_0, t1, t2)`: let `S_1 = 0b11111111`, carry is `0b1`, then the `U_1HI` = `U_1HI` + `0b1110` + `0b1` = `0b11101`;

3. `wide_madd(U_1, U_1HI, S_2, RR_1, t1, t2)`: let `S_2 = 0b11111111` carry is `0b1`, then the `U_1HI` = `U_1HI` + `0b111` (we're taking high 8 bits due to `mulhu` instruction) + `0b1` = `0b1000`;

4. `mul(U_2, S_2, U_2)`: max value for `U_2` is `0b10010`;

5. `adc(U_2, U_2, U_1HI, t1)`: max value for `U_2` is `0b10010` + `0b1000` + `0b1` = `0b10100010`;

6. `poly1305_reduce`: max value for `tmp1` is `0b11001000`

Am I missing something once again or is it alright?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16417#discussion_r1392997747