RFR: 8296411: AArch64: Accelerated Poly1305 intrinsics [v2]

Andrew Haley aph at openjdk.org
Wed May 24 11:13:16 UTC 2023


On Wed, 24 May 2023 10:18:39 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 7097:
>> 
>>> 7095:       // together partial products without any risk of needing to
>>> 7096:       // propagate a carry out.
>>> 7097:       wide_mul(U_0, U_0HI, S_0, R_0);  wide_madd(U_0, U_0HI, S_1, RR_1); wide_madd(U_0, U_0HI, S_2, RR_0);
>> 
>> What is `r` corresponding to here? This asserts that 'the top four bits of each 32-bit subword of "r" are zero'. If `r` is `R_0...R_2` it would seem broken since we're packing 26-bit values into `R_0...R_2` above in a way that would break this invariant?
>
> No, it doesn't break the invariants.
> 
> R is the randomly-chosen 128-bit key. It is generated from an initial 128-bit-log string of random bits, then
> `r &= 0x0ffffffc0ffffffc0ffffffc0fffffff`
> 
> This 128-bit-long string is split into 26-bit limbs before the intrinsic is called. The zero bits remain zero.
> When we repack R into two 64-bit registers those zero bits are still zero.

See https://loup-vaillant.fr/tutorials/poly1305-design for more explanation

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/14085#discussion_r1203912143


More information about the hotspot-dev mailing list