RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v9]

Volodymyr Paprotski duke at openjdk.org
Tue Feb 27 19:13:56 UTC 2024


On Fri, 23 Feb 2024 22:11:15 GMT, Srinivas Vamsi Parasa <duke at openjdk.org> wrote:

>> The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs.
>> 
>> This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm)
>> 
>> This PR shows upto 19x speedup on buffer sizes of 1MB.
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
> 
>   add cpuinfo test for avx_ifma

src/hotspot/cpu/x86/stubGenerator_x86_64_poly.cpp line 1053:

> 1051: 
> 1052:   Let R be the 16-byte secret key used for polynomial evaluation.
> 1053:   Let K be the 16-byte secret AES key.

Poly1305 is not restricted to AES. Just remove AES (call it 'second part of the secret key' or something). 

Per RFC https://datatracker.ietf.org/doc/html/rfc7539#section-2.5

Poly1305 takes a 32-byte one-time key and a message and produces a 16-byte tag.
...
Regardless of how the key is generated, the key is partitioned into two parts, called "r" and "s".

test/lib-test/jdk/test/whitebox/CPUInfoTest.java line 68:

> 66:                     "hv",           "fsrm",             "avx512_bitalg",     "gfni",
> 67:                     "f16c",         "pku",              "ospke",             "cet_ibt",
> 68:                     "cet_ss",       "avx512_ifma",      "serialize",         "avx_ifma"

Wouldn't `avx2_ifma` be a better name since we already have `avx512_ifma`?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1504737750
PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1504730588


More information about the hotspot-compiler-dev mailing list