RFR: 8351412: Add AVX-512 intrinsics for ML-KEM [v6]

Fri May 16 00:32:53 UTC 2025

On Thu, 15 May 2025 13:33:42 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> By using the AVX-512 vector registers the speed of the computation of the ML-KEM algorithms (key generation, encapsulation, decapsulation) can be approximately doubled.
>
> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Response to review comment + loading constants with broadcast op.

src/hotspot/cpu/x86/stubGenerator_x86_64_kyber.cpp line 250:

> 248: static void montmul(int outputRegs[], int inputRegs1[], int inputRegs2[],
> 249:              int scratchRegs1[], int scratchRegs2[], MacroAssembler *_masm) {
> 250:    for (int i = 0; i < 4; i++) {

In the intrinsic for montMul we are treating as if MONT_R_BITS is 16 and MONT_Q_INV_MOD_R is 0xF301 whereas in the Java code MONT_R_BITS is 20 and MONT_Q_INT_MOD_R is 0x8F301. Are these equivalent?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/24953#discussion_r2092137164