RFR: 8360934: Add AVX-512 intrinsics for ML-KEM - enhancement on AVX512_VBMI and AVX512_VBMI2 [v2]

Volodymyr Paprotski vpaprotski at openjdk.org
Wed Jan 7 18:04:14 UTC 2026


On Wed, 7 Jan 2026 17:47:59 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> I wasn't as clear in my question. The asm is indeed processing the bytes in the increment. What I was trying to convince myself about.. 'how come we are not reading past the end of the array. Or are we?'.
>> 
>> On one hand, this is exactly what the existing asm code does, so I will assume that its correct. However, on the java side/version of this code, I could only convince myself about processing ~two AVX512 vectors at a time, not four.
>> 
>> So either I cant count, or there is some further (implicit) restrictions on the callers of `twelve2Sixteen`
>
> In ML_KEM.java there is this  assert (and this is the only call to  implKyber12To16() 
> 
>         assert ((remainder == 0) || (remainder == 48)) &&
>                 (index + i * 96 <= condensed.length);
>         implKyber12To16(condensed, index, parsed, parsedLength);
> 
> and one can check how the callers of twelve2Sixteen() make sure that this is the case.

Yep, thats exactly the assert I was looking at as well.. looks to me like its dividing the 'expanded-short-array-length' by 64 and ensuring the remainder is zero (ignoring the 48 for a bit.. and the condensed-length check).

(for simplicity) So the 'expanded' array length should be a multiple of 64; i.e. 128-bytes. But we stride the expanded array by 256 bytes? (and parsedLength by 128-shorts..)

I haven't checked the callers of `twelve2Sixteen` but I suspect that the length of the expanded array is always a multiple of 256-bytes (128-shorts).. in which case, the assert is 'incomplete'?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/28815#discussion_r2669535184


More information about the hotspot-compiler-dev mailing list