RFR: 8360934: Add AVX-512 intrinsics for ML-KEM - enhancement on AVX512_VBMI [v4]

Shawn M Emery duke at openjdk.org
Thu Jan 8 17:59:35 UTC 2026


> This change allows use of the AVX512_VBMI instruction set to further optimize decompression/parsing of polynomial coefficients for ML-KEM.  The speedup gained in the ML-KEM benchmarks for key generation is between 0.3 to 0.6%, encapsulation is  0.4 to 1.7%, and decapsulation is 0.3 to 1.9%.
> 
> Thank you to @sviswa7 and @ferakocz for their help in working through the early stages of this code with me.

Shawn M Emery has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision:

 - Merge with mainline
 - 8360934: Add AVX-512 intrinsics for ML-KEM - enhancement on AVX512_VBMI
   Change Swap to Dup named function/variable
   Check for only VBMI support (not VBMI2)
 - Update copyright year
 - Merge with mainline
 - Swap parameter operation with source
 - Remove wrong mask from evpsrlvw
 - Reverse ordering for vpermb and vpsrlvw instructions
 - Switch from vpshldvw to vpsrlvw
 - Fix whitespaces
 - 8360934: Add AVX-512 intrinsics for ML-KEM - enhancement on AVX512_VBMI and AVX512_VBMI2

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/28815/files
  - new: https://git.openjdk.org/jdk/pull/28815/files/4af75963..373b1339

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=28815&range=03
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=28815&range=02-03

  Stats: 26668 lines in 2610 files changed: 7287 ins; 4136 del; 15245 mod
  Patch: https://git.openjdk.org/jdk/pull/28815.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/28815/head:pull/28815

PR: https://git.openjdk.org/jdk/pull/28815


More information about the hotspot-compiler-dev mailing list