RFR: 8371259: ML-DSA AVX2 and AVX512 intrinsics and improvements

Mark Powers mpowers at openjdk.org
Sun Nov 16 17:24:15 UTC 2025


On Tue, 4 Nov 2025 16:38:49 GMT, Volodymyr Paprotski <vpaprotski at openjdk.org> wrote:

> - New AVX2 intrinsics are 1.6x-6.9x faster than Java baseline 
>    - `SignatureBench.MLDSA` is 1.2x-2.2x faster
>    - Note: there is no AVX2-SHA3 intrinsics yet (Being reviewed https://github.com/vpaprotsk/jdk/pull/7)
> - AVX512 intrinsic improvements are 1.24x-1.5x faster then current version 
>   - `SignatureBench.MLDSA` is upto 5% faster, never slower
> 
> Note on intrinsic:
> - The emitted (existing) AVX512 assembler was not "significantly" changed; mostly more efficient instruction selection and tighter register allocation, which allowed removal of NTT loop and stack spill.
> - Code was refactored to allow reuse of same assembler (as possible) for AVX512 and AVX2
> 
> Tests and benchmarks:
> - Added a fuzz test to ensure Java and intrinsic produces exactly same result
> - Added benchmark to measure the performance of intrinsic itself
> 
> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java"
> make test TEST="test/jdk/sun/security/provider/acvp/Launcher.java test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java" JTREG="JAVA_OPTIONS=-XX:UseAVX=2"
> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:+UseDilithiumIntrinsics;FORK=1"
> make test TEST="micro:org.openjdk.bench.javax.crypto.full.SignatureBench.MLDSA" MICRO="JAVA_OPTIONS=-XX:+UnlockDiagnosticVMOptions -XX:-UseDilithiumIntrinsics;FORK=1"

You might want to have @kuksenko or @ericcaspole look at MLDSABench.java.

test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 29:

> 27: import java.lang.invoke.MethodHandle;
> 28: import java.lang.invoke.MethodHandles;
> 29: import java.lang.reflect.Field;

unused import statement

test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 31:

> 29: import java.lang.reflect.Field;
> 30: import java.lang.reflect.Method;
> 31: import java.lang.reflect.Constructor;

unused import

test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 123:

> 121:         try {
> 122:             for (int i = 0; i < repeat; i++) {
> 123:                 // seed = rnd.nextLong();

2 lines commented out

test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 219:

> 217:         int[] coeffs3 = new int[ML_DSA_N];
> 218:         for (int j = 0; j<ML_DSA_N; j++) {
> 219:             coeffs3[j] =

`coeffs3` is written to but never read

test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java line 517:

> 515:     };
> 516: }
> 517: // java --add-opens java.base/sun.security.provider=ALL-UNNAMED  -XX:+UseDilithiumIntrinsics test/jdk/sun/security/provider/acvp/ML_DSA_Intrinsic_Test.java

This is line is useful. Not sure I would hide it at the bottom of the file.

test/micro/org/openjdk/bench/javax/crypto/full/MLDSABench.java line 2:

> 1: /*
> 2:  * Copyright (c) 2015, 2018, Oracle and/or its affiliates. All rights reserved.

Copyright date.

-------------

Marked as reviewed by mpowers (Committer).

PR Review: https://git.openjdk.org/jdk/pull/28136#pullrequestreview-3470287661
PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532070492
PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532071025
PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532075447
PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532074544
PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532078122
PR Review Comment: https://git.openjdk.org/jdk/pull/28136#discussion_r2532078790


More information about the security-dev mailing list