RFR: 8351034: Add AVX-512 intrinsics for ML-DSA [v3]

Ferenc Rakoczi duke at openjdk.org
Wed Mar 5 18:30:03 UTC 2025


On Wed, 5 Mar 2025 14:03:00 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Hi, @jatin-bhateja, thanks for the suggestion. I have added __ align(OptoLoopAlignment); before all loop entries.
>
> Hi @ferakocz , 
> 
> Thanks!, for efficient utilization of Decode ICache (please refer to Intel SDM section 3.4.2.5), code blocks should be aligned to 32-byte boundaries; a 64-byte aligned code is a superset of both 16 and 32 byte aligned addresses and also matches with the cacheline size. However, I can noticed that we have been using OptoLoopAlignment at places in AES-GCM also.
> 
> I introduced some errors in generate_dilithiumAlmostInverseNtt_avx512 implementation in anticipation of catching it through existing ML_DSA_Tests under 
> test/jdk/sun/security/provider/acvp
> 
> But all the tests passed for me.
> `java  -jar /home/jatinbha/sandboxes/jtreg/build/images/jtreg/lib/jtreg.jar -jdk:$JAVA_HOME -Djdk.test.lib.artifacts.ACVP-Server=/home/jatinbha/softwares/v1.1.0.38.zip -va -timeout:4 Launcher.java`
> 
> Can you please point out a test I need to use for validation

I think the easiest is to put a for (int i = 0; i < 1000; i++) loop around  the switch statement in the run() method of  the ML_DSA_Test class (test/jdk/sun/security/provider/acvp/ML_DSA_Test.java). (This is because the intrinsics kick in after a few thousand calls of the method.)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23860#discussion_r1981945490


More information about the hotspot-dev mailing list