RFR: 8348561: Add aarch64 intrinsics for ML-DSA [v2]

Wed Feb 5 10:35:10 UTC 2025

On Tue, 4 Feb 2025 18:57:28 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>>> @ferakocz I'm afraid you lucked out on getting your change committed before my reorganization of the stub generation code. If you are unsure of how to do the merge so your new stub is declared and generated following the new model (see the doc comments in stubDeclarations.hpp for details) let me know and I'll be happy to help you sort it out.
>> 
>> @adinn I think I managed to figure it out. Please take a look at the PR and let me know if I should have done anything differently.
>
>> @ferakocz Yes, the stub declaration part of it looks to be correct.
>> 
>> The rest of the patch will need at least two reviewers (@theRealAph? @martinuy? @franferrax) and may take some time to review, given that they will probably need to read up on the maths and algorithms. As an aid for reviewers and maintainers it would be good to insert a comment into the generator file linking the implementations to the relevant maths and algorithm. I found the FIPS-204 spec and the CRYSTALS-Dilithium Algorithm Speciﬁcations and Supporting Documentation paper, Shi Bai, Léo Ducas et al, 2021 - are they the best ones to look at?
> 
> The Java implementation of ML-DSA is based on the FIPS-204 standard and the intrinsicss' implementations are based on the corresponding Java methods, except that the montMul() calls in them are inlined. The rest of the transformation from Java code to intrinsic code is pretty straightforward, so a reviewer need not necessarily understand the whole mathematics of the ML-DSA algorithms, just that the Java and the corresponding intrinsic code do the same thing.

@ferakocz
> The Java implementation of ML-DSA is based on the FIPS-204 standard and the intrinsics' implementations are based on the corresponding Java methods, except that the montMul() calls in them are inlined. The rest of the transformation from Java code to intrinsic code is pretty straightforward, so a reviewer need not necessarily understand the whole mathematics of the ML-DSA algorithms, just that the Java and the corresponding intrinsic code do the same thing.

Yes, I located the relevant Java implementations in SHA3.java (keccak) and ML_DSA.java (dilithiumXXX) plus also SHA3Parallel.java (doubleKeccak). The first file does at least mention FIPS-202. The second does not include any reference, in particular does not mention FIPS-204.

I still think it would be helpful for reviewers and maintainers if you were to add a comment in front of the generator routines that 1) notes that these routines are based on the relevant Java sources and 2) mentions that the Java code is in turn based on the FIPS-202 and FIPS-204 standards.

While I agree that a reviewer or maintainer could simply check the generated code against the Java code I believe access to the underlying theory will be of aid when it comes to understanding what each variant is doing and verifying the equivalence of the two.

That's why I'd also prefer to have two reviews to be sure that more than one of us who may be tasked with maintaining this code can be happy that we understand, at least, the equivalence in question.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23300#issuecomment-2636346476