RFR: 8348561: Add aarch64 intrinsics for ML-DSA [v6]

Mon Feb 24 11:50:55 UTC 2025

On Thu, 20 Feb 2025 17:33:18 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> By using the aarch64 vector registers the speed of the computation of the ML-DSA algorithms (key generation, document signing, signature verification) can be approximately doubled.
>
> Ferenc Rakoczi has updated the pull request incrementally with four additional commits since the last revision:
> 
>  - Accepting suggested change from Andrew Dinn
>  - Added comments suggested by Andrew Dinn
>  - Fixed copyright years
>  - renaming a couple of functions

src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 4593:

> 4591:   // chunks of) vector registers v30 and v31, resp.
> 4592:   // The inputs are in v0-v7 and v16-v23 and the results go to v16-v23,
> 4593:   // four 32-bit values in each register

Suggestion:

Once again it would be good to annotate the lines in this code with comments  that relate the generated code back to the original Java code.

In the header comment you should refer to the relevant Java class and the var names there:

    // computes (in parallel across 8 x 4S vectors)
    //    a = b * c * 2^-32 mod MONT_Q
    // where
    //    inputs b and c are in v0, ..., v7 and v16, ... v23,
    //    scratch registers v24, ... v27 are clobbered
    //    output a is written back into v16, ... v23
    //    constants q and q_inv are in v30, v31
    //
    // See the equivalent Java code in method ML_DSA.montMul

Then comment the generation lines as shown below

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23300#discussion_r1967490923