RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2]

Volodymyr Paprotski duke at openjdk.org
Mon Apr 15 22:12:31 UTC 2024


On Fri, 5 Apr 2024 07:19:28 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   remove use of jdk.crypto.ec
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp line 39:
> 
>> 37: };
>> 38: static address modulus_p256() {
>> 39:   return (address)MODULUS_P256;
> 
> Long constants should have UL suffix.

Properly ULL, but good point, fixed

> src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp line 386:
> 
>> 384:   __ jcc(Assembler::equal, L_Length19);
>> 385: 
>> 386:   // Default copy loop
> 
> Please add appropriate loop entry alignment.

This is actually a 'switch statement default'. The default should never happen (See "Known Length comment on line 335"), but added because java code has that behavior. (i.e. in the unlikely case NIST adds a new elliptic curve to the existing standard?)

> src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp line 394:
> 
>> 392:   __ lea(aLimbs, Address(aLimbs,8));
>> 393:   __ lea(bLimbs, Address(bLimbs,8));
>> 394:   __ jmp(L_DefaultLoop);
> 
> Both sub and cmp are flag affecting instructions and are macro-fusible. 
> By doing a loop rotation i.e. moving the length <= 0 check outside the loop and pushing the loop exit check at bottom you can save additional compare checks.

Per-above, this is a switch statement (`UNLIKELY`) fallback. I can still add alignment and loop rotation, but being a fallback figured its more important to keep it small&readable...

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1566486768
PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1566486717
PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1566486848


More information about the core-libs-dev mailing list