RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v2]
Volodymyr Paprotski
duke at openjdk.org
Mon Apr 15 22:12:31 UTC 2024
On Fri, 5 Apr 2024 07:19:28 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> Volodymyr Paprotski has updated the pull request incrementally with one additional commit since the last revision:
>>
>> remove use of jdk.crypto.ec
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp line 39:
>
>> 37: };
>> 38: static address modulus_p256() {
>> 39: return (address)MODULUS_P256;
>
> Long constants should have UL suffix.
Properly ULL, but good point, fixed
> src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp line 386:
>
>> 384: __ jcc(Assembler::equal, L_Length19);
>> 385:
>> 386: // Default copy loop
>
> Please add appropriate loop entry alignment.
This is actually a 'switch statement default'. The default should never happen (See "Known Length comment on line 335"), but added because java code has that behavior. (i.e. in the unlikely case NIST adds a new elliptic curve to the existing standard?)
> src/hotspot/cpu/x86/stubGenerator_x86_64_poly_mont.cpp line 394:
>
>> 392: __ lea(aLimbs, Address(aLimbs,8));
>> 393: __ lea(bLimbs, Address(bLimbs,8));
>> 394: __ jmp(L_DefaultLoop);
>
> Both sub and cmp are flag affecting instructions and are macro-fusible.
> By doing a loop rotation i.e. moving the length <= 0 check outside the loop and pushing the loop exit check at bottom you can save additional compare checks.
Per-above, this is a switch statement (`UNLIKELY`) fallback. I can still add alignment and loop rotation, but being a fallback figured its more important to keep it small&readable...
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1566486768
PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1566486717
PR Review Comment: https://git.openjdk.org/jdk/pull/18583#discussion_r1566486848
More information about the core-libs-dev
mailing list