RFR: 8341527: AVX-512 intrinsic for SHA3 [v4]
Ferenc Rakoczi
duke at openjdk.org
Tue Oct 15 15:54:15 UTC 2024
On Thu, 10 Oct 2024 16:08:01 GMT, Volodymyr Paprotski <duke at openjdk.org> wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp line 148:
>>
>>> 146: __ addl(rax, 8);
>>> 147: __ kmovbl(k4, rax);
>>> 148: __ addl(rax, 16);
>>
>> Since you need k5 soonest, you could save a few cycles by removing the propagation dependency on rax and loading the immediate directly..
>>
>> (If you really want to get clever,
>>
>> KRegister masks[] = {k1,k2,k3,k4,k5};
>> for (long i=2; i<=32; i*=2) {
>> __ mov64(rax, i-1);
>> __ kmovbl(masks[i], rax);
>> }
>> ```
>> Highly debatable if its actually any more readable.. so up to you)
>
> Another alternative that is closer to the structure of your code (And uses smaller instructions..).
>
> - Start from the end, with `k5`, load `0x1f` constant
> - Shift constant down by one and load into next KRegister
> - (still could be done with a loop. but you decide what you find more readable..)
>
> This way k5 is available immediately for the `evmovdquq`
Changed to start loading the mask registers from k5.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1801483636
More information about the hotspot-dev
mailing list