RFR: 8341527: AVX-512 intrinsic for SHA3 [v4]

Ferenc Rakoczi duke at openjdk.org
Tue Oct 15 15:54:15 UTC 2024


On Thu, 10 Oct 2024 16:08:01 GMT, Volodymyr Paprotski <duke at openjdk.org> wrote:

>> src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp line 148:
>> 
>>> 146:   __ addl(rax, 8);
>>> 147:   __ kmovbl(k4, rax);
>>> 148:   __ addl(rax, 16);
>> 
>> Since you need k5 soonest, you could save a few cycles by removing the propagation dependency on rax and loading the immediate directly..
>> 
>> (If you really want to get clever, 
>> 
>>   KRegister masks[] = {k1,k2,k3,k4,k5};
>>   for (long i=2; i<=32; i*=2) {
>>     __ mov64(rax, i-1);
>>     __ kmovbl(masks[i], rax);
>>   }
>>   ```
>>   Highly debatable if its actually any more readable.. so up to you)
>
> Another alternative that is closer to the structure of your code (And uses smaller instructions..). 
> 
> - Start from the end, with `k5`, load `0x1f` constant
> - Shift constant down by one and load into next KRegister
> - (still could be done with a loop. but you decide what you find more readable..)
> 
> This way k5 is available immediately for the `evmovdquq`

Changed to start loading the mask registers from k5.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1801483636


More information about the hotspot-dev mailing list