RFR: 8341527: AVX-512 intrinsic for SHA3 [v6]

Tue Oct 22 11:44:16 UTC 2024

On Mon, 21 Oct 2024 22:22:17 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:

>> Ferenc Rakoczi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits:
>> 
>>  - fix mismerge
>>  - Merge master
>>  - accepting review suggestions from Volodymyr and Vladimir
>>  - Merge branch 'master' into sha3-avx512-intrinsic
>>  - fix windows build
>>  - fix debug build
>>  - 8341527: AVX-512 intrinsic for SHA3
>
> src/hotspot/cpu/x86/assembler_x86.cpp line 8547:
> 
>> 8545:         (vector_len == AVX_256bit ? VM_Version::supports_avx2() : VM_Version::supports_evex()), "");
>> 8546:   // TODO check what legacy_mode needs to be set to
>> 8547:   InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
> 
> If you notice W needs to be set to 1 for evex and is ignored for avx encoding.  To be consistent with other similar instruction definition (e.g. addsd), here vex_w could be set as VM_Version::supports_evex(). Also the attributes definition need to be followed by attributes.set_rex_vex_w_reverted(). So this should look like as below:
> 
> 
> InstructionAttr attributes(vector_len, /* vex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
> attributes.set_rex_vex_w_reverted();

Changed as suggested.

> src/hotspot/cpu/x86/assembler_x86.cpp line 16228:
> 
>> 16226: 
>> 16227: void Assembler::evpermt2d(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
>> 16228:   assert(vector_len <= AVX_256bit ? VM_Version::supports_avx512vlbw() : VM_Version::supports_avx512bw(), "");
> 
> The evpermt2d instruction is foundational instruction and doesn't need avx512bw() so the assert could be changed to:
> assert(VM_Version::supports_evex() && (vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl()), "");

Changed as suggested.

> src/hotspot/cpu/x86/assembler_x86.cpp line 16236:
> 
>> 16234: 
>> 16235: void Assembler::evpermt2q(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
>> 16236:   assert(vector_len <= AVX_256bit ? VM_Version::supports_avx512vlbw() : VM_Version::supports_avx512bw(), "");
> 
> The evpermt2q instruction is foundational instruction and doesn't need avx512bw() so the assert could be changed to:
> assert(VM_Version::supports_evex() && (vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl()), "");

Changed as suggested.

> src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp line 138:
> 
>> 136: 
>> 137:   // set up the masks
>> 138:   __ mov64(rax, 0x1F);
> 
> This could just be a movl or movd.

Changed as suggested.

> src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp line 147:
> 
>> 145:   __ kmovwl(k2, rax);
>> 146:   __ shrl(rax, 1);
>> 147:   __ kmovwl(k1, rax);
> 
> The same could be achieved by:
> __ kshiftrwl(k4, k5, 1);
> __ kshiftrwl(k3, k5, 2);
> __ kshiftrwl(k2, k5, 3);
> __ kshiftrwl(k1, k5, 4);

Changed as suggested.

> src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp line 288:
> 
>> 286:     __ movq(rax, ofs); // return ofs
>> 287:   } else {
>> 288:     __ mov64(rax, 0);
> 
> This could be  xorq(rax, rax).

Changed as suggested.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1810562847
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1810562945
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1810563063
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1810563149
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1810563236
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1810563309