RFR: 8341527: AVX-512 intrinsic for SHA3 [v6]
Ferenc Rakoczi
duke at openjdk.org
Tue Oct 22 11:44:16 UTC 2024
On Mon, 21 Oct 2024 22:22:17 GMT, Sandhya Viswanathan <sviswanathan at openjdk.org> wrote:
>> Ferenc Rakoczi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits:
>>
>> - fix mismerge
>> - Merge master
>> - accepting review suggestions from Volodymyr and Vladimir
>> - Merge branch 'master' into sha3-avx512-intrinsic
>> - fix windows build
>> - fix debug build
>> - 8341527: AVX-512 intrinsic for SHA3
>
> src/hotspot/cpu/x86/assembler_x86.cpp line 8547:
>
>> 8545: (vector_len == AVX_256bit ? VM_Version::supports_avx2() : VM_Version::supports_evex()), "");
>> 8546: // TODO check what legacy_mode needs to be set to
>> 8547: InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
>
> If you notice W needs to be set to 1 for evex and is ignored for avx encoding. To be consistent with other similar instruction definition (e.g. addsd), here vex_w could be set as VM_Version::supports_evex(). Also the attributes definition need to be followed by attributes.set_rex_vex_w_reverted(). So this should look like as below:
>
>
> InstructionAttr attributes(vector_len, /* vex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
> attributes.set_rex_vex_w_reverted();
Changed as suggested.
> src/hotspot/cpu/x86/assembler_x86.cpp line 16228:
>
>> 16226:
>> 16227: void Assembler::evpermt2d(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
>> 16228: assert(vector_len <= AVX_256bit ? VM_Version::supports_avx512vlbw() : VM_Version::supports_avx512bw(), "");
>
> The evpermt2d instruction is foundational instruction and doesn't need avx512bw() so the assert could be changed to:
> assert(VM_Version::supports_evex() && (vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl()), "");
Changed as suggested.
> src/hotspot/cpu/x86/assembler_x86.cpp line 16236:
>
>> 16234:
>> 16235: void Assembler::evpermt2q(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
>> 16236: assert(vector_len <= AVX_256bit ? VM_Version::supports_avx512vlbw() : VM_Version::supports_avx512bw(), "");
>
> The evpermt2q instruction is foundational instruction and doesn't need avx512bw() so the assert could be changed to:
> assert(VM_Version::supports_evex() && (vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl()), "");
Changed as suggested.
> src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp line 138:
>
>> 136:
>> 137: // set up the masks
>> 138: __ mov64(rax, 0x1F);
>
> This could just be a movl or movd.
Changed as suggested.
> src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp line 147:
>
>> 145: __ kmovwl(k2, rax);
>> 146: __ shrl(rax, 1);
>> 147: __ kmovwl(k1, rax);
>
> The same could be achieved by:
> __ kshiftrwl(k4, k5, 1);
> __ kshiftrwl(k3, k5, 2);
> __ kshiftrwl(k2, k5, 3);
> __ kshiftrwl(k1, k5, 4);
Changed as suggested.
> src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp line 288:
>
>> 286: __ movq(rax, ofs); // return ofs
>> 287: } else {
>> 288: __ mov64(rax, 0);
>
> This could be xorq(rax, rax).
Changed as suggested.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1810562847
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1810562945
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1810563063
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1810563149
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1810563236
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1810563309
More information about the hotspot-dev
mailing list