RFR: 8341527: AVX-512 intrinsic for SHA3 [v6]
Sandhya Viswanathan
sviswanathan at openjdk.org
Mon Oct 21 23:00:17 UTC 2024
On Mon, 21 Oct 2024 19:46:41 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:
>> There is already an intrinsic for SHA-3 for aarch64, which gives significant speed improvement on that architecture, so this pull request is bringing similar improvement for tha x64 family of systems that have the AVX-512 extension. Rudimentary measurements show that 30-40% speed improvement can be achieved.
>
> Ferenc Rakoczi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits:
>
> - fix mismerge
> - Merge master
> - accepting review suggestions from Volodymyr and Vladimir
> - Merge branch 'master' into sha3-avx512-intrinsic
> - fix windows build
> - fix debug build
> - 8341527: AVX-512 intrinsic for SHA3
src/hotspot/cpu/x86/assembler_x86.cpp line 8547:
> 8545: (vector_len == AVX_256bit ? VM_Version::supports_avx2() : VM_Version::supports_evex()), "");
> 8546: // TODO check what legacy_mode needs to be set to
> 8547: InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
If you notice W needs to be set to 1 for evex and is ignored for avx encoding. To be consistent with other similar instruction definition (e.g. addsd), here vex_w could be set as VM_Version::supports_evex(). Also the attributes definition need to be followed by attributes.set_rex_vex_w_reverted(). So this should look like as below:
InstructionAttr attributes(vector_len, /* vex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();
src/hotspot/cpu/x86/assembler_x86.cpp line 16228:
> 16226:
> 16227: void Assembler::evpermt2d(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
> 16228: assert(vector_len <= AVX_256bit ? VM_Version::supports_avx512vlbw() : VM_Version::supports_avx512bw(), "");
The evpermt2d instruction is foundational instruction and doesn't need avx512bw() so the assert could be changed to:
assert(VM_Version::supports_evex() && (vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl()), "");
src/hotspot/cpu/x86/assembler_x86.cpp line 16236:
> 16234:
> 16235: void Assembler::evpermt2q(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
> 16236: assert(vector_len <= AVX_256bit ? VM_Version::supports_avx512vlbw() : VM_Version::supports_avx512bw(), "");
The evpermt2q instruction is foundational instruction and doesn't need avx512bw() so the assert could be changed to:
assert(VM_Version::supports_evex() && (vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl()), "");
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1809606398
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1809618137
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1809619156
More information about the hotspot-dev
mailing list