RFR: 8341527: AVX-512 intrinsic for SHA3 [v6]

Sandhya Viswanathan sviswanathan at openjdk.org
Mon Oct 21 23:00:17 UTC 2024


On Mon, 21 Oct 2024 19:46:41 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> There is already an intrinsic for SHA-3 for aarch64, which gives significant speed improvement on that architecture, so this pull request is bringing similar improvement for tha x64 family of systems that have the AVX-512 extension. Rudimentary measurements show that 30-40% speed improvement can be achieved.
>
> Ferenc Rakoczi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits:
> 
>  - fix mismerge
>  - Merge master
>  - accepting review suggestions from Volodymyr and Vladimir
>  - Merge branch 'master' into sha3-avx512-intrinsic
>  - fix windows build
>  - fix debug build
>  - 8341527: AVX-512 intrinsic for SHA3

src/hotspot/cpu/x86/assembler_x86.cpp line 8547:

> 8545:         (vector_len == AVX_256bit ? VM_Version::supports_avx2() : VM_Version::supports_evex()), "");
> 8546:   // TODO check what legacy_mode needs to be set to
> 8547:   InstructionAttr attributes(vector_len, /* vex_w */ true, /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);

If you notice W needs to be set to 1 for evex and is ignored for avx encoding.  To be consistent with other similar instruction definition (e.g. addsd), here vex_w could be set as VM_Version::supports_evex(). Also the attributes definition need to be followed by attributes.set_rex_vex_w_reverted(). So this should look like as below:


InstructionAttr attributes(vector_len, /* vex_w */ VM_Version::supports_evex(), /* legacy_mode */ false, /* no_mask_reg */ true, /* uses_vl */ true);
attributes.set_rex_vex_w_reverted();

src/hotspot/cpu/x86/assembler_x86.cpp line 16228:

> 16226: 
> 16227: void Assembler::evpermt2d(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
> 16228:   assert(vector_len <= AVX_256bit ? VM_Version::supports_avx512vlbw() : VM_Version::supports_avx512bw(), "");

The evpermt2d instruction is foundational instruction and doesn't need avx512bw() so the assert could be changed to:
assert(VM_Version::supports_evex() && (vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl()), "");

src/hotspot/cpu/x86/assembler_x86.cpp line 16236:

> 16234: 
> 16235: void Assembler::evpermt2q(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) {
> 16236:   assert(vector_len <= AVX_256bit ? VM_Version::supports_avx512vlbw() : VM_Version::supports_avx512bw(), "");

The evpermt2q instruction is foundational instruction and doesn't need avx512bw() so the assert could be changed to:
assert(VM_Version::supports_evex() && (vector_len == Assembler::AVX_512bit || VM_Version::supports_avx512vl()), "");

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1809606398
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1809618137
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1809619156


More information about the hotspot-dev mailing list