RFR: 8341527: AVX-512 intrinsic for SHA3 [v8]

Tue Oct 29 15:59:22 UTC 2024

On Fri, 25 Oct 2024 14:16:40 GMT, Ferenc Rakoczi <duke at openjdk.org> wrote:

>> There is already an intrinsic for SHA-3 for aarch64, which gives significant speed improvement on that architecture, so this pull request is bringing similar improvement for tha x64 family of systems that have the AVX-512 extension. Rudimentary measurements show that 30-40% speed improvement can be achieved.
>
> Ferenc Rakoczi has updated the pull request incrementally with one additional commit since the last revision:
> 
>   added comments

I think this broke the x86 assembler, I'm getting multiple failures that look like the following:

/home/runner/work/jdk/jdk/src/hotspot/cpu/x86/assembler_x86.cpp:3646:6: error: redefinition of ‘void Assembler::evmovdquw(XMMRegister, KRegister, XMMRegister, bool, int)’
 3646 | void Assembler::evmovdquw(XMMRegister dst, KRegister mask, XMMRegister src, bool merge, int vector_len) {
      |      ^~~~~~~~~
/home/runner/work/jdk/jdk/src/hotspot/cpu/x86/assembler_x86.cpp:3593:6: note: ‘void Assembler::evmovdquw(XMMRegister, KRegister, XMMRegister, bool, int)’ previously defined here
 3593 | void Assembler::evmovdquw(XMMRegister dst, KRegister mask, XMMRegister src, bool merge, int vector_len) {
      |      ^~~~~~~~~

src/hotspot/cpu/x86/stubGenerator_x86_64_sha3.cpp line 54:

> 52:     };
> 53: 
> 54: ATTRIBUTE_ALIGNED(64) static const uint64_t permsAndRots[] = {

It's too late for this now, but this should've used alignas directly instead of ATTRIBUTE_ALIGNED. No worries though, the macro ultimately expands to alignas

-------------

PR Comment: https://git.openjdk.org/jdk/pull/21352#issuecomment-2444688417
PR Review Comment: https://git.openjdk.org/jdk/pull/21352#discussion_r1821116001