RFR: 8325991: Accelerate Poly1305 on x86_64 using AVX2 instructions [v13]
Tom Shull
duke at openjdk.org
Tue Mar 12 16:02:23 UTC 2024
On Fri, 8 Mar 2024 18:56:07 GMT, Srinivas Vamsi Parasa <duke at openjdk.org> wrote:
>> The goal of this PR is to accelerate the Poly1305 algorithm using AVX2 instructions (including IFMA) for x86_64 CPUs.
>>
>> This implementation is directly based on the AVX2 Poly1305 hash computation as implemented in Intel(R) Multi-Buffer Crypto for IPsec Library (url: https://github.com/intel/intel-ipsec-mb/blob/main/lib/avx2_t3/poly_fma_avx2.asm)
>>
>> This PR shows upto 19x speedup on buffer sizes of 1MB.
>
> Srinivas Vamsi Parasa has updated the pull request incrementally with one additional commit since the last revision:
>
> make vpmadd52l/hq generic
src/hotspot/cpu/x86/vm_version_x86.cpp line 312:
> 310: __ lea(rsi, Address(rbp, in_bytes(VM_Version::sef_cpuid7_ecx1_offset())));
> 311: __ movl(Address(rsi, 0), rax);
> 312: __ movl(Address(rsi, 4), rbx);
Hi @vamsi-parasa. I believe this code as a bug in it. Here you are copying back all four registers; however, within https://github.com/openjdk/jdk/blob/782206bc97dc6ae953b0c3ce01f8b6edab4ad30b/src/hotspot/cpu/x86/vm_version_x86.hpp#L468 you only created one field.
Can you please open up a JBS issue to fix this?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/17881#discussion_r1521736242
More information about the hotspot-compiler-dev
mailing list