RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v16]
Volodymyr Paprotski
duke at openjdk.org
Wed Nov 16 21:12:26 UTC 2022
On Tue, 15 Nov 2022 19:30:23 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:
>> Volodymyr Paprotski has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 23 commits:
>>
>> - Merge remote-tracking branch 'origin/master' into avx512-poly
>> - Vladimir's review
>> - live review with Sandhya
>> - jcheck
>> - Sandhya's review
>> - fix windows and 32b linux builds
>> - add getLimbs to interface and reviews
>> - fix 32-bit build
>> - make UsePolyIntrinsics option diagnostic
>> - Merge remote-tracking branch 'origin/master' into avx512-poly
>> - ... and 13 more: https://git.openjdk.org/jdk/compare/e269dc03...a26ac7db
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_poly.cpp line 370:
>
>> 368: // Middle 44-bit limbs of new blocks
>> 369: __ vpsrlq(L1, L0, 44, Assembler::AVX_512bit);
>> 370: __ vpsllq(TMP2, TMP1, 20, Assembler::AVX_512bit);
>
> Any particular reason to use `TMP2` here? Can you just update `TMP1` instead (w/ `vpsllq(TMP1, TMP1, 20, Assembler::AVX_512bit);`)?
Thanks for the catch. Removed TMP2. (Several refactors ago, `D[01]` and `L[0-2]` used the same registers, because I was running out.. likely forgot to cleanup after I removed 2/3 of the optimizations and re-did register allocation)
done
-------------
PR: https://git.openjdk.org/jdk/pull/10582
More information about the security-dev
mailing list