RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v16]

Volodymyr Paprotski duke at openjdk.org
Wed Nov 16 21:12:26 UTC 2022


On Tue, 15 Nov 2022 19:30:23 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>> Volodymyr Paprotski has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 23 commits:
>> 
>>  - Merge remote-tracking branch 'origin/master' into avx512-poly
>>  - Vladimir's review
>>  - live review with Sandhya
>>  - jcheck
>>  - Sandhya's review
>>  - fix windows and 32b linux builds
>>  - add getLimbs to interface and reviews
>>  - fix 32-bit build
>>  - make UsePolyIntrinsics option diagnostic
>>  - Merge remote-tracking branch 'origin/master' into avx512-poly
>>  - ... and 13 more: https://git.openjdk.org/jdk/compare/e269dc03...a26ac7db
>
> src/hotspot/cpu/x86/stubGenerator_x86_64_poly.cpp line 370:
> 
>> 368:   // Middle 44-bit limbs of new blocks
>> 369:   __ vpsrlq(L1, L0, 44, Assembler::AVX_512bit);
>> 370:   __ vpsllq(TMP2, TMP1, 20, Assembler::AVX_512bit);
> 
> Any particular reason to use `TMP2` here? Can you just update `TMP1` instead (w/ `vpsllq(TMP1, TMP1, 20, Assembler::AVX_512bit);`)?

Thanks for the catch. Removed TMP2. (Several refactors ago, `D[01]` and `L[0-2]` used the same registers, because I was running out.. likely forgot to cleanup after I removed 2/3 of the optimizations and re-did register allocation)

done

-------------

PR: https://git.openjdk.org/jdk/pull/10582



More information about the security-dev mailing list