RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v16]

Volodymyr Paprotski duke at openjdk.org
Wed Nov 16 21:34:22 UTC 2022


On Tue, 15 Nov 2022 19:38:56 GMT, Volodymyr Paprotski <duke at openjdk.org> wrote:

>>> On other hand, there are functions like poly1305_multiply8_avx512 and poly1305_process_blocks_avx512 that use a lot of temp registers. I think it makes sense to keep those as 'function-header declarations'.
>> 
>> I agree with you on `poly1305_process_blocks_avx512`, but `poly1305_multiply8_avx512` already takes 8 arguments. Putting 8 more arguments for temps doesn't look prohibitive. 
>> 
>>> I think it makes sense to keep those as 'function-header declarations'.
>> 
>> IMO it's not enough. Ideally, if there are any implicit usages, those should be clearly spelled out at every call site.
>
> Changed just the three `*limbs*` functions.

Lifted everything pretty much to just `poly1305_process_blocks_avx512` and `generate_poly1305_processBlocks` (i.e. two register maps)

Took some time to make it 'reasonable' again, but I think it makes sense. (But then, true test would be me looking a month later or if it makes sense to others)

Had to cleanup the names; 'local' names could all be play on `tmp`.. but the register reuse is much clearer from the 'global' names.

-------------

PR: https://git.openjdk.org/jdk/pull/10582



More information about the security-dev mailing list