RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v6]

vpaprotsk duke at openjdk.org
Wed Nov 2 02:38:26 UTC 2022


On Tue, 1 Nov 2022 23:49:17 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2002:
>> 
>>> 2000: }
>>> 2001: 
>>> 2002: address StubGenerator::generate_poly1305_masksCP() {
>> 
>> I suggest to turn it into a C++ literal constant and move the declaration next to `poly1305_process_blocks_avx512` where they are used. As an example, here's how it is handled in GHASH stubs:
>>   https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/stubGenerator_x86_64_ghash.cpp#L35
>> 
>> That would allow to avoid to simplify the code a bit (no need in `StubRoutines::x86::_poly1305_mask_addr`/`poly1305_mask_addr()` and no need to generate the constants during VM startup).
>> 
>> You could split it into 3 constants, but then using a single base register (`polyCP`) won't work anymore.
>> Thinking more about it, I'm not sure why you can't just do the split and use address literals instead  to access individual constants (and repurpose `r13` to be used as a scratch register when RIP-relative addressing mode doesn't work).
>
> The case of AES stubs may be even a better fit here:
> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/stubGenerator_x86_64_aes.cpp#L47
> 
> It doesn't use/introduce any shared constants, so declaring a constant and a local accessor (to save on pointer to address casts at use sites) is enough.

I wonder if I can remove that function completely now..

Originally I kept those in memory, because I was rather tight on zmm registers (actually, all registers), and I could use the `Address` version of instructions to save a register.. But I had done a mayor cleanup on register allocation before pushing the PR, maybe there is room now. (But if we do want to bring back any of the optimizations I kept back, we would need those registers again.. but will see)

PS: I am trying to address 10% degradation @jnimeh and I discussed above, will take a few days to implement the latest round. Apologies for the delay and appreciate the review!

-------------

PR: https://git.openjdk.org/jdk/pull/10582


More information about the hotspot-dev mailing list