RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v6]
Volodymyr Paprotski
duke at openjdk.org
Tue Nov 8 22:03:20 UTC 2022
On Tue, 1 Nov 2022 23:49:17 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2002:
>>
>>> 2000: }
>>> 2001:
>>> 2002: address StubGenerator::generate_poly1305_masksCP() {
>>
>> I suggest to turn it into a C++ literal constant and move the declaration next to `poly1305_process_blocks_avx512` where they are used. As an example, here's how it is handled in GHASH stubs:
>> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/stubGenerator_x86_64_ghash.cpp#L35
>>
>> That would allow to avoid to simplify the code a bit (no need in `StubRoutines::x86::_poly1305_mask_addr`/`poly1305_mask_addr()` and no need to generate the constants during VM startup).
>>
>> You could split it into 3 constants, but then using a single base register (`polyCP`) won't work anymore.
>> Thinking more about it, I'm not sure why you can't just do the split and use address literals instead to access individual constants (and repurpose `r13` to be used as a scratch register when RIP-relative addressing mode doesn't work).
>
> The case of AES stubs may be even a better fit here:
> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/stubGenerator_x86_64_aes.cpp#L47
>
> It doesn't use/introduce any shared constants, so declaring a constant and a local accessor (to save on pointer to address casts at use sites) is enough.
@iwanowww moved to StubGenerator as suggested.. moving functions to the stubGenerator_x86_64.hpp header doesn't seem 'clean' but I think that's the pattern.
The constant pool.. stared at it for a while and ended up keeping it mostly intact (its now a static function, not a member function; header bit cleaner; followed AES pattern).
Did not split it up into individual constants. The main 'problem' is that `Address` and `ExternalAddress` are not compatible. Most instructions do not take `AddressLiteral`, so can't use `ExternalAddress` to refer to those constants. (If I did get the instructions I use to take `AddressLiteral`, I think we would end up with more `lea(rscratch)`s generated; but that's more of a silver-lining)
I also thought of loading constants at run-time, (load and replicate for vector.. what I mentioned in my comment above) but that seems needlessly complicated in hindsight..
-------------
PR: https://git.openjdk.org/jdk/pull/10582
More information about the hotspot-compiler-dev
mailing list