RFR: 8288047: Accelerate Poly1305 on x86_64 using AVX512 instructions [v6]

Volodymyr Paprotski duke at openjdk.org
Wed Nov 9 02:22:01 UTC 2022


On Wed, 9 Nov 2022 00:38:45 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:

>> @iwanowww moved to StubGenerator as suggested.. moving functions to the stubGenerator_x86_64.hpp header doesn't seem 'clean' but I think that's the pattern.
>> 
>> The constant pool.. stared at it for a while and ended up keeping it mostly intact (its now a static function, not a member function; header bit cleaner; followed AES pattern). 
>> 
>> Did not split it up into individual constants. The main 'problem' is that `Address` and `ExternalAddress` are not compatible. Most instructions do not take `AddressLiteral`, so can't use `ExternalAddress` to refer to those constants. (If I did get the instructions I use to take `AddressLiteral`, I think we would end up with more `lea(rscratch)`s generated; but that's more of a silver-lining)
>> 
>> I also thought of loading constants at run-time, (load and replicate for vector.. what I mentioned in my comment above) but that seems needlessly complicated in hindsight..
>
>> Did not split it up into individual constants. The main 'problem' is that Address and ExternalAddress are not compatible. 
> 
> There's a reason for that and it's because RIP-relative addressing doesn't always work, so additional register may be needed.
> 
>> Most instructions do not take AddressLiteral, so can't use ExternalAddress to refer to those constants. 
> 
> I counted 4 instructions accessing the constants (`evpandq`, `andq`, `evporq`, and `vpternlogq`) in your patch. 
> 
> `macroAssembler_x86.hpp` is the place for `AddressLiteral`-related overloads (there are already numerous cases present) and it's trivial to add new ones. 
> 
>> (If I did get the instructions I use to take AddressLiteral, I think we would end up with more lea(rscratch)s generated; but that's more of a silver-lining)
> 
> It depends on memory layout. If constants end up placed close enough in the address space, there'll be no additional instructions generated.
> 
> Anyway, it doesn't look like something important from throughput perspective. Overall, I find it clearer when the code refers to individual constants through `AddressLiteral`s, but I'm also fine with it as it is now.

Makes sense to me, that would indeed be cleaner, will add a couple more overloads. (Still getting used to what is 'clean' in this code base).

-------------

PR: https://git.openjdk.org/jdk/pull/10582


More information about the hotspot-compiler-dev mailing list