RFR: 8337666: AArch64: SHA3 GPR intrinsic
Andrew Haley
aph at openjdk.org
Fri Apr 25 08:27:00 UTC 2025
On Thu, 24 Apr 2025 20:23:54 GMT, Dmitry Chuyko <dchuyko at openjdk.org> wrote:
> Even on M1 8252204 is 28-32% faster than this one. They seem to have 4 execution blocks per core for the accelerator instructions (unlike servers that may provide just 1 unit).
>
> It would be great if C2 could allocate scratch registers in such methods but that would complicate the entire port.
To begin with, please isolate `keccak_round() `in its own function, to make it more similar to the other implementation.
Is it possible to define GPR macro-instructions for instructions like `eor3`and `raxl` ? This would make it a lot easier to understand what is going on, thereby making maintenance easier.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24260#issuecomment-2829719599
More information about the hotspot-dev
mailing list