RFR: 8337666: AArch64: SHA3 GPR intrinsic

Andrew Haley aph at openjdk.org
Fri Apr 25 08:27:00 UTC 2025


On Thu, 24 Apr 2025 20:23:54 GMT, Dmitry Chuyko <dchuyko at openjdk.org> wrote:

> Even on M1 8252204 is 28-32% faster than this one. They seem to have 4 execution blocks per core for the accelerator instructions (unlike servers that may provide just 1 unit).
> 
> It would be great if C2 could allocate scratch registers in such methods but that would complicate the entire port.

To begin with, please isolate `keccak_round() `in its own function, to make it more similar to the other implementation.

Is it possible to define GPR macro-instructions for instructions like `eor3`and `raxl` ? This would make it a lot easier to understand what is going on, thereby making maintenance easier.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24260#issuecomment-2829719599


More information about the hotspot-dev mailing list