RFR: 8347917: AArch64: Enable upper GPR registers in C1 [v4]
Dmitry Chuyko
dchuyko at openjdk.org
Thu Feb 13 09:35:13 UTC 2025
On Thu, 13 Feb 2025 08:49:40 GMT, Andrew Haley <aph at openjdk.org> wrote:
> On 2/12/25 22:25, Dmitry Chuyko wrote: Just a few things to keep here: 1. Even for aarch64 just reversing allocation order is not enough (callee preserved regs are saved in a caller). 2. Register saving overhead for runtime calls is there, but making a call without saving is still expensive.
> I don't quite understand what you're saying here. In the first sentence you seem to imply that callee preserved regs are still saved in the caller, unnecessarily.
Yes. There is some other place to be changed.
> In the second sentence you say "saving overhead for runtime calls is there," which seems to imply that there is some advantage to using a callee-saved register for runtime calls. Clearly this issue only applies to runtime calls, because Java has no callee preserved regs. What conclusion do you make from the benchmark you presented? That the overhead of making a call from C1-compiled code is great, especially when there are many spills?
Speculatively it's like having that call costs ~4ns/op, and preserving unnecessary values costs extra ~1ns/op. Preserving unnecessary values also costs a lot of instructions.
This is definitely a subject for a separate further study, I just checked that if we can observe any difference in benchmarks (yes), and is reversing allocation order currently enough to help (no).
-------------
PR Comment: https://git.openjdk.org/jdk/pull/23152#issuecomment-2656018287
More information about the hotspot-compiler-dev
mailing list