RFR: 8347917: AArch64: Enable upper GPR registers in C1
Dmitry Chuyko
dchuyko at openjdk.org
Wed Feb 12 22:28:13 UTC 2025
On Sun, 26 Jan 2025 16:16:59 GMT, Andrew Haley <aph at openjdk.org> wrote:
> > > As for the different allocation order (to prefer platform callee-saved registers), do you think something simple like last->first order will work for all platforms?
> >
> >
> > It might. It's certainly an interesting thing to try. I'm particularly interested because it potentially reduces the overhead for type checks.
>
> Let's do this in a separate patch.
Just a few things to keep here:
1. Even for aarch64 just reversing allocation order is not enough (callee preserved regs are saved in a caller).
2. Register saving overhead for runtime calls is there, but making a call without saving is still expensive.
Consider a benchmark that keeps few values alive and performs a runtime call:
long[] arr;
@Setup
public void setup() {
arr = new long[8];
}
@Benchmark
public void test(Blackhole bh) {
long v0 = arr[0]; long v1 = arr[1]; long v2 = arr[2]; long v3 = arr[3];
long v4 = arr[4]; long v5 = arr[5]; long v6 = arr[6]; long v7 = arr[7];
v1 += v0; v2 += v1; v3 += v2; v4 += v3; v5 += v4; v6 += v5; v7 += v6; v0 += v7;
v1 *= v0; v2 *= v1; v3 *= v2; v4 *= v3; v5 *= v4; v6 *= v5; v7 *= v6; v0 *= v7;
double d0 = Double.longBitsToDouble(v0);
d0 = Math.sin(d0); // dsin is c1 runtime call
v0 = Double.doubleToRawLongBits(d0);
v1 += v0; v2 += v1; v3 += v2; v4 += v3; v5 += v4; v6 += v5; v7 += v6; v0 += v7;
v1 *= v0; v2 *= v1; v3 *= v2; v4 *= v3; v5 *= v4; v6 *= v5; v7 *= v6; v0 *= v7;
bh.consume(v0); bh.consume(v1); bh.consume(v2); bh.consume(v3);
bh.consume(v4); bh.consume(v5); bh.consume(v6); bh.consume(v7);
}
In '-XX:TieredStopAtLevel=1' mode I observe results like 28.337 ± 0.803 ns/op. If dsin is calculated and consumed in the end of the method, it's like 27.039 ± 0.182 ns/op. Without the call it's 22.595 ± 0.853 ns/op.
With the call hottest methods are distributed like
89.12% c1, level 1 org.openjdk.bench.vm.compiler.jmh_generated.VMCall_baseline_jmhTest::baseline_avgt_jmhStub, version 2, compile id 798
10.69% runtime stub StubRoutines::libmDsin
-------------
PR Comment: https://git.openjdk.org/jdk/pull/23152#issuecomment-2654975579
More information about the hotspot-compiler-dev
mailing list