RFR: 8340241: RISC-V: Returns mispredicted

Ludovic Henry luhenry at openjdk.org
Wed Oct 9 08:40:59 UTC 2024


On Tue, 8 Oct 2024 12:54:35 GMT, Robbin Ehn <rehn at openjdk.org> wrote:

> Hi, please consider.
> 
> RISC-V don't have dedicated call/ret instructions.
> Instead the registers used in the jal/jalr instructions determine if this is a JUMP or CALL/RET.
> The cpu have a return-address stack where it stores return addresses for prediction.
> There are two possible calling conventions: x1 and x5 (or using both for co-routines).
> This stack is updated according this table (from unpriv manual, 2.5.1. Unconditional Jumps) for JALR:
> 
> | rd is x1/x5 | rs1 is x1/x5 | rd=rs1 | RAS action
> | ------------- | ------------- | ------------- |------------- |
> |No | No | — | None|
> |No | Yes | — | Pop|
> |Yes | No | — | Push|
> |Yes | Yes | No | Pop, then push|
> |Yes | Yes | Yes | Push|
> 
> And additionally:
> "A JAL instruction should push the return address onto a return-address stack (RAS) only when rd is 'x1' or x5."
> 
> As the JDK is using x5/(t0) as main scratch all plains jumps are actually calls and calls are co-routine calls (push and pop).
> This causes performance issues as the predictions is often wrong.
> 
> Average time for 10 best iterations (VF2):
> | Benchmark | Baseline (ms) | RAS fixed (ms) | Diff |
> |-- | -- | -- | -- |
> |future-genetic | 22126.6 | 20461.8 | -7.52%|
> |akka-uct | 97119.6 | 97498 | 0.39%|
> |movie-lens | 82359.3 | 81009.2 | -1.64%|
> |scala-doku | 29246.1 | 24518.6 | -16.16%|
> |chi-square | 10207.3 | 10624.9 | 4.09%|
> |fj-kmeans | 55127.9 | 56169.1 | 1.89%|
> |finagle-http | 24845 | 24891.9 | 0.19%|
> |reactors | 97473.9 | 96655.5 | -0.84%|
> |dec-tree | 8322.99 | 8243.11 | -0.96%|
> |naive-bayes | 79249.1 | 76851.9 | -3.02%|
> |als | 52678 | 51245.9 | -2.72%|
> |par-mnemonics | 52237.4 | 53149.8 | 1.75%|
> |scala-kmeans | 2990.88 | 2992.14 | 0.04%|
> |philosophers | 9156.9 | 7754.5 | -15.32%|
> |log-regression | 7621.65 | 7540.85 | -1.06%|
> |gauss-mix | 9835.7 | 9396.25 | -4.47%|
> |mnemonics | 73087.3 | 69426.6 | -5.01%|
> |dotty | 10970.9 | 10719.1 | -2.30%|
> |finagle-chirper | 23386.1 | 23630.3 | 1.04%|
> |recursive fibonacci | 7338.56 | 5369.83 | **-26.83%**|
> 
> For some of workloads, e.g. call to small function in a loop, it really matters.
> 
> This patch blacklist x5(/t0) for JAL/JALR as we only use x1 calling convention.
> And changes all jumps to use x6(/t1) instead of x5(/t0).
> This patch was incrementally done, i.e. the first change removed the default t0.
> I visited all places makings jumps, to make sure t1 was available.
> Then changed to default t1 and removed argument in many cases.
> 
> Other approaches was tested, e.g. completely switch t0 <-> t1.
> This was much harder and more intr...

src/hotspot/cpu/riscv/assembler_riscv.hpp line 2895:

> 2893:   // All calls and jumps must go via MASM.
> 2894:   void jalr(Register Rd, Register Rs, const int32_t offset) {
> 2895:     assert(Rd != x5 && Rs != x5, "Register x5 not used for calls/jumps.");

Suggestion:

    assert(Rd != x5 && Rs != x5, "Register x5 should not be used for calls/jumps.");

src/hotspot/cpu/riscv/assembler_riscv.hpp line 2910:

> 2908: 
> 2909:   void jal(Register Rd, const int32_t offset) {
> 2910:     assert(Rd != x5, "Register x5 not used for calls/jumps.");

Suggestion:

    assert(Rd != x5, "Register x5 should not be used for calls/jumps.");

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 944:

> 942: void MacroAssembler::load_link_jump(const address source, Register temp) {
> 943:   assert(temp != noreg && temp != x0, "expecting a register");
> 944:   assert(temp != x5, "Register x5 not used for calls.");

Suggestion:

  assert(temp != x5 && temp != x1, "Register x5/x1 should not be used for jumps.");

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 973:

> 971:     Assembler::jal(x0, distance);
> 972:   } else {
> 973:     assert(temp != x5 && temp != x1, "Register x5/x1 not used for jumps.");

Suggestion:

    assert(temp != noreg && temp != x0, "expecting a register");
    assert(temp != x5 && temp != x1, "Register x5/x1 should not be used for jumps.");

To keep the order consistent with https://github.com/openjdk/jdk/pull/21406/files#diff-7a5c3ed05b6f3f06ed1c59f5fc2a14ec566a6a5bd1d09606115767daa99115bdR943

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1012:

> 1010: void MacroAssembler::jr(Register Rd, int32_t offset) {
> 1011:   assert(Rd != noreg, "expecting a register");
> 1012:   assert(Rd != x5, "Register x5 not used for jumps.");

Same as above:
Suggestion:

  assert(temp != x5 && temp != x1, "Register x5/x1 should not be used for jumps.");

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1020:

> 1018:   assert_cond(dest != nullptr);
> 1019:   assert(temp != noreg, "expecting a register");
> 1020:   assert(temp != x5, "Register x5 not used for jumps.");

Same as above:
Suggestion:

  assert(temp != x5 && temp != x1, "Register x5/x1 should not be used for jumps.");

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1029:

> 1027: void MacroAssembler::jalr(Register Rs, int32_t offset) {
> 1028:   assert(Rs != noreg, "expecting a register");
> 1029:   assert(Rs != x1, "expecting a register");

Same as above:
Suggestion:

  assert(temp != x5 && temp != x1, "Register x5/x1 should not be used for jumps.");

src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 1035:

> 1033: 
> 1034: void MacroAssembler::rt_call(address dest, Register tmp) {
> 1035:   assert(tmp != x5, "Register x5 not used for jumps.");

Same as above:
Suggestion:

  assert(temp != x5 && temp != x1, "Register x5/x1 should not be used for jumps.");

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21406#discussion_r1793100714
PR Review Comment: https://git.openjdk.org/jdk/pull/21406#discussion_r1793100993
PR Review Comment: https://git.openjdk.org/jdk/pull/21406#discussion_r1793105370
PR Review Comment: https://git.openjdk.org/jdk/pull/21406#discussion_r1793107051
PR Review Comment: https://git.openjdk.org/jdk/pull/21406#discussion_r1793107792
PR Review Comment: https://git.openjdk.org/jdk/pull/21406#discussion_r1793107865
PR Review Comment: https://git.openjdk.org/jdk/pull/21406#discussion_r1793107965
PR Review Comment: https://git.openjdk.org/jdk/pull/21406#discussion_r1793108193


More information about the hotspot-dev mailing list