RFR: 8326306: RISC-V: Re-structure MASM calls and jumps
Fei Yang
fyang at openjdk.org
Fri Apr 26 06:42:31 UTC 2024
On Thu, 25 Apr 2024 07:17:07 GMT, Robbin Ehn <rehn at openjdk.org> wrote:
> Hi, please consider.
>
> We have code that directly use the asm for call/jumps instead masm.
> Our masm have a bit odd naming, and we don't use 'proper' pseudoinstructions/mnemonics.
> Suggested by [riscv-asm-manual](https://github.com/riscv-non-isa/riscv-asm-manual/tree/master)
>
> j offset jal x0, offset Jump
> jal offset jal x1, offset Jump and link
> jr rs jalr x0, rs, 0 Jump register
> jalr rs jalr x1, rs, 0 Jump and link register
> ret jalr x0, x1, 0 Return from subroutine
> call offset auipc x1, offset[31:12]; jalr x1, x1, offset[11:0] Call far-away subroutine
> tail offset auipc x6, offset[31:12]; jalr x0, x6, offset[11:0] Tail call far-away subroutine
>
> But these can only be implemented like this if you have small enough application.
> The fallback of these is to use GOT (your C compiler should place a copy of GOT every 2G so it's always reachable).
> We don't have GOT, instead we materialize, so there is still differences between these and ours.
>
> This patch:
> - Tries to follow these suggested mappings as good we can.
> - Make sure all jumps/calls go through MASM. (so we get control and can easily change for sites using a certain calling convention)
> - To avoid confusion between MASM public/private methods and ASM methods and the mnemonics there are some renaming.
> E.g. the mnemonics jal means call offset, as we can't use that so there is no 'jal'.
> - I enabled c.j, but right now we never generate it.
> - As always the macro does no good and are legacy from when code base did not use templates. (also the x-macros screws up my IDE (vim+rtags))
>
> I started down this path due to I have followup patch on top of this which removes trampoline in favor for load-n-jump.
> (WIP: https://github.com/robehn/jdk/compare/jal-fixes...robehn:jdk:load-n-link?expand=1)
> While looking into our calls it was a bit confusing, this helps.
>
> Done a couple of t1-3 slightly different version of this patch, and as part of the followup, no issues found. (VF2, qemu, LP4)
> Re-running tests, had some last minute changes.
>
> Thanks, Robbin
src/hotspot/cpu/riscv/gc/shenandoah/shenandoahBarrierSetAssembler_riscv.cpp line 303:
> 301: target = CAST_FROM_FN_PTR(address, ShenandoahRuntime::load_reference_barrier_weak);
> 302: }
> 303: __ rt_call(target);
Question: does it make sense to replace `call` with `rt_call` when we are invoking the VM code (C++ code)? Here is what I see the difference between the two: `rt_call` emits code (`auipc` or `movptr`) depending on whether the destination could be found in code cache, while `call` depends on `is_32bit_offset_from_codeache`. So it's still possible for `call` to emit the short `auipc` code if not far even when the target is not there in the code cache like this case. But `rt_call` will always emit a long `movptr` sequence for this case, which I think is not good in performance.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18942#discussion_r1580543868
More information about the hotspot-dev
mailing list