[riscv-port] RFR: 8278322: riscv: Support RVC: compressed instructions

Wed Dec 8 01:44:45 UTC 2021

On Tue, 7 Dec 2021 04:46:10 GMT, Xiaolin Zheng <xlinzheng at openjdk.org> wrote:

> Hi team,
>
> This patch support RISC-V RVC extension. It can introduce:
> * 21% code size reduction in template interpreter generated code
> * 20%~25% code size reduction in C1 generated code, evaluated by a common SpringBoot program
> * 15%~20% code size reduction in C2 generated code, evaluated by a common SpringBoot program
>
> In my observation, the code size footprint could be reduced to nearly a level of the AArch64 back-end. About the performance, there seems a stable ~0.7% performance improvement on SPECjbb2015 on one HiFive Unleashed board, considering the code density increase. I think the performance aspect might be a speculative behavior on different hardware implementations because C910's performance might be better than that, but HiFive Unleashed may be more general.
>
> Things about this patch:
> * If an instruction is compressible, then we will implicitly emit a 16-bit compressed instruction instead of the 32-bit instruction in Assembler.
> * About the `_nc` postfix of some of Assembler instructions: we know a bunch of places should be reserved for patching, where we cannot change them into compressed instructions. `_nc` is short for `not compressible` - with this, those instructions should keep their origin 4-byte form and remain uncompressed.
> * There are things not easy to compress like MachBranchNodes. Please see the comments in the code - currently this patch does not support this. We will support this improvement in patches coming afterward.
>
> The macros after their expansion might be like:
>
> void andr(Register Rd, Register Rs1, Register Rs2) {
> +  {
> +    Register src = noreg;
> +    if (UseRVC && Rs1->is_compressed_valid() && Rs2->is_compressed_valid() &&
> +        ((src = Rs1, Rs2 == Rd) || (src = Rs2, Rs1 == Rd))) {
> +      and_c(Rd, src);
> +      return;
> +    }
> +  }
>   unsigned insn = 0;
>   patch((address)&insn, 6, 0, 0b0110011);
>   patch((address)&insn, 14, 12, 0b111);
>   patch((address)&insn, 31, 25, 0b0000000);
>   patch_reg((address)&insn, 7, Rd);
>   patch_reg((address)&insn, 15, Rs1);
>   patch_reg((address)&insn, 20, Rs2);
>   emit(insn);
> };
>
>
> For further information, please see comments in `assembler_riscv_cext.hpp`.
>
> --
>
> This patch may need some time to acquire a review. I have been polishing this patch for quite a long time and it might seem stable under a bunch of full-tier tests and some tier1 jdk/hotspot jtreg tests. But I think we might mark it `experimental` at first, though it is turned on by default. [The original patch](https://github.com/riscv-collab/riscv-openjdk/pull/7), just for reference, might be quite different from the current one.
>
> I am pleased to receive any suggestion.
>
> Thanks,
> Xiaolin

Well, I don't think it's a good idea for assembler to emit 16-bit compressed instructions by default.
It's quite error prone and will make the code base hard to maintain.
I would suggest the opposite way, i.e., default to 32-bit normal instructions and use compressed instructions when it is absolutly safe.
I see some places where we could make use of compressed instructions.
One scenario would be C2 architecture description file. (C1 should be similar in theory)

Let's say,
6305 instruct addI_reg_reg(iRegINoSp dst, iRegIorL2I src1, iRegIorL2I src2) %{
6306   match(Set dst (AddI src1 src2));
6307
6308   ins_cost(ALU_COST);
6309   format %{ "addw  $dst, $src1, $src2\t#@addI_reg_reg" %}
6310
6311   ins_encode %{
6312     __ addw(as_Register($dst$$reg),
6313             as_Register($src1$$reg),
6314             as_Register($src2$$reg));
6315   %}
6316
6317   ins_pipe(ialu_reg_reg);
6318 %}

In this case, I think it should be safe to emit a compressed C.ADD when possible here for this instruct.
Then I think the "addw" assembler function would need one extra argument, say "compress", defaulting to false.
Then for the above instruct, we would call "addw" passing UseRVC as the "compress" argument.
We can then carry out the work incrementally if we go this way and this thus will also make code review easier.
I agree that we will miss some opportunities compared with the current solution, but that shouldn't be a big issue for now.

-------------

PR: https://git.openjdk.java.net/riscv-port/pull/24