RFR: 8320069: RISC-V: Add Zcb instructions

Tue Dec 19 10:01:39 UTC 2023

On Tue, 19 Dec 2023 09:25:19 GMT, Vladimir Kempik <vkempik at openjdk.org> wrote:

> > > We already have "macroses" for load and stores in macroAssembler_riscv.hpp, what's the reason to do compression decision in assembler_riscv.hpp instead ( not saying it's wrong) ?
> > > https://github.com/openjdk/jdk/blob/38d94725a1a85156e30b72b325886b0e25d4db03/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp#L880
> > 
> > 
> > No, you are correct I also think this is not optimal. I don't know the background, but it seems like this is the easiest way to add compressed transparently. But to fully utilize C instruction we should favor the x8->x15, we often don't get C due to e.g. BCP is in x22. I think to be able to better utilize C we can't have it so transparent.
> > So here I just try to follow the current code, see how lw is changed to c_lw.
> 
> Not exactly related to this PR, but I also saw a strange behaviour from MacroAssembler's lwu. it was generating lw + and ( a kind of lwu emulation) instead of lwu
> 
> an example
> 
> ```
>   0.44%  ?  0x0000003fa46a86c8:   slli    t3,t3,0x20
>    0.48%  ?  0x0000003fa46a86ca:   addi    t3,t3,-1
>   ....
>    3.11%  ?  0x0000003fa46a86dc:   lw    a0,0(t1)
>    5.34%  ?  0x0000003fa46a86e0:   and    a0,a0,t3
> ```
> 
> Using Assembler::lwu directly resulted in a correctly generated lwu

Yes, I have seen similar things.

  0x00002aaabc9464fc:   addiw   ra,ra,-1365 # 0x00000000000aaaab
  0x00002aaabc946500:   slli    ra,ra,0xd
  0x00002aaabc946502:   addi    ra,ra,-929
  0x00002aaabc946506:   slli    ra,ra,0xd
  0x00002aaabc946508:   addi    ra,ra,456
  0x00002aaabc94650c:   jalr    ra

As "111001000" would fit in the signed 12imm to jalr I think this is sub-optimal.

I can go over and fix them, I'll create jira.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/17122#issuecomment-1862454161