RFR: 8344306: RISC-V: Add zicond

Fei Yang fyang at openjdk.org
Wed Nov 27 02:07:38 UTC 2024


On Tue, 26 Nov 2024 09:00:34 GMT, Robbin Ehn <rehn at openjdk.org> wrote:

> Hi, please consider.
> 
> In cpu models we save ~1 cycle per removed branch.
> This patch removes ~0.1% of branches in generic C2 generated code.
> We should probably investigate if we can improve/add peephole optimization to remove more branches.
> 
> As the C1 cmov code is a bit tricky I left that as a followup.
> 
> I added gtests for the cmovs.
> (we should add coverage for more of masm in this gtest suit)
> Pro tip, invoke the gtestLauncher directly (you only need to build exploded):
> `gtestLauncher -jdk build/linux-riscv64-server-fastdebug/jdk/ --gtest_break_on_failure --gtest_filter="*RiscV*" -XX:+UnlockDiagnosticVMOptions -XX:+UseZicond -XX:ParallelGCThreads=1 -XX:ConcGCThreads=1 -XX:CICompilerCount=2`
> 
> Tested on Spacemit X60, gtests and tier1.
> 
> Thanks, Robbin

Hi, thanks for this work! I once tried to find this extension but I think I missed it :-)
Seems that we need two special AD match rules so that `eq`/`ne` zero cases could be further optimized.
(Say, we use `t0` for `rtmp` in this code sequence)

Conditional select, if zero
 rd = (rc == 0) ? rs1 : rs2
=>
 czero.nez  rd, rs1, rc
 czero.eqz  rtmp, rs2, rc
 or         rd, rd, rtmp



Conditional select, if non-zero
 rd = (rc != 0) ? rs1 : rs2
=>
czero.eqz  rd, rs1, rc
czero.nez  rtmp, rs2, rc
or         rd, rd, rtmp

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2006:

> 2004: 
> 2005: void C2_MacroAssembler::enc_cmove(int cmpFlag, Register op1, Register op2, Register dst, Register src) {
> 2006:   bool is_unsigned = (cmpFlag & unsigned_branch_mask) == unsigned_branch_mask ? true : false;

Maybe: 
`bool is_unsigned = (cmpFlag & unsigned_branch_mask) ? true : false;`

src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2007:

> 2005: void C2_MacroAssembler::enc_cmove(int cmpFlag, Register op1, Register op2, Register dst, Register src) {
> 2006:   Label L;
> 2007:   cmp_branch(cmpFlag ^ (1 << neg_cond_bits), op1, op2, L);

The global `neg_cond_bits` is unused after this change. I think we should remove it.

-------------

PR Review: https://git.openjdk.org/jdk/pull/22386#pullrequestreview-2463256160
PR Review Comment: https://git.openjdk.org/jdk/pull/22386#discussion_r1859556828
PR Review Comment: https://git.openjdk.org/jdk/pull/22386#discussion_r1859637031


More information about the hotspot-dev mailing list