RFR: 8344306: RISC-V: Add zicond
Fei Yang
fyang at openjdk.org
Wed Nov 27 02:07:38 UTC 2024
On Tue, 26 Nov 2024 09:00:34 GMT, Robbin Ehn <rehn at openjdk.org> wrote:
> Hi, please consider.
>
> In cpu models we save ~1 cycle per removed branch.
> This patch removes ~0.1% of branches in generic C2 generated code.
> We should probably investigate if we can improve/add peephole optimization to remove more branches.
>
> As the C1 cmov code is a bit tricky I left that as a followup.
>
> I added gtests for the cmovs.
> (we should add coverage for more of masm in this gtest suit)
> Pro tip, invoke the gtestLauncher directly (you only need to build exploded):
> `gtestLauncher -jdk build/linux-riscv64-server-fastdebug/jdk/ --gtest_break_on_failure --gtest_filter="*RiscV*" -XX:+UnlockDiagnosticVMOptions -XX:+UseZicond -XX:ParallelGCThreads=1 -XX:ConcGCThreads=1 -XX:CICompilerCount=2`
>
> Tested on Spacemit X60, gtests and tier1.
>
> Thanks, Robbin
Hi, thanks for this work! I once tried to find this extension but I think I missed it :-)
Seems that we need two special AD match rules so that `eq`/`ne` zero cases could be further optimized.
(Say, we use `t0` for `rtmp` in this code sequence)
Conditional select, if zero
rd = (rc == 0) ? rs1 : rs2
=>
czero.nez rd, rs1, rc
czero.eqz rtmp, rs2, rc
or rd, rd, rtmp
Conditional select, if non-zero
rd = (rc != 0) ? rs1 : rs2
=>
czero.eqz rd, rs1, rc
czero.nez rtmp, rs2, rc
or rd, rd, rtmp
src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2006:
> 2004:
> 2005: void C2_MacroAssembler::enc_cmove(int cmpFlag, Register op1, Register op2, Register dst, Register src) {
> 2006: bool is_unsigned = (cmpFlag & unsigned_branch_mask) == unsigned_branch_mask ? true : false;
Maybe:
`bool is_unsigned = (cmpFlag & unsigned_branch_mask) ? true : false;`
src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 2007:
> 2005: void C2_MacroAssembler::enc_cmove(int cmpFlag, Register op1, Register op2, Register dst, Register src) {
> 2006: Label L;
> 2007: cmp_branch(cmpFlag ^ (1 << neg_cond_bits), op1, op2, L);
The global `neg_cond_bits` is unused after this change. I think we should remove it.
-------------
PR Review: https://git.openjdk.org/jdk/pull/22386#pullrequestreview-2463256160
PR Review Comment: https://git.openjdk.org/jdk/pull/22386#discussion_r1859556828
PR Review Comment: https://git.openjdk.org/jdk/pull/22386#discussion_r1859637031
More information about the hotspot-dev
mailing list