RFR: 8352504: RISC-V: implement and enable CMoveI/L

Hamlin Li mli at openjdk.org
Tue Apr 8 12:34:22 UTC 2025


On Mon, 7 Apr 2025 14:23:52 GMT, Hamlin Li <mli at openjdk.org> wrote:

> Hi,
> Can you help to review this patch?
> On riscv, CMoveI/L already were implemented, but there are some gap:
> 1. CMoveI/L does not support comparison with float/double, corresponding tests are not turn on either.
> 2. Some optimization of C2 is not turned on, e.g. `Phi -> CMove -> min_max`.
> 3. lack of some corresponding performance tests.
> 
> Also there are some issue with current Zicond:
> 1. UseZicond is turned on automatically by hwprobe, but jmh tests show that it's not always bring benefit, in some situation it even brings regression, the reason is the generated code by Zicond is much larger than branch version, in particular when it's in a loop and unrolled.
> 
> This patch on riscv is to:
> 1. add CMoveI/L comparing float/double, and corresponding tests,
> 2. enable more C2 optimization,
> 3. add more benchmark tests,
> 4. turn off UseZicond by default.
> 
> Thanks!
> 
> ## Performance
> 
> ### MinMax
> test data
> <google-sheets-html-origin style="caret-color: rgb(0, 0, 0); color: rgb(0, 0, 0); font-style: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration: none;">
> Benchmark | Mode | Cnt | Score - master | Score - master+UseZbb | Score - -master+UseZicond | Score - master+UseZicond+UseZbb | Score - cmovei | Score - cmovei+UseZbb | Score - cmovei+UseZicond | Score - cmovei+UseZicond+UseZbb | Error | Units | Opt (master/cmovei)
> -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
> o.o.b.vm.compiler.IfMinMax.testReductionInt | avgt | 40 | 17152.075 | 17216.592 | 17272.493 | 17296.89 | 17127.844 | 17036.605 | 17299.333 | 17250.566 | 73.179 | ns/op | 1.001
> o.o.b.vm.compiler.IfMinMax.testReductionLong | avgt | 40 | 19770.828 | 19967.578 | 20268.905 | 20166.165 | 20065.552 | 20059.095 | 20161.914 | 20151.295 | 131.428 | ns/op | 0.985
> o.o.b.vm.compiler.IfMinMax.testSingleInt | avgt | 40 | 114.734 | 114.402 | 114.887 | 114.384 | 114.4 | 110.631 | 112.162 | 110.915 | 0.333 | ns/op | 1.003
> o.o.b.vm.compiler.IfMinMax.testSingleLong | avgt | 40 | 121.53 | 121.711 | 120.91 | 121.665 | 121.309 | 120.57 | 118.639 | 119.373 | 0.451 | ns/op | 1.002
> o.o.b.vm.compiler.IfMinMax.testVectorInt | avgt | 40 | 60130.165 | 60062.303 | 61839.776 | 61895.194 | 15887.398 | 15924.502 | 15874.835 | 15667.936 | 101.94 | ns/op | 3.785
> o.o.b.vm.compiler.IfMinMax.testVectorLong | avgt | 40 | 63855.379 | 6309...

because zicond code is bigger, e.g.

void MacroAssembler::cmov_eq(Register cmp1, Register cmp2, Register dst, Register src) {
  if (UseZicond) {
    xorr(t0, cmp1, cmp2);
    czero_eqz(dst, dst, t0);
    czero_nez(t1 , src, t0);
    orr(dst, dst, t1);
    return;
  }
  Label no_set;
  bne(cmp1, cmp2, no_set);
  mv(dst, src);
  bind(no_set);
}

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24490#issuecomment-2786284949


More information about the core-libs-dev mailing list