RFR: 8318723: RISC-V: C2 UDivL

Fei Yang fyang at openjdk.org
Thu Oct 26 06:23:32 UTC 2023


On Tue, 24 Oct 2023 15:21:08 GMT, Hamlin Li <mli at openjdk.org> wrote:

> Hi,
> Can you review the change to add intrinsic for UDivI and UDivL?
> Thanks!
> 
> 
> ## Tests
> 
> ### Functionality
> Run tests successfully found via `grep -nr test/jdk/ -we divideUnsigned` and `grep -nr test/hotspot/ -we divideUnsigned` 
> 
> ### Performance
> ( NOTE: there are another 2 related issues: https://bugs.openjdk.org/browse/JDK-8318225, https://bugs.openjdk.org/browse/JDK-8318226, the pr of which will be subseqently sent out after this one finished. )
> 
> #### Long
> **Before**
> 
> LongDivMod.testDivideUnsigned                    1024          mixed  avgt   10  19704.317 ± 64.078  ns/op
> LongDivMod.testDivideUnsigned                    1024       positive  avgt   10  28856.859 ± 14.901  ns/op
> LongDivMod.testDivideUnsigned                    1024       negative  avgt   10   6364.974 ±  2.465  ns/op
> 
> 
> **After v1**
> (This is a simpler version, please check the diff from `After v2` below)
> 
> LongDivMod.testDivideUnsigned                    1024          mixed  avgt   10  22668.228 ± 74.161  ns/op
> LongDivMod.testDivideUnsigned                    1024       positive  avgt   10  15966.320 ± 14.985  ns/op
> LongDivMod.testDivideUnsigned                    1024       negative  avgt   10  29518.033 ± 49.056  ns/op
> 
> 
> **After v2**
> (This is the current patch, **This version has a huge regression for negative values!!!**)
> 
> LongDivMod.testDivideUnsigned                    1024          mixed  avgt   10  11432.738 ±  95.785  ns/op
> LongDivMod.testDivideUnsigned                    1024       positive  avgt   10  15969.044 ±  19.492  ns/op
> LongDivMod.testDivideUnsigned                    1024       negative  avgt   10   6376.674 ±  16.869  ns/op
> 
> 
> ##### Diff of v1 from v2
> 
> diff --git a/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp b/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp
> index b96f7611133..dfb40e171e7 100644
> --- a/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp
> +++ b/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp
> @@ -2432,16 +2432,7 @@ int MacroAssembler::corrected_idivq(Register result, Register rs1, Register rs2,
>      if (is_signed) {
>        div(result, rs1, rs2);
>      } else {
> -      Label Lltz, Ldone;
> -      bltz(rs2, Lltz);
>        divu(result, rs1, rs2);
> -      j(Ldone);
> -      bind(Lltz); // For the algorithm details, check j.l.Long::divideUnsigned
> -      sub(result, rs1, rs2);
> -      notr(result, result);
> -      andr(result, result, rs1);
> -      srli(result, result, 63);
> -      bind(Ldone);
>      }
>    } else {
>      rem(result, rs1, rs2); // result = rs1 % rs2;
> 
> 
> 
> #### Integer
> **B...

So I tried this on Hifive Unmatched board. Unforunately, JMH test shows some regression for the LongDivMod.testDivideUnsigned `negative` case.

Before:

LongDivMod.testDivideUnsigned                    1024          mixed  avgt   15  24909.748 ?  17.915  ns/op
LongDivMod.testDivideUnsigned                    1024       positive  avgt   15  36257.181 ?  33.615  ns/op
LongDivMod.testDivideUnsigned                    1024       negative  avgt   15   6720.904 ?   8.522  ns/op        <====


After:

LongDivMod.testDivideUnsigned                    1024          mixed  avgt   15  13650.002 ?  52.788  ns/op
LongDivMod.testDivideUnsigned                    1024       positive  avgt   15  18784.942 ?  18.258  ns/op
LongDivMod.testDivideUnsigned                    1024       negative  avgt   15   7168.625 ?  17.019  ns/op        <====

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16346#issuecomment-1780482379


More information about the hotspot-dev mailing list