RFR: 8318723: RISC-V: C2 UDivL
Fei Yang
fyang at openjdk.org
Thu Oct 26 06:23:32 UTC 2023
On Tue, 24 Oct 2023 15:21:08 GMT, Hamlin Li <mli at openjdk.org> wrote:
> Hi,
> Can you review the change to add intrinsic for UDivI and UDivL?
> Thanks!
>
>
> ## Tests
>
> ### Functionality
> Run tests successfully found via `grep -nr test/jdk/ -we divideUnsigned` and `grep -nr test/hotspot/ -we divideUnsigned`
>
> ### Performance
> ( NOTE: there are another 2 related issues: https://bugs.openjdk.org/browse/JDK-8318225, https://bugs.openjdk.org/browse/JDK-8318226, the pr of which will be subseqently sent out after this one finished. )
>
> #### Long
> **Before**
>
> LongDivMod.testDivideUnsigned 1024 mixed avgt 10 19704.317 ± 64.078 ns/op
> LongDivMod.testDivideUnsigned 1024 positive avgt 10 28856.859 ± 14.901 ns/op
> LongDivMod.testDivideUnsigned 1024 negative avgt 10 6364.974 ± 2.465 ns/op
>
>
> **After v1**
> (This is a simpler version, please check the diff from `After v2` below)
>
> LongDivMod.testDivideUnsigned 1024 mixed avgt 10 22668.228 ± 74.161 ns/op
> LongDivMod.testDivideUnsigned 1024 positive avgt 10 15966.320 ± 14.985 ns/op
> LongDivMod.testDivideUnsigned 1024 negative avgt 10 29518.033 ± 49.056 ns/op
>
>
> **After v2**
> (This is the current patch, **This version has a huge regression for negative values!!!**)
>
> LongDivMod.testDivideUnsigned 1024 mixed avgt 10 11432.738 ± 95.785 ns/op
> LongDivMod.testDivideUnsigned 1024 positive avgt 10 15969.044 ± 19.492 ns/op
> LongDivMod.testDivideUnsigned 1024 negative avgt 10 6376.674 ± 16.869 ns/op
>
>
> ##### Diff of v1 from v2
>
> diff --git a/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp b/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp
> index b96f7611133..dfb40e171e7 100644
> --- a/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp
> +++ b/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp
> @@ -2432,16 +2432,7 @@ int MacroAssembler::corrected_idivq(Register result, Register rs1, Register rs2,
> if (is_signed) {
> div(result, rs1, rs2);
> } else {
> - Label Lltz, Ldone;
> - bltz(rs2, Lltz);
> divu(result, rs1, rs2);
> - j(Ldone);
> - bind(Lltz); // For the algorithm details, check j.l.Long::divideUnsigned
> - sub(result, rs1, rs2);
> - notr(result, result);
> - andr(result, result, rs1);
> - srli(result, result, 63);
> - bind(Ldone);
> }
> } else {
> rem(result, rs1, rs2); // result = rs1 % rs2;
>
>
>
> #### Integer
> **B...
So I tried this on Hifive Unmatched board. Unforunately, JMH test shows some regression for the LongDivMod.testDivideUnsigned `negative` case.
Before:
LongDivMod.testDivideUnsigned 1024 mixed avgt 15 24909.748 ? 17.915 ns/op
LongDivMod.testDivideUnsigned 1024 positive avgt 15 36257.181 ? 33.615 ns/op
LongDivMod.testDivideUnsigned 1024 negative avgt 15 6720.904 ? 8.522 ns/op <====
After:
LongDivMod.testDivideUnsigned 1024 mixed avgt 15 13650.002 ? 52.788 ns/op
LongDivMod.testDivideUnsigned 1024 positive avgt 15 18784.942 ? 18.258 ns/op
LongDivMod.testDivideUnsigned 1024 negative avgt 15 7168.625 ? 17.019 ns/op <====
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16346#issuecomment-1780482379
More information about the hotspot-dev
mailing list