RFR: 8318723: RISC-V: C2 UDivL

Wed Oct 25 07:37:28 UTC 2023

On Wed, 25 Oct 2023 06:55:18 GMT, Fei Yang <fyang at openjdk.org> wrote:

>> Hi,
>> Can you review the change to add intrinsic for UDivI and UDivL?
>> Thanks!
>> 
>> 
>> ## Tests
>> 
>> ### Functionality
>> Run tests successfully found via `grep -nr test/jdk/ -we divideUnsigned` and `grep -nr test/hotspot/ -we divideUnsigned` 
>> 
>> ### Performance
>> NOTE: there are another 2 related issues: https://bugs.openjdk.org/browse/JDK-8318225, https://bugs.openjdk.org/browse/JDK-8318226, the pr of which will be subseqently sent out after this one finished.
>> 
>> #### Long
>> ** Before **
>> 
>> LongDivMod.testDivideUnsigned                    1024          mixed  avgt    2  19852.277          ns/op
>> LongDivMod.testDivideUnsigned                    1024       positive  avgt    2  29155.681          ns/op
>> LongDivMod.testDivideUnsigned                    1024       negative  avgt    2   6385.280          ns/op
>> 
>> 
>> ** After **
>> 
>> LongDivMod.testDivideUnsigned                    1024          mixed  avgt    2  11776.806          ns/op
>> LongDivMod.testDivideUnsigned                    1024       positive  avgt    2  16101.940          ns/op
>> LongDivMod.testDivideUnsigned                    1024       negative  avgt    2   6433.223          ns/op
>> 
>> 
>> #### Integer
>> ** Before **
>> 
>> IntegerDivMod.testDivideUnsigned                    1024          mixed  avgt    2  23498.570          ns/op
>> IntegerDivMod.testDivideUnsigned                    1024       positive  avgt    2  16875.614          ns/op
>> IntegerDivMod.testDivideUnsigned                    1024       negative  avgt    2  30310.243          ns/op
>> 
>> 
>> ** After **
>> 
>> IntegerDivMod.testDivideUnsigned                    1024          mixed  avgt    2  23327.997          ns/op
>> IntegerDivMod.testDivideUnsigned                    1024       positive  avgt    2  16708.209          ns/op
>> IntegerDivMod.testDivideUnsigned                    1024       negative  avgt    2  30162.153          ns/op
>
> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 2436:
> 
>> 2434:     } else {
>> 2435:       Label Lltz, Ldone;
>> 2436:       bltz(rs2, Lltz);
> 
> I am not quite sure what this `bltz` branch is for. Is this a minor performance tunning here? And How would this make a difference then if that's true? I didn't see much difference from the LongDivMod.testDivideUnsigned `negative` jmh test result.

+1. It's also the only test case where there is a regression on the JMH numbers, or at least not a clear improvement (before: 6385.280, after: 6433.223)

On your JMH numbers, how many iterations have you run for each benchmark? I don't see the standard deviation which would be useful to better understand noise.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/16346#discussion_r1371283500