RFR: 8318723: RISC-V: C2 UDivL
Ludovic Henry
luhenry at openjdk.org
Wed Oct 25 07:37:28 UTC 2023
On Wed, 25 Oct 2023 06:55:18 GMT, Fei Yang <fyang at openjdk.org> wrote:
>> Hi,
>> Can you review the change to add intrinsic for UDivI and UDivL?
>> Thanks!
>>
>>
>> ## Tests
>>
>> ### Functionality
>> Run tests successfully found via `grep -nr test/jdk/ -we divideUnsigned` and `grep -nr test/hotspot/ -we divideUnsigned`
>>
>> ### Performance
>> NOTE: there are another 2 related issues: https://bugs.openjdk.org/browse/JDK-8318225, https://bugs.openjdk.org/browse/JDK-8318226, the pr of which will be subseqently sent out after this one finished.
>>
>> #### Long
>> ** Before **
>>
>> LongDivMod.testDivideUnsigned 1024 mixed avgt 2 19852.277 ns/op
>> LongDivMod.testDivideUnsigned 1024 positive avgt 2 29155.681 ns/op
>> LongDivMod.testDivideUnsigned 1024 negative avgt 2 6385.280 ns/op
>>
>>
>> ** After **
>>
>> LongDivMod.testDivideUnsigned 1024 mixed avgt 2 11776.806 ns/op
>> LongDivMod.testDivideUnsigned 1024 positive avgt 2 16101.940 ns/op
>> LongDivMod.testDivideUnsigned 1024 negative avgt 2 6433.223 ns/op
>>
>>
>> #### Integer
>> ** Before **
>>
>> IntegerDivMod.testDivideUnsigned 1024 mixed avgt 2 23498.570 ns/op
>> IntegerDivMod.testDivideUnsigned 1024 positive avgt 2 16875.614 ns/op
>> IntegerDivMod.testDivideUnsigned 1024 negative avgt 2 30310.243 ns/op
>>
>>
>> ** After **
>>
>> IntegerDivMod.testDivideUnsigned 1024 mixed avgt 2 23327.997 ns/op
>> IntegerDivMod.testDivideUnsigned 1024 positive avgt 2 16708.209 ns/op
>> IntegerDivMod.testDivideUnsigned 1024 negative avgt 2 30162.153 ns/op
>
> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 2436:
>
>> 2434: } else {
>> 2435: Label Lltz, Ldone;
>> 2436: bltz(rs2, Lltz);
>
> I am not quite sure what this `bltz` branch is for. Is this a minor performance tunning here? And How would this make a difference then if that's true? I didn't see much difference from the LongDivMod.testDivideUnsigned `negative` jmh test result.
+1. It's also the only test case where there is a regression on the JMH numbers, or at least not a clear improvement (before: 6385.280, after: 6433.223)
On your JMH numbers, how many iterations have you run for each benchmark? I don't see the standard deviation which would be useful to better understand noise.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16346#discussion_r1371283500
More information about the hotspot-dev
mailing list