RFR: 8331558: AArch64: optimize integer remainder [v3]
Andrew Haley
aph at openjdk.org
Wed May 8 09:17:55 UTC 2024
On Wed, 8 May 2024 01:04:37 GMT, Jin Guojie <duke at openjdk.org> wrote:
>> 8331558: AArch64: optimize integer remainder
>> On some Arm processors, a separate multiply/subtract is actually faster than the combined instruction.
>>
>> 8331556: AArch64: CPU_Model support for Neoverse N1/N2/V1/V2
>> Add full platform coverage for Neoverse variants in vm_version.?pp
>>
>> The following test has passed, which shows definite performance improvement.
>>
>> make test TEST="micro:java.lang.IntegerDivMod"
>> make test TEST="micro:java.lang.LongDivMod"
>>
>> * IntegerDivMod.testDivideRemainderUnsigned
>> baseline(ns/ops) 2223
>> with this pacth(ns/ops) 1885
>> improvement(%) 17.93%
>>
>> * IntegerDivMod.testRemainderUnsigned
>> baseline(ns/ops) 2225
>> with this pacth(ns/ops) 1885
>> improvement(%) 18.03%
>>
>> * LongDivMod.testDivideRemainderUnsigned
>> baseline(ns/ops) 2231
>> with this pacth(ns/ops) 1894
>> improvement(%) 17.79%
>>
>> * LongDivMod.testRemainderUnsigned
>> baseline(ns/ops) 2232
>> with this pacth(ns/ops) 1891
>> improvement(%) 18.03%
>
> Jin Guojie has updated the pull request incrementally with one additional commit since the last revision:
>
> Applicable platforms expanded to the entire neoverse family
>
> Even on the V series (V1 and V2), both sdiv/udiv and msub instructions are executed in M0 unit (Integer multi cycle). It should benefit the V series as well.
src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 465:
> 463: /* On Neoverse, MSUB uses the same ALU with SDIV.
> 464: * The combination of MUL/SUB can utilize multiple ALUs,
> 465: * and is much faster than MSUB. */
Suggestion:
* The combination of MUL/SUB can utilize multiple ALUs,
* and can be somewhat faster than MSUB. */
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/19093#discussion_r1593709793
More information about the hotspot-dev
mailing list