RFR: 8331558: AArch64: optimize integer remainder [v2]
Andrew Haley
aph at openjdk.org
Thu May 30 08:35:03 UTC 2024
On Thu, 30 May 2024 07:12:31 GMT, Jin Guojie <duke at openjdk.org> wrote:
>> On some Arm processors, a separate multiply/subtract is actually faster than the combined instruction.
>>
>> (1) The following test has passed, which shows performance improvement.
>>
>> make test TEST="micro:java.lang.IntegerDivMod"
>> make test TEST="micro:java.lang.LongDivMod"
>>
>> * IntegerDivMod.testDivideRemainderUnsigned baseline(ns/ops) 2223 with this pacth(ns/ops) 1885 improvement(%) 17.93%
>>
>> * IntegerDivMod.testRemainderUnsigned baseline(ns/ops) 2225 with this pacth(ns/ops) 1885 improvement(%) 18.03%
>>
>> * LongDivMod.testDivideRemainderUnsigned baseline(ns/ops) 2231 with this pacth(ns/ops) 1894 improvement(%) 17.79%
>>
>> * LongDivMod.testRemainderUnsigned baseline(ns/ops) 2232 with this pacth(ns/ops) 1891 improvement(%) 18.03%
>>
>> (2) jtreg test has passed
>>
>> make run-test TEST=tier1
>
> Jin Guojie has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision:
>
> - Merge branch 'openjdk:master' into dev0530
> - 8331558: AArch64: optimize integer remainder
>
> On some Arm processors, a separate multiply/subtract is actually faster than the combined instruction.
>
> (1) The following test has passed, which shows performance improvement.
>
> make test TEST="micro:java.lang.IntegerDivMod"
> make test TEST="micro:java.lang.LongDivMod"
>
> * IntegerDivMod.testDivideRemainderUnsigned
> baseline(ns/ops) 2223
> with this pacth(ns/ops) 1885
> improvement(%) 17.93%
>
> * IntegerDivMod.testRemainderUnsigned
> baseline(ns/ops) 2225
> with this pacth(ns/ops) 1885
> improvement(%) 18.03%
>
> * LongDivMod.testDivideRemainderUnsigned
> baseline(ns/ops) 2231
> with this pacth(ns/ops) 1894
> improvement(%) 17.79%
>
> * LongDivMod.testRemainderUnsigned
> baseline(ns/ops) 2232
> with this pacth(ns/ops) 1891
> improvement(%) 18.03%
>
> (2) jtreg test has passed
>
> make run-test TEST=tier1
src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 2299:
> 2297: if (VM_Version::is_neoverse()) {
> 2298: mul(rscratch2, Rn, Rm);
> 2299: sub(Rd, Ra, rscratch2);
It's too risky to use `rscratch2` here. Instead, please make another version of `msub` that take a scratch register as an argument. We'll then use the new `msub` in places that we know are safe, such as compiler-generated code, and it won't cause any future maintenance surprises.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/19471#discussion_r1620253887
More information about the hotspot-dev
mailing list