RFR: 8331558: AArch64: optimize integer remainder [v3]
Andrew Haley
aph at openjdk.org
Thu May 30 10:19:04 UTC 2024
On Thu, 30 May 2024 09:23:13 GMT, Jin Guojie <duke at openjdk.org> wrote:
>> On some Arm processors, a separate multiply/subtract is actually faster than the combined instruction.
>>
>> (1) The following test has passed, which shows performance improvement.
>>
>> make test TEST="micro:java.lang.IntegerDivMod"
>> make test TEST="micro:java.lang.LongDivMod"
>>
>> * IntegerDivMod.testDivideRemainderUnsigned baseline(ns/ops) 2223 with this pacth(ns/ops) 1885 improvement(%) 17.93%
>>
>> * IntegerDivMod.testRemainderUnsigned baseline(ns/ops) 2225 with this pacth(ns/ops) 1885 improvement(%) 18.03%
>>
>> * LongDivMod.testDivideRemainderUnsigned baseline(ns/ops) 2231 with this pacth(ns/ops) 1894 improvement(%) 17.79%
>>
>> * LongDivMod.testRemainderUnsigned baseline(ns/ops) 2232 with this pacth(ns/ops) 1891 improvement(%) 18.03%
>>
>> (2) jtreg test has passed
>>
>> make run-test TEST=tier1
>
> Jin Guojie has updated the pull request incrementally with two additional commits since the last revision:
>
> - Merge branch 'dev0530' of https://github.com/jinguojie-alibaba/jdk into dev0530
> - MacroAssembler::msub() takes a scratch register as an argument
src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 446:
> 444:
> 445: void msub(Register Rd, Register Rn, Register Rm, Register Ra, Register tmp = rscratch2);
> 446: void msubw(Register Rd, Register Rn, Register Rm, Register Ra, Register tmp = rscratch2);
Please delete these two methods that use rscratch2 as a default tmp register.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/19471#discussion_r1620421085
More information about the hotspot-dev
mailing list