RFR: 8331558: AArch64: optimize integer remainder [v3]

Andrew Haley aph at openjdk.org
Thu May 30 10:19:04 UTC 2024


On Thu, 30 May 2024 09:23:13 GMT, Jin Guojie <duke at openjdk.org> wrote:

>> On some Arm processors, a separate multiply/subtract is actually faster than the combined instruction.
>> 
>> (1) The following test has passed, which shows performance improvement.
>> 
>> make test TEST="micro:java.lang.IntegerDivMod"
>> make test TEST="micro:java.lang.LongDivMod"
>> 
>> * IntegerDivMod.testDivideRemainderUnsigned baseline(ns/ops) 2223 with this pacth(ns/ops) 1885 improvement(%) 17.93%
>> 
>> * IntegerDivMod.testRemainderUnsigned baseline(ns/ops) 2225 with this pacth(ns/ops) 1885 improvement(%) 18.03%
>> 
>> * LongDivMod.testDivideRemainderUnsigned baseline(ns/ops) 2231 with this pacth(ns/ops) 1894 improvement(%) 17.79%
>> 
>> * LongDivMod.testRemainderUnsigned baseline(ns/ops) 2232 with this pacth(ns/ops) 1891 improvement(%) 18.03%
>> 
>> (2) jtreg test has passed
>> 
>> make run-test  TEST=tier1
>
> Jin Guojie has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Merge branch 'dev0530' of https://github.com/jinguojie-alibaba/jdk into dev0530
>  - MacroAssembler::msub() takes a scratch register as an argument

src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 446:

> 444: 
> 445:   void msub(Register Rd, Register Rn, Register Rm, Register Ra, Register tmp = rscratch2);
> 446:   void msubw(Register Rd, Register Rn, Register Rm, Register Ra, Register tmp = rscratch2);

Please delete these two methods that use rscratch2 as a default tmp register.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19471#discussion_r1620421085


More information about the hotspot-dev mailing list