RFR: 8331558: AArch64: optimize integer remainder [v2]

Andrew Haley aph at openjdk.org
Thu May 30 08:35:03 UTC 2024


On Thu, 30 May 2024 07:12:31 GMT, Jin Guojie <duke at openjdk.org> wrote:

>> On some Arm processors, a separate multiply/subtract is actually faster than the combined instruction.
>> 
>> (1) The following test has passed, which shows performance improvement.
>> 
>> make test TEST="micro:java.lang.IntegerDivMod"
>> make test TEST="micro:java.lang.LongDivMod"
>> 
>> * IntegerDivMod.testDivideRemainderUnsigned baseline(ns/ops) 2223 with this pacth(ns/ops) 1885 improvement(%) 17.93%
>> 
>> * IntegerDivMod.testRemainderUnsigned baseline(ns/ops) 2225 with this pacth(ns/ops) 1885 improvement(%) 18.03%
>> 
>> * LongDivMod.testDivideRemainderUnsigned baseline(ns/ops) 2231 with this pacth(ns/ops) 1894 improvement(%) 17.79%
>> 
>> * LongDivMod.testRemainderUnsigned baseline(ns/ops) 2232 with this pacth(ns/ops) 1891 improvement(%) 18.03%
>> 
>> (2) jtreg test has passed
>> 
>> make run-test  TEST=tier1
>
> Jin Guojie has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision:
> 
>  - Merge branch 'openjdk:master' into dev0530
>  - 8331558: AArch64: optimize integer remainder
>    
>    On some Arm processors, a separate multiply/subtract is actually faster than the combined instruction.
>    
>    (1) The following test has passed, which shows performance improvement.
>    
>    make test TEST="micro:java.lang.IntegerDivMod"
>    make test TEST="micro:java.lang.LongDivMod"
>    
>    * IntegerDivMod.testDivideRemainderUnsigned
>      baseline(ns/ops) 2223
>      with this pacth(ns/ops) 1885
>      improvement(%) 17.93%
>    
>    * IntegerDivMod.testRemainderUnsigned
>      baseline(ns/ops) 2225
>      with this pacth(ns/ops) 1885
>      improvement(%) 18.03%
>    
>    * LongDivMod.testDivideRemainderUnsigned
>      baseline(ns/ops) 2231
>      with this pacth(ns/ops) 1894
>      improvement(%) 17.79%
>    
>    * LongDivMod.testRemainderUnsigned
>      baseline(ns/ops) 2232
>      with this pacth(ns/ops) 1891
>      improvement(%) 18.03%
>    
>    (2) jtreg test has passed
>    
>    make run-test  TEST=tier1

src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 2299:

> 2297:   if (VM_Version::is_neoverse()) {
> 2298:     mul(rscratch2, Rn, Rm);
> 2299:     sub(Rd, Ra, rscratch2);

It's too risky to use `rscratch2` here. Instead, please make another version of `msub` that take a scratch register as an argument. We'll then use the new `msub` in places that we know are safe, such as compiler-generated code, and it won't cause any future maintenance surprises.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19471#discussion_r1620253887


More information about the hotspot-dev mailing list