RFR: 8374513: AArch64: Improve receiver type profiling reliability

Aleksey Shipilev shade at openjdk.org
Tue Jan 20 10:14:31 UTC 2026


On Tue, 20 Jan 2026 09:50:40 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> [JDK-8374513](https://bugs.openjdk.org/browse/JDK-8374513) is the AArch64 port of [JDK-8357258](https://bugs.openjdk.org/browse/JDK-8357258). See the bug report for more detailed information.
>> 
>> This PR executes the plan outlined in the bug:
>> - Common the receiver type profiling code in interpreter and C1
>> - Rewrite receiver type profiling code to only do atomic receiver slot installations
>> - Trim C1OptimizeVirtualCallProfiling to only claim slots when receiver is installed
>> 
>> This PR does not do atomic counter updates themselves, as it may have much wider performance implications, including regressions. This PR should be at least performance neutral.
>> 
>> Functional Testing: Linux AArch64 fastdebug Tier1-4
>> 
>> Performance Testing (to show there is no performance regression):
>> 
>> # Baseline
>> Benchmark                        (randomized)  Mode  Cnt   Score   Error  Units
>> InterfaceCalls.test1stInt2Types         false  avgt   12   1.940 ± 0.002  ns/op
>> InterfaceCalls.test1stInt2Types          true  avgt   12   7.714 ± 0.011  ns/op
>> InterfaceCalls.test1stInt3Types         false  avgt   12   6.747 ± 0.066  ns/op
>> InterfaceCalls.test1stInt3Types          true  avgt   12  18.733 ± 0.175  ns/op
>> InterfaceCalls.test1stInt5Types         false  avgt   12   6.690 ± 0.083  ns/op
>> InterfaceCalls.test1stInt5Types          true  avgt   12  21.299 ± 0.016  ns/op
>> InterfaceCalls.test2ndInt2Types         false  avgt   12   1.997 ± 0.004  ns/op
>> InterfaceCalls.test2ndInt2Types          true  avgt   12   8.027 ± 0.016  ns/op
>> InterfaceCalls.test2ndInt3Types         false  avgt   12   7.915 ± 0.114  ns/op
>> InterfaceCalls.test2ndInt3Types          true  avgt   12  16.988 ± 0.065  ns/op
>> InterfaceCalls.test2ndInt5Types         false  avgt   12   9.422 ± 0.017  ns/op
>> InterfaceCalls.test2ndInt5Types          true  avgt   12  22.736 ± 0.022  ns/op
>> InterfaceCalls.testIfaceCall            false  avgt   12   5.860 ± 0.046  ns/op
>> InterfaceCalls.testIfaceCall             true  avgt   12   5.794 ± 0.026  ns/op
>> InterfaceCalls.testIfaceExtCall         false  avgt   12   6.310 ± 0.067  ns/op
>> InterfaceCalls.testIfaceExtCall          true  avgt   12   6.239 ± 0.017  ns/op
>> InterfaceCalls.testMonomorphic          false  avgt   12   1.146 ± 0.034  ns/op
>> InterfaceCalls.testMonomorphic           true  avgt   12   1.131 ± 0.012  ns/op
>> 
>> # With PR
>> Benchmark                        (randomized)  Mode  Cnt   Score   Error  Units
>> InterfaceCalls.test1stInt2Type...
>
> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 2253:
> 
>> 2251:   lea(rscratch2, Address(mdp, offset));
>> 2252:   cmpxchg(/*addr*/ rscratch2, /*expected*/ zr, /*new*/ recv, Assembler::xword,
>> 2253:           /*acquire*/ false, /*release*/ false, /*weak*/ false, noreg);
> 
> Suggestion:
> 
>           /*acquire*/ false, /*release*/ false, /*weak*/ true, noreg);
> 
> If we fail a weak CAS here we should retry the whole loop: it probably means someone else has claimed the slot.

True. Given that we restart the scan from the beginning anyway, there is no point in potentially looping in the strongified CAS implementation behind this macro.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/29283#discussion_r2707666285


More information about the hotspot-dev mailing list