RFR: 8374513: AArch64: Improve receiver type profiling reliability
Aleksey Shipilev
shade at openjdk.org
Tue Jan 20 10:14:31 UTC 2026
On Tue, 20 Jan 2026 09:50:40 GMT, Andrew Haley <aph at openjdk.org> wrote:
>> [JDK-8374513](https://bugs.openjdk.org/browse/JDK-8374513) is the AArch64 port of [JDK-8357258](https://bugs.openjdk.org/browse/JDK-8357258). See the bug report for more detailed information.
>>
>> This PR executes the plan outlined in the bug:
>> - Common the receiver type profiling code in interpreter and C1
>> - Rewrite receiver type profiling code to only do atomic receiver slot installations
>> - Trim C1OptimizeVirtualCallProfiling to only claim slots when receiver is installed
>>
>> This PR does not do atomic counter updates themselves, as it may have much wider performance implications, including regressions. This PR should be at least performance neutral.
>>
>> Functional Testing: Linux AArch64 fastdebug Tier1-4
>>
>> Performance Testing (to show there is no performance regression):
>>
>> # Baseline
>> Benchmark (randomized) Mode Cnt Score Error Units
>> InterfaceCalls.test1stInt2Types false avgt 12 1.940 ± 0.002 ns/op
>> InterfaceCalls.test1stInt2Types true avgt 12 7.714 ± 0.011 ns/op
>> InterfaceCalls.test1stInt3Types false avgt 12 6.747 ± 0.066 ns/op
>> InterfaceCalls.test1stInt3Types true avgt 12 18.733 ± 0.175 ns/op
>> InterfaceCalls.test1stInt5Types false avgt 12 6.690 ± 0.083 ns/op
>> InterfaceCalls.test1stInt5Types true avgt 12 21.299 ± 0.016 ns/op
>> InterfaceCalls.test2ndInt2Types false avgt 12 1.997 ± 0.004 ns/op
>> InterfaceCalls.test2ndInt2Types true avgt 12 8.027 ± 0.016 ns/op
>> InterfaceCalls.test2ndInt3Types false avgt 12 7.915 ± 0.114 ns/op
>> InterfaceCalls.test2ndInt3Types true avgt 12 16.988 ± 0.065 ns/op
>> InterfaceCalls.test2ndInt5Types false avgt 12 9.422 ± 0.017 ns/op
>> InterfaceCalls.test2ndInt5Types true avgt 12 22.736 ± 0.022 ns/op
>> InterfaceCalls.testIfaceCall false avgt 12 5.860 ± 0.046 ns/op
>> InterfaceCalls.testIfaceCall true avgt 12 5.794 ± 0.026 ns/op
>> InterfaceCalls.testIfaceExtCall false avgt 12 6.310 ± 0.067 ns/op
>> InterfaceCalls.testIfaceExtCall true avgt 12 6.239 ± 0.017 ns/op
>> InterfaceCalls.testMonomorphic false avgt 12 1.146 ± 0.034 ns/op
>> InterfaceCalls.testMonomorphic true avgt 12 1.131 ± 0.012 ns/op
>>
>> # With PR
>> Benchmark (randomized) Mode Cnt Score Error Units
>> InterfaceCalls.test1stInt2Type...
>
> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 2253:
>
>> 2251: lea(rscratch2, Address(mdp, offset));
>> 2252: cmpxchg(/*addr*/ rscratch2, /*expected*/ zr, /*new*/ recv, Assembler::xword,
>> 2253: /*acquire*/ false, /*release*/ false, /*weak*/ false, noreg);
>
> Suggestion:
>
> /*acquire*/ false, /*release*/ false, /*weak*/ true, noreg);
>
> If we fail a weak CAS here we should retry the whole loop: it probably means someone else has claimed the slot.
True. Given that we restart the scan from the beginning anyway, there is no point in potentially looping in the strongified CAS implementation behind this macro.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/29283#discussion_r2707666285
More information about the hotspot-dev
mailing list