RFR: 8374513: AArch64: Improve receiver type profiling reliability
Aleksey Shipilev
shade at openjdk.org
Mon Jan 19 21:10:15 UTC 2026
On Sat, 17 Jan 2026 23:40:45 GMT, Andrew Haley <aph at openjdk.org> wrote:
>> [JDK-8374513](https://bugs.openjdk.org/browse/JDK-8374513) is the AArch64 port of [JDK-8357258](https://bugs.openjdk.org/browse/JDK-8357258). See the bug report for more detailed information.
>>
>> This PR executes the plan outlined in the bug:
>> - Common the receiver type profiling code in interpreter and C1
>> - Rewrite receiver type profiling code to only do atomic receiver slot installations
>> - Trim C1OptimizeVirtualCallProfiling to only claim slots when receiver is installed
>>
>> This PR does not do atomic counter updates themselves, as it may have much wider performance implications, including regressions. This PR should be at least performance neutral.
>>
>> Functional Testing: Linux AArch64 fastdebug Tier1-4
>>
>> Performance Testing (to show there is no performance regression):
>>
>> # Baseline
>> Benchmark (randomized) Mode Cnt Score Error Units
>> InterfaceCalls.test1stInt2Types false avgt 12 1.940 ± 0.002 ns/op
>> InterfaceCalls.test1stInt2Types true avgt 12 7.714 ± 0.011 ns/op
>> InterfaceCalls.test1stInt3Types false avgt 12 6.747 ± 0.066 ns/op
>> InterfaceCalls.test1stInt3Types true avgt 12 18.733 ± 0.175 ns/op
>> InterfaceCalls.test1stInt5Types false avgt 12 6.690 ± 0.083 ns/op
>> InterfaceCalls.test1stInt5Types true avgt 12 21.299 ± 0.016 ns/op
>> InterfaceCalls.test2ndInt2Types false avgt 12 1.997 ± 0.004 ns/op
>> InterfaceCalls.test2ndInt2Types true avgt 12 8.027 ± 0.016 ns/op
>> InterfaceCalls.test2ndInt3Types false avgt 12 7.915 ± 0.114 ns/op
>> InterfaceCalls.test2ndInt3Types true avgt 12 16.988 ± 0.065 ns/op
>> InterfaceCalls.test2ndInt5Types false avgt 12 9.422 ± 0.017 ns/op
>> InterfaceCalls.test2ndInt5Types true avgt 12 22.736 ± 0.022 ns/op
>> InterfaceCalls.testIfaceCall false avgt 12 5.860 ± 0.046 ns/op
>> InterfaceCalls.testIfaceCall true avgt 12 5.794 ± 0.026 ns/op
>> InterfaceCalls.testIfaceExtCall false avgt 12 6.310 ± 0.067 ns/op
>> InterfaceCalls.testIfaceExtCall true avgt 12 6.239 ± 0.017 ns/op
>> InterfaceCalls.testMonomorphic false avgt 12 1.146 ± 0.034 ns/op
>> InterfaceCalls.testMonomorphic true avgt 12 1.131 ± 0.012 ns/op
>>
>> # With PR
>> Benchmark (randomized) Mode Cnt Score Error Units
>> InterfaceCalls.test1stInt2Type...
>
> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 2256:
>
>> 2254: lea(rscratch2, Address(mdp, offset));
>> 2255: cmpxchg(/*addr*/ rscratch2, /*expected*/ zr, /*new*/ recv, Assembler::xword,
>> 2256: /*acquire*/ true, /*release*/ false, /*weak*/ false, noreg);
>
> /*acquire*/ true, /*release*/ false, /*weak*/ false, noreg);
>
> I'm curious about the acquire. For it to make sense here, I would have thought it must be linked with an earlier release in another thread.
We can use `acquire = false`, `release = false` here. This CAS only claims the empty slot for the new receiver type. There is nothing else riding on memory effects of this. So this CAS needs atomicity only, and additional memory effects are superfluous.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/29283#discussion_r2703640346
More information about the hotspot-dev
mailing list