RFR: 8374513: AArch64: Improve receiver type profiling reliability

Aleksey Shipilev shade at openjdk.org
Mon Jan 19 21:10:15 UTC 2026


On Sat, 17 Jan 2026 23:40:45 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> [JDK-8374513](https://bugs.openjdk.org/browse/JDK-8374513) is the AArch64 port of [JDK-8357258](https://bugs.openjdk.org/browse/JDK-8357258). See the bug report for more detailed information.
>> 
>> This PR executes the plan outlined in the bug:
>> - Common the receiver type profiling code in interpreter and C1
>> - Rewrite receiver type profiling code to only do atomic receiver slot installations
>> - Trim C1OptimizeVirtualCallProfiling to only claim slots when receiver is installed
>> 
>> This PR does not do atomic counter updates themselves, as it may have much wider performance implications, including regressions. This PR should be at least performance neutral.
>> 
>> Functional Testing: Linux AArch64 fastdebug Tier1-4
>> 
>> Performance Testing (to show there is no performance regression):
>> 
>> # Baseline
>> Benchmark                        (randomized)  Mode  Cnt   Score   Error  Units
>> InterfaceCalls.test1stInt2Types         false  avgt   12   1.940 ± 0.002  ns/op
>> InterfaceCalls.test1stInt2Types          true  avgt   12   7.714 ± 0.011  ns/op
>> InterfaceCalls.test1stInt3Types         false  avgt   12   6.747 ± 0.066  ns/op
>> InterfaceCalls.test1stInt3Types          true  avgt   12  18.733 ± 0.175  ns/op
>> InterfaceCalls.test1stInt5Types         false  avgt   12   6.690 ± 0.083  ns/op
>> InterfaceCalls.test1stInt5Types          true  avgt   12  21.299 ± 0.016  ns/op
>> InterfaceCalls.test2ndInt2Types         false  avgt   12   1.997 ± 0.004  ns/op
>> InterfaceCalls.test2ndInt2Types          true  avgt   12   8.027 ± 0.016  ns/op
>> InterfaceCalls.test2ndInt3Types         false  avgt   12   7.915 ± 0.114  ns/op
>> InterfaceCalls.test2ndInt3Types          true  avgt   12  16.988 ± 0.065  ns/op
>> InterfaceCalls.test2ndInt5Types         false  avgt   12   9.422 ± 0.017  ns/op
>> InterfaceCalls.test2ndInt5Types          true  avgt   12  22.736 ± 0.022  ns/op
>> InterfaceCalls.testIfaceCall            false  avgt   12   5.860 ± 0.046  ns/op
>> InterfaceCalls.testIfaceCall             true  avgt   12   5.794 ± 0.026  ns/op
>> InterfaceCalls.testIfaceExtCall         false  avgt   12   6.310 ± 0.067  ns/op
>> InterfaceCalls.testIfaceExtCall          true  avgt   12   6.239 ± 0.017  ns/op
>> InterfaceCalls.testMonomorphic          false  avgt   12   1.146 ± 0.034  ns/op
>> InterfaceCalls.testMonomorphic           true  avgt   12   1.131 ± 0.012  ns/op
>> 
>> # With PR
>> Benchmark                        (randomized)  Mode  Cnt   Score   Error  Units
>> InterfaceCalls.test1stInt2Type...
>
> src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 2256:
> 
>> 2254:   lea(rscratch2, Address(mdp, offset));
>> 2255:   cmpxchg(/*addr*/ rscratch2, /*expected*/ zr, /*new*/ recv, Assembler::xword,
>> 2256:           /*acquire*/ true, /*release*/ false, /*weak*/ false, noreg);
> 
> /*acquire*/ true, /*release*/ false, /*weak*/ false, noreg);
> 
> I'm curious about the acquire. For it to make sense here, I would have thought it must be linked with an earlier release in another thread.

We can use `acquire = false`, `release = false` here. This CAS only claims the empty slot for the new receiver type. There is nothing else riding on memory effects of this. So this CAS needs atomicity only, and additional memory effects are superfluous.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/29283#discussion_r2703640346


More information about the hotspot-dev mailing list