RFR: 8357258: x86: Improve receiver type profiling reliability [v5]

Aleksey Shipilev shade at openjdk.org
Mon Dec 1 09:09:23 UTC 2025


On Fri, 28 Nov 2025 15:21:47 GMT, Andrew Haley <aph at openjdk.org> wrote:

> I'm seeing minor performance regressions in `InterfaceCalls.test2ndInt5Types`, before and after this PR:

Reproduced locally too:


Benchmark                                                (randomized)  Mode  Cnt    Score   Error      Units

# Baseline
InterfaceCalls.test2ndInt5Types                                 false  avgt   12   16.945 ±  0.079      ns/op
InterfaceCalls.test2ndInt5Types:L1-dcache-load-misses           false  avgt    3    0.076 ±  2.187       #/op
InterfaceCalls.test2ndInt5Types:L1-dcache-loads                 false  avgt    3   88.738 ±  0.416       #/op
InterfaceCalls.test2ndInt5Types:branch-misses                   false  avgt    3    0.007 ±  0.003       #/op
InterfaceCalls.test2ndInt5Types:branches                        false  avgt    3   49.122 ±  0.353       #/op
InterfaceCalls.test2ndInt5Types:cycles                          false  avgt    3   57.147 ±  1.698       #/op
InterfaceCalls.test2ndInt5Types:instructions                    false  avgt    3  247.443 ±  1.531       #/op


# Current PR
InterfaceCalls.test2ndInt5Types                                 false  avgt   12   22.513 ±  0.208      ns/op
InterfaceCalls.test2ndInt5Types:L1-dcache-load-misses           false  avgt    3    0.012 ±  0.072       #/op
InterfaceCalls.test2ndInt5Types:L1-dcache-loads                 false  avgt    3  108.446 ± 13.975       #/op  ; +20 loads
InterfaceCalls.test2ndInt5Types:branch-misses                   false  avgt    3    0.407 ±  0.010       #/op
InterfaceCalls.test2ndInt5Types:branches                        false  avgt    3   54.102 ±  0.403       #/op  ; +5 branches
InterfaceCalls.test2ndInt5Types:cycles                          false  avgt    3   75.938 ±  5.043       #/op
InterfaceCalls.test2ndInt5Types:instructions                    false  avgt    3  280.194 ±  5.758       #/op  ; +32 instructions


Looked at perfasm, and there are no gross problems there. I also think reliability trumps this minor performance bump. But I also suspect this is caused by second loop re-walking the table looking for (empty) slots, this is where extra loads are coming from. I believe it can reasonably track the first non-null slot and start the walk from there. Let me see if it is simple to do without complicating the code all too much.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/25305#issuecomment-3595393093


More information about the hotspot-dev mailing list