RFR: 8343789: Move mutable nmethod data out of CodeCache [v9]

Andrew Haley aph-open at littlepinkcloud.com
Sat Feb 8 10:38:23 UTC 2025


On 2/7/25 18:33, Boris Ulasevich wrote:
> I think ADP+MOVK is better both in terms of performance and code density.

Good work, you may be right.

Neoverse N1 has a fairly narrow (4 wide) decoder, so I guess it's more likely
to be limited by instruction count.

That benchmark isn't valid for GCC on my machine, because its outputs aren't
used so GCC doesn't generate code for the asm.

However, if we change the benchmark to actually *do something* with the data
(simply add the results together) we get this for movz+movk on Apple M1:

      2,332,615,983      cycles:u                         #    3.135 GHz                         (95.26%)
     18,660,205,348      instructions:u                   #    8.00  insn per cycle              (95.26%)

and this for adrp+movk:

      2,563,872,489      cycles:u                         #    3.057 GHz                         (96.03%)
     14,357,197,644      instructions:u                   #    5.60  insn per cycle              (96.03%)

Here we can see that the M1 is totally front-end limited: 8 ipc is the
speed of light on an M1. Nonetheless, the timings are similar, with the
win going to movz+movk.

On Neoverse V2, I also see an advantage for adrp+movk:

         4162362189      cycles:u                         #    2.796 GHz
        25002398111      instructions:u                   #    6.01  insn per cycle

         3243420864      cycles:u                         #    2.796 GHz
        21002398115      instructions:u                   #    6.48  insn per cycle

So, looks like adrp+movk has an overall advantage. I'm still somewhat skeptical
that this usage really deserves a reloc handler of its own, though, given the
usage.

If we do decide to do this, please give the forced movk version of adrp() a
new name, and have adrp() call it.

Having said all of that, I'm not sure why we're seeing such different instruction
counts for Apple M1 and Neoverse V2, I guess it must be the compiler but I don't
know why, so take all of this with a big pinch of salt. For this really to be
valid I guess we'd have to use the exact same binaries.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671




More information about the hotspot-compiler-dev mailing list