RFR: 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD
Andrew Haley
aph at openjdk.org
Fri Aug 8 08:57:10 UTC 2025
On Fri, 8 Aug 2025 08:08:36 GMT, Fei Gao <fgao at openjdk.org> wrote:
> > I've done some modelling using llvm-mca and it looks like `adrp; add` is a win on recent Apple processors as well as on Arm processors, so go ahead with making this the default.
>
> Thanks for testing that — really great to hear! I’ll update the patch with a constraint for AOT cache shortly.
Correction: I'm afraid that the llvm-mca results are nonsense. It says that this sequence
movk w0, #0x1234, lsl 16
movk w1, #0x1234, lsl 16
movk w2, #0x1234, lsl 16
movk w3, #0x1234, lsl 16
movk w4, #0x1234, lsl 16
movk w5, #0x1234, lsl 16
movk w6, #0x1234, lsl 16
movk w7, #0x1234, lsl 16
takes 2 clock cycles on Apple M1, but Dougall Johnson measured real hardware executing this at 1 clock cycle.
I'm not going to believe any more without numbers we can trust.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26653#issuecomment-3167078839
More information about the hotspot-dev
mailing list