RFR: 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD
Andrew Haley
aph at openjdk.org
Mon Aug 18 18:41:55 UTC 2025
On Thu, 14 Aug 2025 15:55:24 GMT, Fei Gao <fgao at openjdk.org> wrote:
> Do you think these numbers are trustworthy?
Yes, but it's a microarchitecture-dependent optimization, and it's just a single case. I'm seeing virtually identical times on Apple M1 between these:
#define ACTION1 \
"movz x0, #1234; " \
"movk x0, #1234, lsl #16; " \
"movk x0, #1234, lsl #32; " \
"movz x2, #1234; " \
"movk x2, #1234, lsl #16; " \
"movk x2, #1234, lsl #32; " \
"add x1, x2, x0; " \
#define ACTION2 \
"adrp x0, . + 20480 * 4096; " \
"add x0, x0, #48; " \
"adrp x2, . + 20480 * 4096; " \
"add x2, x2, #48; " \
"add x1, x2, x0; " \
96,642,308 cycles:u # 2.858 GHz
702,095,662 instructions:u # 7.26 insn per cycle
103,939,352 cycles:u # 2.930 GHz
502,095,644 instructions:u # 4.83 insn per cycle
All of this stuff is pretty marginal. I can at least accept that `adrp; addp` is shorter therefore better,.
But I do not look forward to a blizzard of such changes.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/26653#issuecomment-3189854914
More information about the hotspot-dev
mailing list