RFR: 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD

Andrew Haley aph at openjdk.org
Mon Aug 18 18:41:55 UTC 2025


On Thu, 14 Aug 2025 15:55:24 GMT, Fei Gao <fgao at openjdk.org> wrote:

> Do you think these numbers are trustworthy?

Yes, but it's a microarchitecture-dependent optimization, and it's just a single case. I'm seeing virtually identical times on Apple M1 between these:


#define ACTION1                                 \
    "movz  x0, #1234; "                         \
    "movk  x0, #1234, lsl #16; "                \
    "movk  x0, #1234, lsl #32; "                \
    "movz  x2, #1234; "                         \
    "movk  x2, #1234, lsl #16; "                \
    "movk  x2, #1234, lsl #32; "                \
    "add   x1, x2, x0; "                        \

#define ACTION2                                 \
    "adrp x0, . + 20480 * 4096; "               \
    "add x0, x0, #48; "                         \
    "adrp x2, . + 20480 * 4096; "               \
    "add x2, x2, #48; "                         \
    "add   x1, x2, x0; "                        \




        96,642,308      cycles:u                         #    2.858 GHz                       
       702,095,662      instructions:u                   #    7.26  insn per cycle            
 
       103,939,352      cycles:u                         #    2.930 GHz                       
       502,095,644      instructions:u                   #    4.83  insn per cycle            



All of this stuff is pretty marginal. I can at least accept that `adrp; addp` is shorter therefore better,.

But I do not look forward to a blizzard of such changes.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26653#issuecomment-3189854914


More information about the hotspot-dev mailing list