RFR: 8362504: AArch64: Replace MOVZ+MOVK+MOVK with ADRP+ADD

Fei Gao fgao at openjdk.org
Thu Aug 7 11:46:13 UTC 2025


On Wed, 6 Aug 2025 09:59:11 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> If the relocation or target address is guaranteed to reside within the CodeCache, we can safely replace a `movz + movk + movk` sequence with a more compact and efficient `adrp + add` instruction pair.
>> 
>> In `MacroAssembler::mov(Register r, Address dest)`, this replacement can be applied if any of the following rules hold:
>> 
>> 1. The relocation type indicates that the address resides within the CodeCache and the necessary patching logic is provided in `fix_relocation_after_move()`.
>> 2. The target address is fixed (i.e., does not require relocation) and is within the reachable range for `adrp`.
>> 
>> The patch performs the filtering in `is_relocated_within_codecache()` and `is_adrp_reachable()` to ensure this optimization is applied safely and selectively.
>
> It's certainly smaller, but whether it's more efficient is dependent on the circumstances. For example, on Firestorm we can do two ADRPs per cycle, but eight MOVZ/MOVKs.

Thanks for your review @theRealAph @adinn .

> It's certainly smaller, but whether it's more efficient is dependent on the circumstances. For example, on Firestorm we can do two ADRPs per cycle, but eight MOVZ/MOVKs.

@theRealAph That makes a lot of sense. On Neoverse N1/N2/V1, `ADRP` and `MOVZ/MOVK` have comparable latency and throughput and share the same pipeline, so we can expect these microarchitectures to benefit from this patch. I’m planning to update the patch with an `aarch64` option to enable or disable this replacement based on the target microarchitecture. What do you think?

> The solution is to make the compiler always generate the 3-instruction load when compiling in an Assembly VM and otherwise generate the 2-instruction load based on reachability i.e. AOT code won't 'benefit' from this patch but runtime generated code will (assuming it is a benefit).

> So, the reachability method needs to return false if `AOTCodeCache::is_on_for_dump() returns true.

@adinn thanks for pointing this out. I'll add this constraint in next commit.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26653#issuecomment-3163784211


More information about the hotspot-dev mailing list