RFR: 8336464: Force `CastX2P` to be a two-address instruction

Andrew Haley aph at openjdk.org
Wed Jul 17 14:00:04 UTC 2024


On Fri, 12 Jul 2024 13:59:23 GMT, Fei Gao <fgao at openjdk.org> wrote:

> This patch forces `CastX2P` to be a two-address instruction, so that C2 could allocate the same register for `dst` and `src`. Then we can remove the instruction completely in the assembly.
> 
> The motivation comes from some cast operations like `castPP`. The difference for ADLC between `castPP` and `CastX2P` lies in that `CastX2P` always has different types for `dst` and `src`. We can force ADLC to generate an extra `two_adr()` for `CastX2P` like it does automatically for `castPP`, which could tell register allocator that the instruction needs the same register for `dst` and `src`.
> 
> However, sometimes, RA and GCM in C2 can't work as we expected.
> 
> For example, we have Assembly on the existing code:
> 
>   ldp    x10, x11, [x17,#136]
>   add    x10, x10, x15
>   add    x11, x11, x10
>   ldr    x12, [x17,#152]
>   str    x16, [x10]
>   add    x10, x12, x15
>   str    x16, [x11]
>   str    x16, [x10]
> 
> 
> After applying the patch independently, the assembly is:
> 
>   ldr    x10, [x16,#136]  <--- 1
>   add    x10, x10, x15
>   ldr    x11, [x16,#144]  <--- 2
>   mov    x13, x10         <--- 3
>   str    x17, [x13]
>   ldr    x12, [x16,#152]
>   add    x10, x11, x10
>   str    x17, [x10]
>   add    x10, x12, x15
>   str    x17, [x10]
> 
> 
> C2 generates a totally extra `mov`, see 3, and we even lost the chance to merge load pair, see 1 and 2. That's terrible.
> 
> Although this scenario would disappear after combining with https://github.com/openjdk/jdk/pull/20157, I'm still not sure if this patch is worthwhile.

src/hotspot/share/adlc/output_h.cpp line 1692:

> 1690: 
> 1691: #if defined(AARCH64)
> 1692:     // Allocate the same register for src and dst, then we can remove

This isn't so much AArch64 specific as specific to any machine that doesn't have separate address and data registers. x86 prefers some registers to form addresses, but for others perhaps either a target macro or a callback function in $cpu.ad.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20159#discussion_r1677662886


More information about the hotspot-compiler-dev mailing list