RFR: 8336245: AArch64: remove extra register copy when converting from long to pointer

Fei Gao fgao at openjdk.org
Fri Jul 12 13:52:05 UTC 2024


On Fri, 12 Jul 2024 13:44:25 GMT, Fei Gao <fgao at openjdk.org> wrote:

> In the cases like:
> 
>   UNSAFE.putLong(address + off1 + 1030, lseed);
>   UNSAFE.putLong(address + 1023, lseed);
>   UNSAFE.putLong(address + off2 + 1001, lseed);
> 
> 
> Unsafe intrinsifies direct memory access using a long as the base address, generating a `CastX2P` node converting long to pointer in C2. Then we get optoassembly code like:
> 
>   ldr  R10, [R15, #120]    # int ! Field: address
>   ldr  R11, [R16, #136]    # int ! Field: off1
>   ldr  R12, [R16, #144]    # int ! Field: off2
>   add  R11, R11, R10
>   mov R11, R11    # long -> ptr
>   add  R12, R12, R10
>   mov R10, R10    # long -> ptr
>   add R11, R11, #1030    # ptr
>   str  R17, [R11]    # int
>   add R10, R10, #1023    # ptr
>   str  R17, [R10]    # int
>   mov R10, R12    # long -> ptr
>   add R10, R10, #1001    # ptr
>   str  R17, [R10]    # int
> 
> 
> In aarch64, the conversion from long to pointer could be a nop but C2 doesn't know it. On the existing code, we do nothing for `mov dst src` only when `dst` == `src` [1], then we have assembly:
> 
>   ldr    x10, [x15,#120]
>   ldp    x11, x12, [x16,#136]
>   add    x11, x11, x10
>   add    x12, x12, x10
>   add    x11, x11, #0x406
>   str    x17, [x11]
>   add    x10, x10, #0x3ff
>   str    x17, [x10]
>   mov    x10, x12  <--- extra register copy
>   add    x10, x10, #0x3e9
>   str    x17, [x10]
> 
> 
> There is still one extra register copy, which we're trying to remove in this patch.
> 
> This patch folds `CastX2P` into memory operands by introducing `indirectX2P` and `indOffX2P`. We also create a new opclass `iRegPorL2P` to remove extra copies from `CastX2P` in pointer addition.
> 
> Tier 1~3 passed on aarch64. No obvious change in size of libjvm.so
> 
> [1] https://github.com/openjdk/jdk/blob/5c612c230b0a852aed5fd36e58b82ebf2e1838af/src/hotspot/cpu/aarch64/aarch64.ad#L7906

src/hotspot/share/opto/machnode.cpp line 400:

> 398: 
> 399:   if (t->isa_intptr_t() &&
> 400: #if !defined(AARCH64)

After applying the operand "IndirectX2P", we may have some patterns like:

str val, [CastX2P base]

The code path here will resolve the `base`, which is actually a `intptr`, not a `ptr`, and the offset is `0`.

I guess the code here was intended to support `[base, offset]`, where base can be a `intptr` but offset can not be `0`. I'm not sure why there is such a limitation that offset can not be `0`, maybe for some old machines?

I don't think the limitation is applied to aarch64 machines now. So I unblock it for aarch64.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20157#discussion_r1675959482


More information about the hotspot-dev mailing list