RFR: 8336245: AArch64: remove extra register copy when converting from long to pointer
Andrew Haley
aph at openjdk.org
Thu Jul 25 09:56:32 UTC 2024
On Thu, 25 Jul 2024 09:37:42 GMT, Andrew Dinn <adinn at openjdk.org> wrote:
>> In the cases like:
>>
>> UNSAFE.putLong(address + off1 + 1030, lseed);
>> UNSAFE.putLong(address + 1023, lseed);
>> UNSAFE.putLong(address + off2 + 1001, lseed);
>>
>>
>> Unsafe intrinsifies direct memory access using a long as the base address, generating a `CastX2P` node converting long to pointer in C2. Then we get optoassembly code like:
>>
>> ldr R10, [R15, #120] # int ! Field: address
>> ldr R11, [R16, #136] # int ! Field: off1
>> ldr R12, [R16, #144] # int ! Field: off2
>> add R11, R11, R10
>> mov R11, R11 # long -> ptr
>> add R12, R12, R10
>> mov R10, R10 # long -> ptr
>> add R11, R11, #1030 # ptr
>> str R17, [R11] # int
>> add R10, R10, #1023 # ptr
>> str R17, [R10] # int
>> mov R10, R12 # long -> ptr
>> add R10, R10, #1001 # ptr
>> str R17, [R10] # int
>>
>>
>> In aarch64, the conversion from long to pointer could be a nop but C2 doesn't know it. On the existing code, we do nothing for `mov dst src` only when `dst` == `src` [1], then we have assembly:
>>
>> ldr x10, [x15,#120]
>> ldp x11, x12, [x16,#136]
>> add x11, x11, x10
>> add x12, x12, x10
>> add x11, x11, #0x406
>> str x17, [x11]
>> add x10, x10, #0x3ff
>> str x17, [x10]
>> mov x10, x12 <--- extra register copy
>> add x10, x10, #0x3e9
>> str x17, [x10]
>>
>>
>> There is still one extra register copy, which we're trying to remove in this patch.
>>
>> This patch folds `CastX2P` into memory operands by introducing `indirectX2P` and `indOffX2P`. We also create a new opclass `iRegPorL2P` to remove extra copies from `CastX2P` in pointer addition.
>>
>> Tier 1~3 passed on aarch64. No obvious change in size of libjvm.so
>>
>> [1] https://github.com/openjdk/jdk/blob/5c612c230b0a852aed5fd36e58b82ebf2e1838af/src/hotspot/cpu/aarch64/aarch64.ad#L7906
>
> src/hotspot/cpu/aarch64/aarch64.ad line 4235:
>
>> 4233: operand immLOffset()
>> 4234: %{
>> 4235: predicate(n->get_long() >= -256 && n->get_long() <= 65520);
>
> Why is this using hard wired constants rather than using Address::offset_ok_for_immed?
>
> Also, why is the constant value 65520?
I think `Address::offset_ok_for_immed` is too restrictive: we want a predicate that is the superset of all possible address offsets.
jshell> ((1<<12)-1) <<4
$3 ==> 65520
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/20157#discussion_r1691171119
More information about the hotspot-dev
mailing list