RFR: 8336245: AArch64: remove extra register copy when converting from long to pointer

Andrew Haley aph at openjdk.org
Thu Jul 25 09:56:32 UTC 2024


On Thu, 25 Jul 2024 09:37:42 GMT, Andrew Dinn <adinn at openjdk.org> wrote:

>> In the cases like:
>> 
>>   UNSAFE.putLong(address + off1 + 1030, lseed);
>>   UNSAFE.putLong(address + 1023, lseed);
>>   UNSAFE.putLong(address + off2 + 1001, lseed);
>> 
>> 
>> Unsafe intrinsifies direct memory access using a long as the base address, generating a `CastX2P` node converting long to pointer in C2. Then we get optoassembly code like:
>> 
>>   ldr  R10, [R15, #120]    # int ! Field: address
>>   ldr  R11, [R16, #136]    # int ! Field: off1
>>   ldr  R12, [R16, #144]    # int ! Field: off2
>>   add  R11, R11, R10
>>   mov R11, R11    # long -> ptr
>>   add  R12, R12, R10
>>   mov R10, R10    # long -> ptr
>>   add R11, R11, #1030    # ptr
>>   str  R17, [R11]    # int
>>   add R10, R10, #1023    # ptr
>>   str  R17, [R10]    # int
>>   mov R10, R12    # long -> ptr
>>   add R10, R10, #1001    # ptr
>>   str  R17, [R10]    # int
>> 
>> 
>> In aarch64, the conversion from long to pointer could be a nop but C2 doesn't know it. On the existing code, we do nothing for `mov dst src` only when `dst` == `src` [1], then we have assembly:
>> 
>>   ldr    x10, [x15,#120]
>>   ldp    x11, x12, [x16,#136]
>>   add    x11, x11, x10
>>   add    x12, x12, x10
>>   add    x11, x11, #0x406
>>   str    x17, [x11]
>>   add    x10, x10, #0x3ff
>>   str    x17, [x10]
>>   mov    x10, x12  <--- extra register copy
>>   add    x10, x10, #0x3e9
>>   str    x17, [x10]
>> 
>> 
>> There is still one extra register copy, which we're trying to remove in this patch.
>> 
>> This patch folds `CastX2P` into memory operands by introducing `indirectX2P` and `indOffX2P`. We also create a new opclass `iRegPorL2P` to remove extra copies from `CastX2P` in pointer addition.
>> 
>> Tier 1~3 passed on aarch64. No obvious change in size of libjvm.so
>> 
>> [1] https://github.com/openjdk/jdk/blob/5c612c230b0a852aed5fd36e58b82ebf2e1838af/src/hotspot/cpu/aarch64/aarch64.ad#L7906
>
> src/hotspot/cpu/aarch64/aarch64.ad line 4235:
> 
>> 4233: operand immLOffset()
>> 4234: %{
>> 4235:   predicate(n->get_long() >= -256 && n->get_long() <= 65520);
> 
> Why is this using hard wired constants rather than using Address::offset_ok_for_immed?
> 
> Also, why is the constant value 65520?

I think `Address::offset_ok_for_immed` is too restrictive: we want a predicate that is the superset of all possible address offsets.


jshell> ((1<<12)-1) <<4
$3 ==> 65520

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20157#discussion_r1691171119


More information about the hotspot-dev mailing list