RFR: 8336245: AArch64: remove extra register copy when converting from long to pointer

Andrew Haley aph at openjdk.org
Fri Jul 12 14:36:51 UTC 2024


On Fri, 12 Jul 2024 13:49:32 GMT, Fei Gao <fgao at openjdk.org> wrote:

>> In the cases like:
>> 
>>   UNSAFE.putLong(address + off1 + 1030, lseed);
>>   UNSAFE.putLong(address + 1023, lseed);
>>   UNSAFE.putLong(address + off2 + 1001, lseed);
>> 
>> 
>> Unsafe intrinsifies direct memory access using a long as the base address, generating a `CastX2P` node converting long to pointer in C2. Then we get optoassembly code like:
>> 
>>   ldr  R10, [R15, #120]    # int ! Field: address
>>   ldr  R11, [R16, #136]    # int ! Field: off1
>>   ldr  R12, [R16, #144]    # int ! Field: off2
>>   add  R11, R11, R10
>>   mov R11, R11    # long -> ptr
>>   add  R12, R12, R10
>>   mov R10, R10    # long -> ptr
>>   add R11, R11, #1030    # ptr
>>   str  R17, [R11]    # int
>>   add R10, R10, #1023    # ptr
>>   str  R17, [R10]    # int
>>   mov R10, R12    # long -> ptr
>>   add R10, R10, #1001    # ptr
>>   str  R17, [R10]    # int
>> 
>> 
>> In aarch64, the conversion from long to pointer could be a nop but C2 doesn't know it. On the existing code, we do nothing for `mov dst src` only when `dst` == `src` [1], then we have assembly:
>> 
>>   ldr    x10, [x15,#120]
>>   ldp    x11, x12, [x16,#136]
>>   add    x11, x11, x10
>>   add    x12, x12, x10
>>   add    x11, x11, #0x406
>>   str    x17, [x11]
>>   add    x10, x10, #0x3ff
>>   str    x17, [x10]
>>   mov    x10, x12  <--- extra register copy
>>   add    x10, x10, #0x3e9
>>   str    x17, [x10]
>> 
>> 
>> There is still one extra register copy, which we're trying to remove in this patch.
>> 
>> This patch folds `CastX2P` into memory operands by introducing `indirectX2P` and `indOffX2P`. We also create a new opclass `iRegPorL2P` to remove extra copies from `CastX2P` in pointer addition.
>> 
>> Tier 1~3 passed on aarch64. No obvious change in size of libjvm.so
>> 
>> [1] https://github.com/openjdk/jdk/blob/5c612c230b0a852aed5fd36e58b82ebf2e1838af/src/hotspot/cpu/aarch64/aarch64.ad#L7906
>
> src/hotspot/share/opto/machnode.cpp line 400:
> 
>> 398: 
>> 399:   if (t->isa_intptr_t() &&
>> 400: #if !defined(AARCH64)
> 
> After applying the operand "IndirectX2P", we may have some patterns like:
> 
> str val, [CastX2P base]
> 
> The code path here will resolve the `base`, which is actually a `intptr`, not a `ptr`, and the offset is `0`.
> 
> I guess the code here was intended to support `[base, offset]`, where base can be a `intptr` but offset can not be `0`. I'm not sure why there is such a limitation that offset can not be `0`, maybe for some old machines?
> 
> I don't think the limitation is applied to aarch64 machines now. So I unblock it for aarch64.

I think it's the other way around. Isn't this code saying that if the address is an intptr + a nonzero offset, then the returned type is bottom, ie nothing? What effect does this change have?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20157#discussion_r1676024922


More information about the hotspot-dev mailing list