RFR: 8346890: AArch64: Type profile counters generate suboptimal code
Aleksey Shipilev
shade at openjdk.org
Fri Jan 10 09:53:35 UTC 2025
On Thu, 9 Jan 2025 16:30:49 GMT, Andrew Haley <aph at openjdk.org> wrote:
> Type profile counters are emitted many times in C1-generated code. The generator was written a long time ago before we knew how best to write AArch64 code, and the generated code is rather suboptimal.
>
> This PR reduces the size of a typical bimorphic type profile counter from 33 to 27 instructions.
Can you explain a bit more here? I think I get why would we want to rewrite `lea+ldr` to `slot_at`.
I do not quite understand why do we rewrite this one:
- Address data_addr(mdo, md->byte_offset_of_slot(data, ReceiverTypeData::receiver_count_offset(i)));
- __ addptr(data_addr, DataLayout::counter_increment);
+ __ addptr(slot_at(ReceiverTypeData::receiver_count_offset(i)),
+ DataLayout::counter_increment);
Does it really optimize anything to rewrite it to `slot_at`? If so, shouldn't this one in the other hunk also get rewritten?
Address data_addr(mdo, md->byte_offset_of_slot(data, VirtualCallData::receiver_count_offset(i)));
__ addptr(data_addr, DataLayout::counter_increment);
-------------
PR Review: https://git.openjdk.org/jdk/pull/23012#pullrequestreview-2542002023
More information about the hotspot-dev
mailing list