RFR: 8020282: Generated code quality: redundant LEAs in the chained dereferences [v2]
Manuel Hässig
mhaessig at openjdk.org
Thu Jun 12 14:26:36 UTC 2025
On Thu, 12 Jun 2025 12:26:01 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:
>> Manuel Hässig has updated the pull request incrementally with three additional commits since the last revision:
>>
>> - Add comment to benchmark as to why we fix the heap size
>> - Add missing null chec
>> - Fix typos
>
> src/hotspot/cpu/x86/peephole_x86_64.cpp line 288:
>
>> 286: bool is_spill = lea_derived_oop->in(AddPNode::Address) != decode->in(1) &&
>> 287: lea_derived_oop->in(AddPNode::Address)->is_SpillCopy() &&
>> 288: decode->in(1)->is_SpillCopy();
>
> The logic around `is_spill` could be simplified by declaring pointers to the lea and decode address inputs and setting these either to their direct inputs or one level up in case of spilling. Something like this:
>
>
> Node* lea_address = lea_derived_oop->in(AddPNode::Address);
> Node* decode_address = decode->in(1);
>
> bool is_spill = lea_address != decode_address &&
> lea_address->is_SpillCopy() &&
> decode_address->is_SpillCopy();
>
> if (is_spill) {
> decode_address = decode_address->in(1);
> lea_address = lea_address->in(1);
> }
>
> // The leaP* and the decode must have the same parent. If we have a spill, they must have
> // the same grandparent.
> if (lea_address != decode_address) {
> return false;
> }
>
> (...)
>
>
>
> This creates more opportunities for simplification below, e.g. rewiring the base address of the affected leas directly to `lea_address` (or `decode_address`).
That makes a lot of things much clearer. I like it! Thank you for the suggestion! Incorporated in 3d6f8972a58
> src/hotspot/cpu/x86/peephole_x86_64.cpp line 325:
>
>> 323: // Ensure the MachProj is in the same block as the decode and the lea.
>> 324: if (proj == nullptr || !block->contains(proj)) {
>> 325: return false;
>
> The only scenario in which I think this may be possible is when we stress scheduling, otherwise `RFLAGS` projections should always be scheduled immediately after their input. Consider adding an assertion like `assert(StressGCM, "should be scheduled contiguously otherwise");` or similar here.
Interesting. I thought that this case would be unlikely, but did not know this. I added the assert in 0f464e131d4.
> src/hotspot/cpu/x86/peephole_x86_64.cpp line 349:
>
>> 347: // Remove spill for the decode if it does not have any other uses.
>> 348: if (is_spill && decode->in(1)->is_Mach() && decode->in(1)->outcnt() == 1 && block->contains(decode->in(1))) {
>> 349: MachNode* decode_spill = decode->in(1)->as_Mach();
>
> This could be simplified if `decode_spill` was defined already within the `if (is_spill)` statement proposed above. Furthermore, `decode->in(1)->is_Mach()` is implied by `is_spill`, so not needed.
Incorporated in 3d6f8972a58
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2142912020
PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2142906368
PR Review Comment: https://git.openjdk.org/jdk/pull/25471#discussion_r2142913342
More information about the hotspot-compiler-dev
mailing list