RFR: 8345485: C2 MergeLoads: merge adjacent array/native memory loads into larger load [v4]
Emanuel Peter
epeter at openjdk.org
Wed Jun 4 12:46:27 UTC 2025
On Wed, 4 Jun 2025 07:50:24 GMT, Kuai Wei <kwei at openjdk.org> wrote:
>> @kuaiwei I'm not in a rush with this one. I'd rather we have a good design and be reasonably sure that it is correct, rather than rush it now and having to do extra cycles fixing things later ;)
>
> Hi @eme64 , I tried to use match pattern for `MergePrimitiveLoads::has_no_merge_load_combine_below()` . But I think it has some difficulty. For mergeable operators, they can be linked in different way, like:
> 1) (((item1 Or item2) Or item3) Or item4)
> 2) ((item1 Or item2) Or (item3 Or item4))
> ...
> To check the next `Or` operator is a valid last one of combine operator chain. We may check its all input recursively. I didn't find a good way to revolve it. If you have better idea, I will check it.
>
> I think it's more easy to mark the combine operator checked. It works in this way:
> * If the checking combine operator has successor combine operator , which is not checked before, we do not optimize it and let the next one has chance to be optimized.
> * If we try to merge but failed, so we mark it as a `checked` and add its input into GVN worklist. So its input operators can be checked.
>
> I added comments of MergePrimitiveLoads::has_no_merge_load_combine_below() to describe the design.
>
> To reduce the memory size of `AddNode`. I removed the flag from `AddNode` and add 2 virtual fucntions
> ```c++
> // Check if this node is checked by merge_memops phase
> virtual bool is_merge_memops_checked() const { return false; };
> virtual void set_merge_memops_checked(bool v) { ShouldNotReachHere(); };
>
> The flag , `_merge_memops_checked`, is only added in OrINode and OrLNode.
>
> Could you help to check the design and code?
>
> Thanks.
@kuaiwei Thanks for your reply!
> I think it's more easy to mark the combine operator checked.
It may seem easier now. But over time, if multiple operations had such flags, things would become very messy. And now every node that can be such a `combine operator` has to have an additional flag, and consumes more memory.
> I tried to use match pattern for MergePrimitiveLoads::has_no_merge_load_combine_below() . But I think it has some difficulty. For mergeable operators, they can be linked in different way, like:
> (((item1 Or item2) Or item3) Or item4)
> ((item1 Or item2) Or (item3 Or item4))
> ...
Yes, we may have to deal with inputs being permuted. But I think we should be able to deal with the permutations, we do that in other places too.
> To check the next Or operator is a valid last one of combine operator chain. We may check its all input recursively. I didn't find a good way to revolve it. If you have better idea, I will check it.
I'm not sure I understood what you said here.
> We may check its all input recursively
You probably mean we could check all outputs?
So if you are looking at the `OrINode`, and the pattern above it is already a `MergeLoad` pattern, then we should also look down, and see if we find other `OrINode`. For each of these output nodes, we should check if their other input could also be merged with what we already have. Do you not think this is possible? What exactly makes it difficult or impossible?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24023#issuecomment-2939903492
More information about the hotspot-compiler-dev
mailing list