RFR: 8345485: C2 MergeLoads: merge adjacent array/native memory loads into larger load [v4]
Kuai Wei
kwei at openjdk.org
Mon Jun 9 07:18:53 UTC 2025
On Wed, 4 Jun 2025 12:44:10 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Hi @eme64 , I tried to use match pattern for `MergePrimitiveLoads::has_no_merge_load_combine_below()` . But I think it has some difficulty. For mergeable operators, they can be linked in different way, like:
>> 1) (((item1 Or item2) Or item3) Or item4)
>> 2) ((item1 Or item2) Or (item3 Or item4))
>> ...
>> To check the next `Or` operator is a valid last one of combine operator chain. We may check its all input recursively. I didn't find a good way to revolve it. If you have better idea, I will check it.
>>
>> I think it's more easy to mark the combine operator checked. It works in this way:
>> * If the checking combine operator has successor combine operator , which is not checked before, we do not optimize it and let the next one has chance to be optimized.
>> * If we try to merge but failed, so we mark it as a `checked` and add its input into GVN worklist. So its input operators can be checked.
>>
>> I added comments of MergePrimitiveLoads::has_no_merge_load_combine_below() to describe the design.
>>
>> To reduce the memory size of `AddNode`. I removed the flag from `AddNode` and add 2 virtual fucntions
>> ```c++
>> // Check if this node is checked by merge_memops phase
>> virtual bool is_merge_memops_checked() const { return false; };
>> virtual void set_merge_memops_checked(bool v) { ShouldNotReachHere(); };
>>
>> The flag , `_merge_memops_checked`, is only added in OrINode and OrLNode.
>>
>> Could you help to check the design and code?
>>
>> Thanks.
>
> @kuaiwei Thanks for your reply!
>
>> I think it's more easy to mark the combine operator checked.
>
> It may seem easier now. But over time, if multiple operations had such flags, things would become very messy. And now every node that can be such a `combine operator` has to have an additional flag, and consumes more memory.
>
>> I tried to use match pattern for MergePrimitiveLoads::has_no_merge_load_combine_below() . But I think it has some difficulty. For mergeable operators, they can be linked in different way, like:
>> (((item1 Or item2) Or item3) Or item4)
>> ((item1 Or item2) Or (item3 Or item4))
>> ...
>
> Yes, we may have to deal with inputs being permuted. But I think we should be able to deal with the permutations, we do that in other places too.
>
>> To check the next Or operator is a valid last one of combine operator chain. We may check its all input recursively. I didn't find a good way to revolve it. If you have better idea, I will check it.
>
> I'm not sure I understood what you said here.
>
>> We may check its all input recursively
>
> You probably mean we could check all outputs?
>
> So if you are looking at the `OrINode`, and the pattern above it is already a `MergeLoad` pattern, then we should also look down, and see if we find other `OrINode`. For each of these output nodes, we should check if their other input could also be merged with what we already have. Do you not think this is possible? What exactly makes it difficult or impossible?
@eme64 From your description, I may change like below. Could you check if I understand correct? Thanks.
When IGVN check the input combine operator, called it as `_checked`. We can go down the combine operators chain to find the _last one.
for op from _last to _checked: // _checked is not include
collect merge_info_list by op
if it can be merged and _checked is in the list:
return // it will be merged when IGVN optimize this op
if can not merge or _checked is not in list:
continue;
// all successors of combine operators are checked, we can start to merge with _checked
...
I think it can work but there are some redundant `collect and check` work. And we can add a cache in IGVN to reduce it. Do you have other suggestion about it ?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24023#issuecomment-2954901892
More information about the hotspot-compiler-dev
mailing list