RFR: 8345485: C2 MergeLoads: merge adjacent array/native memory loads into larger load [v4]

Kuai Wei kwei at openjdk.org
Mon Jun 16 07:10:40 UTC 2025


On Wed, 11 Jun 2025 12:13:22 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> @eme64 From your description, I may change like below. Could you check if I understand correct? Thanks.
>> When IGVN check the input combine operator, called it as `_checked`. We can go down the combine operators chain to find the _last one.
>> 
>> for op from _last to _checked:  // _checked is not include
>>   collect merge_info_list by op
>>   if it can be merged and _checked is in the list:
>>       return     // it will be merged when IGVN optimize this op
>>   if can not merge or _checked is not in list:
>>       continue;
>> // all successors of combine operators are checked, we can start to merge with _checked
>> ...
>> 
>> I think it can work but there are some redundant `collect and check` work. And we can add a cache in IGVN to reduce it. Do you have other suggestion about it ?
>
> @kuaiwei I'm struggling to follow your pseudocode, can you please expand a little or try to describe in other words?
> 
> It is good to reduce redundant work, but it has to be worth the complexity. But I'm skeptical of caching things between IGVN optimizations, it is just not something we do, as far as I know. Caching could also be tricky when the cached things are not accurate any more. What exactly would be your approach with caching?

@eme64 I added some detail to pseudo code. I hope it can explain my design. 
A difficult part of the optimization is that we need find the right combine operator which can be replaced with merged load. Especially it has successor combine operator. We need to find which one is better candidate.
For a given combine operator, the `MergePrimitiveLoad::run()` will do like this:

```c++
void MergePrimitiveLoad::run(AddNode* _check) {
   // go down through the unique output of _check to collect successor combine operators
   GrowableArray<AddNode*> rest_combine_operators = collect_successor_combine_operators(_check)

   if ( !rest_combine_operator.is_empty() ) {
       for (AddNode* op in rest_combine_operators) {
           // collect mergeable load info list from this combine operator
           merge_list = collect_merge_info_list(op)
           if (_check is in merge_list) {
              // we will not optimize _check, it will be merged when op is optimized
              return;
           }
       }
   }

   // all successor combine operators are checked, we can start to optimize the given _check operator
   ...
}

// For a given combine operator, collect mergeable load info list. Every item
// in the result list is a tuple of (Load, combine, shift_value)
// It's similar with MergePrimitive::collect_merge_list() in previous patch, more detail can be found in code.

GrowableArray<MergeLoadInfo> collect_merge_info_list(AddNode* combine) {
   // get the load from input of combine
   LoadNode* load = get_load_from_input(combine);
   if (load == nullptr) return;

   MemNode* mem = memory input of load;

   for every output of mem {
      if (output->isa_load()) {
         // make a merged info for this load node
         info = merge_load_info(output);
      }
      if ( !info->is_invalid() ) {
         append info into result list
      }
   }

   // check the merged bytes and if they are adjacent.
   ...

   return result list;
}

-------------

PR Comment: https://git.openjdk.org/jdk/pull/24023#issuecomment-2969508990


More information about the hotspot-compiler-dev mailing list