RFR: 8351414: C2: MergeStores must happen after RangeCheck smearing

Emanuel Peter epeter at openjdk.org
Mon Mar 10 12:04:29 UTC 2025


On Fri, 7 Mar 2025 16:28:56 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

>> @merykitty I'm not sure if I understood you right. You seem to say that I should do this (please correct me):
>> - Traverse the WHOLE graph, looking for stores (I would like to avoid that, that's why I carry around the list).
>> - Rewrite the logic in MergeStores, and somehow have a global view rather than the local one. That's lots of work and I'm not sure I want to invest the time, unless it is somehow clearly better.
>> 
>> This is really a fix for Valhalla, see https://bugs.openjdk.org/browse/JDK-8348959, so it would be nice to fix this rather sooner. If someone wants to then refactor the code later, that's fine with me ;)
>
> @eme64
> 
>> Traverse the WHOLE graph, looking for stores (I would like to avoid that, that's why I carry around the list).
> 
> I think keeping a list is fine.
> 
>> Rewrite the logic in MergeStores, and somehow have a global view rather than the local one. That's lots of work and I'm not sure I want to invest the time, unless it is somehow clearly better.
> 
> I think the logic would still be the same. We start at a store then try to find the last store in the chain, then group the stores and do the merge. After that, we can remove the replaced stores from the work list. It is global in the sense that we can freely walk the graph instead of being restricted to the current node that is invoking `Ideal`. This leads to a series of:
> 
>     Status status_use = find_adjacent_use_store(_store);
>     if (status_use.found_store() != nullptr) {
>       return nullptr;
>     }
> 
> while we can do
> 
>     while (next != nullptr) {
>       StoreNode* last = next;
>       next = find_adjacent_use_store(last);
>     }
> 
> It is also better because idealisation of a store node may be invoked several times, leading to useless `find_adjacent_use_store` invocations.

@merykitty Thanks for giving more details. I agree that your idea would lead to some fewer adjacency checks, and so it would be somewhat desirable to do that. However, splitting out the `merge_stores` list would have to be done anyway, and that is almost all of the code I have here, so this here is a step in the right direction. Using IGVN is fine in my view, especially because I'm not newly introducing it here, but just moving it.

I personally have decided to only put minimal effort into MergeStores, my priorities are elsewhere. This issue here is primarily addressing the problems in the Valhalla repo, where the issues with the order of RC smearing and MergeStores is somehow more apparent than on mainline. So if you feel that this idea is important to you, feel free to file an RFE, maybe someone else wants to take this effort on.

@kuaiwei I think for now I'll leave it with `merge_stores`, but once you implement the `merge_loads`, you can just rename it. I'm not sure yet what would be the best name. `merge_mem` reminds me too much of the `MergeMemNode`. Maybe `merge_loads_and_stores` or `merge_memops` could be a better alternative.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23944#issuecomment-2710163924


More information about the hotspot-compiler-dev mailing list