RFR: 8333393: PhaseCFG::insert_anti_dependences can fail to raise LCAs and to add necessary anti-dependence edges [v10]

Tue Jan 28 13:33:56 UTC 2025

On Mon, 27 Jan 2025 17:19:12 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

> My problem is that you don't formally define what "equivalent memory inputs" is, and as a consequence, I don't understand what it means.

I have made another attempt at reorganizing the source code comment to try to make the definition of search roots (= equivalent memory inputs = same memory state of the load) clear. The definition starts at line 866. I still reason in terms of the dominator graph, and I do not have a definition that relates directly to the memory graph topology. If this is unacceptable, I'm fine with closing this PR and keep working on finding the fundamental issue.

> This raises the question what you mean by "ideal graph validity" and how it presents itself on the graph.

I mean equivalency in terms of program semantics and have made another attempt at rewording this paragraph. In the source code comment EXAMPLE 1, node 6 and 7 are the equivalent memory inputs to the load. They both represent the same memory state from which we load and choosing one or the other as the load's memory input does not change the program the ideal graph represents. In EXAMPLE 2, nodes 1 and 4 are instead the equivalent memory inputs.

> I strongly disagree, unless we can come up with a formal restriction on the memory graph that will imply this property, just thinking we mess up somewhere and blindly following this assumption that something must hold is not a good approach.

Yes, we would of course need to formalize one or more suitable memory graph
invariants and add assertions to ensure they hold at `insert_anti_dependences`
(and/or elsewhere). I'm not saying we should blindly follow anything.

> Consider a load that has initial_mem at B1 and early at B4:

Great, I think it would be good if we look at more examples. I guess you mean B3 and not B4?

> According to your logic, the Phi would be an equivalent memory input to the load.

Only if B1 dominates B2 and B2 dominates B3. It doesn't look like B1 dominates B2 in your illustration? Of course, it depends on if the `...` part connects to B1.

> But if the load has the memory input at initial_mem, initial_mem will be inspected to find anti-dependent stores, while if the memory input is at the Phi at B2, initial_mem is not inspected. This does not sound equivalent to me.

Assuming there is a path from B1 to B2 through the `...` so that the Phi at B2 is indeed a search root, `initial_mem` and the Phi both represent the same memory state from which we want to load (this is what I mean by equivalence). The memory state cannot have been overwritten in `...`, as otherwise `initial_mem` would not be at B1. What we have observed in this issue is that there may be relevant anti-dependences downstream of both `initial_mem` and the Phi. We need to identify and search for anti-dependences from both.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/22852#issuecomment-2619005970