RFR: 8333393: PhaseCFG::insert_anti_dependences can fail to raise LCAs and to add necessary anti-dependence edges [v10]
Quan Anh Mai
qamai at openjdk.org
Mon Jan 27 09:44:51 UTC 2025
On Mon, 27 Jan 2025 09:13:13 GMT, Daniel Lundén <dlunden at openjdk.org> wrote:
>> If `n` is the memory input of the load such that `n` is a `Phi`, `a` is an input of `n` and `a` is also an input of another `Phi` `m`, then we need to also search for anti-dependencies at `m`. If `m` is before `n` then your fix will miss it. Is there any mechanism that prevents this from happening?
>
> @merykitty: It sounds like you are proposing something very similar to my first [alternative changeset](https://github.com/openjdk/jdk/compare/master...dlunde:jdk:insert-anti-dependences-8333393+rewrite), where I search upwards through Phis in a first phase and then downwards through (a subset of) Phis in a second phase. I would advise against taking this route for the following reasons.
> - I believe that the current algorithm is, in fact, correct under the assumption that there is only one valid memory input for each load. We now see cases where we violate this invariant and have many equivalent memory inputs for loads, likely because of some previous change or changes elsewhere in C2. I plan to investigate this in a follow-up RFE (and if I find an issue, we may be able to remove the search for equivalent memory inputs added in this changeset).
> - The alternative changeset results in a significant compilation time regression for the C2 scheduling phase, compared to no measurable regression for the proposed changeset.
>
> I should also add that the current approach where we stop immediately at all Phis is quite conservative, and that there are opportunities to make it less conservative by not considering all Phis as anti-dependent (specifically, Phis where we know `initial_mem` is live on all inputs). This is another follow-up RFE that I plan to look at.
>
>> If n is the memory input of the load such that n is a Phi, a is an input of n and a is also an input of another Phi m, then we need to also search for anti-dependencies at m. If m is before n then your fix will miss it. Is there any mechanism that prevents this from happening?
>
> In all failing cases that we have observed, we only need to search for anti-dependences if `m` is an equivalent memory input to the load, in which case the proposed changeset will find `m`.
@dlunde No I am just proposing a way to explain the correctness of your approach as currently I don't understand the invariants of the memory graph, the relation between the topology of the memory graph and the constraints on scheduling and how our gcm algorithm can manage to impose such constraints.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/22852#issuecomment-2615263168
More information about the hotspot-compiler-dev
mailing list