RFR: 8333393: PhaseCFG::insert_anti_dependences can fail to raise LCAs and to add necessary anti-dependence edges [v2]
Emanuel Peter
epeter at openjdk.org
Fri Jan 17 08:21:43 UTC 2025
On Fri, 17 Jan 2025 08:09:15 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>>> Hmm, ok. I think the description here should be made more clear then, and explain what the strategy is, and attempt a kind of informal proof why this is an ok approach. What do you think?
>>
>> Do you mean a general proof sketch of the approach in `PhaseCFG::insert_anti_dependences`? This changeset does not really change the approach itself, but just extends the search to more initial memory states.
>>
>>> but it does not really reassure me that we have all cases covered
>>
>> I am not fully reassured we cover all cases either, but we at least now cover more cases than before (and specifically the cases that appear for this issue). See also the earlier comment by Roberto on investigating whether we can somehow enfore in earlier phases the invariant that the original version of `PhaseCFG::insert_anti_dependences` assumed: that we only need to search from one initial memory state. I would suggest doing this in a separate RFE.
>>
>>> What does the initial mean to you?
>>
>> To me it means the initial memory states that we start our searches at. I'm fine with your renaming suggestion, calling them roots. I would perhaps suggest that we then also say `input_mem` instead of `initial_mem` to signify that it is the actual input to the load.
>
>> I am not fully reassured we cover all cases either, but we at least now cover more cases than before (and specifically the cases that appear for this issue). See also the earlier comment by Roberto on investigating whether we can somehow enfore in earlier phases the invariant that the original version of PhaseCFG::insert_anti_dependences assumed: that we only need to search from one initial memory state. I would suggest doing this in a separate RFE.
>
> Hmm. I guess we could allow your "ad hoc" looking fix. Still, I think you could improve the comment, so that if someone else hits a bug here they have an easier starting point.
>
> Maybe you could say something like this:
> We only add anti-dependency edges for `load -> store` and not for `load -> phi`. But below a `phi` there could be more `stores`, which must happen after the `load`.
>
> Then say something more about how you find all the relevant `load`s with your forest traversal.
>
> Another question: why not just traverse down through `phi` nodes, just like with the `MergeMem`s? That way, we could also find all `stores` that have to happen after a `load`, right? It seems to me that would be easier to make a proof for.
>
> Actually, does your current algorithm with traversing all blocks on the idom change find all relevant `loads`? Maybe you can actually make a good argument here too. Maybe you can argue in which blocks the `load` could be scheduled (all from `early` to the initial_mem_block ?), and so we must ensure the anti-dependency edges for any `store` in those blocks that could alias. Am I getting this right?
>
> If that is true, you could alternatively just search down through `MergeMem` and `Phi`, collecting all `stores` at the "leaves". But constrain the search to the blocks between `early` and `initial_mem_block`.
>
> I'm not saying that you need to change your algorithm, I'm just trying to understand at this point. Maybe your solution is correct, and we can convince ourselves that it actually covers all cases if we come up with a good proof / argument.
About naming: I don't care which exact names you use, it would just be nice to have some clear definition of what means what at the beginning, so the reader does not have to reverse engineer it ;)
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/22852#discussion_r1919716838
More information about the hotspot-compiler-dev
mailing list