RFR: 8333393: PhaseCFG::insert_anti_dependences can fail to raise LCAs and to add necessary anti-dependence edges [v5]

Emanuel Peter epeter at openjdk.org
Mon Jan 20 10:44:41 UTC 2025


On Tue, 7 Jan 2025 18:00:21 GMT, Daniel Lundén <dlunden at openjdk.org> wrote:

>> Good analysis, Daniel! Given the presence of overlapping memory phis (memory phis that are placed in the same block and include aliasing memory slices), the general idea of this fix seems reasonable to me. As a more fundamental solution, it would be worth investigating (perhaps separately) the root cause of this overlap and exploring whether it is feasible to enforce disjointness (an invariant apparently assumed by the original `PhaseCFG::insert_anti_dependences` algorithm), at least during code generation.
>> 
>> Does the comment above the definition of `initial_mem` require any update as part of this change?
>
> Thanks for the review @robcasloz!
> 
>> Given the presence of overlapping memory phis (memory phis that are placed in the same block and include aliasing memory slices), the general idea of this fix seems reasonable to me. As a more fundamental solution, it would be worth investigating (perhaps separately) the root cause of this overlap and exploring whether it is feasible to enforce disjointness (an invariant apparently assumed by the original PhaseCFG::insert_anti_dependences algorithm), at least during code generation.
> 
> Yes, I think it could be useful to investigate further. Based on my observations while working on this issue, the overlapping memory Phis likely result from loop peeling. However, the Phi overlap is not the sole cause of this issue, as the second example demonstrates. I suggest we write an RFE and investigate in a separate issue.
> 
>> Does the comment above the definition of initial_mem require any update as part of this change?
> 
> Yes, thanks. Added now.

@dlunde Thanks for the updates. I still struggle to understand the algorithm completely. Christian and I looked at it today for 1.5h. I think it may be best if we schedule a call to discuss this tomorrow.

We also thought that the documentation of the algorithm is a big difficult to understand (before your changes), and it would be nice if we could improve that now that we are digging into it anyway.

We are also not sure if the graph is sane before `insert_anti_dependences`. It would be nice if you could show us a graph that includes the CFG graph and the memory graph, so that it is a little easier to understand what happens. It is interesting that in Example 1, it seems that both the `Load` and `membar_release` are in the same `early` block.

We also wondered why the `membar_release` for the volatile byte ends up on the same slice as the int field. It's well possible that this is correct, but just less than optimal.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/22852#issuecomment-2602061177


More information about the hotspot-compiler-dev mailing list