RFR: 8333393: PhaseCFG::insert_anti_dependences can fail to raise LCAs and to add necessary anti-dependence edges [v2]

Daniel Lundén dlunden at openjdk.org
Thu Jan 9 08:58:41 UTC 2025


On Thu, 9 Jan 2025 08:18:34 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> Thanks for the comments @eme64!
>> 
>>> In you example1, why do we therefore not put an anti-dependency edge betweeen the 183 load, and the 106 Phi? Would that not be enough to ensure the load is scheduled before the other memory affecting nodes further below 106 Phi?
>>>
>>> Or is the issue that this traversal is somehow restricted to blocks - I don't remember that from last time...
>> I'll keep reading the changes now.
>> 
>> Yes, Phis only result in LCA changes at the block level and we never add anti-dependence edges directly between the load and Phi nodes. In example 1, we do mark the last block in between the path from 107 Phi to 106 Phi (which is B27) for raising the LCA. However, when doing the joint LCA raising operation later on (`raise_LCA_above_marks`), we start at the original LCA and stop when we reach the early block (B20). Therefore, we never even consider B27. My very first attempt at solving this issue was to try and identify some dominance relation between the early block and the blocks for and in between 107 Phi and 106 Phi and use this information to force the LCA to the early block. This kind of worked at the block level, but we still need to identify somehow that we need an anti-dependence edge to 64 membar_release. Otherwise, it can happen that the load is scheduled correctly in the early block, but incorrectly (after an overwriting store) within the block (easily verified with `-
 XX:+StressLCM`).
>> 
>>> And in example 2, we should schedule before the Phi as well:
>> Why don't we do that?
>> 
>> Same here as above, we do tag both B19 and B21 for raising the LCA, but never consider them in `raise_LCA_above_marks` since they are above the early block B9.
>
>> we never add anti-dependence edges directly between the load and Phi nodes
> 
> Ah interesting. Do you know why we do not do that? Would that generate worse code? Because it seems to me that would add fewer edges, and would probably require a smaller traversal. What do you think?

My understanding is that anti-dependence edges are only relevant for local scheduling (within blocks). Because Phis merge control-flow paths (by definition at the start of blocks), I would say it makes little sense to add an anti-dependence edge to Phi nodes. Does it make sense semantically to schedule loads before Phi nodes within a block? I don't think so, but I may be wrong.

I think what you are getting at is, at the block level, whether or not it is possible to raise the LCA above the Phi itself, rather than before the relevant inputs. That would make the scheduling less conservative. Have a look at [this comment](https://github.com/openjdk/jdk/pull/22852/files#diff-13dc4f80ba6ccaa27b0612318074e35200ffe9314405e30ace331807e56b5f60L870-L876) in the source. There are previous attempts at making the LCA raising less conservative in this manner (see, e.g., [JDK-8192992](https://bugs.openjdk.org/browse/JDK-8192992)), but it turns out to be quite tricky to get right. It is definitely an issue separate from the present one!

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/22852#discussion_r1908378186


More information about the hotspot-compiler-dev mailing list