RFR: 8333393: PhaseCFG::insert_anti_dependences can fail to raise LCAs and to add necessary anti-dependence edges [v4]

Christian Hagedorn chagedorn at openjdk.org
Thu Feb 27 14:42:00 UTC 2025


On Tue, 25 Feb 2025 19:11:15 GMT, Daniel Lundén <dlunden at openjdk.org> wrote:

>> When searching for load anti-dependences in GCM, the memory state for the load is sometimes represented not only by the memory node input of the load, but also other memory nodes. Because PhaseCFG::insert_anti_dependences searches for anti-dependences only from the load's memory input, it is, therefore, possible to sometimes overlook anti-dependences. The result is that loads are potentially scheduled too late, after stores that redefine the memory states of the loads.
>> 
>> ### Changeset
>> 
>> It is not yet clear why multiple nodes sometimes represent the memory state of a load, nor if this is expected. We can, however, resolve all the miscompiled test cases seen in this issue by improving the idealization of Phi nodes. Specifically, there is an idealization where we split Phis through input MergeMems, that we, prior to this changeset, applied too conservatively.
>> 
>> To illustrate the idealization and how it resolves this issue, consider the example below.
>> 
>> ![failure-graph-1](https://github.com/user-attachments/assets/ecbd204f-bdf0-49cb-a62e-8081d08cfe0c)
>> 
>> `64 membar_release` is a critical anti-dependence for `183 loadI`. The anti-dependence search starts at the load's direct memory input, `107 Phi`, and stops immediately at Phis. Therefore, the search ends at `106 Phi` and we never find `64 membar_release`.
>> 
>> We can apply the split-through-MergeMem Phi idealization to `119 Phi`. This idealization pushes `119 Phi` through `120 MergeMem` and `121 MergeMem`, splitting it into the individual inputs of the MergeMems in the process. As a result, we replace `119 Phi` with two new Phis. One of these generated Phis has identical inputs to `107 Phi` (`106 Phi` and `104 Phi`), and further idealizations will merge this new Phi and `107 Phi`. As a result, `107 Phi` then has a Phi-free path to `64 membar_release` and we correctly discover the anti-dependence.
>> 
>> The changeset consists of the following changes.
>> - Add an analysis that allows applying the split-through-MergeMem idealization in more cases than before (including in the above example) while still ensuring termination.
>> - Add a missing `ResourceMark` in `PhiNode::split_out_instance`.
>> - Add multiple new regression tests in `TestGCMLoadPlacement.java`.
>> 
>> For reference, [here](https://github.com/openjdk/jdk/pull/22852) is a previous PR with an alternative fix that we decided to discard in favor of the fix in this PR.
>> 
>> ### Testing
>> 
>> - [GitHub Actions](https://github.com/dlunde/jdk/ac...
>
> Daniel Lundén has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix subtle bug introduced in previous update

I agree with this solution. Thanks for deep diving once more into it an proposing this alternative solution. Great work! I left a few comments, otherwise, looks good.

Thanks for the credit! Maybe you also want to add Emanuel as contributor since he also joined in the discussions :-)

src/hotspot/share/opto/cfgnode.cpp line 2393:

> 2391:     // non-termination.
> 2392:     uint merge_width = 0;
> 2393:     bool split_must_terminate = false; // Is splitting guaranteed to terminate?

Needed to read it twice to understand it. How about `split_always_terminates` instead to make it more clear?

src/hotspot/share/opto/cfgnode.cpp line 2435:

> 2433:       ResourceMark rm;
> 2434:       VectorSet visited;
> 2435:       Node_List worklist;

You could also use a `Unique_Node_List` instead of a `Node_List` + `VectorSet`:

Unique_Node_List worklist;
for (uint i = 0; i < worklist.size(); i++) {
}

src/hotspot/share/opto/cfgnode.cpp line 2448:

> 2446:       };
> 2447:       split_must_terminate = true; // Assume no circularity until proven otherwise.
> 2448:       while (split_must_terminate && worklist.size() > 0) {

Seems like you have `split_must_terminate` in this condition because the `break` in the `else` path does not exit both loops. When using a separate method, you could directly return false when finding that `split_must_terminate` is false. Then you can remove `split_must_terminate` from this exit condition.

src/hotspot/share/opto/cfgnode.cpp line 2469:

> 2467:         }
> 2468:       }
> 2469:     }

Would it make sense to extract this to a separate method `is_split_through_mergemem_terminating()` (or something similar) which returns the value for `split_must_terminate`? The `PhiNode::Ideal()` method is already extremely large.

test/hotspot/jtreg/compiler/codegen/TestGCMLoadPlacement.java line 30:

> 28:  * @bug 8333393
> 29:  * @summary Test that loads are not scheduled too late.
> 30:  * @run main/othervm -XX:+UnlockDiagnosticVMOptions

`PerMethodTrapLimit` is a pure product flag, so you can remove `-XX:+UnlockDiagnosticVMOptions`. Same below where you don't use `Stress*` flags (better double check again with a product build to be safe :-) )

test/hotspot/jtreg/compiler/codegen/TestGCMLoadPlacement.java line 31:

> 29:  * @summary Test that loads are not scheduled too late.
> 30:  * @run main/othervm -XX:+UnlockDiagnosticVMOptions
> 31:  *                   -XX:CompileCommand=quiet

Is `quiet` required?

test/hotspot/jtreg/compiler/codegen/TestGCMLoadPlacement.java line 101:

> 99: 
> 100:         int test() {
> 101:             for (int i = 0; i < 50; ++i)

You should add missing braces here

-------------

PR Review: https://git.openjdk.org/jdk/pull/23691#pullrequestreview-2647941285
PR Review Comment: https://git.openjdk.org/jdk/pull/23691#discussion_r1973677330
PR Review Comment: https://git.openjdk.org/jdk/pull/23691#discussion_r1973689113
PR Review Comment: https://git.openjdk.org/jdk/pull/23691#discussion_r1973691300
PR Review Comment: https://git.openjdk.org/jdk/pull/23691#discussion_r1973682276
PR Review Comment: https://git.openjdk.org/jdk/pull/23691#discussion_r1973704095
PR Review Comment: https://git.openjdk.org/jdk/pull/23691#discussion_r1973699874
PR Review Comment: https://git.openjdk.org/jdk/pull/23691#discussion_r1973705788


More information about the hotspot-compiler-dev mailing list