RFR: 8347018: C2: Insertion of Assertion Predicates ignores the effects of PhaseIdealLoop::clone_up_backedge_goo() [v2]
Christian Hagedorn
chagedorn at openjdk.org
Fri Jan 17 09:06:36 UTC 2025
On Tue, 14 Jan 2025 17:56:09 GMT, Christian Hagedorn <chagedorn at openjdk.org> wrote:
>> ## Failing Assert
>>
>> The failing assert in `PhaseCFG::schedule_late()` checks the following:
>> https://github.com/openjdk/jdk/blob/55c6904e8f3d02530749bf28f2cc966e8983a984/src/hotspot/share/opto/gcm.cpp#L1431-L1438
>>
>> In the test case, this is violated for `87 storeI`:
>> 
>>
>> The early block `early` for `87 storeI` is bound by `115 loadI` pinned at `161 Region` which is dominated by the control input `146 Region` of `87 storeI`. This lets the assert fail.
>>
>> ## How Did `115 loadI` End up Being Pinned below `87 storeI`?
>> ### Before Pre/Main/Post Loop Creation
>> Before the creation of pre/main/post loops, we have the following graph:
>>
>> 
>>
>> Everything looks fine: The control input of `312 StoreI` (which is eventually cloned and becomes `87 storeI` in the Mach graph) corresponds to the early placement of the store. `415 LoadI` was hoisted out of the loop during Loop Predication and is pinned above at a Template Assertion Predicate.
>>
>> ### Pre/Main/Post Loop Creation
>> #### Post Loop Body Creation
>> During the creation of pre/main/post loops, we clone the main loop body for the post loop body:
>>
>> 
>>
>> We notice that `312 StoreI` is pinned on the main loop backedge. When finishing the last iteration from the main loop and possibly continuing in the post loop, we need to feed everything on the loop backedge of the main loop to the post loop. However, the pinned nodes on the main loop backedge cannot float. Therefore, we need to create new copies of these pinned nodes with `PhaseIdealLoop::clone_up_backedge_goo()`.
>>
>> The pins are updated to the entry of the post loop. All inputs into these pinned nodes that have their current control (fetched with `get_ctrl()`) on the main loop backedge as well are also cloned but keep their control inputs (if any) if it's not the loop backedge.
>>
>> In our example, this applies to `453 StoreI` -> `479 StoreI`, and some inputs recursively (`454 AddI` -> `482 AddI`, `481 LoadI` -> `541 Load`):
>>
>> 
>>
>> Still, all looks fine. Notice that the clone `481 LoadI` of `455 LoadI` is currently still pinned at the same Template Assertion.
>>
>> #### Assertion Predica...
>
> Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision:
>
> Renaming
Thanks Emanuel for your questions, let's first make a step back. This patch essentially fixes two issues:
1. "Cloned Nodes with `clone_up_backedge_goo()` Mess with "Node inside Main Loop" Check" (see section above):
Regardless of the observed assertion failure, I think the cloned nodes only belonging to the main loop should not be pinned before the pre loop. I could not come up with a failing case but I think we should fix this (done with `NodeInMainLoopBody` class).
2. The observed assert where the control input of a store does not match its early block in GCM:
The main part of the fix ensures that the cloned nodes from `clone_up_backedge_goo()` actually end up at the loop entry as originally assumed by the method.
> Why is it a problem that the control the store dominates the control of the load? Could such a constellation not happen in other circumstances too? Could we not come up with some case where the store has its control moved up above the control of the load somehow - or is that generally not supposed to happen?
The assert was introduced around 3 years ago and never failed before. It sounds like a condition that is always met - until discovered now. But when closer looking at the the original intention of `clone_up_backedge_goo()` to pin the cloned nodes at the loop entry, it should still ensure that this assert would hold. But Assertion Predicates now mess with that since we inject additional `Ifs` between the cloned nodes and the loop entry.
The comment at the failing assert suggests that this invariant is required such that `hoist_to_cheaper_block()` works properly. However, I'm unclear about the impact - whether it's just missing some optimization or possibly leading to wrong code or a crash. Maybe @robcasloz can comment on that who introduced the assert.
I've tried to play around to create such a situation without Assertion Predicates but could not find a case how we could violate the assert - but of course, that does not prove anything.
> I suppose I am wondering why can GCM not just handle such cases?
I think that's also a possible option. I've decided to do the fix at the Assertion Predicates creation point for the following reasons:
- Without Assertion Predicates, we have not seen this assert fail.
- Assertion Predicates are violating the guarantees that `clone_up_backedge_goo()` wants to promise: The cloned nodes are not ending up at the loop entry. I'm not sure who else is indirectly relying on this. We do not seem to have traced anything back to this, yet, but we also cannot be sure if there are other hidden problems.
- I'm not sure how difficult it will be to fix GCM and what the impact is. When assuming that Assertion Predicates should have been inserted correctly, without messing with the effects of `clone_up_backedge_goo()`, we have never seen a case where GCM should be able to support this case.
- Since I first considered to get this into JDK 24, fixing the new Assertion Predicate code seemed more straight forward and less risky.
For these reasons, I've chosen the current fix idea without trying to change the current behavior of GCM. What are your thoughts about that?
> You should probably adjust the title of the PR / Bug.
Good point, even though it's addressing two problems, I could just make it more obvious what the fixes are about. Changed!
-------------
PR Comment: https://git.openjdk.org/jdk/pull/23071#issuecomment-2597744603
More information about the hotspot-compiler-dev
mailing list