RFR: 8292088: C2: assert(is_OuterStripMinedLoop()) failed: invalid node class: IfTrue
Christian Hagedorn
chagedorn at openjdk.org
Mon Sep 19 10:15:55 UTC 2022
On Fri, 16 Sep 2022 12:00:43 GMT, Christian Hagedorn <chagedorn at openjdk.org> wrote:
> In `testKnownLimit()`, we directly use the (pre-incremented) iv phi `IV_PHI_i` (`232 Phi`) in the loop exit check of the `while` loop:
>
> ![Screenshot from 2022-09-16 10-33-58](https://user-images.githubusercontent.com/17833009/190604454-50aa1b1e-7111-4723-a329-b95e0f26c220.png)
>
> Such pre-incremented iv phi uses after the loop are detected in `PhaseIdealLoop::reorg_offsets()` and replaced in order to reduce register pressure. We insert an additional `Opaque2` node to prevent any optimizations to undo the effect of `PhaseIdealLoop::reorg_offsets()`:
>
>
> // iv Phi iv Phi
> // | |
> // | AddI (+stride)
> // | |
> // | Opaque2 # Blocks IGVN from folding these nodes until loop opts are over.
> // | ====> |
> // | AddI (-stride)
> // | |
> // | CastII # Preserve type of iv Phi
> // | |
> // Outside Use Outside Use
>
>
> In the test case, this is done before CCP and looks like this:
>
> ![Screenshot from 2022-09-16 10-33-35](https://user-images.githubusercontent.com/17833009/190623922-3b0c9eeb-8468-4cd7-8fe1-1f7df3dc5071.png)
>
> At that point, we do not know yet that the `while` loop is only gonna be executed once (i.e. `422 CountedLoopEnd` is always false). This only becomes known after CCP where the type of `232 Phi` improves. But since we have an `Opaque2` node, this update is not propagated until the `Opaque2` nodes are removed in macro expansion:
>
> https://github.com/openjdk/jdk/blob/11e7d53b23796cbd3d878048f7553885ae07f4d1/src/hotspot/share/opto/macro.cpp#L2412-L2414
>
> During macro expansion, we also adjust the strip mined loop: We move the `421 Bool` of the inner loop exit check `422 CountedLoopEnd` to the outer strip mined loop exit check and adjust the inner loop exit check in such a way that C2 cannot figure out that the entire loop is only run once. In the next IGVN phase, the outer strip mined loop node is removed while the inner loop `429 CountedLoop` is not.
>
> Later in `verify_strip_mined()`, we cannot find the outer strip mined loop of `429 CountedLoop` anymore and we fail with the assertion.
>
> The first thought to fix this problem is to add `Opaque2::Value()` to let type information flow. But this does not fix the problem completely if the type of the iv phi has no known upper limit. There we have the problem that in general `type(phi) != type(phi + x - x)` because `phi + x` could overflow and we end up with type `int` (which happens in `testUnknownLimit()`).
>
> I therefore suggest to remove `Opaque2` nodes earlier before macro expansion to fix this bug. A good place seems to be right after loop opts are over. We can remove them at the same time as `Opaque1` nodes by adding a similar `Identity()` method. This lets the loop nodes to be folded away before trying to adjust the outer strip mined loop limit.
>
> #### Are Opaque2 nodes really useful?
>
> When working on this bug, I started to question the usage of `Opaque2` nodes in general. We are still running IGVN after `Opaque2` nodes are currently removed. This simply undoes the effects of `PhaseIdealLoop::reorg_offsets()` again and we end up using pre-incremented iv phis anyways. My theory was that we are either blocking some specific optimizations during loop opts which cannot be reverted later in IGVN or that we initially (when this `Opaque2` optimization was added) did not run IGVN anymore once `Opaque2` nodes are removed.
>
> I could not think of any such non-revertable optimization that `Opaque2` nodes could prevent. On top of that, `PhaseIdealLoop::reorg_offsets()` also does not mention anything alike. I therefore had a look at the history of `Opaque2` nodes. Unfortunately, they were added before the initial load commit. I've dug deeper through some old closed repo and found that at the time the `Opaque2` nodes were introduced around 20 years ago, we did not do any IGVN anymore after the removal of the `Opaque2` nodes - and we generated code with these unoptimized `iv phi + x - x` patterns.
>
> This suggests that today the `Opaque2` nodes are indeed not really doing what they were originally supposed to do. I would therefore suggest to investigate their complete removal in a separate RFE and go with the suggested fix above which does not make the current situation of the questionable `Opaque2` node usages any worse.
>
> Thanks,
> Christian
Thanks Vladimir for your review!
-------------
PR: https://git.openjdk.org/jdk/pull/10306
More information about the hotspot-compiler-dev
mailing list