RFR: 8351889: C2 crash: assertion failed: Base pointers must match (addp 344)

Emanuel Peter epeter at openjdk.org
Wed May 28 08:36:51 UTC 2025


On Thu, 22 May 2025 08:35:18 GMT, Roland Westrelin <roland at openjdk.org> wrote:

> The test case has an out of loop `Store` with an `AddP` address
> expression that has other uses and is in the loop body. Schematically,
> only showing the address subgraph and the bases for the `AddP`s:
> 
> 
> Store#195 -> AddP#133 -> AddP#134 -> CastPP#110
>                      -> CastPP#110
> 
> 
> Both `AddP`s have the same base, a `CastPP` that's also in the loop
> body.
> 
> That loop is a counted loop and only has 3 iterations so is fully
> unrolled. First, one iteration is peeled:
> 
> 
>                                 /-> CastPP#110
> Store#195 -> Phi#360 -> AddP#133 -> AddP#134 -> CastPP#110
>                     -> AddP#277 -> AddP#278 -> CastPP#283
>                                 -> CastPP#283
> 
> 
> 
> The `AddP`s and `CastPP` are cloned (because in the loop body). As
> part of peeling, `PhaseIdealLoop::peeled_dom_test_elim()` is
> called. It finds the test that guards `CastPP#283` in the peeled
> iteration dominates and replaces the test that guards `CastPP#110`
> (the test in the peeled iteration is the clone of the test in the
> loop). That causes `CastPP#110`'s control to be updated to that of the
> test in the peeled iteration and to be yanked from the loop. So now
> `CastPP#283` and `CastPP#110` have the same inputs.
> 
> Next unrolling happens:
> 
> 
>                                            /-> CastPP#110
>                                /-> AddP#400 -> AddP#401 -> CastPP#110
> Store#195 -> Phi#360 -> Phi#477 -> AddP#133 -> AddP#134 -> CastPP#110
>                   \                        -> CastPP#110
>                    -> AddP#277 -> AddP#278 -> CastPP#283
>                                -> CastPP#283
> 
> 
> 
> `AddP`s are cloned once more but not the `CastPP`s because they are
> both in the peeled iteration now. A new `Phi` is added.
> 
> Next igvn runs. It's going to push the `AddP`s through the `Phi`s.
> 
> Through `Phi#477`:
> 
> 
> 
>                                 /-> CastPP#110
> Store#195 -> Phi#360 -> AddP#510 -> Phi#509 -> AddP#401 -> CastPP#110
>                   \                        -> AddP#134 -> CastPP#110
>                    -> AddP#277 -> AddP#278 -> CastPP#283
>                                -> CastPP#283
> 
> 
> 
> Through `Phi#360`:
> 
> 
>                                            /-> AddP#134 -> CastPP#110
>                                 /-> Phi#509 -> AddP#401 -> CastPP#110
> Store#195 -> AddP#516 -> Phi#515 -> AddP#278 -> CastPP#283
>                      -> Phi#514 -> CastPP#283
>                                 -> CastP#110
> 
> 
> Then `Phi#514` which has 2 `CastPP`s as input with identical inputs is
> transformed into anot...

@rwestrel Thanks for looking into this!

My (somewhat limited) experience with delaying optimizations is that this can be quite brittle. You need to get the condition just right, otherwise it just happens again in some generalized case again - maybe you check for 1 level, and later it happens with 2 or more layers.

I'm half-understanding the example you present.  Can you show the IR nodes for your last step:

Store#195 -> AddP#516 -> AddP#544 -> CastPP#110
                     -> CastPP#529

What exactly are the bases there?
Your simplified drawings seem to show the flow of computation, but I cannot see what the bases are in it, right? You could enhance it, for example with `AddP#nnn(base:nnn)`. I think that would help me follow the example.

Maybe some more full IR snippets could be helpful, maybe even IGV drawings. But that may be more work for you.

I'm wondering if we could not have some other "cleanup" optimizations that fix up the bases. What are the assumptions about merging AddP's at a Phi? Is the base from before the Phi propagated to after the Phi? I'm missing some base understanding here to see through this ;)

src/hotspot/share/opto/cfgnode.cpp line 2107:

> 2105:   }
> 2106:   return false;
> 2107: }

You check for a single level here. Could the same happen over multiple levels?

-------------

PR Review: https://git.openjdk.org/jdk/pull/25386#pullrequestreview-2874076352
PR Review Comment: https://git.openjdk.org/jdk/pull/25386#discussion_r2111247366


More information about the hotspot-compiler-dev mailing list