RFR: 8350864: C2: verify structural invariants of the Ideal graph [v5]

Mon Sep 8 17:07:31 UTC 2025

On Fri, 5 Sep 2025 08:13:35 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:

>> Some crashes are consequences of earlier misshaped ideal graphs, which could be detected earlier, closer to the source, before the possibly many transformations that lead to the crash.
>> 
>> Let's verify that the ideal graph is well-shaped earlier then! I propose here such a feature. This runs after IGVN, because at this point, the graph, should be cleaned up for any weirdness happening earlier or during IGVN.
>> 
>> This feature is enabled with the develop flag `VerifyIdealStructuralInvariants`. Open to renaming. No problem with me! This feature is only available in debug builds, and most of the code is even not compiled in product, since it uses some debug-only functions, such as `Node::dump` or `Node::Name`.
>> 
>> For now, only local checks are implemented: they are checks that only look at a node and its neighborhood, wherever it happens in the graph. Typically: under a `If` node, we have a `IfTrue` and a `IfFalse`. To ease development, each check is implemented in its own class, independently of the others. Nevertheless, one needs to do always the same kind of things: checking there is an output of such type, checking there is N inputs, that the k-th input has such type... To ease writing such checks, in a readable way, and in a less error-prone way than pile of copy-pasted code that manually traverse the graph, I propose a set of compositional helpers to write patterns that can be matched against the ideal graph. Since these patterns are... patterns, so not related to a specific graph, they can be allocated once and forever. When used, one provides the node (called center) around which one want to check if the pattern holds.
>> 
>> On top of making the description of pattern easier, these helpers allows nice printing in case of error, by showing the path from the center to the violating node. For instance (made up for the purpose of showing the formatting), a violation with a path climbing only inputs:
>> 
>> 1 failure for node
>>  211  OuterStripMinedLoopEnd  === 215 39  [[ 212 198 ]] P=0,948966, C=23799,000000
>> At node
>>     209  CountedLoopEnd  === 182 208  [[ 210 197 ]] [lt] P=0,948966, C=23799,000000 !orig=[196] !jvms: StringLatin1::equals @ bci:12 (line 100)
>>   From path:
>>     [center] 211  OuterStripMinedLoopEnd  === 215 39  [[ 212 198 ]] P=0,948966, C=23799,000000
>>       <-(0)- 215  SafePoint  === 210 1 7 1 1 216 37 54 185  [[ 211 ]]  SafePoint  !orig=186 !jvms: StringLatin1::equals @ bci:29 (line 100)
>>       <-(0)- 210  IfFalse  === 209  [[ 21...
>
> Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision:
> 
>   One more ResourceMark

Alright, I have another barrage of comments.

Things are much better already.

Though it would be good to discuss a bit more how the patterns now look, especially if this becomes something that we do more widely eventually.
I would like to know what are the advantages and disadvantages, and what alternatives we would have ;)

src/hotspot/share/opto/compile.cpp line 702:

> 700:       ,
> 701:       _in_dump_cnt(0),
> 702:       _invariant_checker(GraphInvariantChecker::make_default())

How does this interface with `ResouceMarks`?
Because it is now resource allocated. And so is the `_checks`.
How does this not trip the nesting asserts of allocation there?
I'm probably missing something here.

I would have expected that we need to allocate it from the `_comp_arena`.

src/hotspot/share/opto/graphInvariants.cpp line 32:

> 30: constexpr int LocalGraphInvariant::OutputStep;
> 31: 
> 32: void LocalGraphInvariant::LazyReachableCFGNodes::fill() {

Nit: I would call it `compute`. `fill` sounds like you are going to fill the nodes themselves or something.
And: I know they are synonyms. But why do you use both `reachable` (class name) and `live` (array)?

src/hotspot/share/opto/graphInvariants.cpp line 45:

> 43:       }
> 44:     }
> 45:   }

It seems you are assuming that all CFG nodes are reachable "from below".
That is true in most cases... but:
Have we not had this pesky case where we have a "infinite loop", where there is really no reachability from below, but from above it is reachable.

See `_root_and_safepoints` in `PhaseCCP`. I'm not sure we need to worry about this, but I'd like to be sure that we have considered infinite loops here.

The risk is that otherwise you just call those nodes dead, and do not verify them, right? Or you would just ignore failures there.

src/hotspot/share/opto/graphInvariants.cpp line 61:

> 59:  * and compositional to express complex structures from simple properties.
> 60:  * For instance, we have a pattern for saying "the first input of the center match P" where P is another
> 61:  * Pattern. We end up with trees of patterns matching the graph.

`the first input of the center match P` does not sound like a proper assertion.
Some alternatives I could think of:
- `match P on the first input of center` ok
- `the first input of center must match P` ok
- `match the first input of center with P` meh

src/hotspot/share/opto/graphInvariants.cpp line 63:

> 61:  * Pattern. We end up with trees of patterns matching the graph.
> 62:  */
> 63: struct Pattern : ResourceObj {

Why not move the `Pattern` classes to a separate `pattern.hpp/cpp`? If we did ever use them for `IGVN`, then it would not make so much sense to have them in `graphInvariants.cpp`, right?

src/hotspot/share/opto/graphInvariants.cpp line 64:

> 62:  */
> 63: struct Pattern : ResourceObj {
> 64:   virtual bool check(const Node* center, Node_List& steps, GrowableArray<int>& path, stringStream&) const = 0;

Since this is the abstract class, it could make sense to define all inputs, as well as their invariants: precondition / postcondition. Why do all args have a name except the stream?

src/hotspot/share/opto/graphInvariants.cpp line 67:

> 65: };
> 66: 
> 67: /* This pattern just accepts any node. This is convenient mostly as leaves in a pattern tree.

Suggestion:

/* This pattern just accepts any node. This is convenient mostly as leaf in a pattern tree.

I think this is a bit more consistently singular? Optional.

src/hotspot/share/opto/graphInvariants.cpp line 116:

> 114: private:
> 115:   const N*& _binding;
> 116: };

Would it not make sense to move it a bit closer to the related code? Do you need it much before `NodeClassIsAndBind`?

src/hotspot/share/opto/graphInvariants.cpp line 128:

> 126:  *    new AtInput(1, P1),
> 127:  *    new AtInput(2, P2),
> 128:  * )

`In particular, check a node has enough inputs`: At first it is not clear if the code already does that, or if the user is supposed to do it. Why "in particular", does the statement make more clear what you just said? Ah no you are saying that it is best practice to do #input checking first for good reporting :)
Suggestion:

 * Evaluation order is guaranteed to be left-to-right.
 * Good practice:
 *   To get better reporting, the number of inputs should be checked first, before checking concrete inputs.
 *    If you know a node has 3 inputs and want patterns to be applied to each input, it would look like
 *   And::make(
 *      new HasExactlyNInputs(3),
 *      new AtInput(0, P0),
 *      new AtInput(1, P1),
 *      new AtInput(2, P2),
 *   )

src/hotspot/share/opto/graphInvariants.cpp line 159:

> 157:       if (!_checks.at(i)->check(center, steps, path, ss)) {
> 158:         return false;
> 159:       }

Why do you not update steps and path here? If there is a reason, add a comment ;)
I suppose it is because you don't step to another `center`?

src/hotspot/share/opto/graphInvariants.cpp line 175:

> 173:     }
> 174:   }
> 175: }

`make` suggests that this is a factory pattern. But it rather just prints / dumps.
I'd suggest `print_list_of_inputs`.

src/hotspot/share/opto/graphInvariants.cpp line 181:

> 179:   bool check(const Node* center, Node_List& steps, GrowableArray<int>& path, stringStream& ss) const override {
> 180:     if (center->req() != _expect_req) {
> 181:       ss.print_cr("Unexpected number of input. Expected: %d. Found: %d", _expect_req, center->req());

Suggestion:

      ss.print_cr("Unexpected number of inputs. Expected exactly: %d. Found: %d", _expect_req, center->req());

Something should say that the expected number was exact.

src/hotspot/share/opto/graphInvariants.cpp line 187:

> 185:     return true;
> 186:   }
> 187:   const uint _expect_req;

Suggestion:

private:
  const uint _expect_req;

src/hotspot/share/opto/graphInvariants.cpp line 194:

> 192:   bool check(const Node* center, Node_List& steps, GrowableArray<int>& path, stringStream& ss) const override {
> 193:     if (center->req() < _expect_req) {
> 194:       ss.print_cr("Too small number of input. Expected: %d. Found: %d", _expect_req, center->req());

Grammar: Either "Too few inputs" or "Number of inputs too small".

src/hotspot/share/opto/graphInvariants.cpp line 200:

> 198:     return true;
> 199:   }
> 200:   const uint _expect_req;

Suggestion:

private:
  const uint _expect_req;

src/hotspot/share/opto/graphInvariants.cpp line 211:

> 209:   AtInput(uint which_input, const Pattern* pattern) : _which_input(which_input), _pattern(pattern) {}
> 210:   bool check(const Node* center, Node_List& steps, GrowableArray<int>& path, stringStream& ss) const override {
> 211:     assert(_which_input < center->req(), "Input number is out of range");

Hmm. Could still be nice if we did our best here, and responded nicely.
Just in case someone messes up the pattern, and then we get an assert here.
Maybe the bug is hard to reproduce, and having the printed statements would have helped a little?

src/hotspot/share/opto/graphInvariants.cpp line 215:

> 213:       ss.print_cr("Input at index %d is nullptr.", _which_input);
> 214:       return false;
> 215:     }

So we would never do `AtInput(0, ExpectNullptr())` for example?
Fine with me, just an idea to consider ;)

src/hotspot/share/opto/graphInvariants.cpp line 222:

> 220:     }
> 221:     return result;
> 222:   }

Would this not read better?
Suggestion:

    bool success = _pattern->check(center->in(_which_input), state);
    if (!success) {
      state.trace_failure_path(center, _which_input);
    }
    return success;
  }

src/hotspot/share/opto/graphInvariants.cpp line 224:

> 222:   }
> 223:   const uint _which_input;
> 224:   const Pattern* const _pattern;

Suggestion:

private:
  const uint _which_input;
  const Pattern* const _pattern;

src/hotspot/share/opto/graphInvariants.cpp line 234:

> 232:   bool check(const Node* center, Node_List& steps, GrowableArray<int>& path, stringStream& ss) const override {
> 233:     if (!(center->*_type_check)()) {
> 234:       ss.print_cr("Unexpected type: %s.", center->Name());

Is there a way we could say what we actually do expect? Not really, right? We'd need to do it via macro again.

src/hotspot/share/opto/graphInvariants.cpp line 239:

> 237:     return true;
> 238:   }
> 239:   bool (Node::*_type_check)() const;

Suggestion:

private:
  bool (Node::*_type_check)() const;

I would also suggest that you use a `typedef` here.
Something like:
`typedef bool (Node::*TypeCheckMethod)() const;`
Then you can write
Suggestion:

public:
  const TypeCheckMethod _type_check;

src/hotspot/share/opto/graphInvariants.cpp line 282:

> 280: 
> 281:   bool check(const Node* center, Node_List& steps, GrowableArray<int>& path, stringStream& ss) const override {
> 282:     Node_List outputs_of_correct_type;

You should probably have a `ResourceMark` here.
Or just avoid the allocation by first only holding a pointer, and then if you find multiple you just traverse again.

src/hotspot/share/opto/graphInvariants.cpp line 304:

> 302:   }
> 303:   bool (Node::*_type_check)() const;
> 304:   const Pattern* const _pattern;

Suggestion:

private:
  bool (Node::*_type_check)() const;
  const Pattern* const _pattern;

src/hotspot/share/opto/graphInvariants.cpp line 307:

> 305: };
> 306: 
> 307: /* A LocalGraphInvariant that mostly use a Pattern for checking.

Suggestion:

/* A LocalGraphInvariant that mostly uses a Pattern for checking.

src/hotspot/share/opto/graphInvariants.cpp line 312:

> 310:  */
> 311: struct PatternBasedCheck : LocalGraphInvariant {
> 312:   const Pattern* const _pattern;

Suggestion:

private:
  const Pattern* const _pattern;
public:

src/hotspot/share/opto/graphInvariants.cpp line 336:

> 334:       return CheckResult::NOT_APPLICABLE;
> 335:     }
> 336:     CheckResult r = PatternBasedCheck::check(center, reachable_cfg_nodes, steps, path, ss);

Suggestion:

    CheckResult result = PatternBasedCheck::check(center, reachable_cfg_nodes, state);

Packing the 3 args would give us some extra space to write out a name for `r` ;)

src/hotspot/share/opto/graphInvariants.cpp line 351:

> 349:  */
> 350: struct PhiArity : PatternBasedCheck {
> 351:   const RegionNode* region_node = nullptr;

Suggestion:

private:
  const RegionNode* _region_node = nullptr;

You've been giving fields the `_` consistenly up to now, as we usually doing in hotspot ;)

src/hotspot/share/opto/graphInvariants.cpp line 359:

> 357:                     0,
> 358:                     NodeClassIsAndBind(Region, region_node)))) {
> 359:   }

Are there Phi's that only have the ctrl input? I'd be quite surprised if they did not at least have a single data input. What do you think?

src/hotspot/share/opto/graphInvariants.cpp line 378:

> 376:     return CheckResult::VALID;
> 377:   }
> 378: };

I am wondering if it is really worth it to do the whole pattern matching approach, if we still have to write so much code.

There is a lot of boiler plate now, that has replaced the procedural code.

I'm just wondering if we are there yet, or if we need to find some way to make it more concise.
Maybe we can do something like this:

return <something>
       .applies_if(&Node::is_Phi)
       .check([&]() { return PatternBasedCheck::check(center, reachable_cfg_nodes, steps, path, ss); })
       .require(...)
       .finish();

Just an idea. It would probably be lambda based again, which has its disadvantages.
Maybe you have an even better idea.
I'd just like to understand why the Pattern based approach is really super desirable, what are the advantages and disadvantages?

src/hotspot/share/opto/graphInvariants.cpp line 400:

> 398:     }
> 399: 
> 400:     uint cfg_out = ctrl_succ.size();

Suggestion:

    const uint cfg_out = ctrl_succ.size();

Though you could also use `ctrl_succ.size()` directly. Matter of taste.

src/hotspot/share/opto/graphInvariants.cpp line 421:

> 419:           ss.print("  ");
> 420:           ctrl_succ.at(i)->dump("\n", false, &ss);
> 421:         }

You repeat this 4x. Can we do something reasonable about that?

src/hotspot/share/opto/graphInvariants.cpp line 438:

> 436:         ss.print_cr("%s node must have at least one control successors. Found %d.", center->Name(), cfg_out);
> 437:         return CheckResult::FAILED;
> 438:       }

Is there some upper bound?

src/hotspot/share/opto/graphInvariants.cpp line 454:

> 452: };
> 453: 
> 454: /* Checks that Region Start and Root nodes' first input is a self loop, except for copy regions, which then must have only one non null input.

Suggestion:

/* Checks that Region, Start and Root nodes' first input is a self loop, except for copy regions, which then must have only one non null input.

src/hotspot/share/opto/graphInvariants.cpp line 487:

> 485:       if (non_null_inputs_count != 1) {
> 486:         // Should be a rare case, hence the second (but more expensive) traversal.
> 487:         Node_List non_null_inputs;

`ResourceMark`?

src/hotspot/share/opto/graphInvariants.cpp line 509:

> 507: // CountedLoopEnd -> IfTrue -> CountedLoop
> 508: struct CountedLoopInvariants : PatternBasedCheck {
> 509:   const BaseCountedLoopEndNode* counted_loop_end = nullptr;

Suggestion:

private:
  const BaseCountedLoopEndNode* _counted_loop_end = nullptr;
public:

src/hotspot/share/opto/graphInvariants.cpp line 528:

> 526:     if (!center->is_CountedLoop() && !center->is_LongCountedLoop()) {
> 527:       return CheckResult::NOT_APPLICABLE;
> 528:     }

Actually: why not applie that to `OuterStripMinedLoop` as well? Or any `BaseCountedLoop`? Are there more than these 3 cases? If there are ever more, they should probably also adhere to this backedge pattern, we'll just need an extension. But it would be nice to trip over something here if we ever do extend.

src/hotspot/share/opto/graphInvariants.cpp line 547:

> 545:         return CheckResult::FAILED;
> 546:       }
> 547:     }

If you do add `OuterStripMinedLoop`, make it a swich, and assert in the default case ;)

src/hotspot/share/opto/graphInvariants.cpp line 552:

> 550: };
> 551: 
> 552: // CountedLoopEnd -> IfFalse -> SafePoint -> OuterStripMinedLoopEnd[center] -> IfTrue -> OuterStripMinedLoop -> CountedLoop

Could we close the loop, and check that the CountedLoop match via their backedge?

src/hotspot/share/opto/graphInvariants.hpp line 73:

> 71:    * In addition, if the check fails, it must write its error message in [ss].
> 72:    *
> 73:    * If the check succeeds or is not applicable, [steps], [path] and [ss] must be untouched.

I wonder if we should not have some object that represents these 3 args. You pass them everywhere, and they seem to be a unit. And they have invariants that we may want to check.
You could for example enforce that steps and path are in synch just by only providing the access methods that allow it.
What do you think?

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/26362#pullrequestreview-3196932276
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330519851
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330535043
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330549479
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330572874
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330560727
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330581064
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330584310
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330600482
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330620646
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330682903
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330647056
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330634803
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330699234
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330643795
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330699535
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330652147
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330654896
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330691118
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330699783
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330709824
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330706819
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330732097
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330734834
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330736070
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330737470
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330747146
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330754630
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330750840
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330786828
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330798028
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330801396
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330803456
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330805158
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330814454
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330815461
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330820754
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330822466
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330832982
PR Review Comment: https://git.openjdk.org/jdk/pull/26362#discussion_r2330674639