RFR: 8275202: C2: optimize out more redundant conditions
Emanuel Peter
epeter at openjdk.org
Wed Oct 4 14:29:58 UTC 2023
On Wed, 21 Jun 2023 12:47:26 GMT, Roland Westrelin <roland at openjdk.org> wrote:
> This change adds a new loop opts pass to optimize redundant conditions
> such as the second one in:
>
>
> if (i < 10) {
> if (i < 42) {
>
>
> In the branch of the first if, the type of i can be narrowed down to
> [min_jint, 9] which can then be used to constant fold the second
> condition.
>
> The compiler already keeps track of type[n] for every node in the
> current compilation unit. That's not sufficient to optimize the
> snippet above though because the type of i can only be narrowed in
> some sections of the control flow (that is a subset of all
> controls). The solution is to build a new table that tracks the type
> of n at every control c
>
>
> type'[n, root] = type[n] // initialized from igvn's type table
> type'[n, c] = type[n, idom(c)]
>
>
> This pass iterates over the CFG looking for conditions such as:
>
>
> if (i < 10) {
>
>
> that allows narrowing the type of i and updates the type' table
> accordingly.
>
> At a region r:
>
>
> type'[n, r] = meet(type'[n, r->in(1)], type'[n, r->in(2)]...)
>
>
> For a Phi phi at a region r:
>
>
> type'[phi, r] = meet(type'[phi->in(1), r->in(1)], type'[phi->in(2), r->in(2)]...)
>
>
> Once a type is narrowed, uses are enqueued and their types are
> computed by calling the Value() methods. If a use's type is narrowed,
> it's recorded at c in the type' table. Value() methods retrieve types
> from the type table, not the type' table. To address that issue while
> leaving Value() methods unchanged, before calling Value() at c, the
> type table is updated so:
>
>
> type[n] = type'[n, c]
>
>
> An exception is for Phi::Value which needs to retrieve the type of
> nodes are various controls: there, a new type(Node* n, Node* c)
> method is used.
>
> For most n and c, type'[n, c] is likely the same as type[n], the type
> recorded in the global igvn table (that is there shouldn't be many
> nodes at only a few control for which we can narrow the type down). As
> a consequence, the types'[n, c] table is implemented with:
>
> - At c, narrowed down types are stored in a GrowableArray. Each entry
> records the previous type at idom(c) and the narrowed down type at
> c.
>
> - The GrowableArray of type updates is recorded in a hash table
> indexed by c. If there's no update at c, there's no entry in the
> hash table.
>
> This pass operates in 2 steps:
>
> - it first iterates over the graph looking for conditions that narrow
> the types of some nodes and propagate type updates to uses until a
> fix point.
>
> - it transforms the graph so newly found constant nodes are folded.
>
>
> The new pass is run on every loop opts. There are a couple rea...
@rwestrel this looks like an awesome idea.
I gave the code a quick glance, and have a few questions:
**How powerful is this?**
Can we make inferences from `expr(i) > 0` about `i` itself? I guess not really, that would require us to propagate type information "upward" through `expr` to `i`.
Assume we have`expr1(i) > 0` and `expr2(i) < 0`:
- And let's assume that together this implied that `i` was a constant at c. Your algorithm would not detect this, right?
- Alternatively, let's assume this meant that `i` was TOP at c. We would not directly detect this, but we would somehow detect that one of the two expressions is not TOP at c, and therefore we could collapse c? This is very powerful for dead code elimination.
**How expensive is this?**
Worst case, you cannot get around having a basically quadratic runtime and memory requirement, right? So basically `O(C * D)`, where C is number of control nodes, and D is number of data nodes.
I'm trying to think of such a worst-case example: Can there be a program where we have lots of expressions of `i`, and lots of comparisons on `i`, and therefore have to update all expressions of `i` at every comparison of `i`?
This worst-case is probably far off of the "average" program, right? But still, one wonders what the distribution is - even the occasional program/method that hits quadratic runtime could be problematically expensive.
If you don't think the runtime is indeed worst-case quadratic, then I'd be interested in hearing the argument ;)
**Do you think this really solves all the other bugs?**
If we put this behind a product flag, then we do need to fix all other bugs that are linked as duplicate separately. But that may not be easy, because they basically happen because separate type info about expressions of `i` are suddenly combined at some data node, but not visible to the control -> inconsistent graph.
src/hotspot/share/opto/c2_globals.hpp line 775:
> 773: "Verify receiver types at runtime") \
> 774: \
> 775: develop(bool, PrintLoopConditionalPropagation, false, \
When do we call a flag "Print" and when "Trace"?
src/hotspot/share/opto/c2_globals.hpp line 778:
> 776: "Trace Loop Conditional Propagation pass") \
> 777: \
> 778: product(bool, UseLoopConditionalPropagation, true, \
I guess that means we can disable it in product. So all the duplicate bugs are not really solved - the VM can still crash if we disable the flag.
src/hotspot/share/opto/loopConditionalPropagation.cpp line 40:
> 38: snippet above though because the type of i can only be narrowed in
> 39: some section of the control flow (that is a subset of all
> 40: controls). The solution is to build a new table that tracks the type
"new table" will read badly in 10 Years. Maybe give it a specific name?
src/hotspot/share/opto/loopConditionalPropagation.cpp line 137:
> 135: }
> 136:
> 137: void PhaseConditionalPropagation::enqueue_uses(const Node* n, Node* c) {
This sounds like `PhaseCCP::push_child_nodes_to_worklist` and `PhaseIterGVN::add_users_to_worklist`.
Looks like it will be quite easy to forget some "notification-pattern". Do you have some kind of verification that everything was processed to the full extent, just like `PhaseIterGVN::verify_node_value`, which is used for both IGVN and CCP?
[JDK-8298094](https://bugs.openjdk.org/browse/JDK-8298094) Refactor PhaseCCP and IGVN notification
src/hotspot/share/opto/loopConditionalPropagation.cpp line 927:
> 925:
> 926: The second if is redundant but first if updates the type of i-1, not i alone, we can't tell i != 0.
> 927: Because i-1 becomes top in the second if branch, we can tell that branch is dead
Ok. So we have some value that becomes top in this branch.
So if any value is TOP at a control, then we have to assume that the control is dead?
This is really cool. But what happens if we do things like `i + 1 > 0` and `i - 1 < 0`? I guess that would not give us more info about `i` or if anything was still valid or TOP?
test/hotspot/jtreg/compiler/c2/irTests/TestLoopConditionalPropagation.java line 99:
> 97:
> 98:
> 99: static volatile int volatileField;
nit: this belongs at the head of the class.
test/hotspot/jtreg/compiler/c2/irTests/TestLoopConditionalPropagation.java line 678:
> 676: // }
> 677: // return i;
> 678: // }
Dead code? You are probably aware of this.
-------------
PR Review: https://git.openjdk.org/jdk/pull/14586#pullrequestreview-1657494542
PR Review Comment: https://git.openjdk.org/jdk/pull/14586#discussion_r1345770852
PR Review Comment: https://git.openjdk.org/jdk/pull/14586#discussion_r1345770984
PR Review Comment: https://git.openjdk.org/jdk/pull/14586#discussion_r1345772311
PR Review Comment: https://git.openjdk.org/jdk/pull/14586#discussion_r1345798864
PR Review Comment: https://git.openjdk.org/jdk/pull/14586#discussion_r1345838349
PR Review Comment: https://git.openjdk.org/jdk/pull/14586#discussion_r1345848227
PR Review Comment: https://git.openjdk.org/jdk/pull/14586#discussion_r1345852920
More information about the hotspot-compiler-dev
mailing list