RFR: 8275202: C2: optimize out more redundant conditions
Roland Westrelin
roland at openjdk.org
Wed Jun 21 13:33:39 UTC 2023
This change adds a new loop opts pass to optimize redundant conditions
such as the second one in:
if (i < 10) {
if (i < 42) {
In the branch of the first if, the type of i can be narrowed down to
[min_jint, 9] which can then be used to constant fold the second
condition.
The compiler already keeps track of type[n] for every node in the
current compilation unit. That's not sufficient to optimize the
snippet above though because the type of i can only be narrowed in
some sections of the control flow (that is a subset of all
controls). The solution is to build a new table that tracks the type
of n at every control c
type'[n, root] = type[n] // initialized from igvn's type table
type'[n, c] = type[n, idom(c)]
This pass iterates over the CFG looking for conditions such as:
if (i < 10) {
that allows narrowing the type of i and updates the type' table
accordingly.
At a region r:
type'[n, r] = meet(type'[n, r->in(1)], type'[n, r->in(2)]...)
For a Phi phi at a region r:
type'[phi, r] = meet(type'[phi->in(1), r->in(1)], type'[phi->in(2), r->in(2)]...)
Once a type is narrowed, uses are enqueued and their types are
computed by calling the Value() methods. If a use's type is narrowed,
it's recorded at c in the type' table. Value() methods retrieve types
from the type table, not the type' table. To address that issue while
leaving Value() methods unchanged, before calling Value() at c, the
type table is updated so:
type[n] = type'[n, c]
An exception is for Phi::Value which needs to retrieve the type of
nodes are various controls: there, a new type(Node* n, Node* c)
method is used.
For most n and c, type'[n, c] is likely the same as type[n], the type
recorded in the global igvn table (that is there shouldn't be many
nodes at only a few control for which we can narrow the type down). As
a consequence, the types'[n, c] table is implemented with:
- At c, narrowed down types are stored in a GrowableArray. Each entry
records the previous type at idom(c) and the narrowed down type at
c.
- The GrowableArray of type updates is recorded in a hash table
indexed by c. If there's no update at c, there's no entry in the
hash table.
This pass operates in 2 steps:
- it first iterates over the graph looking for conditions that narrow
the types of some nodes and propagate type updates to uses until a
fix point.
- it transforms the graph so newly found constant nodes are folded.
The new pass is run on every loop opts. There are a couple reasons for
that:
1- one of the goals is to avoid the many bugs we've been hitting where
a CastII nodes capture a type that, at some point, conflicts with
its input type. The CastII becomes top, that control path is dead
but the compiler fails to prove it. By running the new pass often,
there's a better chance that these inconsistencies can be caught and
fixed.
2- I tried running it less often and ran into inconsistencies where:
when the new pass is run, it eliminates a range check. At a later
point, the range check CastII becomes top. If the pass is not run
then, we hit the problem of 1- above. If it is run, then it can
prove the path that leads to the CastII is dead.
I looked at compilation time by running ctw on java.base and looking
at times reported by CITime. Overall, the new pass adds +50% to the
time spent in loop opts and about 5% to total compilation time. I
already spent quite a bit of time trying to decrease the compilation
time overhead and I don't see any obvious bottleneck at this
point. The pass runs until a fixed point is reached and can go over
the cfg several times. If it goes only once over the cfg, the
compilation time overhead goes down to ~40%. That's a fairly small
improvement in terms of compilation time in my opinion and, I would
say, it's better to let it run until a fixed point and find all
constants that it can find.
There are several changes that I had to make outside the new pass
itself. Some of them could be pushed separately eventhough they are
solving issues that may not exist without the new pass. In particular,
I had to add assert predicates for all eliminated conditions during
range check eliminations.
-------------
Commit messages:
- conditional propagation
Changes: https://git.openjdk.org/jdk/pull/14586/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14586&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8275202
Stats: 2924 lines in 29 files changed: 2750 ins; 110 del; 64 mod
Patch: https://git.openjdk.org/jdk/pull/14586.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/14586/head:pull/14586
PR: https://git.openjdk.org/jdk/pull/14586
More information about the hotspot-compiler-dev
mailing list