RFR: 8251544: CTW: C2 fails with assert(no_dead_loop) failed: dead loop detected
Christian Hagedorn
chagedorn at openjdk.java.net
Wed Sep 30 07:52:12 UTC 2020
In the testcase, we hit a dead data loop while dead nodes are being removed on a dead control path. A region node, lets
say `r`, that represents a loop (has three inputs: self, a loop entry and backedge) but is not a loop node, yet,
becomes dead when its entry control is replaced by top in the first IGVN run after parsing. All its phi nodes also
become dead by replacing the corresponding entry control input by top. The problem is now that some phi nodes of `r`
are processed by IGVN before the corresponding (dead) region node `r`. In `PhiNode::Ideal`, we actually check if there
is a dead loop. But after some of the phis of `r` were already removed, `is_unsafe_data_reference()` on
[L1939](https://github.com/openjdk/jdk/compare/master...chhagedorn:JDK-8251544#diff-efe6b3bde157b833249cd9a8d8b6645bL1939)
returns false. As a result, we do not realize in `PhiNode::Ideal` for one of the remaining phis that it is actually
dead and we apply the normal propagation in `PhiNode::Identity` which replaces the phi by its only non-top input. We
later apply an additional optimization for a `LoadNode` input of an `AddINode` in which we replace the `LoadNode` by
the `AddINode` itself (because of the data loop) and we end up with a dead data loop and fail with the dead loop
assertion.
The order in which the nodes are processed in IGVN is crucial. I could only reproduce this bug with a very specific
CTW-like testcase which makes this quite an edge case. I can think of two ways ways how to fix this:
1. Delay phi nodes which have only one non-top input left and whose region node is not a loop node, has three inputs
from which the entry control is top and the region has not been processed by IGVN, yet. 2. Extend the dead data loop
check in `PhiNode::Ideal()` to already do an unreachable region check as done in `RegionNode::Ideal()`. The result can
be cached as a region should not become reachable anymore once we figured out it is dead.
I chose the second approach because I think it is preferable as we are not delaying IGVN and all other phi nodes can
already use the information of a dead region before it is processed. This avoid any further unwanted optimizations on
dead nodes.
Thanks,
Christian
-------------
Commit messages:
- 8251544: CTW: C2 fails with assert(no_dead_loop) failed: dead loop detected
Changes: https://git.openjdk.java.net/jdk/pull/425/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=425&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8251544
Stats: 264 lines in 3 files changed: 233 ins; 11 del; 20 mod
Patch: https://git.openjdk.java.net/jdk/pull/425.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/425/head:pull/425
PR: https://git.openjdk.java.net/jdk/pull/425
More information about the hotspot-compiler-dev
mailing list