RFR: JDK-8272574: Crashes in PhaseIdealLoop::build_loop_late_post_work [v2]
王超
github.com+25214855+casparcwang at openjdk.java.net
Wed Aug 18 12:23:38 UTC 2021
On Wed, 18 Aug 2021 07:57:02 GMT, Tobias Hartmann <thartmann at openjdk.org> wrote:
> Could you please explain in more detail how we ended up with the `400 LoadI` having a dependency on `207 IfTrue` which is below the predicate insertion point? Is it part of another predicate? I think it would help if you could add comments to the test, explaining what is going on (i.e. where we insert predicates for which range checks).
>
> Why did you add the test to `loopstripmining/LoadSplitThruPhi.java`? It's unrelated to loop strip mining, right?
Thank you for your review very much!
I removed one line in the test to simplify the graph, the idx of nodes changes, but the logic is the same.
public static void getPermutations (byte[] inputArray, byte[][] outputArray) {
int[] indexes = new int[]{0, 2};
for (int a = 0; a < (int)(a + 16); a++) {
int oneIdx = indexes[0]++;
for (int b = a + 1; b < inputArray.length; b++) {
int twoIdx = indexes[1]++;
outputArray[twoIdx][0] = inputArray[a];
}
}
}
The node corresponding relation is:
1, the range check of ` outputArray[twoIdx]` is node 248
2, ` int oneIdx = indexes[0]` is loadi node 398 which have a dependency of node 207
3, loop predicate of `b < inputArray.length` is node 207
4, loadi node 398 is created through `LoadNode::split_through_phi` of the original parsed bytecode of `int twoIdx = indexes[1]`.
![image](https://user-images.githubusercontent.com/25214855/129887886-adda81ed-3fc2-4572-8eb3-c0752feb1c8f.png)
Loop predication optimization wants to float the range check of `outputArray[twoIdx]` out of the inner `for` loop, but it starts to clone from node 432 with the following stack trace:
#1 0x00007ffff622d4d9 in Node::Node (this=0x7fff5809abf0, n0=0x0, n1=0x7fff5809a3d0, n2=0x7fff5809a450) at /data/openjdk/jdk_dev/src/hotspot/share/opto/node.cpp:386
#2 0x00007ffff57c213d in AddNode::AddNode (this=0x7fff5809abf0, in1=0x7fff5809a3d0, in2=0x7fff5809a450) at /data/openjdk/jdk_dev/src/hotspot/share/opto/addnode.hpp:44
#3 0x00007ffff57c21e9 in AddINode::AddINode (this=0x7fff5809abf0, in1=0x7fff5809a3d0, in2=0x7fff5809a450) at /data/openjdk/jdk_dev/src/hotspot/share/opto/addnode.hpp:88
#4 0x00007ffff60c1784 in PhaseIdealLoop::is_scaled_iv_plus_offset (this=0x7fff98123a90, exp=0x7fff5c05fcc8, iv=0x7fff58082ce0, p_scale=0x7fff98122e98, p_offset=0x7fff98122ea0, depth=0) at /data/openjdk/jdk_dev/src/hotspot/share/opto/loopTransform.cpp:2527
#5 0x00007ffff60b2718 in PhaseIdealLoop::loop_predication_impl_helper (this=0x7fff98123a90, loop=0x7fff5807df08, proj=0x7fff58084538, predicate_proj=0x7fff58130bd8, cl=0x7fff5c05daf8, zero=0x7fff581278d0, invar=..., reason=Deoptimization::Reason_predicate)
at /data/openjdk/jdk_dev/src/hotspot/share/opto/loopPredicate.cpp:1154
#6 0x00007ffff60b3a71 in PhaseIdealLoop::loop_predication_impl (this=0x7fff98123a90, loop=0x7fff5807df08) at /data/openjdk/jdk_dev/src/hotspot/share/opto/loopPredicate.cpp:1414
#7 0x00007ffff60b3eab in IdealLoopTree::loop_predication (this=0x7fff5807df08, phase=0x7fff98123a90) at /data/openjdk/jdk_dev/src/hotspot/share/opto/loopPredicate.cpp:1473
#8 0x00007ffff60b3e58 in IdealLoopTree::loop_predication (this=0x7fff5807dfd0, phase=0x7fff98123a90) at /data/openjdk/jdk_dev/src/hotspot/share/opto/loopPredicate.cpp:1468
#9 0x00007ffff60b3e58 in IdealLoopTree::loop_predication (this=0x7fff5807e098, phase=0x7fff98123a90) at /data/openjdk/jdk_dev/src/hotspot/share/opto/loopPredicate.cpp:1468
#10 0x00007ffff60dccff in PhaseIdealLoop::build_and_optimize (this=0x7fff98123a90, mode=LoopOptsSkipSplitIf) at /data/openjdk/jdk_dev/src/hotspot/share/opto/loopnode.cpp:3994
#11 0x00007ffff5a9dc8c in PhaseIdealLoop::PhaseIdealLoop (this=0x7fff98123a90, igvn=..., mode=LoopOptsSkipSplitIf) at /data/openjdk/jdk_dev/src/hotspot/share/opto/loopnode.hpp:1068
#12 0x00007ffff5a9de7e in PhaseIdealLoop::optimize (igvn=..., mode=LoopOptsSkipSplitIf) at /data/openjdk/jdk_dev/src/hotspot/share/opto/loopnode.hpp:1146
#13 0x00007ffff5a91a75 in Compile::Optimize (this=0x7fff98126cb0) at /data/openjdk/jdk_dev/src/hotspot/share/opto/compile.cpp:2198
#14 0x00007ffff5a8aec2 in Compile::Compile (this=0x7fff98126cb0, ci_env=0x7fff98127a20, target=0x7fff580aaa58, osr_bci=-1, subsume_loads=true, do_escape_analysis=true, eliminate_boxing=true, do_locks_coarsening=true, install_code=true, directive=
0x7ffff01f2df0) at /data/openjdk/jdk_dev/src/hotspot/share/opto/compile.cpp:781
#15 0x00007ffff5978068 in C2Compiler::compile_method (this=0x7ffff02f24b0, env=0x7fff98127a20, target=0x7fff580aaa58, entry_bci=-1, install_code=true, directive=0x7ffff01f2df0) at /data/openjdk/jdk_dev/src/hotspot/share/opto/c2compiler.cpp:107
#16 0x00007ffff5aa83a2 in CompileBroker::invoke_compiler_on_method (task=0x7ffff04c9760) at /data/openjdk/jdk_dev/src/hotspot/share/compiler/compileBroker.cpp:2291
#17 0x00007ffff5aa6edd in CompileBroker::compiler_thread_loop () at /data/openjdk/jdk_dev/src/hotspot/share/compiler/compileBroker.cpp:1964
#18 0x00007ffff5ac7819 in CompilerThread::thread_entry (thread=0x7ffff02f2ad0, __the_thread__=0x7ffff02f2ad0) at /data/openjdk/jdk_dev/src/hotspot/share/compiler/compilerThread.cpp:59
#19 0x00007ffff653cb85 in JavaThread::thread_main_inner (this=0x7ffff02f2ad0) at /data/openjdk/jdk_dev/src/hotspot/share/runtime/thread.cpp:1270
#20 0x00007ffff653ca1b in JavaThread::run (this=0x7ffff02f2ad0) at /data/openjdk/jdk_dev/src/hotspot/share/runtime/thread.cpp:1253
#21 0x00007ffff653a33e in Thread::call_run (this=0x7ffff02f2ad0) at /data/openjdk/jdk_dev/src/hotspot/share/runtime/thread.cpp:361
#22 0x00007ffff628225d in thread_native_entry (thread=0x7ffff02f2ad0) at /data/openjdk/jdk_dev/src/hotspot/os/linux/os_linux.cpp:720
#23 0x00007ffff79b0e25 in start_thread (arg=0x7fff98128700) at pthread_create.c:308
#24 0x00007ffff74d935d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
So the range check of `outputArray[twoIdx]` floats out of the inner `for` loop, but have a dependency of node 398 LoadI, whose control input is node 207 (loop predicate of `b < inputArray.length`), and node 207 is also dominated by the predicate insertion point. In the later stage, the optimization found this cycle dependency makes a bad graph.
The predicate insertion point is calculated from node 207, and skips all the predicates, which leads to node 183.
![image](https://user-images.githubusercontent.com/25214855/129889879-58c2485d-c3b3-41ab-9c2f-f6de445a71a5.png)
So, the solution is very straight forward: stop the predication to be performed if it has a control which is dominated by the predication point. But the implementation seems to have some latent problem.
-------------
PR: https://git.openjdk.java.net/jdk/pull/5142
More information about the hotspot-compiler-dev
mailing list