RFR: 8323582: C2 SuperWord AlignVector: misaligned vector memory access with unaligned native memory [v3]
Emanuel Peter
epeter at openjdk.org
Mon Feb 24 14:32:57 UTC 2025
On Mon, 24 Feb 2025 12:52:42 GMT, Roland Westrelin <roland at openjdk.org> wrote:
> > @rwestrel Do you want me to find examples for the pre-loop disappearing? I suppose I can find some easily by adding an assert in SuperWord, where we bail out, as I showed above.
>
> Yes, if not too much work.
Ok, let's add this:
diff --git a/src/hotspot/share/opto/vectorization.cpp b/src/hotspot/share/opto/vectorization.cpp
index e607a1065dd..290ee249a42 100644
--- a/src/hotspot/share/opto/vectorization.cpp
+++ b/src/hotspot/share/opto/vectorization.cpp
@@ -98,6 +98,7 @@ VStatus VLoop::check_preconditions_helper() {
// the pre-loop limit.
CountedLoopEndNode* pre_end = _cl->find_pre_loop_end();
if (pre_end == nullptr) {
+ assert(false, "found no pre-loop");
return VStatus::make_failure(VLoop::FAILURE_PRE_LOOP_LIMIT);
}
Node* pre_opaq1 = pre_end->limit();
And run that:
rr /oracle-work/jdk-fork7/build/linux-x64-slowdebug/jdk/bin/java -Xcomp -XX:+TraceLoopOpts -XX:CompileCommand=compileonly,jdk.internal.classfile.impl.StackMapGenerator::processBlock --version
....
PreMainPost Loop: N7127/N4014 limit_check profile_predicated predicated counted [0,int),+1 (2147483648 iters) rc has_sfpt strip_mined
Unroll 2 Loop: N7127/N4014 counted [int,int),+1 (2147483648 iters) main rc has_sfpt strip_mined
Loop: N0/N0 has_call has_sfpt
Loop: N7453/N7460 limit_check profile_predicated predicated counted [0,int),+1 (4 iters) pre rc has_sfpt
Loop: N7126/N7125 sfpts={ 7128 }
Loop: N7508/N4014 counted [int,int),+2 (2147483648 iters) main rc has_sfpt strip_mined
Loop: N7409/N7416 counted [int,int),+1 (4 iters) post rc has_sfpt
Parallel IV: 7728 Loop: N7453/N7460 limit_check profile_predicated predicated counted [0,int),+1 (4 iters) pre has_sfpt
Parallel IV: 7725 Loop: N7508/N4014 counted [int,int),+2 (2147483648 iters) main has_sfpt strip_mined
Parallel IV: 7718 Loop: N7409/N7416 counted [int,int),+1 (4 iters) post has_sfpt
Loop: N0/N0 has_call has_sfpt
Loop: N7453/N7460 limit_check profile_predicated predicated counted [0,int),+1 (4 iters) pre has_sfpt
Loop: N7126/N7125 sfpts={ 7128 }
Loop: N7508/N4014 counted [int,int),+2 (2147483648 iters) main has_sfpt strip_mined
Loop: N7409/N7416 counted [int,int),+1 (4 iters) post has_sfpt
RangeCheck Loop: N7508/N4014 counted [int,int),+2 (2147483648 iters) main has_sfpt rce strip_mined
Unroll 4 Loop: N7508/N4014 limit_check counted [int,int),+2 (2147483648 iters) main has_sfpt rce strip_mined
Loop: N0/N0 has_call has_sfpt
Loop: N7453/N7460 limit_check profile_predicated predicated counted [0,int),+1 (4 iters) pre rc has_sfpt
Loop: N7126/N7125 limit_check sfpts={ 7128 }
Loop: N8146/N4014 limit_check counted [int,int),+4 (2147483648 iters) main has_sfpt strip_mined
Loop: N7409/N7416 counted [int,int),+1 (4 iters) post rc has_sfpt
...
# Internal Error (/oracle-work/jdk-fork7/open/src/hotspot/share/opto/vectorization.cpp:101), pid=1381339, tid=1381348
# assert(false) failed: found no pre-loop
The pre-loop node is not dead actually. The issue is with the main-loop in `CountedLoopNode::is_canonical_loop_entry`.
We skip through some predicates, but then we cannot find the ZeroTripGuard, rather I'm seeing this:
(rr) p ctrl->dump_bfs(2,0,"#cd")
dist dump
---------------------------------------------
2 974 ConI === 0 [[ ... ]] #int:1
2 8060 IfTrue === 8056 [[ 8073 ]] #1
1 8073 If === 8060 974 [[ 8074 8077 ]] #Last Value Assertion Predicate P=0.999999, C=-1.000000
0 8077 IfTrue === 8073 [[ 8103 ]] #1
The pre-loop is further up though:
(rr) p this->dump_bfs(26,0,"#c")
dist dump
---------------------------------------------
26 7453 CountedLoop === 7453 4015 7460 [[ 7452 7453 7454 7455 ]] inner stride: 1 pre of N7127 !orig=[7127],[7118],[2645] !jvms: StackMapGenerator::processBlock @ bci:2677 (line 671)
25 7455 If === 7453 7441 [[ 7456 7464 ]] P=0.000001, C=-1.000000 !orig=[2686] !jvms: StackMapGenerator$Frame::popStack @ bci:5 (line 1001) StackMapGenerator::processBlock @ bci:2681 (line 671)
24 7456 IfFalse === 7455 [[ 7448 7457 ]] #0 !orig=[2631],[2628] !jvms: StackMapGenerator$Frame::popStack @ bci:5 (line 1001) StackMapGenerator::processBlock @ bci:2681 (line 671)
23 7457 RangeCheck === 7456 7446 [[ 7458 7467 ]] P=0.999999, C=-1.000000 !orig=[1189] !jvms: StackMapGenerator$Frame::popStack @ bci:33 (line 1002) StackMapGenerator::processBlock @ bci:2681 (line 671)
22 7458 IfTrue === 7457 [[ 7459 ]] #1 !orig=[777],385 !jvms: StackMapGenerator$Frame::popStack @ bci:33 (line 1002) StackMapGenerator::processBlock @ bci:2681 (line 671)
21 7459 CountedLoopEnd === 7458 7443 [[ 7460 7482 ]] [lt] P=0.900000, C=-1.000000 !orig=7122,[5398] !jvms: StackMapGenerator::processBlock @ bci:2674 (line 670)
20 7482 IfFalse === 7459 [[ 7486 ]] #0
19 7486 If === 7482 7485 [[ 7461 7487 ]] P=0.999999, C=-1.000000
18 7487 IfTrue === 7486 [[ 7977 ]] #1
17 7977 If === 7487 974 [[ 7978 7981 ]] #Init Value Assertion Predicate P=0.999999, C=-1.000000
16 7981 IfTrue === 7977 [[ 7994 ]] #1
15 7994 If === 7981 974 [[ 7995 7998 ]] #Last Value Assertion Predicate P=0.999999, C=-1.000000
14 7998 IfTrue === 7994 [[ 8118 ]] #1
13 8118 If === 7998 8117 [[ 8119 8122 ]] #Last Value Assertion Predicate P=0.999999, C=-1.000000
12 8122 IfTrue === 8118 [[ 8007 ]] #1
11 8007 If === 8122 8006 [[ 8008 8011 ]] #Init Value Assertion Predicate P=0.999999, C=-1.000000
10 8011 IfTrue === 8007 [[ 8056 ]] #1
9 8056 If === 8011 974 [[ 8057 8060 ]] #Init Value Assertion Predicate P=0.999999, C=-1.000000
8 8060 IfTrue === 8056 [[ 8073 ]] #1
7 8073 If === 8060 974 [[ 8074 8077 ]] #Last Value Assertion Predicate P=0.999999, C=-1.000000
6 8077 IfTrue === 8073 [[ 8103 ]] #1
5 8173 IfFalse === 7122 [[ 7128 7129 ]] #0 !orig=[7524],[7123],[5442] !jvms: StackMapGenerator::processBlock @ bci:2674 (line 670)
5 8103 If === 8077 8102 [[ 8104 8107 ]] #Last Value Assertion Predicate P=0.999999, C=-1.000000
4 7128 SafePoint === 8173 1 778 1 1 7129 780 1 1 781 781 782 783 784 1 1 1 785 786 [[ 7124 ]] SafePoint !orig=385 !jvms: StackMapGenerator::processBlock @ bci:2688 (line 670)
4 8107 IfTrue === 8103 [[ 8086 ]] #1
3 7124 OuterStripMinedLoopEnd === 7128 781 [[ 7125 7471 ]] P=0.900000, C=-1.000000
3 8086 If === 8107 8085 [[ 8087 8090 ]] #Init Value Assertion Predicate P=0.999999, C=-1.000000
2 7122 CountedLoopEnd === 8146 7121 [[ 8173 4014 ]] [lt] P=0.900000, C=-1.000000 !orig=[5398] !jvms: StackMapGenerator::processBlock @ bci:2674 (line 670)
2 7125 IfTrue === 7124 [[ 7126 ]] #1
2 8090 IfTrue === 8086 [[ 7126 ]] #1
1 4014 IfTrue === 7122 [[ 8146 ]] #1 !jvms: StackMapGenerator::processBlock @ bci:2674 (line 670)
1 7126 OuterStripMinedLoop === 7126 8090 7125 [[ 7126 8146 ]]
0 8146 CountedLoop === 8146 7126 4014 [[ 8146 1191 8157 8158 7122 7503 ]] inner stride: 4 main of N8146 strip mined !orig=[7508],[7127],[7118],[2645] !jvms: StackMapGenerator::processBlock @ bci:2677 (line 671)
It looks like we are skipping some predicates, but not enough of them maybe?
In `AssertionPredicates::find_entry` we see:
- `8090 IfTrue === 8086 [[ 7126 ]] #1`: `is_predicate` returns `true`.
- `8107 IfTrue === 8103 [[ 8086 ]] #1`: `is_predicate` returns `true`.
- `8077 IfTrue === 8073 [[ 8103 ]] #1`: `is_predicate` returns `false`. The reason is that the assertion predicate Opaque nodes have already disappeared.
I talked with @chhagedorn and he says that there are some "dying" initialized assertion predicates from unrolling that can be in the way. They would be cleaned out by IGVN later, and then we can see through. But at this point they are in the way and we cannot see through and find the ZeroTripGuard, the predicate iterator is not good enough yet. But @chhagedorn is working on that. https://bugs.openjdk.org/browse/JDK-8350579
The implication is that the ZeroTripGuard can be temporarily not be found, and so we cannot even find the pre-loop, and also not the multiversion-if. So I cannot really add an assert now. And who knows, there may be other blocking reasons on top of that.
@rwestrel Does that make sense? What do you think we should do?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/22016#issuecomment-2678602660
More information about the hotspot-dev
mailing list