RFR: 8290910: Wrong memory state is picked in SuperWord::co_locate_pack()
Fei Gao
fgao at openjdk.org
Wed Aug 17 07:57:16 UTC 2022
After JDK-8283091, the loop below can be vectorized partially.
Statement 1 can be vectorized but statement 2 can't.
// int[] iArr; long[] lArrFld; int i1,i2;
for (i1 = 6; i1 < 227; i1++) {
iArr[i1] += lArrFld[i1]++; // statement 1
iArr[i1 + 1] -= (i2++); // statement 2
}
But we got incorrect results because the vector packs of iArr are
scheduled incorrectly like:
...
load_vector XMM1,[R8 + #16 + R11 << #2]
movl RDI, [R8 + #20 + R11 << #2] # int
load_vector XMM2,[R9 + #8 + R11 << #3]
subl RDI, R11 # int
vpaddq XMM3,XMM2,XMM0 ! add packedL
store_vector [R9 + #8 + R11 << #3],XMM3
vector_cast_l2x XMM2,XMM2 !
vpaddd XMM1,XMM2,XMM1 ! add packedI
addl RDI, #228 # int
movl [R8 + #20 + R11 << #2], RDI # int
movl RBX, [R8 + #24 + R11 << #2] # int
subl RBX, R11 # int
addl RBX, #227 # int
movl [R8 + #24 + R11 << #2], RBX # int
...
movl RBX, [R8 + #40 + R11 << #2] # int
subl RBX, R11 # int
addl RBX, #223 # int
movl [R8 + #40 + R11 << #2], RBX # int
movl RDI, [R8 + #44 + R11 << #2] # int
subl RDI, R11 # int
addl RDI, #222 # int
movl [R8 + #44 + R11 << #2], RDI # int
store_vector [R8 + #16 + R11 << #2],XMM1
...
simplified as:
load_vector iArr in statement 1
unvectorized loads/stores in statement 2
store_vector iArr in statement 1
We cannot pick the memory state from the first load for LoadI pack
here, as the LoadI vector operation must load the new values in memory
after iArr writes `iArr[i1 + 1] - (i2++)` to `iArr[i1 + 1]`(statement 2).
We must take the memory state of the last load where we have assigned
new values `iArr[i1 + 1] - (i2++)` to the iArr array.
In JDK-8240281, we picked the memory state of the first load[1]. Different
from the scenario in JDK-8240281, the store, which is dependent on an
earlier load here, is in a pack to be scheduled and the LoadI pack
depends on the last_mem. As designed[2], to schedule the StoreI pack,
all memory operations in another single pack should be moved in the same
direction. We know that the store in the pack depends on one of loads in
the LoadI pack, so the LoadI pack should be scheduled before the StoreI
pack. And the LoadI pack depends on the last_mem, so the last_mem must
be scheduled before the LoadI pack and also before the store pack.
Therefore, we need to take the memory state of the last load for the
LoadI pack here.
To fix it, the pack adds additional checks while picking the memory state
of the first load. When the store locates in a pack and the load pack
relies on the last_mem, we shouldn't choose the memory state of the
first load but choose the memory state of the last load.
[1]https://github.com/openjdk/jdk/blob/0ae834105740f7cf73fe96be22e0f564ad29b18d/src/hotspot/share/opto/superword.cpp#L2380
[2]https://github.com/openjdk/jdk/blob/0ae834105740f7cf73fe96be22e0f564ad29b18d/src/hotspot/share/opto/superword.cpp#L2232
-------------
Commit messages:
- 8290910: Wrong memory state is picked in SuperWord::co_locate_pack()
Changes: https://git.openjdk.org/jdk/pull/9898/files
Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=9898&range=00
Issue: https://bugs.openjdk.org/browse/JDK-8290910
Stats: 124 lines in 3 files changed: 120 ins; 0 del; 4 mod
Patch: https://git.openjdk.org/jdk/pull/9898.diff
Fetch: git fetch https://git.openjdk.org/jdk pull/9898/head:pull/9898
PR: https://git.openjdk.org/jdk/pull/9898
More information about the hotspot-compiler-dev
mailing list