RFR: 8290910: Wrong memory state is picked in SuperWord::co_locate_pack()

Fei Gao fgao at openjdk.org
Wed Aug 17 07:57:16 UTC 2022


After JDK-8283091, the loop below can be vectorized partially.
Statement 1 can be vectorized but statement 2 can't.

// int[] iArr; long[] lArrFld; int i1,i2;
for (i1 = 6; i1 < 227; i1++) {
  iArr[i1] += lArrFld[i1]++; // statement 1
  iArr[i1 + 1] -= (i2++); // statement 2
}


But we got incorrect results because the vector packs of iArr are
scheduled incorrectly like:

...
load_vector XMM1,[R8 + #16 + R11 << #2]
movl    RDI, [R8 + #20 + R11 << #2] # int
load_vector XMM2,[R9 + #8 + R11 << #3]
subl    RDI, R11    # int
vpaddq  XMM3,XMM2,XMM0  ! add packedL
store_vector [R9 + #8 + R11 << #3],XMM3
vector_cast_l2x  XMM2,XMM2  !
vpaddd  XMM1,XMM2,XMM1  ! add packedI
addl    RDI, #228   # int
movl    [R8 + #20 + R11 << #2], RDI # int
movl    RBX, [R8 + #24 + R11 << #2] # int
subl    RBX, R11    # int
addl    RBX, #227   # int
movl    [R8 + #24 + R11 << #2], RBX # int
...
movl    RBX, [R8 + #40 + R11 << #2] # int
subl    RBX, R11    # int
addl    RBX, #223   # int
movl    [R8 + #40 + R11 << #2], RBX # int
movl    RDI, [R8 + #44 + R11 << #2] # int
subl    RDI, R11    # int
addl    RDI, #222   # int
movl    [R8 + #44 + R11 << #2], RDI # int
store_vector [R8 + #16 + R11 << #2],XMM1
...

simplified as:

load_vector iArr in statement 1
unvectorized loads/stores in statement 2
store_vector iArr in statement 1

We cannot pick the memory state from the first load for LoadI pack
here, as the LoadI vector operation must load the new values in memory
after iArr writes `iArr[i1 + 1] - (i2++)` to `iArr[i1 + 1]`(statement 2).
We must take the memory state of the last load where we have assigned
new values `iArr[i1 + 1] - (i2++)` to the iArr array.

In JDK-8240281, we picked the memory state of the first load[1]. Different
from the scenario in JDK-8240281, the store, which is dependent on an
earlier load here, is in a pack to be scheduled and the LoadI pack
depends on the last_mem. As designed[2], to schedule the StoreI pack,
all memory operations in another single pack should be moved in the same
direction. We know that the store in the pack depends on one of loads in
the LoadI pack, so the LoadI pack should be scheduled before the StoreI
pack. And the LoadI pack depends on the last_mem, so the last_mem must
be scheduled before the LoadI pack and also before the store pack.
Therefore, we need to take the memory state of the last load for the
LoadI pack here.

To fix it, the pack adds additional checks while picking the memory state
of the first load. When the store locates in a pack and the load pack
relies on the last_mem, we shouldn't choose the memory state of the
first load but choose the memory state of the last load.

[1]https://github.com/openjdk/jdk/blob/0ae834105740f7cf73fe96be22e0f564ad29b18d/src/hotspot/share/opto/superword.cpp#L2380
[2]https://github.com/openjdk/jdk/blob/0ae834105740f7cf73fe96be22e0f564ad29b18d/src/hotspot/share/opto/superword.cpp#L2232

-------------

Commit messages:
 - 8290910: Wrong memory state is picked in SuperWord::co_locate_pack()

Changes: https://git.openjdk.org/jdk/pull/9898/files
 Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=9898&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8290910
  Stats: 124 lines in 3 files changed: 120 ins; 0 del; 4 mod
  Patch: https://git.openjdk.org/jdk/pull/9898.diff
  Fetch: git fetch https://git.openjdk.org/jdk pull/9898/head:pull/9898

PR: https://git.openjdk.org/jdk/pull/9898


More information about the hotspot-compiler-dev mailing list