RFR: 8298935: fix cyclic dependency bug in create_pack logic in SuperWord::find_adjacent_refs [v15]

Jatin Bhateja jbhateja at openjdk.org
Mon Mar 6 05:24:19 UTC 2023


On Mon, 6 Mar 2023 05:13:27 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> I fixed my script, it should now compute the ranges correctly, and not add IR rules with impossible ranges.
>
> Thanks, even though newly added test now passes at all AVX and SSE level can you kindly investigate why should following be vectorized with un-aligned accesses when it carries a cross iteration true dependency with distance 4.
> 
> 
>     @Test
>     // CPU: sse4.1 to avx -> vector_width: 16 -> elements in vector: 4
>     //   positive byte_offset 12 can lead to cyclic dependency
>     @IR(counts = {IRNode.LOAD_VECTOR, "> 0", IRNode.MUL_V, "> 0", IRNode.STORE_VECTOR, "> 0"},
>         applyIfAnd = {"AlignVector", "false", "MaxVectorSize", ">= 8", "MaxVectorSize", "<= 12"},
>         applyIfCPUFeatureAnd = {"sse4.1", "true", "avx2", "false"})
>     // CPU: avx2 -> vector_width: 32 -> elements in vector: 8
>     //   positive byte_offset 12 can lead to cyclic dependency
>     @IR(counts = {IRNode.LOAD_VECTOR, "> 0", IRNode.MUL_V, "> 0", IRNode.STORE_VECTOR, "> 0"},
>         applyIfAnd = {"AlignVector", "false", "MaxVectorSize", ">= 8", "MaxVectorSize", "<= 12"},
>         applyIfCPUFeatureAnd = {"avx2", "true", "avx512", "false"})
>     // CPU: avx512 -> vector_width: 64 -> elements in vector: 16
>     //   positive byte_offset 12 can lead to cyclic dependency
>     @IR(counts = {IRNode.LOAD_VECTOR, "> 0", IRNode.MUL_V, "> 0", IRNode.STORE_VECTOR, "> 0"},
>         applyIfAnd = {"AlignVector", "false", "MaxVectorSize", ">= 8", "MaxVectorSize", "<= 12"},
>         applyIfCPUFeature = {"avx512", "true"})
>     // CPU: asimd -> vector_width: 32 -> elements in vector: 8
>     //   positive byte_offset 12 can lead to cyclic dependency
>     @IR(counts = {IRNode.LOAD_VECTOR, "> 0", IRNode.MUL_V, "> 0", IRNode.STORE_VECTOR, "> 0"},
>         applyIfAnd = {"AlignVector", "false", "MaxVectorSize", ">= 8", "MaxVectorSize", "<= 12"},
>         applyIfCPUFeature = {"asimd", "true"})
>     public static void testIntP3(int[] data) {
>         for (int j = 0; j < RANGE - 3; j++) {
>             data[j + 3] = (int)(data[j] * (int)-11);
>         }
>     }
> 
> 
> Also SLP now operates under SuperWordMaxVectorSize so it will be good to its it instead.

With +AlignVector behavior with and without Vectorize,true pragma should match.


static void test1() {
     for (int i = 4; i < 100; i++) { 
         fArr[i + 4] = fArr[i];
     }
} 

  

CPROMPT>javad -XX:+TraceNewVectors -XX:+AlignVector  -cp . bug
WARNING: Using incubator modules: jdk.incubator.vector
res = 0.0
CPROMPT>
CPROMPT>javad -XX:+TraceNewVectors -XX:+AlignVector -XX:CompileCommand=Vectorize,bug::test1,true -cp . bug
CompileCommand: Vectorize bug.test1 bool Vectorize = true
WARNING: Using incubator modules: jdk.incubator.vector
new Vector node:  990  LoadVector  === 373 856 824  [[ 822 802 800 798 718 706 556 196 ]]  @float[int:>=0]:NotNull:exact+any *, idx=6; mismatched #vectory[8]:{float} !orig=[823],[719],[557],[199],143 !jvms: bug::test1 @ bci:18 (line 7)
new Vector node:  991  StoreVector  === 855 856 825 990  [[ 988 195 856 ]]  @float[int:>=0]:NotNull:exact+any *, idx=6; mismatched  Memory: @float[int:>=0]:NotNull:exact+any *, idx=6; !orig=[822],[718],[556],[196],164 !jvms: bug::test1 @ bci:19 (line 7)
res = 0.0

-------------

PR: https://git.openjdk.org/jdk/pull/12350


More information about the hotspot-compiler-dev mailing list