RFR: 8298935: fix cyclic dependency bug in create_pack logic in SuperWord::find_adjacent_refs [v15]

Jatin Bhateja jbhateja at openjdk.org
Mon Mar 6 05:16:13 UTC 2023


On Thu, 2 Mar 2023 15:56:00 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> @jatin-bhateja I think the IR rule is just ineffective. I have the following condition in it that will never be met:
>> `applyIfAnd = {"MaxVectorSize", ">= 8", "MaxVectorSize", "<= 4"},`
>> The `<= 4` must hold so that `byte_offset <= MaxVectorSize`, and so the cyclical dependency would not happen. But `>= 8` must hold so that two ints fit in a vector, so that we even vectorize.
>> 
>> I could improve the script and filter out such ineffective IR rules. Not sure if that is worth it though.
>
> I fixed my script, it should now compute the ranges correctly, and not add IR rules with impossible ranges.

Thanks, even though newly added test now passes at all AVX and SSE level can you kindly investigate why should following be vectorized with un-aligned accesses when it carries a cross iteration true dependency with distance 4.


    @Test
    // CPU: sse4.1 to avx -> vector_width: 16 -> elements in vector: 4
    //   positive byte_offset 12 can lead to cyclic dependency
    @IR(counts = {IRNode.LOAD_VECTOR, "> 0", IRNode.MUL_V, "> 0", IRNode.STORE_VECTOR, "> 0"},
        applyIfAnd = {"AlignVector", "false", "MaxVectorSize", ">= 8", "MaxVectorSize", "<= 12"},
        applyIfCPUFeatureAnd = {"sse4.1", "true", "avx2", "false"})
    // CPU: avx2 -> vector_width: 32 -> elements in vector: 8
    //   positive byte_offset 12 can lead to cyclic dependency
    @IR(counts = {IRNode.LOAD_VECTOR, "> 0", IRNode.MUL_V, "> 0", IRNode.STORE_VECTOR, "> 0"},
        applyIfAnd = {"AlignVector", "false", "MaxVectorSize", ">= 8", "MaxVectorSize", "<= 12"},
        applyIfCPUFeatureAnd = {"avx2", "true", "avx512", "false"})
    // CPU: avx512 -> vector_width: 64 -> elements in vector: 16
    //   positive byte_offset 12 can lead to cyclic dependency
    @IR(counts = {IRNode.LOAD_VECTOR, "> 0", IRNode.MUL_V, "> 0", IRNode.STORE_VECTOR, "> 0"},
        applyIfAnd = {"AlignVector", "false", "MaxVectorSize", ">= 8", "MaxVectorSize", "<= 12"},
        applyIfCPUFeature = {"avx512", "true"})
    // CPU: asimd -> vector_width: 32 -> elements in vector: 8
    //   positive byte_offset 12 can lead to cyclic dependency
    @IR(counts = {IRNode.LOAD_VECTOR, "> 0", IRNode.MUL_V, "> 0", IRNode.STORE_VECTOR, "> 0"},
        applyIfAnd = {"AlignVector", "false", "MaxVectorSize", ">= 8", "MaxVectorSize", "<= 12"},
        applyIfCPUFeature = {"asimd", "true"})
    public static void testIntP3(int[] data) {
        for (int j = 0; j < RANGE - 3; j++) {
            data[j + 3] = (int)(data[j] * (int)-11);
        }
    }


Also SLP now operates under SuperWordMaxVectorSize so it will be good to its it instead.

-------------

PR: https://git.openjdk.org/jdk/pull/12350


More information about the hotspot-compiler-dev mailing list