RFR: 8286125: C2: "bad AD file" with PopulateIndex on x86_64 [v2]

Tobias Hartmann thartmann at openjdk.java.net
Tue May 10 09:05:56 UTC 2022


On Tue, 10 May 2022 05:09:44 GMT, Pengfei Li <pli at openjdk.org> wrote:

>> A fuzzer test reports an assertion failure issue with PopulateIndexNode
>> on x86_64. It can be reproduced by the new jtreg case inside this patch.
>> Root cause is that C2 superword creates a PopulateIndexNode by mistake
>> while vectorizing below loop.
>> 
>>     for (int i = 304; i > 15; i -= 3) {
>>         int c = 16;
>>         do {
>>             for (int t = 1; t < 1; t++) {}
>>             arr[c + 1] >>= i;
>>         } while (--c > 0);
>>     }
>> 
>> This is a corner loop case with redundant code inside. After several C2
>> optimizations, the do-while loop inside is unrolled and then isomorphic
>> right shift statements can be combined in the superword optimization.
>> Since all shift counts are the same loop IV value `i`, superword should
>> generate a RShiftCntVNode to create a vector of scalar replications of
>> the loop IV. But after JDK-8280510, a PopulateIndexNode is generated by
>> mistake because of the `opd == iv()` condition.
>> 
>> To fix this, we add a `have_same_inputs` condition here checking if all
>> inputs at position `opd_idx` of nodes in the pack are the same. If true,
>> C2 code should NOT run into this block to generate a PopulateIndexNode.
>> Instead, it should run into the next block for scalar replications.
>> 
>> Additionally, only adding this condition here is still not good enough
>> because it breaks the experimental post loop vectorization. As in post
>> loops, all packs are singleton, i.e., `have_same_inputs` is always true.
>> Hence, we also add a pack size check here to make post loop logic run
>> into this block. It's safe to let it go because post loop never needs
>> scalar replications of the loop IV - it never combines nodes in packs.
>> 
>> We also add two more assertions in the code.
>> 
>> Jtreg hotspot::hotspot_all_no_apps, jdk::tier1~3 and langtools::tier1
>> are tested and no issue is found.
>
> Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision:
> 
>  - Merge branch 'master' into slpfix
>  - 8286125: C2: "bad AD file" with PopulateIndex on x86_64
>    
>    A fuzzer test reports an assertion failure issue with PopulateIndexNode
>    on x86_64. It can be reproduced by the new jtreg case inside this patch.
>    Root cause is that C2 superword creates a PopulateIndexNode by mistake
>    while vectorizing below loop.
>    
>        for (int i = 304; i > 15; i -= 3) {
>            int c = 16;
>            do {
>                for (int t = 1; t < 1; t++) {}
>                arr[c + 1] >>= i;
>            } while (--c > 0);
>        }
>    
>    This is a corner loop case with redundant code inside. After several C2
>    optimizations, the do-while loop inside is unrolled and then isomorphic
>    right shift statements can be combined in the superword optimization.
>    Since all shift counts are the same loop IV value `i`, superword should
>    generate a RShiftCntVNode to create a vector of scalar replications of
>    the loop IV. But after JDK-8280510, a PopulateIndexNode is generated by
>    mistake because of the `opd == iv()` condition.
>    
>    To fix this, we add a `have_same_inputs` condition here checking if all
>    inputs at position `opd_idx` of nodes in the pack are the same. If true,
>    C2 code should NOT run into this block to generate a PopulateIndexNode.
>    Instead, it should run into the next block for scalar replications.
>    
>    Additionally, only adding this condition here is still not good enough
>    because it breaks the experimental post loop vectorization. As in post
>    loops, all packs are singleton, i.e., `have_same_inputs` is always true.
>    Hence, we also add a pack size check here to make post loop logic run
>    into this block. It's safe to let it go because post loop never needs
>    scalar replications of the loop IV - it never combines nodes in packs.
>    
>    We also add two more assertions in the code.
>    
>    Jtreg hotspot::hotspot_all_no_apps, jdk::tier1~3 and langtools::tier1
>    are tested and no issue is found.

Please remove the newline from the beginning of the test before integrating.

Thanks for the clarification. Makes sense.

-------------

Marked as reviewed by thartmann (Reviewer).

PR: https://git.openjdk.java.net/jdk/pull/8587


More information about the hotspot-compiler-dev mailing list