RFR: 8286125: C2: "bad AD file" with PopulateIndex on x86_64 [v2]
Tobias Hartmann
thartmann at openjdk.java.net
Tue May 10 09:05:56 UTC 2022
On Tue, 10 May 2022 05:09:44 GMT, Pengfei Li <pli at openjdk.org> wrote:
>> A fuzzer test reports an assertion failure issue with PopulateIndexNode
>> on x86_64. It can be reproduced by the new jtreg case inside this patch.
>> Root cause is that C2 superword creates a PopulateIndexNode by mistake
>> while vectorizing below loop.
>>
>> for (int i = 304; i > 15; i -= 3) {
>> int c = 16;
>> do {
>> for (int t = 1; t < 1; t++) {}
>> arr[c + 1] >>= i;
>> } while (--c > 0);
>> }
>>
>> This is a corner loop case with redundant code inside. After several C2
>> optimizations, the do-while loop inside is unrolled and then isomorphic
>> right shift statements can be combined in the superword optimization.
>> Since all shift counts are the same loop IV value `i`, superword should
>> generate a RShiftCntVNode to create a vector of scalar replications of
>> the loop IV. But after JDK-8280510, a PopulateIndexNode is generated by
>> mistake because of the `opd == iv()` condition.
>>
>> To fix this, we add a `have_same_inputs` condition here checking if all
>> inputs at position `opd_idx` of nodes in the pack are the same. If true,
>> C2 code should NOT run into this block to generate a PopulateIndexNode.
>> Instead, it should run into the next block for scalar replications.
>>
>> Additionally, only adding this condition here is still not good enough
>> because it breaks the experimental post loop vectorization. As in post
>> loops, all packs are singleton, i.e., `have_same_inputs` is always true.
>> Hence, we also add a pack size check here to make post loop logic run
>> into this block. It's safe to let it go because post loop never needs
>> scalar replications of the loop IV - it never combines nodes in packs.
>>
>> We also add two more assertions in the code.
>>
>> Jtreg hotspot::hotspot_all_no_apps, jdk::tier1~3 and langtools::tier1
>> are tested and no issue is found.
>
> Pengfei Li has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision:
>
> - Merge branch 'master' into slpfix
> - 8286125: C2: "bad AD file" with PopulateIndex on x86_64
>
> A fuzzer test reports an assertion failure issue with PopulateIndexNode
> on x86_64. It can be reproduced by the new jtreg case inside this patch.
> Root cause is that C2 superword creates a PopulateIndexNode by mistake
> while vectorizing below loop.
>
> for (int i = 304; i > 15; i -= 3) {
> int c = 16;
> do {
> for (int t = 1; t < 1; t++) {}
> arr[c + 1] >>= i;
> } while (--c > 0);
> }
>
> This is a corner loop case with redundant code inside. After several C2
> optimizations, the do-while loop inside is unrolled and then isomorphic
> right shift statements can be combined in the superword optimization.
> Since all shift counts are the same loop IV value `i`, superword should
> generate a RShiftCntVNode to create a vector of scalar replications of
> the loop IV. But after JDK-8280510, a PopulateIndexNode is generated by
> mistake because of the `opd == iv()` condition.
>
> To fix this, we add a `have_same_inputs` condition here checking if all
> inputs at position `opd_idx` of nodes in the pack are the same. If true,
> C2 code should NOT run into this block to generate a PopulateIndexNode.
> Instead, it should run into the next block for scalar replications.
>
> Additionally, only adding this condition here is still not good enough
> because it breaks the experimental post loop vectorization. As in post
> loops, all packs are singleton, i.e., `have_same_inputs` is always true.
> Hence, we also add a pack size check here to make post loop logic run
> into this block. It's safe to let it go because post loop never needs
> scalar replications of the loop IV - it never combines nodes in packs.
>
> We also add two more assertions in the code.
>
> Jtreg hotspot::hotspot_all_no_apps, jdk::tier1~3 and langtools::tier1
> are tested and no issue is found.
Please remove the newline from the beginning of the test before integrating.
Thanks for the clarification. Makes sense.
-------------
Marked as reviewed by thartmann (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/8587
More information about the hotspot-compiler-dev
mailing list