RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2]
Pengfei Li
pli at openjdk.org
Mon Jul 3 09:07:12 UTC 2023
On Tue, 27 Jun 2023 16:58:52 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> src/hotspot/share/opto/vmaskloop.cpp line 550:
>>
>>> 548: // 2) Address is growing down (index scale * loop stride < 0)
>>> 549: // 3) Memory access scale is different from data size
>>> 550: // 4) The loop increment node is on the SWPointer's node stack
>>
>> Why should the `incr` not be on the node stack?
>
> Does that not prevent `a[i+1]` from being accepted?
That's a really corner case. In C2's ideal graph, most loop statements eventually uses the loop induction variable `phi` node as a input. That's good. But, there is one exception that a loop statement has a sub-expression of `iv + stride`. In this kind of cases, IGVN may do common sub-expression elimination and the inputs may come from the loop increment node thereafter. As the final step of vector masked transformation replaces the loop increment node, the calculation for `iv + stride` will also be replaced as well and it causes mis-compilation. In current patch, I duplicate the loop increment pattern for update (that's why we have `is_loop_incr_pattern()`, see commit 2) to avoid this issue, but currently it only applies to the expression not in array indices, such as `a[i] = i + 1`. For the patterns like `a[i+1] = i`, I'm still looking for a better approach to handle.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250523973
More information about the hotspot-compiler-dev
mailing list