RFR: 8310190: C2 SuperWord: AlignVector is broken, generates misaligned packs [v43]
Christian Hagedorn
chagedorn at openjdk.org
Fri Dec 15 10:50:24 UTC 2023
On Fri, 15 Dec 2023 08:28:51 GMT, Christian Hagedorn <chagedorn at openjdk.org> wrote:
>> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>>
>> renamings and proof improvement in adjust_pre_loop_limit_to_align_main_loop_vectors
>
> src/hotspot/share/opto/vectorization.cpp line 824:
>
>> 822: // (C_const + C_pre * pre_iter_C_const) % aw = 0 (4c)
>> 823: //
>> 824: // We can only guarantee solutions to (4a) and (4b) if:
>
> Should we also mention here that strengthening it with (4a-c) is the best we can do to prove (3) statically while we might miss some solutions for (3) where some of (4a-c) are false which, however, cannot be proven statically?
Maybe we can also put (5a-b) in words:
Suggestion:
// We can only guarantee solutions to (4a) and (4b) if C_init and C_invar are zero or multiples of C_pre:
> src/hotspot/share/opto/vectorization.cpp line 940:
>
>> 938: // hence we have to add a dependency for invar, and scale (pre_stride is the
>> 939: // same for all mem_refs in the loop). If there is no invariant, then we add
>> 940: // a dependency that there is no invariant.
>
> I found it a little difficult to understand this part. I had to write it down to verify that we indeed are not missing a dependency. Maybe we can be more explicit here. How about something like that?
>
>
> // Hence, the solution for pre_iter depends on:
> // - Always: pre_r and pre_q
> // - If an init is present, we have a dependency on it. But since init is fixed and given by the loop for all mem_refs
> // of all packs, we can drop it.
> // - If init is a constant, we do not have another dependency.
> // - If init is a non-constant, we have a dependency on scale since C_init = scale. We need to keep since it is not
> // given by the loop and could vary from pack to pack. We do not have to add another dependency since (5a*):
> //
> // C_init % abs(C_pre) = 0 <=> C_init = C_pre * X
> //
> // can only be satisfied if X is constant:
> //
> // C_init = C_pre * X
> // X = C_init / C_pre
> // X = scale / (scale * pre_stride)
> // X = 1 / pre_stride
> //
> // - If an invariant is present, we have a dependency on it which we need to keep since it is not given by the loop
> // and could vary from pack to pack. We additionally have to add another dependency on scale since (5b*):
> //
> // C_invar % abs(C_pre) = 0 <=> C_invar = C_pre * Y
> //
> // can only be satisfied if we have the following Y:
> //
> // C_invar = C_pre * Y
> // Y = (C_invar / C_pre)
> // Y = abs(invar_factor) / (scale * pre_stride)
> //
> // A dependency for pre_stride is not required as it is fixed and given by the loop for all mem_refs of all packs
> // (similar to init).
Another thought: Should we assert that if init is a non-constant that `abs(stride) = 1` if there is a solution?
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1427694498
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1427819252
More information about the hotspot-compiler-dev
mailing list