RFR: 8298935: fix cyclic dependency bug in create_pack logic in SuperWord::find_adjacent_refs [v15]
Jatin Bhateja
jbhateja at openjdk.org
Mon Mar 6 10:05:25 UTC 2023
On Mon, 6 Mar 2023 09:01:49 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> @jatin-bhateja
>>> Thanks, even though newly added test now passes at all AVX and SSE level can you kindly investigate why should following be vectorized with un-aligned accesses when it carries a cross iteration true dependency with distance 4.
>>
>> The cyclic dependency is at a distance of 3, not 4 in this example. Ints are 4 bytes. Thus, the `byte_offset` is 12 bytes. So if `MaxVectorSize <= 12`, we cannot ever have a cyclic dependency within a vector.
>>
>> See my explanations at the beginning of the test file, for example:
>> https://github.com/openjdk/jdk/blob/fb7f6dd9fbc2a6086d2ad36e0681fbc9eff6c9a7/test/hotspot/jtreg/compiler/loopopts/superword/TestDependencyOffsets.java#L49-L57
>
> You can see that the rules for distance 4 are adjusted, if you look at `testIntP4`:
> https://github.com/openjdk/jdk/blob/fb7f6dd9fbc2a6086d2ad36e0681fbc9eff6c9a7/test/hotspot/jtreg/compiler/loopopts/superword/TestDependencyOffsets.java#L1202-L1230
Correct, my bad just pasted a wrong example!, Do you see any value in
> @jatin-bhateja
>
> > Thanks, even though newly added test now passes at all AVX and SSE level can you kindly investigate why should following be vectorized with un-aligned accesses when it carries a cross iteration true dependency with distance 4.
>
> The cyclic dependency is at a distance of 3, not 4 in this example. Ints are 4 bytes. Thus, the `byte_offset` is 12 bytes. So if `MaxVectorSize <= 12`, we cannot ever have a cyclic dependency within a vector.
>
> See my explanations at the beginning of the test file, for example:
>
> https://github.com/openjdk/jdk/blob/fb7f6dd9fbc2a6086d2ad36e0681fbc9eff6c9a7/test/hotspot/jtreg/compiler/loopopts/superword/TestDependencyOffsets.java#L49-L57
My bad, just pasted a wrong example, do you really see any value in generating tests for synthetic vector sizes where MaxVectorSize is a non-power of two. Lets remove them to reduce noise ?
-------------
PR: https://git.openjdk.org/jdk/pull/12350
More information about the hotspot-compiler-dev
mailing list