RFR: 8298935: fix cyclic dependency bug in create_pack logic in SuperWord::find_adjacent_refs [v15]

Jatin Bhateja jbhateja at openjdk.org
Mon Mar 6 10:05:25 UTC 2023


On Mon, 6 Mar 2023 09:01:49 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> @jatin-bhateja 
>>> Thanks, even though newly added test now passes at all AVX and SSE level can you kindly investigate why should following be vectorized with un-aligned accesses when it carries a cross iteration true dependency with distance 4.
>> 
>> The cyclic dependency is at a distance of 3, not 4 in this example. Ints are 4 bytes. Thus, the `byte_offset` is 12 bytes. So if `MaxVectorSize <= 12`, we cannot ever have a cyclic dependency within a vector.
>> 
>> See my explanations at the beginning of the test file, for example:
>> https://github.com/openjdk/jdk/blob/fb7f6dd9fbc2a6086d2ad36e0681fbc9eff6c9a7/test/hotspot/jtreg/compiler/loopopts/superword/TestDependencyOffsets.java#L49-L57
>
> You can see that the rules for distance 4 are adjusted, if you look at `testIntP4`:
> https://github.com/openjdk/jdk/blob/fb7f6dd9fbc2a6086d2ad36e0681fbc9eff6c9a7/test/hotspot/jtreg/compiler/loopopts/superword/TestDependencyOffsets.java#L1202-L1230

Correct, my bad just pasted a wrong example!,  Do you see any value in 

> @jatin-bhateja
> 
> > Thanks, even though newly added test now passes at all AVX and SSE level can you kindly investigate why should following be vectorized with un-aligned accesses when it carries a cross iteration true dependency with distance 4.
> 
> The cyclic dependency is at a distance of 3, not 4 in this example. Ints are 4 bytes. Thus, the `byte_offset` is 12 bytes. So if `MaxVectorSize <= 12`, we cannot ever have a cyclic dependency within a vector.
> 
> See my explanations at the beginning of the test file, for example:
> 
> https://github.com/openjdk/jdk/blob/fb7f6dd9fbc2a6086d2ad36e0681fbc9eff6c9a7/test/hotspot/jtreg/compiler/loopopts/superword/TestDependencyOffsets.java#L49-L57

My bad, just pasted a wrong example, do you really see any value in generating tests for synthetic vector sizes where MaxVectorSize is a non-power of two. Lets remove them to reduce noise ?

-------------

PR: https://git.openjdk.org/jdk/pull/12350


More information about the hotspot-compiler-dev mailing list