RFR: 8298935: fix independence bug in create_pack logic in SuperWord::find_adjacent_refs [v26]

Emanuel Peter epeter at openjdk.org
Wed Mar 15 08:15:36 UTC 2023


On Wed, 15 Mar 2023 04:50:48 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:

>> https://github.com/openjdk/jdk/pull/12350#issuecomment-1461681129
>> 
>>> So is there a **4th Bug** lurking here?
>> 
>> @vnkozlov I now found an example that reveals this **Bug 4**. I want to fix it in a separate Bug [JDK-8304042](https://bugs.openjdk.org/browse/JDK-8304042).
>> 
>> This is the method:
>> 
>>     static void test(int[] dataI1, int[] dataI2, float[] dataF1, float[] dataF2) {
>>         for (int i = 0; i < RANGE/2; i+=2) {
>>             dataF1[i+0] = dataI1[i+0] + 0.33f;            // 1
>>             dataI2[i+1] = (int)(11.0f * dataF2[i+1]);     // 2
>> 
>>             dataI2[i+0] = (int)(11.0f * dataF2[i+0]);     // 3
>>             dataF1[i+1] = dataI1[i+1] + 0.33f;            // 4
>>         }
>>     }
>> 
>> 
>> Note: `dataI1 == dataI2` and `dataF1 == dataF2`. I only had to use two references so that C2 does not know this, and does not optimize away load after store.
>> 
>> Lines 1 and 4 are `isomorphic` and `independent`. The same holds for line 2 and 3. We creates the packs `[1,4]` and `[2,3]`, and vectorize (with and without my patch). However, we have the following dependencies: `1->3` and `2->4`. This creates a cyclic dependency between the two packs.
>> 
>> As explained in the previous https://github.com/openjdk/jdk/pull/12350#issuecomment-1461681129, we have to verify that there are no cyclic dependencies between the packs, just before we schedule. The SuperWord paper states this in "3.7 Scheduling".
>
> @eme64, I ran with `-XX:-TieredCompilation -Xbatch -XX:CICompilerCount=1 -XX:+TraceNewVectors` on AVX512 linux machine our vectors jtregs tests (including `jdk/incubator/vector) and everything is fine except one test:
> 
> compiler/loopopts/superword/TestPickLastMemoryState.java`
> 
> With these changes we almost don't generate vectors (may be 2 per `@run`). Without changes we got about 160 (50 per `@run`) new vectors. It has several `@run` commands for different vector sizes.

@vnkozlov I'm looking into `TestPickLastMemoryState.java`

These are relevant vectorized methods:

- `f`: vectorize with with master, nothing with my patch.
- `reset`: both vectorize.
- `test1 - 6`: vectorize with master, nothing with patch.

Explanation:
 - `f`: `b[h - 1]` and `b[h]` misaligned.
 - `reset`: just a simple `VectorStore` with all zeros. Not much can go wrong here.
 - `test1`: `iArr[i1]` and `iArr[i1+1]` misaligned.
 - `test2`: `iArr[i1]` and `iArr[i1+2]` misaligned.
 - `test3`: `iArr[i1]` and `iArr[i1-2]` misaligned.
 - `test4`: `iArr[i1]` and `iArr[i1-3]` misaligned.
 - `test5`: `iArr[i1]` and `iArr[i1+2]` misaligned.
 - `test6`: `iArr[i1]` and `iArr[i1+1]` misaligned.

My first guess is that it means that we reject the misaligned slice during `find_adjacent_refs`. But then we re-introduce (some of) the memops during `extend_packlist`, because we do not only follow non-memops, but also memops. I call this "happy accident" before my fix, which should not be allowed. It can indeed lead to bugs on very similar examples. I now disallow these "happy accidents", I forbid the re-introduction of memops during `extend_packlist`.

TODO: I will look at all examples in detail now, and see if this guess is correct.

If this is correct, we should just use the `_do_vector_loop` in general, which would probably allow most if not all of these cases.

-------------

PR: https://git.openjdk.org/jdk/pull/12350


More information about the hotspot-compiler-dev mailing list