RFR: 8298935: fix independence bug in create_pack logic in SuperWord::find_adjacent_refs [v26]
Emanuel Peter
epeter at openjdk.org
Wed Mar 15 08:15:36 UTC 2023
On Wed, 15 Mar 2023 04:50:48 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:
>> https://github.com/openjdk/jdk/pull/12350#issuecomment-1461681129
>>
>>> So is there a **4th Bug** lurking here?
>>
>> @vnkozlov I now found an example that reveals this **Bug 4**. I want to fix it in a separate Bug [JDK-8304042](https://bugs.openjdk.org/browse/JDK-8304042).
>>
>> This is the method:
>>
>> static void test(int[] dataI1, int[] dataI2, float[] dataF1, float[] dataF2) {
>> for (int i = 0; i < RANGE/2; i+=2) {
>> dataF1[i+0] = dataI1[i+0] + 0.33f; // 1
>> dataI2[i+1] = (int)(11.0f * dataF2[i+1]); // 2
>>
>> dataI2[i+0] = (int)(11.0f * dataF2[i+0]); // 3
>> dataF1[i+1] = dataI1[i+1] + 0.33f; // 4
>> }
>> }
>>
>>
>> Note: `dataI1 == dataI2` and `dataF1 == dataF2`. I only had to use two references so that C2 does not know this, and does not optimize away load after store.
>>
>> Lines 1 and 4 are `isomorphic` and `independent`. The same holds for line 2 and 3. We creates the packs `[1,4]` and `[2,3]`, and vectorize (with and without my patch). However, we have the following dependencies: `1->3` and `2->4`. This creates a cyclic dependency between the two packs.
>>
>> As explained in the previous https://github.com/openjdk/jdk/pull/12350#issuecomment-1461681129, we have to verify that there are no cyclic dependencies between the packs, just before we schedule. The SuperWord paper states this in "3.7 Scheduling".
>
> @eme64, I ran with `-XX:-TieredCompilation -Xbatch -XX:CICompilerCount=1 -XX:+TraceNewVectors` on AVX512 linux machine our vectors jtregs tests (including `jdk/incubator/vector) and everything is fine except one test:
>
> compiler/loopopts/superword/TestPickLastMemoryState.java`
>
> With these changes we almost don't generate vectors (may be 2 per `@run`). Without changes we got about 160 (50 per `@run`) new vectors. It has several `@run` commands for different vector sizes.
@vnkozlov I'm looking into `TestPickLastMemoryState.java`
These are relevant vectorized methods:
- `f`: vectorize with with master, nothing with my patch.
- `reset`: both vectorize.
- `test1 - 6`: vectorize with master, nothing with patch.
Explanation:
- `f`: `b[h - 1]` and `b[h]` misaligned.
- `reset`: just a simple `VectorStore` with all zeros. Not much can go wrong here.
- `test1`: `iArr[i1]` and `iArr[i1+1]` misaligned.
- `test2`: `iArr[i1]` and `iArr[i1+2]` misaligned.
- `test3`: `iArr[i1]` and `iArr[i1-2]` misaligned.
- `test4`: `iArr[i1]` and `iArr[i1-3]` misaligned.
- `test5`: `iArr[i1]` and `iArr[i1+2]` misaligned.
- `test6`: `iArr[i1]` and `iArr[i1+1]` misaligned.
My first guess is that it means that we reject the misaligned slice during `find_adjacent_refs`. But then we re-introduce (some of) the memops during `extend_packlist`, because we do not only follow non-memops, but also memops. I call this "happy accident" before my fix, which should not be allowed. It can indeed lead to bugs on very similar examples. I now disallow these "happy accidents", I forbid the re-introduction of memops during `extend_packlist`.
TODO: I will look at all examples in detail now, and see if this guess is correct.
If this is correct, we should just use the `_do_vector_loop` in general, which would probably allow most if not all of these cases.
-------------
PR: https://git.openjdk.org/jdk/pull/12350
More information about the hotspot-compiler-dev
mailing list