RFR: 8298935: fix cyclic dependency bug in create_pack logic in SuperWord::find_adjacent_refs

Emanuel Peter epeter at openjdk.org
Wed Mar 1 09:46:13 UTC 2023


On Mon, 13 Feb 2023 01:43:49 GMT, Fei Gao <fgao at openjdk.org> wrote:

>>> @fg1417 Do you have the possibility to test on arm32?
>> 
>> Sure. I'll do the testing with a 32-bit docker container on a 64-bit host.
>
>> > @fg1417 Do you have the possibility to test on arm32?
>> 
>> Sure. I'll do the testing with a 32-bit docker container on a 64-bit host.
> 
> The testing for tier 1 - 3 and jcstress looks good. No new failures on arm32. Thanks.

I have been discussing with @fg1417 over emails (thanks for the conversation).

@fg1417 voiced concern about my change here:
> Disallow extend_packlist from adding MemNodes back in. Because if we have rejected some memops, we do not want them to be added back in later.

This refers to:
https://github.com/openjdk/jdk/blob/8349906d63a5746383752091904bd1b1e75c83ae/src/hotspot/share/opto/superword.cpp#L1562-L1564

https://github.com/openjdk/jdk/blob/8349906d63a5746383752091904bd1b1e75c83ae/src/hotspot/share/opto/superword.cpp#L1603-L1612

Before my patch, the condition was only demanding the extension to be within the block, and did not exclude memops
(eg. it was `if (!in_bb(t1) || !in_bb(t2))  {`)

Before my patch, this means we may be adding back (resurrecting) memops in `extend_packlist` that were rejected in `find_adjacent_refs`. For example they were rejected for these reasons:

**1: +AlignVector requires all packs to have vector_with <= that of best_align_to_mem_ref**
`+AlignVector` requires that all packs have a `vector_width` smaller or equal of that of `best_align_to_mem_ref`.
https://github.com/openjdk/jdk/blob/8349906d63a5746383752091904bd1b1e75c83ae/src/hotspot/share/opto/superword.cpp#L773-L781
The issue here is that if we align the main-loop based on `best_align_to_mem_ref`.
https://github.com/openjdk/jdk/blob/8349906d63a5746383752091904bd1b1e75c83ae/src/hotspot/share/opto/superword.cpp#L2636-L2640
https://github.com/openjdk/jdk/blob/8349906d63a5746383752091904bd1b1e75c83ae/src/hotspot/share/opto/superword.cpp#L3998
If there is any other pack that now has a larger `vector_width` than that of the `best_align_to_mem_ref` we cannot ensure that it is properly aligned under `+AlignVector`.

It seems that the current assumption of the VM is that `+AlignVector` requires all vector memory ops to be aligned to their vector size. Of course many CPU's only have a 4-byte or 8-byte alignment requirement. On those platforms, our current alignment analysis is much too coarse and prevents SuperWord in many instances, unnecessarily.

One example that used to work under master, but not with my patch (**performance regression**), is this example (from `TestVectorizeTypeConversion.testConvI2D`):

    @Test
    @IR(counts = {IRNode.LOAD_VECTOR, ">0",
                  IRNode.VECTOR_CAST_I2X, ">0",
                  IRNode.STORE_VECTOR, ">0"})
    private static void testConvI2D(double[] d, int[] a) {
        for(int i = 0; i < d.length; i++) {
            d[i] = (double) (a[i]);
        }
    }

In this example, `best_align_to_mem_ref` is `StoreD, which on some machines may have a smaller `vector_width` than the `LoadI`, and hence gets rejected. But under master, it was added back in during `extend_packlist`.


**WIP, I will write more**

-------------

PR: https://git.openjdk.org/jdk/pull/12350


More information about the hotspot-compiler-dev mailing list