RFR: 8298935: fix cyclic dependency bug in create_pack logic in SuperWord::find_adjacent_refs [v9]

Emanuel Peter epeter at openjdk.org
Mon Feb 27 16:13:30 UTC 2023


On Fri, 24 Feb 2023 15:29:14 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:

>> @jatin-bhateja Ok, I have reconsidered it. I will add some `SuperWordMaxVectorSize` and `AlignVector` combinations. But I will do it in a separate file, and always have CompileCommand directive `Vectorize` enabled (`_do_vector_loop == true`). I might refactor `TestOptionVectorizeIR.java` for that.
>> Let me know if you find it essencial to have the tests also with `_do_vector_loop == false`.
>
> Sounds good.

@jatin-bhateja I am trying to script-generate lots of test functions, and the `IR` rules. But this is a bit tricky.

I'm using types `byte, char, short, int, long, float, double` with offsets in a range from `-129 .... 0 ... 129`.

I need to ensure that we have enough elements per vector. For that, I need to ensure the loop is unrolled sufficiently. Even if I set `LoopMaxUnroll` extremely high, we do often stop unrolling before what I expected.
In `SuperWord::unrolling_analysis` we figure out how many elements of a type fit into a vector, and that sets the `slp_max_unroll`. For that, we ask `Matcher::vector_width_in_bytes(BasicType bt)`. And this limited by `SuperWordMaxVectorSize`, but also depends more on hardware feature like `UseAVX` or `VM_Version::supports_avx512bw`.
For the arm platforms arm32 and aarch64 it seems a bit easier, it just depends on `MaxVectorSize`.

I fear that I will have to make basically a `IR` rule per `UseAVX` value. Maybe for arm platforms its a bit easier.
I think I will limit the `IR` rules to `x64` and `aarch64`. The other platforms can at least verify the values.

Please let me know if you have a better idea @jatin-bhateja !

-------------

PR: https://git.openjdk.org/jdk/pull/12350


More information about the hotspot-compiler-dev mailing list