RFR: 8332163: C2 SuperWord: refactor PacksetGraph and SuperWord::output into VTransformGraph

Emanuel Peter epeter at openjdk.org
Tue Jul 2 13:06:24 UTC 2024


On Mon, 1 Jul 2024 09:10:20 GMT, Christian Hagedorn <chagedorn at openjdk.org> wrote:

>> The original PR was [here](https://github.com/openjdk/jdk/pull/19261), it got too chaotic.
>> 
>> I added some extra tests for this in: https://github.com/openjdk/jdk/pull/19558
>> I extracted some refactorings to: https://github.com/openjdk/jdk/pull/19573
>> 
>> We used to have:
>> - `PacksetGraph`: this detects cycles introduces by packs, and schedules/reorders the memops.
>> - `SuperWord::apply_vectorization`: creates `VectorNodes` directly from the `PackSet`.
>> 
>> In my blog, I have published lots of ideas for SuperWord / AutoVectorization improvements:
>> https://eme64.github.io/blog/2023/11/03/C2-AutoVectorizer-Improvement-Ideas.html
>> 
>> Many ideas are based on the "VectorTransform IR": cost-model, if-conversion, direct widening of scalars to vectors, additional optimizations/features with shuffle/pack/extract, handling more reduction patterns, etc.
>> 
>> I now decided to name it `VTransform`, which is essencially a graph `VtransformGraph` of nodes `VTransformNodes` that resemble the C2 Node on purpose, because the `VTransform` models the C2 graph after vectorization. We can now model the transformation from scalar-loop to vectorized-loop without modifying the C2 graph yet.
>> 
>> The new code has these steps:
>> - Given the `PackSet` from `SuperWord`, we create a `VTransformGraph` with `SuperWordVTransformBuilder`.
>> - [Not yet: all sorts of optimizations / checks on the `VTransformGraph`, in future RFE's]
>> - We then schedule the `VTransformGraph`, and check for cycles.
>> - Once we are ready to commit to vectorization, we call `VTransformGraph::apply_vectorization` which lets each individual `VTransformNode::apply` generate the new vectorized C2 nodes.
>> 
>> **Testing**
>> 
>> Regression testing passed.
>> 
>> Performance testing: no significant change in performance (as expected).
>
> src/hotspot/share/opto/vectorization.hpp line 1411:
> 
>> 1409:   bool _is_trace_align_vector;
>> 1410:   bool _is_trace_info;
>> 1411:   bool _is_trace_verbose;
> 
> Can these somehow be made `const`?

The tricky thing is with the initialization: I now do that in the constructor body. But if it is `const`, then I need to do that in the initialization list. And I would like to use a local variable `bool is_trace     = _vloop.vtrace().is_trace(TraceAutoVectorizationTag::VTRANSFORM);`. But one can only have local variables after initialization of members. I thought of a few alternative solutions, but all of them are nasty in different ways.

@chhagedorn Do you have a good idea?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1662478663


More information about the hotspot-compiler-dev mailing list