RFR: 8332163: C2 SuperWord: refactor PacksetGraph and SuperWord::output into VTransformGraph [v10]
Emanuel Peter
epeter at openjdk.org
Wed Jul 3 14:10:29 UTC 2024
On Wed, 3 Jul 2024 13:54:46 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> The original PR was [here](https://github.com/openjdk/jdk/pull/19261), it got too chaotic.
>>
>> I added some extra tests for this in: https://github.com/openjdk/jdk/pull/19558
>> I extracted some refactorings to: https://github.com/openjdk/jdk/pull/19573
>>
>> We used to have:
>> - `PacksetGraph`: this detects cycles introduces by packs, and schedules/reorders the memops.
>> - `SuperWord::apply_vectorization`: creates `VectorNodes` directly from the `PackSet`.
>>
>> In my blog, I have published lots of ideas for SuperWord / AutoVectorization improvements:
>> https://eme64.github.io/blog/2023/11/03/C2-AutoVectorizer-Improvement-Ideas.html
>>
>> Many ideas are based on the "VectorTransform IR": cost-model, if-conversion, direct widening of scalars to vectors, additional optimizations/features with shuffle/pack/extract, handling more reduction patterns, etc.
>>
>> I now decided to name it `VTransform`, which is essencially a graph `VtransformGraph` of nodes `VTransformNodes` that resemble the C2 Node on purpose, because the `VTransform` models the C2 graph after vectorization. We can now model the transformation from scalar-loop to vectorized-loop without modifying the C2 graph yet.
>>
>> The new code has these steps:
>> - Given the `PackSet` from `SuperWord`, we create a `VTransformGraph` with `SuperWordVTransformBuilder`.
>> - [Not yet: all sorts of optimizations / checks on the `VTransformGraph`, in future RFE's]
>> - We then schedule the `VTransformGraph`, and check for cycles.
>> - Once we are ready to commit to vectorization, we call `VTransformGraph::apply_vectorization` which lets each individual `VTransformNode::apply` generate the new vectorized C2 nodes.
>>
>> **Testing**
>>
>> Regression testing passed.
>>
>> Performance testing: no significant change in performance (as expected).
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>
> tabs and includes
src/hotspot/share/opto/superword.cpp line 40:
> 38: #include "opto/vectornode.hpp"
> 39: #include "opto/movenode.hpp"
> 40: #include "utilities/powerOfTwo.hpp"
Note: I was able to remove some includes. Some were probably previously unnecessary, but some also are unnecessary since I removed/moded logic to other files now.
src/hotspot/share/opto/superwordVTransformBuilder.cpp line 28:
> 26: #include "opto/vectornode.hpp"
> 27:
> 28: void SuperWordVTransformBuilder::build() {
Note: used to be `PacksetGraph::build`
src/hotspot/share/opto/superwordVTransformBuilder.cpp line 181:
> 179: // Either get existing vtnode vector input (when input is a pack), or else make a
> 180: // new one vector vtnode for the input (e.g. for Replicate or PopulateIndex).
> 181: VTransformNode* SuperWordVTransformBuilder::get_or_make_vtnode_vector_input_at_index(const Node_List* pack, const int index) {
Note: used to be `SuperWord::vector_opd`
src/hotspot/share/opto/superwordVTransformBuilder.hpp line 30:
> 28: #define SHARE_OPTO_SUPERWORD_VTRANSFORM_BUILDER_HPP
> 29:
> 30: // Facility class that builds a VTransform from a SuperWord PackSet.
Note: this takes over a part of the old `PacksetGraph` and `SuperWord::output`.
src/hotspot/share/opto/vectorization.hpp line 1332:
> 1330: VTransformBoolTest(const BoolTest::mask mask, bool is_negated) :
> 1331: _mask(mask), _is_negated(is_negated) {}
> 1332: };
Note: moved to `vtransform.hpp`
src/hotspot/share/opto/vtransform.cpp line 46:
> 44: //
> 45: // We return "true" IFF we find no cycle, i.e. if the linearization succeeds.
> 46: bool VTransformGraph::schedule() {
Note: used to be `PacksetGraph::schedule`. I use a reverse-post-order algorithm now, used to be topsort.
src/hotspot/share/opto/vtransform.hpp line 59:
> 57: // - Pack/Unpack/Shuffle: introduce additional nodes not present in the scalar loop.
> 58: // This is difficult to do with the SuperWord packset approach.
> 59: // - If-conversion: convert predicated nodes into CFG.
Note: please read this description, it explains the basic idea of the `VTransform`.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1664232667
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1664243632
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1664247162
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1664248990
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1664250172
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1664252012
PR Review Comment: https://git.openjdk.org/jdk/pull/19719#discussion_r1664254914
More information about the hotspot-compiler-dev
mailing list