RFR: 8325252: C2 SuperWord: refactor the packset
Emanuel Peter
epeter at openjdk.org
Thu Mar 21 08:19:33 UTC 2024
On Wed, 13 Mar 2024 14:25:57 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
> I'm refactoring the packset, separating the details of packset-manupulation from the SuperWord algorithm.
>
> Most importantly: I split it into two classes: `PairSet` and `PackSet`.
> `combine_pairs_to_longer_packs` converts the first into the second.
>
> I was able to simplify the combining, and remove the pack-sorting.
> I now walk "pair-chains" directly with `PairSetIterator`. One such pair-chain is equivalent to a pack.
>
> I moved all the `filter / split` functionality to the `PackSet`, which allows hiding a lot of packset-manipulation from the SuperWord algorithm.
>
> I ran into some issues when I was extending the pairset in `extend_pairset_with_more_pairs_by_following_use_and_def`:
> Using the PairSetIterator changed the order of extension, and that messed with the packing heuristic, and quite a few examples did not vectorize, because we would pack up the wrong 2 nodes out of a choice of 4 (e.g. we would pack `ac bd` instead of `ab cd`). Hence, I now still have to keep the insertion order for the pairs, and this basically means we are extending with a BFS order. Maybe this issue can be removed, if I improve the packing heuristic with some look-ahead expansion approach (but that is for another day [JDK-8309908](https://bugs.openjdk.org/browse/JDK-8309908)).
>
> But since I already spent some time on some of the packing heuristic (reordering and cost estimate), I did a light refactoring, and added extra tests for MulAddS2I.
>
> More details are described in the annotations in the code.
src/hotspot/share/opto/superword.cpp line 1199:
> 1197: if (num_s1_uses > 1) {
> 1198: _race_possible = true;
> 1199: }
Note: I removed `_race_possible` and `num_s1_uses`. We checked here if there was any node in a pack that has multiple uses. If that happens, it it spossible that `order_inputs_of_uses_to_match_def_pair` changes the order of inputs. This flag was set to fix everything again afterwards. But now I just call that algorithm always, implicitly always setting `_race_possible = true`. This does not cost so much, and makes things quite a bit simpler.
src/hotspot/share/opto/superword.cpp line 1309:
> 1307: auto adjacent_profit = [&] (Node* s1, Node* s2) { return 2; };
> 1308: auto pack_cost = [&] (int ct) { return ct; };
> 1309: auto unpack_cost = [&] (int ct) { return ct; };
Note: I moved the methods from the `SuperWord` class to lambdas in this method, they were not used anywhere else.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18276#discussion_r1533408297
PR Review Comment: https://git.openjdk.org/jdk/pull/18276#discussion_r1533409251
More information about the hotspot-compiler-dev
mailing list