RFR: 8325252: C2 SuperWord: refactor the packset

Emanuel Peter epeter at openjdk.org
Thu Mar 21 08:15:28 UTC 2024


On Wed, 13 Mar 2024 14:25:57 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

> I'm refactoring the packset, separating the details of packset-manupulation from the SuperWord algorithm.
> 
> Most importantly: I split it into two classes: `PairSet` and `PackSet`.
> `combine_pairs_to_longer_packs` converts the first into the second.
> 
> I was able to simplify the combining, and remove the pack-sorting.
> I now walk "pair-chains" directly with `PairSetIterator`. One such pair-chain is equivalent to a pack.
> 
> I moved all the `filter / split` functionality to the `PackSet`, which allows hiding a lot of packset-manipulation from the SuperWord algorithm.
> 
> I ran into some issues when I was extending the pairset in `extend_pairset_with_more_pairs_by_following_use_and_def`:
> Using the PairSetIterator changed the order of extension, and that messed with the packing heuristic, and quite a few examples did not vectorize, because we would pack up the wrong 2 nodes out of a choice of 4 (e.g. we would pack `ac bd` instead of `ab cd`). Hence, I now still have to keep the insertion order for the pairs, and this basically means we are extending with a BFS order. Maybe this issue can be removed, if I improve the packing heuristic with some look-ahead expansion approach (but that is for another day [JDK-8309908](https://bugs.openjdk.org/browse/JDK-8309908)).
> 
> But since I already spent some time on some of the packing heuristic (reordering and cost estimate), I did a light refactoring, and added extra tests for MulAddS2I.
> 
> More details are described in the annotations in the code.

src/hotspot/share/opto/superword.cpp line 1047:

> 1045:       Node* s2 = _pairset.right_at(i);
> 1046:       changed |= extend_pairset_with_more_pairs_by_following_def(s1, s2);
> 1047:       changed |= extend_pairset_with_more_pairs_by_following_use(s1, s2);

Note: could not use the `PairSetIterator`, it changed the order of extension. Keeping the old insertion order leads to something closer to a BFS, which seems to be more successful on the IR tests we have. That does not mean that this is a optimal solution. After all, SuperWord is a greedy algoritm. I hope to implement something more optimal in the future.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/18276#discussion_r1533404492


More information about the hotspot-compiler-dev mailing list