RFR: 8315361: C2: Create a superclass of SuperWord [v2]

Wed Nov 1 07:05:00 UTC 2023

On Wed, 1 Nov 2023 03:14:17 GMT, Fei Gao <fgao at openjdk.org> wrote:

>> Hi @fg1417 I gave it a quick look, generally looks very reasonable.
>> 
>> My question: should we not also move the `velt` (vector element type) and maybe even the dependence graph parts?
>> If I remember right, then the post-loop-vectorizer also needs some way to determine element type (`_elem_bt` [here](https://github.com/openjdk/jdk/pull/14581)), but not sure about dependence graph. And also the memory slices may be helpful for all vectorizers. We could also have more classes that have different parts of these features (eg. `Vectorizer` with only basic functionality, and then a `VectorizerWithDependencyGraph`).
>
>> My question: should we not also move the `velt` (vector element type) and maybe even the dependence graph parts? If I remember right, then the post-loop-vectorizer also needs some way to determine element type (`_elem_bt` [here](https://github.com/openjdk/jdk/pull/14581)), but not sure about dependence graph. And also the memory slices may be helpful for all vectorizers. We could also have more classes that have different parts of these features (eg. `Vectorizer` with only basic functionality, and then a `VectorizerWithDependencyGraph`).
> 
> Hi @eme64, thanks for your review!
> 
> Yes, and both SuperWord and post-loop-vectorizer need to determine vector element type. There are still some deficiencies in current [`compute_vector_element_type()`](https://github.com/openjdk/jdk/blob/b3fec6b5f32c338ae1a84dd20bdcbd3d9b7186f3/src/hotspot/share/opto/superword.cpp#L3350) of SuperWord. For example, some loop cases of subword types, including byte and short, can't be recognized very well. See the description in https://github.com/openjdk/jdk/pull/7954. Maybe the main idea of [`find_vector_element_types()`](https://github.com/openjdk/jdk/blob/bd1b939b21a23675fd072b91d4eab538ff6d2a7d/src/hotspot/share/opto/vmaskloop.cpp#L318) proposed in post-loop-vectorizer can be improved to solve it?
> 
> About the dependence graph part, SuperWord works better than post-loop-vectorizer, because it can support more cases like reading-forward, after your great refactoring 👍 . But now, post-loop-vectorizer can’t support any cases with index offset.
> 
> So, maybe both two parts need more complex refactoring and improving, before they can be shared well by two vectorizers, not like code parts in this patch.
> 
> Thanks!

@fg1417 thanks for the answer. I think I'd be ok with doing only the part you did for now. We can do the `velt` part separately. But I would prefer that we do not integrate the post-loop-vectorizer until we have found a way to unify the `velt` logic.

I think it would be nice to find a solution to what you did in https://github.com/openjdk/jdk/pull/7954. I wonder if we can do some of the work already during IGVN, and simplify the graph (best option if possible). If not, then we would need some sort of pre-processing for the AutoVectorizer, and either modify the graph (maybe not such a good idea) or somehow have a separate mapping of the graph/copy (more overhead / complexity).

How does the post-loop-vectorizer deal with all of this in the current draft?

-------------

PR Comment: https://git.openjdk.org/jdk/pull/16353#issuecomment-1788506642