RFR: 8315361: C2 SuperWord: refactor out loop analysis into shared auto-vectorization facility VLoopAnalyzer [v12]
Emanuel Peter
epeter at openjdk.org
Mon Jan 22 16:11:02 UTC 2024
> This is a refactoring of `SuperWord`.
>
> **Goals**
>
> 1. Clean up `SuperWord`: disentangle different components, make them more **modular**.
> 2. Make the loop analysis parts a **shared facility**, not just for SuperWord but also the post-loop-vectorizer ([JDK-8308994](https://bugs.openjdk.org/browse/JDK-8308994)).
> 3. It is also a necessary step on my bigger plans for improvement with the C2 Auto-Vectorizer ([see my blog post](https://eme64.github.io/blog/2023/11/03/C2-AutoVectorizer-Improvement-Ideas.html)).
> 4. Improve tracing in the auto-vectorization by making it more systematic.
>
> **Summary**
>
> - I wrote a summary of how C2 auto-vectorization with SuperWord works (please read!):
> https://github.com/openjdk/jdk/blob/95fd361e60fc66eb91edad321662e508b2d1bdde/src/hotspot/share/opto/superword.hpp#L32-L177
> - I moved many `Superword` components out to `VLoop` and its subclass `VLoopAnalyzer`. The idea is that any vectorizer can use these facilities in the future. They are therefore made more modular, which should hopefully make future changes easier. These components are:
> - Checking the pre-conditions for vectorization (e.g. no unwanted ctrl-flow).
> - `VLoop::check_preconditions_helper` replaces code from old `SuperWord::transform_loop`.
> - Running all submodules of `VLoopAnalyzer`: `VLoopAnalyzer::analyze_helper`. Replaces analysis part of `SuperWord::SLP_extract`.
> - Finding and marking reductions -> `VLoopReductions`
> - Detecting memory slices -> `VLoopMemorySlices`
> - Analyzing the body -> `VLoopBody` (renamed `in_bb` -> `in_body`)
> - Determining vector element types, and functions to determine the `vector_width` of a node -> `VLoopTypes`
> - Constructing the dependence graph -> `VLoopDependenceGraph`. Replaces old `DepGraph` with all its components.
> - New: CompileCommand option `TraceAutovectorization`
> - Run with `-XX:CompileCommand=traceAutovectorization,*::*,help` to get a usage description.
> - Replaced all printing with flags `TraceSuperWord` (and `Verbose`) and of `VectorizeDebug`.
> - The advantage of a CompileCommand is that tracing can be applied selectively for only a limited set of java classes / methods.
> - It uses tags, which are more readable than the `VectorizeDebug` bit-flags. These tags can be used for all parts of the vectorizer, but one can also target SuperWord specifically.
> - I systematically added tracing at every point where vectorization (partially) fails (use tag `SW_REJECTIONS`).
> - `TraceSuperWord` still works, and perfo...
Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
remove SuperWord::init, and reserve space in data structures
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/16620/files
- new: https://git.openjdk.org/jdk/pull/16620/files/b05444dc..30ef793b
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=16620&range=11
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=16620&range=10-11
Stats: 63 lines in 3 files changed: 7 ins; 18 del; 38 mod
Patch: https://git.openjdk.org/jdk/pull/16620.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/16620/head:pull/16620
PR: https://git.openjdk.org/jdk/pull/16620
More information about the hotspot-compiler-dev
mailing list