RFR: 8343685: C2 SuperWord: refactor VPointer with MemPointer

Emanuel Peter epeter at openjdk.org
Mon Dec 2 13:42:02 UTC 2024


I know this one is large, but it consists of a lot of renamings, and new tests. On the whole, the new `VPointer` code is less than the old!

**Goal**

Replace old `VPointer` with a new version that relies on `MemPointer` - which then is a shared utility for both `MergeStores` and `SuperWord / AutoVectorization`. `MemPointer` generally parses pointers, and `VPointer` specializes this facility for the use in loops (`VLoop`).

The old `VPointer` implementation with its recursive pattern matching was quite complicated and difficult to reason about for correctness. The approach in `MemPointer` is much simpler: iteratively decomposing sub-expressions. Further: the new implementation is more powerful at detecting equivalent invariants.

**Future**:  with the `MemPointer` implementation of `VPointer`, it should be easier to implement speculative runtime-checks for Aliasing-Analysis [JDK-8324751](https://bugs.openjdk.org/browse/JDK-8324751). The pressing need for this has come from the FFM / MemorySegment folks, like @mcimadamore and @minborg .

**Details**

This looks like a rather big patch, so let me explain the parts.
- Refactor of `MemPointer` in `mepointer.hpp/cpp`:
  - Added concept of `Base` to `MemPointer`. This is required for the aliasing computation in `VPointer`.
  - `sub_expression_has_native_base_candidate`: add special case to parse through `CastX2P` if we find a native memory base `MemorySegment.address()`, i.e. `jdk.internal.foreign.NativeMemorySegmentImpl.min`. This helps some native memory segment cases to vectorize that did not before.
  - So far `MemPointer` could only answer adjacency queries. But VPointer also needs overlap queries, see the old `VPointer::not_equal` (i.e. can we prove that the two `VPointer` never overlap?). So I had to add a new case to aliasing computation: `NotOrAtDistance`. It is useful to answer the new and better named `MemPointer::never_overlaps_with`.
  - Collapsed together `MemPointerDecomposedForm` and `MemPointer`. It was an unnecessary and unhelpful split.
- Re-write of `VPointer` based on `MemPointer`:
  - Old pattern:
    - `VPointer[mem:  847     StoreI, base:   37, adr:   37,  base[  37] + offset(  16) + invar(   0) + scale(   4) * iv]`
    - `VPointer[mem: 3189      LoadB, base:    1, adr: 2273,  base[   1] + offset(   0) + invar(   0) + scale(   1) * iv]` -> `adr = CastX2P`, the addition of `11 Param`, `1819 Phi` and `629 LoadL`.
  - New pattern:
    - `VPointer[size:  4, object, base(110 CastPP) + con( 16) + iv_scale(  4) * iv + invar(0)]`
    - `VPointer[size:  1, native, base(629 LoadL) + con(  0) + iv_scale(  1) * iv + invar(1 * [11 Parm] + 1 * [1819 Phi])]` -> parse through the `CastX2P`, and detect the more helpful base `629 LoadL` (`MemorySegment.address()`), take `11 Param` and `1819 Phi` as invariants.
  - Move `invariant` check to `VLoop::is_pre_loop_invariant`.
  - Re-work the aliasing computation, and give it better names:
    - Before, it was done via `VPointer::cmp`, which returned quite cryptic return codes. This allowed us to do adjacency and overlap queries. Now I have explicit queries `is_adjacent_to_and_before` and `never_overlaps_with`.
    - `VPointer::make_with_size` and `VPointer::make_with_iv_offset`: one can now create a copy of a `VPointer` with a changed size (e.g. when going from smaller scalar to larger vector), or with a shifted offset (when simulating a `VPointer` in the next iteration). Now we can do aliasing computations with such modified `VPointers`, which is quite powerful and I will probably use for other things in the future.
- Refactor many uses of `VPointer`:
  - since we now do not have a single `invar`, but rather a set of `invar_summands`, I had to change some code in `AlignmentSolution / AlignmentSolver`, `VMemoryRegion`, `create_adjacent_memop_pairs` and `VTransform::adjust_pre_loop_limit_to_align_main_loop_vectors`. This unfortunately makes the change quite large, but a lot of it is just renaming of variables (e.g. `scale` -> `iv_scale` to avoid confusion with scales of other summands).
  - `VTransformMemVectorNode`: it now carries the `VPointer` corresponding to the vector load/store, adjusted for its size. We can now do aliasing checks with these nodes. This is how I refactored away `overlap_possible_with_any_in`, and replaced it with a single call to `never_overlaps_with`.
- Fix existing tests:  more cases now vectorize:
  - Non-power-of-2 stride: the issue used to be that e.g. `stride=3` would make IGVN split `3 * iv -> iv + iv << 1`. This was not properly parsed by the old `VPointer`, as it found the `iv` twice. But the `MemPointerParser` simply collects all usages and adds them up again, back to `3 * iv`.
  - Some cases in `TestMemorySegment.java` now vectorize, because `VPointer` is better at detecting the invariants and detecting that they are equivalent.
- Add new tests:
  - A while ago, I had worked on: [JDK-8330274](https://bugs.openjdk.org/browse/JDK-8330274) / https://github.com/openjdk/jdk/pull/18795, but then gave it up, in favour of this refactoring here. I had previously noticed, that the old VPointer does not deal very well with multiple invariants, because if the order of addition is off, then a different `invar` node was generated, and then the aliasing computation gets confused, i.e. thinks the aliasing is unknown. I added the tests from there:`TestEquivalentInvariants.java`.

**Future work:**
[JDK-8330274](https://bugs.openjdk.org/browse/JDK-8330274): investigate why some cases with multiple invariants still do not vectorize, and what can be done.
[JDK-8331576](https://bugs.openjdk.org/browse/JDK-8331576): Investigate case where native memory does not vectorize because of `CastX2P`.
[JDK-8324751](https://bugs.openjdk.org/browse/JDK-8324751): Aliasing-Analysis with speculative runtime-checks

-------------

Commit messages:
 - rename
 - fix up print
 - add TestEquivalentInvariants.java
 - improve documentation
 - hide parser via delegation
 - Merge branch 'master' into JDK-8343685-VPointer-MemPointer
 - make sort stable
 - some comment and naming improvements
 - check if field not found
 - update comments
 - ... and 102 more: https://git.openjdk.org/jdk/compare/28ae281b...4b3c7d29

Changes: https://git.openjdk.org/jdk/pull/21926/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=21926&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8343685
  Stats: 4032 lines in 18 files changed: 1849 ins; 1538 del; 645 mod
  Patch: https://git.openjdk.org/jdk/pull/21926.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/21926/head:pull/21926

PR: https://git.openjdk.org/jdk/pull/21926


More information about the hotspot-compiler-dev mailing list