RFR: 8343685: C2 SuperWord: refactor VPointer with MemPointer [v14]

Christian Hagedorn chagedorn at openjdk.org
Fri Jan 17 14:35:54 UTC 2025


On Thu, 16 Jan 2025 16:19:15 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> **This is a required step towards adding runtime-checks for Aliasing Analysis, especially Important for FFM / MemorySegments.**
>> See: https://eme64.github.io/blog/2025/01/01/AutoVectorization-Status.html
>> 
>> I know this one is large, but it consists of a lot of renamings, and new tests. On the whole, the new `VPointer` code is less than the old!
>> 
>> **Goal**
>> 
>> Replace old `VPointer` with a new version that relies on `MemPointer` - which then is a shared utility for both `MergeStores` and `SuperWord / AutoVectorization`. `MemPointer` generally parses pointers, and `VPointer` specializes this facility for the use in loops (`VLoop`).
>> 
>> The old `VPointer` implementation with its recursive pattern matching was quite complicated and difficult to reason about for correctness. The approach in `MemPointer` is much simpler: iteratively decomposing sub-expressions. Further: the new implementation is more powerful at detecting equivalent invariants.
>> 
>> **Future**:  with the `MemPointer` implementation of `VPointer`, it should be easier to implement speculative runtime-checks for Aliasing-Analysis [JDK-8324751](https://bugs.openjdk.org/browse/JDK-8324751). The pressing need for this has come from the FFM / MemorySegment folks, like @mcimadamore and @minborg .
>> 
>> **Details**
>> 
>> This looks like a rather big patch, so let me explain the parts.
>> - Refactor of `MemPointer` in `mepointer.hpp/cpp`:
>>   - Added concept of `Base` to `MemPointer`. This is required for the aliasing computation in `VPointer`.
>>   - `sub_expression_has_native_base_candidate`: add special case to parse through `CastX2P` if we find a native memory base `MemorySegment.address()`, i.e. `jdk.internal.foreign.NativeMemorySegmentImpl.min`. This helps some native memory segment cases to vectorize that did not before.
>>   - So far `MemPointer` could only answer adjacency queries. But VPointer also needs overlap queries, see the old `VPointer::not_equal` (i.e. can we prove that the two `VPointer` never overlap?). So I had to add a new case to aliasing computation: `NotOrAtDistance`. It is useful to answer the new and better named `MemPointer::never_overlaps_with`.
>>   - Collapsed together `MemPointerDecomposedForm` and `MemPointer`. It was an unnecessary and unhelpful split.
>> - Re-write of `VPointer` based on `MemPointer`:
>>   - Old pattern:
>>     - `VPointer[mem:  847     StoreI, base:   37, adr:   37,  base[  37] + offset(  16) + invar(   0) + scale(   4) * iv]`
>>     - `VPointer[mem: 31...
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Batch 2 for Christian

Made a complete pass. I left some comments but otherwise, it looks good, great work!

src/hotspot/share/opto/mempointer.cpp line 227:

> 225:   for (int i = 0; i < worklist.length(); i++) {
> 226:     Node* n = worklist.at(i);
> 227:     switch(n->Opcode()) {

Suggestion:

    switch (n->Opcode()) {

src/hotspot/share/opto/mempointer.cpp line 277:

> 275:     return false;
> 276:   }
> 277:   return true;

Could be simplified to:
Suggestion:

  return holder_symbol == vmSymbols::jdk_internal_foreign_NativeMemorySegmentImpl() &&
         field_symbol == vmSymbols::min_name();

src/hotspot/share/opto/mempointer.cpp line 513:

> 511: #endif
> 512:     return MemPointerAliasing::make_not_or_at_distance(distance.value());
> 513:   }

You can remove the `else` and just continue because you return in the if-branch:
Suggestion:

 
   // At runtime, the two object bases can be:
  //   (1) different: pointers do not alias.
  //   (2) the same:  implies that (S2) holds. The summands are all the same, and with
  //                  the Pre-Condition, we know that both pointers are in bounds of the
  //                  same memory object, i.e. (S1) holds. We have already proven (S0)
  //                  and (S3), so all 4 conditions for "MemPointer Lemma" are given.
#ifndef PRODUCT
  if (trace.is_trace_aliasing()) {
    tty->print_cr("  -> Aliasing not or at distance = %d.", distance.value());
  }
#endif
  return MemPointerAliasing::make_not_or_at_distance(distance.value());

src/hotspot/share/opto/mempointer.cpp line 607:

> 605:   //
> 606:   // Which we can restate as:
> 607:   //   distance <= -other.size    ||  this.size <= distance

Did you try to align it with L604?
Suggestion:

  //   distance <= -other.size         ||  this.size <= distance

src/hotspot/share/opto/mempointer.hpp line 234:

> 232: //     Still: we would like to find such a base if possible, and if two pointers are similar (i.e. have the
> 233: //     same summands), we would like to find the same base. Further, it is reasonable to speculatively
> 234: //     assume that such base addresses are aligned (need to add this speculative check in  JDK-8323582).

It might also be helpful to add a TODO here.
Suggestion:

//     assume that such base addresses are aligned (TODO: need to add this speculative check in JDK-8323582).

src/hotspot/share/opto/mempointer.hpp line 243:

> 241: //          candidate for a native memory base.
> 242: //      (2) LoadL from field jdk.internal.foreign.NativeMemorySegmentImpl.min
> 243: //          This would be preferrable over CastX2P, because it holds the address() of a native

Suggestion:

//          This would be preferable over CastX2P, because it holds the address() of a native

src/hotspot/share/opto/mempointer.hpp line 510:

> 508:     if (p1.variable() == nullptr) {
> 509:       return (p2.variable() == nullptr) ? 0 : 1;
> 510:     } else if (p2.variable() == nullptr) {

`else` can be removed:
Suggestion:

    } if (p2.variable() == nullptr) {

src/hotspot/share/opto/mempointer.hpp line 612:

> 610:     Node* object()           const { assert(is_object(), ""); return _base; }
> 611:     Node* native()           const { assert(is_native(), ""); return _base; }
> 612:     Node* object_or_native() const { assert(is_known(),  ""); return _base; }

Add a message for good measure:
Suggestion:

    Node* object()           const { assert(is_object(), "unexpected kind"); return _base; }
    Node* native()           const { assert(is_native(), "unexpected kind"); return _base; }
    Node* object_or_native() const { assert(is_known(),  "unexpected kind"); return _base; }

src/hotspot/share/opto/mempointer.hpp line 678:

> 676: #endif
> 677: 
> 678:     // Put the base in in the 0th summand.

Suggestion:

    // Put the base in the 0th summand.

src/hotspot/share/opto/mempointer.hpp line 763:

> 761:   bool has_same_non_base_summands_as(const MemPointer& other) const {
> 762:     if (!base().is_known() || !other.base().is_known()) {
> 763:       assert(false, "unknonw base case is not answered optimally");

Suggestion:

      assert(false, "unknown base case is not answered optimally");

src/hotspot/share/opto/mempointer.hpp line 829:

> 827: };
> 828: 
> 829: class MemPointerParser : public StackObj {

Maybe a quick class comment to summarize the input and output could be helpful here.

src/hotspot/share/opto/superword.cpp line 98:

> 96: };
> 97: 
> 98: void SuperWord::unrolling_analysis(const VLoop &vloop, int &local_loop_unroll_factor) {

This method is quite complex. Is it possible to add a short summary as method comment? Can also be done in a follow-up since you have not changed much in this refactoring.

src/hotspot/share/opto/superword.cpp line 559:

> 557: 
> 558: // Collect all memops that could potentially be vectorized.
> 559: void SuperWord::collect_valid_memops(GrowableArray<MemOp>& memops) {

Can be made const.

src/hotspot/share/opto/superword.cpp line 2930:

> 2928: 
> 2929:   // 1.3: base (unless base is guaranteed aw aligned)
> 2930:   if (aw > ObjectAlignmentInBytes || is_base_native) {

You could move the definition of `is_base_native` down before this line to easier see where it's coming from.

src/hotspot/share/opto/superword.hpp line 568:

> 566:     MemNode* _mem;
> 567:     const VPointer* _vpointer;
> 568:     int _original_index;

Can also be made const:
Suggestion:

    const int _original_index;

src/hotspot/share/opto/vectorization.cpp line 186:

> 184: void VLoopVPointers::allocate_vpointers_array() {
> 185:   uint bytes2 = _vpointers_length * sizeof(VPointer);
> 186:   _vpointers = (VPointer*)_arena->Amalloc(bytes2);

Why `bytes2`?

src/hotspot/share/opto/vectorization.cpp line 484:

> 482:   //
> 483:   //      Note: we have been assuming that this also holds for native memory base
> 484:   //            addresses. This is incorrect, see JDK-8323582.

Maybe it's justified to add a preceding TODO here?
Suggestion:

  //      TODO: Note: we have been assuming that this also holds for native memory base
  //                  addresses. This is incorrect, see JDK-8323582.

src/hotspot/share/opto/vectorization.cpp line 525:

> 523:   // modulo operator "%" such that the remainder is always positive, see AlignmentSolution::mod(i, q).
> 524:   //
> 525:   // Note: the following assumption is incorrect for native memory bases, see JDK-8323582.

Same here:
Suggestion:

  // TODO: Note: the following assumption is incorrect for native memory bases, see JDK-8323582.

src/hotspot/share/opto/vectorization.cpp line 883:

> 881:   //
> 882:   //   -> base aligned: base % aw = 0
> 883:   //        Note: this assumption is incorrect for native memory bases, see JDK-8323582.

Suggestion:

  //        TODO: Note: this assumption is incorrect for native memory bases, see JDK-8323582.

src/hotspot/share/opto/vectorization.cpp line 962:

> 960:     // adr = base + con + invar + iv_scale * iv
> 961:     tty->print("  adr = base[%d]", base().object_or_native()->_idx);
> 962:     tty->print(" + con(%d) + invar + iv_scale(%d) * iv", _vpointer.con(), iv_scale());

Nit: In previous comments, you put `con` last. Should we also do this here?

src/hotspot/share/opto/vtransform.cpp line 148:

> 146: 
> 147: // Helper-class for VTransformGraph::has_store_to_load_forwarding_failure.
> 148: // It wraps a VPointer. The VPointer have an iv_offset applied, which

Suggestion:

// It wraps a VPointer. The VPointer has an iv_offset applied, which

test/hotspot/jtreg/compiler/loopopts/superword/TestMemorySegment.java line 649:

> 647:     @Test
> 648:     // FAILS: invariants are sorted differently, because of differently inserted Cast.
> 649:     // See: JDK-8330274

8330274 is closed as dup of this one. Should the comment be updated? Same below.

-------------

PR Review: https://git.openjdk.org/jdk/pull/21926#pullrequestreview-2558494167
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920230801
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920254553
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920266329
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920269117
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920201787
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920203388
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920209130
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920212319
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920214435
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920217872
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920220328
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920150297
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920152865
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920163422
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920140485
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1919888113
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1919898691
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1919899561
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1919900715
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1919902716
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920132303
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920278468


More information about the hotspot-compiler-dev mailing list