RFR: 8343685: C2 SuperWord: refactor VPointer with MemPointer [v4]

Vladimir Kozlov kvn at openjdk.org
Mon Jan 13 20:24:49 UTC 2025


On Fri, 3 Jan 2025 08:27:54 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> **This is a required step towards adding runtime-checks for Aliasing Analysis, especially Important for FFM / MemorySegments.**
>> See: https://eme64.github.io/blog/2025/01/01/AutoVectorization-Status.html
>> 
>> I know this one is large, but it consists of a lot of renamings, and new tests. On the whole, the new `VPointer` code is less than the old!
>> 
>> **Goal**
>> 
>> Replace old `VPointer` with a new version that relies on `MemPointer` - which then is a shared utility for both `MergeStores` and `SuperWord / AutoVectorization`. `MemPointer` generally parses pointers, and `VPointer` specializes this facility for the use in loops (`VLoop`).
>> 
>> The old `VPointer` implementation with its recursive pattern matching was quite complicated and difficult to reason about for correctness. The approach in `MemPointer` is much simpler: iteratively decomposing sub-expressions. Further: the new implementation is more powerful at detecting equivalent invariants.
>> 
>> **Future**:  with the `MemPointer` implementation of `VPointer`, it should be easier to implement speculative runtime-checks for Aliasing-Analysis [JDK-8324751](https://bugs.openjdk.org/browse/JDK-8324751). The pressing need for this has come from the FFM / MemorySegment folks, like @mcimadamore and @minborg .
>> 
>> **Details**
>> 
>> This looks like a rather big patch, so let me explain the parts.
>> - Refactor of `MemPointer` in `mepointer.hpp/cpp`:
>>   - Added concept of `Base` to `MemPointer`. This is required for the aliasing computation in `VPointer`.
>>   - `sub_expression_has_native_base_candidate`: add special case to parse through `CastX2P` if we find a native memory base `MemorySegment.address()`, i.e. `jdk.internal.foreign.NativeMemorySegmentImpl.min`. This helps some native memory segment cases to vectorize that did not before.
>>   - So far `MemPointer` could only answer adjacency queries. But VPointer also needs overlap queries, see the old `VPointer::not_equal` (i.e. can we prove that the two `VPointer` never overlap?). So I had to add a new case to aliasing computation: `NotOrAtDistance`. It is useful to answer the new and better named `MemPointer::never_overlaps_with`.
>>   - Collapsed together `MemPointerDecomposedForm` and `MemPointer`. It was an unnecessary and unhelpful split.
>> - Re-write of `VPointer` based on `MemPointer`:
>>   - Old pattern:
>>     - `VPointer[mem:  847     StoreI, base:   37, adr:   37,  base[  37] + offset(  16) + invar(   0) + scale(   4) * iv]`
>>     - `VPointer[mem: 31...
>
> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 116 commits:
> 
>  - copyright 2025
>  - Merge branch 'master' into JDK-8343685-VPointer-MemPointer
>  - manual merge
>  - fix printing
>  - rename
>  - fix up print
>  - add TestEquivalentInvariants.java
>  - improve documentation
>  - hide parser via delegation
>  - Merge branch 'master' into JDK-8343685-VPointer-MemPointer
>  - ... and 106 more: https://git.openjdk.org/jdk/compare/84e6432b...b64f9295

I have few comments.

src/hotspot/share/opto/memnode.cpp line 2951:

> 2949: #endif
> 2950:   const MemPointer pointer_use(NOT_PRODUCT(trace COMMA) use_store);
> 2951:   const MemPointer pointer_def(NOT_PRODUCT(trace COMMA) def_store);

Why you swapped arguments? Main argument will different in debug vs product VMs.

src/hotspot/share/opto/mempointer.cpp line 38:

> 36:   MemPointer(MemPointerParser::parse(NOT_PRODUCT(trace COMMA)
> 37:                                      mem,
> 38:                                      callback)) {}

Again. Why not product argument first?

src/hotspot/share/opto/mempointer.cpp line 243:

> 241:     // is too deep. The constant is chosen arbitrarily, not too large but big
> 242:     // enough for all normal cases.
> 243:     if (worklist.length() > 100) { return false; }

May be specify size when creating `worklist` so there is no need for resizing when it is grow.

src/hotspot/share/opto/mempointer.hpp line 620:

> 618: 
> 619: private:
> 620:   NOT_PRODUCT( const TraceMemPointer& _trace; )

Why you prefer `_trace` to be first and not last?

src/hotspot/share/opto/mempointer.hpp line 677:

> 675:     assert(pos == summands.length(), "copied all summands");
> 676: 
> 677:     assert(1 <= _size && _size <= 2048 && is_power_of_2(_size), "valid size");

Where 2048 comes from? Do you have a runtime check somewhere too?

src/hotspot/share/opto/noOverflowInt.hpp line 109:

> 107:     } else if (b.is_NaN()) {
> 108:       return -1;
> 109:     }

This is strange NaN compare results. May be add comment explaining that it is not really float arithmetic "NaN".

src/hotspot/share/opto/superword.cpp line 500:

> 498: 
> 499: // We use two comparisons, because a subtraction could underflow.
> 500: #define RETURN_CMP_VALUE_IF_NOT_EQUAL(a, b) \

Please use local static function instead of macro - you can't step through macros in debugger.

test/hotspot/jtreg/compiler/loopopts/superword/TestEquivalentInvariants.java line 2:

> 1: /*
> 2:  * Copyright (c) 2024, 2025, Oracle and/or its affiliates. All rights reserved.

This is new file. Why two years?

test/hotspot/jtreg/compiler/loopopts/superword/TestMemorySegment.java line 655:

> 653:     // FAILS: invariants are sorted differently, because of differently inserted Cast.
> 654:     // See: JDK-8330274
> 655:     // Interestingly, it now passes for native, but not for objects.

Should we list new success conditions instead of just commenting old?

test/hotspot/jtreg/compiler/loopopts/superword/TestMemorySegment.java line 674:

> 672:     // FAILS: invariants are sorted differently, because of differently inserted Cast.
> 673:     // See: JDK-8330274
> 674:     // Interestingly, it now passes for native, but not for objects.

The same.  May be skip these 2 tests.

-------------

PR Review: https://git.openjdk.org/jdk/pull/21926#pullrequestreview-2547675401
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1913715311
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1913723129
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1913729662
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1913742360
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1913743816
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1913715176
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1913757811
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1913748892
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1913752684
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1913753662


More information about the hotspot-compiler-dev mailing list