RFR: 8343685: C2 SuperWord: refactor VPointer with MemPointer [v14]
Christian Hagedorn
chagedorn at openjdk.org
Fri Jan 17 14:35:54 UTC 2025
On Thu, 16 Jan 2025 16:19:15 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> **This is a required step towards adding runtime-checks for Aliasing Analysis, especially Important for FFM / MemorySegments.**
>> See: https://eme64.github.io/blog/2025/01/01/AutoVectorization-Status.html
>>
>> I know this one is large, but it consists of a lot of renamings, and new tests. On the whole, the new `VPointer` code is less than the old!
>>
>> **Goal**
>>
>> Replace old `VPointer` with a new version that relies on `MemPointer` - which then is a shared utility for both `MergeStores` and `SuperWord / AutoVectorization`. `MemPointer` generally parses pointers, and `VPointer` specializes this facility for the use in loops (`VLoop`).
>>
>> The old `VPointer` implementation with its recursive pattern matching was quite complicated and difficult to reason about for correctness. The approach in `MemPointer` is much simpler: iteratively decomposing sub-expressions. Further: the new implementation is more powerful at detecting equivalent invariants.
>>
>> **Future**: with the `MemPointer` implementation of `VPointer`, it should be easier to implement speculative runtime-checks for Aliasing-Analysis [JDK-8324751](https://bugs.openjdk.org/browse/JDK-8324751). The pressing need for this has come from the FFM / MemorySegment folks, like @mcimadamore and @minborg .
>>
>> **Details**
>>
>> This looks like a rather big patch, so let me explain the parts.
>> - Refactor of `MemPointer` in `mepointer.hpp/cpp`:
>> - Added concept of `Base` to `MemPointer`. This is required for the aliasing computation in `VPointer`.
>> - `sub_expression_has_native_base_candidate`: add special case to parse through `CastX2P` if we find a native memory base `MemorySegment.address()`, i.e. `jdk.internal.foreign.NativeMemorySegmentImpl.min`. This helps some native memory segment cases to vectorize that did not before.
>> - So far `MemPointer` could only answer adjacency queries. But VPointer also needs overlap queries, see the old `VPointer::not_equal` (i.e. can we prove that the two `VPointer` never overlap?). So I had to add a new case to aliasing computation: `NotOrAtDistance`. It is useful to answer the new and better named `MemPointer::never_overlaps_with`.
>> - Collapsed together `MemPointerDecomposedForm` and `MemPointer`. It was an unnecessary and unhelpful split.
>> - Re-write of `VPointer` based on `MemPointer`:
>> - Old pattern:
>> - `VPointer[mem: 847 StoreI, base: 37, adr: 37, base[ 37] + offset( 16) + invar( 0) + scale( 4) * iv]`
>> - `VPointer[mem: 31...
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>
> Batch 2 for Christian
Made a complete pass. I left some comments but otherwise, it looks good, great work!
src/hotspot/share/opto/mempointer.cpp line 227:
> 225: for (int i = 0; i < worklist.length(); i++) {
> 226: Node* n = worklist.at(i);
> 227: switch(n->Opcode()) {
Suggestion:
switch (n->Opcode()) {
src/hotspot/share/opto/mempointer.cpp line 277:
> 275: return false;
> 276: }
> 277: return true;
Could be simplified to:
Suggestion:
return holder_symbol == vmSymbols::jdk_internal_foreign_NativeMemorySegmentImpl() &&
field_symbol == vmSymbols::min_name();
src/hotspot/share/opto/mempointer.cpp line 513:
> 511: #endif
> 512: return MemPointerAliasing::make_not_or_at_distance(distance.value());
> 513: }
You can remove the `else` and just continue because you return in the if-branch:
Suggestion:
// At runtime, the two object bases can be:
// (1) different: pointers do not alias.
// (2) the same: implies that (S2) holds. The summands are all the same, and with
// the Pre-Condition, we know that both pointers are in bounds of the
// same memory object, i.e. (S1) holds. We have already proven (S0)
// and (S3), so all 4 conditions for "MemPointer Lemma" are given.
#ifndef PRODUCT
if (trace.is_trace_aliasing()) {
tty->print_cr(" -> Aliasing not or at distance = %d.", distance.value());
}
#endif
return MemPointerAliasing::make_not_or_at_distance(distance.value());
src/hotspot/share/opto/mempointer.cpp line 607:
> 605: //
> 606: // Which we can restate as:
> 607: // distance <= -other.size || this.size <= distance
Did you try to align it with L604?
Suggestion:
// distance <= -other.size || this.size <= distance
src/hotspot/share/opto/mempointer.hpp line 234:
> 232: // Still: we would like to find such a base if possible, and if two pointers are similar (i.e. have the
> 233: // same summands), we would like to find the same base. Further, it is reasonable to speculatively
> 234: // assume that such base addresses are aligned (need to add this speculative check in JDK-8323582).
It might also be helpful to add a TODO here.
Suggestion:
// assume that such base addresses are aligned (TODO: need to add this speculative check in JDK-8323582).
src/hotspot/share/opto/mempointer.hpp line 243:
> 241: // candidate for a native memory base.
> 242: // (2) LoadL from field jdk.internal.foreign.NativeMemorySegmentImpl.min
> 243: // This would be preferrable over CastX2P, because it holds the address() of a native
Suggestion:
// This would be preferable over CastX2P, because it holds the address() of a native
src/hotspot/share/opto/mempointer.hpp line 510:
> 508: if (p1.variable() == nullptr) {
> 509: return (p2.variable() == nullptr) ? 0 : 1;
> 510: } else if (p2.variable() == nullptr) {
`else` can be removed:
Suggestion:
} if (p2.variable() == nullptr) {
src/hotspot/share/opto/mempointer.hpp line 612:
> 610: Node* object() const { assert(is_object(), ""); return _base; }
> 611: Node* native() const { assert(is_native(), ""); return _base; }
> 612: Node* object_or_native() const { assert(is_known(), ""); return _base; }
Add a message for good measure:
Suggestion:
Node* object() const { assert(is_object(), "unexpected kind"); return _base; }
Node* native() const { assert(is_native(), "unexpected kind"); return _base; }
Node* object_or_native() const { assert(is_known(), "unexpected kind"); return _base; }
src/hotspot/share/opto/mempointer.hpp line 678:
> 676: #endif
> 677:
> 678: // Put the base in in the 0th summand.
Suggestion:
// Put the base in the 0th summand.
src/hotspot/share/opto/mempointer.hpp line 763:
> 761: bool has_same_non_base_summands_as(const MemPointer& other) const {
> 762: if (!base().is_known() || !other.base().is_known()) {
> 763: assert(false, "unknonw base case is not answered optimally");
Suggestion:
assert(false, "unknown base case is not answered optimally");
src/hotspot/share/opto/mempointer.hpp line 829:
> 827: };
> 828:
> 829: class MemPointerParser : public StackObj {
Maybe a quick class comment to summarize the input and output could be helpful here.
src/hotspot/share/opto/superword.cpp line 98:
> 96: };
> 97:
> 98: void SuperWord::unrolling_analysis(const VLoop &vloop, int &local_loop_unroll_factor) {
This method is quite complex. Is it possible to add a short summary as method comment? Can also be done in a follow-up since you have not changed much in this refactoring.
src/hotspot/share/opto/superword.cpp line 559:
> 557:
> 558: // Collect all memops that could potentially be vectorized.
> 559: void SuperWord::collect_valid_memops(GrowableArray<MemOp>& memops) {
Can be made const.
src/hotspot/share/opto/superword.cpp line 2930:
> 2928:
> 2929: // 1.3: base (unless base is guaranteed aw aligned)
> 2930: if (aw > ObjectAlignmentInBytes || is_base_native) {
You could move the definition of `is_base_native` down before this line to easier see where it's coming from.
src/hotspot/share/opto/superword.hpp line 568:
> 566: MemNode* _mem;
> 567: const VPointer* _vpointer;
> 568: int _original_index;
Can also be made const:
Suggestion:
const int _original_index;
src/hotspot/share/opto/vectorization.cpp line 186:
> 184: void VLoopVPointers::allocate_vpointers_array() {
> 185: uint bytes2 = _vpointers_length * sizeof(VPointer);
> 186: _vpointers = (VPointer*)_arena->Amalloc(bytes2);
Why `bytes2`?
src/hotspot/share/opto/vectorization.cpp line 484:
> 482: //
> 483: // Note: we have been assuming that this also holds for native memory base
> 484: // addresses. This is incorrect, see JDK-8323582.
Maybe it's justified to add a preceding TODO here?
Suggestion:
// TODO: Note: we have been assuming that this also holds for native memory base
// addresses. This is incorrect, see JDK-8323582.
src/hotspot/share/opto/vectorization.cpp line 525:
> 523: // modulo operator "%" such that the remainder is always positive, see AlignmentSolution::mod(i, q).
> 524: //
> 525: // Note: the following assumption is incorrect for native memory bases, see JDK-8323582.
Same here:
Suggestion:
// TODO: Note: the following assumption is incorrect for native memory bases, see JDK-8323582.
src/hotspot/share/opto/vectorization.cpp line 883:
> 881: //
> 882: // -> base aligned: base % aw = 0
> 883: // Note: this assumption is incorrect for native memory bases, see JDK-8323582.
Suggestion:
// TODO: Note: this assumption is incorrect for native memory bases, see JDK-8323582.
src/hotspot/share/opto/vectorization.cpp line 962:
> 960: // adr = base + con + invar + iv_scale * iv
> 961: tty->print(" adr = base[%d]", base().object_or_native()->_idx);
> 962: tty->print(" + con(%d) + invar + iv_scale(%d) * iv", _vpointer.con(), iv_scale());
Nit: In previous comments, you put `con` last. Should we also do this here?
src/hotspot/share/opto/vtransform.cpp line 148:
> 146:
> 147: // Helper-class for VTransformGraph::has_store_to_load_forwarding_failure.
> 148: // It wraps a VPointer. The VPointer have an iv_offset applied, which
Suggestion:
// It wraps a VPointer. The VPointer has an iv_offset applied, which
test/hotspot/jtreg/compiler/loopopts/superword/TestMemorySegment.java line 649:
> 647: @Test
> 648: // FAILS: invariants are sorted differently, because of differently inserted Cast.
> 649: // See: JDK-8330274
8330274 is closed as dup of this one. Should the comment be updated? Same below.
-------------
PR Review: https://git.openjdk.org/jdk/pull/21926#pullrequestreview-2558494167
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920230801
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920254553
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920266329
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920269117
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920201787
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920203388
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920209130
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920212319
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920214435
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920217872
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920220328
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920150297
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920152865
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920163422
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920140485
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1919888113
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1919898691
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1919899561
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1919900715
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1919902716
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920132303
PR Review Comment: https://git.openjdk.org/jdk/pull/21926#discussion_r1920278468
More information about the hotspot-compiler-dev
mailing list