RFR: 8330274: C2 SuperWord: VPointer invar: same sum with different addition order should be equal [v2]
Emanuel Peter
epeter at openjdk.org
Wed Apr 24 09:11:33 UTC 2024
On Wed, 24 Apr 2024 09:02:49 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> This is an enhancement for AutoVectorization.
>>
>> I want to improve the detection of `invar`s that are equivalent (guaranteed to compute the same value), but don't have the identical node (the computation is in a different order).
>>
>> Note: only about 100 lines are real changes, the rest is tests. These are the first tests that check vectorization for MemorySegments.
>>
>> **Solution Sketch: "canonicalize" the invar**
>>
>> - Extract all summands of the `invar`: make a list.
>> - Parse through `AddL`, `SubL`, `AddI`, `SubI`, to get summands.
>> - Bypass `CastLL` and `CastII`
>> - Recursively treat `ConvI2L`, `LShiftI` and `LShiftL`: i.e. canonicalize their input.
>>
>> - Sort all extracted summands by node idx.
>> - Add up all summands in new order.
>>
>> If two `invar`s use the same summands, then we know that after canonicalization the new nodes representing the `invar`s must be the same.
>>
>> **Example**
>>
>>
>> invar1 = b + c + d + a
>> invar2 = d + b + a + c
>>
>> -> equivalent but not identical nodes
>>
>> Sort, and add up again:
>>
>> invar1 = a + b + c + d
>> invar2 = a + b + c + d
>>
>> -> now the nodes are identical
>>
>> **Motivation: MemorySegment with invar**
>>
>> One might think that this is a big of a special case: why would anybody write indices to an Array or MemorySegment where the invar has a different addition order for its summands?
>>
>> This example did not vectorize, even though it should:
>> https://github.com/openjdk/jdk/blob/78e42d6e311c33548d16c6c74493388d9850238e/test/hotspot/jtreg/compiler/loopopts/superword/TestEquivalentInvariants.java#L425-L441
>>
>> Both the `get` and the `set` look like they have the same address, and the address increases by a byte in each iteration.
>>
>> Upon inspection, I saw that the `invar` that `VPointer` produces for the two operations are not identical: the order of addition of the `invar`'s summands is different, and thus the `invar` nodes are different.
>>
>> The consequence: Only if we can prove that the two `invar` are identical can we know that the addresses are identical, and that there is no aliasing for loop carried dependencies. Since we have different `invar`, we don't know how the two addresses alias, and that prevents vectorization.
>>
>> Why does this happen? After parsing, the graph looks like this:
>> ![image](https://github.com/openjdk/jdk/assets/32593061/f768d0b0-0b2f-48f0-bfdc-61e93e62bb4f)
>>
>> We already see that the two addresses are different only by a `CastLL`, with type `long:>=0`. So...
>
> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 16 additional commits since the last revision:
>
> - Merge branch 'master' into JDK-8330274-invar-sum-equality
> - IR rules for test only on 64 bit
> - more tests, more comments, rm trace code
> - more int/long tests: where offsetPlain moves away
> - add long tests
> - verify cfg case
> - test: handle AlignVector
> - some int tests
> - allow LShift for scaling
> - better comments
> - ... and 6 more: https://git.openjdk.org/jdk/compare/a5cd6d8c...fdfd7ca2
src/hotspot/share/opto/vectorization.hpp line 726:
> 724: NONCOPYABLE(VPointer);
> 725:
> 726: Node* convI2L(Node* n);
TODO: remove
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/18795#discussion_r1577557220
More information about the hotspot-compiler-dev
mailing list