RFR: 8330819: C2 SuperWord: bad dominance after pre-loop limit adjustment with base that has CastLL after pre-loop [v3]
Vladimir Kozlov
kvn at openjdk.org
Tue Apr 23 21:13:28 UTC 2024
On Tue, 23 Apr 2024 14:29:05 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Summary: the address `adr` of the vector we want to align the main-loop for has a `CastLL` after the pre-loop and before the main-loop. When we use this address to adjust the pre-loop limit, we create a use before the `CastLL`, which leads to a "bad dominance" assert. Solution: make sure that all such base addresses `adr` are not just invariant in the main-loop, but also are invariant of/before the pre-loop.
>>
>> **Example where we get the "bad dominance"**
>>
>> This code shape comes from the attached regression tests (no matter if with Unsafe or MemorySegment).
>>
>> The loop is PreMainPost-ed and the main-loop unrolled. `1326 CountedLoop` is the pre-loop, and `1657 CountedLoop` is the main-loop, which contains the `1648 LoadI`. During `SuperWord`, we take this load's address to align the main-loop.
>>
>> The address is parsed into its components by `VPointer`:
>> `VPointer[mem: 1648 LoadI, base: 1, adr: 1669, base[ 1] + offset( 0) + invar( 0) + scale( 4) * iv]`
>>
>> We note that this is the access to native memory via Unsafe / MemorySegment, and so there is no array pointer base, and the `base = 1 TOP`. `VPointer` tries still to find a "base" adress `adr` by parsing the very left-most input to the chain of `AddP`s. Here, there is only a single `1711 AddP`, and the left input is `adr = 1669 CastX2P`. The right side of the `AddP` is also parsed, and determined to be `4 * iv`.
>>
>> The problematic part: `1669 CastX2P` is "pinned" down below the pre-loop by the `1513CastLL` that is applied to `11 Parm` (= `long offset` parameter in the test).
>>
>> ![image](https://github.com/openjdk/jdk/assets/32593061/d5579226-797c-489e-8aa1-0c906ca59755)
>>
>> During `SuperWord`, we want to align the main-loop vectors. We do this by adjusting the pre-loop limit `1439 Opaque1`:
>>
>> ![image](https://github.com/openjdk/jdk/assets/32593061/23bafa67-1438-4057-88a7-fb72e8b06c5c)
>>
>> You can see the dark-green IR nodes, which compute the `new_limit = old_limit + adjustment`, where the adjustment is a modulo `1734 AndI` of the address of the `1738 LoadVector` for which we are aligning. In this computation we are also using the `adr` of our `VPointer`, which depends on the `1513 CastLL` which is pinned below the pre-loop. Thus, we are using a node inside the pre-loop that is pinned after the pre-loop. Hence the "bad dominance" assert.
>>
>> **Why does this happen?**
>>
>> Usually, the `base` and/or `adr` of a `VPointer` are invariant not just of the main-loop but als...
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>
> rm lib from test
Good.
-------------
Marked as reviewed by kvn (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/18892#pullrequestreview-2018307381
More information about the hotspot-compiler-dev
mailing list