RFR: 8330819: C2 SuperWord: bad dominance after pre-loop limit adjustment with base that has CastLL after pre-loop [v3]

Vladimir Kozlov kvn at openjdk.org
Tue Apr 23 21:13:28 UTC 2024


On Tue, 23 Apr 2024 14:29:05 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> Summary: the address `adr` of the vector we want to align the main-loop for has a `CastLL` after the pre-loop and before the main-loop. When we use this address to adjust the pre-loop limit, we create a use before the `CastLL`, which leads to a "bad dominance" assert. Solution: make sure that all such base addresses `adr` are not just invariant in the main-loop, but also are invariant of/before the pre-loop.
>> 
>> **Example where we get the "bad dominance"**
>> 
>> This code shape comes from the attached regression tests (no matter if with Unsafe or MemorySegment).
>> 
>> The loop is PreMainPost-ed and the main-loop unrolled. `1326 CountedLoop` is the pre-loop, and `1657 CountedLoop` is the main-loop, which contains the `1648 LoadI`. During `SuperWord`, we take this load's address to align the main-loop.
>> 
>> The address is parsed into its components by `VPointer`:
>> `VPointer[mem: 1648      LoadI, base:    1, adr: 1669,  base[   1] + offset(   0) + invar(   0) + scale(   4) * iv]`
>> 
>> We note that this is the access to native memory via Unsafe / MemorySegment, and so there is no array pointer base, and the `base = 1 TOP`. `VPointer` tries still to find a "base" adress `adr` by parsing the very left-most input to the chain of `AddP`s. Here, there is only a single `1711 AddP`, and the left input is `adr = 1669 CastX2P`. The right side of the `AddP` is also parsed, and determined to be `4 * iv`.
>> 
>> The problematic part: `1669 CastX2P` is "pinned" down below the pre-loop by the `1513CastLL` that is applied to `11 Parm` (= `long offset` parameter in the test).
>> 
>> ![image](https://github.com/openjdk/jdk/assets/32593061/d5579226-797c-489e-8aa1-0c906ca59755)
>> 
>> During `SuperWord`, we want to align the main-loop vectors. We do this by adjusting the pre-loop limit `1439 Opaque1`:
>> 
>> ![image](https://github.com/openjdk/jdk/assets/32593061/23bafa67-1438-4057-88a7-fb72e8b06c5c)
>> 
>> You can see the dark-green IR nodes, which compute the `new_limit = old_limit + adjustment`, where the adjustment is a modulo `1734 AndI` of the address of the `1738 LoadVector` for which we are aligning. In this computation we are also using the `adr` of our `VPointer`, which depends on the `1513 CastLL` which is pinned below the pre-loop. Thus, we are using a node inside the pre-loop that is pinned after the pre-loop. Hence the "bad dominance" assert.
>> 
>> **Why does this happen?**
>> 
>> Usually, the `base` and/or `adr` of a `VPointer` are invariant not just of the main-loop but als...
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
> 
>   rm lib from test

Good.

-------------

Marked as reviewed by kvn (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/18892#pullrequestreview-2018307381


More information about the hotspot-compiler-dev mailing list