RFR: 8330819: C2 SuperWord: bad dominance after pre-loop limit adjustment with base that has CastLL after pre-loop [v3]
Emanuel Peter
epeter at openjdk.org
Tue Apr 23 14:29:05 UTC 2024
> Summary: the address `adr` of the vector we want to align the main-loop for has a `CastLL` after the pre-loop and before the main-loop. When we use this address to adjust the pre-loop limit, we create a use before the `CastLL`, which leads to a "bad dominance" assert. Solution: make sure that all such base addresses `adr` are not just invariant in the main-loop, but also are invariant of/before the pre-loop.
>
> **Example where we get the "bad dominance"**
>
> This code shape comes from the attached regression tests (no matter if with Unsafe or MemorySegment).
>
> The loop is PreMainPost-ed and the main-loop unrolled. `1326 CountedLoop` is the pre-loop, and `1657 CountedLoop` is the main-loop, which contains the `1648 LoadI`. During `SuperWord`, we take this load's address to align the main-loop.
>
> The address is parsed into its components by `VPointer`:
> `VPointer[mem: 1648 LoadI, base: 1, adr: 1669, base[ 1] + offset( 0) + invar( 0) + scale( 4) * iv]`
>
> We note that this is the access to native memory via Unsafe / MemorySegment, and so there is no array pointer base, and the `base = 1 TOP`. `VPointer` tries still to find a "base" adress `adr` by parsing the very left-most input to the chain of `AddP`s. Here, there is only a single `1711 AddP`, and the left input is `adr = 1669 CastX2P`. The right side of the `AddP` is also parsed, and determined to be `4 * iv`.
>
> The problematic part: `1669 CastX2P` is "pinned" down below the pre-loop by the `1513CastLL` that is applied to `11 Parm` (= `long offset` parameter in the test).
>
> ![image](https://github.com/openjdk/jdk/assets/32593061/d5579226-797c-489e-8aa1-0c906ca59755)
>
> During `SuperWord`, we want to align the main-loop vectors. We do this by adjusting the pre-loop limit `1439 Opaque1`:
>
> ![image](https://github.com/openjdk/jdk/assets/32593061/23bafa67-1438-4057-88a7-fb72e8b06c5c)
>
> You can see the dark-green IR nodes, which compute the `new_limit = old_limit + adjustment`, where the adjustment is a modulo `1734 AndI` of the address of the `1738 LoadVector` for which we are aligning. In this computation we are also using the `adr` of our `VPointer`, which depends on the `1513 CastLL` which is pinned below the pre-loop. Thus, we are using a node inside the pre-loop that is pinned after the pre-loop. Hence the "bad dominance" assert.
>
> **Why does this happen?**
>
> Usually, the `base` and/or `adr` of a `VPointer` are invariant not just of the main-loop but also the pre-loop. The pinning after pre-loop and befor...
Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
rm lib from test
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/18892/files
- new: https://git.openjdk.org/jdk/pull/18892/files/d5e51590..ffdafa80
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=18892&range=02
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=18892&range=01-02
Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod
Patch: https://git.openjdk.org/jdk/pull/18892.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/18892/head:pull/18892
PR: https://git.openjdk.org/jdk/pull/18892
More information about the hotspot-compiler-dev
mailing list