Integrated: 8303466: C2: failed: malformed control flow. Limit type made precise with MaxL/MinL
Emanuel Peter
epeter at openjdk.org
Wed Apr 26 05:59:57 UTC 2023
On Fri, 31 Mar 2023 12:44:17 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
> **Context**
>
> During `PhaseIdealLoop::do_unroll`, we hack the loop-limit, and subtract `stride` from it. We have to prevent underflow on that subtract. Currently, we do this with a `CMoveI`. The problem with this: `CMoveI` is not smart enough to generate a precise type. For example, there are many cases where the input types get better, and underflow is not possible anymore. But the `CMoveI` does not detect this, and still has type `min_jint..hi`.
>
> We have the same issue in `PhaseIdealLoop::adjust_limit`, where we use `CMoveL` to implement long max/min. The types are not as precise as they could and should be.
>
> **Problem**
>
> The imprecise type is used for the zero-trip-guard. It does not fold to false, even though the data-path into the post loop does constant fold to `TOP`. The graph breaks, and assert `malformed control flow` triggers.
>
> Details: In these cases, we have the super-unrolled main-loop (SuperWord'ed, then further unrolled) directly leading to a vectorized post-loop. The effect is that there is no `region/phi` merging main-exit and main-zero-trip-guard. So the types are already more narrow here. It may be possible that the values are such that we find out that we should never enter the vectorized post-loop. But if data finds out and control does not, we get a broken graph.
> Note: we have pre-loop. Then a main-loop and vectorized post loop. Then we merge the main-zero-trip-guard. And at the end we have the scalar post loop.
>
> I have already recently fixed a bug around this `CMoveI`. https://github.com/openjdk/jdk/commit/5a4945c0d95423d0ab07762c915e9cb4d3c66abb I would now like to have a more satisfactory fix, that properly propagates the types.
>
> **Solution**
>
> `PhaseIdealLoop::adjust_limit` already converts the limit from int to long, and does all computations in long, including taking max/min with a `CMoveL`. I now use the so far unused `MaxL/MinL`. I implemented some missing `Value/Identity` components for it. Since `MaxL/MinL` is not implemented in the backend, I just expand it in macro-expansion to a `CMoveL`. At that point the loop-opts are over, and it is most likely ok that we do not make the types more precise after this.
>
> I take the same approach for `PhaseIdealLoop::do_unroll`: convert limits to long, do subtraction in long, take `MinL/MaxL` to clamp it to the int-range (prevent subtraction underflow).
>
> **Discussion**
>
> This solution seems much cleaner to me, and I hope that we will see less bugs because of imprecise types in the limit computation, which were often due to the `CMove` not being smart enough to analyze all inputs (it would have to recognize a multitude of patterns, for the Cmp inputs and the direct inputs to the CMove - we currently do not do that, but just take the union of the input types - this is very inprecise).
>
> There is a bit of an overhead here: We use longs even though we only want to have int values. But I think we should prefer a clean implementation here, with correct type computation. The performance impact is probably non-existent on 64-bit machines anyway.
>
> **Caveat**
>
> I found some cases with the same assert `malformed control flow` that are most likely skeleton/assertion predicate bugs [JDK-8288981](https://bugs.openjdk.org/browse/JDK-8288981). Some of those cases were new patterns, for example where we PreMainPost a main loop.
>
> I hope that this fix here at least reduces the frequency of failures significantly.
>
> **Testing**
>
> I added 2 regression tests. Our fuzzer seems to spit out examples regularly, so that gives us extra coverage.
>
> Tested up to `tier5` and stress testing. Performance testing **running...**
>
> **Future Work**
>
> We should implement `MaxL/MinL` in the backend. We should also use them during parsing. This would also allow to `SuperWord` the instruction, on the platforms that support it.
>
> Should we add such an assert during IGVN? I think after IGVN, we should never have a `MultiBranchNode` that does not have the required number of outputs, right? We could add it to `VerifyIterativeGVN`.
This pull request has now been integrated.
Changeset: cc894d84
Author: Emanuel Peter <epeter at openjdk.org>
URL: https://git.openjdk.org/jdk/commit/cc894d849aa5f730d5a806acfc7a237cf5170af1
Stats: 455 lines in 9 files changed: 364 ins; 69 del; 22 mod
8303466: C2: failed: malformed control flow. Limit type made precise with MaxL/MinL
Reviewed-by: roland, kvn, chagedorn, thartmann
-------------
PR: https://git.openjdk.org/jdk/pull/13269
More information about the hotspot-compiler-dev
mailing list