RFR: 8173709: Fix VerifyLoopOptimizations - step 1 [v3]

Emanuel Peter epeter at openjdk.org
Wed Mar 29 13:51:07 UTC 2023


> I am reviving `-XX:+VerifyLoopOptimizations` after many years of abandonment. There were many bugs filed, but so far it has not been addressed.
> 
> The hope is that this work will allow us to catch ctrl / idom / loop body bugs quicker, and fix many of the existing ones along the way.
> 
> **The Idea of VerifyLoopOptimizations**
> Before loop-opts, we build many data-structures for dominance (idom), control, and loop membership. Then, loop-opts use this data to transform the graph. At the same time, they must maintain the correctness of the data-structures, so that other optimizations can be made, without needing to re-compute the data-structures every time.
> `VerifyLoopOptimizations` was implemented to verify correctness of the data-structures. After some loop-opts, we re-compute a verification data-structure and compare it to the one we created before the loop-opts and maintained during loopopts.
> 
> **My Approach**
> I soon realized that there were many reasons why `VerifyLoopOptimizations` was broken. It seemed infeasible to fix all of them at once. I decided to first remove any part that was failing, until I have a minimal set that is working. I will leave many parts commented out. In follow-up RFE's, I will then iteratively improve the verification by re-enabling some verification and fixing the corresponding bugs.
> 
> **What I fixed**
> 
>  - `verify_compare`
>    - Renamed it to `verify_nodes`, since it does verification node-by-node (vs `verify_tree`, which verifies the loop-tree).
>    - Previously, it was implemented as a BFS with recursion, which lead to stack-overflow. I flattened the BFS into a loop.
>    - The BFS calls `verify_node` on every node. I refactored `verify_node` a bit, so that it is more readable.
>    - Rather than having a thread-unsafe static variable `fail`, I now made it a reference argument.
>  - `verify_tree`
>    - I corrected the style and improved comments.
>    - I removed the broken verification for `Opaque` nodes. I added some rudamentary verification for `CountedLoop`. I leave more of this work for follow-up RFE's.
> 
> **Disabled Verifications**
> I commented out the following verifications:
> 
> (A) data nodes should have same ctrl
> (B) ctrl node should belong to same loop
> (C) ctrl node should have same idom
> (D) loop should have same tail
> (E) loop should have same body (list of nodes)
> (F) broken verification in PhaseIdealLoop::build_loop_late_post, because ctrl was set wrong
> 
> 
> Note: verifying `idom`, `ctrl` and `_body` is the central goal of `VerifyLoopOptimizations`. But all of them are broken in many parts of the VM, as we have now not verified them for many years.
> 
> **Follow-Up Work**
> 
> I filed a first follow-up RFE [JDK-8305073](https://bugs.openjdk.org/browse/JDK-8305073). The following tasks should be addressed in it, or in subsequent follow-up RFE's.
> 
> I propose the following order:
> 
>  - idom (C): The dominance structure is at the base of everything else.
>  - ctrl / loop (A, B): Once dominance is fixed, we can ensure every node is assigned to the correct ctrl/loop.
>  - tail (D): ensure the tail of a loop is updated correctly
>  - body (E): nodes are assigned to the `_body` of a loop, according to the node ctrl.
>  - other issues like (F)
>  - Add more verification to IdealLoopTree::verify_tree. For example zero-trip-guard, etc.
>  - Evaluate from where else we should call `PhaseIdealLoop::verify`. Maybe we are missing some cases.
> 
> **Testing**
> I am running `tier1-tier6` and stress testing.
> Preliminary results are all good.
> 
> **Conclusion**
> With this fix, I have the basic infrastructure of the verification working.
> However, all of the substantial verification are now still disabled, because there are too many places in the VM that do not maintain the data-structures properly.
> Follow-up RFE's will have to address these one-by-one.

Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:

  fix after Christian's suggestions

-------------

Changes:
  - all: https://git.openjdk.org/jdk/pull/13207/files
  - new: https://git.openjdk.org/jdk/pull/13207/files/6cd3479b..5ef6113f

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk&pr=13207&range=02
 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13207&range=01-02

  Stats: 124 lines in 2 files changed: 55 ins; 37 del; 32 mod
  Patch: https://git.openjdk.org/jdk/pull/13207.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/13207/head:pull/13207

PR: https://git.openjdk.org/jdk/pull/13207


More information about the hotspot-compiler-dev mailing list