RFR: 8173709: Fix VerifyLoopOptimizations - step 1
Emanuel Peter
epeter at openjdk.org
Wed Mar 29 08:04:50 UTC 2023
On Wed, 29 Mar 2023 07:50:34 GMT, Roberto Castañeda Lozano <rcastanedalo at openjdk.org> wrote:
>> I am reviving `-XX:+VerifyLoopOptimizations` after many years of abandonment. There were many bugs filed, but so far it has not been addressed.
>>
>> The hope is that this work will allow us to catch ctrl / idom / loop body bugs quicker, and fix many of the existing ones along the way.
>>
>> **The Idea of VerifyLoopOptimizations**
>> Before loop-opts, we build many data-structures for dominance (idom), control, and loop membership. Then, loop-opts use this data to transform the graph. At the same time, they must maintain the correctness of the data-structures, so that other optimizations can be made, without needing to re-compute the data-structures every time.
>> `VerifyLoopOptimizations` was implemented to verify correctness of the data-structures. After some loop-opts, we re-compute a verification data-structure and compare it to the one we created before the loop-opts and maintained during loopopts.
>>
>> **My Approach**
>> I soon realized that there were many reasons why `VerifyLoopOptimizations` was broken. It seemed infeasible to fix all of them at once. I decided to first remove any part that was failing, until I have a minimal set that is working. I will leave many parts commented out. In follow-up RFE's, I will then iteratively improve the verification by re-enabling some verification and fixing the corresponding bugs.
>>
>> **What I fixed**
>>
>> - `verify_compare`
>> - Renamed it to `verify_nodes`, since it does verification node-by-node (vs `verify_tree`, which verifies the loop-tree).
>> - Previously, it was implemented as a BFS with recursion, which lead to stack-overflow. I flattened the BFS into a loop.
>> - The BFS calls `verify_node` on every node. I refactored `verify_node` a bit, so that it is more readable.
>> - Rather than having a thread-unsafe static variable `fail`, I now made it a reference argument.
>> - `verify_tree`
>> - I corrected the style and improved comments.
>> - I removed the broken verification for `Opaque` nodes. I added some rudamentary verification for `CountedLoop`. I leave more of this work for follow-up RFE's.
>>
>> **Disabled Verifications**
>> I commented out the following verifications:
>>
>> (A) data nodes should have same ctrl
>> (B) ctrl node should belong to same loop
>> (C) ctrl node should have same idom
>> (D) loop should have same tail
>> (E) loop should have same body (list of nodes)
>> (F) broken verification in PhaseIdealLoop::build_loop_late_post, because ctrl was set wrong
>>
>>
>> Note: verifying `idom`, `ctrl` and `_body` is the central goal of `VerifyLoopOptimizations`. But all of them are broken in many parts of the VM, as we have now not verified them for many years.
>>
>> **Follow-Up Work**
>>
>> I filed a first follow-up RFE [JDK-8305073](https://bugs.openjdk.org/browse/JDK-8305073). The following tasks should be addressed in it, or in subsequent follow-up RFE's.
>>
>> I propose the following order:
>>
>> - idom (C): The dominance structure is at the base of everything else.
>> - ctrl / loop (A, B): Once dominance is fixed, we can ensure every node is assigned to the correct ctrl/loop.
>> - tail (D): ensure the tail of a loop is updated correctly
>> - body (E): nodes are assigned to the `_body` of a loop, according to the node ctrl.
>> - other issues like (F)
>> - Add more verification to IdealLoopTree::verify_tree. For example zero-trip-guard, etc.
>> - Evaluate from where else we should call `PhaseIdealLoop::verify`. Maybe we are missing some cases.
>>
>> **Testing**
>> I am running `tier1-tier6` and stress testing.
>> Preliminary results are all good.
>>
>> **Conclusion**
>> With this fix, I have the basic infrastructure of the verification working.
>> However, all of the substantial verification are now still disabled, because there are too many places in the VM that do not maintain the data-structures properly.
>> Follow-up RFE's will have to address these one-by-one.
>
> Thanks for working on this, Emanuel! Is there any chance that, in the scope of this work, we could enable the `VerifyLoopOptimizations` flag by default? This would mitigate the risk of this code degrading again in the future. In my opinion, it would be worth doing this even if it meant sacrificing some of the more expensive verification steps.
@robcasloz This is a trade-off with all verification and stress code.
I think our general approach is that any time-consuming verification/stress testing only runs with a flag. This is important so that the performance does not drop too much. Because it would change the profiling and ergonomics, order of compilation and inlining, etc. Product failures may not easily reproduce in debug. Plus, it just makes all testing slower and more expensive.
`VerifyLoopOptimizations` has quite the overhead. First, you need to build the loop-tree. That alone consists of multiple traversals over the whole graph.
I suggest we keep it guarded by the flag, but add it to stress testing. And when one is touching anything to do with loop-opts, one should probably run testing with the flag.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/13207#issuecomment-1488119585
More information about the hotspot-compiler-dev
mailing list