RFR: 8307084: C2: Vectorized drain loop is not executed for some small trip counts [v4]
Fei Gao
fgao at openjdk.org
Thu Jan 15 18:30:45 UTC 2026
On Wed, 14 Jan 2026 14:08:37 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> Fei Gao has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits:
>>
>> - Fix build failure after rebasing and address review comments
>> - Merge branch 'master' into optimize-atomic-post
>> - Fixed new test failures after rebasing and refined parts of the code to address review comments
>> - Merge branch 'master' into optimize-atomic-post
>> - Merge branch 'master' into optimize-atomic-post
>> - Clean up comments for consistency and add spacing for readability
>> - Fix some corner case failures and refined part of code
>> - Merge branch 'master' into optimize-atomic-post
>> - Refine ascii art, rename some variables and resolve conflicts
>> - Merge branch 'master' into optimize-atomic-post
>> - ... and 3 more: https://git.openjdk.org/jdk/compare/a8552243...ab1de504
>
> src/hotspot/share/opto/loopTransform.cpp line 1337:
>
>> 1335: } else {
>> 1336: if (get_ctrl(n) != back_ctrl) { return n; }
>> 1337: }
>
> Is the `clone_up_backedge` name still correct for all cases?
> It seems before we only cloned up nodes that belonged to `back_ctrl`.
> But now we also clone from exit pre-loop **exit path**.
>
> I never liked the `goo` anyway...
>
> And: why is it called `clone_up`? What "up" does it refer to?
>
> What about `clone_up(FromBackedge, ...` and `clone_up(FromPreLoopExit, ...`, using two enums? Then we can be a bit more explicit which case we are in, and add corresponding asserts for `back_ctrl` (non-null vs null).
Yeah, I agree. We should rename it. How about `resolve_value_for_preheader`?
> What about clone_up(FromBackedge, ... and clone_up(FromPreLoopExit, ..., using two enums? Then we can be a bit more explicit which case we are in, and add corresponding asserts for back_ctrl (non-null vs null).
That sounds good. Something like:
resolve_value_for_preheader(FromBackedge, ...)
resolve_value_for_preheader(FromPreLoopExit, ...)
> src/hotspot/share/opto/loopTransform.cpp line 1402:
>
>> 1400: Node* main_backedge_ctrl = main_head->back_control();
>> 1401: // For the post loop, we call clone_up_backedge_goo() to obtain the fall-out values
>> 1402: // from the main loop, which serve as the fall-in values for the post loop.
>
> The naming of `clone_up_backedge_goo` is confusing me a bit:
> We seem to clone down (main -> post) and we clone fall-out (exit) values, and not backedge values.
Agreed - that’s really confusing. I had considered renaming it before, but it would have required touching some unrelated files, so I decided against it. However, I now think the rename is necessary for clarity.
> I'm wondering if/how ControlAroundStripMined relates to the post loop case?
We use `ControlAroundStripMined` to clone the post loop. See the mainline code here: https://github.com/openjdk/jdk/blob/75420e9314c54adc5b45f9b274a87af54dd6b5a8/src/hotspot/share/opto/loopTransform.cpp#L1697
> if the branch is not taken, you assume it can only be the drain loop.
Yes.
> Could we assert mode == InsertVectorizedDrain down there?
Sounds good. I can add it.
> What about renaming pre_incr -> main_input or after_pre. That would remove the iv connotation, and be more parallel to drain_input or after_main.
Sounds reasonable.
> But a similar question here: what if we had a split-through-phi here?
Before split-through-phi, we would have had some input x, but afterwards we'd have op(x) here.
Would that not mean that we should use x for the input to drain_input, but are getting op(x)?
A similar argument applies here as well. We also use `clone_up_backedge_goo()` along this path to resolve the correct pre_incr value after the pre-loop. Even if a split-through-phi has occurred, the helper will walk and clone the input chain as needed to recover the appropriate value whose target control is the pre-exit control (`main_merge_region->in(1)`).
> src/hotspot/share/opto/loopTransform.cpp line 1482:
>
>> 1480: pre_incr = clone_up_backedge_goo(nullptr, main_merge_region->in(1), pre_incr, visited, clones);
>> 1481: }
>> 1482: drain_input->set_req(1, pre_incr);
>
> Just a control question: above you did:
> `drain_input = PhiNode::make(main_merge_region, iv_after_main);`
> Does that not put `iv_after_main` at slot `1`, and now we overwrite it with `pre_incr`?
Yes, this is what we're generating:
`drain_input = merge (pre_incr, iv_after_main)`
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/22629#discussion_r2694628680
PR Review Comment: https://git.openjdk.org/jdk/pull/22629#discussion_r2694641458
PR Review Comment: https://git.openjdk.org/jdk/pull/22629#discussion_r2694662395
PR Review Comment: https://git.openjdk.org/jdk/pull/22629#discussion_r2695142459
PR Review Comment: https://git.openjdk.org/jdk/pull/22629#discussion_r2695189822
More information about the hotspot-compiler-dev
mailing list