RFR: 8307084: C2: Vectorized drain loop is not executed for some small trip counts [v2]

Emanuel Peter epeter at openjdk.org
Mon Sep 8 11:03:33 UTC 2025


On Mon, 8 Sep 2025 09:38:11 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> src/hotspot/share/opto/loopTransform.cpp line 1887:
>> 
>>> 1885:     // from the main loop and the pre loop.
>>> 1886:     zero_ctrl = main_exit->unique_ctrl_out_or_null();
>>> 1887:     assert(zero_ctrl, "if zero_ctrl doesn't exist, pre-main-post model fails.");
>> 
>> Style guide forbids implicit null / zero checks.
>> Suggestion:
>> 
>>     assert(zero_ctrl != nullptr, "if zero_ctrl doesn't exist, pre-main-post model fails.");
>
> What do you mean by `pre-main-post model fails`? PreMainPost has presumably already succeeded. Can you reformulate?

Why not add the `Region` check already here?

>> src/hotspot/share/opto/loopTransform.cpp line 1934:
>> 
>>> 1932: 
>>> 1933:   // Step 3: Find a 'new_phi' which is the input trip count of the zero trip guard.
>>> 1934:   Node* new_incr = nullptr;
>> 
>> Is it called `new_phi` or `new_incr`?
>
> `phi` is usually the `PhiNode`, and `incr` is the `AddINode`, right?

So you could actually make the type more precise than `Node*` :)

>> src/hotspot/share/opto/loopopts.cpp line 2458:
>> 
>>> 2456: 
>>> 2457:   while (worklist.size()) {
>>> 2458:     Node* use = worklist.pop();
>> 
>> Can you add a quick comment what kind of traversal this is? BFS? Over what nodes?
>
> Ah, are we only removing nodes?

Oh, you have another implicit zero check here.

>> src/hotspot/share/opto/loopopts.cpp line 2477:
>> 
>>> 2475:       while (visit_list.size()) {
>>> 2476:         Node* curr = visit_list.at(0);
>>> 2477:         visit_list.remove(0);
>> 
>> That `remove` ends up calling `Node_Array::remove`, which copies all upper entries. Generally not very performant. Not sure if it matters here, just noticed it.
>
> Maybe you can construct some graph where this really visits a lot of nodes, then this could blow up quadratically.

`pop` is more efficient, because it just takes it from the end. But then you'd get a DFS and not BFS.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/22629#discussion_r2329711083
PR Review Comment: https://git.openjdk.org/jdk/pull/22629#discussion_r2329746816
PR Review Comment: https://git.openjdk.org/jdk/pull/22629#discussion_r2329856226
PR Review Comment: https://git.openjdk.org/jdk/pull/22629#discussion_r2329879100


More information about the hotspot-compiler-dev mailing list