RFR: 8356708: C2: loop strip mining expansion doesn't take sunk stores into account [v4]

Emanuel Peter epeter at openjdk.org
Tue Jun 17 07:17:35 UTC 2025


On Mon, 16 Jun 2025 15:26:03 GMT, Roland Westrelin <roland at openjdk.org> wrote:

>> `test1()` has a counted loop with a `Store` to `field`. That `Store`
>> is sunk out of loop. When the `OuterStripMinedLoop` is expanded, only
>> `Phi`s that exist at the inner loop are added to the outer
>> loop. There's no `Phi` for the slice of the sunk `Store` (because
>> there's no `Store` left in the inner loop) so no `Phi` is added for
>> that slice to the outer loop. As a result, there's a missing anti
>> dependency for `Load` of `field` that's before the loop and it can be
>> scheduled inside the outer strip mined loop which is incorrect.
>> 
>> `test2()` is the same as `test1()` but with a chain of 2 `Store`s.
>> 
>> `test3()` is another variant where a `Store` is left in the inner loop
>> after one is sunk out of it so the inner loop still has a `Phi`. As a
>> result, the outer loop also gets a `Phi` but it's incorrectly wired as
>> the sunk `Store` should be the input along the backedge but is
>> not. That one doesn't cause any failure AFAICT.
>> 
>> The fix I propose is some extra logic at expansion of the
>> `OuterStripMinedLoop` to handle these corner cases.
>
> Roland Westrelin has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - Update test/hotspot/jtreg/compiler/loopstripmining/TestStoresSunkInOuterStripMinedLoop.java
>    
>    Co-authored-by: Roberto Castañeda Lozano <robcasloz at users.noreply.github.com>
>  - Update test/hotspot/jtreg/compiler/loopstripmining/TestStoresSunkInOuterStripMinedLoop.java
>    
>    Co-authored-by: Roberto Castañeda Lozano <robcasloz at users.noreply.github.com>

@rwestrel Thanks for the work here :)

As mentioned previously, it feels like we have allowed a violation of C2 IR assumptions, namely that there is always a Phi if we mutate the memory state. And now we need to clean up things after that violation. Not great, but I understand why we got here: for simplicity of the outer strip mined loop, and not affecting other optimizations.

Like I ask in one of my code comments below:
Is there any place where we describe why we do not have Phis at the outer loop, and why we think that should be ok? It would be good to have those assumptions documented. And then you can refer to the method `OuterStripMinedLoopNode::handle_sunk_stores_at_expansion`, where we have to clean things up, and from there also back to the main description.

I see you have some minimal comments at `PhaseIdealLoop::create_outer_strip_mined_loop`:

// to the loop head. The inner strip mined loop is left as it is. Only
// once loop optimizations are over, do we adjust the inner loop exit
// condition to limit its number of iterations, set the outer loop
// exit condition and add Phis to the outer loop head.

I think we should add some more info there, and link to and from `OuterStripMinedLoopNode::handle_sunk_stores_at_expansion`.

Additionally / alternatively, you could also comment directly at the `OuterStripMinedLoopNode` class.

I just would like to prevent the situation where you are the only person who is able to understand how the outer strip mined loop works ;)

src/hotspot/share/opto/loopnode.cpp line 2876:

> 2874:                                                                         PhaseIdealLoop* iloop) const {
> 2875:   CountedLoopNode* inner_cl = inner_counted_loop();
> 2876:   Node* cle_out = inner_loop_exit();

Suggestion:

  IfFalseNode* cle_out = inner_loop_exit();

Optional :)

src/hotspot/share/opto/loopnode.cpp line 2992:

> 2990: }
> 2991: 
> 2992: // Sunk stores should be referenced from an outer loop memory Phi

I think you really need to give some longer explanation here why we need to do what you do here.

Also: is there anywhere a description why we do not have the phi already by default for outer loops? Because I think we should really describe that somewhere, and state our assumptions. And then you could also refer to that description from here, and from there to here.

src/hotspot/share/opto/loopnode.cpp line 2993:

> 2991: 
> 2992: // Sunk stores should be referenced from an outer loop memory Phi
> 2993: void OuterStripMinedLoopNode::handle_sunk_stores_at_expansion(PhaseIterGVN* igvn) {

What does the word "expansion" refer to? Could you also mention that in your code comment above, please?

src/hotspot/share/opto/loopnode.cpp line 2994:

> 2992: // Sunk stores should be referenced from an outer loop memory Phi
> 2993: void OuterStripMinedLoopNode::handle_sunk_stores_at_expansion(PhaseIterGVN* igvn) {
> 2994:   Node* cle_exit_proj = inner_loop_exit();

Suggestion:

  IfFalseNode* cle_exit_proj = inner_loop_exit();

Optional

src/hotspot/share/opto/loopnode.cpp line 2996:

> 2994:   Node* cle_exit_proj = inner_loop_exit();
> 2995: 
> 2996:   // Sunk stores are pinned on the loop exit projection of the inner loop

Can you add a description why?

src/hotspot/share/opto/loopnode.cpp line 3007:

> 3005: #endif
> 3006: 
> 3007:   // Sunk stores are reachable from the memory state of the outer loop safepoint

Is it true that the control of the safepoint is the `cle_exit_proj`? Could we add an assert for that? So we are just looking for all memory between those two control nodes? Or is it more complicated?

src/hotspot/share/opto/loopnode.cpp line 3008:

> 3006: 
> 3007:   // Sunk stores are reachable from the memory state of the outer loop safepoint
> 3008:   Node* safepoint = outer_safepoint();

Suggestion:

  SafePointNode* safepoint = outer_safepoint();

Optional

src/hotspot/share/opto/loopnode.cpp line 3014:

> 3012:     return;
> 3013:   }
> 3014:   MergeMemNode* mm = safepoint_mem->as_MergeMem();

You don't use `safepoint_mem` for anything else than checking if it is a `MergeMem`. So why not do this:
Suggestion:

  MergeMemNode* mm = safepoint->in(TypeFunc::Memory)->isa_MergeMem;
  if (mm == nullptr) {
    // There is no MergeMem, which should only happen if there was no memory node
    // sunk out of the loop.
    assert(stores_in_outer_loop_cnt == 0, "inconsistent");
    return;
  }

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25717#pullrequestreview-2934351620
PR Review Comment: https://git.openjdk.org/jdk/pull/25717#discussion_r2151459471
PR Review Comment: https://git.openjdk.org/jdk/pull/25717#discussion_r2151462464
PR Review Comment: https://git.openjdk.org/jdk/pull/25717#discussion_r2151487902
PR Review Comment: https://git.openjdk.org/jdk/pull/25717#discussion_r2151459759
PR Review Comment: https://git.openjdk.org/jdk/pull/25717#discussion_r2151463314
PR Review Comment: https://git.openjdk.org/jdk/pull/25717#discussion_r2151465652
PR Review Comment: https://git.openjdk.org/jdk/pull/25717#discussion_r2151466480
PR Review Comment: https://git.openjdk.org/jdk/pull/25717#discussion_r2151472016


More information about the hotspot-compiler-dev mailing list