RFR: 8333393: PhaseCFG::insert_anti_dependences can fail to raise LCAs and to add necessary anti-dependence edges [v13]

Wed Feb 5 16:21:11 UTC 2025

On Wed, 5 Feb 2025 14:32:11 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> src/hotspot/share/opto/gcm.cpp line 831:
>> 
>>> 829:   // | 8 membar_release <- 7 | early
>>> 830:   // | ...                   |
>>> 831:   // +-----------------------+
>> 
>> I just discussed this example with @chhagedorn 
>> 
>> // Patch the existing phi to select an input from the merge:
>> // Phi:AT1(...MergeMem(m0, m1, m2)...) into
>> //     Phi:AT1(...m1...)
>> int alias_idx = phase->C->get_alias_index(at);
>> 
>> And
>> 
>>         // Phi(...MergeMem(m0, m1:AT1, m2:AT2)...) into
>>         //     MergeMem(Phi(...m0...), Phi:AT1(...m1...), Phi:AT2(...m2...))
>> 
>> 
>> In `cfgnode.cpp`, we try to move the MergeMem after Phi. Why does this not happen in this example?
>> 
>> There are many cases in that code... but it seems to me that here something may be missing. I have not given it more time though.
>> 
>> If we knew that MergeMem always happened after the Phi, then we could only search from the `initial_mem`, and would walk through all relevant MergeMem, right?
>> 
>> This is just an intuition, but maybe having MergeMem after Phi is a fundamental assumption. Or maybe it just happens in all cases, and yours is the only we found so far where that is not possible.
>> 
>> What do you think?
>
> @dlunde I really don't want to block you here. I never understood the memory graph above the initial mem. Now that I see the example I'm getting new ideas 😅

Thanks for the comment @eme64 @chhagedorn! Happy to iterate, never hesitate to provide comments. I do recall we discussed these MergeMem/Phi swap idealizations offline last week.

I think this looks very promising. Looking at the two rules you mention and applying them iteratively to our example

7 Phi(3 MergeMem(1:A, 2:L), 5 MergeMem(1:A, 4:L))

I get

7 Phi(3 MergeMem(1:A, 2:L), 5 MergeMem(1:A, 4:L)) into
    MergeMem(Phi:A(1:A, 5 MergeMem(1:A, 4:L)),
             Phi:L(2:L, 5 MergeMem(1:A, 4:L))) into
    MergeMem(MergeMem(Phi:A(1:A, 1:A), Phi:L(1:A, 4:L)),
             Phi:L(2:L, 5 MergeMem(1:A, 4:L))) into
    MergeMem(MergeMem(Phi:A(1:A, 1:A), Phi:L(1:A, 4:L)),
             Phi:L(2:L, 4:L)))

Then, after this, we should be able to merge the resulting `Phi:L(2:L, 4:L)` with 6 Phi (`initial_mem`). So, essentially, we have broken out the `L` part of `7 Phi` and realized it is the same as `6 Phi`. I guess this is what you are also saying?

For EXAMPLE 2:

4 Phi(1:A, 3 MergeMem(1:A, 2:!L)) into
    MergeMem(Phi(1:A, 1:A), Phi(1:A, 2:!L))

`Phi(1:A, 1:A)` is `1:A` so then we have a Phi-free path from `1 MachProj` to `5 membar_release` as well!

I'll have a look and see if I can figure out why we do not apply such idealizations here.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/22852#discussion_r1943240611