RFC : Approach to handle Allocation Merges in C2 Scalar Replacement

Thu Apr 28 18:31:23 UTC 2022

Hi, Xin Liu.

Sorry for the delay in getting back to you!

> I found that 'ConnectionGraph::split_unique_types' is the key for Scalar
> Replacement in HotSpot. It enforces a property 'unique typing' for a
> java object. If a java object is 'unique typing', its all uses are
> exclusively refer to that java object.

Yes, I agree with you here. split_unique_types create a new instance type 
for each Allocate and propagate the type transitively to all nodes accessing 
that allocate. The instance type has a "_instance_id" property that is used to 
identify the Allocate related to the object being accessed.

> A phi(o1, o2) original has adr_type alias<X>. X is a general MyPair*.
> After some transformations, we can have distinct alias<Y> and
> alias<Y+1>. They are both instance-specific aliases. In our previous
> discussion, the splitting I called Ivanov/Divino Approach :) not only
> gets rid of phi(o1, o2) but also makes o1 and o2 unique typing. 

I think you're referring to the selector-based approach that Vladimir 
and I were discussing. Note that I found that the idea doesn't work 
because we can't access the bases directly as the idea proposes.

> Here is my new thinking on this. If you agree with my understanding of
> 'unique typing' property so far. Let's take a step back and review it. I
> feel it's too strong and it's really complex to enforce it. All we need
> is more precise alias analysis.

I guess you might say that the "unique typing" property is strong/complex. 
But on the other hand, you really need something to uniquely identify which 
object you're scalar replacing AFAIU. We might find a way to get rid of the 
"unique typing" approach but we need to consider all the pros and cons. The
current implementation has been in place for more than a decade and is solid,
making a big change to it risks adding new bugs.

> Let's assume we have a phi(o1, o2), and we know that o1 has alias<Y> and
> o2 has alias<Y+1>. splitting phi beforehand is not required. We can
> leverage the fact that Java is a strong-typed OO language, so memory
> locations of o1 and o2 don't overlap. We know which java object we are
> dealing with in scalar replacement. It is alloc->_idx! we can take a
> path at phi(o1, o2).

Which alloc node exactly? Also, note that the phi functions are only merging
the references to the objects, not the objects themselves. Suppose you have this Java code:

Point p = null;
Point p0 = new Point(...);
Point p1 = new Point(...);
if (...) p = p0; else p = p1;
for (...) p.x += i;
return p0.x + p1.x;

When doing the "split" you need to take into consideration that the inputs to the phi *might* 
be used directly after the "split".

> Here is the simplest case.  we can allow SR to deal with NSR objects
> which have been dead before reaching any safepoint. In your example
> escapeAnalysisFails(), 'o' is not live at the exiting safepoint.
> Therefore, it's not c2's responsibility to save its state.
> process_users_of_allocation(alloc) can simplify
> from (LoadI, memphi(o1, o2), AddP(phi(o1, o2), #offset))
> to      (LoadI, o1, AddP(o1, #offset)) for o1. It will do the similar change for the 2nd java object.

I think you'll not be able to access o1 and o2 directly in the Load because their 
definitions might not dominate the addp->load. Other things need to be considered too if you want to handle the general case:
- Even if you are able to split the loads and move them "above" the merge phi, it's not always easy to find the appropriate Memory input for the loads.
- In some cases the loads->addp have control inputs and moving them "above" the merge phi isn't easy/possible.
- Even if we are able to replace the loads/stores there might be some uses of the object itself preventing the removal of the phi and therefore scalarization.
--- For instance, you're comparing the output of the phi with another object or with NULL to decide to access the object fields.

****

As I mentioned to you in person: I have an implementation that can identify certain patterns of 
merge+uses and remove them. That helps in some cases and I see more objects being scalar replaced. 
I'm doing more experiments and trying to make it more general before creating a PR. *However*, it's
not a general solution for the merge problem.

That being said, I have an idea that I think can be used to solve the general allocation merge problem. I'm working on
some details and I'll post it here before EOW.

Thank you for looking into this Xin Liu. Thank you for asking these questions!

Cesar