RFR: 8289943: Simplify some object allocation merges [v13]

Tue Oct 11 23:16:16 UTC 2022

On Thu, 6 Oct 2022 16:50:28 GMT, Cesar Soares Lucas <cslucas at openjdk.org> wrote:

>> Hi there, can I please get some feedback on this approach to simplify object allocation merges in order to promote Scalar Replacement of the objects involved in the merge?
>> 
>> The basic idea for this [approach was discussed in this thread](https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2022-April/055189.html) and it consists of: 
>> 1) Identify Phi nodes that merge object allocations and replace them with a new IR node called ReducedAllocationMergeNode (RAM node).
>> 2) Scalar Replace the incoming allocations to the RAM node.
>> 3) Scalar Replace the RAM node itself.
>> 
>> There are a few conditions for doing the replacement of the Phi by a RAM node though - Although I plan to work on removing them in subsequent PRs:
>> 
>> - The only supported users of the original Phi are AddP->Load, SafePoints/Traps, DecodeN.
>> 
>> These are the critical parts of the implementation and I'd appreciate it very much if you could tell me if what I implemented isn't violating any C2 IR constraints:
>> 
>> - The way I identify/use the memory edges that will be used to find the last stored values to the merged object fields.
>> - The way I check if there is an incoming Allocate node to the original Phi node.
>> - The way I check if there is no store to the merged objects after they are merged.
>> 
>> Testing:
>> - Windows/Linux/MAC fastdebug/release 
>>   - hotspot_all
>>   - tier1
>>   - Renaissance
>>   - dacapo
>>   - new IR-based tests
>
> Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Fix x86 tests.

Additional high-level comments.

(1) Overall, I don't see (yet) enough motivation to justify the addition of `ReducedAllocationMerge`. I'd prefer to see the new node go away.

`ReducedAllocationMerge` nodes are short-lived. As I can infer from the code, they are used solely for the purpose of caching field information (in addition to marking that the original phi satisfied some requirements). Have you considered using existing `ConnectionGraph` instance for bookkeeping purposes? It's available during IGVN and macro expansion. 

Also, I believe you face some ideal graph inconsistencies because you capture information too early (before `split_unique_types` and following IGVN pass; and previous allocation eliminations during `eliminate_macro_nodes()` may contribute to that).

(2) Following up on my earlier question about interactions with `split_unique_types()`, I'm worried that you remove corresponding `LocalVar`s from the ConnectionGraph and introduce with unique memory slices. I'd feel much more confident in the correctness if you split slices for unions of interacting allocations instead.

-------------

PR: https://git.openjdk.org/jdk/pull/9073