RFR: 8333258: C2: high memory usage in PhaseCFG::insert_anti_dependences()

Roland Westrelin roland at openjdk.org
Thu Jun 20 13:53:16 UTC 2024


On Thu, 20 Jun 2024 13:16:04 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> In a debug build, `PhaseCFG::insert_anti_dependences()` is called
>> twice for a single node: once for actual processing, once for
>> verification.
>> 
>> In TestAntiDependenciesHighMemUsage, the test has a `Region` that
>> merges 337 incoming path. It also has one `Phi` per memory slice that
>> are stored to: 1000 `Phi` nodes. Each `Phi` node has 337 inputs that
>> are identical except for one. The common input is the memory state on
>> method entry. The test has 60 `Load` that needs to be processed for
>> anti dependences. All `Load` share the same memory input: the memory
>> state on method entry. For each `Load`, all `Phi` nodes are pushed 336
>> times on the work lists for anti dependence processing because all of
>> them appear multiple times as uses of each `Load`s memory state: `Phi`s
>> are pushed 336 000 on 2 work lists. Memory is not reclaimed on exit
>> from `PhaseCFG::insert_anti_dependences()` so memory usage grows as
>> `Load` nodes are processed:
>> 
>> 336000 * 2 work lists * 60 loads * 8 bytes pointer = 322 MB. 
>> 
>> The fix I propose for this is to not push `Phi` nodes more than once
>> when they have the same inputs multiple times.
>> 
>> In TestAntiDependenciesHighMemUsage2, the test has 4000 loads. For
>> each of them, when processed for anti dependences, all 4000 loads are
>> pushed on the work lists because they share the same memory
>> input. Then when they are popped from the work list, they are
>> discarded because only stores are of interest:
>> 
>> 4000 loads processed * 4000 loads pushed * 2 work lists * 8 bytes pointer = 256 MB. 
>> 
>> The fix I propose for this is to test before pushing on the work list
>> whether a node is a store or not.
>> 
>> Finally, I propose adding a `ResourceMark` so memory doesn't
>> accumulate over calls to `PhaseCFG::insert_anti_dependences()`.
>
> src/hotspot/share/opto/gcm.cpp line 689:
> 
>> 687: 
>> 688:     if (op == Op_MachProj || op == Op_Catch)   continue;
>> 689:     if (store->needs_anti_dependence_check())  continue;  // not really a store
> 
> Why did you remove this?

I didn't remove it, I moved it. It's what this:


The fix I propose for this is to test before pushing on the work list
whether a node is a store or not.


refers to. What we do today is that we push all uses and then filter out those that are not of interest when they are popped. What I propose is to filter out what's not useful before it's pushed so the queue doesn't grow large with nodes that are going to be discarded anyway.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19791#discussion_r1647614136


More information about the hotspot-compiler-dev mailing list