RFR: 8333334: C2: Make result of `Node::dominates` more precise to enhance scalar replacement

Tobias Hartmann thartmann at openjdk.org
Wed Jun 5 05:42:57 UTC 2024


On Fri, 31 May 2024 09:01:38 GMT, MaxXing <duke at openjdk.org> wrote:

> This patch changes the algorithm of `Node::dominates` to make the result more precise, and allows the iterators of `ConcurrentHashMap` to be scalar replaced.
> 
> The previous algorithm will return a conservative result when encountering a dead control flow, and only try the first two input paths of a multi-input Region node, which may prevent the scalar replacement in some cases.
> 
> For example, with G1 GC enabled, C2 generates GC barriers for `ConcurrentHashMap` iteration operations at some early phases, and then eliminates them in a later IGVN, but `LoadNode` is also idealized in the same IGVN. This causes `LoadNode::Ideal` to see some dead barrier control flows, and refuse to split some instance field loads through Phi due to the conservative result of `Node::dominates`, and thus the scalar replacement can not be applied to iterators in the later macro elimination phase.
> 
> This patch allows `Node::dominates` to try other paths of the last multi-input Region node when the first path is dead, and makes `ConcurrentHashMap` iteration ~30% faster:
> 
> 
> Benchmark                            (nkeys)  Mode  Cnt        Score       Error  Units
> Maps.testConcurrentHashMapIterators    10000  avgt   15   414099.085 ± 33230.945  ns/op   # baseline
> Maps.testConcurrentHashMapIterators    10000  avgt   15   315490.281 ±  3037.056  ns/op   # patch
> 
> 
> Testing: tier1-4.

Impressive results! I haven't looked at the change yet but here are a few questions / requests:
- Could you add a screenshot of the IR of the case you are describing?
- Wouldn't it help to add the LoadNode back to the IGVN worklist and wait for the dead path to be removed?
- Could you add an [IR framework](https://github.com/openjdk/jdk/blob/master/test/hotspot/jtreg/compiler/lib/ir_framework/README.md) test that verifies that the optimization works as expected?

Thanks,
Tobias

-------------

PR Review: https://git.openjdk.org/jdk/pull/19496#pullrequestreview-2098076379


More information about the hotspot-compiler-dev mailing list