RFR: 8333334: C2: Make result of `Node::dominates` more precise to enhance scalar replacement [v7]

Christian Hagedorn chagedorn at openjdk.org
Mon Aug 19 10:31:52 UTC 2024


On Tue, 9 Jul 2024 03:10:55 GMT, Qizheng Xing <qxing at openjdk.org> wrote:

>> This patch changes the algorithm of `Node::dominates` to make the result more precise, and allows the iterators of `ConcurrentHashMap` to be scalar replaced.
>> 
>> The previous algorithm will return a conservative result when encountering a dead control flow, and only try the first two input paths of a multi-input Region node, which may prevent the scalar replacement in some cases.
>> 
>> For example, with G1 GC enabled, C2 generates GC barriers for `ConcurrentHashMap` iteration operations at some early phases, and then eliminates them in a later IGVN, but `LoadNode` is also idealized in the same IGVN. This causes `LoadNode::Ideal` to see some dead barrier control flows, and refuse to split some instance field loads through Phi due to the conservative result of `Node::dominates`, and thus the scalar replacement can not be applied to iterators in the later macro elimination phase.
>> 
>> This patch allows `Node::dominates` to try other paths of the last multi-input Region node when the first path is dead, and makes `ConcurrentHashMap` iteration ~30% faster:
>> 
>> 
>> Benchmark                            (nkeys)  Mode  Cnt        Score       Error  Units
>> Maps.testConcurrentHashMapIterators    10000  avgt   15   414099.085 ± 33230.945  ns/op   # baseline
>> Maps.testConcurrentHashMapIterators    10000  avgt   15   315490.281 ±  3037.056  ns/op   # patch
>> 
>> 
>> Testing: tier1-4.
>
> Qizheng Xing has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add copyright.

Generally, the fix idea looks good to me, too!

As Tobias has already mentioned, you should add some braces for the existing one-liner if-statements, just to be safe.

I have a few comment.

src/hotspot/share/opto/memnode.cpp line 431:

> 429: // control input of a memory operation predates (dominates)
> 430: // an allocation it wants to look past.
> 431: Node::DomResult MemNode::all_controls_dominate(Node* dom, Node* sub) {

Now you have many checks of the form:

all_controls_dominate(this, st_alloc) == DomResult::Dominate

But actually, there is only one place in IGVN where you care about the third dead code result. Maybe you can abstract that away and do the following:
- Rename this method to `maybe_all_control_dominate()`.
- Add a new method `all_control_dominate()` which checks the result for `DomResult::Dominate`:

bool MemNode::all_controls_dominate(Node* dom, Node* sub) {
  DomResult dom_result = maybe_all_controls_dominate(dom, sub);
  return dom_result == DomResult::Dominate 
}

- The calls in `LoadNode::split_through_phi()` use `maybe_all_controls_dominate()`.
- All other callers in existing code do not need to be updated since they call the new `all_controls_dominate()` method which mimics the old behavior without caring about dead code.

Might be cleaner but it's just a thought.

src/hotspot/share/opto/memnode.cpp line 1695:

> 1693:   if (dom_result != DomResult::Dominate) {
> 1694:     if (dom_result == DomResult::EncounteredDeadCode) {
> 1695:       // Wait for the dead code to be removed.

You could extend the comment here to mention that it is guaranteed that the dead code will eventually be removed in IGVN such that we have an unambiguous result whether it's dominated or not.

test/hotspot/jtreg/compiler/c2/irTests/scalarReplacement/ScalarReplacementWithGCBarrierTests.java line 107:

> 105:     @IR(phase = { CompilePhase.AFTER_PARSING }, counts = { IRNode.ALLOC, "1" })
> 106:     @IR(phase = { CompilePhase.INCREMENTAL_BOXING_INLINE }, counts = { IRNode.ALLOC, "2" })
> 107:     @IR(applyIf = { "UseG1GC", "true" }, phase = { CompilePhase.ITER_GVN_AFTER_ELIMINATION }, counts = { IRNode.ALLOC, "1" })

I think the `applyIf` is redundant and can be removed because you are only running the framework test with

TestFramework.runWithFlags("-XX:+UseG1GC");

Suggestion:

    @IR(phase = { CompilePhase.ITER_GVN_AFTER_ELIMINATION }, counts = { IRNode.ALLOC, "1" })

test/hotspot/jtreg/compiler/c2/irTests/scalarReplacement/ScalarReplacementWithGCBarrierTests.java line 110:

> 108:     private int testScalarReplacementWithGCBarrier(List list) {
> 109:         Iter iter = list.iter();
> 110:         for (;;) {

I think it's cleaner to use `while (true)` instead:
Suggestion:

        while (true) {

-------------

PR Review: https://git.openjdk.org/jdk/pull/19496#pullrequestreview-2245027794
PR Review Comment: https://git.openjdk.org/jdk/pull/19496#discussion_r1721468222
PR Review Comment: https://git.openjdk.org/jdk/pull/19496#discussion_r1721477723
PR Review Comment: https://git.openjdk.org/jdk/pull/19496#discussion_r1721483868
PR Review Comment: https://git.openjdk.org/jdk/pull/19496#discussion_r1721484716


More information about the hotspot-compiler-dev mailing list