RFR: 8333334: C2: Make result of `Node::dominates` more precise to enhance scalar replacement [v5]

Emanuel Peter epeter at openjdk.org
Fri Jul 5 08:32:25 UTC 2024


On Mon, 1 Jul 2024 08:12:13 GMT, Qizheng Xing <qxing at openjdk.org> wrote:

>> This patch changes the algorithm of `Node::dominates` to make the result more precise, and allows the iterators of `ConcurrentHashMap` to be scalar replaced.
>> 
>> The previous algorithm will return a conservative result when encountering a dead control flow, and only try the first two input paths of a multi-input Region node, which may prevent the scalar replacement in some cases.
>> 
>> For example, with G1 GC enabled, C2 generates GC barriers for `ConcurrentHashMap` iteration operations at some early phases, and then eliminates them in a later IGVN, but `LoadNode` is also idealized in the same IGVN. This causes `LoadNode::Ideal` to see some dead barrier control flows, and refuse to split some instance field loads through Phi due to the conservative result of `Node::dominates`, and thus the scalar replacement can not be applied to iterators in the later macro elimination phase.
>> 
>> This patch allows `Node::dominates` to try other paths of the last multi-input Region node when the first path is dead, and makes `ConcurrentHashMap` iteration ~30% faster:
>> 
>> 
>> Benchmark                            (nkeys)  Mode  Cnt        Score       Error  Units
>> Maps.testConcurrentHashMapIterators    10000  avgt   15   414099.085 ± 33230.945  ns/op   # baseline
>> Maps.testConcurrentHashMapIterators    10000  avgt   15   315490.281 ±  3037.056  ns/op   # patch
>> 
>> 
>> Testing: tier1-4.
>
> Qizheng Xing has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Add `@requires` for G1 GC.

Nice, looks like this is a worth-wile improvement!

I have some comments about style and software-engineering. Mainly:
- Generally return values passed via a argument is already a bit of a nasty solution - but sometimes it is the least nasty solution - so fine.
- But if you do pass a return value via argument, it should probably be a reference, not a pointer. But yours is not just a pointer, but an optional pointer.
- That leads you to the `dummy_flag` hack.
- How about this alternative: You create two `MemNode::all_controls_dominate` methods, one without the `bool& dead_code`, and one with. The one without simply creates a local variable and delegates to the implementation with the return-argument. Would that work?
- I would also give `dead_code` a more descriptive name. Currently it is not immediately clear what this flag is. For one, you should probably mention above `MemNode::all_controls_dominate` that it is a return-argument. Maybe call it `encountered_dead_code`.

Also: there are only 11 occurances of `all_controls_dominate`. Maybe you could have a `enum` return instead of a `bool`. It could have states `Dominate`, `Not_Dominate` and `Encountered_Dead_Code`. You would of course have to adjust all use-cases accordingly - but maybe we want to look at all use-cases anyway and make sure we handle the dead-code case correctly.

Side note:
`bool dummy_flag, &dead_code_flag = dead_code != nullptr ? *dead_code : dummy_flag;`

Lines like this make me really nervous. I'm not sure if the `dummy_flag` is initialized (maybe it doesn't matter, but still).

I'm not settled on any particular solution here, these are just my first reactions ;)

What do you think?

src/hotspot/share/opto/memnode.cpp line 430:

> 428: // control input of a memory operation predates (dominates)
> 429: // an allocation it wants to look past.
> 430: bool MemNode::all_controls_dominate(Node* dom, Node* sub, bool *dead_code) {

Suggestion:

bool MemNode::all_controls_dominate(Node* dom, Node* sub, bool* dead_code) {

style

src/hotspot/share/opto/memnode.cpp line 431:

> 429: // an allocation it wants to look past.
> 430: bool MemNode::all_controls_dominate(Node* dom, Node* sub, bool *dead_code) {
> 431:   bool dummy_flag, &dead_code_flag = dead_code != nullptr ? *dead_code : dummy_flag;

This line is hard to read. And I'm not sure that `dummy_flag` is initialized. If not, it could most of the time happen to be `false`, but also randomly `true` sometimes. I would put the two variable definitions on separate lines.

src/hotspot/share/opto/memnode.cpp line 510:

> 508:           only_dominating_controls = true;
> 509:         } else {
> 510:           if (dead_code) dead_code_flag = true;

A few comments about this line:
- You need curly braces for the body
- We do not do implicit null-checks: you should always do `dead_code != null`. But why are you even checking this here? In all other cases you would just set the `dead_code_flag`, and if `dead_code` then this would go to the `dummy_flag`, right?

src/hotspot/share/opto/memnode.cpp line 1677:

> 1675:   // Select Region to split through.
> 1676:   Node* region;
> 1677:   bool dead_code, can_not_split = false;

Suggestion:

  bool dead_code     = false;
  bool can_not_split = false;

With your version, would actually both variables be set to `false`, or could one be uninitialized? I just don't know enough about C++ here - but this makes me nervous ;)

src/hotspot/share/opto/memnode.hpp line 109:

> 107:   static Node *optimize_memory_chain(Node *mchain, const TypePtr *t_adr, Node *load, PhaseGVN *phase);
> 108:   // This one should probably be a phase-specific function:
> 109:   static bool all_controls_dominate(Node* dom, Node* sub, bool *dead_code = nullptr);

Suggestion:

  static bool all_controls_dominate(Node* dom, Node* sub, bool* dead_code = nullptr);

We put the `*` next to the type. Putting it with the variable is old style - so you will see it but we try to change it whenever we touch a line.

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/19496#pullrequestreview-2160184419
PR Review Comment: https://git.openjdk.org/jdk/pull/19496#discussion_r1666487145
PR Review Comment: https://git.openjdk.org/jdk/pull/19496#discussion_r1666486821
PR Review Comment: https://git.openjdk.org/jdk/pull/19496#discussion_r1666493397
PR Review Comment: https://git.openjdk.org/jdk/pull/19496#discussion_r1666482358
PR Review Comment: https://git.openjdk.org/jdk/pull/19496#discussion_r1666478298


More information about the hotspot-compiler-dev mailing list