RFR: 8347555: [REDO] C2: implement optimization for series of Add of unique value [v7]

Kangcheng Xu kxu at openjdk.org
Mon Mar 17 19:18:25 UTC 2025


On Thu, 13 Mar 2025 08:14:04 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> Kangcheng Xu has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   add micro benchmark
>
> This looks really interesting!
> 
> I see that you are doing some special pattern matching. I wonder if it might be worth generalizing the algorithm, to search through an arbitrary "tree" of additions, collect all "leaves" of (`variable * multiplier`), sort by `variable`, and compute new additions for each `variable`. What do you think?

> @eme64  I wonder if it might be worth generalizing the algorithm, to search through an arbitrary "tree" of additions, collect all "leaves" of (`variable * multiplier`) [...]

This was actually my first approach, a top-down approach starts from "root" and collect all leaves to convert to multiplications in one go. It didn't work mainly due to two problems:

1. Not being a technical "tree" introduces additional complexity with repeated nodes (and/or sub-"trees")
    
It's very likely that a subtree contains repeated nodes (e.g., `Add(x, x)`). This is especially problematic if `x` itself is a complex subgraph. Deduplication (probably with memoization) introduces additional complexity. Luckily, even if it's not a tree, it's still strictly a DAG. A bottom-up approach converting leaves incrementally would avoid the issue of duplicated computation.

2. A very large tree results in resource exhaustion on one (i)GVN pass

It's common for some simple loop to be unrolled into a very large subgraph, in which case the transforming the entire graph in one go is blocking and takes a long time to run. By dividing work into individual passes, we allow the optimization to progress and reduce memory footprint.

> src/hotspot/share/opto/addnode.cpp line 447:
> 
>> 445: 
>> 446:   return nullptr;
>> 447: }
> 
> I'm not a great fan of "output arguments" such as the `multiplier` here.
> Why not create a class/struct `Multiplication`, which has a field `valid` (instead of returning `nullptr`). And fields `variable` and `multiplier`. The fields can all be constant.
> You could even have an `add` method, that adds two such `Multiplication`s together.

Good suggestion. I didn't make fields `const` as implicitly deleting the constructor makes pattern matching more complex.

> src/hotspot/share/opto/addnode.cpp line 500:
> 
>> 498: // Note that one of the term of the addition could simply be `a` (i.e., a << 0). Calling this function with `multiplier`
>> 499: // being null is safe.
>> 500: Node* AddNode::find_power_of_two_addition_pattern(Node* n, BasicType bt, jlong* multiplier) {
> 
> This code here looks quite complicated. Why not parse both sides of the add with a `find_simple_lshift_pattern`, and then check that they use the same variable?

Good point. Updated to use `find_simple_lshift_pattern`

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23506#issuecomment-2730570456
PR Review Comment: https://git.openjdk.org/jdk/pull/23506#discussion_r1999453274
PR Review Comment: https://git.openjdk.org/jdk/pull/23506#discussion_r1999454340


More information about the hotspot-compiler-dev mailing list