RFR: 8347555: [REDO] C2: implement optimization for series of Add of unique value [v7]
Emanuel Peter
epeter at openjdk.org
Thu Mar 13 08:17:01 UTC 2025
On Tue, 11 Mar 2025 19:49:41 GMT, Kangcheng Xu <kxu at openjdk.org> wrote:
>> [JDK-8347555](https://bugs.openjdk.org/browse/JDK-8347555) is a redo of [JDK-8325495](https://bugs.openjdk.org/browse/JDK-8325495) was [first merged](https://git.openjdk.org/jdk/pull/20754) then backed out due to a regression. This patch redos the feature and fixes the bit shift overflow problem. For more information please refer to the previous PR.
>>
>> When constanlizing multiplications (possibly in forms on `lshifts`), the multiplier is upgraded to long and then later narrowed to int if needed. However, when a `lshift` operand is exactly `32`, overflowing an int, using long has an unexpected result. (i.e., `(1 << 32) = 1` and `(int) (1L << 32) = 0`)
>>
>> The following was implemented to address this issue.
>>
>> if (UseNewCode2) {
>> *multiplier = bt == T_INT
>> ? (jlong) (1 << con->get_int()) // loss of precision is expected for int as it overflows
>> : ((jlong) 1) << con->get_int();
>> } else {
>> *multiplier = ((jlong) 1 << con->get_int());
>> }
>>
>>
>> Two new bitshift overflow tests were added.
>
> Kangcheng Xu has updated the pull request incrementally with one additional commit since the last revision:
>
> add micro benchmark
This looks really interesting!
I see that you are doing some special pattern matching. I wonder if it might be worth generalizing the algorithm, to search through an arbitrary "tree" of additions, collect all "leaves" of (`variable * multiplier`), sort by `variable`, and compute new additions for each `variable`. What do you think?
src/hotspot/share/opto/addnode.cpp line 407:
> 405: }
> 406:
> 407: // Try to convert a serial of additions into a single multiplication. Also convert `(a * CON) + a` to `(CON + 1) * a` as
What about `(a * CON1) + (a * CON2)`? Like `11 * a + 5 * a`. Do we also optimize that?
src/hotspot/share/opto/addnode.cpp line 413:
> 411: // power-of-2 addition (e.g., 3 * a => (a << 2) + a). Without this check, GVN would keep trying to optimize the same
> 412: // node and can't progress. For example, 3 * a => (a << 2) + a => 3 * a => (a << 2) + a => ...
> 413: if (find_power_of_two_addition_pattern(this, bt, nullptr) != nullptr) {
Where does the optimization `3 * a => (a << 2) + a` happen? Do we use `find_power_of_two_addition_pattern` there too? If not: how do we prevent the code of the two locations from diverging in the future?
src/hotspot/share/opto/addnode.cpp line 427:
> 425: || find_simple_multiplication_pattern(in1, bt, &multiplier) == in2
> 426: || find_power_of_two_addition_pattern(in1, bt, &multiplier) == in2) {
> 427: multiplier++; // +1 for the in2 term
Nit: I think we generally have the `||` at the end of the line, not the beginning.
src/hotspot/share/opto/addnode.cpp line 447:
> 445:
> 446: return nullptr;
> 447: }
I'm not a great fan of "output arguments" such as the `multiplier` here.
Why not create a class/struct `Multiplication`, which has a field `valid` (instead of returning `nullptr`). And fields `variable` and `multiplier`. The fields can all be constant.
You could even have an `add` method, that adds two such `Multiplication`s together.
src/hotspot/share/opto/addnode.cpp line 480:
> 478: if (!con->is_Con()) {
> 479: swap(con, base);
> 480: }
Is that necessary? Does `Mul` not automatically get canonicalized so that the constant is on the rhs?
src/hotspot/share/opto/addnode.cpp line 500:
> 498: // Note that one of the term of the addition could simply be `a` (i.e., a << 0). Calling this function with `multiplier`
> 499: // being null is safe.
> 500: Node* AddNode::find_power_of_two_addition_pattern(Node* n, BasicType bt, jlong* multiplier) {
This code here looks quite complicated. Why not parse both sides of the add with a `find_simple_lshift_pattern`, and then check that they use the same variable?
test/hotspot/jtreg/compiler/c2/TestSerialAdditions.java line 43:
> 41: */
> 42: public class TestSerialAdditions {
> 43: private static final Random RNG = Utils.getRandomInstance();
Could you use `Generators.java` instead? It will produce more "interesting" random values, such as powers of 2, and values close to powers of 2.
-------------
PR Review: https://git.openjdk.org/jdk/pull/23506#pullrequestreview-2680840690
PR Review Comment: https://git.openjdk.org/jdk/pull/23506#discussion_r1992974452
PR Review Comment: https://git.openjdk.org/jdk/pull/23506#discussion_r1992971118
PR Review Comment: https://git.openjdk.org/jdk/pull/23506#discussion_r1992953844
PR Review Comment: https://git.openjdk.org/jdk/pull/23506#discussion_r1992966421
PR Review Comment: https://git.openjdk.org/jdk/pull/23506#discussion_r1992978164
PR Review Comment: https://git.openjdk.org/jdk/pull/23506#discussion_r1992980967
PR Review Comment: https://git.openjdk.org/jdk/pull/23506#discussion_r1992984685
More information about the hotspot-compiler-dev
mailing list