RFR: 8356813: Improve Mod(I|L)Node::Value [v2]

Wed May 28 09:55:57 UTC 2025

On Sun, 25 May 2025 13:17:49 GMT, Hannes Greule <hgreule at openjdk.org> wrote:

>> This change improves the precision of the `Mod(I|L)Node::Value()` functions.
>> 
>> I reordered the structure a bit. First, we handle constants, afterwards, we handle ranges. The bottom checks seem to be excessive (`Type::BOTTOM` is covered by using `isa_(int|long)()`, the local bottom is just the full range). Given we can even give reasonable bounds if only one input has any bounds, we don't want to return early.
>> The changes after that are commented. Please let me know if the explanations are good, or if you have any suggestions.
>> 
>> ### Monotonicity
>> 
>> Before, a 0 divisor resulted in `Type(Int|Long)::POS`. Initially I wanted to keep it this way, but that violates monotonicity during PhaseCCP. As an example, if we see a 0 divisor first and a 3 afterwards, we might try to go from `>=0` to `-2..2`, but the meet of these would be `>=-2` rather than `-2..2`. Using `Type(Int|Long)::ZERO` instead (zero is always in the resulting value if we cover a range).
>> 
>> ### Testing
>> 
>> I added tests for cases around the relevant bounds. I also ran tier1, tier2, and tier3 but didn't see any related failures after addressing the monotonicity problem described above (I'm having a few unrelated failures on my system currently, so separate testing would be appreciated in case I missed something).
>> 
>> Please review and let me know what you think.
>> 
>> ### Other
>> 
>> The `UMod(I|L)Node`s were adjusted to be more in line with its signed variants. This change diverges them again, but similar improvements could be made after #17508.
>> 
>> During experimenting with these changes, I stumbled upon a few things that aren't directly related to this change, but might be worth to further look into:
>> - If the divisor is a constant, we will directly replace the `Mod(I|L)Node` with more but less expensive nodes in `::Ideal()`. Type analysis for these nodes combined is less precise, means we miss potential cases were this would help e.g., removing range checks. Would it make sense to delay the replacement?
>> - To force non-negative ranges, I'm using `char`. I noticed that method parameters of sub-int integer types all fall back to `TypeInt::INT`. This seems to be an intentional change of https://github.com/openjdk/jdk/commit/200784d505dd98444c48c9ccb7f2e4df36dcbb6a. The bug report is private, so I can't really judge if that part is necessary, but it seems odd.
>
> Hannes Greule has updated the pull request incrementally with three additional commits since the last revision:
> 
>  - Update ModL comment
>  - Use TOP instead of ZERO
>  - Apply suggested test changes

@SirYwell Thanks for looking into this, that looks promising!

I have two bigger comments:
- Could we unify the L and I code, either using C++ templating or `BasicType`? It would reduce code duplication.
- Can we have some tests where the input ranges are random as well, and where we check the output ranges with some comparisons?

------------------
Copied from the code comment:

> Nice work with the examples you already have, and randomizing some of it!
> 
> I would like to see one more generalized test.
> - compute `res = lhs % rhs`
> - Truncate both `lhs` and `rhs` with randomly produced bounds from Generators, like this: `lhs = Math.max(lo, Math.min(hi, lhs))`.
> - Below, add all sorts of comparisons with random constants, like this: `if (res < CON) { sum += 1; }`. If the output range is wrong, this could wrongly constant fold, and allow us to catch that.
> 
> Then fuzz the generated method a few times with random inputs for `lhs` and `rhs`, and check that the `sum` and `res` value are the same for compiled and interpreted code.
> 
> I hope that makes sense :)
> This is currently my best method to check if ranges are correct, and I think it is quite important because often tests are only written with constants in mind, but less so with ranges, and then we mess up the ranges because it is just too tricky.
> 
> This is an example, where I asked someone to try this out as well:
> https://github.com/openjdk/jdk/pull/23089/files#diff-12bebea175a260a6ab62c22a3681ccae0c3d9027900d2fdbd8c5e856ae7d1123R404-R422

src/hotspot/share/opto/divnode.cpp line 1222:

> 1220: 
> 1221:   const TypeInt *i1 = t1->isa_int();
> 1222:   const TypeInt *i2 = t2->isa_int();

Suggestion:

  const TypeInt* i1 = t1->isa_int();
  const TypeInt* i2 = t2->isa_int();

In new code style, we put the asterisk on the left. We fix it whenever we touch old code.

src/hotspot/share/opto/divnode.cpp line 1515:

> 1513: 
> 1514:   const TypeLong *i1 = t1->isa_long();
> 1515:   const TypeLong *i2 = t2->isa_long();

The code below is basically a duplication. Could you unify the code, either using c++ templates or `BasicType`?

test/hotspot/jtreg/compiler/c2/gvn/ModINodeValueTests.java line 92:

> 90:     @IR(failOn = {IRNode.MOD_I, IRNode.CMP_I})
> 91:     // The sign of the result of % is the same as the sign of the dividend,
> 92:     // i.e., posVal % x < 0 => false.

Suggestion:

    // i.e., POS_INT % x < 0 => false.

Same everywhere below.

test/hotspot/jtreg/compiler/c2/gvn/ModINodeValueTests.java line 201:

> 199:         // in bounds, cannot optimize
> 200:         return ((byte) x) % (((char) y) + 1) <= -128;
> 201:     }

Nice work with the examples you already have, and randomizing some of it!

I would like to see one more generalized test.
- compute `res = lhs % rhs`
- Truncate both `lhs` and `rhs` with randomly produced bounds from Generators, like this: `lhs = Math.max(lo, Math.min(hi, lhs))`.
- Below, add all sorts of comparisons with random constants, like this: `if (res < CON) { sum += 1; }`. If the output range is wrong, this could wrongly constant fold, and allow us to catch that.

Then fuzz the generated method a few times with random inputs for `lhs` and `rhs`, and check that the `sum` and `res` value are the same for compiled and interpreted code.

I hope that makes sense :)
This is currently my best method to check if ranges are correct, and I think it is quite important because often tests are only written with constants in mind, but less so with ranges, and then we mess up the ranges because it is just too tricky.

This is an example, where I asked someone to try this out as well:
https://github.com/openjdk/jdk/pull/23089/files#diff-12bebea175a260a6ab62c22a3681ccae0c3d9027900d2fdbd8c5e856ae7d1123R404-R422

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/25254#pullrequestreview-2874299006
PR Review Comment: https://git.openjdk.org/jdk/pull/25254#discussion_r2111393407
PR Review Comment: https://git.openjdk.org/jdk/pull/25254#discussion_r2111407282
PR Review Comment: https://git.openjdk.org/jdk/pull/25254#discussion_r2111413138
PR Review Comment: https://git.openjdk.org/jdk/pull/25254#discussion_r2111426388