RFR: 8286638: C2: CmpU needs to do more precise over/underflow analysis

Thu May 12 12:42:31 UTC 2022

`CmpUNode::Value` already does an under/overflow analysis, in case we have an `AddI` or `SubI` above it.
Instead of assuming the types are now the full `#int` range, we separately analyze the normal and the over/underflowed range.

We get the two ranges `tr1` and `tr2`, which we now both compare with the right input `t2`, via `sub`.
If both `cmp1` and `cmp2` are equal, for example both are `[ge]`, and below we have a Bool node that checks for `[lt]`, we know that this can never be true.

However, I now encountered a case where `cmp1` was `[gt]` and `cmp2` was `[ge]`. Unfortunately, they are not the same, so we just discarded our analysis, and since it is an overflow case just cannot say anything. But we could actually know that both will never be `[lt]`.

Thus, **I propose** to take the `meet` (the union of all possible results of `cmp1` and `cmp2`. In this example, the meet would be `[ge]`.

**Why is this important?**
I got a bug, where a ConvI2L node was able to determine the range was impossible, ripping out the data-flow. But the range-check did not manage to do the same analysis, because of an underflow. This leads to some mangled code further down.

**Detailed analysis of that case:**

type `i: [minint...0]`
access to `c[i-1]`

**Range-check:**
`int index = AddI(i, -1)`
-> type index: [minint-1 ... -1] -> underflow
We detect that this AddI may have 2 ranges:
`tr1: int:<=-1`
`tr2: int:max `(underflow: minint-1)

We then check how these ranges compare to in2:
`t2: int:>=0`

For this we compute:
`const Type* cmp1 = sub(tr1, t2);` -> TypeInt::CC_GT = [1]
`const Type* cmp2 = sub(tr2, t2);` -> TypeInt::CC_GE = [0...1]

But then, we only do something with this result if `cmp1 == cmp2`.
We never detect that the `Bool [lt] `could never be true.

**Data-flow:**
`long index = ConvI2L( AddI(i, -1) )`
-> type of` ConvI2L: [0...maxint-1]`
-> why do we know this? Because this is before an array access. We assume range-check guarantees index in range `[0...c.size()-1]`, and `c.size()<=maxint`.
Then there is a push_thru_add, and we get:
`long index = AddL( ConvI2L(i), -1)`
-> type of new `ConvI2L: [1...maxint-1]` - because we correct the lo by 1 for the add. Somehow we do not adjust hi, in my opinion it should now be maxint, to correct by 1.
Consequence: if hi is maxint or maxint-1, there is no overflow.
Then, we statically detect that:
type `i: [minint...0]`
type` ConvI2L: [1...maxint-1]`
-> filter results in `TOP` -> data-flow is eliminated sucessfully.

Added **regression test** that matches this example above.
Running larger test suite now...

-------------

Commit messages:
 - fix whitespace
 - 8286638: C2: CmpU needs to do more precise over/underflow analysis

Changes: https://git.openjdk.java.net/jdk/pull/8679/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=8679&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8286638
  Stats: 67 lines in 2 files changed: 62 ins; 1 del; 4 mod
  Patch: https://git.openjdk.java.net/jdk/pull/8679.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/8679/head:pull/8679

PR: https://git.openjdk.java.net/jdk/pull/8679