RFR: 8315066: Add unsigned bounds and known bits to TypeInt/Long [v47]

Emanuel Peter epeter at openjdk.org
Tue Apr 22 16:15:17 UTC 2025


On Mon, 21 Apr 2025 12:05:27 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

>> Hi,
>> 
>> This patch adds unsigned bounds and known bits constraints to TypeInt and TypeLong. This opens more transformation opportunities in an elegant manner as well as helps avoid some ad-hoc rules in Hotspot.
>> 
>> In general, a `TypeInt/Long` represents a set of values `x` that satisfies: `x s>= lo && x s<= hi && x u>= ulo && x u<= uhi && (x & zeros) == 0 && (x & ones) == ones`. These constraints are not independent, e.g. an int that lies in [0, 3] in signed domain must also lie in [0, 3] in unsigned domain and have all bits but the last 2 being unset. As a result, we must canonicalize the constraints (tighten the constraints so that they are optimal) before constructing a `TypeInt/Long` instance.
>> 
>> This is extracted from #15440 , node value transformations are left for later PRs. I have also added unit tests to verify the soundness of constraint normalization.
>> 
>> Please kindly review, thanks a lot.
>> 
>> Testing
>> 
>> - [x] GHA
>> - [x] Linux x64, tier 1-4
>
> Quan Anh Mai has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 61 commits:
> 
>  - Merge branch 'master' into unsignedbounds
>  - Merge branch 'master' into unsignedbounds
>  - reviews
>  - Merge branch 'master' into unsignedbounds
>  - refine comments
>  - Merge branch 'master' into unsignedbounds
>  - Merge branch 'master' into unsignedbounds
>  - harden SimpleCanonicalResult
>  - number lemmas
>  - include
>  - ... and 51 more: https://git.openjdk.org/jdk/compare/0995b940...cdab1911

Another small batch for the else case.

src/hotspot/share/opto/rangeinference.cpp line 277:

> 275:     assert(lo < new_lo, "this case cannot overflow");
> 276:     return new_lo;
> 277:   } else {

Suggestion:

  } else {
    assert(zero_violation > one_violation, "remaining case");

Could be a help to the reader, and a defense against bad changes.

src/hotspot/share/opto/rangeinference.cpp line 280:

> 278:     // This means that the first bit that does not satisfy the bit requirement
> 279:     // is a 1 that should be a 0. Trace backward to find i which is the last
> 280:     // bit that is 0 in both lo and zeros.

Does this `i` have a name below? What value does it have in the example?

src/hotspot/share/opto/rangeinference.cpp line 293:

> 291:     // different bit between the result and lo must be the 3rd bit. As a result,
> 292:     // the result must not be smaller than:
> 293:     //           1 0 1 0 0 0 0 0

Oh, I'm starting to get an intuition here. I think we can make it a bit more "intuitive".

We start at the "first violation". We would like to flip the bit from 1 to 0, but since we are only allowed to increase the number, we also need to flip the 5th bit. But that is already 1 too, so we need to go to the 4th. That one we cannot flip from 0 to 1, because it is forced to be a 0 by zeros. So the first bit we can flip is the 3rd.

Maybe this formulation with "which is the highest bit we can flip" could be helpful for the intuition?

src/hotspot/share/opto/rangeinference.cpp line 299:

> 297: 
> 298:     juint first_violation = count_leading_zeros(zero_violation);
> 299:     // This mask out all bits from the first violation

Suggestion:

    // This mask out all bits after the first violation.

src/hotspot/share/opto/rangeinference.cpp line 307:

> 305:     // violation, which is the last set bit of tmp
> 306:     //           0 1 1 0 0 0 0 0
> 307:     U tmp = ~either & find_mask;

Did I understand that right: `tmp` is the bits that we cannot flip? Or is it the ones we can flip? A better name would be appreciated :)
Same for `either`. worst case you call it `lo_or_zeros`... but that's not great either.

I'm not fully seeing through the logic here yet, so I struggle to make good suggestions.

src/hotspot/share/opto/rangeinference.cpp line 309:

> 307:     U tmp = ~either & find_mask;
> 308:     // i == 2 here, shortcut the calculation instead of explicitly spelling out
> 309:     // i

Suggestion:

    // i == 2 here, shortcut the calculation instead of explicitly spelling out i.

The single `i` looks a little funny :)

-------------

PR Review: https://git.openjdk.org/jdk/pull/17508#pullrequestreview-2784584965
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2054401952
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2054406394
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2054418681
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2054422027
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2054429286
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2054430521


More information about the hotspot-compiler-dev mailing list