RFR: 8315066: Add unsigned bounds and known bits to TypeInt/Long [v53]

Emanuel Peter epeter at openjdk.org
Thu May 1 08:42:01 UTC 2025


On Thu, 1 May 2025 07:00:46 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

>> Hi,
>> 
>> This patch adds unsigned bounds and known bits constraints to TypeInt and TypeLong. This opens more transformation opportunities in an elegant manner as well as helps avoid some ad-hoc rules in Hotspot.
>> 
>> In general, a `TypeInt/Long` represents a set of values `x` that satisfies: `x s>= lo && x s<= hi && x u>= ulo && x u<= uhi && (x & zeros) == 0 && (x & ones) == ones`. These constraints are not independent, e.g. an int that lies in [0, 3] in signed domain must also lie in [0, 3] in unsigned domain and have all bits but the last 2 being unset. As a result, we must canonicalize the constraints (tighten the constraints so that they are optimal) before constructing a `TypeInt/Long` instance.
>> 
>> This is extracted from #15440 , node value transformations are left for later PRs. I have also added unit tests to verify the soundness of constraint normalization.
>> 
>> Please kindly review, thanks a lot.
>> 
>> Testing
>> 
>> - [x] GHA
>> - [x] Linux x64, tier 1-4
>
> Quan Anh Mai has updated the pull request incrementally with two additional commits since the last revision:
> 
>  - wording
>  - grammar, more details for non-existence

A few more comments around canonicalization.

src/hotspot/share/opto/rangeinference.cpp line 39:

> 37: public:
> 38:   bool _progress; // whether there is progress compared to the last iteration
> 39:   bool _present;  // whether the result is empty, typically due to the calculation arriving at contradiction

Yeah, why not just call it `_is_non_empty`? I guess that is a matter of taste.

src/hotspot/share/opto/rangeinference.cpp line 443:

> 441: template <class U>
> 442: static AdjustResult<KnownBits<U>>
> 443: adjust_bits_from_bounds(const KnownBits<U>& bits, const RangeInt<U>& bounds) {

Ok, so this only deals with the unsigned bounds. Is there an analogue for the signed bits?
Ah, you could make the name more precise.
Suggestion:

template <class U>
static AdjustResult<KnownBits<U>>
adjust_bits_from_unsigned_bounds(const KnownBits<U>& bits, const RangeInt<U>& bounds) {

Hmm, maybe you only deal with simple intervals, and there the signed and unsigned bounds end up giving you the equivalent info... is  this correct?

src/hotspot/share/opto/rangeinference.cpp line 447:

> 445:   // and bounds._hi should share this common prefix in terms of bits
> 446:   U mismatch = bounds._lo ^ bounds._hi;
> 447:   // Find the first mismatch, all bits before it is the same in bounds._lo and

Suggestion:

  // Find the first mismatch, all bits before it are the same in bounds._lo and

src/hotspot/share/opto/rangeinference.cpp line 454:

> 452:   // it
> 453:   U new_zeros = bits._zeros | (match_mask & ~bounds._lo);
> 454:   U new_ones = bits._ones | (match_mask & bounds._lo);

Suggestion:

  U common_prefix = match_mask & bounds._lo;
  assert(common_prefix == match_mask & bounds._lo, "common prefix of both");
  U neg_common_prefix = match_mask & ~bounds._lo;
  assert(neg_common_prefix == match_mask & ~bounds._lo, "common negated prefix of both");
  U new_zeros = bits._zeros | neg_common_prefix;
  U new_ones = bits._ones | common_prefix;

Just an idea. Up to you.

src/hotspot/share/opto/rangeinference.cpp line 456:

> 454:   U new_ones = bits._ones | (match_mask & bounds._lo);
> 455:   bool progress = (new_zeros != bits._zeros) || (new_ones != bits._ones);
> 456:   bool present = ((new_zeros & new_ones) == U(0));

Hmm. It sounds like `present` could also be named `is_non_empty`?

src/hotspot/share/opto/rangeinference.cpp line 466:

> 464: // not be larger than 64.
> 465: // This function is called simple because it deals with a simple intervals (see
> 466: // TypeInt at type.hpp).

Could we somehow assert that the input bounds are indeed a simple interval?

src/hotspot/share/opto/rangeinference.cpp line 509:

> 507:   // Trivially canonicalize the bounds so that srange._lo and urange._hi are
> 508:   // both < 0 or >= 0. The same for srange._hi and urange._ulo. See TypeInt for
> 509:   // detailed explanation.

You seem to suggest that `urange._hi` can be `< 0`. But it is unsigned, so that is confusing.
This looks like Lemma 3 from TypeInt is relevant here, right? You might want to state that here,
and still also bring this ASCII art up again here:

 *     Signed:
 *     -----lo=========uhi---------0--------ulo==========hi-----
 *     Unsigned:
 *                                 0--------ulo==========hi----------lo=========uhi---------

Also, does `S(urange._lo) > S(urange._hi)` imply `U(srange._lo) > U(srange._hi)`?
- If yes, assert it!
- If no: are we missing an optimization?

src/hotspot/share/opto/rangeinference.cpp line 519:

> 517:       // This means that there should be no element in the interval
> 518:       // [S(urange._lo), max_S], tighten urange._lo to min_S
> 519:       urange._lo = U(std::numeric_limits<S>::min());

You could also add `min_S` and `max_S` in the ASCII art :)

-------------

PR Review: https://git.openjdk.org/jdk/pull/17508#pullrequestreview-2809309491
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2069971106
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2069953917
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2069957235
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2069968393
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2069970304
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2069975655
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2069985155
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r2069989449


More information about the hotspot-compiler-dev mailing list