RFR: 8315066: Add unsigned bounds and known bits to TypeInt/Long [v7]

Emanuel Peter epeter at openjdk.org
Tue Sep 3 14:26:31 UTC 2024


On Wed, 14 Aug 2024 16:30:26 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

>> Hi,
>> 
>> This patch adds unsigned bounds and known bits constraints to TypeInt and TypeLong. This opens more transformation opportunities in an elegant manner as well as helps avoid some ad-hoc rules in Hotspot.
>> 
>> In general, a TypeInt/Long represents a set of values x that satisfies: x s>= lo && x s<= hi && x u>= ulo && x u<= uhi && (x & zeros) == 0 && (~x & ones) == 0. These constraints are not independent, e.g. an int that lies in [0, 3] in signed domain must also lie in [0, 3] in unsigned domain and have all bits but the last 2 being unset. As a result, we must normalize the constraints (tighten the constraints so that they are optimal) before constructing a TypeInt/Long instance.
>> 
>> This is extracted from #15440 , node value transformations are left for later PRs. I have also added unit tests to verify the soundness of constraint normalization.
>> 
>> Please kindly review, thanks a lot.
>> 
>> Testing
>> 
>> - [x] GHA
>> - [x] Linux x64, tier 1-4
>
> Quan Anh Mai has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits:
> 
>  - fix compile errors
>  - Merge branch 'master' into unsignedbounds
>  - add comments
>  - Merge branch 'master' into unsignedbounds
>  - fix release build
>  - add comments, group arguments to reduce C-style reference passing arguments
>  - fix tests, add verify
>  - add unit tests
>  - fix template parameter
>  - refactor
>  - ... and 1 more: https://git.openjdk.org/jdk/compare/d8e4d3f2...d5ad9f1a

This is great work!
I'm struggling with the bit-manipulation, a lot of it looks a little magical to me 🙈 . Some more comments or examples in the code would help me review.

This is just a first pass, I'd have to go over this in more detail another time.

src/hotspot/share/opto/rangeinference.cpp line 54:

> 52: static AdjustResult<RangeInt<T>>
> 53: adjust_bounds_from_bits(const RangeInt<T>& bounds, const KnownBits<T>& bits) {
> 54:   static_assert(std::is_unsigned<T>::value, "");

Suggestion:

  static_assert(std::is_unsigned<T>::value, "only unsingned juint and julong allowed");

src/hotspot/share/opto/rangeinference.cpp line 56:

> 54:   static_assert(std::is_unsigned<T>::value, "");
> 55: 
> 56:   auto adjust_lo = [](T lo, const KnownBits<T>& bits) {

This does not even capture anything. Why not make it its own dedicated method with a nice name? I guess this could also be a member method of `KnownBits`.

src/hotspot/share/opto/rangeinference.cpp line 62:

> 60:     if (zero_violation == one_violation) {
> 61:       return lo;
> 62:     }

I need more explanation here.

src/hotspot/share/opto/rangeinference.cpp line 94:

> 92:     return {true, false, {}};
> 93:   }
> 94:   T new_hi = ~adjust_lo(~bounds._hi, {bits._ones, bits._zeros});

Wow, that looks like some magic 😆 Can you please explain this?

src/hotspot/share/opto/rangeinference.cpp line 109:

> 107: static AdjustResult<KnownBits<T>>
> 108: adjust_bits_from_bounds(const KnownBits<T>& bits, const RangeInt<T>& bounds) {
> 109:   static_assert(std::is_unsigned<T>::value, "");

Again: could be a member function of `KnownBits`, where unsigned could be verified already...

src/hotspot/share/opto/rangeinference.cpp line 113:

> 111:   T match_mask = mismatch == 0 ? std::numeric_limits<T>::max()
> 112:                                : ~(std::numeric_limits<T>::max() >> count_leading_zeros(mismatch));
> 113:   T new_zeros = bits._zeros | (match_mask &~ bounds._lo);

Suggestion:

  T new_zeros = bits._zeros | (match_mask & ~bounds._lo);

I think this was a typo?

src/hotspot/share/opto/rangeinference.cpp line 143:

> 141: }
> 142: 
> 143: // Tighten all constraints of a TypeIntPrototype to its canonical form.

Oh I like this definition of "canonical form". I think you should rename `normalize_constraints` -> `canonicalize`. Otherwise, you have two words, and you would need a definition for both "normalized" and "canonical".

src/hotspot/share/opto/rangeinference.cpp line 152:

> 150: TypeIntPrototype<T, U>::normalize_constraints() const {
> 151:   static_assert(std::is_signed<T>::value, "");
> 152:   static_assert(std::is_unsigned<U>::value, "");

I wonder if you should then generally use `T` for signed and unsigned, `S` for signed, and `U` for unsigned?

src/hotspot/share/opto/rangeinference.cpp line 153:

> 151:   static_assert(std::is_signed<T>::value, "");
> 152:   static_assert(std::is_unsigned<U>::value, "");
> 153:   static_assert(sizeof(T) == sizeof(U), "");

Honestly, should this check not be done by `TypeIntPrototype` already? Or are you putting it here to make the pre-conditions explicit?

src/hotspot/share/opto/rangeinference.cpp line 160:

> 158:       urange._lo > urange._hi ||
> 159:       (_bits._zeros & _bits._ones) != 0) {
> 160:     return {false, {}};

We are returning a `Pair` here. I don't know what the `bool` stands for. Can you add some description at the top of the method, or make an explicit type instead of a `Pair` that makes it intuitive what we are returning here?

src/hotspot/share/opto/rangeinference.cpp line 164:

> 162: 
> 163:   if (T(urange._lo) > T(urange._hi)) {
> 164:     if (T(urange._hi) < srange._lo) {

These would be much more intuitive to read if `T -> S`!

src/hotspot/share/opto/rangeinference.cpp line 434:

> 432: #ifndef PRODUCT
> 433: template <class T>
> 434: static const char* intnamenear(T origin, const char* xname, char* buf, size_t buf_size, T n) {

Suggestion:

static const char* int_name_near(T origin, const char* xname, char* buf, size_t buf_size, T n) {

simpler to read

src/hotspot/share/opto/rangeinference.cpp line 444:

> 442:       return nullptr;
> 443:     }
> 444:     os::snprintf_checked(buf, buf_size, "%s+" INT32_FORMAT, xname, jint(n - origin));

Was there an explicit choice to use buffers, instead of `outputStream`?

src/hotspot/share/opto/rangeinference.hpp line 44:

> 42:   T _zeros;
> 43:   T _ones;
> 44: };

Are the `KnownBits` always unsigned? Why not name the type `U` and verify that it is unsigned?

You could also have member-functions on it then, like the bounds adjustment. And then you can either have the type free (i.e. allow signed type), or require the same `U` type on the input ranges or values.

src/hotspot/share/opto/type.cpp line 488:

> 486:   TypeInt::UBYTE   = TypeInt::make(0, 255,       WidenMin)->is_int(); // Unsigned Bytes
> 487:   TypeInt::CHAR    = TypeInt::make(0,65535,      WidenMin)->is_int(); // Java chars
> 488:   TypeInt::SHORT   = TypeInt::make(-32768,32767, WidenMin)->is_int(); // Java shorts

Suggestion:

  TypeInt::CHAR    = TypeInt::make(0, 65535,      WidenMin)->is_int(); // Java chars
  TypeInt::SHORT   = TypeInt::make(-32768, 32767, WidenMin)->is_int(); // Java shorts

Might as well fix the spacing now

src/hotspot/share/opto/type.cpp line 500:

> 498:   assert( TypeInt::CC_EQ == TypeInt::ZERO,    "types must match for CmpL to work" );
> 499:   assert( TypeInt::CC_GE == TypeInt::BOOL,    "types must match for CmpL to work" );
> 500:   assert( (juint)(TypeInt::CC->_hi - TypeInt::CC->_lo) <= SMALLINT, "CC is truly small");

Why did you remove this check?

src/hotspot/share/opto/type.hpp line 604:

> 602:   const jint _lo, _hi;       // Lower bound, upper bound in the signed domain
> 603:   const juint _ulo, _uhi;    // Lower bound, upper bound in the unsigned domain
> 604:   const juint _zeros, _ones; // Bits that are known to be 0 or 1

It would have been nice to have some sort of `Range` and `KnownBits` class... not sure how feasible this is, especially since `_lo` and `_hi` are already used all over the place...

src/hotspot/share/opto/type.hpp line 606:

> 604:   const juint _zeros, _ones; // Bits that are known to be 0 or 1
> 605: 
> 606:   static const TypeInt* cast(const Type* t) { return t->is_int(); }

Is this even used?

src/hotspot/share/opto/type.hpp line 617:

> 615:   bool is_con(jint i) const { return is_con() && _lo == i; }
> 616:   jint get_con() const { assert(is_con(), "");  return _lo; }
> 617:   // Check if a TypeInt is a subset of this TypeInt (i.e. all elements of the

Suggestion:

  // Check if a jint or TypeInt is a subset of this TypeInt (i.e. all elements of the

-------------

Changes requested by epeter (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/17508#pullrequestreview-2277424756
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742093319
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742099992
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742108156
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742112203
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742114496
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742128592
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742137154
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742131718
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742139791
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742143974
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742145788
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742154568
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742158039
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742103749
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742077724
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742078511
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742057104
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742060868
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1742065136


More information about the hotspot-compiler-dev mailing list