RFR: 8315066: Add unsigned bounds and known bits to TypeInt/Long [v9]
Emanuel Peter
epeter at openjdk.org
Wed Sep 4 10:07:27 UTC 2024
On Tue, 3 Sep 2024 20:39:41 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:
>> Hi,
>>
>> This patch adds unsigned bounds and known bits constraints to TypeInt and TypeLong. This opens more transformation opportunities in an elegant manner as well as helps avoid some ad-hoc rules in Hotspot.
>>
>> In general, a TypeInt/Long represents a set of values x that satisfies: x s>= lo && x s<= hi && x u>= ulo && x u<= uhi && (x & zeros) == 0 && (~x & ones) == 0. These constraints are not independent, e.g. an int that lies in [0, 3] in signed domain must also lie in [0, 3] in unsigned domain and have all bits but the last 2 being unset. As a result, we must normalize the constraints (tighten the constraints so that they are optimal) before constructing a TypeInt/Long instance.
>>
>> This is extracted from #15440 , node value transformations are left for later PRs. I have also added unit tests to verify the soundness of constraint normalization.
>>
>> Please kindly review, thanks a lot.
>>
>> Testing
>>
>> - [x] GHA
>> - [x] Linux x64, tier 1-4
>
> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision:
>
> move static_asserts
I got myself quite confused reading the code now. I have to admit I did not sleep much the last 2 nights, so I hope it does not just all come down to that... 😅
But one general comment: @merykitty I think you are very very strong in math. I wasn't bad at it either, and love understanding bit manipulations. But I'm definitely struggling hard, and do not have the time to reverse-engineer the code, or write proofs myself. And I would really like to see more proofs. So this will take quite a bit of effort to make this code more readable.
I mean we are going to spend a lot of time in the future with this optimization. There will be countless follow-up bugs with them. So we need to have really high-quality code that many people can fairly easily understand ;)
Please just assume that your reviewers are novices in all the tricks you are applying, and lead us through everything step-by-step, starting with clear definitions.
All of that said: I really admire your work, and am excited where this is going :)
src/hotspot/share/opto/compile.cpp line 4485:
> 4483: index_max = sizetype->_hi - 1;
> 4484: }
> 4485: const TypeInt* iidxtype = TypeInt::make(0, index_max, Type::WidenMax)->is_int();
Can you explain this change, and the new condition `sizetype->_hi > 0`?
src/hotspot/share/opto/compile.hpp line 928:
> 926: // Workhorse function to sort out the blocked Node_Notes array:
> 927: Node_Notes* locate_node_notes(GrowableArray<Node_Notes*>* arr,
> 928: int idx, bool can_grow = false);
What was wrong with the `inline`?
src/hotspot/share/opto/rangeinference.cpp line 46:
> 44: RangeInt<U> _bounds;
> 45: KnownBits<U> _bits;
> 46: };
Could there be a `static_assert` for the `U` unsigned type?
A quick comment about the semantics of these classes would be appreciated.
src/hotspot/share/opto/rangeinference.cpp line 50:
> 48: // Try to tighten the bound constraints from the known bit information
> 49: // E.g: if lo = 0 but the lowest bit is always 1 then we can tighten
> 50: // lo = 1
Add an example like this:
lo = 2, hi = 9
zeros = 1111
ones = 1100
-> 4-aligned
0 1 2 3 4 5 6 7 8 9 10
0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010
bits: ok . . . ok . . . ok . .
bounds: lo hi
adjust: --------> lo hi <---
src/hotspot/share/opto/rangeinference.cpp line 55:
> 53: adjust_bounds_from_bits(const RangeInt<U>& bounds, const KnownBits<U>& bits) {
> 54: // Find the minimum value that is not less than lo and satisfies bits
> 55: auto adjust_lo = [](U lo, const KnownBits<U>& bits) {
I'm not ok with a lambda that is larger than the body of its enclosing function. We also have no benefit here from using a lambda, other than hiding it. But we could better model this with a class that has public and private methods.
src/hotspot/share/opto/rangeinference.cpp line 64:
> 62: assert(zero_violation == 0, "");
> 63: return lo;
> 64: }
It would be nice if you state a definition `violation` in words in a comment.
lo = 1100
zeros = 1111
ones = 1111
zv = 1100
ov = 0011
-> I struggle to see what exactly is violated here...
-> The "violations" are non-zero, so I'd assume the lo is outside what the bits allow... but that is not true.
src/hotspot/share/opto/rangeinference.cpp line 108:
> 106: if (new_hi > bounds._hi) {
> 107: return {true, false, {}};
> 108: }
I need a proof that this is ok.
src/hotspot/share/opto/rangeinference.cpp line 118:
> 116: // extracting the common prefix of lo and hi and combining with the current
> 117: // bit constraints
> 118: // E.g: if lo = 0 and hi = 10, then all but the lowest 4 bits must be 0
Please add some ASCII art like I proposed above.
src/hotspot/share/opto/rangeinference.cpp line 131:
> 129: bool progress = (new_zeros != bits._zeros) || (new_ones != bits._ones);
> 130: bool present = ((new_zeros & new_ones) == 0);
> 131: return {progress, present, {new_zeros, new_ones}};
Too dense for me to read, after 2 min I gave up. I need comments ;)
src/hotspot/share/opto/rangeinference.cpp line 136:
> 134: // Try to tighten both the bounds and the bits at the same time
> 135: // Iteratively tighten 1 using the other until no progress is made.
> 136: // This function converges because bit constraints converge fast.
You could say that each iteration constrains at least one new bit, and we have only 32 or 64 bits in total. That is correct, right?
src/hotspot/share/opto/rangeinference.cpp line 139:
> 137: template <class U>
> 138: static SimpleCanonicalResult<U>
> 139: normalize_constraints_simple(const RangeInt<U>& bounds, const KnownBits<U>& bits) {
What is the difference between "canonical" and "normalized"?
src/hotspot/share/opto/rangeinference.cpp line 169:
> 167: if (srange._lo > srange._hi ||
> 168: urange._lo > urange._hi ||
> 169: (_bits._zeros & _bits._ones) != 0) {
Am I misunderstanding `zeros` and `ones`? I thought that if `zeros[i] = ones[i] = 1`, then it can be either a 0 or one there, and so if you AND and get a 1, then you know you have multiple options.
Or are the bits all inverted, so that `zeros=0000, ones=1111` means we can only have zeros and no ones?
Oh dear am I confused now 🙈
src/hotspot/share/opto/rangeinference.cpp line 170:
> 168: urange._lo > urange._hi ||
> 169: (_bits._zeros & _bits._ones) != 0) {
> 170: return {false, {}};
Why not make a special function `CanonicalizedTypeIntPrototype<S,U>::make_empty()`? Would be more readable here.
src/hotspot/share/opto/rangeinference.hpp line 46:
> 44: U _zeros;
> 45: U _ones;
> 46: };
Can you please give a definition / specification of what set bits mean, together with a few examples?
-------------
Changes requested by epeter (Reviewer).
PR Review: https://git.openjdk.org/jdk/pull/17508#pullrequestreview-2279379523
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743309996
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743310916
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743317541
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743369065
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743408727
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743351860
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743406337
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743409735
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743414178
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743418278
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743416368
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743431677
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743427354
PR Review Comment: https://git.openjdk.org/jdk/pull/17508#discussion_r1743433348
More information about the hotspot-compiler-dev
mailing list