RFR: 8282365: Consolidate and improve division by constant idealizations [v42]

Kim Barrett kbarrett at openjdk.org
Sat Dec 30 18:48:03 UTC 2023


On Mon, 25 Dec 2023 17:57:28 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

>> This patch implements idealisation for unsigned divisions to change a division by a constant into a series of multiplication and shift. I also change the idealisation of `DivI` to get a more efficient series when the magic constant overflows an int32.
>> 
>> In general, the idea behind a signed division transformation is that for a positive constant `d`, we would need to find constants `c` and `m` so that:
>> 
>>     floor(x / d) = floor(x * c / 2**m) for 0 < x < 2**(N - 1) (1)
>>     ceil(x / d) = floor(x * c / 2**m) + 1 for -2**(N - 1) <= x < 0 (2)
>> 
>> The implementation in the original book takes into consideration that the machine may not be able to perform the full multiplication `x * c`, so the constant overflow and we need to add back the dividend as in `DivLNode::Ideal` cases. However, for int32 division, `x * c` cannot overflow an int64. As a result, it is always feasible to just calculate the product and shift the result.
>> 
>> For unsigned multiplication, the situation is somewhat trickier because the condition needs to be twice as strong (the condition (1) and (2) above are mostly the same). This results in the magic constant `c` calculated based on the method presented in Hacker's Delight by Henry S. Warren, Jr. may overflow an uintN. For int division, we can depend on the theorem devised by Arch D. Robison in N-Bit Unsigned Division Via N-Bit Multiply-Add, which states that there exists either:
>> 
>>     c1 in uint32 and m1, such that floor(x / d) = floor(x * c1 / 2**m1) for 0 < x < 2**32 (3)
>>     c2 in uint32 and m2, such that floor(x / d) = floor((x + 1) * c2 / 2**m2) for 0 < x < 2**32 (4)
>> 
>> which means that either `x * c1` never overflows an uint64 or `(x + 1) * c2` never overflows an uint64. And we can perform a full multiplication.
>> 
>> For longs, there is no way to do a full multiplication so we do some basic transformations to achieve a computable formula. The details I have written as comments in the overflow case.
>> 
>> More tests are added to cover the possible patterns.
>> 
>> Please take a look and have some reviews. Thank you very much.
>
> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   power of 2

Changes requested by kbarrett (Reviewer).

test/hotspot/gtest/opto/test_constant_division.cpp line 33:

> 31: 
> 32: // Generate a random positive integer of type T in a way that biases
> 33: // towards smaller values

Why is there a bias toward smaller numbers?  Maybe it should be named differently to indicate that bias?

test/hotspot/gtest/opto/test_constant_division.cpp line 54:

> 52: template <>
> 53: julong random<jlong, julong>() {
> 54:   juint bits = juint(os::random()) % 63 + 1;

This change (`&` => `%`, and the similar change below) go a long way toward explaining why I couldn't
puzzle out what this function was intended to do.  Note that `&` has lower precedence than `+`, so the
earlier version was masking with 64.  The new version doesn't have that operator precedence mistake,
though I'd prefer the precedence be made explicit using parens.

test/hotspot/gtest/opto/test_constant_division.cpp line 132:

> 130:   for (int i = 0; i < iter_num;) {
> 131:     UT d = random<T, UT>();
> 132:     if ((d & (d - 1)) == 0) {

We have `is_power_of_2` for this.

test/hotspot/gtest/opto/test_constant_division.cpp line 139:

> 137:     UT N_pos = random<T, UT>();
> 138:     if (N_neg < d && N_pos < d) {
> 139:       continue;

With sufficiently bad luck, we could spin here for a long time.  (Similarly, though much less likely above with
the power-of-2 case.) That doesn't seem great.  Of course, if one does count these skipped cases against
the iteration limit then with sufficiently bad luck one might not test anything.  Rather than skipping the test
here, could you instead modify one of the values and proceed with the test?

-------------

PR Review: https://git.openjdk.org/jdk/pull/9947#pullrequestreview-1799579259
PR Review Comment: https://git.openjdk.org/jdk/pull/9947#discussion_r1438680076
PR Review Comment: https://git.openjdk.org/jdk/pull/9947#discussion_r1438679875
PR Review Comment: https://git.openjdk.org/jdk/pull/9947#discussion_r1438680376
PR Review Comment: https://git.openjdk.org/jdk/pull/9947#discussion_r1438681980


More information about the hotspot-compiler-dev mailing list