RFR: 8282365: Consolidate and improve division by constant idealizations [v35]

Thu Dec 7 22:54:47 UTC 2023

On Thu, 7 Dec 2023 14:24:31 GMT, Quan Anh Mai <qamai at openjdk.org> wrote:

>> This patch implements idealisation for unsigned divisions to change a division by a constant into a series of multiplication and shift. I also change the idealisation of `DivI` to get a more efficient series when the magic constant overflows an int32.
>> 
>> In general, the idea behind a signed division transformation is that for a positive constant `d`, we would need to find constants `c` and `m` so that:
>> 
>>     floor(x / d) = floor(x * c / 2**m) for 0 < x < 2**(N - 1) (1)
>>     ceil(x / d) = floor(x * c / 2**m) + 1 for -2**(N - 1) <= x < 0 (2)
>> 
>> The implementation in the original book takes into consideration that the machine may not be able to perform the full multiplication `x * c`, so the constant overflow and we need to add back the dividend as in `DivLNode::Ideal` cases. However, for int32 division, `x * c` cannot overflow an int64. As a result, it is always feasible to just calculate the product and shift the result.
>> 
>> For unsigned multiplication, the situation is somewhat trickier because the condition needs to be twice as strong (the condition (1) and (2) above are mostly the same). This results in the magic constant `c` calculated based on the method presented in Hacker's Delight by Henry S. Warren, Jr. may overflow an uintN. For int division, we can depend on the theorem devised by Arch D. Robison in N-Bit Unsigned Division Via N-Bit Multiply-Add, which states that there exists either:
>> 
>>     c1 in uint32 and m1, such that floor(x / d) = floor(x * c1 / 2**m1) for 0 < x < 2**32 (3)
>>     c2 in uint32 and m2, such that floor(x / d) = floor((x + 1) * c2 / 2**m2) for 0 < x < 2**32 (4)
>> 
>> which means that either `x * c1` never overflows an uint64 or `(x + 1) * c2` never overflows an uint64. And we can perform a full multiplication.
>> 
>> For longs, there is no way to do a full multiplication so we do some basic transformations to achieve a computable formula. The details I have written as comments in the overflow case.
>> 
>> More tests are added to cover the possible patterns.
>> 
>> Please take a look and have some reviews. Thank you very much.
>
> Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision:
> 
>   missing include

Not a full review, just a bit of spot-checking.  I'm not planning to review the magic
constants algorithm.

src/hotspot/share/opto/divconstants.cpp line 27:

> 25: #include "precompiled.hpp"
> 26: #include <limits>
> 27: #include <type_traits>

We generally put standard library includes at the end.  There was a long internal to Oracle discussion
about include ordering about a year ago that still hasn't made it into the style guide.

src/hotspot/share/opto/divconstants.cpp line 33:

> 31: // division by constant into a multiply/shift series.
> 32: 
> 33: // (1) Theory:

All these blank lines in what is really a single large block comment seem odd to me.  Anyone else bothered?

src/hotspot/share/opto/divnode.cpp line 85:

> 83: }
> 84: 
> 85: // magic_divide_constants in utilities/javaArithmetic.hpp calculates the constant c, s

javaArithmetic.hpp doesn't exist in this version of the PR.

src/hotspot/share/opto/divnode.hpp line 40:

> 38: template <class T>
> 39: void magic_divide_constants(T d, T N_neg, T N_pos, juint min_s, T& c, bool& c_ovf, juint& s);
> 40: void magic_divide_constants_round_down(juint d, juint& c, juint& s);

The definitions of these are in new file divconstants.cpp.  So I think these should be in divconstants.hpp.

src/hotspot/share/utilities/globalDefinitions.hpp line 1107:

> 1105:   using U = std::make_unsigned_t<T>;
> 1106:   return (x >= 0) ? x : U(0) - U(x);
> 1107: }

I understand what this to change to ABS is doing, though it's not obvious.  (Dodging overflow UB
for -x when x is the minimum value of a signed integral type.)  I'm not entirely sure that's a wise move.

As written this will trigger `-Wconversion` warnings someday (maybe).  static_casting the subtraction
result to T will eliminate that concern.

However, this is an API change.  The previous definition worked for floating point types, while this
change does not.  (std::make_unsigned requires T be an integral or enum, but not bool, type.)

I also don't understand why this change is part of this PR.

So I'm inclined to say no to this change without some compelling rationale.

test/hotspot/gtest/opto/test_constant_division.cpp line 29:

> 27: #include <random>
> 28: #include <type_traits>
> 29: #include <vector>

We mostly don't use C++ standard library facilities in HotSpot, even in tests; see the style guide.
<type_traits> is permitted.  We have GrowableArray instead of <vector>.  And we have os::random,
unless this really needs something better (seems unlikely, other than needing to deal with wider
types than int.)  Also, stdlib includes at the end again (except I think unittest.hpp is supposed to
_really_ be last.)

test/hotspot/gtest/opto/test_constant_division.cpp line 31:

> 29: #include <vector>
> 30: 
> 31: #undef assert

We have utilities/vmassert_(uninstall,reinstall).hpp for dealing with stdlib assert.

-------------

Changes requested by kbarrett (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/9947#pullrequestreview-1771164625
PR Review Comment: https://git.openjdk.org/jdk/pull/9947#discussion_r1419683294
PR Review Comment: https://git.openjdk.org/jdk/pull/9947#discussion_r1419684321
PR Review Comment: https://git.openjdk.org/jdk/pull/9947#discussion_r1419690273
PR Review Comment: https://git.openjdk.org/jdk/pull/9947#discussion_r1419688078
PR Review Comment: https://git.openjdk.org/jdk/pull/9947#discussion_r1419725886
PR Review Comment: https://git.openjdk.org/jdk/pull/9947#discussion_r1419740698
PR Review Comment: https://git.openjdk.org/jdk/pull/9947#discussion_r1419740532