RFR: 8234331: Add robust and optimized utility for rounding up to next power of two+
Thomas Stüfe
thomas.stuefe at gmail.com
Wed Dec 4 16:32:44 UTC 2019
Hi Claes,
I think this is very good and useful.
General remarks:
- Pity that it only works for 32/64bit values. It would be neat to have
variants for 8 and 16 bit values too.
- +1 for a "lumpier" header.
- the round_down() variants: would it make sense to allow 0 as valid input?
I can imagine scenarios where 0 could be a valid input and result for this
operation.
- I tested on AIX, builds and gtests run through without errors.
----
http://cr.openjdk.java.net/~redestad/8234331/open.03/src/hotspot/share/runtime/threadSMR.cpp.udiff.html
+ int hash_table_size =
round_up_power_of_2(MIN2((int)get_java_thread_list()->length(), 32) << 1);
Style nit, I would have preferred this to be two operations for
readability.
http://cr.openjdk.java.net/~redestad/8234331/open.03/src/hotspot/share/utilities/powerOfTwo.hpp.html
43 inline typename EnableIf<IsSigned<T>::value, T>::type
round_down_power_of_2(T value) {
...
47 assert(lz < sizeof(T) * BitsPerByte, "Sanity");
Do we need this if we check the input value for >0 ? (Same question for the
signed version of round_up() below).
48 assert(lz > 0, "Overflow to negative");
Same here?
Also, small nit, the assert text is slightly off since no overflow happened
yet, but the input value is negative.
(The asserts do not bother me much, I am just trying to understand why we
need them)
-
100 // Accepts 0 (returns 1); overflows if value is larger than or equal
to 2^31
101 // or 2^63, for 32- and 64-bit integers, respectively
Comment is a bit off. True only for unsigned values. Also it should mention
that overflow results in assert to be in sync with the other comments.
----
http://cr.openjdk.java.net/~redestad/8234331/open.03/src/hotspot/share/utilities/count_leading_zeros.hpp.frames.html
template <typename T> inline uint32_t count_leading_zeros(T x) {
Instead of all the "if sizeof(T) == 4|8", could we specialize the functions
for different operand sizes? Example:
template <typename T, size_t n> struct Impl;
template <typename T> struct Impl<T, 8> { static int doit(T v) { return
__builtin_clzll(v); } };
template <typename T> struct Impl<T, 4> { static int doit(T v) { return
__builtin_clz(v); } };
template <typename T> int count_leading_zero(T v) { return Impl<T,
sizeof(T)>::doit(v); }
Would make it easier later to plug in implementations for different sizes,
and we'd still get a compile time error for invalid input types.
-
I wondered why count_leading_zeros did not allow 0 as input until I saw
that __builtin_clz(ll) unfolds to bsr; and that does not work for input=0.
I still find it an artificial limitation but that was there before your
patch.
----
Thank you for the enhanced comment in the fallback implementation, thats
helpful.
----
http://cr.openjdk.java.net/~redestad/8234331/open.03/test/hotspot/gtest/utilities/test_powerOfTwo.cpp.html
It bothers me a bit that we do not test invalid input and overflow. There
is a way, I believe, to have gtests expect an assert. We have
TEST_VM_ASSERT but I am not sure if that is a good idea since crashing
takes time. I also see that we only use it at a single place, so I guess
others share the same concern :)
Alternatively, one could have variants of round_up/down which return a
definite value on overflow. I wonder whether this may be useful someplace
outside the tests.
-
Just a thought, but could you cut down the code for the many TEST functions
if you would use a templatized test function? Basically, something like
template <class T> struct MyMaxImpl;
template <int32_t> MyMaxImpl { static int32 get() { return
int32_max_pow2; } };
.. etc
template <typename T> void round_up_test() {
EXPECT_EQ(round_up_power_of_2((T)1), (T)1) << "value = " << 1;
...
const T max = MyMaxImpl<T>::get();
EXPECT_EQ(round_up_power_of_2(max - 1), max)
...
}
TEST(power_of_2, round_up_power_of_2_signed_32) {
round_up_do_test<int32_t>();
}
That way the test would also be easily extendable later if we decide to add
support for uint8_t or uint16_t.
Thanks, Thomas
On Wed, Dec 4, 2019 at 12:36 AM Claes Redestad <claes.redestad at oracle.com>
wrote:
> Hi Thomas and others,
>
> thanks for the thorough feedback!
>
> I also had offline discussions about the existing utilities in
> zUtils.inline.hpp with Per Liden, and decided to try and work this into
> a templated solution that enables both 32-bit and 64-bit
> implementations to work correctly - along with signed variants
> (especially nice with types such as size_t).
>
> http://cr.openjdk.java.net/~redestad/8234331/open.03
>
> This is a pretty significant rework. Changes since .01:
>
> - introduce utilities/powerOfTwo.hpp
> - move round_up_* and round_down_* from zUtils, re-implement to use
> the new implementation
> - implement next_* as round_up_*(value + 1) as per John's suggestion.
> - added tests
> - corrected the issues Thomas, Ivan and others pointed out, mainly
> ensuring we don't depend on undefined behavior in neither product
> code, asserts nor tests
> - .. and ensure to use next and round_up as appropriate to preserve
> behavior
>
> Other notes:
>
> - Moving existing power-of-two functions like is_power_of_2 from
> globalDefinitions.hpp to powerOfTwo.hpp would be straightforward, but
> tedious. I'd like to defer this to a follow-up
> - The xlc implementation is untested, but should work. If someone can
> verify, I'd be much obliged.
> - Many thanks to Erik Österlund, who guided me through a maze of
> undefined behavior and template metaprogramming to a workable and
> relatively clean implementation
> - HotSpot shrinks by ~15Kb :-)
>
> Testing: tier1-5
>
> On 2019-11-28 08:34, Thomas Stüfe wrote:
> > Hi Claes,
> >
> > I think this is useful. Why not a 64bit variant too? If you do not want
> > to go through the hassle of providing a count_leading_zeros(uint64_t),
> > you could call the 32bit variant twice and take care of endianness for
> > the caller.
> >
> > --
> >
> > In inline int32_t next_power_of_two(int32_t value) , should we weed out
> > negative input values right away instead of asserting at the end of the
> > function?
> >
> > --
> >
> > The functions will always return the next power of two, even if the
> > input is a power of two - e.g. "2" for "1". Is that intended? It would
> > be nice to have an API comment in the header describing these corner
> > cases (what happens for negative input, what happens if input is power
> 2).
> >
> > --
> >
> > The patch can cause subtle differences in some caller code, I think, if
> > input value is a power of 2 already. See e.g:
> >
> >
> http://cr.openjdk.java.net/~redestad/8234331/open.01/src/hotspot/share/libadt/dict.cpp.udiff.html
> >
> > - i=16;
> > - while( i < size ) i <<= 1;
> > + i = MAX2(16, (int)next_power_of_two(size));
> >
> > If i == size == 16, old code would keep i==16, new code would come to
> > i==32, I think.
> >
> >
> http://cr.openjdk.java.net/~redestad/8234331/open.01/src/hotspot/share/opto/phaseX.cpp.udiff.html
> >
> >
> //------------------------------round_up---------------------------------------
> > // Round up to nearest power of 2
> > -uint NodeHash::round_up( uint x ) {
> > - x += (x>>2); // Add 25% slop
> > - if( x <16 ) return 16; // Small stuff
> > - uint i=16;
> > - while( i < x ) i <<= 1; // Double to fit
> > - return i; // Return hash table size
> > +uint NodeHash::round_up(uint x) {
> > + x += (x >> 2); // Add 25% slop
> > + return MAX2(16U, next_power_of_two(x));
> > }
> >
> > same here. If x == 16, before we'd return 16, now 32.
> >
> > ---
> >
> >
> http://cr.openjdk.java.net/~redestad/8234331/open.01/src/hotspot/share/runtime/threadSMR.cpp.udiff.html
> >
> > I admit I do not understand the current coding :) I do not believe it
> > works for all input values, e.g. were
> > get_java_thread_list()->length()==1025, we'd get 1861 - if I am not
> > mistaken. Your code is definitely clearer but not equivalent to the old
> one.
>
> FTR, the algorithm used is described here:
>
> http://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2
>
> (1025 should round up to 2048)
>
> Thanks!
>
> /Claes
>
> >
> > ---
> >
> > In the end, I wonder whether we should have two kind of APIs, or a
> > parameter, distinguishing between "next power of 2" and "next power of 2
> > unless input value is already power of 2".
> >
> > Cheers, Thomas
> >
> >
> >
> >
> >
> > On Tue, Nov 26, 2019 at 10:42 AM Claes Redestad
> > <claes.redestad at oracle.com <mailto:claes.redestad at oracle.com>> wrote:
> >
> > Hi,
> >
> > in various places in the hotspot we have custom code to calculate the
> > next power of two, some of which have potential to go into an
> infinite
> > loop in case of an overflow.
> >
> > This patch proposes adding next_power_of_two utility methods which
> > avoid infinite loops on overflow, while providing slightly more
> > efficient code in most cases.
> >
> > Bug: https://bugs.openjdk.java.net/browse/JDK-8234331
> > Webrev: http://cr.openjdk.java.net/~redestad/8234331/open.01/
> >
> > Testing: tier1-3
> >
> > Thanks!
> >
> > /Claes
> >
>
More information about the hotspot-compiler-dev
mailing list