RFR: 8234331: Add robust and optimized utility for rounding up to next power of two+

Wed Dec 4 16:32:44 UTC 2019

Hi Claes,

I think this is very good and useful.

General remarks:

- Pity that it only works for 32/64bit values. It would be neat to have
variants for 8 and 16 bit values too.

- +1 for a "lumpier" header.

- the round_down() variants: would it make sense to allow 0 as valid input?
I can imagine scenarios where 0 could be a valid input and result for this
operation.

- I tested on AIX, builds and gtests run through without errors.

----

http://cr.openjdk.java.net/~redestad/8234331/open.03/src/hotspot/share/runtime/threadSMR.cpp.udiff.html

+  int hash_table_size =
round_up_power_of_2(MIN2((int)get_java_thread_list()->length(), 32) << 1);

Style nit, I would have preferred this to be two operations for
readability.

http://cr.openjdk.java.net/~redestad/8234331/open.03/src/hotspot/share/utilities/powerOfTwo.hpp.html

  43 inline typename EnableIf<IsSigned<T>::value, T>::type
round_down_power_of_2(T value) {
...

  47   assert(lz < sizeof(T) * BitsPerByte, "Sanity");

Do we need this if we check the input value for >0 ? (Same question for the
signed version of round_up() below).

  48   assert(lz > 0, "Overflow to negative");

Same here?
Also, small nit, the assert text is slightly off since no overflow happened
yet, but the input value is negative.

(The asserts do not bother me much, I am just trying to understand why we
need them)

-

 100 // Accepts 0 (returns 1); overflows if value is larger than or  equal
to 2^31
 101 // or 2^63, for 32- and 64-bit integers, respectively

Comment is a bit off. True only for unsigned values. Also it should mention
that overflow results in assert to be in sync with the other comments.

----

http://cr.openjdk.java.net/~redestad/8234331/open.03/src/hotspot/share/utilities/count_leading_zeros.hpp.frames.html

template <typename T> inline uint32_t count_leading_zeros(T x) {

Instead of all the "if sizeof(T) == 4|8", could we specialize the functions
for different operand sizes? Example:

template <typename T, size_t n> struct Impl;
template <typename T> struct Impl<T, 8> { static int doit(T v) { return
__builtin_clzll(v); } };
template <typename T> struct Impl<T, 4> { static int doit(T v) { return
__builtin_clz(v); } };
template <typename T> int count_leading_zero(T v) { return Impl<T,
sizeof(T)>::doit(v); }

Would make it easier later to plug in implementations for different sizes,
and we'd still get a compile time error for invalid input types.

-

I wondered why count_leading_zeros did not allow 0 as input until I saw
that __builtin_clz(ll) unfolds to bsr; and that does not work for input=0.
I still find it an artificial limitation but that was there before your
patch.

----

Thank you for the enhanced comment in the fallback implementation, thats
helpful.

----

http://cr.openjdk.java.net/~redestad/8234331/open.03/test/hotspot/gtest/utilities/test_powerOfTwo.cpp.html

It bothers me a bit that we do not test invalid input and overflow. There
is a way, I believe, to have gtests expect an assert. We have
TEST_VM_ASSERT but I am not sure if that is a good idea since crashing
takes time. I also see that we only use it at a single place, so I guess
others share the same concern :)

Alternatively, one could have variants of round_up/down which return a
definite value on overflow. I wonder whether this may be useful someplace
outside the tests.

-

Just a thought, but could you cut down the code for the many TEST functions
if you would use a templatized test function? Basically, something like

template <class T> struct MyMaxImpl;
template <int32_t> MyMaxImpl   { static int32 get() { return
int32_max_pow2; } };
.. etc

template <typename T> void round_up_test() {
  EXPECT_EQ(round_up_power_of_2((T)1), (T)1) << "value = " << 1;
...
    const T max = MyMaxImpl<T>::get();
  EXPECT_EQ(round_up_power_of_2(max - 1), max)
...
}

TEST(power_of_2, round_up_power_of_2_signed_32) {
  round_up_do_test<int32_t>();
}

That way the test would also be easily extendable later if we decide to add
support for uint8_t or uint16_t.

Thanks, Thomas

On Wed, Dec 4, 2019 at 12:36 AM Claes Redestad <claes.redestad at oracle.com>
wrote:

> Hi Thomas and others,
>
> thanks for the thorough feedback!
>
> I also had offline discussions about the existing utilities in
> zUtils.inline.hpp with Per Liden, and decided to try and work this into
> a templated solution that enables both 32-bit and 64-bit
> implementations to work correctly - along with signed variants
> (especially nice with types such as size_t).
>
>   http://cr.openjdk.java.net/~redestad/8234331/open.03
>
> This is a pretty significant rework. Changes since .01:
>
> - introduce utilities/powerOfTwo.hpp
> - move round_up_* and round_down_* from zUtils, re-implement to use
>    the new implementation
> - implement next_* as round_up_*(value + 1) as per John's suggestion.
> - added tests
> - corrected the issues Thomas, Ivan and others pointed out, mainly
>    ensuring we don't depend on undefined behavior in neither product
>    code, asserts nor tests
> - .. and ensure to use next and round_up as appropriate to preserve
>    behavior
>
> Other notes:
>
> - Moving existing power-of-two functions like is_power_of_2 from
>    globalDefinitions.hpp to powerOfTwo.hpp would be straightforward, but
>    tedious. I'd like to defer this to a follow-up
> - The xlc implementation is untested, but should work. If someone can
>    verify, I'd be much obliged.
> - Many thanks to Erik Österlund, who guided me through a maze of
>    undefined behavior and template metaprogramming to a workable and
>    relatively clean implementation
> - HotSpot shrinks by ~15Kb :-)
>
> Testing: tier1-5
>
> On 2019-11-28 08:34, Thomas Stüfe wrote:
> > Hi Claes,
> >
> > I think this is useful. Why not a 64bit variant too? If you do not want
> > to go through the hassle of providing a count_leading_zeros(uint64_t),
> > you could call the 32bit variant twice and take care of endianness for
> > the caller.
> >
> > --
> >
> > In inline int32_t next_power_of_two(int32_t value) , should we weed out
> > negative input values right away instead of asserting at the end of the
> > function?
> >
> > --
> >
> > The functions will always return the next power of two, even if the
> > input is a power of two - e.g. "2" for "1". Is that intended? It would
> > be nice to have an API comment in the header describing these corner
> > cases (what happens for negative input, what happens if input is power
> 2).
> >
> > --
> >
> > The patch can cause subtle differences in some caller code, I think, if
> > input value is a power of 2 already. See e.g:
> >
> >
> http://cr.openjdk.java.net/~redestad/8234331/open.01/src/hotspot/share/libadt/dict.cpp.udiff.html
> >
> > -  i=16;
> > -  while( i < size ) i <<= 1;
> > +  i = MAX2(16, (int)next_power_of_two(size));
> >
> > If i == size == 16, old code would keep i==16, new code would come to
> > i==32, I think.
> >
> >
> http://cr.openjdk.java.net/~redestad/8234331/open.01/src/hotspot/share/opto/phaseX.cpp.udiff.html
> >
> >
>  //------------------------------round_up---------------------------------------
> >   // Round up to nearest power of 2
> > -uint NodeHash::round_up( uint x ) {
> > -  x += (x>>2);                  // Add 25% slop
> > -  if( x <16 ) return 16;        // Small stuff
> > -  uint i=16;
> > -  while( i < x ) i <<= 1;       // Double to fit
> > -  return i;                     // Return hash table size
> > +uint NodeHash::round_up(uint x) {
> > +  x += (x >> 2);                  // Add 25% slop
> > +  return MAX2(16U, next_power_of_two(x));
> >   }
> >
> > same here. If x == 16, before we'd return 16, now 32.
> >
> > ---
> >
> >
> http://cr.openjdk.java.net/~redestad/8234331/open.01/src/hotspot/share/runtime/threadSMR.cpp.udiff.html
> >
> > I admit I do not understand the current coding :) I do not believe it
> > works for all input values, e.g. were
> > get_java_thread_list()->length()==1025, we'd get 1861 - if I am not
> > mistaken. Your code is definitely clearer but not equivalent to the old
> one.
>
> FTR, the algorithm used is described here:
>
> http://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2
>
> (1025 should round up to 2048)
>
> Thanks!
>
> /Claes
>
> >
> > ---
> >
> > In the end, I wonder whether we should have two kind of APIs, or a
> > parameter, distinguishing between "next power of 2" and "next power of 2
> > unless input value is already power of 2".
> >
> > Cheers, Thomas
> >
> >
> >
> >
> >
> > On Tue, Nov 26, 2019 at 10:42 AM Claes Redestad
> > <claes.redestad at oracle.com <mailto:claes.redestad at oracle.com>> wrote:
> >
> >     Hi,
> >
> >     in various places in the hotspot we have custom code to calculate the
> >     next power of two, some of which have potential to go into an
> infinite
> >     loop in case of an overflow.
> >
> >     This patch proposes adding next_power_of_two utility methods which
> >     avoid infinite loops on overflow, while providing slightly more
> >     efficient code in most cases.
> >
> >     Bug: https://bugs.openjdk.java.net/browse/JDK-8234331
> >     Webrev: http://cr.openjdk.java.net/~redestad/8234331/open.01/
> >
> >     Testing: tier1-3
> >
> >     Thanks!
> >
> >     /Claes
> >
>