optimize KlassInfoTable size to power of 2

Andrew Haley aph at redhat.com
Tue Dec 11 11:05:41 UTC 2018


On 12/11/18 9:53 AM, 臧琳 wrote:
>   And I just did quick test, that using the constant _num_buckets for modulo, gcc issues several instructions instead 
> of idiv, and the speed up at about 12%, while using 65536 for _number_buckets speedup at 20%.

OK. So GCC's constant optimization is an improvement over what we have already.

There is something that I do not understand. You said that GCC didn't know
that _num_buckets was constant. In that case, how did GCC know not to use
a divide instruction when you tried 65536?

In any case, if we really do care so much about this, I would have thought
that the best solution would be to use 65537 as the table size because there is a
nice way to calculate n % 65537::

unsigned mod_m(unsigned n) {
  unsigned tmp = n % 65536;
  tmp -= n / 65536;
  if (tmp >= 65537) // overflow
    tmp += 65537;
  return tmp;
}

It's very difficult to prove that using a non-prime table size won't impact the
performance on some systems; using only a few bits of the address isn't worth
the risk, IMO.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


More information about the serviceability-dev mailing list