optimize KlassInfoTable size to power of 2
Andrew Haley
aph at redhat.com
Tue Dec 11 11:05:41 UTC 2018
On 12/11/18 9:53 AM, 臧琳 wrote:
> And I just did quick test, that using the constant _num_buckets for modulo, gcc issues several instructions instead
> of idiv, and the speed up at about 12%, while using 65536 for _number_buckets speedup at 20%.
OK. So GCC's constant optimization is an improvement over what we have already.
There is something that I do not understand. You said that GCC didn't know
that _num_buckets was constant. In that case, how did GCC know not to use
a divide instruction when you tried 65536?
In any case, if we really do care so much about this, I would have thought
that the best solution would be to use 65537 as the table size because there is a
nice way to calculate n % 65537::
unsigned mod_m(unsigned n) {
unsigned tmp = n % 65536;
tmp -= n / 65536;
if (tmp >= 65537) // overflow
tmp += 65537;
return tmp;
}
It's very difficult to prove that using a non-prime table size won't impact the
performance on some systems; using only a few bits of the address isn't worth
the risk, IMO.
--
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the serviceability-dev
mailing list