Faster HashMap implementation
dl at cs.oswego.edu
Tue Jun 9 12:09:45 UTC 2009
Alex Yakovlev wrote:
> entrySet() iterator which is very expensive.
> Very big speedup would be to reuse Entry object,
We had originally done this in a few classes.
The Iterator spec even allows it. However, most
usages ignore that part of the spec, and it led
to a bunch of bug reports, so now all Entry iterators
for classes using non-Entry-based representations
create new ones. When running on the -server compiler
this is not a huge loss, but it hurts noticeably
using -client. BTW, when checking performance,
do always run both, and also run across multiple
machines. Especially for hash tables, there are
usually interactions with processor-level properties.
> 3) as for DenseMapMicroBenchmark test - I think that
> CompactHashMap.scala will perform well in such tests
> since it stores primitive types in primitive arrays,
Right. The upcoming Scala specialization support is great
for collections especially. Too bad there doesn't seem
to be a plausible path to this in Java. (Paul Tyma once created
a Java "specifics compiler" to do it manually, but I don't
see this around any more -- http://www.classhat.com seems to have
> Actually, scala version is optimized for memory footprint:
> and there are very slow resizes when switching
> from byte to short and from short to int arrays.
More generally, the cost of switching representations
and the overhead for choosing and switching are the main
reasons there are so few JDK classes that use multiple internal
data structures. But still there are probably a few
cases where it is worthwhile.
> Well... It is also possible to reorganize index bit structure
> to allow for example up to 2.0 load factor but I am not sure
> if it is really needed.
The main issue is that people (should) use large load factors
only when they want to save pre-allocated array (table) overhead,
which yours doesn't do. Using non-default load factors is
uncommon though. Scanning http://www.google.com/codesearch for
lang:java new\sHashMap \(\S+,\s \d+.\d+f\)
and further variants (which aren't completely accurate but
close enough) finds only a few > 1, and none I saw > 2.0.
Still, there probably needs to be a better effort to
approximately match current space consumption in these cases.
> It strikes me that there might be a bigger win available
> here by using some hybrid strategy based on
> IdentityHashMap-like single-indirect (i.e., usually
> at most a single cache miss) for collision-free accesses
> but using a variant of your clever packed-indices
> under collisions. I might try this out someday.
> Interesting, I'm looking forward to see it.
On a little bit of preliminary exploration, it might
be better yet to use something like ternary tries for
collision/overflow. It might be a while before I get to
> or you mean iterateFirst/iterateNext methods?
> Their only purpose is to simplify LinkedMap and reuse code,
> but maybe having different iterators will give some speedup,
> but it's just one virtual call - you think it's really that bad?
> But hence it could not be inlined by HotSpot...
Right. Overridden virtual calls inside iterators usually hurt
enough to avoid, but it is worth empirically checking.
> I was thinking if it would be possible to make concurrent
> implementation based on AtomicIntegerArray with CAS/retries,
> but there are some problems I cannot solve.
Cliff Click's hash table (http://sourceforge.net/projects/high-scale-lib/)
does something along these lines. The tradeoffs involved here
only sometimes seem to win out over current ConcurentHashMap.
More information about the core-libs-dev