Faster HashMap implementation

Thu Jun 4 10:50:47 UTC 2009

See also:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6812862

-Ulf


Am 04.06.2009 01:21, Alex Yakovlev schrieb:
> Hello,
>
> While playing with scala programming language I've come to a HashMap 
> implementation
> that appears to be faster than current java.util.HashMap, also with a 
> lower memory footprint.
>
> Alex Miller advised me to post about it in this maillist.
>
> I've put current implementation (both scala and java) and some 
> benchmark results at github:
> http://github.com/alex14n/CompactHashMap
>
> Current java version implements most of the external HashMap behaviour
> inluding serialization, but does not throw ConcurrentModificationException
> and does not implement internal methods that can be requeired for some 
> subclasses.
>
> Main trick is in storing all data in 2 arrays without any HashEntry 
> objects.
> One is array of Objects with keys and values, another in a complex 
> array of indices.
>
> Index array has 2 parts. First part contains hashcode-to-index mappings,
> second part is index-to-index mapping for elements with the same hashcode.
> The rest bits are used for:
> 1. 'Deleted' flag to mark empty spots after key removal
> (other bits part of such values is used as deleted-linked-list indices)
> 2. 'End-of-list' flag allowing not to access next index if this 
> element is last.
> 3. Some bits of hashcode used to compare with hashcode of the key we 
> are looking for.
>
> Finding an existing value for a key has approximately the same speed 
> as the currrent
> HashMap implementation. It is:
> 1. Computing element hashcode,
> 2. Accessing array to get index/address for this hashcode
> 3a. Current HashMap access HashEntry object with 
> key/hashcode/value/next properties
> 4a. In my implementation hashcode and next are packed in index array bits,
> key and value are stored in adjacent cells of the second array.
>
> When looking for missing key my implementation can be twice faster in 
> some situations,
> since it's only one random memory access (element index array). If 
> it's empty,
> or its hashcode bits are not equal to hashcode bits of the key being 
> searched,
> and it's `end-of-list` bit is set - it is enough to stop searching. 
> Current HashMap
> needs 2 random memory reads: (1) array cell with address and (2) 
> HashEntry at that address.
>
> Insertion of a new key does not require any memory allocation (new 
> HashEntry object)
> since all arrays are pre-allocated except for resize operation. Resize 
> consists of
> Arrays.copyOf to a twice large array and almost sequental rehashing - 
> reading one array
> and writing it to either the same index or with `1` in highest bit.
>
> When adding several key/values they are sequentally written to 
> key/values array,
> and sequental memory read is faster and more cache-friendly.
>
> Having 2 arrays instead of a lot of HashEntry objects is more garbage 
> collector
> friendly and has significantly lower memory footprint.
>
> Iteration over elements is a sequental array read which is faster.
> To clone such HashMap we only need to clone 2 arrays with is also much 
> faster.
>
> Elements order is preserved even after resize, only when some key is 
> removed
> it leaves an empty 'hole' that will be filled after next key 
> insertion. Thus iteration
> order is more predictable and I think it's possible not to throw 
> ConcurrentModificationException
> in most situations. Iterator needs only 1 parameter - array index - 
> and even resize
> does not break iteration order.
>
> But there can be problems. For example if we iterated over some key
> that was deleted after that and then inserted again into higher array 
> index,
> it can eventually came up second time into iterator. But if we save 
> highest
> non-empty index that was at iteration start, and monitor insert/delete
> operations (possible holes) in indices we've not iterated over yet,
> we will not need to throw ConcurrentModificationException in most cases.
>
> I've done some benchmarks that proves my assumptions:
> * Reading existing key-value mapping has the same speed,
> * Adding new mapping and looking for missing key is significantly faster.
>
> But there can be some problems in my testing strategy
> or benchmark can be somehow biased, so it's better to retest it
> with some independent benchmarks.
>
> Hope it will help to make java collections faster since such approach
> (preallocated arrays instead of ...Entry objects) can be used to speedup
> other collections. In my scala version both HashMap and HashSet
> are implemented this way. It also stores primitive types like int
> in primitive arrays which saves a lot of memory. But on one hand
> scala has built-in boxed arrays, and on the other hand such array boxing
> is slow. It also has to store keys and values in different arrays,
> which gives extra random memory read and thus slower -
> when keys and values are in adjacent cells of the same array
> value will usually be pre-fetched when key is read.
>
> Best,
> Alex Yakovlev