HashMap collision speed (regression 7->8)

Tue Jan 13 00:42:26 UTC 2015

Overall, I'm very happy with the way that Doug's heroic treebinization
project worked out - we have expected O(1) and worst case O(log N) when
trees are possible.

I didn't consider the problem of not hurting iterator performance.  I might
consider having the iterator read-ahead all the elements in the bucket into
an array and discard the pointers in the tree nodes that are there only to
help the iterator.

My intuition is to build a tree ordered only by hashcode, and then within
each of those nodes, have a sequence of either homogenous Comparable trees
or non-Comparable linked list, and each of those terminal structures, being
homogeneous and concrete, will be easy for hotspot to deal with.

But all of these ideas are expensive to implement and test, and make this
complicated code even more so - probably we should declare victory instead.

On Mon, Jan 12, 2015 at 4:15 AM, Doug Lea <dl at cs.oswego.edu> wrote:

> On 01/12/2015 03:26 AM, Peter Levart wrote:
>
>> Did bit smearing function used in JDK 7 have negative impact on any
>> hashCodes or
>> was it replaced with simpler one just because it is not needed when tree
>> bins
>> are used?
>>
>
> I don't have detailed records, but in pre-release jdk8 tests, smearing was
> detectably but not hugely slower for the cases of String, identity-hash,
> and Integer keys. It is worth rechecking with jdk9. We believe these
> cases together account for the majority of HashMap instances --  based on
> some old instrumentation of a reference load, around 90% of the them.
> (It would be great if someone re-ran this on a more recent corpus.)
> The design goal of HashMap is to work best in common cases,
> and to work well in all others except pathological cases of
> non-comparable elements with many duplicate hashCodes.
>
> One issue that might be causing treeified bins to be worse than they
> would otherwise be is that the recursion in TreeBin.find makes
> it harder for compilers (JITs) to perform class analysis to determine
> concrete types, leading to more virtual dispatching and slower code.
> Someone from a compiler group once asked whether this code could
> be changed to help JITs. Allocating emulated stack-frames
> during lookups seems not to work well, but it is possible that
> a scheme that temporarily reused internal node pointers might help.
> I haven't looked into this.
>
> -Doug
>
>