RFR 8005698 : Handle Frequent HashMap Collisions with Balanced Trees
Brent Christian
brent.christian at oracle.com
Thu May 23 19:02:07 UTC 2013
Thank you for having a look, Doug.
Comments inline...
On 5/23/13 5:20 AM, Doug Lea wrote:
> * Given that balanced trees are not used in WeakHashMap
> or Hashtable, I wonder why the TreeNode etc classes are
> not just nested inside HashMap (usable from subclass LinkedHashMap).
> And if so, it would be simpler, faster, and less code-intensive
> to declare TreeNode as a subclass of the internal Entry class.
> Plus some further gymnastics to overcome the unfortunate
> decision to make LinkedHashMap a subclass of HashMap.
> Had you considered this?
I definitely did. I wanted a cleaner class hierarchy for
Entry/TreeNode, but LinkedHashMap makes it difficult, since its Entries
need extra before/after references.
I couldn't work out a TreeNode class that could be used by both HashMap
and LinkedHashMap, since LHM would need a TreeNode derived from
LinkedHashMap.Entry, not HashMap.Entry.
So instead I used composition, where TreeNode has-a [Linked]HashMap.Entry.
> * The comparableClassFor code turns out to be a bottleneck
> (and is bypassed in the particular case of String)
> in tests of the analogous code in ConcurrentHashMap, mainly
> because the only available reflection methods allocate little
> objects while checking for the form "class C implements Comparable<C>"
> At some point, someone might want to implement a JDK-internal
> streamlined form.
OK
> * Notice that if trees were ever to be used in WeakHashMap,
> the nodes would surely need to be declared as WeakReference
> subclasses, as are the current Entry nodes.
Full-disclosure: I did have a working version of WeakHashMap with
balanced trees. Here again, composition helped: TreeNode has-a
WeakReference instead of being-a WeakReference. But guaranteeing that
TreeBin was protected from encountering null/cleared keys took some
contortions. In the end, the extra complexity wasn't worth it, so it's
not included.
> The whole thought
> of doing this though is suspect (so I'm happy to see it
> not being done)
Glad to hear that taking it out was the right move. :)
Thanks,
-Brent
> On 05/22/13 16:50, Brent Christian wrote:
>> I have updated the changes as follows:
>>
>> * TreeBin.createEntryForNode() + the anonymous TreeBin subclass in
>> [Linked]HashMap have been replaced by a TreeBin.map reference back to the
>> containing map, enabling TreeBin to just call newEntry() directly.
>>
>> * TreeNode.entry was made final
>>
>> * Added a top-level comment in HashMap giving a brief overview of how
>> balanced
>> trees are used
>>
>> * Updated the TreeBinSplitBackToEntries test for TREE_THRESHOLD=16
>>
>> * Test code was updated to not need to be on the bootclasspath.
>> LaunchOnBootClass.java was removed from the changes (and would not
>> have been
>> needed anyway, due to the recent jtreg update)
>>
>>
>> The new webrev is here:
>> http://cr.openjdk.java.net/~bchristi/8005698/webrev.01/
>>
>> Thanks,
>> -Brent
>>
>> On 5/13/13 9:48 AM, Brent Christian wrote:
>>> Hi,
>>>
>>> Please review my changes for 8005698 : Handle Frequent HashMap
>>> Collisions with Balanced Trees.
>>>
>>> For HashMap and LinkedHashMap, we would like to duplicate the technique
>>> (and some of the code) that the latest ConcurrentHashMap[1] uses for
>>> handling hash collisions: use of balanced trees to store map entries in
>>> hash bins containing many collisions.
>>>
>>> Webrev is here:
>>> http://cr.openjdk.java.net/~bchristi/8005698/webrev.00/
>>>
>>> Some high-level details copied from the JEP[2]:
>>>
>>> Earlier work in this area in JDK 8, namely the alternative
>>> string-hashing implementation[3], improved collision performance for
>>> string-valued keys only, and it did so at the cost of adding a new
>>> (private) field to every String instance.
>>>
>>> The changes proposed here will improve collision performance for any key
>>> type that implements Comparable. The alternative string-hashing
>>> mechanism, including the private hash32 field added to the String class,
>>> can then be removed.
>>>
>>> The principal idea is that once the number of items in a hash bucket
>>> grows beyond a certain threshold, that bucket will switch from using a
>>> linked list of entries to a balanced tree.
>>>
>>> We will not implement this for the Hashtable class - some legacy code
>>> that uses it is known to depend upon iteration order. Instead, Hashtable
>>> will be reverted to its state prior to the introduction of the
>>> alternative string-hashing implementation, and will maintain its
>>> historical iteration order.
>>>
>>> We also will not implement this technique in WeakHashMap. An attempt was
>>> made, but the complexity of having to account for weak keys resulted in
>>> an unacceptable drop in microbenchmark performance. WeakHashMap will
>>> also be reverted to its prior state.
>>>
>>> Thanks!
>>> -Brent
>>>
>>> 1.
>>> http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/ConcurrentHashMap.java?view=log
>>>
>>>
>>>
>>> 2. http://openjdk.java.net/jeps/180
>>> 3. http://hg.openjdk.java.net/jdk8/jdk8/jdk/rev/43bd5ee0205e
>>
>
More information about the core-libs-dev
mailing list