HashMap collision speed (regression 7->8)

Peter Levart peter.levart at gmail.com
Sat Jan 10 13:01:18 UTC 2015


On 01/10/2015 01:20 AM, Doug Lea wrote:
> On 01/09/2015 06:29 PM, Martin Buchholz wrote:
>> Given the prevalence of sub-optimal hashcodes, my own intuition is 
>> also that
>> raising the treeification threshold from 8 will be a win.
>
> That's what I thought at first. But 8 is a better choice for String
> and other Comparable keys, which account for the majority of HashMaps
> out there. (For non-comparables, infinity is the best threshold.)
> How much slower should we make the most common cases to make the others
> faster? The only way to decide empirically is to take a large
> corpus of programs and vary thresholds. Short of that, speeding up
> comparableClassFor is still the best bet for reducing impact on
> non-comparables.

Hi Doug,

comparableClassFor() for non-comparables that don't implement Comparable 
is already as fast as it can be (the 1st check is instanceof 
Comparable). For other comparables (and non-comparables) that implement 
Comparable (except for String which is special-cased), we could improve 
the situation by caching the result.

Here's another attempt at that. This time it uses plain old JDK1 stuff, 
so it actually works even in HashMap (using IdentityHashMap so no danger 
of circular usage if it is to be applied to CHM also):

http://cr.openjdk.java.net/~plevart/jdk9-dev/Class.getGenericDerivative/webrev.01/

With this patch, the results of Bernd's JMH benchmark do give some boost 
to keys that implement Comparable (badDistWithComp case).

These are the results with original JDK9 HashMap:

Benchmark                               (initialSize)   Mode 
Samples        Score  Score error    Units
j.t.HashMapCollision.badDistNoComp                 16   avgt 6     
3104.047      278.057    ms/op
j.t.HashMapCollision.badDistWithComp               16   avgt 6     
2754.499      243.780    ms/op
j.t.HashMapCollision.goodDistNoComp                16   avgt 6     
1031.992       26.422    ms/op
j.t.HashMapCollision.goodDistWithComp              16   avgt 6     
1082.347       30.981    ms/op

And this is with patch applied:

Benchmark                               (initialSize)   Mode 
Samples        Score  Score error    Units
j.t.HashMapCollision.badDistNoComp                 16   avgt 6     
3081.419      386.125    ms/op
j.t.HashMapCollision.badDistWithComp               16   avgt 6     
2116.030      281.160    ms/op
j.t.HashMapCollision.goodDistNoComp                16   avgt 6     
1015.224       81.843    ms/op
j.t.HashMapCollision.goodDistWithComp              16   avgt 6     
1078.719       38.351    ms/op


Caching is performed as part of Class generic types information caching 
(ClassRepository), so there's no overhead for those that don't need 
generic types information. All logic is kept inside (C)HM.

Regards, Peter

>
> -Doug
>




More information about the core-libs-dev mailing list