HashMap collision speed (regression 7->8)
Doug Lea
dl at cs.oswego.edu
Tue Jan 13 11:36:37 UTC 2015
On 01/12/2015 05:12 PM, Peter Levart wrote:
> Hi,
>
> I added results obtained with JDK 8 (FCS and u20) - same machine, same VM
> options, just different JDKs:
Thanks very much. It remains a mystery why the original report
by Bernd showed a 40% difference from jdk7. The approx 10% hit
seen in these more careful measurements is about what we
expected as the intended tradeoff for avoiding DOS attacks
and better performance in more common cases. There are
still some tweaky improvements we should keep exploring,
but I don't think there's any need for fundamental changes.
-Doug
>
> Original JDK 7 HashMap (and JVM):
>
> Benchmark (initialSize) Mode Samples
> Score Score error Units
> j.t.HashMapCollision.badDistNoComp 16 ss 60
> 2839.458 157.299 ms
> j.t.HashMapCollision.badDistWithComp 16 ss 60
> 2673.924 187.063 ms
> j.t.HashMapCollision.goodDistNoComp 16 ss 60
> 686.972 32.928 ms
> j.t.HashMapCollision.goodDistWithComp 16 ss 60
> 631.001 6.574 ms
>
> Original JDK 8 HashMap (JDK 8 FCS JVM):
>
> Benchmark (initialSize) Mode Samples
> Score Score error Units
> j.t.HashMapCollision.badDistNoComp 16 ss 60
> 3186.305 74.890 ms
> j.t.HashMapCollision.badDistWithComp 16 ss 60
> 2479.155 136.924 ms
> j.t.HashMapCollision.goodDistNoComp 16 ss 60
> 673.819 13.236 ms
> j.t.HashMapCollision.goodDistWithComp 16 ss 60
> 673.636 8.676 ms
>
> Original JDK 8 HashMap (JDK 8u20 JVM):
>
> Benchmark (initialSize) Mode Samples
> Score Score error Units
> j.t.HashMapCollision.badDistNoComp 16 ss 60
> 3107.455 72.524 ms
> j.t.HashMapCollision.badDistWithComp 16 ss 60
> 2986.006 9.796 ms
> j.t.HashMapCollision.goodDistNoComp 16 ss 60
> 631.295 7.281 ms
> j.t.HashMapCollision.goodDistWithComp 16 ss 60
> 641.041 17.139 ms
>
> Original JDK 9 HashMap:
>
> Benchmark (initialSize) Mode Samples
> Score Score error Units
> j.t.HashMapCollision.badDistNoComp 16 ss 60
> 3011.738 78.249 ms
> j.t.HashMapCollision.badDistWithComp 16 ss 60
> 2984.280 48.315 ms
> j.t.HashMapCollision.goodDistNoComp 16 ss 60
> 682.060 52.341 ms
> j.t.HashMapCollision.goodDistWithComp 16 ss 60
> 685.705 55.183 ms
>
> Original JDK 9 HashMap with TREEIFY_THRESHOLD = 1 << 20:
>
> Benchmark (initialSize) Mode Samples
> Score Score error Units
> j.t.HashMapCollision.badDistNoComp 16 ss 60
> 2780.771 236.647 ms
> j.t.HashMapCollision.badDistWithComp 16 ss 60
> 2541.740 233.429 ms
> j.t.HashMapCollision.goodDistNoComp 16 ss 60
> 757.364 67.869 ms
> j.t.HashMapCollision.goodDistWithComp 16 ss 60
> 671.617 54.943 ms
>
> Caching of comparableClassFor (in ClassRepository - good for heterogeneous keys
> too):
>
> Benchmark (initialSize) Mode Samples
> Score Score error Units
> j.t.HashMapCollision.badDistNoComp 16 ss 60
> 3014.888 71.778 ms
> j.t.HashMapCollision.badDistWithComp 16 ss 60
> 2279.757 54.159 ms
> j.t.HashMapCollision.goodDistNoComp 16 ss 60
> 760.743 70.674 ms
> j.t.HashMapCollision.goodDistWithComp 16 ss 60
> 725.188 67.853 ms
>
> Caching of comparableClassFor (internally - good for homogeneous keys only):
>
> Benchmark (initialSize) Mode Samples
> Score Score error Units
> j.t.HashMapCollision.badDistNoComp 16 ss 60
> 3026.707 84.571 ms
> j.t.HashMapCollision.badDistWithComp 16 ss 60
> 2137.296 66.140 ms
> j.t.HashMapCollision.goodDistNoComp 16 ss 60
> 635.964 8.213 ms
> j.t.HashMapCollision.goodDistWithComp 16 ss 60
> 685.129 46.783 ms
>
>
> Peter
>
> On 01/12/2015 12:26 AM, Peter Levart wrote:
>>
>> On 01/11/2015 10:00 PM, Doug Lea wrote:
>>> On 01/11/2015 02:26 PM, Peter Levart wrote:
>>>
>>>> Although majority of entries constitute the bins of size 13 or 14, there's only
>>>> a single hashCode value per bin.
>>>>
>>>> So in this benchmark, treeifying with non-comparable keys is a waste of effort.
>>>
>>> On the other hand, the waste seems to only cost about 10% in your runs.
>>> I wonder why the original report using jdk7 vs jdk8 seemed larger.
>>
>> I don't know. I ran the same benchmark with same VM options on JDK 7 too. Here
>> are all results together:
>>
>> Original JDK 7 HashMap (and JVM):
>>
>> Benchmark (initialSize) Mode Samples
>> Score Score error Units
>> j.t.HashMapCollision.badDistNoComp 16 ss 60
>> 2839.458 157.299 ms
>> j.t.HashMapCollision.badDistWithComp 16 ss 60
>> 2673.924 187.063 ms
>> j.t.HashMapCollision.goodDistNoComp 16 ss 60
>> 686.972 32.928 ms
>> j.t.HashMapCollision.goodDistWithComp 16 ss 60
>> 631.001 6.574 ms
>>
>> Original JDK 9 HashMap:
>>
>> Benchmark (initialSize) Mode Samples
>> Score Score error Units
>> j.t.HashMapCollision.badDistNoComp 16 ss 60
>> 3011.738 78.249 ms
>> j.t.HashMapCollision.badDistWithComp 16 ss 60
>> 2984.280 48.315 ms
>> j.t.HashMapCollision.goodDistNoComp 16 ss 60
>> 682.060 52.341 ms
>> j.t.HashMapCollision.goodDistWithComp 16 ss 60
>> 685.705 55.183 ms
>>
>> Original JDK 9 HashMap with TREEIFY_THRESHOLD = 1 << 20:
>>
>> Benchmark (initialSize) Mode Samples
>> Score Score error Units
>> j.t.HashMapCollision.badDistNoComp 16 ss 60
>> 2780.771 236.647 ms
>> j.t.HashMapCollision.badDistWithComp 16 ss 60
>> 2541.740 233.429 ms
>> j.t.HashMapCollision.goodDistNoComp 16 ss 60
>> 757.364 67.869 ms
>> j.t.HashMapCollision.goodDistWithComp 16 ss 60
>> 671.617 54.943 ms
>>
>> Caching of comparableClassFor (in ClassRepository - good for heterogeneous
>> keys too):
>>
>> Benchmark (initialSize) Mode Samples
>> Score Score error Units
>> j.t.HashMapCollision.badDistNoComp 16 ss 60
>> 3014.888 71.778 ms
>> j.t.HashMapCollision.badDistWithComp 16 ss 60
>> 2279.757 54.159 ms
>> j.t.HashMapCollision.goodDistNoComp 16 ss 60
>> 760.743 70.674 ms
>> j.t.HashMapCollision.goodDistWithComp 16 ss 60
>> 725.188 67.853 ms
>>
>> Caching of comparableClassFor (internally - good for homogeneous keys only):
>>
>> Benchmark (initialSize) Mode Samples
>> Score Score error Units
>> j.t.HashMapCollision.badDistNoComp 16 ss 60
>> 3026.707 84.571 ms
>> j.t.HashMapCollision.badDistWithComp 16 ss 60
>> 2137.296 66.140 ms
>> j.t.HashMapCollision.goodDistNoComp 16 ss 60
>> 635.964 8.213 ms
>> j.t.HashMapCollision.goodDistWithComp 16 ss 60
>> 685.129 46.783 ms
>>
>>>
>>>>
>>>> Are there (non-forged) sets of non-comparable keys with hashCodes where
>>>> treeifying makes sense?
>>>
>>> Try using a class like:
>>> class FHC { float f; int hashCode() { return Float.floatToIntBits(f); } }
>>> and populate with instances with integral values for f.
>>>
>>> Similarly for doubles.
>>>
>>> Pre-jdk8, we devised a bit-smearing function that (among other
>>> constraints) did OK for float/double keys with integral values,
>>> that are not all that rare. With treeification, we don't need to
>>> penalize classes with decent hashCodes by bit-smearing to still
>>> get OK performance for these kinds of cases where the tree-based
>>> hashCode comparison helps more than Comparability per se.
>>
>> I see. These keys actually have unique or near unique hashCodes but which are
>> not good for power of two length tables without bit-smearing. With tree bins
>> we don't need heavy bit-smearing to get decent performance in speed, but the
>> table gets quite sparse anyway (although this is the smaller of the space
>> overheads - tree nodes are bigger). For example, for 1M integral Floats, we
>> get the following:
>>
>> >>> Float ...
>> Capacity: 2097152
>> Load factor: 0.75
>> Size: 1000000
>> Bin sizes: 0*1966080 1*0 2*0 3*24288 4*41248 5*0 6*0 7*0 8*0
>> 9*0 10*4456 11*22963 12*30554 13*7539 14*24 total=1000000
>> Empty bins: 93.8 %
>> Unique hash codes per bin: 0*1966080 1*0 2*0 3*24288 4*41248 5*0 6*0 7*0 8*0
>> 9*0 10*4456 11*22963 12*30554 13*7539 14*24 total=1000000
>>
>>
>>>
>>> Also...
>>>
>>> It looks like the simplest path to a minor improvement is
>>> just to cache internally (your fourth test below). But I now
>>> recall not doing this because it adds to footprint and
>>> the field could prevent class unloading, for only a small
>>> benefit.
>>
>> Footprint, yes (one reference field in HM instance), while class unloading is
>> taken care of using WeakReference:
>>
>> http://cr.openjdk.java.net/~plevart/jdk9-dev/HM.comparableClassFor/HomogeneousKeysCache/webrev.01/
>>
>>>
>>> (Every time HashMap has changed, there have been reports of
>>> performance regressions even though typical performance
>>> generally improves.)
>>>
>>> -Doug
>>
>> Regards, Peter
>>
>>>
>>>
>>>>> Original JDK9 HashMap:
>>>>>
>>>>> Benchmark (initialSize) Mode Samples
>>>>> Score Score error Units
>>>>> j.t.HashMapCollision.badDistNoComp 16 ss 60
>>>>> 3011.738 78.249 ms
>>>>> j.t.HashMapCollision.badDistWithComp 16 ss 60
>>>>> 2984.280 48.315 ms
>>>>> j.t.HashMapCollision.goodDistNoComp 16 ss 60
>>>>> 682.060 52.341 ms
>>>>> j.t.HashMapCollision.goodDistWithComp 16 ss 60
>>>>> 685.705 55.183 ms
>>>>>
>>>>> Original JDK9 HashMap with TREEIFY_THRESHOLD = 1 << 20:
>>>>>
>>>>> Benchmark (initialSize) Mode Samples
>>>>> Score Score error Units
>>>>> j.t.HashMapCollision.badDistNoComp 16 ss 60
>>>>> 2780.771 236.647 ms
>>>>> j.t.HashMapCollision.badDistWithComp 16 ss 60
>>>>> 2541.740 233.429 ms
>>>>> j.t.HashMapCollision.goodDistNoComp 16 ss 60
>>>>> 757.364 67.869 ms
>>>>> j.t.HashMapCollision.goodDistWithComp 16 ss 60
>>>>> 671.617 54.943 ms
>>>>>
>>>>> Caching of comparableClassFor (in ClassRepository - good for heterogeneous
>>>>> keys too):
>>>>>
>>>>> Benchmark (initialSize) Mode Samples
>>>>> Score Score error Units
>>>>> j.t.HashMapCollision.badDistNoComp 16 ss 60
>>>>> 3014.888 71.778 ms
>>>>> j.t.HashMapCollision.badDistWithComp 16 ss 60
>>>>> 2279.757 54.159 ms
>>>>> j.t.HashMapCollision.goodDistNoComp 16 ss 60
>>>>> 760.743 70.674 ms
>>>>> j.t.HashMapCollision.goodDistWithComp 16 ss 60
>>>>> 725.188 67.853 ms
>>>>>
>>>>> Caching of comparableClassFor (internally - good for homogeneous keys only):
>>>>>
>>>>> Benchmark (initialSize) Mode Samples
>>>>> Score Score error Units
>>>>> j.t.HashMapCollision.badDistNoComp 16 ss 60
>>>>> 3026.707 84.571 ms
>>>>> j.t.HashMapCollision.badDistWithComp 16 ss 60
>>>>> 2137.296 66.140 ms
>>>>> j.t.HashMapCollision.goodDistNoComp 16 ss 60
>>>>> 635.964 8.213 ms
>>>>> j.t.HashMapCollision.goodDistWithComp 16 ss 60
>>>>> 685.129 46.783 ms
>>>>>
>>>>>
>>>
>>
>
More information about the core-libs-dev
mailing list