HashMap collision speed (regression 7->8)

Bernd Eckenfels ecki at zusammenkunft.net
Tue Jan 13 17:11:00 UTC 2015


Hello,

Windows 7 x64 on Core i7 Q720 (4C/8T/6MB L3, 1,6GHz) 8GB RAM

my numbers in the gist are from a notebook, but I plan to repeat my and
your benachmark versions on a different machine tomorow (which has
less memory pressure), I will report.

I was testing single threaded, will see if this makes a difference.

Gruss
Bernd


Am Tue, 13 Jan 2015 16:51:01 +0100 schrieb Peter Levart
<peter.levart at gmail.com>:

> On 01/13/2015 12:36 PM, Doug Lea wrote:
> > On 01/12/2015 05:12 PM, Peter Levart wrote:
> >> Hi,
> >>
> >> I added results obtained with JDK 8 (FCS and u20) - same machine, 
> >> same VM
> >> options, just different JDKs:
> >
> > Thanks very much. It remains a mystery why the original report
> > by Bernd showed a 40% difference from jdk7. The approx 10% hit
> > seen in these more careful measurements is about what we
> > expected as the intended tradeoff for avoiding DOS attacks
> > and better performance in more common cases. There are
> > still some tweaky improvements we should keep exploring,
> > but I don't think there's any need for fundamental changes.
> >
> > -Doug
> 
> Perhaps it was because of different HW. I ran the tests on Intel®
> Core™ i7-2600K CPU with 8 MB of Cache. I wonder what Bernd used. The
> number of different keys is 250.000, they are accessed randomly in a
> loop. Such HashMap consists of majority of TreeNode(s). TreeNode(s)
> have 9 reference fields + int and boolean. I don't claim that such
> HashMap fits intirely into Cache, but half of it perhaps does. Maybe
> it is just that cache-hit ratio was larger on my HW than on Bernd's
> and that Bernd's cache was thrashing more than mine. With JDK 7
> (normal Nodes have just 3 reference fields + int) such HashMap has
> roughly half the memory footprint.
> 
> Bernd, what did you run your benchmark on?
> 
> 
> Peter
> 
> >
> >>
> >> Original JDK 7 HashMap (and JVM):
> >>
> >> Benchmark                               (initialSize)   Mode
> >> Samples Score  Score error    Units
> >> j.t.HashMapCollision.badDistNoComp                 16     ss 60
> >> 2839.458      157.299       ms
> >> j.t.HashMapCollision.badDistWithComp               16     ss 60
> >> 2673.924      187.063       ms
> >> j.t.HashMapCollision.goodDistNoComp                16     ss 60
> >> 686.972       32.928       ms
> >> j.t.HashMapCollision.goodDistWithComp              16     ss 60
> >> 631.001        6.574       ms
> >>
> >> Original JDK 8 HashMap (JDK 8 FCS JVM):
> >>
> >> Benchmark                               (initialSize)   Mode
> >> Samples Score  Score error    Units
> >> j.t.HashMapCollision.badDistNoComp                 16     ss 60
> >> 3186.305       74.890       ms
> >> j.t.HashMapCollision.badDistWithComp               16     ss 60
> >> 2479.155      136.924       ms
> >> j.t.HashMapCollision.goodDistNoComp                16     ss 60
> >> 673.819       13.236       ms
> >> j.t.HashMapCollision.goodDistWithComp              16     ss 60
> >> 673.636        8.676       ms
> >>
> >> Original JDK 8 HashMap (JDK 8u20 JVM):
> >>
> >> Benchmark                               (initialSize)   Mode
> >> Samples Score  Score error    Units
> >> j.t.HashMapCollision.badDistNoComp                 16     ss 60
> >> 3107.455       72.524       ms
> >> j.t.HashMapCollision.badDistWithComp               16     ss 60
> >> 2986.006        9.796       ms
> >> j.t.HashMapCollision.goodDistNoComp                16     ss 60
> >> 631.295        7.281       ms
> >> j.t.HashMapCollision.goodDistWithComp              16     ss 60
> >> 641.041       17.139       ms
> >>
> >> Original JDK 9 HashMap:
> >>
> >> Benchmark                               (initialSize)   Mode
> >> Samples Score  Score error    Units
> >> j.t.HashMapCollision.badDistNoComp                 16     ss 60
> >> 3011.738       78.249       ms
> >> j.t.HashMapCollision.badDistWithComp               16     ss 60
> >> 2984.280       48.315       ms
> >> j.t.HashMapCollision.goodDistNoComp                16     ss 60
> >> 682.060       52.341       ms
> >> j.t.HashMapCollision.goodDistWithComp              16     ss 60
> >> 685.705       55.183       ms
> >>
> >> Original JDK 9 HashMap with TREEIFY_THRESHOLD = 1 << 20:
> >>
> >> Benchmark                               (initialSize)   Mode
> >> Samples Score  Score error    Units
> >> j.t.HashMapCollision.badDistNoComp                 16     ss 60
> >> 2780.771      236.647       ms
> >> j.t.HashMapCollision.badDistWithComp               16     ss 60
> >> 2541.740      233.429       ms
> >> j.t.HashMapCollision.goodDistNoComp                16     ss 60
> >> 757.364       67.869       ms
> >> j.t.HashMapCollision.goodDistWithComp              16     ss 60
> >> 671.617       54.943       ms
> >>
> >> Caching of comparableClassFor (in ClassRepository - good for 
> >> heterogeneous keys
> >> too):
> >>
> >> Benchmark                               (initialSize)   Mode
> >> Samples Score  Score error    Units
> >> j.t.HashMapCollision.badDistNoComp                 16     ss 60
> >> 3014.888       71.778       ms
> >> j.t.HashMapCollision.badDistWithComp               16     ss 60
> >> 2279.757       54.159       ms
> >> j.t.HashMapCollision.goodDistNoComp                16     ss 60
> >> 760.743       70.674       ms
> >> j.t.HashMapCollision.goodDistWithComp              16     ss 60
> >> 725.188       67.853       ms
> >>
> >> Caching of comparableClassFor (internally - good for homogeneous
> >> keys only):
> >>
> >> Benchmark                               (initialSize)   Mode
> >> Samples Score  Score error    Units
> >> j.t.HashMapCollision.badDistNoComp                 16     ss 60
> >> 3026.707       84.571       ms
> >> j.t.HashMapCollision.badDistWithComp               16     ss 60
> >> 2137.296       66.140       ms
> >> j.t.HashMapCollision.goodDistNoComp                16     ss 60
> >> 635.964        8.213       ms
> >> j.t.HashMapCollision.goodDistWithComp              16     ss 60
> >> 685.129       46.783       ms
> >>
> >>
> >> Peter
> >>
> >> On 01/12/2015 12:26 AM, Peter Levart wrote:
> >>>
> >>> On 01/11/2015 10:00 PM, Doug Lea wrote:
> >>>> On 01/11/2015 02:26 PM, Peter Levart wrote:
> >>>>
> >>>>> Although majority of entries constitute the bins of size 13 or
> >>>>> 14, there's only
> >>>>> a single hashCode value per bin.
> >>>>>
> >>>>> So in this benchmark, treeifying with non-comparable keys is a 
> >>>>> waste of effort.
> >>>>
> >>>> On the other hand, the waste seems to only cost about 10% in
> >>>> your runs.
> >>>> I wonder why the original report using jdk7 vs jdk8 seemed
> >>>> larger.
> >>>
> >>> I don't know. I ran the same benchmark with same VM options on
> >>> JDK 7 too. Here
> >>> are all results together:
> >>>
> >>> Original JDK 7 HashMap (and JVM):
> >>>
> >>> Benchmark                               (initialSize)   Mode
> >>> Samples Score  Score error    Units
> >>> j.t.HashMapCollision.badDistNoComp                 16 ss        60
> >>> 2839.458      157.299       ms
> >>> j.t.HashMapCollision.badDistWithComp               16 ss        60
> >>> 2673.924      187.063       ms
> >>> j.t.HashMapCollision.goodDistNoComp                16 ss        60
> >>> 686.972       32.928       ms
> >>> j.t.HashMapCollision.goodDistWithComp              16 ss        60
> >>> 631.001        6.574       ms
> >>>
> >>> Original JDK 9 HashMap:
> >>>
> >>> Benchmark                               (initialSize)   Mode
> >>> Samples Score  Score error    Units
> >>> j.t.HashMapCollision.badDistNoComp                 16 ss        60
> >>> 3011.738       78.249       ms
> >>> j.t.HashMapCollision.badDistWithComp               16 ss        60
> >>> 2984.280       48.315       ms
> >>> j.t.HashMapCollision.goodDistNoComp                16 ss        60
> >>> 682.060       52.341       ms
> >>> j.t.HashMapCollision.goodDistWithComp              16 ss        60
> >>> 685.705       55.183       ms
> >>>
> >>> Original JDK 9 HashMap with TREEIFY_THRESHOLD = 1 << 20:
> >>>
> >>> Benchmark                               (initialSize)   Mode
> >>> Samples Score  Score error    Units
> >>> j.t.HashMapCollision.badDistNoComp                 16 ss        60
> >>> 2780.771      236.647       ms
> >>> j.t.HashMapCollision.badDistWithComp               16 ss        60
> >>> 2541.740      233.429       ms
> >>> j.t.HashMapCollision.goodDistNoComp                16 ss        60
> >>> 757.364       67.869       ms
> >>> j.t.HashMapCollision.goodDistWithComp              16 ss        60
> >>> 671.617       54.943       ms
> >>>
> >>> Caching of comparableClassFor (in ClassRepository - good for 
> >>> heterogeneous
> >>> keys too):
> >>>
> >>> Benchmark                               (initialSize)   Mode
> >>> Samples Score  Score error    Units
> >>> j.t.HashMapCollision.badDistNoComp                 16 ss        60
> >>> 3014.888       71.778       ms
> >>> j.t.HashMapCollision.badDistWithComp               16 ss        60
> >>> 2279.757       54.159       ms
> >>> j.t.HashMapCollision.goodDistNoComp                16 ss        60
> >>> 760.743       70.674       ms
> >>> j.t.HashMapCollision.goodDistWithComp              16 ss        60
> >>> 725.188       67.853       ms
> >>>
> >>> Caching of comparableClassFor (internally - good for homogeneous 
> >>> keys only):
> >>>
> >>> Benchmark                               (initialSize)   Mode
> >>> Samples Score  Score error    Units
> >>> j.t.HashMapCollision.badDistNoComp                 16 ss        60
> >>> 3026.707       84.571       ms
> >>> j.t.HashMapCollision.badDistWithComp               16 ss        60
> >>> 2137.296       66.140       ms
> >>> j.t.HashMapCollision.goodDistNoComp                16 ss        60
> >>> 635.964        8.213       ms
> >>> j.t.HashMapCollision.goodDistWithComp              16 ss        60
> >>> 685.129       46.783       ms
> >>>
> >>>>
> >>>>>
> >>>>> Are there (non-forged) sets of non-comparable keys with
> >>>>> hashCodes where
> >>>>> treeifying makes sense?
> >>>>
> >>>> Try using a class like:
> >>>>   class FHC { float f; int hashCode() { return 
> >>>> Float.floatToIntBits(f); } }
> >>>> and populate with instances with integral values for f.
> >>>>
> >>>> Similarly for doubles.
> >>>>
> >>>> Pre-jdk8, we devised a bit-smearing function that (among other
> >>>> constraints) did OK for float/double keys with integral values,
> >>>> that are not all that rare.  With treeification, we don't need to
> >>>> penalize classes with decent hashCodes by bit-smearing to still
> >>>> get OK performance for these kinds of cases where the tree-based
> >>>> hashCode comparison helps more than Comparability per se.
> >>>
> >>> I see. These keys actually have unique or near unique hashCodes
> >>> but which are
> >>> not good for power of two length tables without bit-smearing.
> >>> With tree bins
> >>> we don't need heavy bit-smearing to get decent performance in
> >>> speed, but the
> >>> table gets quite sparse anyway (although this is the smaller of
> >>> the space
> >>> overheads - tree nodes are bigger). For example, for 1M integral 
> >>> Floats, we
> >>> get the following:
> >>>
> >>> >>> Float ...
> >>>                  Capacity: 2097152
> >>>               Load factor: 0.75
> >>>                      Size: 1000000
> >>>                 Bin sizes: 0*1966080 1*0 2*0 3*24288 4*41248 5*0
> >>> 6*0 7*0 8*0
> >>> 9*0 10*4456 11*22963 12*30554 13*7539 14*24 total=1000000
> >>>                Empty bins: 93.8 %
> >>> Unique hash codes per bin: 0*1966080 1*0 2*0 3*24288 4*41248 5*0
> >>> 6*0 7*0 8*0
> >>> 9*0 10*4456 11*22963 12*30554 13*7539 14*24 total=1000000
> >>>
> >>>
> >>>>
> >>>> Also...
> >>>>
> >>>> It looks like the simplest path to a minor improvement is
> >>>> just to cache internally (your fourth test below). But I now
> >>>> recall not doing this because it adds to footprint and
> >>>> the field could prevent class unloading, for only a small
> >>>> benefit.
> >>>
> >>> Footprint, yes (one reference field in HM instance), while class 
> >>> unloading is
> >>> taken care of using WeakReference:
> >>>
> >>> http://cr.openjdk.java.net/~plevart/jdk9-dev/HM.comparableClassFor/HomogeneousKeysCache/webrev.01/ 
> >>>
> >>>
> >>>>
> >>>> (Every time HashMap has changed, there have been reports of
> >>>> performance regressions even though typical performance
> >>>> generally improves.)
> >>>>
> >>>> -Doug
> >>>
> >>> Regards, Peter
> >>>
> >>>>
> >>>>
> >>>>>> Original JDK9 HashMap:
> >>>>>>
> >>>>>> Benchmark                               (initialSize) Mode
> >>>>>> Samples Score  Score error    Units
> >>>>>> j.t.HashMapCollision.badDistNoComp                 16
> >>>>>> ss        60 3011.738       78.249       ms
> >>>>>> j.t.HashMapCollision.badDistWithComp               16
> >>>>>> ss        60 2984.280       48.315       ms
> >>>>>> j.t.HashMapCollision.goodDistNoComp                16
> >>>>>> ss        60 682.060       52.341       ms
> >>>>>> j.t.HashMapCollision.goodDistWithComp              16
> >>>>>> ss        60 685.705       55.183       ms
> >>>>>>
> >>>>>> Original JDK9 HashMap with TREEIFY_THRESHOLD = 1 << 20:
> >>>>>>
> >>>>>> Benchmark                               (initialSize) Mode
> >>>>>> Samples Score  Score error    Units
> >>>>>> j.t.HashMapCollision.badDistNoComp                 16
> >>>>>> ss        60 2780.771      236.647       ms
> >>>>>> j.t.HashMapCollision.badDistWithComp               16
> >>>>>> ss        60 2541.740      233.429       ms
> >>>>>> j.t.HashMapCollision.goodDistNoComp                16
> >>>>>> ss        60 757.364       67.869       ms
> >>>>>> j.t.HashMapCollision.goodDistWithComp              16
> >>>>>> ss        60 671.617       54.943       ms
> >>>>>>
> >>>>>> Caching of comparableClassFor (in ClassRepository - good for 
> >>>>>> heterogeneous
> >>>>>> keys too):
> >>>>>>
> >>>>>> Benchmark                               (initialSize) Mode
> >>>>>> Samples Score  Score error    Units
> >>>>>> j.t.HashMapCollision.badDistNoComp                 16
> >>>>>> ss        60 3014.888       71.778       ms
> >>>>>> j.t.HashMapCollision.badDistWithComp               16
> >>>>>> ss        60 2279.757       54.159       ms
> >>>>>> j.t.HashMapCollision.goodDistNoComp                16
> >>>>>> ss        60 760.743       70.674       ms
> >>>>>> j.t.HashMapCollision.goodDistWithComp              16
> >>>>>> ss        60 725.188       67.853       ms
> >>>>>>
> >>>>>> Caching of comparableClassFor (internally - good for
> >>>>>> homogeneous keys only):
> >>>>>>
> >>>>>> Benchmark                               (initialSize) Mode
> >>>>>> Samples Score  Score error    Units
> >>>>>> j.t.HashMapCollision.badDistNoComp                 16
> >>>>>> ss        60 3026.707       84.571       ms
> >>>>>> j.t.HashMapCollision.badDistWithComp               16
> >>>>>> ss        60 2137.296       66.140       ms
> >>>>>> j.t.HashMapCollision.goodDistNoComp                16
> >>>>>> ss        60 635.964        8.213       ms
> >>>>>> j.t.HashMapCollision.goodDistWithComp              16
> >>>>>> ss        60 685.129       46.783       ms
> >>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> >
> 




More information about the core-libs-dev mailing list