Do Set implementations waste memory?

Ulf Zibis Ulf.Zibis at gmx.de
Thu Mar 18 13:54:17 UTC 2010


+1

-Ulf

Am 18.03.2010 14:22, schrieb Osvaldo Doederlein:
>
> The oldest collection classes were designed for the needs of J2SE 1.2, 
> a full decade ago. This was discussed before, IIRC there was some 
> reply from Josh agreeing that some speeed-vs-size tradeoffs made last 
> decade should be revisited today. The extra runtime size/bloat that a 
> specialized HashSet implementation would cost, was reasonably 
> significant in 1999 but completely irrelevant in 2010. I mean, HashSet 
> is a HUGELY important collection, and the benefit of any optimization 
> of its implementation would spread to many APIs and applications.
>
> And the problem is not only the extra value field, there is also 
> overhead from the extra indirection (plus extra polymorphic call) from 
> the HashSet object to the internal HashMap object. This overhead may 
> sometimes be sufficient to block inlining and devirtualization, so 
> it's a potentially bigger cost than just a single extra memory load 
> (which is easily hoisted out of loops etc.). Look at this code inside 
> HashSet for a much worse cost:
>
>     public Iterator<E> iterator() {
>         return map.keySet().iterator();
>     }
>
> Yeah we pay the cost of building the internal HashMap's key-set (which 
> is lazily-built), just to iterate the freaking HashSet. (Notice that 
> differently from HashMap, a Set is a true Collection that we can 
> iterate directly without any view-collection of keys/values/entries.)
>
> IMHO all this adds evidence that the current HashSet implementation is 
> a significant performance bug. We need a brand-new impl that does the 
> hashing internally, without relying on HashMap, without any unused 
> fields, extra indirections, or surprising costs like that for 
> iterator(). I guess it would be relatively simple to copy-paste 
> HashMap's code, cut stuff until just a Set of keys is left, and merge 
> in the most specific pieces of HashSet (basically just 
> readObject()/writeObject()).
>




More information about the core-libs-dev mailing list