[concurrency-interest] NavigableMap implementation bugs
Joshua Bloch
josh at bloch.us
Mon Jun 9 23:20:53 UTC 2008
Oooh, ooh, pick me!
Josh
P.S. It isn't as tedious as you think. Doing the obvious transformation on
HashMap gives you a benchmark. Then you try to beat it.
On Mon, Jun 9, 2008 at 3:01 PM, Martin Buchholz <martinrb at google.com> wrote:
> Osvaldo, I fundamentally agree with you that
> optimizing HashSet in the manner you describe
> is worthwhile, but because the implementation
> needs to "parallel" HashMap, it would be good
> to do this after the current hacking activity on
> HashMap quiesces.
>
> Unfortunately, this kind of programming
> is pure tedium - who will volunteer?
>
> Martin
>
> On Mon, Jun 9, 2008 at 2:09 PM, Osvaldo Pinali Doederlein
> <osvaldo at visionnaire.com.br> wrote:
> > Doug Lea wrote:
> >>
> >> I don't think so. Relaying the subset methods to the *Set classes
> >> was just an implementation expediency under the mis-thought that they
> >> would take care of recursive subsets etc rather than needing
> >> special implementations for KeySets. But the byproduct for remove()
> >> is clearly wrong so they do need separate implementations.
> >>
> >
> > This remembers me from an old pet peeve, the fact that java.util.HashSet
> is
> > implemented as a wrapper over java.util.HashMap. This sucks
> > performance-wise, due to the additional indirections everywhere, and even
> > more important, because every entry single object is larger than
> necessary -
> > it contains a key and a value fields, but a single field would be
> necessary
> > for HashSet. If you have a Set with millions of entries, the overhead of
> one
> > useless pointer per entry is significant. Plus, the wrapping
> implementation
> > uses a dummy, fixed Object instance as the value for all entries - and
> this
> > may create another subtle performance problem: large numbers of
> > "cross-space" references in the heap. I know that in most GCs only
> > old-to-young refs are evil, but it seems that in some GCs any cross-space
> > reference is bad (requires remset tracking), like Detlef et al's new
> > "Garbage-First" collector (if I understood it).
> >
> > Back in J2SE1.2 time, the runtime savings from the wrapping
> implementation
> > could be significant... but, today? The HashMap* classes inside the
> latest
> > rt.jar are 18,5Kb total (uncompressed), versus 3Kb from HashSet. That's a
> > 12Kb of space saving, plus some small saving in JIT time too. I'd say
> that
> > this is an irrelevant advantage, even in JavaME. The performance bonus of
> a
> > tuned HashSet, however modest, would certainly be much more significant.
> I
> > know it's ugly from a maintenance POV to have two large classes that will
> be
> > 90% identical, but to paraphrase the movie Some Like It Hot, nothing is
> > perfect...
> >
> > A+
> > Osvaldo
> >
> > --
> > -----------------------------------------------------------------------
> > Osvaldo Pinali Doederlein Visionnaire Informática S/A
> > osvaldo at visionnaire.com.br http://www.visionnaire.com.br
> > Arquiteto de Tecnologia +55 (41) 337-1000 #223
> >
> >
>
> _______________________________________________
> Concurrency-interest mailing list
> Concurrency-interest at altair.cs.oswego.edu
> http://altair.cs.oswego.edu/mailman/listinfo/concurrency-interest
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20080609/83c4f281/attachment.html>
More information about the core-libs-dev
mailing list