RFR (M): JDK-6394757: AbstractSet.removeAll semantics are surprisingly dependent on relative sizes

Fri May 3 13:20:07 UTC 2019

I'm not a reviewer, but strongly support this change. Simpler is better.
Thanks!

With best regards,
Tagir Valeev.

пт, 3 мая 2019 г., 7:37 Stuart Marks <stuart.marks at oracle.com>:

> Hi all,
>
> Please review these spec and implementation changes to remove the
> "optimization"
> to AbstractSet.removeAll. Briefly, this method was specified (and
> implemented)
> to iterate one collection or the other depending on the relative sizes of
> the
> collections. The problem is that this would cause an unexpected semantic
> shift,
> since one or the other collection's contains() method would be called
> depending
> on their relative sizes, and the contains() methods might implement
> different
> semantics depending upon the kind of collection.
>
> The fix is to remove the specification and implementation of
> AbstractSet.removeAll and to inherit AbstractCollection.removeAll, which
> does
> the iteration one way consistently.
>
> I've removed overriding removeAll method implementations from
> IdentityHashMap's
> view sets which were added in order to avoid the "optimization" inherited
> from
> AbstractSet.removeAll.
>
> I've added some words to the Collection interface to introduce the term
> "membership semantics" for a concept that's been around for a long time
> but
> which never had a name, essentially the contains() method. I've then
> updated the
> specifications of containsAll, removeAll, retainAll, and (where necessary)
> equals to specify which collection's membership semantics are used.
>
> Finally, since this change may introduce some performance issues the
> optimization was intended to avoid, I've added some implementation notes
> to the
> various methods to warn about potential performance issues if this
> collection's
> (or the other's) contains() method is linear or worse.
>
> There have been various discussions in the past (see JDK-8178425 for
> example)
> that propose optimizations to the various bulk operations. They usually
> involve
> modifying the decision criteria for iterating this collection vs iterating
> the
> other collection. However, as we concluded previously, doing this
> introduces
> semantic problems. Another approach to optimizating the bulk cases is to
> copy
> the other collection into an intermediate collection that has an O(1)
> contains()
> method; however, this also changes the semantics and thus must be ruled
> out.
> Such approaches should be left to the caller.
>
> Bug:
>
>      https://bugs.openjdk.java.net/browse/JDK-6394757
>
> Previous discussions:
>
>
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-January/058140.html
>
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-February/058378.html
>
>      (see also additional bugs and linked to the above bug)
>
> Webrev:
>
>      http://cr.openjdk.java.net/~smarks/reviews/6394757/webrev.0/
>
> Thanks,
>
> s'marks
>