RFR (M): JDK-6394757: AbstractSet.removeAll semantics are surprisingly dependent on relative sizes

Brent Christian brent.christian at oracle.com
Tue May 7 23:33:40 UTC 2019


Hi, Stuart.

That all looks pretty good to me.  I think, "membership semantics" is a 
good term.

Just a few minor comments:

Collection.java:

  110  * that use different membership semantics. For operations that 
involve more than
  111  * one collection, it is specified which collection's membership 
semantics are
  112  * used by the operation.

addAll() and copyOf() involve more than one collection, though I agree 
that they do not need to be updated to specify membership semantics.


AbstractCollection.java:


404 * obtaining an iterator from the {@code iterator} method. Each element
405 * is passed to the {@code contains} method of the specified collection.
406 * If this call returns {@code false}, the element is removed from
          ^^^^^^^^^

Is "this call" a little ambiguous?  Maybe:

"If contains() returns false..."
or
"If false is returned..."

?

List.java:

Should containsAll(), removeAll(), retainAll() have the @implNote about 
contains() performance?

Thanks,
-Brent

On 5/2/19 6:36 PM, Stuart Marks wrote:
> Hi all,
> 
> Please review these spec and implementation changes to remove the 
> "optimization" to AbstractSet.removeAll. Briefly, this method was 
> specified (and implemented) to iterate one collection or the other 
> depending on the relative sizes of the collections. The problem is that 
> this would cause an unexpected semantic shift, since one or the other 
> collection's contains() method would be called depending on their 
> relative sizes, and the contains() methods might implement different 
> semantics depending upon the kind of collection.
> 
> The fix is to remove the specification and implementation of 
> AbstractSet.removeAll and to inherit AbstractCollection.removeAll, which 
> does the iteration one way consistently.
> 
> I've removed overriding removeAll method implementations from 
> IdentityHashMap's view sets which were added in order to avoid the 
> "optimization" inherited from AbstractSet.removeAll.
> 
> I've added some words to the Collection interface to introduce the term 
> "membership semantics" for a concept that's been around for a long time 
> but which never had a name, essentially the contains() method. I've then 
> updated the specifications of containsAll, removeAll, retainAll, and 
> (where necessary) equals to specify which collection's membership 
> semantics are used.
> 
> Finally, since this change may introduce some performance issues the 
> optimization was intended to avoid, I've added some implementation notes 
> to the various methods to warn about potential performance issues if 
> this collection's (or the other's) contains() method is linear or worse.
> 
> There have been various discussions in the past (see JDK-8178425 for 
> example) that propose optimizations to the various bulk operations. They 
> usually involve modifying the decision criteria for iterating this 
> collection vs iterating the other collection. However, as we concluded 
> previously, doing this introduces semantic problems. Another approach to 
> optimizating the bulk cases is to copy the other collection into an 
> intermediate collection that has an O(1) contains() method; however, 
> this also changes the semantics and thus must be ruled out. Such 
> approaches should be left to the caller.
> 
> Bug:
> 
>      https://bugs.openjdk.java.net/browse/JDK-6394757
> 
> Previous discussions:
> 
>      
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-January/058140.html 
> 
>      
> http://mail.openjdk.java.net/pipermail/core-libs-dev/2019-February/058378.html 
> 
> 
>      (see also additional bugs and linked to the above bug)
> 
> Webrev:
> 
>      http://cr.openjdk.java.net/~smarks/reviews/6394757/webrev.0/
> 
> Thanks,
> 
> s'marks


More information about the core-libs-dev mailing list