JDK-6982173: Small problem causing thousands of wasted CPU hours

Alan Snyder javalists at cbfiddle.com
Fri Feb 15 19:30:56 UTC 2019


I think this situation is a mess.

The “general contract” of a Collection, I believe, is that it contains zero or more identified member objects, as defined by appropriate equals() method. I know this is hard to specify precisely, but I presume we all “know it when we see it”.

There is value to “collections” whose members are not objects but are equivalence classes of objects, as defined by a nonstandard equality test, but I think it is a mistake to call them Collections.

If a method takes a parameter of type Collection, it should be free to assume that the parameter object supports the “general contract” of Collection. Is there any plausible alternative?

Changing one method in the JDK to support a non-standard Collection parameter does not solve the problem, because non-JDK methods with Collection (etc.) parameters could have similar “misbehavior”. How would the developer know when a specific TreeSet can or cannot be passed to a method? Does every method that accepts a Collection (etc.) parameter require boilerplate (as in the disjoint example) explaining its exact requirements or how it can go wrong?

Perhaps it would be useful to define specific behaviors that these nonstandard “collections” might support. For example, a Membership interface (with method contains(e)) would be perfect for a removeAll(Membership) method on Collection, implemented as you propose.

  Alan




> On Feb 15, 2019, at 10:44 AM, Stuart Marks <stuart.marks at oracle.com> wrote:
> 
> 
> 
> On 2/14/19 9:30 AM, Alan Snyder wrote:
>>> Care must be exercised if this method is used on collections that
>>> do not comply with the general contract for {@code Collection}.
>> So, what does this mean? Are we catering to incorrect implementations?
> 
> I think this is a quote from the specification of Collections.disjoint():
> 
>>     * <p>Care must be exercised if this method is used on collections that
>>     * do not comply with the general contract for {@code Collection}.
>>     * Implementations may elect to iterate over either collection and test
>>     * for containment in the other collection (or to perform any equivalent
>>     * computation).  If either collection uses a nonstandard equality test
>>     * (as does a {@link SortedSet} whose ordering is not <em>compatible with
>>     * equals</em>, or the key set of an {@link IdentityHashMap}), both
>>     * collections must use the same nonstandard equality test, or the
>>     * result of this method is undefined.
> 
> (Collections.disjoint() has similar issues to removeAll/retainAll but I don't think it's as important to fix, nor do I think it necessarily must be made consistent with them. But that's kind of a separate discussion.)
> 
> I dislike the phrase "general contract" because it's, well, too general. I think it mainly refers to what I've been referring to as "contains() semantics" or "membership semantics". The rest of the paragraph refers to cases where a "nonstandard equality test" is used. I prefer to say that the membership semantics of things like IdentityHashMap and SortedSet differ from the membership semantics specified in the Set interface.
> 
> Specifically, Set says
> 
>    sets contain no pair of elements e1 and e2 such that e1.equals(e2)
> 
> whereas IdentityHashMap implies that it cannot contain any two keys k1 and k2 such that k1 == k2, and SortedSet should say that it contains no pair of elements e1 and e2 such that compare(e1, e2) == 0.
> 
> (This last is the subject that JDK-8190545 is intended to address.)
> 
> The upshot is that Collections.disjoint might not work as expected if it's used on collections that have different membership semantics.
> 
> s'marks
> 



More information about the core-libs-dev mailing list