RFR (M): JDK-6394757: rev1: AbstractSet.removeAll semantics are surprisingly dependent on relative sizes
Alan Snyder
javalists at cbfiddle.com
Fri May 17 05:06:43 UTC 2019
Hi Stuart,
I believe I already accepted the fact of ersatz Collections.
Could you explain the inconsistency in the specification that affects removeAll()? I don’t see it.
In the current specification, the classes that can create ersatz Collections all call out the ersatz Collections in their documentation.
SortedSet:
Note that the ordering maintained by a sorted set (whether or not an
explicit comparator is provided) must be <i>consistent with equals</i> if
the sorted set is to correctly implement the {@code Set} interface.
The behavior of a sorted set <i>is</i> well-defined even if its
ordering is inconsistent with equals; it just fails to obey the general
contract of the {@code Set} interface.
TreeSet is similar
IdentityHashMap:
This class is <i>not</i> a general-purpose {@code Map} implementation! While this class implements the {@code Map} interface, it
intentionally violates {@code Map's} general contract, which mandates the
use of the {@code equals} method when comparing objects.
IdentityHashMap.keySet:
While the object returned by this method implements the
{@code Set} interface, it does <i>not</i> obey {@code Set's} general
contract. Like its backing map, the set returned by this method
defines element equality as reference-equality rather than
object-equality. This affects the behavior of its {@code contains},
{@code remove}, {@code containsAll}, {@code equals}, and
{@code hashCode} methods.
The logical implication of not supporting the general contract is that Collection operations performed on an ersatz Collection may have
unexpected (implementation specific) effects, which implies that passing one of these ersatz Collections to a parameter of a Collection
type may have unexpected (implementation specific) effects.
In other words, whether removeAll is performed on an ersatz Collection or the parameter is an ersatz Collection, the effects may be
implementation specific and inconsistent with the specification of removeAll.
That is not to say that improving the design would not have benefit, of course it does. The bug reports you cite prove that.
However, the compatibility impact of the change needs to be taken into account.
Alan
> On May 16, 2019, at 8:42 PM, Stuart Marks <stuart.marks at oracle.com> wrote:
>
> Hi Alan,
>
> Whether you call them "ersatz" collections, "non-standard" collections, or collections with different membership semantics, they have been around in the collections framework from day one. This is perhaps unfortunate, and it may be considered to be bad design, but it is a fact.
>
> The issue is not with undefined or implementation-specific behavior. The issue is that the specification as it stands is self-contradictory. In particular, it assumes that certain operations are symmetric that in fact are not, if you read other parts of the spec and derive logical conclusions from them. This is what I'm trying to fix. This is not a mere a matter of intellectual consistency. This state of affairs has real consequences. There is a whole family of bugs linked to this one which complain both about the performance issue and the semantic issue. All of these bugs are directly related to the self-contradictory nature of the current specification. With my changes, the spec is self-contradictory in fewer places. (There is, however, more work to be done; see JDK-8190545.)
>
> Your lead point, though, is about performance. The performance optimization of AbstractSet.removeAll is one of those that assumes that the operation is symmetric, when in fact it isn't. As a rule, optimizations mustn't change semantics. Therefore, it has to go.
>
> s'marks
>
More information about the core-libs-dev
mailing list