JDK-6982173: Small problem causing thousands of wasted CPU hours
Jan Peter Stotz
jpstotz at gmx.de
Wed Jan 23 20:05:11 UTC 2019
Hi,
like many other I ran into bug JDK-698217 which is about
AbstractSet.removeAll() and it's "aptitude" in wasting CPU time if you
accidentally hit this bug. And there are hundred of developers hitting
this bug every year.
I simply don't understand why there was no progress in 8 years, on a
severe performance issue (a removeAll method on an efficient set that
can require O(n^2)!) caused by code that was written to speed-up the
removeAll implementation.
Which makes this bug worse is that it is triggered by the relative size
of the current set compared to the collection passed as parameter.
Therefore for most developers this means not to use this buggy function
at all (once they realized how worse it is).
IMHO there is a extreme simple workaround for avoiding this problem. The
relevant part of the current implementation is as follows:
if (size() > c.size()) {
for (Object e : c)
modified |= remove(e);
} else {
for (Iterator<?> i = iterator(); i.hasNext(); ) {
if (c.contains(i.next())) {
i.remove();
modified = true;
}
}
}
The problem is that the code in the first if block only makes sense if
the collections passed as parameter implements the remove(Object o)
method in an efficient way - this means with a complexity of less than
O(n).
Java currently has no way to identify Collection implementations that
satisfy this property we should at least implement a workaround that
prevents the Java developers form unexpected "damage".
Which mean that the first if block should only be triggered if we are
sure that it is faster than the second case (because removing an element
from a Set is usually implemented efficiently).
Therefore I would propose a patch that no only compare the size of the
current set and the given collection but also makes sure that the given
collection is a Set or a Map:
if (size() > c.size() && (c instanceof Set || c instanceof Map) {
for (Object e : c)
modified |= remove(e);
} else {
....
Jan
More information about the core-libs-dev
mailing list