RFR [15] 6394757: rev2: AbstractSet.removeAll semantics are surprisingly dependent on relative sizes

Wed May 20 18:23:16 UTC 2020

Alan,

This contains that and that contains this at least avoids calling size which could be expensive or ephemeral at the AbstractSet level.  Using the iterators you can at least do a final check to see there are no more elements to walk.   Biasing towards ordering could pay off for things like naturally ordered SortedSet to SortedSet, LinkedHashSet, or CopyOnWriteArraySet.

Computing the hash during equals is about the best I can come up with:  

public class SetEquals {

    public static void main(String[] args) {
        Comparator<String> cc = String.CASE_INSENSITIVE_ORDER;
        Set<String> s1 = new TreeSet<>(cc);
        Set<String> s2 = new TreeSet<>(cc);
        s1.add("hello");
        s2.add("Hello");
        System.out.println(equals(s1, s2));
        System.out.println(equals(s2, s1));
        System.out.println(s1.hashCode() == s2.hashCode());
    }

    private static boolean equals(Set<?> s1, Set<?> s2) {
        if (s2 == null) {
            return false;
        }

        if (s1 == s2) {
            return true;
        }

        int s1h = 0;
        int s2h = 0;
        Iterator<?> e1 = s1.iterator();
        Iterator<?> e2 = s2.iterator();

        if (s1 instanceof SortedSet && s2 instanceof SortedSet
                && isSameOrder((SortedSet) s1, (SortedSet) s2)) {
            //TODO: raw type warnings.
            Comparator cmp = ((SortedSet<?>) s1).comparator();
            if (cmp == null) {
                cmp = Comparator.naturalOrder();
            }
            while (e1.hasNext() && e2.hasNext()) {
                Object o1 = e1.next();
                Object o2 = e2.next();

                try {
                    if (cmp.compare(o1, o2) != 0
                            && (!s1.contains(o2) || !s2.contains(o1))) {
                        return false;
                    }
                } catch (ClassCastException cce) {
                    return false;
                }

                s1h += Objects.hashCode(o1);
                s2h += Objects.hashCode(o2);

                //Iteration order should be the same.
                if (s1h != s2h) {
                    return false;
                }
            }
        } else {
            while (e1.hasNext() && e2.hasNext()) {
                Object o1 = e1.next();
                Object o2 = e2.next();

                try {
                    if (!Objects.equals(o1, o2)
                            && (!s1.contains(o2) || !s2.contains(o1))) {
                        return false;
                    }
                } catch (ClassCastException cce) {
                    return false;
                }

                s1h += Objects.hashCode(o1);
                s2h += Objects.hashCode(o2);
            }
        }
        return s1h == s2h && !e1.hasNext() && !e2.hasNext();
    }

    private static boolean isSameOrder(SortedSet<?> s1, SortedSet<?> s2) {
        //null matches natural ordering or unwrap to find a singleton.
        //Avoid calling equals on comparator.
        return Collections.reverseOrder(s1.comparator())
                == Collections.reverseOrder(s2.comparator());
    }
}

Jason
________________________________________
From: Alan Snyder <javalists at cbfiddle.com>
Sent: Thursday, May 14, 2020 8:39 AM
To: Jason Mehrens
Cc: core-libs-dev
Subject: Re: RFR [15] 6394757: rev2: AbstractSet.removeAll semantics are surprisingly dependent on relative sizes

> HashSet/TreeSet could do what ConcurrentHashMap/ConcurrentSkipListSet do by using the "this contains that and that contains this" logic.

Yes, at the cost of yet another performance regression.

But how about this problem:

Comparator<String> cc = String.CASE_INSENSITIVE_ORDER;
Set<String> s1 = new TreeSet<>(cc);
Set<String> s2 = new TreeSet<>(cc);
s1.add("hello");
s2.add("Hello");
s1.equals(s2) -> true
s2.equals(s1) -> true
s1.hashCode() == s2.hashCode() -> false