ReversibleCollection proposal
Remi Forax
forax at univ-mlv.fr
Sat Apr 17 16:49:34 UTC 2021
----- Mail original -----
> De: "Stuart Marks" <stuart.marks at oracle.com>
> À: "core-libs-dev" <core-libs-dev at openjdk.java.net>
> Envoyé: Vendredi 16 Avril 2021 19:40:55
> Objet: ReversibleCollection proposal
> This is a proposal to add a ReversibleCollection interface to the Collections
> Framework. I'm looking for comments on overall design before I work on detailed
> specifications and tests. Please send such comments as replies on this email
> thread.
>
> Here's a link to a draft PR that contains the code diffs. It's prototype
> quality,
> but it should be good enough to build and try out:
>
> https://github.com/openjdk/jdk/pull/3533
>
> And here's a link to a class diagram showing the proposed additions:
>
>
> https://cr.openjdk.java.net/~smarks/ReversibleCollection/ReversibleCollectionDiagram.pdf
>
> Thanks,
>
> s'marks
>
>
> # Ordering and Reversibility
>
>
> A long-standing concept that's been missing from collections is that of the
> positioning, sequencing, or arrangement of elements as a structural property of
> a
> collection. (This is sometimes called the "iteration order" of a collection.)
> For
> example, a HashSet is not ordered, but a List is. This concept is mostly not
> manifested in the collections API.
>
> Iterating a collection produces elements one after another in *some* sequence.
> The
> concept of "ordered" determines whether this sequence is defined or whether it's
> a
> coincidence of implementation. What does "having an order" mean? It implies that
> there is a first element and that each element has a successor. Since
> collections
> have a finite number of elements, it further implies that there is a last
> element
> that has no successor. However, it is difficult to discern whether a collection
> has
> a defined order. HashSet generally iterates its elements in the same undefined
> order, and you can't actually tell that it's not a defined order.
>
> Streams do have a notion of ordering ("encounter order") and this is preserved,
> where appropriate, through the stream pipeline. It's possible to detect this by
> testing whether its spliterator has the ORDERED characteristic. Any collection
> with
> a defined order will have a spliterator with this characteristic. However, this
> is
> quite a roundabout way to determine whether a collection has a defined order.
> Furthermore, knowing this doesn't enable any additional operations. It only
> provides
> constraints on the stream's implementations (keeping the elements in order) and
> provides stronger semantics for certain operations. For example, findFirst() on
> an
> unordered stream is the same as findAny(), but actually finds the first element
> if
> the stream is ordered.
>
> The concept of ordering is thus present in the system but is surfaced only in a
> fairly indirect way. We can strengthen abstraction of ordering by making a few
> observations and considering their implications.
>
> Given that there is a first element and a last element, the sequence of elements
> has
> two ends. It's reasonable to consider operations (add, get, remove) on either
> end.
> Indeed, the Deque interface has a full complement of operations at each end.
> This is
> an oft-requested feature on various other collections.
>
> Each element except for the last has a successor, implying that each element
> except
> for the first has a predecessor. Thus it's reasonable to consider iterating the
> elements from first to last or from last to first (that is, in forward or
> reverse
> order). Indeed, the concept of iterating in reverse order appears already in
> bits
> and pieces in particular places around the collections:
>
> - List has indexOf() and lastIndexOf()
> - Deque has removeFirstOccurrence() and removeLastOccurrence()
> - List has a ListIterator with hasPrevious/previous methods
> - Deque and NavigableSet have descendingIterator methods
>
> Given an ordered collection, though, there's no general way to iterate it in
> reverse
> order. Reversed iteration isn't the most common operation, but there are some
> frequent use cases, such as operating on elements in most-recently-added order.
> Questions and bug reports about this have come up repeatedly over the years.
>
> Unfortunately, iterating in reverse order is much harder than iterating in
> forward
> order. There are a variety of ways to iterate in forward order. For example,
> given a
> List, one can do any of the following:
>
> for (var e : list) { ... }
> list.forEach(...)
> list.stream()
> list.toArray()
>
> However, to iterate a list in reverse order, one must use an explicit loop over
> ListIterator:
>
> for (var it = list.listIterator(list.size()); it.hasPrevious(); ) {
> var e = it.previous();
> ...
> }
>
> Streaming the elements of a List in reverse order is even worse. One approach
> would
> be to implement a reverse-ordered Iterator that wraps a ListIterator and
> delegates
> hasNext/next calls to the ListIterator's hasPrevious/previous methods. Then,
> this
> Iterator can be turned into a Spliterator, which is then turned into a Stream.
> (The
> code to do this is left as an exercise for the reader.) Obtaining the elements
> in
> reverse via other means -- forEach() or toArray() -- is similarly difficult.
>
> Obtaining the elements in reverse order can be accomplished by adopting a
> concept
> from NavigableSet's descendingSet() method, which provides a reverse-ordered
> view
> collection. This view is also a NavigableSet, which means that any operation
> that
> can be performed on the original set can also be applied to the reverse-ordered
> view. If it were possible to obtain a similar reverse-ordered view on any kind
> of
> ordered collection, this would enable the elements to be processed in reverse
> order
> in any fashion, not just special-purposes APIs such as ListIterator or
> descendingIterator().
>
>
> # LinkedHashSet
>
>
> The main feature of LinkedHashSet is that it does have a defined ordering, as
> opposed to HashSet, which does not. LinkedHashSet clearly has a first and a last
> element. However, LinkedHashSet is deficient in a number of ways:
>
> - access to and removal of the first element is reasonable (using an iterator)
> but
> it is not possible to add at the front
>
> - it is possible to add an element at the end, but access to and removal of the
> last element are expensive
>
> - it's not possible to iterate in reverse without copying the entire collection
>
> Most frustratingly, LinkedHashSet is implemented using a doubly-linked list, so
> there is no fundamental implementation reason that prevents these operations
> from
> being supported. The main reason these operations aren't supported is probably
> that
> there hasn't been a good place to push such APIs.
>
>
> # Proposal
>
>
> Introduce a new ReversibleCollection<E> interface. This provides a new method to
> obtain a reverse-ordered view. It also contains several methods (copied from
> Deque)
> that allow operating on elements at both ends.
>
>
> interface ReversibleCollection<E> extends Collection<E> {
> ReversibleCollection<E> reversed();
> default void addFirst(E e)
> default void addLast(E e)
> default E getFirst()
> default E getLast()
> default E removeFirst()
> default E removeLast()
> }
>
>
> The List, Deque, and SortedSet interfaces, and the LinkedHashSet class are
> retrofitted to implement ReversibleCollection. They provide covariant overriding
> implementations of the reversed() method. For example, List.reversed() returns a
> List. Default implementations for all of these methods are provided in the
> appropriate interfaces.
>
> Covariant overrides are also provided in several other classes. This presents a
> difficulty for LinkedHashSet, as there's no suitable return type for reversed().
> To
> remedy this, we add another interface
>
>
> interface ReversibleSet<E> extends ReversibleCollection<E>, Set<E> { }
>
>
> This adds no new methods but is essentially an intersection of
> ReversibleCollection
> and Set. As such, it's suitable as the return type of LinkedHashSet.reversed()
> and
> the set views of LinkedHashMap.
>
> SortedSet::addFirst and addLast throw UnsupportedOperationException. This is
> because
> SortedSet's ordering is determined by the comparison method and cannot be
> controlled
> explicitly.
>
> LinkedHashSet::addFirst and addLast add the element at the appropriate position
> or
> reposition the element if it is already present in the set.
>
> New methods are added to LinkedHashMap
>
> V putFirst(K, V)
> V putLast(K, V)
>
> which put the mapping at the designated position or reposition the mapping if is
> already present in the map.
>
I think the analysis is spot on but I don't think the proposed solution is the right one.
Introducing a new interface (two in this case) has a really huge cost and in this case, i've trouble to see why i will want to write a method that takes a ReversibleCollection as parameter, ReversibleCollection as a type does not seem to be useful. About the cost of introducing a new collection interface, it means that we also have to update all emptyXXX, singleXXX, uncheckedXXX with a new interface, and that only for the JDK. Every libraries that proxy collections has also to be updated to take care of this new type, otherwise the integration with an application that uses this new type and the library is not seamless anymore.
I fully agree that having a way to reverse a collection is a powerful API point, as Brian said, it's better to reverse the collection and then ask for a stream than providing a method reverseStream (and this is true for iterators or views like keySet/entrySet/values or subList too). Several people also ask to have findLast() on Stream, using list.descendingList().findFirst() will be equivalent.
As Donald said, reversed() is perhaps not the right name because there are already collections that have methods with the same name.
Given that we already have descdendingSet() on NavigableSet, i think the method "reversed" should be called descendingDeque() on Deque, descendingList() on List, etc.
This also sove the problem of LinkedList implementing both Deque and List.
Getting the first element of a Set is something that is requested a lot, so having getFirst(), i think directly on Set, is a good idea too,
having a method removeLast() on List (mainly for ArrayList) is also a good idea, because it's better than list.remove(list.size() - 1).
All these methods can be introduced has default methods on the existing collection interfaces, there is no need of a special interface (or two) for that.
[...]
>
> # END
Rémi
More information about the core-libs-dev
mailing list