Re: ReversibleCollection proposal

Anthony Vanelverdinghe dev at anthonyv.be
Thu May 13 19:27:08 UTC 2021


Hi Stuart

Will EnumSet also be updated to implement ReversibleSet? Or will it be updated to implement NavigableSet [1] independently of this enhancement?

I'd also like to propose adding `ReversibleSet::of` factory methods. This is something I miss having on SortedSet occasionally, but ReversibleSet would actually be a better fit for such methods.

Kind regards, Anthony

[1] https://bugs.openjdk.java.net/browse/JDK-6278287
 
On Friday, April 16, 2021 19:40 CEST, Stuart Marks <stuart.marks at oracle.com> wrote: 
 
> This is a proposal to add a ReversibleCollection interface to the Collections 
> Framework. I'm looking for comments on overall design before I work on detailed 
> specifications and tests. Please send such comments as replies on this email thread.
> 
> Here's a link to a draft PR that contains the code diffs. It's prototype quality, 
> but it should be good enough to build and try out:
> 
>      https://github.com/openjdk/jdk/pull/3533
> 
> And here's a link to a class diagram showing the proposed additions:
> 
>  
> https://cr.openjdk.java.net/~smarks/ReversibleCollection/ReversibleCollectionDiagram.pdf
> 
> Thanks,
> 
> s'marks
> 
> 
> # Ordering and Reversibility
> 
> 
> A long-standing concept that's been missing from collections is that of the 
> positioning, sequencing, or arrangement of elements as a structural property of a 
> collection. (This is sometimes called the "iteration order" of a collection.) For 
> example, a HashSet is not ordered, but a List is. This concept is mostly not 
> manifested in the collections API.
> 
> Iterating a collection produces elements one after another in *some* sequence. The 
> concept of "ordered" determines whether this sequence is defined or whether it's a 
> coincidence of implementation. What does "having an order" mean? It implies that 
> there is a first element and that each element has a successor. Since collections 
> have a finite number of elements, it further implies that there is a last element 
> that has no successor. However, it is difficult to discern whether a collection has 
> a defined order. HashSet generally iterates its elements in the same undefined 
> order, and you can't actually tell that it's not a defined order.
> 
> Streams do have a notion of ordering ("encounter order") and this is preserved, 
> where appropriate, through the stream pipeline. It's possible to detect this by 
> testing whether its spliterator has the ORDERED characteristic. Any collection with 
> a defined order will have a spliterator with this characteristic. However, this is 
> quite a roundabout way to determine whether a collection has a defined order. 
> Furthermore, knowing this doesn't enable any additional operations. It only provides 
> constraints on the stream's implementations (keeping the elements in order) and 
> provides stronger semantics for certain operations. For example, findFirst() on an 
> unordered stream is the same as findAny(), but actually finds the first element if 
> the stream is ordered.
> 
> The concept of ordering is thus present in the system but is surfaced only in a 
> fairly indirect way. We can strengthen abstraction of ordering by making a few 
> observations and considering their implications.
> 
> Given that there is a first element and a last element, the sequence of elements has 
> two ends. It's reasonable to consider operations (add, get, remove) on either end. 
> Indeed, the Deque interface has a full complement of operations at each end. This is 
> an oft-requested feature on various other collections.
> 
> Each element except for the last has a successor, implying that each element except 
> for the first has a predecessor. Thus it's reasonable to consider iterating the 
> elements from first to last or from last to first (that is, in forward or reverse 
> order). Indeed, the concept of iterating in reverse order appears already in bits 
> and pieces in particular places around the collections:
> 
>   - List has indexOf() and lastIndexOf()
>   - Deque has removeFirstOccurrence() and removeLastOccurrence()
>   - List has a ListIterator with hasPrevious/previous methods
>   - Deque and NavigableSet have descendingIterator methods
> 
> Given an ordered collection, though, there's no general way to iterate it in reverse 
> order. Reversed iteration isn't the most common operation, but there are some 
> frequent use cases, such as operating on elements in most-recently-added order. 
> Questions and bug reports about this have come up repeatedly over the years.
> 
> Unfortunately, iterating in reverse order is much harder than iterating in forward 
> order. There are a variety of ways to iterate in forward order. For example, given a 
> List, one can do any of the following:
> 
>      for (var e : list) { ... }
>      list.forEach(...)
>      list.stream()
>      list.toArray()
> 
> However, to iterate a list in reverse order, one must use an explicit loop over 
> ListIterator:
> 
>      for (var it = list.listIterator(list.size()); it.hasPrevious(); ) {
>          var e = it.previous();
>          ...
>      }
> 
> Streaming the elements of a List in reverse order is even worse. One approach would 
> be to implement a reverse-ordered Iterator that wraps a ListIterator and delegates 
> hasNext/next calls to the ListIterator's hasPrevious/previous methods. Then, this 
> Iterator can be turned into a Spliterator, which is then turned into a Stream. (The 
> code to do this is left as an exercise for the reader.) Obtaining the elements in 
> reverse via other means -- forEach() or toArray() -- is similarly difficult.
> 
> Obtaining the elements in reverse order can be accomplished by adopting a concept 
> from NavigableSet's descendingSet() method, which provides a reverse-ordered view 
> collection. This view is also a NavigableSet, which means that any operation that 
> can be performed on the original set can also be applied to the reverse-ordered 
> view. If it were possible to obtain a similar reverse-ordered view on any kind of 
> ordered collection, this would enable the elements to be processed in reverse order 
> in any fashion, not just special-purposes APIs such as ListIterator or 
> descendingIterator().
> 
> 
> # LinkedHashSet
> 
> 
> The main feature of LinkedHashSet is that it does have a defined ordering, as 
> opposed to HashSet, which does not. LinkedHashSet clearly has a first and a last 
> element. However, LinkedHashSet is deficient in a number of ways:
> 
>   - access to and removal of the first element is reasonable (using an iterator) but 
> it is not possible to add at the front
> 
>   - it is possible to add an element at the end, but access to and removal of the 
> last element are expensive
> 
>   - it's not possible to iterate in reverse without copying the entire collection
> 
> Most frustratingly, LinkedHashSet is implemented using a doubly-linked list, so 
> there is no fundamental implementation reason that prevents these operations from 
> being supported. The main reason these operations aren't supported is probably that 
> there hasn't been a good place to push such APIs.
> 
> 
> # Proposal
> 
> 
> Introduce a new ReversibleCollection<E> interface. This provides a new method to 
> obtain a reverse-ordered view. It also contains several methods (copied from Deque) 
> that allow operating on elements at both ends.
> 
> 
>      interface ReversibleCollection<E> extends Collection<E> {
>          ReversibleCollection<E> reversed();
>          default void addFirst(E e)
>          default void addLast(E e)
>          default E getFirst()
>          default E getLast()
>          default E removeFirst()
>          default E removeLast()
>      }
> 
> 
> The List, Deque, and SortedSet interfaces, and the LinkedHashSet class are 
> retrofitted to implement ReversibleCollection. They provide covariant overriding 
> implementations of the reversed() method. For example, List.reversed() returns a 
> List. Default implementations for all of these methods are provided in the 
> appropriate interfaces.
> 
> Covariant overrides are also provided in several other classes. This presents a 
> difficulty for LinkedHashSet, as there's no suitable return type for reversed(). To 
> remedy this, we add another interface
> 
> 
>      interface ReversibleSet<E> extends ReversibleCollection<E>, Set<E> { }
> 
> 
> This adds no new methods but is essentially an intersection of ReversibleCollection 
> and Set. As such, it's suitable as the return type of LinkedHashSet.reversed() and 
> the set views of LinkedHashMap.
> 
> SortedSet::addFirst and addLast throw UnsupportedOperationException. This is because 
> SortedSet's ordering is determined by the comparison method and cannot be controlled 
> explicitly.
> 
> LinkedHashSet::addFirst and addLast add the element at the appropriate position or 
> reposition the element if it is already present in the set.
> 
> New methods are added to LinkedHashMap
> 
>      V putFirst(K, V)
>      V putLast(K, V)
> 
> which put the mapping at the designated position or reposition the mapping if is 
> already present in the map.
> 
> 
> # Design & Implementation Issues
> 
> 
> Covariant overrides for reversed() are provided where possible. However, there is a 
> conflict between List and Deque, as there are examples in the JDK (such as 
> LinkedList) that implement both List and Deque. Since List is much more popular than 
> Deque, I've decided against adding a covariant override to Deque. Instead, a method 
> Deque::reversedDeque is added that returns a Deque, and Deque::reversed returns 
> ReversibleCollection.
> 
> There is no ReversibleMap interface. Most iteration over maps is over the entrySet, 
> keySet, or values views. NavigableMap already has descendingMap(), and LinkedHashMap 
> provides ReversibleX views, which cover most of the cases already.
> 
> Introducing new methods into an interface hierarchy always raises the possibility of 
> name clashes. I've done some searching but I haven't found any major issues, but we 
> should be on the lookout for this. I'll continue to do more searching for such 
> conflicts.
> 
> Introducing covariant overrides on existing methods (notably LinkedHashMap entrySet, 
> keySet, and values) could present incompatibility issues with subclasses that 
> override these methods. I've done some searching and again I haven't found major 
> problems, but this is nonetheless a point of concern. I'll do more searching here, too.
> 
> The default implementations for List::reversed, Deque::reversed, and 
> SortedSet::reversed return reverse-ordered views that are implemented using only 
> public methods of the original interface. This demonstrates the feasibility of 
> retrofitting reverse ordering onto any instance of these interfaces. Similarly, the 
> various add/get/remove methods' default implementations are all implementable in 
> terms of an iterator or reverse-ordered iterator. That said, some concurrent 
> collection implementations will need to override these default implementations in 
> order to preserve their safety invariants. It would probably also be wise to add 
> optimized implementations to the more commonly-used collection implementations.
> 
> Given that a ReversibleCollection has a defined ordering, any spliterator obtained 
> from such a collection should have the Spliterator.ORDERED characteristic.
> 
> This proposal is related to a previous discussion on a similar topic, where Tagir 
> Valeev had proposed OrderedSet and OrderedMap:
> 
>      http://mail.openjdk.java.net/pipermail/core-libs-dev/2020-April/066028.html
> 
> I think there were some good ideas in there and in the ensuing discussion, but it 
> didn't go anywhere. Several of the concepts in this proposal are an evolution of 
> some ideas discussed in that thread.
> 
> One of the ideas from that thread was to use Deque as a common interface for many 
> collections to support reversibility and operations at the ends. We've abandoned 
> that idea in this proposal. The reasons are that it's cluttered with a bunch of 
> methods that are inherited from Queue. Also, some methods are null-returning instead 
> of throwing, which introduces confusion for collections that can contain nulls.
> 
> # END



More information about the core-libs-dev mailing list