[External] : Sequenced Collections

Tagir Valeev amaembo at gmail.com
Sat Feb 12 03:24:24 UTC 2022


Wow, I missed that the Sequenced Collections JEP draft was posted!

Of course, I strongly support this initiative and am happy that my proposal
got some love and is moving forward. In general, I like the JEP in the way
it is. I have only two slight concerns:
1. I'm not sure that having addition methods (addFirst, addLast, putFirst,
putLast) is a good idea, as not every mutable implementation may support
them. While this adds some API unification, it's quite a rare case when
this could be necessary. I think most real world applications of Sequenced*
types would be around querying, or maybe draining (so removal operations
are ok). Probably it would be enough to add addFirst, addLast, putFirst,
putLast directly to the compatible implementations/subinterfaces like List,
LinkedHashSet, and LinkedHashMap removing them from the Sequenced*
interfaces. In this case, SortedSet interface will not be polluted with
operations that can never be implemented. Well my opinion is not very
strong here.

2. SequencedCollection name is a little bit too long. I think every extra
letter adds a hesitation for users to use the type, especially in APIs
where it could be the most useful. I see the Naming section and must admit
that I don't have better ideas. Well, maybe just Sequenced would work? Or
too vague?

Speaking of Remi's suggestion, I don't think it's a good idea. Maybe it
could be if we designed the Collection API from scratch. But given the
current state of Java collections, it's better to add new interfaces than
to put the new semantics to the java.util.List and greatly increase the
amount of non-random-accessed lists in the wild. There are tons of code
that implicitly assume fast random access of every incoming list (using
indexed iteration inside). The suggested approach could become a
performance disaster.

With best regards,
Tagir Valeev.

On Sat, Feb 12, 2022 at 2:26 AM Stuart Marks <stuart.marks at oracle.com>
wrote:

> Hi Rémi,
>
> I see that you're trying to reduce the number of interfaces introduced by
> unifying
> things around an existing interface, List. Yes, it's true that List is an
> ordered
> collection. However, your analysis conveniently omits other facts about
> List that
> make it unsuitable as a general "ordered collection" interface.
> Specifically:
>
> 1) List supports element access by int index; and
>
> 2) List is externally ordered. That is, its ordering is determined by a
> succession
> of API calls, irrespective of element values. This is in contrast to
> SortedSet et al
> which are internally ordered, in that the ordering is determined by the
> element values.
>
> The problem with indexed element access is that it creates a bunch of
> hidden
> performance pitfalls for any data structure where element access is other
> than O(1).
> So get(i) degrades to O(n), binarySearch degrades from O(log n) to O(n).
> (This is in
> the sequential implementation; the random access implementation degrades
> to O(n log
> n)). Apparently innocuous indexed for-loops degrade to quadratic. This is
> one of the
> reasons why LinkedList is a bad List implementation.
>
> If we refactor LinkedHashSet to implement List, we basically have created
> another
> situation just like LinkedList. That's a step in the wrong direction.
>
> Turning to internal ordering (SortedSet): it's fundamentally incompatible
> with
> List's external ordering. List has a lot of positional mutation operations
> such as
> add(i, obj); after this call, you expect obj to appear at position i. That
> can't
> work with a SortedSet.
>
> There is implicit positioning semantics in other methods that don't have
> index
> arguments. For example, replaceAll replaces each element of a List with
> the result
> of calling a function on that element. Crucially, the function result goes
> into the
> same location as the original element. That to cannot work with SortedSet.
>
> Well, we can try to deal with these issues somehow, like making certain
> methods
> throw UnsupportedOperationException, or by relaxing the semantics of the
> methods so
> that they no longer have the same element positioning semantics. Either of
> these
> approaches contorts the List interface to such an extent that it's no
> longer a List.
>
> So, no, it's not useful or effective to try to make List be the common
> "ordered
> collection" interface.
>
> s'marks
>
>
>
> On 2/10/22 3:14 AM, Remi Forax wrote:
> > I've read the draft of the JEP on sequenced collection, and i think the
> proposed design can be improved.
> >    https://bugs.openjdk.java.net/browse/JDK-8280836
> >
> > I agree with the motivation, there is a need for an API to consider the
> element of a list, a sorted set and a linked hash set as an ordered
> sequence of elements with a simple way to access/add/remove the first/last
> element and also reverse the elements as view.
> >
> > I disagree about the conclusion that we need to introduce 4 new
> interfaces for that matter.
> >
> > Here are the reasons
> > 1/ Usually an ordered collection is called a list. Introducing an
> interface SequencedCollection for something which is usually called a list
> will cause more harm than good. Or maybe we should rename LISP to SEQP :)
> >
> > 2/ There is already an interface List in Java, that represents an
> ordered sequence of elements, with LinkedList being the name of the the
> double linked list implementation. You can argue that there is a slight
> difference between the semantics of java.util.List and the proposed syntax
> of java.util.SequencedCollection, but given that people already have
> difficulties to understand basic data structure concepts, as a teacher i
> dread to have a discussion on those slight differences that are only true
> in Java.
> >
> > If the collection API was not already existing, we may discuss about
> having the same interface java.util.List to both indexed collection and
> ordered collection, but that boat has sailed a long time ago.
> >
> > So in first approach, we should refactor sorted set and linked hash set
> to directly implement java.util.List and all the proposed methods into
> java.util.List. But as you hint in the Risks and Assumptions section, this
> will cause regression due to inference and also we will have trouble with
> LinkedHashMap (see below).
> >
> > 3/ LinkedHashMap mixes 3 implementations in one class, some of these
> implementations does not conform to the semantics of SequencedMap.
> > - You can opt-out having the key sequentially ordered as defined by
> SequencedMap by using the constructor LinkedHashMap(int initialCapacity,
> float loadFactor, boolean accessOrder) and passing true as last parameter.
> > - You can opt-out having the key sequentially ordered as defined by
> SequencedMap by overriding removeEldestEntry(), removing the first entry at
> the same time you add a new one.
> >
> > Because all these reasons, i think we should move to another design,
> using delegation instead of inheritance, which for the collection framework
> means exposing new way to access/modify sorted set and linked hash set
> through java.util.List views.
> >
> > The concept of views is not a new concept, it's used in Arrays.asList(),
> List.subList() or Map.keySet()/values()/entrySet() (and more). The idea is
> not that a sorted set is a list but that it provides a method to see it as
> a list. It solves our problem of compatibility by not adding super types to
> existing type and also the problem of the semantics of LinkedHashMap
> because a view keeps the semantics of the data structure it originated.
> >
> > Here is the proposed new methods in List, SortedSet and SortedMap.
> >
> > interface List<E> extends Collection<E> {
> >    // new methods
> >    void addFirst();
> >    void addLast();
> >    E getFirst();
> >    E getLast();
> >    E removeFirst();
> >    E removeLast();
> >    List<E> reversedList(); // or descendingList() ??
> > }
> >
> > interface SortedSet<E> implements Set<E> {
> >    // new methods
> >    List<E> asList();
> > }
> >
> > interface SortedMap<K,V> implements Map<K,V> {
> >    // new methods
> >    List<K> keyList();  // do not use covariant return type
> >    List<Map.Entry<K,V>> entryList();  // same
> > }
> >
> > I believe this design is objectively better than the one proposed
> because as a user being able to use new interfaces is a slow process, the
> libraries/dependencies must be updated to take the new interfaces as
> parameter before the new types can be used. By contrast, the proposed
> design only enhance existing interfaces so people will enjoy the new
> methods directly when introduced.
> >
> > Rémi
> >
>


More information about the core-libs-dev mailing list