LinkedHashMap/LinkedHashSet enhancement: OrderedMap/OrderedSet

Tagir Valeev amaembo at gmail.com
Wed Apr 29 05:20:24 UTC 2020


Hello, Paul and Stuart!

Thank you for sharing your thoughts regarding my proposal. Yes, the
proposal was triggered by twitter discussion but I needed this
functionality occasionally in my daily work several times.

Sorry for mixing the goal and possible solution, so let's start again.

The goal of my proposal is to provide a way to
- Get the last element of LinkedHashSet
- Iterate through LinkedHashSet in backward order
- Iterate through LinkedHashSet starting from a given element (both
forward and backward)

Throughout this e-mail let's not discuss whether having such
functionality is useful or not. I will try to write a separate e-mail
regarding this later. Here let's discuss the possible solutions. I see
the following approaches.

1. Do not add any new interfaces, nor implement existing ones. Just
add new methods to LinkedHashSet that allow having the desired
functionality.
2. Add a single method to LinkedHashSet that returns a view
implementing some existing interface and has desired functionality
(Deque looks like a good candidate).
3. Create a new interface like OrderedSet that provides necessary
methods, and implement it (my proposal)
4. Add a single method to LinkedHashSet that returns a view
implementing some new interface.
Also, we should take into account that in many cases the Set has the
corresponding Map whose keys behave like the set (in this case,
LinkedHashMap). It looks reasonable to be able to provide the same
functionality for LinkedHashMap::keySet. Also, in this particular
case, it's technically possible to provide the same functionality for
LinkedHashMap::entrySet as well.

The first approach doesn't allow us to enhance LinkedHashMap::keySet
and/or LinkedHashMap::entrySet. Paul suggests adding
LinkedHashMap::entryLinkedSet that returns LinkedHashSet. However,
LinkedHashSet is a concrete implementation that delegates to the Map
assuming that our set is technically the keys of that Map. Changing it
to be able to behave like a view for another Map entrySet would
require changing the LinkedHashSet internal structure significantly,
increasing its complexity and probably adding some overhead for the
old clients who don't need the new functionality. Also, returning a
concrete class from the method like entryLinkedSet really limits
future refactoring capabilities. It's possible to give up any visible
changes in LinkedHashMap restricting this proposal to LinkedHashSet
only. This is acceptable, though, to me, it looks incomplete. On the
other hand, such kind of solution could be enhanced later to approach
#3.

The second proposal suggests reusing an existing interface that
provides a view. It's true that I suggested asDeque() for
LinkedHashSet in Twitter. We could also provide similar methods to
LinkedHashMap, like keySetAsDeque() and/or entrySetAsDeque(). This
provides symmetrical functionality for both LinkedHashMap and
LinkedHashSet without introducing any new interfaces. Why I'm not so
enthusiastic about this proposal anymore? First, unlike Collection,
Deque doesn't specify that some operations could be optional. E.g.
Collection::add is specified to throw UnsupportedOperationException if
it's not supported, but Deque::add omits this possibility. So not
implementing certain modification methods looks like a violation of
Deque contract. Probably that's why we don't have
Collections.unmodifiableDeque(). We could wire the Deque::add method
to LinkedHashSet::add but it would behave strange because we cannot
always add the new element if it already exists. And we cannot even
return false from Deque::add because it's specified to return true.
Finally, we cannot implement Deque::add for
LinkedHashMap::keySetAsDeque view because we don't know the
corresponding value to put. We could refine the Deque spec to allow
unmodifiable or remove-only deque, but I agree with Paul that it
sounds odd.

Another point against Deque is that it doesn't contain methods to
iterate from a given element, so we should either give up this part of
proposal or add new methods to Deque, providing a default O(n)
implementation. They are less natural for generic Deque because it may
contain repeating elements, so they should be like
Deque::iteratorFromFirstOccurrence, rather than simply
Deque::iteratorFrom. All of this makes me think that Deque is a poor
choice for my proposal.

That said, adding a reversed() method to Deque providing a default
implementation sounds a good idea to me. It's independent of my
proposal and could be done separately. I can work on this if everybody
agrees that it could be a useful addition.

Why do I like adding a new interface like OrderedSet?
- It allows having a uniform implementation of LinkedHashSet,
LinkedHashMap::keySet and LinkedHashMap::entrySet
- It allows documenting the API via type system that an ordered set is
expected/returned. How many times did we see that the program behavior
depends on HashSet/HashMap iteration order? This often occurs if the
method behavior depends on the iteration order but the client provides
an unordered Set. Having this required on the API level would help to
prevent people from doing such mistakes.
- Existing APIs like TreeSet, TreeMap::navigableKeySet,
ConcurrentSkipListSet can also implement the suggested OrderedSet.
Most of APIs except iteratorFrom/descendingIteratorFrom already exist
in NavigableSet and for these two default implementation is possible.
- All methods from the new interface could be implemented. Unlike
possible linkedHashSet.asDeque().addFirst(), etc. we don't add any
APIs that actually don't work.
OrderedMap is probably less useful, but having it for symmetry would
be nice. Though we could omit it or add it later. Implementing
OrderedSet without OrderedMap doesn't prevent us from adding
OrderedMap later.

Regarding a SortedSet. I treat it as a historical artifact. Probably
it would be even good to deprecate it. JDK doesn't provide any
SortedSet implementations that don't implement NavigableSet except
wrappers like Collections::unmodifiableSortedSet. It doesn't provide
descendingIterator, and default implementation is impossible in terms
of other SortedSet methods (except copying the whole set and iterating
through the copy). It's a pity that we have this interface and it's a
pity that it cannot extend the proposed OrderedSet but we can live
with this, given that people rarely have something that implements
SortedSet but not implements NavigableSet.

I agree that adding a new interface to java.util requires some high
bar, but here, I believe, it would be useful enough.

There's also the fourth approach: create a new interface for
bidirectional traversal only. This is close to the thing described by
Paul as "a new form Iterable supporting access to first + last
elements". Probably something like (the name is TBD, of course):

interface Walkable<E> extends Iterable<E> {
  E first();
  E last();
  Walkable<E> reversed(); // reversed().iterator() can be used instead
of descendingIterator(), and it's for-each loop friendly!
  Iterator<E> iteratorFromFirstOccurrence(E e);
  // inherit iterator(), spliterator(), forEach() from Iterable
}

It doesn't have any mutating methods (well, except Iterator::remove,
it could be discussed whether it should be specified as throws UOE()
when created from Walkable) and has no direct relation to sets, so
could be used in other contexts as well. I like it less, but this
could be discussed as well.

With best regards,
Tagir Valeev

On Wed, Apr 29, 2020 at 6:58 AM Stuart Marks <stuart.marks at oracle.com> wrote:
>
> Hi Paul,
>
> I too hesitate about adding Ordered* interfaces. As I said previously, I don't
> think they're very useful, and they do seem rather Queue- or Deque-like.
>
> Indeed, Tagir was musing the other day on Twitter about a Deque view of
> LinkedHashSet and possibly LinkedHashMap. I think that might be a more
> worthwhile direction to pursue. Providing Deque views is more in keeping with
> the current collections idioms than making LHS/LHM implement Deque directly.
> This is rather like Map providing entry and key Set views and values Collection
> views.
>
> Note that these "view collections" provide live views of the backing collection
> and that they aren't necessarily read-only. They support iteration in various
> forms (e.g., streaming), and they can also permit removal (but not addition).
> This is a bit odd but it can come in handy.
>
> I'm thinking that the form of Iterable that supports access to first/last
> elements, and iteration in either order, is in fact Deque itself. Deque already
> supports descendingIterator(), which is a useful mechanism, but it's somewhat
> inconvenient: you can't use it with enhanced-for, stream, or toArray. It would
> be good to have a reversed-view Deque that flips the order of all the
> operations. That avoids having to provide descendingStream(),
> descendingToArray(), etc., and it adapts to anything that consumes an Iterable
> or Collection.
>
> (It occurs to me that we're missing a Collections.unmodifiableDeque wrapper.)
>
> The LHS/LHM views, and various Deque wrappers, should all be pretty simple, with
> the typical bunch of one-liner methods. (With the possible exception of
> toArray(T[]).) Whether they are sufficiently useful, though, is still an open
> question.
>
> s'marks
>
> On 4/28/20 10:59 AM, Paul Sandoz wrote:
> > Hi Tagir,
> >
> > I am hesitant to add new interfaces specific to Set/Map for a non-indexed encounter order.
> >
> > The interface OrderedSet is tantalizingly queue-like, in fact rather close to the read-only part of Deque.  Did you consider the possibility of LinkedHashSet implementing Deque?
> >
> > I have not thought about this too deeply, but since LinkedHashMap is implemented as a doubly linked list it might be possible (as LinkedList implements Deque).
> >
> > If so we could also add the method:
> >
> >    LinkedHashSet<Map.Entry<K,V>> LinkedHashMap.entryLinkedSet()
> >
> > and thereby avoid the addition of any new interfaces, although it might require some refinement to the specification of Deque; a new notion of an unmodifiable queue, which I admit is a little odd, and as a result is probably not be a good idea.
> >
> > More generally I think the core abstraction is really a new form Iterable supporting access to first + last elements and reverse iteration (and possibly iteration starting from a containing element).
> >
> > Paul.
> >
> >> On Apr 25, 2020, at 9:51 AM, Tagir Valeev <amaembo at gmail.com> wrote:
> >>
> >> Hello!
> >>
> >> To me, it was always a pity that the internal structure of
> >> LinkedHashMap allows doing more things easily, yet this functionality
> >> is not exposed to the external clients, so if somebody really needs
> >> such things they will need to waste memory/CPU to emulate them. The
> >> functionality I'm speaking about includes:
> >> - getting the last (i.e. most recently added/accessed) entry
> >> - getting the descending iterator going from newest to oldest entries
> >> - getting the iterator that starts at given existing entry
> >>
> >> It's easy to do such things with TreeMap and other NavigableMap
> >> implementations. However, LinkedHashMap provides no additional visible
> >> methods in addition to the Map interface (well, except
> >> removeEldestEntry()). The same considerations apply to LinkedHashSet.
> >>
> >> So we can provide new interfaces j.u.OrderedMap and OrderedSet that
> >> provide the additional functionality and make
> >> LinkedHashMap/LinkedHashSet implement them. Aside from new
> >> functionality, the interfaces will carry additional semantics: their
> >> iteration order is always predictable and their spliterators return
> >> DISTINCT|ORDERED characteristics, so some clients may require
> >> OrderedMap/Set just to assert that they rely on predictable iteration
> >> order. We can even make existing NavigableMap/Set extend
> >> OrderedMap/Set because NavigableMap/Set are ordered. Some of
> >> NavigableMap/Set methods will naturally fit the OrderedMap/Set
> >> interfaces.
> >>
> >> I think I can devote my spare time to write the spec, implementation,
> >> and tests for this enhancement, participate in the API design
> >> discussions and do any other job necessary to make this improvement
> >> happen. However, I'd like to know in advance whether such kind of
> >> change would be met positively. What do you think? Is it possible to
> >> do this or this improvement contradicts the OpenJDK goals and will
> >> likely be rejected? Also, do I need to write a JEP for this?
> >>
> >> It's probably too early to discuss the exact API, but for more
> >> specificity of my proposal, here's the quick draft:
> >>
> >> package java.util;
> >>
> >> public interface OrderedSet<E> extends Set<E> {
> >>   // First or NSEE if empty
> >>   E first() throws NoSuchElementException;
> >>   // Last or NSEE if empty
> >>   E last() throws NoSuchElementException;
> >>   // Retrieve&remove first or null if empty
> >>   E pollFirst();
> >>   // Retrieve&remove last or null if empty
> >>   E pollLast();
> >>   // Iterator starting from element, or NSEE if not found
> >>   Iterator<E> iteratorFrom(E fromElement) throws NoSuchElementException;
> >>   // Descending
> >>   Iterator<E> descendingIterator();
> >>   // Descending iterator starting from element, or NSEE if not found
> >>   Iterator<E> descendingIteratorFrom(E fromElement) throws
> >> NoSuchElementException;
> >>   // TBD: descendingSet()
> >> }
> >>
> >> All the methods except iteratorFrom and descendingIteratorFrom are
> >> already present in NavigableSet, so the spec will be mostly copied
> >> from there. For iteratorFrom and descendingIteratorFrom it's easy to
> >> provide a default implementation in NavigableSet. All methods can be
> >> implemented easily in LinkedHashSet and LinkedHashMap::keySet
> >>
> >> package java.util;
> >>
> >> public interface OrderedMap<K, V> extends Map<K, V> {
> >>   // First entry or null if empty
> >>   Map.Entry<K,V> firstEntry();
> >>   // Last entry or null if empty
> >>   Map.Entry<K,V> lastEntry();
> >>   // Retrieve&remove first entry or null if empty
> >>   Map.Entry<K,V> pollFirstEntry();
> >>   // Retrieve&remove last entry or null if empty
> >>   Map.Entry<K,V> pollLastEntry();
> >>   // First key or NSEE if empty
> >>   K firstKey();
> >>   // Last key or NSEE if empty
> >>   K lastKey();
> >>   // Ordered keySet()
> >>   OrderedSet<K> orderedKeySet();
> >>   // TBD: OrderedSet<K> descendingKeySet();
> >>   // TBD: OrderedSet<Entry<K,V>> orderedEntrySet();
> >> }
> >>
> >> All the methods except orderedKeySet() are already present in
> >> NavigableMap, and the orderedKeySet() is just an alias for
> >> navigableKeySet(). Also, all of them can be implemented easily in
> >> LinkedHashMap.
> >>
> >> Any feedback is welcome.
> >>
> >> Thank you in advance,
> >> Tagir Valeev.
> >


More information about the core-libs-dev mailing list