Proposal: Collection mutability marker interfaces

John Hendrikx john.hendrikx at gmail.com
Fri Aug 26 21:04:48 UTC 2022


On 26/08/2022 18:54, Ethan McCue wrote:
> If the collections would decide whether or not to copy, I don't think 
> just requesting an immutable reference would be enough.
>
>     static <E> List<E> listCopy(Collection<? extends E> coll) {
>         if (coll instanceof List12 || (coll instanceof ListN && ! 
> ((ListN<?>)coll).allowNulls)) {
>             return (List<E>)coll;
>         } else {
>             return (List<E>)List.of(coll.toArray()); // implicit 
> nullcheck of coll
>         }
>     }
>
> The two things that List.copyOf needs to know are that the list is 
> immutable, but also that it isn't a variant that might contain a null.

I really don't care about the null problem, that's a problem that the 
designers of this basically brought upon themselves, not because of any 
real inherit limitation that an immutable collection can't contain 
`null`. What irks even more is that the `List` interface provides no way 
to determine if an implementation is actively null hostile meaning that 
this code is no longer safe (or strictly, never really was safe due to 
rather weak guarantees made in the `List` interface):

       List<?> aList = ... ; // a list from somewhere

       if (aList.contains(null)) throw IllegalArgumentException();   // 
this is unsafe, and will cause a NPE depending on the list type

This unfortunate choice was never that visible, but since `List.of` it 
occurs more frequently in standard code, and highlights that a leniently 
specified interface is mostly a useless interface.

So, I don't see the reason to jump through hoops to use the same type of 
`List` that `List.of` or `List.copyOf` returns.  All that is required is 
that an immutable list is returned, which can be as simple as:

       return Collections.unmodifiableList(clone());

Or:

       return Collections.unmodifiableList(new ArrayList<>(this));

Or if already wrapped in the immutable wrapper simply `return this`.

>
> So maybe instead of
>
>      List<T> y = x.immutableCopy();
>
> It could be appropriate to use the spliterator approach and request a 
> copy which has certain characteristics.
>
>     static <E> List<E> listCopy(Collection<? extends E> coll) {
>         if (coll instanceof List<?> list) {
>             return list.copyWhere(EnumSet.of(IMMUTABLE, DISALLOW_NULLS));
>         } else {
>             return (List<E>)List.of(coll.toArray()); // implicit 
> nullcheck of coll
>         }
>     }
>
> but that leaves open whether you would want to request the *presence* 
> of capabilities or the *absence* of them.
>
> Maybe
>
>     List.of().copyWhere();
>
> Could be defined to give a list where it is immutable and nulls aren't 
> allowed. And then
>
>    List.of(1, 2, 3).copyWhere(EnumSet.of(ADDABLE, NULLS_ALLOWED));
>
> gives you a mutable copy where nulls are allowed.
>
> This still does presume that making a copy if a capability isn't 
> present is the only use of knowing the capabilities - which from the 
> conversation so far isn't that unrealistic

I fear there are too many possibilities here to cover all use cases one 
could think of: Appendable, Prependable, Insertable, Removable, Popable, 
HeadRemovable(?), Permutable, Replacable, just to name a few.  A copy to 
create a modifiable version seems sufficient, and a custom solution is 
probably in order if that would cause performance issues (like a wrapper 
around an actual list that only allows specific functionality, like 
implements Appendable<T>).

Perhaps with a method (or constructor) of the form:

       <T extends List<T> & Appendable<T>> void giveMeAnAppendableList(T 
appendable);

--John

>
> On Fri, Aug 26, 2022 at 11:20 AM John Hendrikx 
> <john.hendrikx at gmail.com> wrote:
>
>
>     On 24/08/2022 15:38, Ethan McCue wrote:
>>     A use case that doesn't cover is adding to a collection.
>>     Say as part of a method's contract you state that you take
>>     ownership of a List. You aren't going to copy even if the list is
>>     mutable.
>>
>>     Later on, you may want to add to the list. Add is supported on
>>     ArrayList so you don't need to copy and replace your reference,
>>     but you would if the list you were given was made with List.of or
>>     Arrays.asList
>
>     I don't think this is a common enough use case that should be
>     catered for.  It might be better handled with concurrent lists
>     instead.
>
>     The most common use case by far is wanting to make sure a
>     collection you've received is not going to be modified while you
>     are working with it.  I don't think another proposal which does
>     cover the most common cases should be dismissed out of hand
>     because it doesn't support a rather rare use case.
>
>     --John
>
>
>>
>>     On Wed, Aug 24, 2022, 8:13 AM John Hendrikx
>>     <john.hendrikx at gmail.com> wrote:
>>
>>         Would it be an option to not make the receiver responsible
>>         for the decision whether to make a copy or not?  Instead put
>>         this burden (using default methods) on the various collections?
>>
>>         If List/Set/Map had a method like this:
>>
>>              List<T> immutableCopy();  // returns a (shallow)
>>         immutable copy if list is mutable (basically always copies,
>>         unless proven otherwise)
>>
>>         Paired with methods on Collections to prevent collections
>>         from being modified:
>>
>>              Collections.immutableList(List<T>)
>>
>>         This wrapper is similar to `unmodifiableList` except it
>>         implements `immutableCopy` as `return this`.
>>
>>         Then for the various scenario's, where `x` is an untrusted
>>         source of List with unknown status:
>>
>>              // Create a defensive copy; result is a private list
>>         that cannot be modified:
>>
>>              List<T> y = x.immutableCopy();
>>
>>              // Create a defensive copy for sharing, promising it
>>         won't ever change:
>>
>>              List<T> y = Collections.immutableList(x.immutableCopy());
>>
>>              // Create a defensive copy for mutating:
>>
>>              List<T> y = new ArrayList<>(x); // same as always
>>
>>              // Create a mutable copy, modify it, then expose as
>>         immutable:
>>
>>              List<T> y = new ArrayList<>(x); // same as always
>>
>>              y.add( <some element> );
>>
>>              List<T> z = Collections.immutableList(y);
>>
>>              y = null;  // we promise `z` won't change again by
>>         clearing the only path to mutating it!
>>
>>         The advantage would be that this information isn't part of
>>         the type system where it can easily get lost. The actual
>>         implementation knows best whether a copy must be made or not.
>>
>>         Of course, the immutableList wrapper can be used incorrectly
>>         and the promise here can be broken by keeping a reference to
>>         the original (mutable) list, but I think that's an acceptable
>>         trade-off.
>>
>>         --John
>>
>>         PS. Chosen names are just for illustration; there is some
>>         discussion as what "unmodifiable" vs "immutable" means in the
>>         context of collections that may contain elements that are
>>         mutable. In this post, immutable refers to shallow immutability .
>>
>>         On 24/08/2022 03:24, Ethan McCue wrote:
>>>         Ah, I'm an idiot.
>>>
>>>         There is still a proposal here somewhere...maybe. right now
>>>         non jdk lists can't participate in the special casing?
>>>
>>>         On Tue, Aug 23, 2022, 9:00 PM Paul Sandoz
>>>         <paul.sandoz at oracle.com> wrote:
>>>
>>>             List.copyOf already does what you want.
>>>
>>>             https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/List.java#L1068
>>>             https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/ImmutableCollections.java#L168
>>>
>>>             Paul.
>>>
>>>             > On Aug 23, 2022, at 4:49 PM, Ethan McCue
>>>             <ethan at mccue.dev> wrote:
>>>             >
>>>             > Hi all,
>>>             >
>>>             > I am running into an issue with the collections
>>>             framework where I have to choose between good semantics
>>>             for users and performance.
>>>             >
>>>             > Specifically I am taking a java.util.List from my
>>>             users and I need to choose to either
>>>             > * Not defensively copy and expose a potential footgun
>>>             when I pass that List to another thread
>>>             > * Defensively copy and make my users pay an
>>>             unnecessary runtime cost.
>>>             >
>>>             > What I would really want, in a nutshell, is for
>>>             List.copyOf to be a no-op when used on lists made with
>>>             List.of().
>>>             >
>>>             > Below the line is a pitch I wrote up on reddit 7
>>>             months ago for a mechanism I think could accomplish
>>>             that. My goal is to share the idea a bit more widely and
>>>             to this specific audience to get feedback.
>>>             >
>>>             >
>>>             https://www.reddit.com/r/java/comments/sf8qrv/comment/hv8or92/?utm_source=share&utm_medium=web2x&context=3
>>>             <https://www.reddit.com/r/java/comments/sf8qrv/comment/hv8or92/?utm_source=share&utm_medium=web2x&context=3>
>>>
>>>             >
>>>             > Important also for context is Ron Pressler's comment
>>>             above.
>>>             > --------------
>>>             >
>>>             > What if the collections api added more marker
>>>             interfaces like RandomAccess?
>>>             >
>>>             > It's already a common thing for codebases to make
>>>             explicit null checks at error boundaries because the
>>>             type system can't encode null | List<String>.
>>>             >
>>>             > This feels like a similar problem.
>>>             > If you have a List<T> in the type system then you
>>>             don't know for sure you can call any methods on it until
>>>             you check that its not null. In the same way, there is a
>>>             set of methods that you don't know at the type/interface
>>>             level if you are allowed to call.
>>>             >
>>>             > If the List is actually a __
>>>             > Then you can definitely call
>>>             > And you know other reference holders might call
>>>             > And you can confirm its this case by
>>>             > null
>>>             > no methods
>>>             > no methods
>>>             > list == null
>>>             > List.of(...)
>>>             > get, size
>>>             > get, size
>>>             > ???
>>>             > Collections.unmodifiableList(...)
>>>             > get, size
>>>             > get, size, add, set
>>>             > ???
>>>             > Arrays.asList(...)
>>>             > get, size, set
>>>             > get, size, set
>>>             > ???
>>>             > new ArrayList<>()
>>>             > get, size, add, set
>>>             > get, size, add, set
>>>             > ???
>>>             > While yes, there is no feasible way to encode these
>>>             things in the type system. Its not impossible to encode
>>>             it at runtime though.
>>>             > interface FullyImmutable {
>>>             > // So you know the existence of this implies the absence
>>>             > // of the others
>>>             > default Void cantIntersect() { return null; }
>>>             > }
>>>             >
>>>             > interace MutationCapability {
>>>             > default String cantIntersect() { return ""; }
>>>             > }
>>>             >
>>>             > interface Addable extends MutationCapability {}
>>>             > interface Settable extends MutationCapability {}
>>>             >
>>>             > If the List is actually a __
>>>             > Then you can definitely call
>>>             > And you know other reference holders might call
>>>             > And you can confirm its this case by
>>>             > null
>>>             > no methods
>>>             > no methods
>>>             > list == null
>>>             > List.of(...)
>>>             > get, size
>>>             > get, size
>>>             > instanceof FullyImmutable
>>>             > Collections.unmodifiableList(...)
>>>             > get, size
>>>             > get, size, add, set
>>>             > !(instanceof Addable) && !(instanceof Settable)
>>>             > Arrays.asList(...)
>>>             > get, size, set
>>>             > get, size, set
>>>             > instanceof Settable
>>>             > new ArrayList<>()
>>>             > get, size, add, set
>>>             > get, size, add, set
>>>             > instanceof Settable && instanceof Addable
>>>             > In the same way a RandomAccess check let's
>>>             implementations decide whether they want to try an
>>>             alternative algorithm or crash, some marker "capability"
>>>             interfaces would let users of a collection decide if
>>>             they want to clone what they are given before working on it.
>>>             >
>>>             >
>>>             > --------------
>>>             >
>>>             > So the applicability of this would be that the list
>>>             returned by List.of could implement FullyImmutable,
>>>             signifying that there is no caller which might have a
>>>             mutable handle on the collection. Then List.of could
>>>             check for this interface and skip a copy.
>>>             >
>>>             >
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/core-libs-dev/attachments/20220826/056df455/attachment-0001.htm>


More information about the core-libs-dev mailing list