<div dir="ltr"><div dir="ltr">I understand that null is prohibited by default, but can we also provide a set of factory methods that accept null?<div>They can be named like List.ofNullable/copyOfNullable.<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jan 31, 2023 at 10:13 AM Stuart Marks <<a href="mailto:stuart.marks@oracle.com">stuart.marks@oracle.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">In this reply I'll focus on the null handling issues in collections.<br>
<br>
As you've noted, there are really (at least) two distinct issues here: whether a <br>
collection permits nulls, and the behavior of contains(null) queries. There have <br>
been continual complaints about both, and the issues are somewhat distinct.<br>
<br>
The collection implementations in the JDK are all over the place with regard to <br>
nulls. I believe this is because the original collections and the concurrent <br>
collections up through JDK 1.6 or so were mostly done by Josh Bloch and Doug Lea, <br>
who disagreed about null handling. They had this exchange in 2006:<br>
<br>
Doug Lea wrote: [1]<br>
<br>
> The main reason that nulls aren't allowed in ConcurrentMaps<br>
> (ConcurrentHashMaps, ConcurrentSkipListMaps) is that<br>
> ambiguities that may be just barely tolerable in non-concurrent<br>
> maps can't be accommodated. The main one is that if<br>
> map.get(key) returns null, you can't detect whether the<br>
> key explicitly maps to null vs the key isn't mapped.<br>
> In a non-concurrent map, you can check this via map.contains(key),<br>
> but in a concurrent one, the map might have changed between calls.<br>
> <br>
> Further digressing: I personally think that allowing<br>
> nulls in Maps (also Sets) is an open invitation for programs<br>
> to contain errors that remain undetected until<br>
> they break at just the wrong time. (Whether to allow nulls even<br>
> in non-concurrent Maps/Sets is one of the few design issues surrounding<br>
> Collections that Josh Bloch and I have long disagreed about.)<br>
<br>
<br>
Josh Bloch wrote: [2]<br>
<br>
> I have moved towards your position over the years. It was probably a<br>
> mistake to allow null keys in Maps and null elements in Sets. I'm<br>
> still not sure about Map values and List elements.<br>
> <br>
> In other words, Doug hates null more than I do, but over the years<br>
> I've come to see it as quite troublesome.<br>
<br>
<br>
(I haven't talked to Josh or Doug about this particular issue recently, so I suppose <br>
it's possible their opinions have changed in the intervening time.)<br>
<br>
I had to decide what to do about nulls when I added the unmodifiable collections in <br>
JDK 9. Given that Doug "hates" nulls, and Josh finds them at least quite <br>
troublesome, I started from a position of disallowing null members in all the new <br>
collections. Most of the unmodifiable collections are still this way (but see below).<br>
<br>
What about contains(null)? Surely it should be ok to query for a null, and return <br>
false, even if the collection itself can't contain null. However, a bunch of <br>
collections and collection views in the JDK throw NPE on contains(null), so the <br>
unmodifiable collections don't set a precedent here.<br>
<br>
If you're dealing with all non-null values, and a null creeps in somehow, then <br>
calling contains(null) might be an error, and it would be good for the library to <br>
inform you of that instead of just returning false and letting the program continue <br>
until it fails later (fail-fast). A similar issue arises with the non-generic <br>
contains() method. If you have a Collection<Integer>, then shouldn't contains("abc") <br>
be an error? It's not, and it simply returns false, because the signature is <br>
contains(Object). But there have been complaints that this should have been an error <br>
because a Collection of Integer cannot possibly contain a String. Similar reasoning <br>
applies to contains(null) on a collection that can't contain null. (Yes, the error <br>
occurs at runtime instead of compile time, but better late than never.)<br>
<br>
A meta-issue is: should a new thing correct a perceived design flaw in the older <br>
thing, or should it be consistent with the old thing? Either way, you lose.<br>
<br>
Another factor in my original decision was that it's much easier to start with a <br>
restriction and relax it later than it is to go the other way. We have some <br>
flexibility to allow nulls and contains(null) if we want; but it's essentially a <br>
non-starter to disallow nulls once they've been allowed somewhere.<br>
<br>
That's why my starting position for the design prohibited nulls everywhere.<br>
<br>
Over time we've relaxed some restrictions around null handling in the unmodifiable <br>
collections. Streams permit nulls, and the List returned from Stream.toList() also <br>
permits nulls, and it also allows contains(null). So there's a different <br>
unmodifiable List implementation under there, accessible only via streams. (Note <br>
that the null-permitting unmodifiable lists lose the optimizations for sizes 0, 1, <br>
and 2, which are in part based on nulls being disallowed.)<br>
<br>
I'm fairly sympathetic to the case for changing the unmodifiable collections to <br>
allow contains(null), given that the current NPE-throwing behavior has generated a <br>
fair amount of complaints over the years. And the situation with having two flavors <br>
of unmodifiable list is quite odd and is an uncomfortable point in the design space.<br>
<br>
I should point out however that even if the unmodifiable collections were changed to <br>
allow contains(null), it's still kind of unclear as to whether this code is safe:<br>
<br>
public void addShoppingItems(Collection<String> shoppingItems) {<br>
if (shoppingItems.contains(null)) { // this looks quite reasonable and safe...<br>
throw new IllegalArgumentException("shoppingItems should not contain nulls");<br>
}<br>
...<br>
}<br>
<br>
If you're only ever passing ArrayList, Arrays.asList(), and List.of() lists here, <br>
then sure, this would work great. If the source collection is of unknown provenance, <br>
you can still run into trouble:<br>
<br>
1. contains(null) calls on the keySet and values collections of ConcurrentHashMap <br>
and Hashtable all throw NPE. So does contains(null) on a set from <br>
Collections.newSetFromMap(), which is built from those maps' keySets. (I suppose <br>
Doug Lea might be convinced to relax the CHM behavior.)<br>
<br>
2. Speaking of concurrency, if the source collection is being modified concurrently, <br>
then a null could be introduced after the check (TOCTOU) so you'd need to make a <br>
defensive copy before checking it.<br>
<br>
3. NavigableSet and friends delegate contains() to the comparison method, which <br>
might or might not accept nulls. There's no way the collection implementation can <br>
tell in advance.<br>
<br>
You might consider List.copyOf() as an alternative, as that makes a null-free <br>
defensive copy in one shot, and avoids copying if the source is an unmodifiable <br>
list. If you don't care about concurrency, then<br>
<br>
shoppingItems.forEach(Objects::requireNonNull)<br>
<br>
might be suitable.<br>
<br>
If unmodifiable collections were to change to allow contains(null), this probably <br>
would make them somewhat less annoying, but it wouldn't necessarily solve yours or <br>
everyone's problems. There's simply no way at this point to make the collections' <br>
treatment of nulls uniform.<br>
<br>
s'marks<br>
<br>
<br>
[1] <br>
<a href="https://web.archive.org/web/20210510190135/https://cs.oswego.edu/pipermail/concurrency-interest/2006-May/002485.html" rel="noreferrer" target="_blank">https://web.archive.org/web/20210510190135/https://cs.oswego.edu/pipermail/concurrency-interest/2006-May/002485.html</a><br>
<br>
[2] <br>
<a href="https://web.archive.org/web/20200930213909/http://cs.oswego.edu/pipermail/concurrency-interest/2006-May/002486.html" rel="noreferrer" target="_blank">https://web.archive.org/web/20200930213909/http://cs.oswego.edu/pipermail/concurrency-interest/2006-May/002486.html</a><br>
<br>
<br>
<br>
<br>
On 1/29/23 7:28 AM, John Hendrikx wrote:<br>
> TLDR; why does contains(null) not just return false for collections that don't allow <br>
> nulls. Just because the interface allows it, does not mean it should do it as it <br>
> devalues the usefulness of the abstraction provided by the interface.<br>
> <br>
> Background:<br>
> <br>
> I'm someone that likes to check correctness of any constructor or method parameter <br>
> before allowing an object to be constructed or to be modified, in order to maintain <br>
> invariants that are provided by the class to its users.<br>
> <br>
> This ranges from simple null checks, to range checks on numeric values, to complete <br>
> checks like "is a collection sorted" or is a list of nodes acyclic. Anything I can <br>
> check in the constructor that may avoid problems further down the line or that may <br>
> affect what I can guarantee on its own API methods. For example, if I check in the <br>
> constructor that something is not null, then the associated getter will guarantee <br>
> that the returned value is not null. If I check that a List doesn't contain nulls, <br>
> then the associated getter will reflect that as well (assuming it is immutable or <br>
> defensivily copied).<br>
> <br>
> For collections, this is currently becoming harder and harder because more and more <br>
> new collections are written to be null hostile. It is fine if a collection doesn't <br>
> accept nulls, but throwing NPE when I ask such a collection if it contains null <br>
> seems to be counter productive, and reduces the usefulness of the collection <br>
> interfaces.<br>
> <br>
> This strict interpretation makes the collection interfaces harder to use than <br>
> necessary. Interfaces are only useful when their contract is well defined. The more <br>
> things an interface allows or leaves unspecified, the less useful it is for its users.<br>
> <br>
> I know that the collection interfaces allow this, but you have to ask yourself this <br>
> important question: how useful is an interface that makes almost no guarantees about <br>
> what its methods will do? Interfaces like `List`, `Map` and `Set` are passed as <br>
> method parameters a lot, and to make these useful, implementations of these <br>
> interfaces should do their best to provide as consistent an experience as is <br>
> reasonably possible. The alternative is that these interfaces will slowly decline in <br>
> usefulness as methods will start asking for `ArrayList` instead of `List` to avoid <br>
> having to deal with a too lenient specification.<br>
> <br>
> With the collection interfaces I get the impression that recently there has been too <br>
> much focus on what would be easy for the collection implementation instead of what <br>
> would be easy for the users of said interfaces. In my opinion, the concerns of the <br>
> user of interfaces almost always far outweigh the concerns of the implementors of <br>
> said interfaces.<br>
> <br>
> In the past, immutable collections were rare, but they get are getting more and more <br>
> common all the time. For example, in unit testing. Unfortunately, these immutable <br>
> collections differ quite a bit from their mutable counterparts. Some parts are only <br>
> midly annoying (not accepting `null` as the **value** in a `Map` for example), but <br>
> other parts require code to be rewritten for it to be able to work as a drop-in <br>
> replacement for the mutable variant. A simple example is this:<br>
> <br>
> public void addShoppingItems(Collection<String> shoppingItems) {<br>
> if (shoppingItems.contains(null)) { // this looks quite reasonable and <br>
> safe...<br>
> throw new IllegalArgumentException("shoppingItems should not contain <br>
> nulls");<br>
> }<br>
> <br>
> this.shoppingItems.addAll(shoppingItems);<br>
> }<br>
> <br>
> Testing this code is annoying:<br>
> <br>
> x.addShoppingItems(List.of("a", null")); // can't construct immutable <br>
> collection with null<br>
> <br>
> x.addShoppingItems(Arrays.asList("a", null")); // fine, go back to what we <br>
> did before then...<br>
> <br>
> The above problems, I suppose we can live with it; immutable collections don't want <br>
> `null` although I don't see any reason to not allow it as I can write a similar <br>
> `List.of` that returns immutable collections that do allow `null`. For JDK code this <br>
> is a bit disconcerting, as it is supposed to be as flexible as is reasonable without <br>
> having too much of an opinion about what is good or bad.<br>
> <br>
> This next one however:<br>
> <br>
> assertNoExceptionThrown(() -> x.addShoppingItems(List.of("a", "b"))); // not <br>
> allowed, contains(null) in application code throws NPE<br>
> <br>
> This is much more insidious; the `contains(null)` check has been a very practical <br>
> way to check if collections contain null, and this works for almost all collections <br>
> in common use, so there is no problem. But now, more and more collections are <br>
> starting to just throw NPE immediately even just to **check** if a null element is <br>
> present. This only serves to distinguish themselves loudly from other similar <br>
> collections that will simply return `false`.<br>
> <br>
> I think this behavior is detrimental to the java collection interfaces in general, <br>
> especially since there is a clear answer to the question if such a collection <br>
> contains null or not. In fact, why `contains` was ever allowed to throw an exception <br>
> aside from "UnsupportedOperationException" is a mystery to me, and throwing one when <br>
> one could just as easily return the correct and expected answer seems very opiniated <br>
> and almost malicious -- not behavior I'd expect from JDK core libraries.<br>
> <br>
> Also note that there is no way to know in advance if `contains(null)` is going to be <br>
> throwing the NPE. If interfaces had a method `allowsNulls()` that would already be <br>
> an improvement.<br>
> <br>
> --John<br>
> <br>
> <br>
</blockquote></div></div>