Letting the nulls flow (Was: Exhaustiveness)
Brian Goetz
brian.goetz at oracle.com
Sat Aug 22 17:14:15 UTC 2020
Breaking into a separate thread. I hope we can put this one to bed
once and for all.
> I'm not hostile to that view, but may i ask an honest question, why
> this semantics is better ?
> Do you have examples where it makes sense to let the null to slip
> through the statement switch ? Because as i can see why being null
> hostile is a good default, it follows the motos "blow early, blow
> often" or "in case of doubt throws".
Charitably, I think this approach is borne of a belief that, if we keep
the nulls out by posting sentries at the door, we can live an interior
life unfettered by stray nulls. But I think it is also time to
recognize that this approach to "block the nulls at the door" (a)
doesn't actually work, (b) creates sharp edges when the doors move
(which they do, though refactoring), and (c) pushes the problems elsewhere.
(To illustrate (c), just look at the conversation about nulls in
patterns and switch we are having right now! We all came to this
exercise thinking "switch is null-hostile, that's how it's always been,
that's how it must be", and are contorting ourselves to try to come up
with a consistent explanation. But, if we look deeper, we see that
switch is *only accidentally* null-hostile, based on some highly
contextual decisions that were made when adding enum and autoboxing in
Java 5. I'll talk more about that decision in a moment, but my point
right now is that we are doing a _lot_ of work to try to be consistent
with an arbitrary decision that was made in the past, in a specific and
limited context, and probably not with the greatest care. Truly today's
problems come from yesterdays "solutions." If we weren't careful, an
accidental decision about nulls in enum switch almost polluted the
semantics of pattern matching! That would be terrible! So let's stop
doing that, and let's stop creating new ways for our tomorrow's selves
to be painted into a corner.)
As background, I'll observe that every time a new context comes up,
someone suggests "we should make it null-hostile." (Closely related: we
should make that new kind of variable immutable.) And, nearly every
time, this ends up being the wrong choice. This happened with Streams;
when we first wrestled with nulls in streams, someone pushed for "Just
have streams throw on null elements." But this would have been
terrible; it would have meant that calculations on null-friendly
domains, that were prepared to engage null directly, simply could not
use streams in the obvious way; calculations like:
Stream.of(arrayOfStuff)
.map(Stuff::methodThatMightReturnNull)
.filter(x -> x != null)
.map(Stuff::doSomething)
.collect(toList())
would not be directly expressible, because we would have already NPEed.
Sure, there are workarounds, but for what? Out of a naive hope that, if
we inject enough null checks, no one will ever have to deal with null?
Out of irrational hatred for nulls? Nothing good comes from either of
these motivations.
But, this episode wasn't over. It was then suggested "OK, we can't NPE,
but how about we filter the nulls?" Which would have been worse. It
would mean that, for example, doing a map+toArray on an array might not
have the same size as the initial array -- which would violate what
should be a pretty rock-solid intuition. It would kill all the
pre-sized-array optimizations. It would mean `zip` would have no useful
semantics. Etc etc.
In the end, we came to the right answer for streams, which is "let the
nulls flow". And this is was the right choice because Streams is
general-purpose plumbing. The "blow early" bias is about guarding the
gates, and thereby hopefully keeping the nulls from getting into the
house and having wild null parties at our expense. And this works when
the gates are few, fixed, and well marked. But if your language
exhibits any compositional mechanisms (which is our best tool), then
what was the front door soon becomes the middle of the hallway after a
trivial refactoring -- which means that no refactorings are really
trivial. Oof.
We already went through a good example recently where it would be
foolish to try to exclude null (and yet we tried anyway) --
deconstruction patterns. If a constructor
new Foo(x)
can accept null, then a deconstructor
case Foo(var x)
should dutifully serve up that null. The guard-the-gates brigade tried
valiently to put up new gates at each deconstructor, but that would have
been a foolish place to put such a boundary. I offered an analogy to
having deconstruction reject null over on amber-dev:
> In languages with side-effects (like Java), not all aggregation
> operations are reversible; if I bake a pie, I can't later recover the
> apples and the sugar. But many are, and we like abstractions like
> these (collections, Optional, stream, etc) because they are very
> useful and easily reasoned about. So those that are, should commit to
> the principle. It would be OK for a list implementation to behave
> like this:
>
> Listy list = new Listy();
> list.add(null) // throws NPE
>
> because a List is free to express constraints on its domain. But it
> would be exceedingly bizarre for a list implementation to behave like
> this:
>
> Listy list = new Listy();
> list.add(3); // ok, I like ints
> list.add(null); // ok, I like nulls too
> assertTrue(list.size() == 2); // ok
> assertTrue(list.get(0) == 3); // ok
> assertTrue(list.get(1) == null); // NPE!
>
> If the list takes in nulls, it should give them back.
Now, this is like the first suggested form of null-hostility in streams,
and to everyone's credit, no one suggested exactly that, but what was
suggested was the second, silent form of hostility -- just pretend you
don't see the nulls. And, like with streams, that would have been
silly. So, OK, we dodged the bullet of infecting patterns with special
nullity rules. Whew.
Now, switch. As I mentioned, I think we're here mostly because we are
perpetuating the null biases of the past. In Java 1.0, switches were
only over primitives, so there was no question about nulls. In Java 5,
we added two new reference-typed switch targets: enums and boxes. I
wasn't in the room when that decision was made, but I can imagine how it
went: Java 5 was a *very* full release, and under dramatic pressure to
get out the door. The discussion came up about nulls, maybe someone
even suggested `case null` back then. And I'm sure the answer was some
form of "null enums and primitive boxes are almost always bugs, let's
not bend over backwards and add new complexity to the language (case
null) just to accomodate this bug, let's just throw NPE."
And, given how limited switch was, and the special characteristics of
enums and boxes, this was probably a pragmatic decision, but I think we
lost sight of the subtleties of the context. It is almost certainly
right that 99.999% of the time, a null enum or box is a bug. But this
is emphatically not true when we broaden the type to Object. Since the
context and conditions change, the decision should be revisited before
copying it to other contexts.
In Java 7, when we added switching on strings, I do remember the
discussion about nulls; it was mostly about "well, there's a precedent,
and it's not worth breaking the precedent even if null strings are more
common than null Integers, and besides, the mandate of Project Coin is
very limited, and `case null` would probably be out of scope." While
this may have again been a pragmatic choice at the time given the
constraints, it further set us down a slippery slope where the
assumption that "switches always throw null" is set in concrete. But
this assumption is not founded on solid ground.
So, the better way to approach this is to imagine Java had no switch,
and we were adding a general switch today. Would we really be
advocating so hard for "Oooh, another door we can guard, let's stick it
to the nulls there too"? (And, even if we were tempted to, should we?)
The plain fact is that we got away with null-hostility in the first
three forms of reference types in switch because switch (at the time)
was such a weak and non-compositional mechanism, and there are darn few
things it can actually do well. But, if we were designing a
general-purpose switch, with rich labels and enhanced control flow
(e.g., guards) as we are today, where we envisioned refactoring between
switches on nested patterns and patterns with nested switches, this
would be more like a general plumbing mechanism, like streams, and when
plumbing has an opinion about the nulls, frantic calls to the plumber
are not far behind. The nulls must flow unimpeded, because otherwise,
we create new anomalies and blockages like the streams examples I gave
earlier and refactoring surprises. And having these anomalies doesn't
really make life any better for the users -- it actually makes
everything just less predictable, because it means simple refactorings
are not simple -- and in a way that is very easy to forget about.
If we really could keep the nulls out at the front gate, and thus define
a clear null-free domain to work in, then I would be far more
sympathetic to the calls of "new gates, new guards!" But the gates
approach just doesn't work, and we have ample evidence of this. And the
richer and more compositional we make the language, the more sharp edges
this creates, because old interiors become new gates.
So, back to the case at hand (though we should bring specifics this back
to the case-at-hand thread): what's happening here is our baby switch is
growing up into a general purpose mechanism. And, we should expect it
to take on responsibilities suited to its new abilities.
Now, for the backlash. Whenever we make an argument for
what-appears-to-be relaxing an existing null-hostility, there is much
concern about how the nulls will run free and wreak havoc. But, let's
examine that more closely.
The concern seems to be that, if if we let the null through the gate,
we'll just get more NPEs, at worse places. Well, we can't get more
NPEs; at most, we can get exactly the same number. But in reality, we
will likely get less. There are three cases.
1. The domain is already null-free. In this case, it doesn't make a
difference; no NPEs before, none after.
2. The domain is mostly null-free, but nulls do creep in, we see them
as bugs, and we are happy to get notified. This is the case today with
enums, where a null enum is almost always a bug. Yes, in cases like
this, not guarding the gates means that the bug will get further before
it is detected, or might go undetected. This isn't fantastic, but this
also isn't a disaster, because it is rare and is still likely it will
get detected eventually.
3. The domain is at least partially null tolerant. Here, we are moving
an always-throw at the gates to a
might-throw-in-the-guts-if-you-forget. But also, there are plenty of
things you can do with a null binding that don't NPE, such as pass it to
a method that deals sensibly with nulls, add it to an ArrayList, print
it, etc. This is a huge improvement, from "must treat null in a
special, out of band way" to "treat null uniformly." At worst, it is no
worse, and often better.
And, when it comes to general purpose domains, #3 is much bigger than
#2. So I think we have to optimize for #3.
Finally, there are those who argue we should "just" have nullable types
(T? and T!), and then all of this goes away. I would love to get there,
but it would be a very long road. But let's imagine we do get there.
OMG how terrible it would be when constructs like lambdas, switches, or
patterns willfully try to save us from the nulls, thus doing the job
(badly) of the type system! We'd have explicitly nullable types for
which some constructs NPE anyway. Or, we'd have to redefine the
semantics of everything in complex ways based on whether the underlying
input types are nullable or not. We would feel pretty stupid for having
created new corners to paint ourselves into.
Our fears of untamed nulls wantonly running through the streets are
overblown. Our attempts to contain the nulls through ad-hoc
gate-guarding have all been failures. Let the nulls flow.
More information about the amber-spec-experts
mailing list