Letting the nulls flow (Was: Exhaustiveness)
Tagir Valeev
amaembo at gmail.com
Sun Aug 23 03:46:35 UTC 2020
Hello!
Some data from the current IntelliJ IDEA codebase
We have 64 occurrences of this code pattern
if($x$ == null) {...} // presumably completes abruptly
switch($x) {...}
Roughly half of them are enum switches and the other half is string switches
Also, we have 29 occurrences of this code pattern:
if($x$ != null) {
switch($x$) { ... }
...
}
Also, we have one occurrence of this code pattern:
if($x$ == null) {...
} else {
switch($x) {...}
}
All of them could benefit from null-friendly switch. Btw often null
branch is the same as default branch (or some other non-null branch).
With best regards,
Tagir Valeev
On Sun, Aug 23, 2020 at 12:14 AM Brian Goetz <brian.goetz at oracle.com> wrote:
>
> Breaking into a separate thread. I hope we can put this one to bed
> once and for all.
>
> > I'm not hostile to that view, but may i ask an honest question, why
> > this semantics is better ?
> > Do you have examples where it makes sense to let the null to slip
> > through the statement switch ? Because as i can see why being null
> > hostile is a good default, it follows the motos "blow early, blow
> > often" or "in case of doubt throws".
>
> Charitably, I think this approach is borne of a belief that, if we keep
> the nulls out by posting sentries at the door, we can live an interior
> life unfettered by stray nulls. But I think it is also time to
> recognize that this approach to "block the nulls at the door" (a)
> doesn't actually work, (b) creates sharp edges when the doors move
> (which they do, though refactoring), and (c) pushes the problems elsewhere.
>
> (To illustrate (c), just look at the conversation about nulls in
> patterns and switch we are having right now! We all came to this
> exercise thinking "switch is null-hostile, that's how it's always been,
> that's how it must be", and are contorting ourselves to try to come up
> with a consistent explanation. But, if we look deeper, we see that
> switch is *only accidentally* null-hostile, based on some highly
> contextual decisions that were made when adding enum and autoboxing in
> Java 5. I'll talk more about that decision in a moment, but my point
> right now is that we are doing a _lot_ of work to try to be consistent
> with an arbitrary decision that was made in the past, in a specific and
> limited context, and probably not with the greatest care. Truly today's
> problems come from yesterdays "solutions." If we weren't careful, an
> accidental decision about nulls in enum switch almost polluted the
> semantics of pattern matching! That would be terrible! So let's stop
> doing that, and let's stop creating new ways for our tomorrow's selves
> to be painted into a corner.)
>
>
> As background, I'll observe that every time a new context comes up,
> someone suggests "we should make it null-hostile." (Closely related: we
> should make that new kind of variable immutable.) And, nearly every
> time, this ends up being the wrong choice. This happened with Streams;
> when we first wrestled with nulls in streams, someone pushed for "Just
> have streams throw on null elements." But this would have been
> terrible; it would have meant that calculations on null-friendly
> domains, that were prepared to engage null directly, simply could not
> use streams in the obvious way; calculations like:
>
> Stream.of(arrayOfStuff)
> .map(Stuff::methodThatMightReturnNull)
> .filter(x -> x != null)
> .map(Stuff::doSomething)
> .collect(toList())
>
> would not be directly expressible, because we would have already NPEed.
> Sure, there are workarounds, but for what? Out of a naive hope that, if
> we inject enough null checks, no one will ever have to deal with null?
> Out of irrational hatred for nulls? Nothing good comes from either of
> these motivations.
>
> But, this episode wasn't over. It was then suggested "OK, we can't NPE,
> but how about we filter the nulls?" Which would have been worse. It
> would mean that, for example, doing a map+toArray on an array might not
> have the same size as the initial array -- which would violate what
> should be a pretty rock-solid intuition. It would kill all the
> pre-sized-array optimizations. It would mean `zip` would have no useful
> semantics. Etc etc.
>
> In the end, we came to the right answer for streams, which is "let the
> nulls flow". And this is was the right choice because Streams is
> general-purpose plumbing. The "blow early" bias is about guarding the
> gates, and thereby hopefully keeping the nulls from getting into the
> house and having wild null parties at our expense. And this works when
> the gates are few, fixed, and well marked. But if your language
> exhibits any compositional mechanisms (which is our best tool), then
> what was the front door soon becomes the middle of the hallway after a
> trivial refactoring -- which means that no refactorings are really
> trivial. Oof.
>
> We already went through a good example recently where it would be
> foolish to try to exclude null (and yet we tried anyway) --
> deconstruction patterns. If a constructor
>
> new Foo(x)
>
> can accept null, then a deconstructor
>
> case Foo(var x)
>
> should dutifully serve up that null. The guard-the-gates brigade tried
> valiently to put up new gates at each deconstructor, but that would have
> been a foolish place to put such a boundary. I offered an analogy to
> having deconstruction reject null over on amber-dev:
>
> > In languages with side-effects (like Java), not all aggregation
> > operations are reversible; if I bake a pie, I can't later recover the
> > apples and the sugar. But many are, and we like abstractions like
> > these (collections, Optional, stream, etc) because they are very
> > useful and easily reasoned about. So those that are, should commit to
> > the principle. It would be OK for a list implementation to behave
> > like this:
> >
> > Listy list = new Listy();
> > list.add(null) // throws NPE
> >
> > because a List is free to express constraints on its domain. But it
> > would be exceedingly bizarre for a list implementation to behave like
> > this:
> >
> > Listy list = new Listy();
> > list.add(3); // ok, I like ints
> > list.add(null); // ok, I like nulls too
> > assertTrue(list.size() == 2); // ok
> > assertTrue(list.get(0) == 3); // ok
> > assertTrue(list.get(1) == null); // NPE!
> >
> > If the list takes in nulls, it should give them back.
>
> Now, this is like the first suggested form of null-hostility in streams,
> and to everyone's credit, no one suggested exactly that, but what was
> suggested was the second, silent form of hostility -- just pretend you
> don't see the nulls. And, like with streams, that would have been
> silly. So, OK, we dodged the bullet of infecting patterns with special
> nullity rules. Whew.
>
> Now, switch. As I mentioned, I think we're here mostly because we are
> perpetuating the null biases of the past. In Java 1.0, switches were
> only over primitives, so there was no question about nulls. In Java 5,
> we added two new reference-typed switch targets: enums and boxes. I
> wasn't in the room when that decision was made, but I can imagine how it
> went: Java 5 was a *very* full release, and under dramatic pressure to
> get out the door. The discussion came up about nulls, maybe someone
> even suggested `case null` back then. And I'm sure the answer was some
> form of "null enums and primitive boxes are almost always bugs, let's
> not bend over backwards and add new complexity to the language (case
> null) just to accomodate this bug, let's just throw NPE."
>
> And, given how limited switch was, and the special characteristics of
> enums and boxes, this was probably a pragmatic decision, but I think we
> lost sight of the subtleties of the context. It is almost certainly
> right that 99.999% of the time, a null enum or box is a bug. But this
> is emphatically not true when we broaden the type to Object. Since the
> context and conditions change, the decision should be revisited before
> copying it to other contexts.
>
> In Java 7, when we added switching on strings, I do remember the
> discussion about nulls; it was mostly about "well, there's a precedent,
> and it's not worth breaking the precedent even if null strings are more
> common than null Integers, and besides, the mandate of Project Coin is
> very limited, and `case null` would probably be out of scope." While
> this may have again been a pragmatic choice at the time given the
> constraints, it further set us down a slippery slope where the
> assumption that "switches always throw null" is set in concrete. But
> this assumption is not founded on solid ground.
>
> So, the better way to approach this is to imagine Java had no switch,
> and we were adding a general switch today. Would we really be
> advocating so hard for "Oooh, another door we can guard, let's stick it
> to the nulls there too"? (And, even if we were tempted to, should we?)
>
> The plain fact is that we got away with null-hostility in the first
> three forms of reference types in switch because switch (at the time)
> was such a weak and non-compositional mechanism, and there are darn few
> things it can actually do well. But, if we were designing a
> general-purpose switch, with rich labels and enhanced control flow
> (e.g., guards) as we are today, where we envisioned refactoring between
> switches on nested patterns and patterns with nested switches, this
> would be more like a general plumbing mechanism, like streams, and when
> plumbing has an opinion about the nulls, frantic calls to the plumber
> are not far behind. The nulls must flow unimpeded, because otherwise,
> we create new anomalies and blockages like the streams examples I gave
> earlier and refactoring surprises. And having these anomalies doesn't
> really make life any better for the users -- it actually makes
> everything just less predictable, because it means simple refactorings
> are not simple -- and in a way that is very easy to forget about.
>
> If we really could keep the nulls out at the front gate, and thus define
> a clear null-free domain to work in, then I would be far more
> sympathetic to the calls of "new gates, new guards!" But the gates
> approach just doesn't work, and we have ample evidence of this. And the
> richer and more compositional we make the language, the more sharp edges
> this creates, because old interiors become new gates.
>
> So, back to the case at hand (though we should bring specifics this back
> to the case-at-hand thread): what's happening here is our baby switch is
> growing up into a general purpose mechanism. And, we should expect it
> to take on responsibilities suited to its new abilities.
>
>
> Now, for the backlash. Whenever we make an argument for
> what-appears-to-be relaxing an existing null-hostility, there is much
> concern about how the nulls will run free and wreak havoc. But, let's
> examine that more closely.
>
> The concern seems to be that, if if we let the null through the gate,
> we'll just get more NPEs, at worse places. Well, we can't get more
> NPEs; at most, we can get exactly the same number. But in reality, we
> will likely get less. There are three cases.
>
> 1. The domain is already null-free. In this case, it doesn't make a
> difference; no NPEs before, none after.
>
> 2. The domain is mostly null-free, but nulls do creep in, we see them
> as bugs, and we are happy to get notified. This is the case today with
> enums, where a null enum is almost always a bug. Yes, in cases like
> this, not guarding the gates means that the bug will get further before
> it is detected, or might go undetected. This isn't fantastic, but this
> also isn't a disaster, because it is rare and is still likely it will
> get detected eventually.
>
> 3. The domain is at least partially null tolerant. Here, we are moving
> an always-throw at the gates to a
> might-throw-in-the-guts-if-you-forget. But also, there are plenty of
> things you can do with a null binding that don't NPE, such as pass it to
> a method that deals sensibly with nulls, add it to an ArrayList, print
> it, etc. This is a huge improvement, from "must treat null in a
> special, out of band way" to "treat null uniformly." At worst, it is no
> worse, and often better.
>
> And, when it comes to general purpose domains, #3 is much bigger than
> #2. So I think we have to optimize for #3.
>
>
> Finally, there are those who argue we should "just" have nullable types
> (T? and T!), and then all of this goes away. I would love to get there,
> but it would be a very long road. But let's imagine we do get there.
> OMG how terrible it would be when constructs like lambdas, switches, or
> patterns willfully try to save us from the nulls, thus doing the job
> (badly) of the type system! We'd have explicitly nullable types for
> which some constructs NPE anyway. Or, we'd have to redefine the
> semantics of everything in complex ways based on whether the underlying
> input types are nullable or not. We would feel pretty stupid for having
> created new corners to paint ourselves into.
>
> Our fears of untamed nulls wantonly running through the streets are
> overblown. Our attempts to contain the nulls through ad-hoc
> gate-guarding have all been failures. Let the nulls flow.
>
More information about the amber-spec-experts
mailing list