Letting the nulls flow (Was: Exhaustiveness)

forax at univ-mlv.fr forax at univ-mlv.fr
Sun Aug 23 19:28:57 UTC 2020


I think we agree that switch should be able to be null-friendly, 
the question is more what is the default, and how "default" works. 

I wonder if "case null" is the right design, if for a lot of switch, null behavior is the same as the behavior of an existing case or default. 
Currently case null is the first (depending on nesting) case, so you can not easily said that null and another case share the same behavior. 

Whatever we decide for "default", a syntax that let append null to an existing case seems better ? 
Something along "case Foo, null: ... " 

Rémi 

> De: "Brian Goetz" <brian.goetz at oracle.com>
> À: "Tagir Valeev" <amaembo at gmail.com>
> Cc: "Remi Forax" <forax at univ-mlv.fr>, "Guy Steele" <guy.steele at oracle.com>,
> "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Envoyé: Dimanche 23 Août 2020 17:43:03
> Objet: Re: Letting the nulls flow (Was: Exhaustiveness)

> Thanks, Tagir -- this is a perfect example of what I meant yesterday by how the
> "blow early, blow often" approach is a false promise. It just means that
> responsible programmers who need to deal with null as a fact-of-life have to do
> *extra* work (which is therefore more duplicative or error-prone) to deal with
> it.

> On 8/22/2020 11:46 PM, Tagir Valeev wrote:

>> Hello!

>> Some data from the current IntelliJ IDEA codebase

>> We have 64 occurrences of this code pattern
>> if($x$ == null) {...} // presumably completes abruptly
>> switch($x) {...}
>> Roughly half of them are enum switches and the other half is string switches

>> Also, we have 29 occurrences of this code pattern:
>> if($x$ != null) {
>>   switch($x$) { ... }
>>   ...
>> }

>> Also, we have one occurrence of this code pattern:
>> if($x$ == null) {...
>> } else {
>>   switch($x) {...}
>> }

>> All of them could benefit from null-friendly switch. Btw often null
>> branch is the same as default branch (or some other non-null branch).

>> With best regards,
>> Tagir Valeev

>> On Sun, Aug 23, 2020 at 12:14 AM Brian Goetz [ mailto:brian.goetz at oracle.com |
>> <brian.goetz at oracle.com> ] wrote:

>>> Breaking into a separate thread.   I hope we can put this one to bed
>>> once and for all.

>>>> I'm not hostile to that view, but may i ask an honest question, why
>>>> this semantics is better ?
>>>> Do you have examples where it makes sense to let the null to slip
>>>> through the statement switch ? Because as i can see why being null
>>>> hostile is a good default, it follows the motos "blow early, blow
>>>> often" or "in case of doubt throws".

>>> Charitably, I think this approach is borne of a belief that, if we keep
>>> the nulls out by posting sentries at the door, we can live an interior
>>> life unfettered by stray nulls.  But I think it is also time to
>>> recognize that this approach to "block the nulls at the door" (a)
>>> doesn't actually work, (b) creates sharp edges when the doors move
>>> (which they do, though refactoring), and (c) pushes the problems elsewhere.

>>> (To illustrate (c), just look at the conversation about nulls in
>>> patterns and switch we are having right now!  We all came to this
>>> exercise thinking "switch is null-hostile, that's how it's always been,
>>> that's how it must be", and are contorting ourselves to try to come up
>>> with a consistent explanation.   But, if we look deeper, we see that
>>> switch is *only accidentally* null-hostile, based on some highly
>>> contextual decisions that were made when adding enum and autoboxing in
>>> Java 5.  I'll talk more about that decision in a moment, but my point
>>> right now is that we are doing a _lot_ of work to try to be consistent
>>> with an arbitrary decision that was made in the past, in a specific and
>>> limited context, and probably not with the greatest care.  Truly today's
>>> problems come from yesterdays "solutions."  If we weren't careful, an
>>> accidental decision about nulls in enum switch almost polluted the
>>> semantics of pattern matching!  That would be terrible!  So let's stop
>>> doing that, and let's stop creating new ways for our tomorrow's selves
>>> to be painted into a corner.)

>>> As background, I'll observe that every time a new context comes up,
>>> someone suggests "we should make it null-hostile."  (Closely related: we
>>> should make that new kind of variable immutable.)  And, nearly every
>>> time, this ends up being the wrong choice.  This happened with Streams;
>>> when we first wrestled with nulls in streams, someone pushed for "Just
>>> have streams throw on null elements."  But this would have been
>>> terrible; it would have meant that calculations on null-friendly
>>> domains, that were prepared to engage null directly, simply could not
>>> use streams in the obvious way; calculations like:

>>>      Stream.of(arrayOfStuff)
>>>                  .map(Stuff::methodThatMightReturnNull)
>>>                  .filter(x -> x != null)
>>>                  .map(Stuff::doSomething)
>>>                  .collect(toList())

>>> would not be directly expressible, because we would have already NPEed.
>>> Sure, there are workarounds, but for what?  Out of a naive hope that, if
>>> we inject enough null checks, no one will ever have to deal with null?
>>> Out of irrational hatred for nulls?  Nothing good comes from either of
>>> these motivations.

>>> But, this episode wasn't over.  It was then suggested "OK, we can't NPE,
>>> but how about we filter the nulls?"  Which would have been worse.  It
>>> would mean that, for example, doing a map+toArray on an array might not
>>> have the same size as the initial array -- which would violate what
>>> should be a pretty rock-solid intuition.  It would kill all the
>>> pre-sized-array optimizations.  It would mean `zip` would have no useful
>>> semantics.  Etc etc.

>>> In the end, we came to the right answer for streams, which is "let the
>>> nulls flow".   And this is was the right choice because Streams is
>>> general-purpose plumbing.  The "blow early" bias is about guarding the
>>> gates, and thereby hopefully keeping the nulls from getting into the
>>> house and having wild null parties at our expense. And this works when
>>> the gates are few, fixed, and well marked.  But if your language
>>> exhibits any compositional mechanisms (which is our best tool), then
>>> what was the front door soon becomes the middle of the hallway after a
>>> trivial refactoring -- which means that no refactorings are really
>>> trivial.  Oof.

>>> We already went through a good example recently where it would be
>>> foolish to try to exclude null (and yet we tried anyway) --
>>> deconstruction patterns.  If a constructor

>>>      new Foo(x)

>>> can accept null, then a deconstructor

>>>      case Foo(var x)

>>> should dutifully serve up that null.  The guard-the-gates brigade tried
>>> valiently to put up new gates at each deconstructor, but that would have
>>> been a foolish place to put such a boundary.  I offered an analogy to
>>> having deconstruction reject null over on amber-dev:

>>>> In languages with side-effects (like Java), not all aggregation
>>>> operations are reversible; if I bake a pie, I can't later recover the
>>>> apples and the sugar.  But many are, and we like abstractions like
>>>> these (collections, Optional, stream, etc) because they are very
>>>> useful and easily reasoned about.  So those that are, should commit to
>>>> the principle.  It would be OK for a list implementation to behave
>>>> like this:

>>>>     Listy list = new Listy();
>>>>     list.add(null) // throws NPE

>>>> because a List is free to express constraints on its domain.  But it
>>>> would be exceedingly bizarre for a list implementation to behave like
>>>> this:

>>>>     Listy list = new Listy();
>>>>     list.add(3);     // ok, I like ints
>>>>     list.add(null); // ok, I like nulls too
>>>>     assertTrue(list.size() == 2);   // ok
>>>>     assertTrue(list.get(0) == 3); // ok
>>>>     assertTrue(list.get(1) == null);  // NPE!

>>>> If the list takes in nulls, it should give them back.

>>> Now, this is like the first suggested form of null-hostility in streams,
>>> and to everyone's credit, no one suggested exactly that, but what was
>>> suggested was the second, silent form of hostility -- just pretend you
>>> don't see the nulls.  And, like with streams, that would have been
>>> silly.  So, OK, we dodged the bullet of infecting patterns with special
>>> nullity rules.  Whew.

>>> Now, switch.  As I mentioned, I think we're here mostly because we are
>>> perpetuating the null biases of the past.  In Java 1.0, switches were
>>> only over primitives, so there was no question about nulls.  In Java 5,
>>> we added two new reference-typed switch targets: enums and boxes.  I
>>> wasn't in the room when that decision was made, but I can imagine how it
>>> went: Java 5 was a *very* full release, and under dramatic pressure to
>>> get out the door.  The discussion came up about nulls, maybe someone
>>> even suggested `case null` back then.  And I'm sure the answer was some
>>> form of "null enums and primitive boxes are almost always bugs, let's
>>> not bend over backwards and add new complexity to the language (case
>>> null) just to accomodate this bug, let's just throw NPE."

>>> And, given how limited switch was, and the special characteristics of
>>> enums and boxes, this was probably a pragmatic decision, but I think we
>>> lost sight of the subtleties of the context.  It is almost certainly
>>> right that 99.999% of the time, a null enum or box is a bug.  But this
>>> is emphatically not true when we broaden the type to Object.  Since the
>>> context and conditions change, the decision should be revisited before
>>> copying it to other contexts.

>>> In Java 7, when we added switching on strings, I do remember the
>>> discussion about nulls; it was mostly about "well, there's a precedent,
>>> and it's not worth breaking the precedent even if null strings are more
>>> common than null Integers, and besides, the mandate of Project Coin is
>>> very limited, and `case null` would probably be out of scope."  While
>>> this may have again been a pragmatic choice at the time given the
>>> constraints, it further set us down a slippery slope where the
>>> assumption that "switches always throw null" is set in concrete.  But
>>> this assumption is not founded on solid ground.

>>> So, the better way to approach this is to imagine Java had no switch,
>>> and we were adding a general switch today.  Would we really be
>>> advocating so hard for "Oooh, another door we can guard, let's stick it
>>> to the nulls there too"?  (And, even if we were tempted to, should we?)

>>> The plain fact is that we got away with null-hostility in the first
>>> three forms of reference types in switch because switch (at the time)
>>> was such a weak and non-compositional mechanism, and there are darn few
>>> things it can actually do well.  But, if we were designing a
>>> general-purpose switch, with rich labels and enhanced control flow
>>> (e.g., guards) as we are today, where we envisioned refactoring between
>>> switches on nested patterns and patterns with nested switches, this
>>> would be more like a general plumbing mechanism, like streams, and when
>>> plumbing has an opinion about the nulls, frantic calls to the plumber
>>> are not far behind.  The nulls must flow unimpeded, because otherwise,
>>> we create new anomalies and blockages like the streams examples I gave
>>> earlier and refactoring surprises. And having these anomalies doesn't
>>> really make life any better for the users -- it actually makes
>>> everything just less predictable, because it means simple refactorings
>>> are not simple -- and in a way that is very easy to forget about.

>>> If we really could keep the nulls out at the front gate, and thus define
>>> a clear null-free domain to work in, then I would be far more
>>> sympathetic to the calls of "new gates, new guards!"  But the gates
>>> approach just doesn't work, and we have ample evidence of this.  And the
>>> richer and more compositional we make the language, the more sharp edges
>>> this creates, because old interiors become new gates.

>>> So, back to the case at hand (though we should bring specifics this back
>>> to the case-at-hand thread): what's happening here is our baby switch is
>>> growing up into a general purpose mechanism.  And, we should expect it
>>> to take on responsibilities suited to its new abilities.

>>> Now, for the backlash.  Whenever we make an argument for
>>> what-appears-to-be relaxing an existing null-hostility, there is much
>>> concern about how the nulls will run free and wreak havoc. But, let's
>>> examine that more closely.

>>> The concern seems to be that, if if we let the null through the gate,
>>> we'll just get more NPEs, at worse places.  Well, we can't get more
>>> NPEs; at most, we can get exactly the same number.  But in reality, we
>>> will likely get less.  There are three cases.

>>> 1.  The domain is already null-free.  In this case, it doesn't make a
>>> difference; no NPEs before, none after.

>>> 2.  The domain is mostly null-free, but nulls do creep in, we see them
>>> as bugs, and we are happy to get notified.  This is the case today with
>>> enums, where a null enum is almost always a bug.  Yes, in cases like
>>> this, not guarding the gates means that the bug will get further before
>>> it is detected, or might go undetected.  This isn't fantastic, but this
>>> also isn't a disaster, because it is rare and is still likely it will
>>> get detected eventually.

>>> 3.  The domain is at least partially null tolerant.  Here, we are moving
>>> an always-throw at the gates to a
>>> might-throw-in-the-guts-if-you-forget.  But also, there are plenty of
>>> things you can do with a null binding that don't NPE, such as pass it to
>>> a method that deals sensibly with nulls, add it to an ArrayList, print
>>> it, etc.  This is a huge improvement, from "must treat null in a
>>> special, out of band way" to "treat null uniformly."  At worst, it is no
>>> worse, and often better.

>>> And, when it comes to general purpose domains, #3 is much bigger than
>>> #2.  So I think we have to optimize for #3.

>>> Finally, there are those who argue we should "just" have nullable types
>>> (T? and T!), and then all of this goes away.  I would love to get there,
>>> but it would be a very long road.  But let's imagine we do get there.
>>> OMG how terrible it would be when constructs like lambdas, switches, or
>>> patterns willfully try to save us from the nulls, thus doing the job
>>> (badly) of the type system!  We'd have explicitly nullable types for
>>> which some constructs NPE anyway. Or, we'd have to redefine the
>>> semantics of everything in complex ways based on whether the underlying
>>> input types are nullable or not.  We would feel pretty stupid for having
>>> created new corners to paint ourselves into.

>>> Our fears of untamed nulls wantonly running through the streets are
>>> overblown.  Our attempts to contain the nulls through ad-hoc
>>> gate-guarding have all been failures.  Let the nulls flow.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20200823/50d8c92a/attachment-0001.htm>


More information about the amber-spec-experts mailing list