Next up for patterns: type patterns in switch

Mon Aug 10 22:20:42 UTC 2020

Letting the nulls flow is a good move in that absorbing
game of “find the primitive”.  Here, you are observing forcefully
that instanceof and switch, while important precedents and
sources of use cases and design patterns, are not the primitives.
We are not so lucky that the answer is “we just need more sugar
for the existing constructs”.  But we are not so unlucky that
we must build our new primitives out of alien materials.

The existing ideas about type matching, and specifically that
`T x = v;` requires that `T` be total over the type of `v`,
are available and useful.

Also useful is the idea that some constructs are necessarily
null-hostile (starting with dot: `x.f`).  In a “let the nulls
flow design”, if a construct is not *necessarily* null hostile,
it is necessarily *null permissive*.  So we hunt for necessary
hostility, and among patterns we find it with destructuring
(`Box(var t)` as opposed to `Box`) and with value testing
patterns like string or numeric literals (if those are patterns,
which I think they should be).  We also note that the level
of hostility (from patterns) must be compatible with
generally null-agnostic use cases:  This means a pattern
can fail on null (if it *must*) but it *must not throw on
null*, because the next pattern in line might be the one
that matches the null.

So instanceof and switch turn out to be sugar for certain
uses of patterns.  And together they are universal enough
that (luckily) we might not need a new syntax to directly
denote the new primitive, of applying a pattern (partial
or total) and extracting bindings.

The existing behavior w.r.t. nulls of instanceof and switch
need to be rationalized.  I think that is easy, although there
is a little bit to learn.  (Just as there’s something to learn
today:  They are unconditionally null-rejecting at present.)
An important thing (as Brian points out) is that if you
are choosing to write null-agnostic code, your learning
curve should be gentle-to-none.

Here are the rules the way I see them, in the presence
of primitive patterns which are null-permissive (because
they support null-agnostic use cases):

`x instanceof P` includes an additional check `x==null` before
it tests the pattern P against x.  Rationale:  Compatibility.
Also look at the name:  `null` is never an *instance* *of* any
type.  A pattern might match null, but we are testing whether
`x` is an instance, which is to say, an object.

Some equations to relate instanceof to the primitive __Matches:

x instanceof P  ≡  x __Matches P && (__PermitsNull(P) ? x != null : true)

x __Matches P  ≡  x instanceof P || (__PermitsNull(P) ? x == null : false)

Do we need syntax for __Matches P?  Probably not, because the
above equations allow workarounds when the instanceof syntax
isn’t exactly right.  (And it usually *is* exactly right; the trailing
null logic folds away in context, or is harmless in some other way,
as the nulls flow around.)

What about switch?  I like to think that a switch statement is simply
sugar (plus optimizations) for a *decision chain*, an if/else chain which
tests each case in turn (in source code order, of course):

   switch (x) {
   case P: p(); break;
   case Q: q(); break;
   …
   default: d(); }

⇒ (approximately)

  { var x_ = x;
  if (x_ __Matches P) p(); else
  if (x_ __Matches Q) q(); else
  …
  d(); }

(Note that this account of classic switch requires extra tweaks
to deal with two embarrassing features:  (a) fall through, which
requires some way of contriving transfers between arms of the
decision chain, and (b) the fact that default can go anywhere,
and sometimes is placed in the middle to make use of fall-through.
These are embarrassments, not show-stoppers.)

So what about nulls?  The simple—I will say naive—account
of switch is that there is a null check at the head of the switch
near `var x_ = x;`.  This would account for all of switch’s behaviors
as of today, but makes switch hostile to nulls.

A more nuanced and useful account of switch’s behavior comes
from the following observations:

1. All switch cases *today*, if regarded as patterns, are necessarily
null-rejecting.  *None of them ever match null.*

2. The NPE observed from a switch-on-null, today, might as well
be viewed as arising from the *bottom* of the decision chain,
*after all matches* fail.  From that point of view, the fact that
the failure appears to come from the *top* is simply an optimization,
a fast-fail when it is statically provable that there’s no hope ever
matching that pesky null, in any given legacy switch.

3. When null meets default, we are painted into a corner, so we
have to enjoy the only remaining option:  At least in legacy switches,
the default case is *also* mandated to reject nulls.  (So “default”
turns out to mean “anything but null”.  But that doesn’t parley
into a general anti-null story; sorry null-haters.)  This feature
of default can (maybe) be turned into a benefit:  Perhaps we
can teach users that by saying “default” you are *asking* for
an NPE, if a null escapes all the intervening patterns in the
decision chain.  I don’t have a strong opinion on that.

The previous three observations fully account for today’s
legacy switches, with their limited set of patterns.  The next
one is also necessary to extend to switch cases which may
support null-friendly patterns:

4. We need a rule to allow nulls to flow through switches
until the user is ready to handle them.  This means that
null-permissive patterns in *some* switch cases need to
be shielded from null just as with instanceof.

What is this rule?  We’ve already discussed it adequately;
it comes in two parts:

A. `case null:` is allowed and does the obvious thing.
We might as well require that it always come first.

B. There is a way of issuing a case which accepts nulls,
and that way is a total pattern that is null friendly.
(As Brian points out, this fits with the useful idea
that a null-friendly pattern of the form `T v` or `var v`
works just like the similar declaration.)

Note that B is less arbitrary than it might seem at first
blush:  To avoid dead code, any total pattern in a switch
must come *last*, at the bottom of the decision chain.
(There can be no `default:` after it either, since that would
be dead.)

So the rules together mean:

1. If there is a `case null` at the top, that’s where nulls go.
2. If there is a total pattern at the bottom, that’s where nulls go.
3. Non-total patterns don’t catch nulls *in a switch*, just like in instanceof.
4. If there is neither a `case null` nor a total pattern, the switch throws NPE.

I think this covers the use cases, except (perhaps) for some
of the “anecdotal” (really, artificial) use cases one could come
up with where the rules get slightly more burdensome than
if they were different (and burdensome in more substantial
ways).

Corresponding rules apply to sub-patterns (and to refactorings).

a. a null-hostile sub-pattern (in Box(Pox(var t)) is null-hostile
b. a null-permissive sub-pattern (in Box(Object t)) that is also total permits nulls
c. a null-permissive sub-pattern (in Box(String t)) that is partial 

c. is debatable, but I think it’s the right answer also.  It aligns
the contextual behavior of case patterns with that of sub-patterns.

The contextual behavior can be summarized generally:

a. some patterns are null hostile, so never match null (constants, destructurings)
b. null-permissive patterns which are asked to narrow the target type do not match null
c. null-permissive patterns which widen or reiterate the target type let nulls flow through

Case b is like instanceof, while case c is like a declaration.  The declaration-like
behavior is allowed if the corresponding declaration would also be valid,
else the instanceof behavior is allowed.

Score card:

- Patterns intrinsically allow nulls to flow when possible.  (Some necessarily reject nulls; others don’t.)
- Patterns are always applied in a type context which renders them like declarations or like type tests.
- The type context determines whether they do type tests or not; only type tests reject nulls.
- Instanceof is declared to be always a type test (even if its pattern is total).
- Switch rejects nulls with NPE unless there is a case that accepts nulls.

Did I miss anything?