Telling the totality story

Thu Mar 3 20:42:25 UTC 2022

Given the misconceptions about totality and whether "pattern matching 
means the same thing in all contexts", it is pretty clear that the story 
we're telling is not evoking the right concepts.  It is important not 
only to have the right answer, but also to have the story that helps 
people understand why its right, so let's take another whack at that.

(The irony here is that it only makes a difference at one point -- null 
-- which is the least interesting part of the story.  So it is much 
sound and fury that signifies nothing.)

None of what we say below represents a change in the language; it is 
just test-driving a new way to tell the same story.  (From a 
mathematical and specification perspective, it is a much worse story, 
since it is full of accidental detail about nulls (among other sins), 
but it might make some people happier, and maybe we can learn something 
from the reaction.)

The key change here is to say that every pattern-matching construct 
needs special rules for null.  This is already true, so we're just 
doubling down on it.  None of this changes anything about how it works.

#### Instanceof

We define:

     x instanceof P === (x != null) && x matches P

This means that instanceof always says false on null; on non-null, we 
can ask the pattern.  This is consistent with instanceof today.  P will 
just never see a null in this case. We'll define "matches" soon.

#### Switch

We play the same game with switch.  A switch:

     switch (x) {
         case P1:
         case P2:
         ...
     }

is defined to mean:

  - if x is null
    - if one of P1..Pn is the special label "null", then we take that branch
    - else we throw NPE
  - otherwise, we start trying to match against P1..Pn sequentially
  - if none match (its remainder), throw MatchException

This is consistent with switch today, in that there are no null labels, 
so null always falls into the NPE bucket.  It is also easy to keep track 
of; if the switch does not say "case null", it does not match null.  
None of the patterns P1..Pn will ever see a null.

#### Covering patterns

Let's define what it means for a pattern to cover a type T. This is just 
a new name for totality; we say that an "any" pattern covers all types 
T, and a type pattern `U u` covers T when `T <: U`.  (When we do 
primitive type patterns, there will be more terms in this list, but 
they'll still be type patterns.)  No other pattern is a covering 
pattern.  Covering patterns all have a single binding (`var x`, `String 
s`).

#### Let

This is where it gets ugly.  Let has no opinions about null, but the 
game here is to pretend it does.  So in a let statement:

     let P = e

  - Evaluate e
  - If e is null
    - If P is a covering pattern, its binding is bound to null;
    - else we throws NPE
  - Otherwise, e is matched to P.  If it does not match (its remainder), 
a MatchException is thrown (or the else clause is taken, if one is present.)

#### Nesting

Given a nested pattern R(P), where R is `record R(T t) { }`, this means:

  - Match the target to record R, and note the resulting t
    - if P is a covering pattern, we put the resulting `t` in the 
binding of P without further matching
    - Otherwise, we match t to P

In other words: when a "non trivial" (i.e., non covering) pattern P is 
nested in a record pattern, then:

     x instanceof R(P) === x instanceof R(var a) && a instanceof P

and a covering pattern is interpreted as "not really a pattern", and we 
just slam the result of the outer binding into the inner binding.

#### Matching

What we've done now is ensure that no pattern ever actually encounters a 
null, because the enclosing constructs always filter the nulls out.  
This means we can say:

  - var patterns match everything
  - `T t` is gated by `instanceof T`
  - R(P) is gated by `instanceof R`

## Is this a better way to tell the story?

This is clearly a much more complex way to tell the story; every 
construct must have a special case for null, and we must be prepared to 
treat simple patterns (`var x`, `String s`) not always as patterns, but 
sometimes as mere declarations whose bindings will be forcibly crammed 
from "outside".

Note too this will not scale to declared total patterns that treat null 
as a valid target value, if we ever want them; it depends on being able 
to "peek inside" the pattern from the outside, and say "oh, you're a 
covering pattern, I can set your binding without asking you."  That 
won't work with declared total patterns, which are opaque and may have 
multiple bindings.  (Saying "this doesn't bother me" is saying "I'm fine 
with a new null gate for total declared patterns.")

It also raises the profile of null in the story, which kind of sucks; 
while it is not new "null gates", it is new stories that is prominently 
parameterized by null.  (Which is a funny outcome for a system 
explicitly designed to *not* create new null gates!)

#### Further editorializing

So, having written it out, I dislike it even more than that I thought I 
would when I started writing this mail. Overall this seems like a lot of 
mental gymnastics to avoid acknowledging the reality that pattern 
matching has three parameters, not two -- the pattern, the target, and 
the *static type* of the target.

I think the root cause here is that when we faced the lump-or-split 
decision of "do we reuse instanceof for pattern matching or not" back in 
Java 13, the choice to lump (which was a rational and defensible choice, 
but so was splitting) created an expectation that "no pattern ever 
matches null", because instanceof is the first place people saw 
patterns, and instanceof never matches null.  While this expectation is 
only consistent with the simplest of patterns (and is only problematic 
at null), there seems to be a desire to force "the semantics of matching 
pattern P is whatever instanceof says matches P."

I suspect that had we chosen the split route for instanceof (which would 
not have been as silly as choosing the split route for switch, but still 
lumping seemed better), we would not be facing this situation.  But even 
if we decide, in hindsight, that lumping was a mistake, I do not want to 
compound this mistake by distorting the meaning of pattern matching to 
cater to it.  It might seem harmless from the outside, but it will 
dramatically limit how far we can push pattern matching in the future.

## Tweaking the current story

So, I think the main thing we can control about the story is the 
terminology.  I think part of what people find confusing is the use of 
the term "total", since that's a math-y term, and also it collides with 
"exhaustive", which is similar but not entirely coincident.

I won't try to find new terms here, but I think there are three relevant 
terms needed:

  - "Pattern set P* is exhaustive on type T."  This is a static 
property; it means that the pattern set P* covers all the "reasonable" 
values of T.  Null is only one "potentially unreasonable" value.
  - "Pattern P matches expression e".  This is a dynamic property.
- "Pattern P is total on type T".  This is both a static and dynamic 
property; it means the pattern matches all values of T, and that we know 
this statically.

What we've been calling "remainder" is the set of T values that are not 
matched by any of the patterns in a set P* that is exhaustive on T.  
(Exhaustiveness is static; matching is dynamic; the difference is 
remainder.)

I think the name "exhaustive" is pretty good; we've been using the term 
"exhaustiveness checking" for switch for a while.  The problem could be 
"matches", as it has a dynamic sound to it; the problem could also be 
"total".  The above excursion calls totality "covering", but having both 
"exhaustive" and "covering" in the same scheme is likely to confuse 
people, so I won't suggest that.  Perhaps "statically total"?  This has 
the nice implication that matching is statically known to always 
succeed.  This would give us:

  - If P is statically total on T, it is exhaustive on T;
  - Any patterns (var x) are statically total on all T;
  - Type patterns `T t` are statically total on `U <: T` (and more, when 
we get to primitive type patterns)
  - Statically total patterns match null (because they match everything)
  - instanceof intercepts nulls: x instanceof P === x != null && x matches P
  - switch intercepts nulls: as above
  - Let does not have any new opinions about null
  - Nesting unrolls to instanceof only for non-statically-total nested 
patterns

Also I think being more explicit about the switch/instanceof rules will 
help, as this filters out all top-level null reasoning.