Telling the totality story
Brian Goetz
brian.goetz at oracle.com
Thu Mar 3 20:42:25 UTC 2022
Given the misconceptions about totality and whether "pattern matching
means the same thing in all contexts", it is pretty clear that the story
we're telling is not evoking the right concepts. It is important not
only to have the right answer, but also to have the story that helps
people understand why its right, so let's take another whack at that.
(The irony here is that it only makes a difference at one point -- null
-- which is the least interesting part of the story. So it is much
sound and fury that signifies nothing.)
None of what we say below represents a change in the language; it is
just test-driving a new way to tell the same story. (From a
mathematical and specification perspective, it is a much worse story,
since it is full of accidental detail about nulls (among other sins),
but it might make some people happier, and maybe we can learn something
from the reaction.)
The key change here is to say that every pattern-matching construct
needs special rules for null. This is already true, so we're just
doubling down on it. None of this changes anything about how it works.
#### Instanceof
We define:
x instanceof P === (x != null) && x matches P
This means that instanceof always says false on null; on non-null, we
can ask the pattern. This is consistent with instanceof today. P will
just never see a null in this case. We'll define "matches" soon.
#### Switch
We play the same game with switch. A switch:
switch (x) {
case P1:
case P2:
...
}
is defined to mean:
- if x is null
- if one of P1..Pn is the special label "null", then we take that branch
- else we throw NPE
- otherwise, we start trying to match against P1..Pn sequentially
- if none match (its remainder), throw MatchException
This is consistent with switch today, in that there are no null labels,
so null always falls into the NPE bucket. It is also easy to keep track
of; if the switch does not say "case null", it does not match null.
None of the patterns P1..Pn will ever see a null.
#### Covering patterns
Let's define what it means for a pattern to cover a type T. This is just
a new name for totality; we say that an "any" pattern covers all types
T, and a type pattern `U u` covers T when `T <: U`. (When we do
primitive type patterns, there will be more terms in this list, but
they'll still be type patterns.) No other pattern is a covering
pattern. Covering patterns all have a single binding (`var x`, `String
s`).
#### Let
This is where it gets ugly. Let has no opinions about null, but the
game here is to pretend it does. So in a let statement:
let P = e
- Evaluate e
- If e is null
- If P is a covering pattern, its binding is bound to null;
- else we throws NPE
- Otherwise, e is matched to P. If it does not match (its remainder),
a MatchException is thrown (or the else clause is taken, if one is present.)
#### Nesting
Given a nested pattern R(P), where R is `record R(T t) { }`, this means:
- Match the target to record R, and note the resulting t
- if P is a covering pattern, we put the resulting `t` in the
binding of P without further matching
- Otherwise, we match t to P
In other words: when a "non trivial" (i.e., non covering) pattern P is
nested in a record pattern, then:
x instanceof R(P) === x instanceof R(var a) && a instanceof P
and a covering pattern is interpreted as "not really a pattern", and we
just slam the result of the outer binding into the inner binding.
#### Matching
What we've done now is ensure that no pattern ever actually encounters a
null, because the enclosing constructs always filter the nulls out.
This means we can say:
- var patterns match everything
- `T t` is gated by `instanceof T`
- R(P) is gated by `instanceof R`
## Is this a better way to tell the story?
This is clearly a much more complex way to tell the story; every
construct must have a special case for null, and we must be prepared to
treat simple patterns (`var x`, `String s`) not always as patterns, but
sometimes as mere declarations whose bindings will be forcibly crammed
from "outside".
Note too this will not scale to declared total patterns that treat null
as a valid target value, if we ever want them; it depends on being able
to "peek inside" the pattern from the outside, and say "oh, you're a
covering pattern, I can set your binding without asking you." That
won't work with declared total patterns, which are opaque and may have
multiple bindings. (Saying "this doesn't bother me" is saying "I'm fine
with a new null gate for total declared patterns.")
It also raises the profile of null in the story, which kind of sucks;
while it is not new "null gates", it is new stories that is prominently
parameterized by null. (Which is a funny outcome for a system
explicitly designed to *not* create new null gates!)
#### Further editorializing
So, having written it out, I dislike it even more than that I thought I
would when I started writing this mail. Overall this seems like a lot of
mental gymnastics to avoid acknowledging the reality that pattern
matching has three parameters, not two -- the pattern, the target, and
the *static type* of the target.
I think the root cause here is that when we faced the lump-or-split
decision of "do we reuse instanceof for pattern matching or not" back in
Java 13, the choice to lump (which was a rational and defensible choice,
but so was splitting) created an expectation that "no pattern ever
matches null", because instanceof is the first place people saw
patterns, and instanceof never matches null. While this expectation is
only consistent with the simplest of patterns (and is only problematic
at null), there seems to be a desire to force "the semantics of matching
pattern P is whatever instanceof says matches P."
I suspect that had we chosen the split route for instanceof (which would
not have been as silly as choosing the split route for switch, but still
lumping seemed better), we would not be facing this situation. But even
if we decide, in hindsight, that lumping was a mistake, I do not want to
compound this mistake by distorting the meaning of pattern matching to
cater to it. It might seem harmless from the outside, but it will
dramatically limit how far we can push pattern matching in the future.
## Tweaking the current story
So, I think the main thing we can control about the story is the
terminology. I think part of what people find confusing is the use of
the term "total", since that's a math-y term, and also it collides with
"exhaustive", which is similar but not entirely coincident.
I won't try to find new terms here, but I think there are three relevant
terms needed:
- "Pattern set P* is exhaustive on type T." This is a static
property; it means that the pattern set P* covers all the "reasonable"
values of T. Null is only one "potentially unreasonable" value.
- "Pattern P matches expression e". This is a dynamic property.
- "Pattern P is total on type T". This is both a static and dynamic
property; it means the pattern matches all values of T, and that we know
this statically.
What we've been calling "remainder" is the set of T values that are not
matched by any of the patterns in a set P* that is exhaustive on T.
(Exhaustiveness is static; matching is dynamic; the difference is
remainder.)
I think the name "exhaustive" is pretty good; we've been using the term
"exhaustiveness checking" for switch for a while. The problem could be
"matches", as it has a dynamic sound to it; the problem could also be
"total". The above excursion calls totality "covering", but having both
"exhaustive" and "covering" in the same scheme is likely to confuse
people, so I won't suggest that. Perhaps "statically total"? This has
the nice implication that matching is statically known to always
succeed. This would give us:
- If P is statically total on T, it is exhaustive on T;
- Any patterns (var x) are statically total on all T;
- Type patterns `T t` are statically total on `U <: T` (and more, when
we get to primitive type patterns)
- Statically total patterns match null (because they match everything)
- instanceof intercepts nulls: x instanceof P === x != null && x matches P
- switch intercepts nulls: as above
- Let does not have any new opinions about null
- Nesting unrolls to instanceof only for non-statically-total nested
patterns
Also I think being more explicit about the switch/instanceof rules will
help, as this filters out all top-level null reasoning.
More information about the amber-spec-observers
mailing list