[pattern-switch] Totality

Brian Goetz brian.goetz at oracle.com
Wed Aug 26 17:01:16 UTC 2020


I think we now have a sound story for totality in both patterns and 
switches.  Let's start with refining what we mean by totality.

We have seen a lot of cases -- and not just those involving enums or 
sealed types -- where we want to say a pattern or set of patterns is 
"total enough" to not force the user to explicitly handle the corner 
cases.  In these cases, we will let the compiler handle the corner cases 
by generating exceptions.

A prime example is the deconstruction pattern Foo(var x); this matches 
all Foos, but not null.  Similarly, there is a whole family of such 
corner cases: Foo(Bar(var x)) matches all Foos, except for null and 
Foo(null).  But we are in agreement that it would be overly pedantic to 
force the user to handle these explicitly.

Note that these show up not only in switch, but in pattern assignment:

     Object o = e;    // Object o is a total pattern on typeof(e), so 
always succeeds
     Foo(var x) = foo // Total on Foo, except null, so should NPE on null
     Foo(Bar(var x)) = foo // Total on Foo, except null and Foo(null), 
throw on these

     var y = switch (foo) {
         case Foo(var x) -> bar(x);  // Total on Foo, except null, so 
NPE on null
     }

These goals come from a pragmatic desire to not pedantically require the 
user to spell out the residue, but to accept the pattern (or set of 
patterns) as total anyway.


Now, let's put this intuition on a sounder footing by writing some 
formal rules for saying that a pattern P is _total on T with remainder 
R_, where R is a _set of patterns_. We will say P is _strongly total on 
T_ if P is total on T with empty remainder.

The intuition is that, if P is total on T with remainder R, the values 
matched by R but not by P are "silly" values and a language construct 
like switch can (a) consider P sufficient to establish totality and (b) 
can insert synthetic tests for each of the patterns in R that throw.

We will lean on this in _both_ switch and pattern assignment.  For 
switch, we will treat it as if we insert synthetic cases after P that 
throw, so that the remaining values can still be matched by earlier 
explicit patterns.

Invariant: If P is total on T with remainder R then, for all t in T, 
either t matches P, or t matches some pattern in R.  (This is not the 
definition of what makes a pattern total; it is just something that is 
true about total patterns.)

Base cases:

  - The type pattern `T t` is strongly total on U <: T.
  - The inferred type pattern `var x` is strongly total on all T.

Induction cases:

  - Let D(T) be a deconstructor.
    - The deconstruction pattern D(Q), where Q is strongly total on T, 
is total on D with remainder { null }.
    - The deconstruction pattern D(Q), where Q is total on T with 
remainder R*, is total on D with remainder { null } union { D(R) : R in R* }

We can easily generalize the definition of totality to a set of 
patterns.  In this case, we can handle sealed types and enums:

  - Let E be an enum type.  The set of patterns { C1 .. Cn } is total on 
E with remainder { null, E e } if C1 .. Cn contains all the constants of E.

Observation: that `E e` pattern is pretty broad!  But that's OK; it 
captures all novel constant values, and, because the explicit cases 
cover all the known values, captures only the novel values.  Same for 
sealed types:

  - Let S be a sealed abstract type or interface, with permitted direct 
subtypes C*, and P* be a set of patterns applicable to S.  If for each C 
in C*, there exists a subset Pc of P* that is total on C with remainder 
Rc, then P* is total on S with remainder { null } union \forall{c \in 
C}{ Rc }.

  - Let D(T) be a deconstructor, and let P* be total on T with remainder 
R*.  Then { D(P) : P in P* } is total on D with remainder { null } 
\union { D(R) : R in R* }.

Example:
   Container<T> = Box<T> | Bag<T>
   Shape = Circle | Rect
   P* = { Box(Circle), Box(Rect), Bag(Circle), Bag(Rect) }

{ Circle, Rect } total on Shape with remainder { Shape, null }

-> { Box(Circle), Box(Rect) total on Box<Shape> with remainder { 
Box(Shape), Box(null), null }

-> { Bag(Circle), Bag(Rect) total on Bag<Shape> with remainder { 
Bag(Shape), Bag(null), null }

-> P* total on Container<Shape> with remainder { Container(Box(Shape)), 
Container(Box(null)), Container(Bag(Shape)), Container(Box(Shape)), 
Container(null), null }


Now:

  - A pattern assignment `P = e` requires that P be total on the static 
type of `e`, with some remainder R, and throws on the remainder.
  - A total switch on `e` requires that its cases be total on the static 
type of `e`, with some remainder R, and inserts synthetic throwing cases 
at the end of the switch for each pattern in R.


We can then decide how to opt into totality in switches, other than "be 
an expression switch."


On 8/21/2020 4:18 PM, Brian Goetz wrote:
>
>
> On 8/21/2020 11:14 AM, Brian Goetz wrote:
>>
>> Next up (separate topic): letting statement switches opt into totality.
>>
>
> Assuming the discussion on Exhaustiveness is good, let's talk about 
> totality.
>
> Expression switches must be total; we totalize them by throwing when 
> we encounter any residue, even though we only require that the set of 
> cases in the switch be optimistically total.  Residue includes:
>
>  - `null` switch targets in String, Enum, and primitive box switches only;
>  - novel values in enum switches without a total case clause;
>  - novel subtypes in switches on sealed types without a total case clause;
>  - when an optimistically total subchain of deconstruction pattern 
> cases wraps a residue value (e.g., D(null) or D(novel))
>
> What about statement switches?  Right now, any residue for a statement 
> switch without a total case clause will just be silently ignored 
> (because statement switches need not be total.)
>
> What we would like is a way to say "this switch is total, please type 
> check it for me as such, and insert any needed residue-catching 
> cases."  I think this is a job for `default`.
>
> Now that we've got some clarity that switches _don't_ throw on null, 
> but instead it is as if string/enum/box switches have an implicit 
> `case null` when no explicit one is present, we can define `default`, 
> once again, to be total (and not just weakly total.)  So in:
>
>     switch (object) {
>         case "foo":
>         case Box(Frog fs):
>         default: ...
>     }
>
> a `null` just falls into `default` just like anything else that is not 
> the string "foo" or a box of frogs ("let the nulls flow"). Default 
> would have to come last (except in legacy switches, where a legacy 
> switch has one of the distinguished target types and all constant case 
> labels.)
>
> What if we want to destructure too?  Well, add a pattern:
>
>     switch (object) {
>         case "foo":
>         case Box(Frog fs):
>         default Object o: ...
>     }
>
> This would additionally assert that the following pattern is total, 
> otherwise a compilation error ensues.  (Note, though, that this is 
> entirely about `switch`, not patterns.  The semantics of the pattern 
> is unchanged, and I do not believe that sprinkling `default` into 
> nested patterns to shout "TOTALITY HERE, I MEAN IT" carries its weight.)
>
> This seems a better job to give default in this new world; anything 
> not previously matched, where we retcon the current null behavior as 
> being only about string, enum, or boxes.
>
> This leaves us with only one hole, which is: suppose I have an 
> _optimistically total_ statement switch.   Users might like to (a) 
> assert the switch is total, and get the concomitant type checking, and 
> (b) get residue ejection for free.  Of the two, though, A is much more 
> important than B, but we'll take B when we can get it. Perhaps, if the 
> target of a switch is a sealed type, we can interpret:
>
>     switch (shape) {
>         case Rect r: ...
>         default Circle c: ...
>     }
>
> as meaning that `Circle c` _closes_ the switch to make it total, and 
> engages the totality checking to ensure this is true.  So, `default P` 
> would mean either:
>
>  - P is total, or
>  - P is not total, but taken with the other cases, makes the switch 
> optimistically total
>
> and in the latter case, would engage the 
> residue-detection-and-ejection machinery.
>
> This might be stretching it a tad too far, but I like that we can 
> given `default` useful new jobs to do in `switch` rather than just 
> giving him a gold watch.
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20200826/8133945c/attachment.htm>


More information about the amber-spec-experts mailing list