[pattern-switch] Totality

Wed Aug 26 19:27:40 UTC 2020

I like this new analysis.  See below.

> On Aug 26, 2020, at 1:01 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
> 
> I think we now have a sound story for totality in both patterns and switches.  Let's start with refining what we mean by totality.  
> 
> We have seen a lot of cases -- and not just those involving enums or sealed types -- where we want to say a pattern or set of patterns is "total enough" to not force the user to explicitly handle the corner cases.  In these cases, we will let the compiler handle the corner cases by generating exceptions.  
> 
> A prime example is the deconstruction pattern Foo(var x); this matches all Foos, but not null.  Similarly, there is a whole family of such corner cases: Foo(Bar(var x)) matches all Foos, except for null and Foo(null).  But we are in agreement that it would be overly pedantic to force the user to handle these explicitly.  
> 
> Note that these show up not only in switch, but in pattern assignment:
> 
>     Object o = e;    // Object o is a total pattern on typeof(e), so always succeeds
>     Foo(var x) = foo // Total on Foo, except null, so should NPE on null
>     Foo(Bar(var x)) = foo // Total on Foo, except null and Foo(null), throw on these
> 
>     var y = switch (foo) {
>         case Foo(var x) -> bar(x);  // Total on Foo, except null, so NPE on null
>     }
> 
> These goals come from a pragmatic desire to not pedantically require the user to spell out the residue, but to accept the pattern (or set of patterns) as total anyway.  
> 
> 
> Now, let's put this intuition on a sounder footing by writing some formal rules for saying that a pattern P is _total on T with remainder R_, where R is a _set of patterns_.  We will say P is _strongly total on T_ if P is total on T with empty remainder.  
> 
> The intuition is that, if P is total on T with remainder R, the values matched by R but not by P are "silly" values and a language construct like switch can (a) consider P sufficient to establish totality and (b) can insert synthetic tests for each of the patterns in R that throw.  
> 
> We will lean on this in _both_ switch and pattern assignment.  For switch, we will treat it as if we insert synthetic cases after P that throw, so that the remaining values can still be matched by earlier explicit patterns.  
> 
> Invariant: If P is total on T with remainder R then, for all t in T, either t matches P, or t matches some pattern in R.  (This is not the definition of what makes a pattern total; it is just something that is true about total patterns.)  
> 
> Base cases: 
> 
>  - The type pattern `T t` is strongly total on U <: T.  
>  - The inferred type pattern `var x` is strongly total on all T.  
> 
> Induction cases: 
> 
>  - Let D(T) be a deconstructor.  
>    - The deconstruction pattern D(Q), where Q is strongly total on T, is total on D with remainder { null }.  
>    - The deconstruction pattern D(Q), where Q is total on T with remainder R*, is total on D with remainder { null } union { D(R) : R in R* }

Note that the first sub-bullet is “merely” an important special case of the second sub-subbullet.

> We can easily generalize the definition of totality to a set of patterns.  In this case, we can handle sealed types and enums: 
> 
>  - Let E be an enum type.  The set of patterns { C1 .. Cn } is total on E with remainder { null, E e } if C1 .. Cn contains all the constants of E.  
> 
> Observation: that `E e` pattern is pretty broad!  But that's OK; it captures all novel constant values, and, because the explicit cases cover all the known values, captures only the novel values.  Same for sealed types: 
> 
>  - Let S be a sealed abstract type or interface, with permitted direct subtypes C*, and P* be a set of patterns applicable to S.  If for each C in C*, there exists a subset Pc of P* that is total on C with remainder Rc, then P* is total on S with remainder { null } union \forall{c \in C}{ Rc }.  

I think there should not be set braces around that last occurrence of “Rc”.

>  - Let D(T) be a deconstructor, and let P* be total on T with remainder R*.  Then { D(P) : P in P* } is total on D with remainder { null } \union { D(R) : R in R* }.  
> 
> Example: 
>   Container<T> = Box<T> | Bag<T>
>   Shape = Circle | Rect
>   P* = { Box(Circle), Box(Rect), Bag(Circle), Bag(Rect) }

I assume that this use of syntax “T = U | V” is meant to imply that T is a sealed type that permits U and V.

> { Circle, Rect } total on Shape with remainder { Shape, null }
> 
> -> { Box(Circle), Box(Rect) total on Box<Shape> with remainder { Box(Shape), Box(null), null }
> 
> -> { Bag(Circle), Bag(Rect) total on Bag<Shape> with remainder { Bag(Shape), Bag(null), null }
> 
> -> P* total on Container<Shape> with remainder { Container(Box(Shape)), Container(Box(null)), Container(Bag(Shape)), Container(Box(Shape)), Container(null), null }

I believe that, in this last remainder, the second occurrence of `Container(Box(Shape))` was intended to be `Container(Bag(null))`.

But I also think that the phrase `Container(Box(Shape))` is inconsistent or incoherent; there has been some confusion of the _type parameter_ of `Container` with a _deconstruction parameter_ of `Container`.  To say `Container(Box(…))` is as silly as to say `Shape(Rect)`.

I will try to redo this derivation while being very explicit about the type parameters:

	Container<T> = Box<T> | Bag<T>
	Shape = Circle | Rect
	P* = { Box<Shape>(Circle), Box<Shape>(Rect), Bag<Shape>(Circle), Bag<Shape>(Rect) }

	{ Box<T>(T), Bag<T>(T) } total on Container<T> with remainder { Container<T>, null }

Now instantiate the previous rule with T=Shape to get

	{ Box<Shape>(Shape), Bag<Shape>(Shape) } total on Container<Shape>(Shape) with remainder { Container<Shape>(Shape), null }

	{ Circle, Rect } total on Shape with remainder { Shape, null }

	-> { Box<Shape>(Circle), Box<Shape>(Rect) total on Box<Shape> with remainder { Box<Shape>(Shape), Box<Shape>(null), null }

	-> { Bag<Shape>(Circle), Bag<Shape>(Rect) total on Bag<Shape> with remainder { Bag<Shape>(Shape), Bag<Shape>(null), null }

	-> P* total on Container<Shape> with remainder
			{ Box<Shape>(Shape), Box<shape>(null), Bag<Shape>(Shape), Bag<Shape>(null), Container<Shape>, null }

> Now:
> 
>  - A pattern assignment `P = e` requires that P be total on the static type of `e`, with some remainder R, and throws on the remainder.  
>  - A total switch on `e` requires that its cases be total on the static type of `e`, with some remainder R, and inserts synthetic throwing cases at the end of the switch for each pattern in R.

Yes!

To which we can perhaps add: a pattern `instanceof` expression `x instanceof P` requires that the pattern P _not_ be total on the type of x.

> We can then decide how to opt into totality in switches, other than "be an expression switch.”

Yes!