Opting into totality

Mon Aug 24 22:01:16 UTC 2020

> On Aug 24, 2020, at 5:04 PM, Brian Goetz <brian.goetz at oracle.com> wrote:
> 
> 
> 
>> I am going to argue here that, just as fear of letting nulls flow stemmed from a early design that conflated multiple design issues as a result of extrapolating from too few data points (enums and strings), we have been boxed into another corner because we conflated expression-ness and the need for totality.  
> 
> I'm not sure we _conflated_ the two, as we did this with our eyes open (and fairly recently), but I suspect I agree with the rest -- that for $REASONS, we introduced an asymmetry that we knew would come back to bite us, and left a note for ourselves to come back and revisit, especially as optimistic totality became more important (e.g., through sealed types.)  
> 
>> Going back to the dawn of time, a switch statement does not have to be total.  Why is this possible?  Because there is an obvious default behavior: do nothing.  If we were to view it in terms of delivering a value of some type, we would say that type is “void”.
> 
> Yep.  Cue usual comparison to "if without else."  Partiality, for statements, is OK, but not for expressions.  Can't have a ternary expression with no `: alternative` part.  
> 
>> Then why did we not allow a switch expression to be _exactly_ analogous?  In fact, we could have, by relying on existing precedent in the language: if no switch label matches and there is no default, or if execution of the statements of the switch block completes normally, we could simply decree that a switch expression has the default behavior “do nothing” and delivers a _default value_—exactly as we do for initialization of fields and array components.  So for
>> 
>> 	enum Color { RED, GREEN, BLUE }
>> 	Color x = …
>> 	int n = switch (x) { RED -> 1; GREEN -> 2; };
>> 
>> then if x is BLUE, n will get the value 0.
>> 
>> But I am guessing that we worried about programming errors and demanded totality for switch expressions, so we enforced it by fiat because we had no other mechanism to request totality.
> 
> Yes, that's right.  Note that this is analogous to another feature that is frequently requested -- the so-called "safe dereference" operators, where `x?.y` yields a default value if `x` is null.  When this one first reared its head (and about a thousand times since, since it comes up near-constantly), our objection was that, while `null` might conceivably be a reasonable default for such an expression if `y` is of reference type, yielding a default primitive value is more likely to lead to errors than not.  (This is not unlike the problem with `Map::get`, where `null` means both "mapping not present" and "element mapped to null", and there's no way to tell the difference unless you can freeze the map for updates while asking two questions (Map::containsKey and Map::get.)   The argument against both is the same.  
> 
>> The first is a switch label of the form “default <pattern>”, which behaves just like a switch label “case <pattern>” except that it is a static error if the <pattern> is not total on the type of the selector expression.  This mechanism is good for extensible type hierarchies, where we expect to call out a number of special cases and then have a catch-all case, and we want the compiler to confirm to us on every compilation that the catch-all case actually does catch everything.
> 
> What I like about this use of default is that it decomposes into independent parts.   The `default` case means "everything else"; `default <total-pattern>` means "everything else, and destructure that everything else with this pattern, which had better be total because, well, I'm matching everything else."   THe     addition of a pattern doesn't change the meaning of default; it is a form of composition.  You could recast this as taking one feature -- "patterns in switch" -- and turning it into two: patterns in case labels, and patterns in default labels.  
> 
>> The second is the possibility of writing “switch case” rather than “switch”, which introduces these extra constraints on the switch block: It is a static error if any SwitchLabel of the switch statement begins with “default".  It is a static error if the set of case patterns is not at least optimistically total on the type of the selector expression.  It is a static error if the last BlockStatement in _any_ SwitchBlockStatementGroup, or the Block in any SwitchRule, can complete normally.  It is a static error if any SwitchLabel of the switch statement is not part of a SwitchBlockStatementGroup or SwitchRule.  In addition, the compiler automatically inserts SwitchBlockStatementGroups or SwitchRules to cover the residue, so as to throw an appropriate error at run time if the value produced by the selector expression belongs to the residue.  This mechanism is good for enums and sealed types, that is, situations where we expect to enumerate all the special cases explicitly and want to be notified by the compiler (or failing that, at run time) if we have failed to do so.
> 
> Is it fair to recharacterize this as follows -- we divide potentially-total statement switches into two kinds: 
>  - Those that arrive at totality via a catch-all clause;
>  - Those that arrive at totality via an (optimistic) covering of an enumerated domain ("switch by parts")
> 
> and we provide a separate mechanism for each to declare their totality (which engages different type checking and translation.)  
> 
> (I see later that you say this, so good, we're on the same page.)
> 
> (We might have called this "enum switch", especially if we had taken Alan's suggestion of declaring sealed types with a more enum-like syntax.)
> 
> <digression>
> Some comments on the restrictions:
> 
>> It is a static error if the last BlockStatement in _any_ SwitchBlockStatementGroup, or the Block in any SwitchRule, can complete normally.  
> 
> This is mostly already true in general; we envision it is a static error if you fall into, or out of, any case that has bindings.  (We might relax this to allow falling _into_ a total case.)   This was a simplification; we could support binding merging, but the return-on-complexity didn't seem quite there.
> 
> </digression>
> 
>> •  If the type of the selector expression is not an enum type, then either the “switch case” form is used or there is exactly one default label associated with the switch block.
> 
> I presume you intend that this eventually becomes true for switches on sealed types as well.

Yes, sorry, I didn’t state the conditions quite correctly, and I believe the correct way to state them will emerge once we work out the complete theory of optimistic totality.

> One question I have is that optimistic totality can apply in switches that are not on sealed types, or where sealed types show up in nested contexts.  For example:
> 
>     switch (box) {
>         case Box(var x): ...
>     }
> 
> Here, I think we want to say that Box(var x) is o.t. on Box, but it doesn't match null.   So how does the programmer indicate that they want to get totality checking and residue rejection?  

I believe that “switch case” can handle this:

    switch case (box) {
        case Box(var x): ...
    }

This says, among other things, that it is a static error if the (singleton) set of case patterns { Box(var x) } is not o.t. on the type of “box”, and it says we want residue checking, so it’s as if the compiler rewrote it to:

    switch case (box) {
	case null: throw <null residue error, which could be NPE or something else>
        case Box(var x): ...
    }

Alternatively, we could write

    switch (box) {
        default Box(var x): ...
    }

which says that it is a static error if the pattern Box(var x) is not total on the type of “box”.  It’s not, because it doesn’t match null, so we get a static error, as desired.  Perhaps we should have written

    switch (box) {
        case Box(var x): …
	default Box z: ...
    }

But I’m thinking the “switch case” solution is preferable for this specific example.

> Similarly, suppose we have a sealed type Shape = Circ + Rect, and the obvious container Box<Shape>: 
> 
>     switch (boxOfShape) {
>         case Box(Circ c): 
>         case Box(Rect r):
>     }
> 
> Again, I think we want this set of cases to be o.t., but the switch is not on a sealed type.  I am not sure how to integrate these cases into your model.

And again, I think that “switch case” does the job:

    switch case (boxOfShape) {
        case Box(Circ c): ...
        case Box(Rect r): ...
    }

This says, among other things, that it is a static error if the set of case patterns { Box(Circ c), Box(Rest r) } is not o.t. on the type of “boxOfShape”, and it says we want residue checking, so it’s as if the compiler rewrote it to:

    switch case (boxOfShape) {
	case null: throw <null residue error, which could be NPE or something else>
        case Box(Circ c): ...
        case Box(Rect r): …
	case Box(var _): throw <unknown residue error, which could be ICCE or something else>
    }