Opting into totality

Mon Aug 24 21:04:40 UTC 2020

> I am going to argue here that, just as fear of letting nulls flow 
> stemmed from a early design that conflated multiple design issues as a 
> result of extrapolating from too few data points (enums and strings), 
> we have been boxed into another corner because we conflated 
> expression-ness and the need for totality.

I'm not sure we _conflated_ the two, as we did this with our eyes open 
(and fairly recently), but I suspect I agree with the rest -- that for 
$REASONS, we introduced an asymmetry that we knew would come back to 
bite us, and left a note for ourselves to come back and revisit, 
especially as optimistic totality became more important (e.g., through 
sealed types.)

> Going back to the dawn of time, a switch statement does not have to be 
> total.  Why is this possible?  Because there is an obvious default 
> behavior: do nothing.  If we were to view it in terms of delivering a 
> value of some type, we would say that type is “void”.

Yep.  Cue usual comparison to "if without else."  Partiality, for 
statements, is OK, but not for expressions.  Can't have a ternary 
expression with no `: alternative` part.

> Then why did we not allow a switch expression to be _exactly_ 
> analogous?  In fact, we could have, by relying on existing precedent 
> in the language: if no switch label matches and there is no default, 
> or if execution of the statements of the switch block completes 
> normally, we could simply decree that a switch expression has the 
> default behavior “do nothing” and delivers a _default value_—exactly 
> as we do for initialization of fields and array components.  So for
>
> enum Color { RED, GREEN, BLUE }
> Color x = …
> int n = switch (x) { RED -> 1; GREEN -> 2; };
>
> then if x is BLUE, n will get the value 0.
>
> But I am guessing that we worried about programming errors and 
> demanded totality for switch expressions, so we enforced it by fiat 
> because we had no other mechanism to request totality.

Yes, that's right.  Note that this is analogous to another feature that 
is frequently requested -- the so-called "safe dereference" operators, 
where `x?.y` yields a default value if `x` is null.  When this one first 
reared its head (and about a thousand times since, since it comes up 
near-constantly), our objection was that, while `null` might conceivably 
be a reasonable default for such an expression if `y` is of reference 
type, yielding a default primitive value is more likely to lead to 
errors than not.  (This is not unlike the problem with `Map::get`, where 
`null` means both "mapping not present" and "element mapped to null", 
and there's no way to tell the difference unless you can freeze the map 
for updates while asking two questions (Map::containsKey and 
Map::get.)   The argument against both is the same.

> The first is a switch label of the form “default <pattern>”, which 
> behaves just like a switch label “case <pattern>” except that it is a 
> static error if the <pattern> is not total on the type of the selector 
> expression.  This mechanism is good for extensible type hierarchies, 
> where we expect to call out a number of special cases and then have a 
> catch-all case, and we want the compiler to confirm to us on every 
> compilation that the catch-all case actually does catch everything.

What I like about this use of default is that it decomposes into 
independent parts.   The `default` case means "everything else"; 
`default <total-pattern>` means "everything else, and destructure that 
everything else with this pattern, which had better be total because, 
well, I'm matching everything else."   THe addition of a pattern doesn't 
change the meaning of default; it is a form of composition.  You could 
recast this as taking one feature -- "patterns in switch" -- and turning 
it into two: patterns in case labels, and patterns in default labels.

> The second is the possibility of writing “switch case” rather than 
> “switch”, which introduces these extra constraints on the switch 
> block: It is a static error if any SwitchLabel of the switch statement 
> begins with “default".  It is a static error if the set of case 
> patterns is not at least optimistically total on the type of the 
> selector expression.  It is a static error if the last BlockStatement 
> in _any_ SwitchBlockStatementGroup, or the Block in any SwitchRule, 
> can complete normally.  It is a static error if any SwitchLabel of the 
> switch statement is not part of a SwitchBlockStatementGroup or 
> SwitchRule.  In addition, the compiler automatically inserts 
> SwitchBlockStatementGroups or SwitchRules to cover the residue, so as 
> to throw an appropriate error at run time if the value produced by the 
> selector expression belongs to the residue.  This mechanism is good 
> for enums and sealed types, that is, situations where we expect to 
> enumerate all the special cases explicitly and want to be notified by 
> the compiler (or failing that, at run time) if we have failed to do so.

Is it fair to recharacterize this as follows -- we divide 
potentially-total statement switches into two kinds:
  - Those that arrive at totality via a catch-all clause;
  - Those that arrive at totality via an (optimistic) covering of an 
enumerated domain ("switch by parts")

and we provide a separate mechanism for each to declare their totality 
(which engages different type checking and translation.)

(I see later that you say this, so good, we're on the same page.)

(We might have called this "enum switch", especially if we had taken 
Alan's suggestion of declaring sealed types with a more enum-like syntax.)

<digression>
Some comments on the restrictions:

> It is a static error if the last BlockStatement in _any_ 
> SwitchBlockStatementGroup, or the Block in any SwitchRule, can 
> complete normally.

This is mostly already true in general; we envision it is a static error 
if you fall into, or out of, any case that has bindings.  (We might 
relax this to allow falling _into_ a total case.)   This was a 
simplification; we could support binding merging, but the 
return-on-complexity didn't seem quite there.

</digression>

> •  If the type of the selector expression is not an enum type, then 
> either the “switch case” form is used or there is exactly 
> one default label associated with the switch block.

I presume you intend that this eventually becomes true for switches on 
sealed types as well.

One question I have is that optimistic totality can apply in switches 
that are not on sealed types, or where sealed types show up in nested 
contexts.  For example:

     switch (box) {
         case Box(var x): ...
     }

Here, I think we want to say that Box(var x) is o.t. on Box, but it 
doesn't match null.   So how does the programmer indicate that they want 
to get totality checking and residue rejection?

Similarly, suppose we have a sealed type Shape = Circ + Rect, and the 
obvious container Box<Shape>:

     switch (boxOfShape) {
         case Box(Circ c):
         case Box(Rect r):
     }

Again, I think we want this set of cases to be o.t., but the switch is 
not on a sealed type.  I am not sure how to integrate these cases into 
your model.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20200824/4652be6f/attachment-0001.htm>