Optimistic totality

Mon Aug 24 16:23:47 UTC 2020

As I mentioned yesterday, I think our ideas about totality and null 
handling were getting polluted by our desire to support intuitive, 
optimistic totality.  So let's try to separate them, by outlining some 
goals for optimistic totality.

First, I'll posit that we're now able to stand on a more solid foundation:

  - Null is just another value for purposes of pattern matching; total 
patterns match null.
  - Null is just another value for purposes of switches; switches will 
feed null into total cases.
  - The perceived null-hostility of switch is actually about switches on 
enums, boxes, and strings; in the general case, we don't want, or need, 
such null-hostility.

This is a much simpler story, and has many fewer sharp edges.

Declaring clarity on that, we now have two additional problems to solve:

  - How to fill the gap between (always total) expression switches and 
(always partial) statement switches.  Totality checking is a useful 
static analysis that can identify bugs earlier, and being able to 
restore symmetry in semantics (even if it requires asymmetry in syntax) 
reduces unexpected errors and potholes.

  - How to extend the optimistic totality of expression switches over 
enums (which is a very restricted case) to the more general case of 
switches over sealed types, and switches with weakly total cases (such 
as total deconstruction patterns.)

This mail will focus mostly on the second problem; I'll start another 
thread for the first.

The goal of optimistic totality handling is to allow users to write a 
set cases that covers the target "well enough" that a catch-all throwing 
default is not needed.  This has two benefits:

  - Let the compiler write the dumb do-nothing code, rather than making 
the user do it;
  - If the user writes a throwing catch-all, we lose the opportunity to 
type-check the assumption that the switch was total in the first place.

Users are well aware of the first benefit, but the second benefit is 
actually more important.  If the user writes:

     Freq frequency = switch (trafficLight) {
         case RED -> Freq.ofTHz(450);
         case YELLOW -> Freq.ofTHz(525);
         default -> throw ...;
     }

We are deprived of two ways to help:
  - We cannot tell whether the user meant for { RED, YELLOW } to cover 
the space, so we cannot offer helpful type checking of "you forgot green."
  - Even if the code, as written, does cover the space, if a new 
constant / permitted subtype is added later, we lose the opportunity to 
catch it at next compilation, and alert the user to the fact that their 
assumption of totality was broken by someone else.

On the other hand, if there is no default clause, we get exhaustiveness 
checking when the code is first written, and continual revalidation of 
this assumption on every recompile.

OK, so optimistic totality is good.  What does that really mean?  We 
already know one case:

  - Specifying all the known constants of an enum, but no default or 
null case.

Because this case is so limited, we handled this one pretty well in 12; 
we NPE on null, and ICCE on everything else.

Another (new) case is:

  - When we have a _weakly total_ (total except for null) pattern on the 
target type.  A key category of weakly total patterns are deconstruction 
patterns whose sub-patterns are total.  Such as:

     var x = switch (box) {
         case Box(var x) b -> ...
     }

The pattern `Box(var x)` matches all _non-null_ boxes.  (It can't match 
null boxes, because we'd be invoking the deconstructor with a null 
receiver, which would surely NPE anyway, since a deconstructor is going 
to have some `x = this.x` ~99.99% of the time.)  So, should this be good 
enough to declare the switch optimistically total?  I think so; having 
to say `case null` in this switch would be irritating to users for no 
good reason.

What we've done is flipped things around; rather than saying "switches 
NPE on null", we can say "total switches with optimistically total case 
sets can throw on silly inputs" -- because the very concept of 
optimistic totality suggests that we think the residue consists only of 
silly inputs (and we are only throwing when the switch is total 
anyway.)  Now we can have a more refined definition of silly inputs.

Another case:

  - The sealed class analogue of an enum switch.

Here, we have a sealed class C, and a set of patterns that, for every 
permitted subtype D of C, some subset of the patterns is 
(optimistically) total on D.  Now, our residue has two inhabitants: 
null, and novel subclasses.

Do we think this should be optimistically total?  Yes; all the reasons 
why a throwing default is bad on the enum case apply to the sealed case, 
there is just a larger residue set.

Another case:

  - When we have a deconstructor D(C), and a set of patterns 
D(P1)...D(Pn) such that P1..Pn are optimistically total on C, we would 
like to conclude that the lifted patterns are optimistically total on D.

Example:

     switch (boxOfShape) {   // Shape = Circle + Rect
         case Box(Circle c):
         case Box(Rect r):
     }

Our claim here is that because Circle + Rect are o.t. on Shape, 
Box(Circle)+Box(Rect) should be o.t. on Box<Shape>.  Do we buy that?  
Again, I think we want this; asking users to insert Box(null) or 
Box(novel) cases to get totality checking is counterproductive.

What we see here is that we have an accumulation of situations where we 
think a given set of patterns covers a target type "well enough" that we 
are willing to (a) let the user skate on being truly total, and (b) 
engage enhanced type checking against the set of "good enough" cases.

After writing this, I think we are, once again, being overly constrained 
by (and worse, distracted by) "consistency" with what we decided in 12 
for the simple case of enum switches: that the answer always has to be 
some form of ICCE or NPE.  These were easy answers when the residue was 
so easily characterized, but trying to extrapolate from them too rigidly 
may be a mistake.

So, I think that we should save NPE and ICCE for the more accurate, 
narrow uses we found for them in 12, and for any "complex" residue, just 
define a new exception type -- and focus our energy on ensuring we get 
good error messages out of it, and move past this distraction.

The real point here is defining what we consider to be acceptable 
residue for an optimistically total switch, and ensure that we can 
deliver clear error messages when a Box(Hexagon) shows up.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20200824/0a8f9f43/attachment-0001.htm>