[pattern-switch] Exhaustiveness

Thu Aug 20 15:56:20 UTC 2020

Tagir's question about exhaustiveness in switches points to some 
technical debt left over from expression switches.

(Note: this entire discussion has nothing to do with whether `case 
Object o` is nullable; it has strictly to do with extending the existing 
treatment of exhaustive switches over enums to sealed classes, when we 
can conclude a switch over such a type is total without a default / 
total case, and what implicit cases we have to insert to make up for 
that.  Please let's not conflate this thread with that issue.)

When we did expression switch, for an exhaustive switch that covered all 
the enums without a default, we inserted an extra catch-all case that 
throws ICCE, on the theory that nulls are already checked by the switch 
and so anything that hits the synthetic default must be a novel enum 
value, which merits an ICCE.  This worked for enum switches (where all 
case labels are discrete values), but doesn't quite scale to sealed 
types. Let's fix that.

As a recap, suppose we have

     enum E { A, B; }

and suppose that, via separate compilation, a novel value C is 
introduced that was unknown at the time the switch was compiled.

An "exhaustive" statement switch on E:

     switch (e) {
         case A:
         case B:
     }

throws NPE on null but does nothing on C, because switch statements make 
no attempt at being exhaustive.

An _expression_ switch that is deemed exhaustive without a default case:

     var s = switch (e) {
         case A -> ...
         case B -> ...
     }

throws NPE on null and ICCE on C.

At the time, we were concerned about the gap between statement and 
expression switches, and talked about having a way to make statement 
switches exhaustive.  That's still on the table, and we should still 
address this, but that's not the subject of this mail.

What I want to focus on in this mail is the interplay between 
exhaustiveness analysis and (exhaustive) switch semantics, and what code 
we have to inject to make up for gaps.  We've identified two sources of 
gaps: nulls, and novel enum values. When we get to sealed types, we can 
add novel subtypes to the list of things we have to detect and 
implicitly reject; when we get to deconstruction patterns, we need to 
address these at nested levels too.

Let's analyze switches on Container<Shape> assuming:

     Container<T> = Box<T> | Bag<T>
     Shape = Rect | Circle

and assume a novel shape Pentagon shows up unexpectedly via separate 
compilation.

If we have a switch _statement_ with:

     case Box(Rect r)
     case Box(Circle c)
     case Bag(Rect r)
     case Bag(Circle c)

then the only value we implicitly handle is null; everything else just 
falls out of the switch, because they don't try to be exhaustive.

If this is an expression switch, then I think its safe to say:

  - The switch should deemed exhaustive; no Box(null) etc cases needed.
  - We get an NPE on null.

But that leaves Box(null), Bag(null), Box(Pentagon), and Bag(Pentagon).  
We have to do something (the switch has to be total) with these, and 
again, asking users to manually handle these is unreasonable.  A 
reasonable strawman here is:

     ICCE on Box(Pentagon) and Bag(Pentagon)
     NPE on Box(null) and Bag(null)

Essentially, what this means is: we need to explicitly consider null and 
novel values/types of enum/sealed classes in our exhaustiveness 
analysis, and, if these are not seen to be explicitly covered and the 
implicit coverage plays into the conclusion of overall weak totality, 
then we need to insert implicit catch-alls for these cases.

If we switch over:

     case Box(Rect r)
     case Box(Circle c)
     case Box b
     case Bag(Rect r)
     case Bag(Circle c)

then Box(Pentagon) and Box(null) are handled by the `Box b` case and 
don't need to be handled by a catch-all.

If we have:

     case Box(Rect r)
     case Box(Circle c)
     case Bag(Rect r)
     case Bag(Circle c)
     default

then Box(Pentagon|null) and Bag(Pentagon|null) clearly fall into the 
default case, so no special handling is needed there.

Are we in agreement on what _should_ happen in these cases?  If so, I 
can put a more formal basis on it.

On 8/14/2020 1:20 PM, Brian Goetz wrote:
>>
>>  - Exhaustiveness and null. (Tagir)  For sealed domains (enums and 
>> sealed types), we kind of cheated with expression switches because we 
>> could count on the switch filtering out the null.  But Tagir raises 
>> an excellent point, which is that we do not yet have a sound 
>> definition of exhaustiveness that scales to nested patterns (do 
>> Box(Rect) and Box(Circle) cover Box(Shape)?)  This is an interaction 
>> between sealed types and patterns that needs to be ironed out.  
>> (Thanks Tagir!)
>
> [ Breaking this out from Tagir's more comprehensive reply ]
>
>> It's unclear for me how exhaustiveness on nested patterns plays with
>> null. case Box(Circle c) and case Box(Rect r) don't cover case
>> Box(null) which is a valid possibility for Box<Shape> type.
>
> It’s not even clear how exhaustiveness plays with null even without 
> nesting, so let's start there.
>
> Consider this switch:
>
> switch (trafficLight) {
> case GREEN, YELLOW: driveReallyFast();
> case RED: sigh();
>     }
>
> Is it exhaustive?  Well, we want to say yes.  And with the existing 
> null-hostility of switch, it is.  But even without that, we’d like to 
> say yes, because a null enum value is almost always an error, and 
> making users deal with cases that don’t happen in reality is kind of 
> rude.
>
> For a domain sealed to a set of alternatives (enums or sealed 
> classes), let’s say that a set of patterns is _weakly exhaustive_ if 
> it covers all the alternatives but not null, and _strongly exhaustive_ 
> if it also covers null.  When we did switch expressions, we said that 
> weakly exhaustive coverings didn’t need a default in a switch 
> expression.  I think we’re primed to say the same thing for sealed 
> classes.  But, this “weak is good enough” leans on the fact that the 
> existing hostility of switch will cover what we miss.  We get no such 
> cover in nested cases.
>
> I think it’s worth examining further why we are willing to accept the 
> weak coverage with enums.  Is it really that we’re willing to assume 
> that enums just should never be null?  If we had type cardinalities in 
> the language, would we treat `enum X` as declaring a cardinality-1 
> type called X?  I think we might.  OK, what about sealed classes? 
>  Would the same thing carry over?  Not so sure there.  And this is a 
> problem, because we ultimately want:
>
> case Optional.of(var x):
> case Optional.empty():
>
> to be exhaustive on Optional<T>, and said exhaustiveness will likely 
> lean on some sort of sealing.
>
> This is related to Guy's observation that totality is a "subtree all 
> the way down" property.  Consider:
>
>     sealed class Container<T> permits Box, Bag { }
>     sealed class Shape permits Rect, Circle { }
>
> Ignoring null, Box+Bag should be exhaustive on container, and 
> Rect+Circle should be exhaustive on shape.  So if we are switching 
> over a Container<Shape>, then what of:
>
>     case Box(Rect r):
>     case Box(Circle c):
>     case Bag(Rect r):
>     case Bag(Circle c):
>
> We have some "nullity holes" in three places: Box(null), Bag(null), 
> and null itself.   Is this set of cases exhaustive on Bags, Boxes, or 
> Containers?
>
> I think users would like to be able to write the above four cases and 
> treat it as exhaustive; having to explicitly provide Box(null) / Box 
> b, Bag(null) / Bag b, or a catch-all to accept 
> null+Box(null)+Bag(null) would all be deemed unpleasant ceremony.
>
> Hmm...
>