[pattern-switch] Exhaustiveness

Thu Aug 20 16:58:01 UTC 2020

Hello!

> Are we in agreement on what _should_ happen in these cases?

This sounds like a good idea. I see the following alternatives (not
saying that they are better, just putting them on the table).

1. Do not perform exhaustiveness analysis for nested patterns, except
if all nested components are total. So, in your example, either
explicit default, or case Box and case Bag will be necessary. This is
somewhat limiting but I'm not sure that the exhaustiveness of complex
nested patterns would be very popular. If one really needs this, they
could use nested switch expression:
return switch(container) {
  case Box(var s) -> switch(s) {case Rect r -> ...; case Circle c ->
...; /*optional case null is possible*/};
  case Bag(var s) -> switch(s) {case Rect r -> ...; case Circle c ->
...; /*optional case null is possible*/};
}
Here Box(var s) has total nested component, so it matches everything
that Box matches, the same with Bag, thus we perform exhaustiveness
analysis using Container declaration.
This approach does not close the door for us. We can rethink later and
add exhaustiveness analysis for nested patterns when we will finally
determine how it should look like. Note that this still allows making
Optional.of(var x) + Optional.empty exhaustive if we provide an
appropriate mechanism to declare this kind of patterns.

2. Allow deconstructors and records to specify whether null is a
possible value for a given component. Like make deconstructors and
records null-hostile by default and provide a syntax (T? or T|null or
whatever) to allow nulls. In this case, if the deconstructor is
null-friendly, then the exhaustive pattern must handle Box(null).
Otherwise, Box(null) is a compilation error. Yes, I know, this may
quickly evolve to the point when we will need to add a full-fledged
nullability to the type system. But probably it's possible to allow
nullability specification for new language features only? Ok, ok,
sorry, my imagination drove me too far away from reality. Forget it.

With best regards,
Tagir Valeev.

On Thu, Aug 20, 2020 at 10:57 PM Brian Goetz <brian.goetz at oracle.com> wrote:
>
> Tagir's question about exhaustiveness in switches points to some technical debt left over from expression switches.
>
> (Note: this entire discussion has nothing to do with whether `case Object o` is nullable; it has strictly to do with extending the existing treatment of exhaustive switches over enums to sealed classes, when we can conclude a switch over such a type is total without a default / total case, and what implicit cases we have to insert to make up for that.  Please let's not conflate this thread with that issue.)
>
>
> When we did expression switch, for an exhaustive switch that covered all the enums without a default, we inserted an extra catch-all case that throws ICCE, on the theory that nulls are already checked by the switch and so anything that hits the synthetic default must be a novel enum value, which merits an ICCE.  This worked for enum switches (where all case labels are discrete values), but doesn't quite scale to sealed types.  Let's fix that.
>
> As a recap, suppose we have
>
>     enum E { A, B; }
>
> and suppose that, via separate compilation, a novel value C is introduced that was unknown at the time the switch was compiled.
>
> An "exhaustive" statement switch on E:
>
>     switch (e) {
>         case A:
>         case B:
>     }
>
> throws NPE on null but does nothing on C, because switch statements make no attempt at being exhaustive.
>
> An _expression_ switch that is deemed exhaustive without a default case:
>
>     var s = switch (e) {
>         case A -> ...
>         case B -> ...
>     }
>
> throws NPE on null and ICCE on C.
>
> At the time, we were concerned about the gap between statement and expression switches, and talked about having a way to make statement switches exhaustive.  That's still on the table, and we should still address this, but that's not the subject of this mail.
>
> What I want to focus on in this mail is the interplay between exhaustiveness analysis and (exhaustive) switch semantics, and what code we have to inject to make up for gaps.  We've identified two sources of gaps: nulls, and novel enum values.  When we get to sealed types, we can add novel subtypes to the list of things we have to detect and implicitly reject; when we get to deconstruction patterns, we need to address these at nested levels too.
>
> Let's analyze switches on Container<Shape> assuming:
>
>     Container<T> = Box<T> | Bag<T>
>     Shape = Rect | Circle
>
> and assume a novel shape Pentagon shows up unexpectedly via separate compilation.
>
> If we have a switch _statement_ with:
>
>     case Box(Rect r)
>     case Box(Circle c)
>     case Bag(Rect r)
>     case Bag(Circle c)
>
> then the only value we implicitly handle is null; everything else just falls out of the switch, because they don't try to be exhaustive.
>
> If this is an expression switch, then I think its safe to say:
>
>  - The switch should deemed exhaustive; no Box(null) etc cases needed.
>  - We get an NPE on null.
>
> But that leaves Box(null), Bag(null), Box(Pentagon), and Bag(Pentagon).  We have to do something (the switch has to be total) with these, and again, asking users to manually handle these is unreasonable.  A reasonable strawman here is:
>
>     ICCE on Box(Pentagon) and Bag(Pentagon)
>     NPE on Box(null) and Bag(null)
>
> Essentially, what this means is: we need to explicitly consider null and novel values/types of enum/sealed classes in our exhaustiveness analysis, and, if these are not seen to be explicitly covered and the implicit coverage plays into the conclusion of overall weak totality, then we need to insert implicit catch-alls for these cases.
>
> If we switch over:
>
>     case Box(Rect r)
>     case Box(Circle c)
>     case Box b
>     case Bag(Rect r)
>     case Bag(Circle c)
>
> then Box(Pentagon) and Box(null) are handled by the `Box b` case and don't need to be handled by a catch-all.
>
> If we have:
>
>     case Box(Rect r)
>     case Box(Circle c)
>     case Bag(Rect r)
>     case Bag(Circle c)
>     default
>
> then Box(Pentagon|null) and Bag(Pentagon|null) clearly fall into the default case, so no special handling is needed there.
>
>
> Are we in agreement on what _should_ happen in these cases?  If so, I can put a more formal basis on it.
>
>
> On 8/14/2020 1:20 PM, Brian Goetz wrote:
>
>
>  - Exhaustiveness and null. (Tagir)  For sealed domains (enums and sealed types), we kind of cheated with expression switches because we could count on the switch filtering out the null.  But Tagir raises an excellent point, which is that we do not yet have a sound definition of exhaustiveness that scales to nested patterns (do Box(Rect) and Box(Circle) cover Box(Shape)?)  This is an interaction between sealed types and patterns that needs to be ironed out.  (Thanks Tagir!)
>
>
> [ Breaking this out from Tagir's more comprehensive reply ]
>
> It's unclear for me how exhaustiveness on nested patterns plays with
> null. case Box(Circle c) and case Box(Rect r) don't cover case
> Box(null) which is a valid possibility for Box<Shape> type.
>
>
> It’s not even clear how exhaustiveness plays with null even without nesting, so let's start there.
>
> Consider this switch:
>
>     switch (trafficLight) {
>         case GREEN, YELLOW: driveReallyFast();
>         case RED: sigh();
>     }
>
> Is it exhaustive?  Well, we want to say yes.  And with the existing null-hostility of switch, it is.  But even without that, we’d like to say yes, because a null enum value is almost always an error, and making users deal with cases that don’t happen in reality is kind of rude.
>
> For a domain sealed to a set of alternatives (enums or sealed classes), let’s say that a set of patterns is _weakly exhaustive_ if it covers all the alternatives but not null, and _strongly exhaustive_ if it also covers null.  When we did switch expressions, we said that weakly exhaustive coverings didn’t need a default in a switch expression.  I think we’re primed to say the same thing for sealed classes.  But, this “weak is good enough” leans on the fact that the existing hostility of switch will cover what we miss.  We get no such cover in nested cases.
>
> I think it’s worth examining further why we are willing to accept the weak coverage with enums.  Is it really that we’re willing to assume that enums just should never be null?  If we had type cardinalities in the language, would we treat `enum X` as declaring a cardinality-1 type called X?  I think we might.  OK, what about sealed classes?  Would the same thing carry over?  Not so sure there.  And this is a problem, because we ultimately want:
>
>     case Optional.of(var x):
>     case Optional.empty():
>
> to be exhaustive on Optional<T>, and said exhaustiveness will likely lean on some sort of sealing.
>
> This is related to Guy's observation that totality is a "subtree all the way down" property.  Consider:
>
>     sealed class Container<T> permits Box, Bag { }
>     sealed class Shape permits Rect, Circle { }
>
> Ignoring null, Box+Bag should be exhaustive on container, and Rect+Circle should be exhaustive on shape.  So if we are switching over a Container<Shape>, then what of:
>
>     case Box(Rect r):
>     case Box(Circle c):
>     case Bag(Rect r):
>     case Bag(Circle c):
>
> We have some "nullity holes" in three places: Box(null), Bag(null), and null itself.   Is this set of cases exhaustive on Bags, Boxes, or Containers?
>
> I think users would like to be able to write the above four cases and treat it as exhaustive; having to explicitly provide Box(null) / Box b, Bag(null) / Bag b, or a catch-all to accept null+Box(null)+Bag(null) would all be deemed unpleasant ceremony.
>
> Hmm...
>
>