[pattern-switch] Exhaustiveness
Remi Forax
forax at univ-mlv.fr
Thu Aug 20 20:34:27 UTC 2020
> De: "Brian Goetz" <brian.goetz at oracle.com>
> À: "Tagir Valeev" <amaembo at gmail.com>
> Cc: "amber-spec-experts" <amber-spec-experts at openjdk.java.net>
> Envoyé: Jeudi 20 Août 2020 21:09:00
> Objet: Re: [pattern-switch] Exhaustiveness
> Here's an attempt at a formalism for capturing this.
> There are several categories of patterns we might call total on a type T. We
> could refine the taxonomy as:
> - Strongly total -- matches all values of T.
> - Weakly total -- matches all values of T except perhaps null.
> What we want to do is characterize the aggregate totality on T of a _set_ of
> patterns P*. A set of patterns could in the aggregate be either of the above,
> or also:
> - Optimistically total -- matches all values of subtypes of T _known at compile
> time_, except perhaps null.
> Note that we have an ordering:
> partial < optimistically total < weakly total < strongly total
> Now, some rules about defining the totality of a set of patterns.
> T-Total: The singleton set containing the type pattern `T t` is strongly total
> on U <: T. (This is the rule we've been discussing, but that's not the point of
> this mail -- we just need a base case right now.)
> T-Subset: If a set of patterns P* contains a subset of patterns that is X-total
> on T, then P* is X-total on T.
> T-Sealed: If T is sealed to U* (direct subtypes only), and for each U in U*,
> there is some subset of P* that is optimistically total on U, then P* is
> optimistically total on T.
> T-Nested: Given a deconstructor D(U) and a collection of patterns { P1..Pn }, if
> { P1..Pn } is X-total on U, then { D(P1)..D(Pn) } is min(X,weak)-total on D.
> OK, examples. Let's say
> Container<T> = Box<T> | Bag<T>
> Shape = Round | Rect
> Round = Circle | Ellipse
> { Container c }: total on Container by T-Total.
> { Box b, Bag b }: optimistically total on Container
> - Container sealed to Box and Bag
> - `Box b` total on Box, `Bag b` total on Bag
> - { Box b, Bag b } optimistically total on Container by T-Sealed
> { Box(Round r), Box(Rect r) }: optimistically total on Box<Shape>
> - Box sealed to Round and Rect
> - { Round r, Rect r } optimistically total on Shape by T-Sealed
> - { Box(Round r), Box(Rect r) } optimistically total on Box<Shape> by T-Nested
> { Box(Object o) } weakly total on Box<Object>
> - Object o total on Object
> - { Object o } total on Object by T-Subset
> - { Box(Object o) } weakly total on Box<Object> by T-Nested
> { Box(Rect r), Box(Circle c), Box(Ellipse e) } optimistically total on
> Box<Shape>
> - Shape sealed to Round and Rect
> - { Rect r } total on Rect
> - { Circle c, Ellipse e } optimistically total on Round
> - { Rect r, Circle c, Ellipse e } is optimistically total on Shape, because for
> each of { Rect, Round }, there is a subset that is optimistically total on that
> type
> - { Box(Rect r), Box(Circle c), Box(Ellipse e) } optimistically total on Box by
> T-Nested
> We can enhance this model to construct the residue (a characterization of what
> falls through the cracks), and therefore has to be handled by a catch-all in a
> putatively total switch. A grammar for the residue would be:
> R := null | novel | D(R)
> so might includes Box(null), Box(novel), Bag(Box(novel)), etc. We would need to
> extend this to support deconstructors with multiple bindings too.
> OK, coming back to reality:
> - The patterns of a switch expression must be at least optimistically total.
> - The translation of a switch expression must include a synthetic case that
> catches all elements of the residue of its patterns, and throws the appropriate
> exceptions:
> - NPE for a null
> - ICCE for a novel value
> - One of the above, or maybe something else, for D(novel), D(null), D(E(novel,
> null)), etc
> We still have not addressed how we might nominate a _statement_ switch as being
> some form of total; that's a separate story.
> Also a separate story: under what conditions in the new world do switches throw
> NPE, but this seems like progress.
> Given the weird shape of the residue, it's not clear there's a clean way to
> extrapolate the NPE|ICCE rule, since we might have Foo(null, novel), and would
> arbitrarily have to pick which exception to throw, and neither would really be
> all that great. Perhaps there's a new exception type lurking here.
I disagree here, if me move the null checks upfront the type check a NPE will always be thrown before a ICCE on the same pattern which is the usual rule for any JDK methods
(if you see a pattern matching as an equivalent of a cascade of instanceof + method call).
Rémi
> On 8/20/2020 12:58 PM, Tagir Valeev wrote:
>> Hello!
>>> Are we in agreement on what _should_ happen in these cases?
>> This sounds like a good idea. I see the following alternatives (not
>> saying that they are better, just putting them on the table).
>> 1. Do not perform exhaustiveness analysis for nested patterns, except
>> if all nested components are total. So, in your example, either
>> explicit default, or case Box and case Bag will be necessary. This is
>> somewhat limiting but I'm not sure that the exhaustiveness of complex
>> nested patterns would be very popular. If one really needs this, they
>> could use nested switch expression:
>> return switch(container) {
>> case Box(var s) -> switch(s) {case Rect r -> ...; case Circle c ->
>> ...; /*optional case null is possible*/};
>> case Bag(var s) -> switch(s) {case Rect r -> ...; case Circle c ->
>> ...; /*optional case null is possible*/};
>> }
>> Here Box(var s) has total nested component, so it matches everything
>> that Box matches, the same with Bag, thus we perform exhaustiveness
>> analysis using Container declaration.
>> This approach does not close the door for us. We can rethink later and
>> add exhaustiveness analysis for nested patterns when we will finally
>> determine how it should look like. Note that this still allows making
>> Optional.of(var x) + Optional.empty exhaustive if we provide an
>> appropriate mechanism to declare this kind of patterns.
>> 2. Allow deconstructors and records to specify whether null is a
>> possible value for a given component. Like make deconstructors and
>> records null-hostile by default and provide a syntax (T? or T|null or
>> whatever) to allow nulls. In this case, if the deconstructor is
>> null-friendly, then the exhaustive pattern must handle Box(null).
>> Otherwise, Box(null) is a compilation error. Yes, I know, this may
>> quickly evolve to the point when we will need to add a full-fledged
>> nullability to the type system. But probably it's possible to allow
>> nullability specification for new language features only? Ok, ok,
>> sorry, my imagination drove me too far away from reality. Forget it.
>> With best regards,
>> Tagir Valeev.
>> On Thu, Aug 20, 2020 at 10:57 PM Brian Goetz [ mailto:brian.goetz at oracle.com |
>> <brian.goetz at oracle.com> ] wrote:
>>> Tagir's question about exhaustiveness in switches points to some technical debt
>>> left over from expression switches.
>>> (Note: this entire discussion has nothing to do with whether `case Object o` is
>>> nullable; it has strictly to do with extending the existing treatment of
>>> exhaustive switches over enums to sealed classes, when we can conclude a switch
>>> over such a type is total without a default / total case, and what implicit
>>> cases we have to insert to make up for that. Please let's not conflate this
>>> thread with that issue.)
>>> When we did expression switch, for an exhaustive switch that covered all the
>>> enums without a default, we inserted an extra catch-all case that throws ICCE,
>>> on the theory that nulls are already checked by the switch and so anything that
>>> hits the synthetic default must be a novel enum value, which merits an ICCE.
>>> This worked for enum switches (where all case labels are discrete values), but
>>> doesn't quite scale to sealed types. Let's fix that.
>>> As a recap, suppose we have
>>> enum E { A, B; }
>>> and suppose that, via separate compilation, a novel value C is introduced that
>>> was unknown at the time the switch was compiled.
>>> An "exhaustive" statement switch on E:
>>> switch (e) {
>>> case A:
>>> case B:
>>> }
>>> throws NPE on null but does nothing on C, because switch statements make no
>>> attempt at being exhaustive.
>>> An _expression_ switch that is deemed exhaustive without a default case:
>>> var s = switch (e) {
>>> case A -> ...
>>> case B -> ...
>>> }
>>> throws NPE on null and ICCE on C.
>>> At the time, we were concerned about the gap between statement and expression
>>> switches, and talked about having a way to make statement switches exhaustive.
>>> That's still on the table, and we should still address this, but that's not
>>> the subject of this mail.
>>> What I want to focus on in this mail is the interplay between exhaustiveness
>>> analysis and (exhaustive) switch semantics, and what code we have to inject to
>>> make up for gaps. We've identified two sources of gaps: nulls, and novel enum
>>> values. When we get to sealed types, we can add novel subtypes to the list of
>>> things we have to detect and implicitly reject; when we get to deconstruction
>>> patterns, we need to address these at nested levels too.
>>> Let's analyze switches on Container<Shape> assuming:
>>> Container<T> = Box<T> | Bag<T>
>>> Shape = Rect | Circle
>>> and assume a novel shape Pentagon shows up unexpectedly via separate
>>> compilation.
>>> If we have a switch _statement_ with:
>>> case Box(Rect r)
>>> case Box(Circle c)
>>> case Bag(Rect r)
>>> case Bag(Circle c)
>>> then the only value we implicitly handle is null; everything else just falls out
>>> of the switch, because they don't try to be exhaustive.
>>> If this is an expression switch, then I think its safe to say:
>>> - The switch should deemed exhaustive; no Box(null) etc cases needed.
>>> - We get an NPE on null.
>>> But that leaves Box(null), Bag(null), Box(Pentagon), and Bag(Pentagon). We have
>>> to do something (the switch has to be total) with these, and again, asking
>>> users to manually handle these is unreasonable. A reasonable strawman here is:
>>> ICCE on Box(Pentagon) and Bag(Pentagon)
>>> NPE on Box(null) and Bag(null)
>>> Essentially, what this means is: we need to explicitly consider null and novel
>>> values/types of enum/sealed classes in our exhaustiveness analysis, and, if
>>> these are not seen to be explicitly covered and the implicit coverage plays
>>> into the conclusion of overall weak totality, then we need to insert implicit
>>> catch-alls for these cases.
>>> If we switch over:
>>> case Box(Rect r)
>>> case Box(Circle c)
>>> case Box b
>>> case Bag(Rect r)
>>> case Bag(Circle c)
>>> then Box(Pentagon) and Box(null) are handled by the `Box b` case and don't need
>>> to be handled by a catch-all.
>>> If we have:
>>> case Box(Rect r)
>>> case Box(Circle c)
>>> case Bag(Rect r)
>>> case Bag(Circle c)
>>> default
>>> then Box(Pentagon|null) and Bag(Pentagon|null) clearly fall into the default
>>> case, so no special handling is needed there.
>>> Are we in agreement on what _should_ happen in these cases? If so, I can put a
>>> more formal basis on it.
>>> On 8/14/2020 1:20 PM, Brian Goetz wrote:
>>> - Exhaustiveness and null. (Tagir) For sealed domains (enums and sealed types),
>>> we kind of cheated with expression switches because we could count on the
>>> switch filtering out the null. But Tagir raises an excellent point, which is
>>> that we do not yet have a sound definition of exhaustiveness that scales to
>>> nested patterns (do Box(Rect) and Box(Circle) cover Box(Shape)?) This is an
>>> interaction between sealed types and patterns that needs to be ironed out.
>>> (Thanks Tagir!)
>>> [ Breaking this out from Tagir's more comprehensive reply ]
>>> It's unclear for me how exhaustiveness on nested patterns plays with
>>> null. case Box(Circle c) and case Box(Rect r) don't cover case
>>> Box(null) which is a valid possibility for Box<Shape> type.
>>> It’s not even clear how exhaustiveness plays with null even without nesting, so
>>> let's start there.
>>> Consider this switch:
>>> switch (trafficLight) {
>>> case GREEN, YELLOW: driveReallyFast();
>>> case RED: sigh();
>>> }
>>> Is it exhaustive? Well, we want to say yes. And with the existing
>>> null-hostility of switch, it is. But even without that, we’d like to say yes,
>>> because a null enum value is almost always an error, and making users deal with
>>> cases that don’t happen in reality is kind of rude.
>>> For a domain sealed to a set of alternatives (enums or sealed classes), let’s
>>> say that a set of patterns is _weakly exhaustive_ if it covers all the
>>> alternatives but not null, and _strongly exhaustive_ if it also covers null.
>>> When we did switch expressions, we said that weakly exhaustive coverings
>>> didn’t need a default in a switch expression. I think we’re primed to say the
>>> same thing for sealed classes. But, this “weak is good enough” leans on the
>>> fact that the existing hostility of switch will cover what we miss. We get no
>>> such cover in nested cases.
>>> I think it’s worth examining further why we are willing to accept the weak
>>> coverage with enums. Is it really that we’re willing to assume that enums just
>>> should never be null? If we had type cardinalities in the language, would we
>>> treat `enum X` as declaring a cardinality-1 type called X? I think we might.
>>> OK, what about sealed classes? Would the same thing carry over? Not so sure
>>> there. And this is a problem, because we ultimately want:
>>> case Optional.of(var x):
>>> case Optional.empty():
>>> to be exhaustive on Optional<T>, and said exhaustiveness will likely lean on
>>> some sort of sealing.
>>> This is related to Guy's observation that totality is a "subtree all the way
>>> down" property. Consider:
>>> sealed class Container<T> permits Box, Bag { }
>>> sealed class Shape permits Rect, Circle { }
>>> Ignoring null, Box+Bag should be exhaustive on container, and Rect+Circle should
>>> be exhaustive on shape. So if we are switching over a Container<Shape>, then
>>> what of:
>>> case Box(Rect r):
>>> case Box(Circle c):
>>> case Bag(Rect r):
>>> case Bag(Circle c):
>>> We have some "nullity holes" in three places: Box(null), Bag(null), and null
>>> itself. Is this set of cases exhaustive on Bags, Boxes, or Containers?
>>> I think users would like to be able to write the above four cases and treat it
>>> as exhaustive; having to explicitly provide Box(null) / Box b, Bag(null) / Bag
>>> b, or a catch-all to accept null+Box(null)+Bag(null) would all be deemed
>>> unpleasant ceremony.
>>> Hmm...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.java.net/pipermail/amber-spec-experts/attachments/20200820/ea122ede/attachment-0001.htm>
More information about the amber-spec-experts
mailing list