From brian.goetz at oracle.com Mon Aug 3 17:20:09 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 3 Aug 2020 13:20:09 -0400 Subject: Finalizing in JDK 16 - Records In-Reply-To: <576588AE-CE85-46B3-92AD-C13F536F8F9F@oracle.com> References: <576588AE-CE85-46B3-92AD-C13F536F8F9F@oracle.com> Message-ID: <4108f339-7850-af4f-1b9f-27c91cd71353@oracle.com> One of the less-well-known "featurelets" that is buried in records has to do with nesting.? We want records to nest cleanly in methods (local records) and classes.? During the second preview, we made some progress, but not complete progress, on this. As of Preview/2, the following are allowed: ?- Records can be nested in top-level (non-inner) classes.? They are implicitly static. ?- Records can be nested in methods (local records).? They are implicitly static.? (The spec now gives meaning to the notion of a static local class, whether or not you can explicitly declare one.) At the same time as we did the latter, we removed the restrictions from local interfaces and enums too.? This is what we call a "horizontal" move; we could have stuck strictly to local records, which some might argue is more in line with the JEP's mandate, but the result would be that it leaves the language more irregular: why concrete classes and records, but not enums (on which the design of records is modeled?)? It made more sense to allow all of these (or at least, move towards allowing all of them) in rather than leave an arbitrary trail of ad-hoc "X and Y but not Z" constraints.? And like records, local enums and interfaces are implicitly static.? This is all in Preview/2. There is still one restriction we would like to bust: you can't have records (today) in inner classes because they are implicitly static, and there is a blanket restriction against static members (explicit or implicit) in inner classes.? As above, we could drill the smallest hole possible and just allow records in there, but it makes more sense (and will be easier for users to reason about) if we just drop the "no statics in inner" rule.? As has been stated before, this rule was added in 1.1 out of an "abundance of caution", and I think it is time to retire it in entirety.? So I would like the next round of record spec to permit records, enums, interfaces, static classes, and static fields in inner classes.? With refactoring in the spec that has already been done for Preview/2, this is largely a matter of removing these restrictions from the spec, implementations, and tests. On 7/27/2020 6:54 AM, Gavin Bierman wrote: > [Second email of two, looking to close out features we hope to finalize in JDK > 16.] > > > Records > ------- > > Record classes are a special kind of class that are used primarily to define a > simple aggregate of values. > > Records can be thought of as _nominal tuples_; their declaration commits to a > description of their state and given that their representation, as well as all > of the interesting protocols an object might expose -- construction, property > access, equality, etc -- are derived from that state description. > > Because we can derive everything from a common state description, the > declaration can be extremely parsimonious. Here is an example of a record class > declaration: > > record Point(int x, int y){} > > The state or, more formally, a record component list, (int x, int y), drives the > implicit declaration of a number of members of the Point class. > > - A `private` field is declared for each record component > - A `public` accessor method is declared for each record component > - A constructor is declared with an argument list matching the record component > list, and whose body assigns the fields with the corresponding argument. This > constructor is called the _canonical constructor_. > - Implementations of the methods: equals, toString and HashCode. > > The body of a record class declaration is often empty, but it can contain method > declarations as usual. Indeed, if it is necessary, the implicitly declared > members - the accessors, canonical constructor, and equals, toString, or > HashCode methods -- can alternatively be explicitly declared in the body. > > Often the reason for explicitly providing a canonical constructor for a record > class is to validate and/or normalize the argument values. To > enhance the readability of record class declarations, we provide a new compact > form of canonical constructor declaration, where only this > validation/normalization code is required. Here is an example: > > record Rational(int num, int denom) { > Rational { > int gcd = gcd(num, denom); > num /= gcd; > denom /= gcd; > } > } > > The intention of a compact constructor declaration is that only validation > and/or normalization code need be given in the constructor body; the remaining > initialization code is automatically supplied by the compiler. The formal > argument list is not required in a compact constructor declaration as it is > taken from the record component list. In other words, this declaration is > equivalent to the following one that uses the conventional constructor form: > > record Rational(int num, int denom) { > Rational(int num, int demon) { > // Validation/Normalization > int gcd = gcd(num, denom); > num /= gcd; > denom /= gcd; > // Initialization > this.num = num; > this.denom = denom; > } > } > > Once we settled on the design of record classes, things have been pretty stable. > Three issues that did arise were: > > 1. Initially canonical constructors were required to be public. This was changed > in the second preview. Now, if the canonical constructor is implicitly declared > then its access modifier is the same as the record class. If it is explicitly > declared then its access modifier must provide at least as much access as the > record class. > > 2. We have extended the meaning of the `@Override` annotation to include the > case that the annotated method is an explicitly declared accessor method for a > record component. > > 3. To enforce the intended use of compact constructors, we made it a > compile-time error to assign to any of the instance fields in the constructor > body. > > One area that has generated a number of questions is annotations. Our intention > is that an annotation on a record component is propagated to the field, > accessor, and/or constructor parameter, according to the applicability of the > annotation. It is not clear what other design choices there are. So we hope this > is just something that has to be learnt, and afterwards it feels natural. > > The records JEP also allows for local record declarations. This is important as > records will often be used as containers for intermediate data within method > bodies. Being able to declare these record classes locally is essential to stop > proliferation of classes. We are aware of some small tweaks that will be > required to the specification during the second preview period, but overall this > feature has not generated any controversy. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Aug 3 21:59:37 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 3 Aug 2020 23:59:37 +0200 (CEST) Subject: Finalizing in JDK 16 - Records In-Reply-To: <4108f339-7850-af4f-1b9f-27c91cd71353@oracle.com> References: <576588AE-CE85-46B3-92AD-C13F536F8F9F@oracle.com> <4108f339-7850-af4f-1b9f-27c91cd71353@oracle.com> Message-ID: <2010073375.95701.1596491977116.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Gavin Bierman" , "amber-spec-experts" > > Envoy?: Lundi 3 Ao?t 2020 19:20:09 > Objet: Re: Finalizing in JDK 16 - Records > One of the less-well-known "featurelets" that is buried in records has to do > with nesting. We want records to nest cleanly in methods (local records) and > classes. During the second preview, we made some progress, but not complete > progress, on this. > As of Preview/2, the following are allowed: > - Records can be nested in top-level (non-inner) classes. They are implicitly > static. > - Records can be nested in methods (local records). They are implicitly static. > (The spec now gives meaning to the notion of a static local class, whether or > not you can explicitly declare one.) > At the same time as we did the latter, we removed the restrictions from local > interfaces and enums too. This is what we call a "horizontal" move; we could > have stuck strictly to local records, which some might argue is more in line > with the JEP's mandate, but the result would be that it leaves the language > more irregular: why concrete classes and records, but not enums (on which the > design of records is modeled?) It made more sense to allow all of these (or at > least, move towards allowing all of them) in rather than leave an arbitrary > trail of ad-hoc "X and Y but not Z" constraints. And like records, local enums > and interfaces are implicitly static. This is all in Preview/2. > There is still one restriction we would like to bust: you can't have records > (today) in inner classes because they are implicitly static, and there is a > blanket restriction against static members (explicit or implicit) in inner > classes. As above, we could drill the smallest hole possible and just allow > records in there, but it makes more sense (and will be easier for users to > reason about) if we just drop the "no statics in inner" rule. As has been > stated before, this rule was added in 1.1 out of an "abundance of caution", and > I think it is time to retire it in entirety. So I would like the next round of > record spec to permit records, enums, interfaces, static classes, and static > fields in inner classes. With refactoring in the spec that has already been > done for Preview/2, this is largely a matter of removing these restrictions > from the spec, implementations, and tests. There is another restriction that doesn't make a lot of sense, an interface can not declare a member record (or a class, an enum, etc) as private. This restriction also interacts poorly with sealed interface with implicit permits directive. sealed interface I { int value(); private record Impl(int value) implements I { } // does not compile public static I wrap(int value) { return new Impl(value); } } I propose to relax this restriction as part of Sealed Preview 2. R?mi > On 7/27/2020 6:54 AM, Gavin Bierman wrote: >> [Second email of two, looking to close out features we hope to finalize in JDK >> 16.] >> Records >> ------- >> Record classes are a special kind of class that are used primarily to define a >> simple aggregate of values. >> Records can be thought of as _nominal tuples_; their declaration commits to a >> description of their state and given that their representation, as well as all >> of the interesting protocols an object might expose -- construction, property >> access, equality, etc -- are derived from that state description. >> Because we can derive everything from a common state description, the >> declaration can be extremely parsimonious. Here is an example of a record class >> declaration: >> record Point(int x, int y){} >> The state or, more formally, a record component list, (int x, int y), drives the >> implicit declaration of a number of members of the Point class. >> - A `private` field is declared for each record component >> - A `public` accessor method is declared for each record component >> - A constructor is declared with an argument list matching the record component >> list, and whose body assigns the fields with the corresponding argument. This >> constructor is called the _canonical constructor_. >> - Implementations of the methods: equals, toString and HashCode. >> The body of a record class declaration is often empty, but it can contain method >> declarations as usual. Indeed, if it is necessary, the implicitly declared >> members - the accessors, canonical constructor, and equals, toString, or >> HashCode methods -- can alternatively be explicitly declared in the body. >> Often the reason for explicitly providing a canonical constructor for a record >> class is to validate and/or normalize the argument values. To >> enhance the readability of record class declarations, we provide a new compact >> form of canonical constructor declaration, where only this >> validation/normalization code is required. Here is an example: >> record Rational(int num, int denom) { >> Rational { >> int gcd = gcd(num, denom); >> num /= gcd; >> denom /= gcd; >> } >> } >> The intention of a compact constructor declaration is that only validation >> and/or normalization code need be given in the constructor body; the remaining >> initialization code is automatically supplied by the compiler. The formal >> argument list is not required in a compact constructor declaration as it is >> taken from the record component list. In other words, this declaration is >> equivalent to the following one that uses the conventional constructor form: >> record Rational(int num, int denom) { >> Rational(int num, int demon) { >> // Validation/Normalization >> int gcd = gcd(num, denom); >> num /= gcd; >> denom /= gcd; >> // Initialization >> this.num = num; >> this.denom = denom; >> } >> } >> Once we settled on the design of record classes, things have been pretty stable. >> Three issues that did arise were: >> 1. Initially canonical constructors were required to be public. This was changed >> in the second preview. Now, if the canonical constructor is implicitly declared >> then its access modifier is the same as the record class. If it is explicitly >> declared then its access modifier must provide at least as much access as the >> record class. >> 2. We have extended the meaning of the `@Override` annotation to include the >> case that the annotated method is an explicitly declared accessor method for a >> record component. >> 3. To enforce the intended use of compact constructors, we made it a >> compile-time error to assign to any of the instance fields in the constructor >> body. >> One area that has generated a number of questions is annotations. Our intention >> is that an annotation on a record component is propagated to the field, >> accessor, and/or constructor parameter, according to the applicability of the >> annotation. It is not clear what other design choices there are. So we hope this >> is just something that has to be learnt, and afterwards it feels natural. >> The records JEP also allows for local record declarations. This is important as >> records will often be used as containers for intermediate data within method >> bodies. Being able to declare these record classes locally is essential to stop >> proliferation of classes. We are aware of some small tweaks that will be >> required to the specification during the second preview period, but overall this >> feature has not generated any controversy. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Aug 3 22:22:31 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 3 Aug 2020 18:22:31 -0400 Subject: Finalizing in JDK 16 - Records In-Reply-To: <2010073375.95701.1596491977116.JavaMail.zimbra@u-pem.fr> References: <576588AE-CE85-46B3-92AD-C13F536F8F9F@oracle.com> <4108f339-7850-af4f-1b9f-27c91cd71353@oracle.com> <2010073375.95701.1596491977116.JavaMail.zimbra@u-pem.fr> Message-ID: <4ff32147-bae8-cc66-1062-4ea2a56f5918@oracle.com> > > There is another restriction that doesn't make a lot of sense, an > interface can not declare a member record (or a class, an enum, etc) > as private. Yes, I know, and I'd like to eventually clear this one too, it's a gratuitous restriction (though there is a VM component, which makes it more work.) I deliberately chose not to include this one in this proposed batch, because there was no credible connection to records.?? But, the connection to sealed types might be just enough to justify it as part of that project. > > This restriction also interacts poorly with sealed interface with > implicit permits directive. > > ? sealed interface I { > ??? int value(); > > ??? private record Impl(int value) implements I { }?? // does not compile > > ??? public static I wrap(int value) { > ??? ? return new Impl(value); > ??? } > ?} > > I propose to relax this restriction as part of Sealed Preview 2. > > R?mi > > > > On 7/27/2020 6:54 AM, Gavin Bierman wrote: > > [Second email of two, looking to close out features we hope to finalize in JDK > 16.] > > > Records > ------- > > Record classes are a special kind of class that are used primarily to define a > simple aggregate of values. > > Records can be thought of as _nominal tuples_; their declaration commits to a > description of their state and given that their representation, as well as all > of the interesting protocols an object might expose -- construction, property > access, equality, etc -- are derived from that state description. > > Because we can derive everything from a common state description, the > declaration can be extremely parsimonious. Here is an example of a record class > declaration: > > record Point(int x, int y){} > > The state or, more formally, a record component list, (int x, int y), drives the > implicit declaration of a number of members of the Point class. > > - A `private` field is declared for each record component > - A `public` accessor method is declared for each record component > - A constructor is declared with an argument list matching the record component > list, and whose body assigns the fields with the corresponding argument. This > constructor is called the _canonical constructor_. > - Implementations of the methods: equals, toString and HashCode. > > The body of a record class declaration is often empty, but it can contain method > declarations as usual. Indeed, if it is necessary, the implicitly declared > members - the accessors, canonical constructor, and equals, toString, or > HashCode methods -- can alternatively be explicitly declared in the body. > > Often the reason for explicitly providing a canonical constructor for a record > class is to validate and/or normalize the argument values. To > enhance the readability of record class declarations, we provide a new compact > form of canonical constructor declaration, where only this > validation/normalization code is required. Here is an example: > > record Rational(int num, int denom) { > Rational { > int gcd = gcd(num, denom); > num /= gcd; > denom /= gcd; > } > } > > The intention of a compact constructor declaration is that only validation > and/or normalization code need be given in the constructor body; the remaining > initialization code is automatically supplied by the compiler. The formal > argument list is not required in a compact constructor declaration as it is > taken from the record component list. In other words, this declaration is > equivalent to the following one that uses the conventional constructor form: > > record Rational(int num, int denom) { > Rational(int num, int demon) { > // Validation/Normalization > int gcd = gcd(num, denom); > num /= gcd; > denom /= gcd; > // Initialization > this.num = num; > this.denom = denom; > } > } > > Once we settled on the design of record classes, things have been pretty stable. > Three issues that did arise were: > > 1. Initially canonical constructors were required to be public. This was changed > in the second preview. Now, if the canonical constructor is implicitly declared > then its access modifier is the same as the record class. If it is explicitly > declared then its access modifier must provide at least as much access as the > record class. > > 2. We have extended the meaning of the `@Override` annotation to include the > case that the annotated method is an explicitly declared accessor method for a > record component. > > 3. To enforce the intended use of compact constructors, we made it a > compile-time error to assign to any of the instance fields in the constructor > body. > > One area that has generated a number of questions is annotations. Our intention > is that an annotation on a record component is propagated to the field, > accessor, and/or constructor parameter, according to the applicability of the > annotation. It is not clear what other design choices there are. So we hope this > is just something that has to be learnt, and afterwards it feels natural. > > The records JEP also allows for local record declarations. This is important as > records will often be used as containers for intermediate data within method > bodies. Being able to declare these record classes locally is essential to stop > proliferation of classes. We are aware of some small tweaks that will be > required to the specification during the second preview period, but overall this > feature has not generated any controversy. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Aug 4 18:11:38 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 4 Aug 2020 14:11:38 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: Message-ID: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> One thing this left open was the actual syntax of guards.? (We know its snowing in hell now, because I am actually encouraging a syntax conversation.) Patterns in `instanceof` do not need guards, because `instanceof` is an expression, and expressions conjoin with `&&`: ??? if (x instanceof Foo f && f.bar() == 3) { ... } We explored the approach of making the boolean guard expression part of the pattern -- where the above would parse as `x instanceof P`, where P is `Foo f && f.bar() == 3`. Since pattern matching is conditional, we could think of boolean expressions as patterns that are independent of their target. But, this approach didn't pan out due to various ambiguities. The most obvious ambiguity was the obvious interpretation of constant patterns; that `0` could be the literal zero or a pattern that matches zero.? (I have since proposed we try to avoid constant patterns entirely.)? Under this interpretation, not only was there room for confusion ("Is foo(0) an invocation of foo, or a pattern?"), but there were puzzlers like: ??? switch (b) { ? ?? ?? case true && false: ... ??????? case false && true: ... ??? } It would not be clear whether this would be two patterns conjoined together (neither of which would match anything) or the constant pattern `true && false`.? (There are similar ambiguities with deconstruction patterns with no bindings.) Overall, all of this was enough to sour me on trying to press && into service here.? Which leaves some relatively mundane syntax options: ??? case P when e ??? case P where e ??? case P if e ??? case P only-if e ??? case P where (e) and I guess weirder things like ??? case P &&& e ??? case P : e ??? etc I particularly don't like `if` since it makes it even harder to tell where the case ends and the consequent begins.? Also, `e` should not contain a switch expression, since no one wants to try to parse: ??? case Foo f where switch (f.bar()) { ? ?? ?? case Bar b -> 3; ??? } > 3 -> blah(); in their heads.? (We already excluded switch expressions from candidates for constant expressions in 12 for similar reasons.) I mostly think this is a matter of picking a (contextual) keyword.? (I kind of like that `only-if` could be an actual keyword.) (Looking ahead, if we ever want to reenable support for merging, we might have patterns like: ??? case Foo(int x), Bar(int x) where x > 0: and we'd have to accept that the guard applies to _all_ the patterns.) On 6/24/2020 10:44 AM, Brian Goetz wrote: > There are a lot of directions we could take next for pattern > matching.? The one that builds most on what we've already done, and > offers significant incremental expressiveness, is extending the type > patterns we already have to a new context: switch.? (There is still > plenty of work to do on deconstruction patterns, pattern assignment, > etc, but these require more design work.) > > Here's an overview of where I think we are here. > > [JEP 305][jep305] introduced the first phase of [pattern > matching][patternmatch] > into the Java language.? It was deliberately limited, focusing on only > one kind > of pattern (type test patterns) and one linguistic context (`instanceof`). > Having introduced the concept to Java developers, we can now extend > both the > kinds of patterns and the linguistic context where patterns are used. > > ## Patterns in switch > > The obvious next context in which to introduce pattern matching is > `switch`;? a > switch using patterns as `case` labels can replace `if .. else if` > chains with > a more direct way of expressing a multi-way conditional. > > Unfortunately, `switch` is one of the most complex, irregular > constructs we have > in Java, so we must teach it some new tricks while avoiding some > existing traps. > Such tricks and traps may include: > > **Typing.**? Currently, the operand of a `switch` may only be one of the > integral primitive types, the box type of an integral primitive, > `String`, or an > `enum` type.? (Further, if the `switch` operand is an `enum` type, the > `case` > labels must be _unqualified_ enum constant names.)? Clearly we can > relax this > restriction to allow other types, and constrain the case labels to only be > patterns that are applicable to that type, but it may leave a seam of > "legacy" > vs "pattern" switch, especially if we do not adopt bare constant > literals as > the denotation of constant patterns.? (We have confronted this issue > before with > expression switch, and concluded that it was better to rehabilitate > the `switch` > we have rather than create a new construct, and we will make the same > choice > here, but the cost of this is often a visible seam.) > > **Parsing.**? The grammar currently specifies that the operand of a > `case` label > is a `CaseConstant`, which casts a wide syntactic net, later narrowed with > post-checks after attribution.? This means that, since parsing is done > before we > know the type of the operand, we must be watchful for ambiguities between > patterns and expressions (and possibly refine the production for > `case` labels.) > > **Nullity.**? The `switch` construct is currently hostile to `null`, > but some > patterns do match `null`, and it may be desirable if nulls can be handled > within a suitably crafted `switch`. > > **Exhaustiveness.**? For switches over the permitted subtypes of > sealed types, > we will want to be able to do exhaustiveness analysis -- including for > nested > patterns (i.e., if `Shape`? is `Circle` or `Rect`, then `Box(Circle > c)` and > `Box(Rect r)` are exhaustive on `Box`.) > > **Fallthrough.**? Fallthrough is everyone's least favorite feature of > `switch`, > but it exists for a reason.? (The mistake was making fallthrough the > default > behavior, but that ship has sailed.)? In the absence of an OR pattern > combinator, one might find fallthrough in switch useful in conjunction > with > patterns: > > ``` > case Box(int x): > case Bag(int x): > ??? // use x > ``` > > However, it is likely that we will, at least initially, disallow > falling out > of, or into, a case label with binding variables. > > #### Translation > > Switches on primitives and their wrapper types are translated using the > `tableswitch` or `lookupswitch` bytecodes; switches on strings and > enums are > lowered in the compiler to switches involving hash codes (for strings) or > ordinals (for enums.) > > For switches on patterns, we would need a new strategy, one likely > built on > `invokedynamic`, where we lower the cases to a densely numbered `int` > switch, > and then invoke a classifier function with the operand which tells us > the first > case number it matches.? So a switch like: > > ``` > switch (o) { > ??? case P: A > ??? case Q: B > } > ``` > > is lowered to: > > ``` > int target = indy[BSM=PatternSwitch, args=[P,Q]](o) > switch (target) { > ??? case 0: A > ??? case 1: B > } > ``` > > A symbolic description of the patterns is provided as the bootstrap > argument > list, which builds a decision tree based on analysis of the patterns > and their > target types. > > #### Guards > > No matter how rich our patterns are, it is often the case that we will > want > to provide additional filtering on the results of a pattern: > > ``` > if (shape instanceof Cylinder c && c.color() == RED) { ... } > ``` > > Because we use `instanceof` as part of a boolean expression, it is easy to > narrow the results by conjoining additional checks with `&&`.? But in > a `case` > label, we do not necessarily have this opportunity.? Worse, the > semantics of > `switch` mean that once a `case` label is selected, there is no way to say > "oops, forget it, keep trying from the next label". > > It is common in languages with pattern matching to support some form > of "guard" > expression, which is a boolean expression that conditions whether the case > matches, such as: > > ``` > case Point(var x, var y) > ??? __where x == y: ... > ``` > > Bindings from the pattern would have to be available in guard expressions. > > Syntactic options (and hazards) for guards abound; users would > probably find it > natural to reuse `&&` to attach guards to patterns; C# has chosen > `when` for > introducing guards; we could use `case P if (e)`, etc. Whatever we do > here, > there is a readability risk,? as the more complex guards are, the > harder it is > to tell where the case label ends and the "body" begins.? (And worse > if we allow > switch expressions inside guards.) > > An alternate to guards is to allow an imperative `continue` statement in > `switch`, which would mean "keep trying to match from the next > label."? Given > the existing semantics of `continue`, this is a natural extension, but > since > `continue` does not currently have meaning for switch, some work would > have to > be done to disambiguate continue statements in switches enclosed in > loops.? The > imperative version is strictly more expressive than most reasonable > forms of the > declarative version, but users are likely to prefer the declarative > version. > > ## Nulls > > Almost no language design exercise is complete without some degree of > wrestling > with `null`.? As we define more complex patterns than simple type > patterns, and > extend constructs such as `switch` (which have existing opinions about > nullity) > to support patterns, we need to have a clear understanding of which > patterns > are nullable, and separate the nullity behaviors of patterns from the > nullity > behaviors of those constructs which use patterns. > > ## Nullity and patterns > > This topic has a number of easily-tangled concerns: > > ?- **Construct nullability.**? Constructs to which we want to add pattern > ?? awareness (`instanceof`, `switch`) already have their own opinion about > ?? nulls.? Currently, `instanceof` always says false when presented with a > ?? `null`, and `switch` always NPEs.? We may, or may not, wish to > refine these > ?? rules in some cases. > ?- **Pattern nullability.**? Some patterns clearly would never match > `null` > ?? (such as deconstruction patterns), whereas others (an "any" > pattern, and > ?? surely the `null` constant pattern) might make sense to match null. > ?- **Refactoring friendliness.**? There are a number of cases that we > would like > ?? to freely refactor back and forth, such as certain chains of `if > ... else if` > ?? with switches. > ?- **Nesting vs top-level.**? The "obvious" thing to do at the top > level of a > ?? construct is not always the "obvious" thing to do in a nested > construct. > ?- **Totality vs partiality.**? When a pattern is partial on the > operand type > ?? (e.g., `case String` when the operand of switch is `Object`), it is > almost > ?? never the case we want to match null (except in the case of the `null` > ?? constant pattern), whereas when a pattern is total on the operand > type (e.g., > ?? `case Object` in the same example), it is more justifiable to match > null. > ?- **Inference.**? It would be nice if a `var` pattern were simply > inference for > ?? a type pattern, rather than some possibly-non-denotable union. > > As a starting example, consider: > > ``` > record Box(Object o) { } > > Box box = ... > switch (box) { > ??? case Box(Chocolate c): > ??? case Box(Frog f): > ??? case Box(var o): > } > ``` > > It would be highly confusing and error-prone for either of the first two > patterns to match `Box(null)` -- given that `Chocolate` and `Frog` > have no type > relation, it should be perfectly safe to reorder the two. But, because > the last > pattern seems so obviously total on boxes, it is quite likely that > what the > author wants is to match all remaining boxes, including those that > contain null. > (Further, it would be terrible if there were _no_ way to say "Match > any `Box`, > even if it contains `null`.? (While one might initially think this > could be > repaired with OR patterns, imagine that `Box` had _n_ components -- > we'd need to > OR together _2^n_ patterns, with complex merging, to express all the > possible > combinations of nullity.)) > > Scala and C# took the approach of saying that "var" patterns are not > just type > inference, they are "any" patterns -- so `Box(Object o)` matches boxes > containing a non-null payload, where `Box(var o)` matches all boxes.? This > means, unfortunately, that `var` is not mere type inference -- which > complicates > the role of `var` in the language considerably.? Users should not have > to choose > between the semantics they want and being explicit about types; these > should be > orthogonal choices.? The above `switch` should be equivalent to: > > ``` > Box box = ... > switch (box) { > ??? case Box(Chocolate c): > ??? case Box(Frog f): > ??? case Box(Object o): > } > ``` > > and the choice to use `Object` or `var` should be solely one of > whether the > manifest types are deemed to improve or impair readability. > > #### Construct and pattern nullability > > Currently, `instanceof` always says `false` on `null`, and `switch` always > throws on `null`.? Whatever null opinions a construct has, these are > applied > before we even test any patterns. > > We can formalize the intuition outlined above as: type patterns that > are _total_ > on their target operand (`var x`, and `T t` on an operand of type `U`, > where `U > <: T`) match null, and non-total type patterns do not. (Another way to say > this is: a `var` pattern is the "any" pattern, and a type pattern that > is? total > on its operand type is also an "any" pattern.)? Additionally, the `null` > constant pattern matches null.? These are the _only_ nullable patterns. > > In our `Box` example, this means that the last case (whether written > as `Box(var > o)` or `Box(Object o)`) matches all boxes, including those containing null > (because the nested pattern is total on the nested operand), but the > first two > cases do not. > > If we retain the current absolute hostility of `switch` to nulls, we can't > trivially refactor from > > ``` > switch (o) { > ??? case Box(Chocolate c): > ??? case Box(Frog f): > ??? case Box(var o): > } > ``` > to > > ``` > switch (o) { > ??? case Box(var contents): > ??????? switch (contents) { > ??????????? case Chocolate c: > ??????????? case Frog f: > ??????????? case Object o: > ??????? } > ??? } > } > ``` > > because the inner `switch(contents)` would NPE before we tried to > match any of > the patterns it contains.? Instead, the user would explicitly have to > do an `if > (contents == null)` test, and, if the intent was to handle `null` in > the same > way as the `Object o` case, some duplication of code would be needed.? > We can > address this sharp corner by slightly relaxing the null-hostility of > `switch`, > as described below. > > A similar sharp corner is the decomposition of a nested pattern `P(Q)` > into > `P(alpha) & alpha instanceof Q`; while this is intended to be a > universally > valid transformation, if P's 1st component might be null and Q is > total,? this > transformation would not be valid because of the existing (mild) > null-hostility > of `instanceof`.? Again, we may be able to address this by adjusting > the rules > surrounding `instanceof` slightly. > > ## Generalizing switch > > The refactoring example above motivates why we might want to relax the > null-handling behavior of `switch`.? On the other hand, the one thing the > current behavior has going for it is that at least the current > behavior is easy > to reason about; it always throws when confronted with a `null`.? Any > relaxed > behavior would be more complex; some switches would still have to > throw (for > compatibility with existing semantics), and some (which can't be expressed > today) would accept nulls.? This is a tricky balance to achieve, but I > think we > have a found a good one. > > A starting point is that we don't want to require readers to do an _O(n)_ > analysis of each of the `case` labels just to determine whether a > given switch > accepts `null` or not; this should be an _O(1)_ analysis.? (We do not > want to > introduce a new flavor of `switch`, such as `switch-nullable`; this > might seem > to fix the proximate problem but would surely create others. As we've > done with > expression switch and patterns, we'd rather rehabilitate `switch` than > create > an almost-but-not-quite-the-same variant.) > > Let's start with the null pattern, which we'll spell for sake of > exposition > `case null`.? What if you were allowed to say `case null` in a switch, > and the > switch would do the obvious thing? > > ``` > switch (o) { > ??? case null -> System.out.println("Ugh, null"); > ??? case String s -> System.out.println("Yay, non-null: " + s); > } > ``` > > Given that the `case null` appears so close to the `switch`, it does > not seem > confusing that this switch would match `null`; the existence of `case > null` at > the top of the switch makes it pretty clear that this is intended > behavior.? (We > could further restrict the null pattern to being the first pattern in > a switch, > to make this clearer.) > > Now, let's look at the other end of the switch -- the last case.? What > if the > last pattern is a total pattern?? (Note that if any `case` has a total > pattern, > it _must_ be the last one, otherwise the cases after that would be > dead, which > would be an error.)? Is it also reasonable for that to match null?? > After all, > we're saying "everything": > > ``` > switch (o) { > ??? case String s: ... > ??? case Object o: ... > } > ``` > > Under this interpretation, the switch-refactoring anomaly above goes away. > > The direction we're going here is that if we can localize the > null-acceptance of > switches in the first (is it `case null`?) and last (is it total?) > cases, then > the incremental complexity of allowing _some_ switches to accept null > might be > outweighed by the incremental benefit of treating `null` more > uniformly (and > thus eliminating the refactoring anomalies.)? Note also that there is > no actual > code compatibility issue; this is all mental-model compatibility. > > So far, we're suggesting: > > ?- A switch with a constant `null` case? will accept nulls; > ?- If present, a constant `null` case must go first; > ?- A switch with a total (any) case matches also accepts nulls; > ?- If present, a total (any) case must go last. > > #### Relocating the problem > > It might be more helpful to view these changes as not changing the > behavior of > `switch`, but of the `default` case of `switch`.? We can equally well > interpret > the current behavior as: > > ?- `switch` always accepts `null`, but matching the `default` case of > a `switch` > ?? throws `NullPointerException`; > ?- any `switch` without a `default` case has an implicit do-nothing > `default` > ?? case. > > If we adopt this change of perspective, then `default`, not `switch`, > is in > control of the null rejection behavior -- and we can view these changes as > adjusting the behavior of `default`.? So we can recast the proposed > changes as: > > ? - Switches accept null; > ? - A constant `null` case will match nulls, and must go first; > ? - A total switch (a switch with a total `case`) cannot have a > `default` case; > ? - A non-total switch without a `default` case gets an implicit > do-nothing > ??? `default` case; > ? - Matching the (implicit or explicit) default case with a `null` operand > ??? always throws NPE. > > The main casualty here is that the `default` case does not mean the same > thing as `case var x` or `case Object o`.? We can't deprecate > `default`, but > for pattern switches, it becomes much less useful. > > #### What about method (declared) patterns? > > So far, we've declared all patterns, except the `null` constant > pattern and the > total (any) pattern, to not match `null`.? What about patterns that are > explicitly declared in code?? It turns out we can rule out these matching > `null` fairly easily. > > We can divide declared patterns into three kinds: deconstruction > patterns (dual > to constructors), static patterns (dual to static methods), and instance > patterns (dual to instance methods.)? For both deconstruction and instance > patterns, the match target becomes the receiver; method bodies are never > expected to deal with the case where `this == null`. > > For static patterns, it is conceivable that they could match `null`, > but this > would put a fairly serious burden on writers of static patterns to > check for > `null` -- which they would invariably forget, and many more NPEs would > ensue. > (Think about writing the pattern for `Optional.of(T t)` -- it would be > overwhelmingly likely we'd forget to check the target for nullity.)? > SO there > is a strong argument to simply say "declared patterns never match > null", to > not put writers of such patterns in this situation. > > So, only the top and bottom patterns in a switch could match null; if > the top > pattern is not `case null`, and the bottom pattern is not total, then > the switch > throws NPE on null, otherwise it accepts null. > > #### Adjusting instanceof > > The remaining anomaly we had was about unrolling nested patterns when > the inner > pattern is total.? We can plug this by simply outlawing total patterns in > `instanceof`. > > This may seem like a cheap trick, but it makes sense on its own.? If the > following statement was allowed: > > ``` > if (e instanceof var x) { X } > ``` > > it would simply be confusing; on the one hand, it looks like it should > always > match, but on the other, `instanceof` is historically null-hostile.? > And, if the > pattern always matches, then the `if` statement is silly; it should be > replaced > with: > > ``` > var x = e; > X > ``` > > since there's nothing conditional about it.? So by banning "any" > patterns on the > RHS of `instanceof`, we both avoid a confusion about what is going to > happen, > and we prevent the unrolling anomaly. > > For reasons of compatibility, we will have to continue to allow > > ``` > if (e instanceof Object) { ... } > ``` > > which succeeds on all non-null operands. > > We've been a little sloppy with the terminology of "any" vs "total"; > note that > in > > ``` > Point p; > if (p instanceof Point(var x, var y)) { } > ``` > > the pattern `Point(var x, var y)` is total on `Point`, but not an > "any" pattern > -- it still doesn't match on p == null. > > On the theory that an "any" pattern in `instanceof` is silly, we may > also want > to ban other "silly" patterns in `instanceof`, such as constant > patterns, since > all of the following have simpler forms: > > ``` > if (x instanceof null) { ... } > if (x instanceof "") { ... } > if (i instanceof 3) { ... } > ``` > > In the first round (type patterns in `instanceof`), we mostly didn't > confront > this issue, saying that `instanceof T t` matched in all the cases where > `instanceof T` would match.? But given that the solution for `switch` > relies > on "any" patterns matching null, we may wish to adjust the behavior of > `instanceof` before it exits preview. > > > [jep305]: https://openjdk.java.net/jeps/305 > [patternmatch]: pattern-match.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Tue Aug 4 18:27:57 2020 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 4 Aug 2020 14:27:57 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> References: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> Message-ID: <91D0A1FB-CE2C-40A5-B508-8DC247BA4EBF@oracle.com> If we are to choose a contextual keyword for this purpose, I think ?where? is the most natural for reading out loud, and is shorter (in several ways) than ?only-if? (the hyphenation of which strikes me as a bit precious in this context). ?Guy > On Aug 4, 2020, at 2:11 PM, Brian Goetz wrote: > > One thing this left open was the actual syntax of guards. (We know its snowing in hell now, because I am actually encouraging a syntax conversation.) > > Patterns in `instanceof` do not need guards, because `instanceof` is an expression, and expressions conjoin with `&&`: > > if (x instanceof Foo f && f.bar() == 3) { ... } > > We explored the approach of making the boolean guard expression part of the pattern -- where the above would parse as `x instanceof P`, where P is `Foo f && f.bar() == 3`. Since pattern matching is conditional, we could think of boolean expressions as patterns that are independent of their target. But, this approach didn't pan out due to various ambiguities. > > The most obvious ambiguity was the obvious interpretation of constant patterns; that `0` could be the literal zero or a pattern that matches zero. (I have since proposed we try to avoid constant patterns entirely.) Under this interpretation, not only was there room for confusion ("Is foo(0) an invocation of foo, or a pattern?"), but there were puzzlers like: > > switch (b) { > case true && false: ... > case false && true: ... > } > > It would not be clear whether this would be two patterns conjoined together (neither of which would match anything) or the constant pattern `true && false`. (There are similar ambiguities with deconstruction patterns with no bindings.) > > Overall, all of this was enough to sour me on trying to press && into service here. Which leaves some relatively mundane syntax options: > > case P when e > case P where e > case P if e > case P only-if e > case P where (e) > > and I guess weirder things like > > case P &&& e > case P : e > etc > > I particularly don't like `if` since it makes it even harder to tell where the case ends and the consequent begins. Also, `e` should not contain a switch expression, since no one wants to try to parse: > > case Foo f where switch (f.bar()) { > case Bar b -> 3; > } > 3 -> blah(); > > in their heads. (We already excluded switch expressions from candidates for constant expressions in 12 for similar reasons.) > > I mostly think this is a matter of picking a (contextual) keyword. (I kind of like that `only-if` could be an actual keyword.) > > > > (Looking ahead, if we ever want to reenable support for merging, we might have patterns like: > > case Foo(int x), Bar(int x) where x > 0: > > and we'd have to accept that the guard applies to _all_ the patterns.) > > > On 6/24/2020 10:44 AM, Brian Goetz wrote: >> There are a lot of directions we could take next for pattern matching. The one that builds most on what we've already done, and offers significant incremental expressiveness, is extending the type patterns we already have to a new context: switch. (There is still plenty of work to do on deconstruction patterns, pattern assignment, etc, but these require more design work.) >> >> Here's an overview of where I think we are here. >> >> [JEP 305][jep305] introduced the first phase of [pattern matching][patternmatch] >> into the Java language. It was deliberately limited, focusing on only one kind >> of pattern (type test patterns) and one linguistic context (`instanceof`). >> Having introduced the concept to Java developers, we can now extend both the >> kinds of patterns and the linguistic context where patterns are used. >> >> ## Patterns in switch >> >> The obvious next context in which to introduce pattern matching is `switch`; a >> switch using patterns as `case` labels can replace `if .. else if` chains with >> a more direct way of expressing a multi-way conditional. >> >> Unfortunately, `switch` is one of the most complex, irregular constructs we have >> in Java, so we must teach it some new tricks while avoiding some existing traps. >> Such tricks and traps may include: >> >> **Typing.** Currently, the operand of a `switch` may only be one of the >> integral primitive types, the box type of an integral primitive, `String`, or an >> `enum` type. (Further, if the `switch` operand is an `enum` type, the `case` >> labels must be _unqualified_ enum constant names.) Clearly we can relax this >> restriction to allow other types, and constrain the case labels to only be >> patterns that are applicable to that type, but it may leave a seam of "legacy" >> vs "pattern" switch, especially if we do not adopt bare constant literals as >> the denotation of constant patterns. (We have confronted this issue before with >> expression switch, and concluded that it was better to rehabilitate the `switch` >> we have rather than create a new construct, and we will make the same choice >> here, but the cost of this is often a visible seam.) >> >> **Parsing.** The grammar currently specifies that the operand of a `case` label >> is a `CaseConstant`, which casts a wide syntactic net, later narrowed with >> post-checks after attribution. This means that, since parsing is done before we >> know the type of the operand, we must be watchful for ambiguities between >> patterns and expressions (and possibly refine the production for `case` labels.) >> >> **Nullity.** The `switch` construct is currently hostile to `null`, but some >> patterns do match `null`, and it may be desirable if nulls can be handled >> within a suitably crafted `switch`. >> >> **Exhaustiveness.** For switches over the permitted subtypes of sealed types, >> we will want to be able to do exhaustiveness analysis -- including for nested >> patterns (i.e., if `Shape` is `Circle` or `Rect`, then `Box(Circle c)` and >> `Box(Rect r)` are exhaustive on `Box`.) >> >> **Fallthrough.** Fallthrough is everyone's least favorite feature of `switch`, >> but it exists for a reason. (The mistake was making fallthrough the default >> behavior, but that ship has sailed.) In the absence of an OR pattern >> combinator, one might find fallthrough in switch useful in conjunction with >> patterns: >> >> ``` >> case Box(int x): >> case Bag(int x): >> // use x >> ``` >> >> However, it is likely that we will, at least initially, disallow falling out >> of, or into, a case label with binding variables. >> >> #### Translation >> >> Switches on primitives and their wrapper types are translated using the >> `tableswitch` or `lookupswitch` bytecodes; switches on strings and enums are >> lowered in the compiler to switches involving hash codes (for strings) or >> ordinals (for enums.) >> >> For switches on patterns, we would need a new strategy, one likely built on >> `invokedynamic`, where we lower the cases to a densely numbered `int` switch, >> and then invoke a classifier function with the operand which tells us the first >> case number it matches. So a switch like: >> >> ``` >> switch (o) { >> case P: A >> case Q: B >> } >> ``` >> >> is lowered to: >> >> ``` >> int target = indy[BSM=PatternSwitch, args=[P,Q]](o) >> switch (target) { >> case 0: A >> case 1: B >> } >> ``` >> >> A symbolic description of the patterns is provided as the bootstrap argument >> list, which builds a decision tree based on analysis of the patterns and their >> target types. >> >> #### Guards >> >> No matter how rich our patterns are, it is often the case that we will want >> to provide additional filtering on the results of a pattern: >> >> ``` >> if (shape instanceof Cylinder c && c.color() == RED) { ... } >> ``` >> >> Because we use `instanceof` as part of a boolean expression, it is easy to >> narrow the results by conjoining additional checks with `&&`. But in a `case` >> label, we do not necessarily have this opportunity. Worse, the semantics of >> `switch` mean that once a `case` label is selected, there is no way to say >> "oops, forget it, keep trying from the next label". >> >> It is common in languages with pattern matching to support some form of "guard" >> expression, which is a boolean expression that conditions whether the case >> matches, such as: >> >> ``` >> case Point(var x, var y) >> __where x == y: ... >> ``` >> >> Bindings from the pattern would have to be available in guard expressions. >> >> Syntactic options (and hazards) for guards abound; users would probably find it >> natural to reuse `&&` to attach guards to patterns; C# has chosen `when` for >> introducing guards; we could use `case P if (e)`, etc. Whatever we do here, >> there is a readability risk, as the more complex guards are, the harder it is >> to tell where the case label ends and the "body" begins. (And worse if we allow >> switch expressions inside guards.) >> >> An alternate to guards is to allow an imperative `continue` statement in >> `switch`, which would mean "keep trying to match from the next label." Given >> the existing semantics of `continue`, this is a natural extension, but since >> `continue` does not currently have meaning for switch, some work would have to >> be done to disambiguate continue statements in switches enclosed in loops. The >> imperative version is strictly more expressive than most reasonable forms of the >> declarative version, but users are likely to prefer the declarative version. >> >> ## Nulls >> >> Almost no language design exercise is complete without some degree of wrestling >> with `null`. As we define more complex patterns than simple type patterns, and >> extend constructs such as `switch` (which have existing opinions about nullity) >> to support patterns, we need to have a clear understanding of which patterns >> are nullable, and separate the nullity behaviors of patterns from the nullity >> behaviors of those constructs which use patterns. >> >> ## Nullity and patterns >> >> This topic has a number of easily-tangled concerns: >> >> - **Construct nullability.** Constructs to which we want to add pattern >> awareness (`instanceof`, `switch`) already have their own opinion about >> nulls. Currently, `instanceof` always says false when presented with a >> `null`, and `switch` always NPEs. We may, or may not, wish to refine these >> rules in some cases. >> - **Pattern nullability.** Some patterns clearly would never match `null` >> (such as deconstruction patterns), whereas others (an "any" pattern, and >> surely the `null` constant pattern) might make sense to match null. >> - **Refactoring friendliness.** There are a number of cases that we would like >> to freely refactor back and forth, such as certain chains of `if ... else if` >> with switches. >> - **Nesting vs top-level.** The "obvious" thing to do at the top level of a >> construct is not always the "obvious" thing to do in a nested construct. >> - **Totality vs partiality.** When a pattern is partial on the operand type >> (e.g., `case String` when the operand of switch is `Object`), it is almost >> never the case we want to match null (except in the case of the `null` >> constant pattern), whereas when a pattern is total on the operand type (e.g., >> `case Object` in the same example), it is more justifiable to match null. >> - **Inference.** It would be nice if a `var` pattern were simply inference for >> a type pattern, rather than some possibly-non-denotable union. >> >> As a starting example, consider: >> >> ``` >> record Box(Object o) { } >> >> Box box = ... >> switch (box) { >> case Box(Chocolate c): >> case Box(Frog f): >> case Box(var o): >> } >> ``` >> >> It would be highly confusing and error-prone for either of the first two >> patterns to match `Box(null)` -- given that `Chocolate` and `Frog` have no type >> relation, it should be perfectly safe to reorder the two. But, because the last >> pattern seems so obviously total on boxes, it is quite likely that what the >> author wants is to match all remaining boxes, including those that contain null. >> (Further, it would be terrible if there were _no_ way to say "Match any `Box`, >> even if it contains `null`. (While one might initially think this could be >> repaired with OR patterns, imagine that `Box` had _n_ components -- we'd need to >> OR together _2^n_ patterns, with complex merging, to express all the possible >> combinations of nullity.)) >> >> Scala and C# took the approach of saying that "var" patterns are not just type >> inference, they are "any" patterns -- so `Box(Object o)` matches boxes >> containing a non-null payload, where `Box(var o)` matches all boxes. This >> means, unfortunately, that `var` is not mere type inference -- which complicates >> the role of `var` in the language considerably. Users should not have to choose >> between the semantics they want and being explicit about types; these should be >> orthogonal choices. The above `switch` should be equivalent to: >> >> ``` >> Box box = ... >> switch (box) { >> case Box(Chocolate c): >> case Box(Frog f): >> case Box(Object o): >> } >> ``` >> >> and the choice to use `Object` or `var` should be solely one of whether the >> manifest types are deemed to improve or impair readability. >> >> #### Construct and pattern nullability >> >> Currently, `instanceof` always says `false` on `null`, and `switch` always >> throws on `null`. Whatever null opinions a construct has, these are applied >> before we even test any patterns. >> >> We can formalize the intuition outlined above as: type patterns that are _total_ >> on their target operand (`var x`, and `T t` on an operand of type `U`, where `U >> <: T`) match null, and non-total type patterns do not. (Another way to say >> this is: a `var` pattern is the "any" pattern, and a type pattern that is total >> on its operand type is also an "any" pattern.) Additionally, the `null` >> constant pattern matches null. These are the _only_ nullable patterns. >> >> In our `Box` example, this means that the last case (whether written as `Box(var >> o)` or `Box(Object o)`) matches all boxes, including those containing null >> (because the nested pattern is total on the nested operand), but the first two >> cases do not. >> >> If we retain the current absolute hostility of `switch` to nulls, we can't >> trivially refactor from >> >> ``` >> switch (o) { >> case Box(Chocolate c): >> case Box(Frog f): >> case Box(var o): >> } >> ``` >> to >> >> ``` >> switch (o) { >> case Box(var contents): >> switch (contents) { >> case Chocolate c: >> case Frog f: >> case Object o: >> } >> } >> } >> ``` >> >> because the inner `switch(contents)` would NPE before we tried to match any of >> the patterns it contains. Instead, the user would explicitly have to do an `if >> (contents == null)` test, and, if the intent was to handle `null` in the same >> way as the `Object o` case, some duplication of code would be needed. We can >> address this sharp corner by slightly relaxing the null-hostility of `switch`, >> as described below. >> >> A similar sharp corner is the decomposition of a nested pattern `P(Q)` into >> `P(alpha) & alpha instanceof Q`; while this is intended to be a universally >> valid transformation, if P's 1st component might be null and Q is total, this >> transformation would not be valid because of the existing (mild) null-hostility >> of `instanceof`. Again, we may be able to address this by adjusting the rules >> surrounding `instanceof` slightly. >> >> ## Generalizing switch >> >> The refactoring example above motivates why we might want to relax the >> null-handling behavior of `switch`. On the other hand, the one thing the >> current behavior has going for it is that at least the current behavior is easy >> to reason about; it always throws when confronted with a `null`. Any relaxed >> behavior would be more complex; some switches would still have to throw (for >> compatibility with existing semantics), and some (which can't be expressed >> today) would accept nulls. This is a tricky balance to achieve, but I think we >> have a found a good one. >> >> A starting point is that we don't want to require readers to do an _O(n)_ >> analysis of each of the `case` labels just to determine whether a given switch >> accepts `null` or not; this should be an _O(1)_ analysis. (We do not want to >> introduce a new flavor of `switch`, such as `switch-nullable`; this might seem >> to fix the proximate problem but would surely create others. As we've done with >> expression switch and patterns, we'd rather rehabilitate `switch` than create >> an almost-but-not-quite-the-same variant.) >> >> Let's start with the null pattern, which we'll spell for sake of exposition >> `case null`. What if you were allowed to say `case null` in a switch, and the >> switch would do the obvious thing? >> >> ``` >> switch (o) { >> case null -> System.out.println("Ugh, null"); >> case String s -> System.out.println("Yay, non-null: " + s); >> } >> ``` >> >> Given that the `case null` appears so close to the `switch`, it does not seem >> confusing that this switch would match `null`; the existence of `case null` at >> the top of the switch makes it pretty clear that this is intended behavior. (We >> could further restrict the null pattern to being the first pattern in a switch, >> to make this clearer.) >> >> Now, let's look at the other end of the switch -- the last case. What if the >> last pattern is a total pattern? (Note that if any `case` has a total pattern, >> it _must_ be the last one, otherwise the cases after that would be dead, which >> would be an error.) Is it also reasonable for that to match null? After all, >> we're saying "everything": >> >> ``` >> switch (o) { >> case String s: ... >> case Object o: ... >> } >> ``` >> >> Under this interpretation, the switch-refactoring anomaly above goes away. >> >> The direction we're going here is that if we can localize the null-acceptance of >> switches in the first (is it `case null`?) and last (is it total?) cases, then >> the incremental complexity of allowing _some_ switches to accept null might be >> outweighed by the incremental benefit of treating `null` more uniformly (and >> thus eliminating the refactoring anomalies.) Note also that there is no actual >> code compatibility issue; this is all mental-model compatibility. >> >> So far, we're suggesting: >> >> - A switch with a constant `null` case will accept nulls; >> - If present, a constant `null` case must go first; >> - A switch with a total (any) case matches also accepts nulls; >> - If present, a total (any) case must go last. >> >> #### Relocating the problem >> >> It might be more helpful to view these changes as not changing the behavior of >> `switch`, but of the `default` case of `switch`. We can equally well interpret >> the current behavior as: >> >> - `switch` always accepts `null`, but matching the `default` case of a `switch` >> throws `NullPointerException`; >> - any `switch` without a `default` case has an implicit do-nothing `default` >> case. >> >> If we adopt this change of perspective, then `default`, not `switch`, is in >> control of the null rejection behavior -- and we can view these changes as >> adjusting the behavior of `default`. So we can recast the proposed changes as: >> >> - Switches accept null; >> - A constant `null` case will match nulls, and must go first; >> - A total switch (a switch with a total `case`) cannot have a `default` case; >> - A non-total switch without a `default` case gets an implicit do-nothing >> `default` case; >> - Matching the (implicit or explicit) default case with a `null` operand >> always throws NPE. >> >> The main casualty here is that the `default` case does not mean the same >> thing as `case var x` or `case Object o`. We can't deprecate `default`, but >> for pattern switches, it becomes much less useful. >> >> #### What about method (declared) patterns? >> >> So far, we've declared all patterns, except the `null` constant pattern and the >> total (any) pattern, to not match `null`. What about patterns that are >> explicitly declared in code? It turns out we can rule out these matching >> `null` fairly easily. >> >> We can divide declared patterns into three kinds: deconstruction patterns (dual >> to constructors), static patterns (dual to static methods), and instance >> patterns (dual to instance methods.) For both deconstruction and instance >> patterns, the match target becomes the receiver; method bodies are never >> expected to deal with the case where `this == null`. >> >> For static patterns, it is conceivable that they could match `null`, but this >> would put a fairly serious burden on writers of static patterns to check for >> `null` -- which they would invariably forget, and many more NPEs would ensue. >> (Think about writing the pattern for `Optional.of(T t)` -- it would be >> overwhelmingly likely we'd forget to check the target for nullity.) SO there >> is a strong argument to simply say "declared patterns never match null", to >> not put writers of such patterns in this situation. >> >> So, only the top and bottom patterns in a switch could match null; if the top >> pattern is not `case null`, and the bottom pattern is not total, then the switch >> throws NPE on null, otherwise it accepts null. >> >> #### Adjusting instanceof >> >> The remaining anomaly we had was about unrolling nested patterns when the inner >> pattern is total. We can plug this by simply outlawing total patterns in >> `instanceof`. >> >> This may seem like a cheap trick, but it makes sense on its own. If the >> following statement was allowed: >> >> ``` >> if (e instanceof var x) { X } >> ``` >> >> it would simply be confusing; on the one hand, it looks like it should always >> match, but on the other, `instanceof` is historically null-hostile. And, if the >> pattern always matches, then the `if` statement is silly; it should be replaced >> with: >> >> ``` >> var x = e; >> X >> ``` >> >> since there's nothing conditional about it. So by banning "any" patterns on the >> RHS of `instanceof`, we both avoid a confusion about what is going to happen, >> and we prevent the unrolling anomaly. >> >> For reasons of compatibility, we will have to continue to allow >> >> ``` >> if (e instanceof Object) { ... } >> ``` >> >> which succeeds on all non-null operands. >> >> We've been a little sloppy with the terminology of "any" vs "total"; note that >> in >> >> ``` >> Point p; >> if (p instanceof Point(var x, var y)) { } >> ``` >> >> the pattern `Point(var x, var y)` is total on `Point`, but not an "any" pattern >> -- it still doesn't match on p == null. >> >> On the theory that an "any" pattern in `instanceof` is silly, we may also want >> to ban other "silly" patterns in `instanceof`, such as constant patterns, since >> all of the following have simpler forms: >> >> ``` >> if (x instanceof null) { ... } >> if (x instanceof "") { ... } >> if (i instanceof 3) { ... } >> ``` >> >> In the first round (type patterns in `instanceof`), we mostly didn't confront >> this issue, saying that `instanceof T t` matched in all the cases where >> `instanceof T` would match. But given that the solution for `switch` relies >> on "any" patterns matching null, we may wish to adjust the behavior of >> `instanceof` before it exits preview. >> >> >> [jep305]: https://openjdk.java.net/jeps/305 >> [patternmatch]: pattern-match.html >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Aug 6 19:10:45 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 6 Aug 2020 15:10:45 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: Message-ID: <3da2a849-84f1-6f7b-ef81-9fe152ea0268@oracle.com> As some of you may have noticed from the automated messages, we're in the process of migrating design docs to GitHub.? I've created an amber-docs repo where I'm checking in Markdown docs; soon, a GitHub Action will automatically format them on push, and publish them to the openjdk website. In the meantime, this is an easier way to deal with and collaborate on documents, and I can accept pull requests for corrections and such.? For those of you who have commit rights to amber, you probably have it for amber-docs as well (if not, let me know). Etiquette: it is acceptable to directly commit "obvious" fixes (correcting typos, etc) on documents authored by others. Everything else, send a pull request. I've staged a slightly refined version of the story outlined here at: https://github.com/openjdk/amber-docs/blob/master/site/design-notes/type-patterns-in-switch.md On 6/24/2020 10:44 AM, Brian Goetz wrote: > There are a lot of directions we could take next for pattern > matching.? The one that builds most on what we've already done, and > offers significant incremental expressiveness, is extending the type > patterns we already have to a new context: switch.? (There is still > plenty of work to do on deconstruction patterns, pattern assignment, > etc, but these require more design work.) > > Here's an overview of where I think we are here. > > [JEP 305][jep305] introduced the first phase of [pattern > matching][patternmatch] > into the Java language.? It was deliberately limited, focusing on only > one kind > of pattern (type test patterns) and one linguistic context (`instanceof`). > Having introduced the concept to Java developers, we can now extend > both the > kinds of patterns and the linguistic context where patterns are used. > > ## Patterns in switch > > The obvious next context in which to introduce pattern matching is > `switch`;? a > switch using patterns as `case` labels can replace `if .. else if` > chains with > a more direct way of expressing a multi-way conditional. > > Unfortunately, `switch` is one of the most complex, irregular > constructs we have > in Java, so we must teach it some new tricks while avoiding some > existing traps. > Such tricks and traps may include: > > **Typing.**? Currently, the operand of a `switch` may only be one of the > integral primitive types, the box type of an integral primitive, > `String`, or an > `enum` type.? (Further, if the `switch` operand is an `enum` type, the > `case` > labels must be _unqualified_ enum constant names.)? Clearly we can > relax this > restriction to allow other types, and constrain the case labels to only be > patterns that are applicable to that type, but it may leave a seam of > "legacy" > vs "pattern" switch, especially if we do not adopt bare constant > literals as > the denotation of constant patterns.? (We have confronted this issue > before with > expression switch, and concluded that it was better to rehabilitate > the `switch` > we have rather than create a new construct, and we will make the same > choice > here, but the cost of this is often a visible seam.) > > **Parsing.**? The grammar currently specifies that the operand of a > `case` label > is a `CaseConstant`, which casts a wide syntactic net, later narrowed with > post-checks after attribution.? This means that, since parsing is done > before we > know the type of the operand, we must be watchful for ambiguities between > patterns and expressions (and possibly refine the production for > `case` labels.) > > **Nullity.**? The `switch` construct is currently hostile to `null`, > but some > patterns do match `null`, and it may be desirable if nulls can be handled > within a suitably crafted `switch`. > > **Exhaustiveness.**? For switches over the permitted subtypes of > sealed types, > we will want to be able to do exhaustiveness analysis -- including for > nested > patterns (i.e., if `Shape`? is `Circle` or `Rect`, then `Box(Circle > c)` and > `Box(Rect r)` are exhaustive on `Box`.) > > **Fallthrough.**? Fallthrough is everyone's least favorite feature of > `switch`, > but it exists for a reason.? (The mistake was making fallthrough the > default > behavior, but that ship has sailed.)? In the absence of an OR pattern > combinator, one might find fallthrough in switch useful in conjunction > with > patterns: > > ``` > case Box(int x): > case Bag(int x): > ??? // use x > ``` > > However, it is likely that we will, at least initially, disallow > falling out > of, or into, a case label with binding variables. > > #### Translation > > Switches on primitives and their wrapper types are translated using the > `tableswitch` or `lookupswitch` bytecodes; switches on strings and > enums are > lowered in the compiler to switches involving hash codes (for strings) or > ordinals (for enums.) > > For switches on patterns, we would need a new strategy, one likely > built on > `invokedynamic`, where we lower the cases to a densely numbered `int` > switch, > and then invoke a classifier function with the operand which tells us > the first > case number it matches.? So a switch like: > > ``` > switch (o) { > ??? case P: A > ??? case Q: B > } > ``` > > is lowered to: > > ``` > int target = indy[BSM=PatternSwitch, args=[P,Q]](o) > switch (target) { > ??? case 0: A > ??? case 1: B > } > ``` > > A symbolic description of the patterns is provided as the bootstrap > argument > list, which builds a decision tree based on analysis of the patterns > and their > target types. > > #### Guards > > No matter how rich our patterns are, it is often the case that we will > want > to provide additional filtering on the results of a pattern: > > ``` > if (shape instanceof Cylinder c && c.color() == RED) { ... } > ``` > > Because we use `instanceof` as part of a boolean expression, it is easy to > narrow the results by conjoining additional checks with `&&`.? But in > a `case` > label, we do not necessarily have this opportunity.? Worse, the > semantics of > `switch` mean that once a `case` label is selected, there is no way to say > "oops, forget it, keep trying from the next label". > > It is common in languages with pattern matching to support some form > of "guard" > expression, which is a boolean expression that conditions whether the case > matches, such as: > > ``` > case Point(var x, var y) > ??? __where x == y: ... > ``` > > Bindings from the pattern would have to be available in guard expressions. > > Syntactic options (and hazards) for guards abound; users would > probably find it > natural to reuse `&&` to attach guards to patterns; C# has chosen > `when` for > introducing guards; we could use `case P if (e)`, etc. Whatever we do > here, > there is a readability risk,? as the more complex guards are, the > harder it is > to tell where the case label ends and the "body" begins.? (And worse > if we allow > switch expressions inside guards.) > > An alternate to guards is to allow an imperative `continue` statement in > `switch`, which would mean "keep trying to match from the next > label."? Given > the existing semantics of `continue`, this is a natural extension, but > since > `continue` does not currently have meaning for switch, some work would > have to > be done to disambiguate continue statements in switches enclosed in > loops.? The > imperative version is strictly more expressive than most reasonable > forms of the > declarative version, but users are likely to prefer the declarative > version. > > ## Nulls > > Almost no language design exercise is complete without some degree of > wrestling > with `null`.? As we define more complex patterns than simple type > patterns, and > extend constructs such as `switch` (which have existing opinions about > nullity) > to support patterns, we need to have a clear understanding of which > patterns > are nullable, and separate the nullity behaviors of patterns from the > nullity > behaviors of those constructs which use patterns. > > ## Nullity and patterns > > This topic has a number of easily-tangled concerns: > > ?- **Construct nullability.**? Constructs to which we want to add pattern > ?? awareness (`instanceof`, `switch`) already have their own opinion about > ?? nulls.? Currently, `instanceof` always says false when presented with a > ?? `null`, and `switch` always NPEs.? We may, or may not, wish to > refine these > ?? rules in some cases. > ?- **Pattern nullability.**? Some patterns clearly would never match > `null` > ?? (such as deconstruction patterns), whereas others (an "any" > pattern, and > ?? surely the `null` constant pattern) might make sense to match null. > ?- **Refactoring friendliness.**? There are a number of cases that we > would like > ?? to freely refactor back and forth, such as certain chains of `if > ... else if` > ?? with switches. > ?- **Nesting vs top-level.**? The "obvious" thing to do at the top > level of a > ?? construct is not always the "obvious" thing to do in a nested > construct. > ?- **Totality vs partiality.**? When a pattern is partial on the > operand type > ?? (e.g., `case String` when the operand of switch is `Object`), it is > almost > ?? never the case we want to match null (except in the case of the `null` > ?? constant pattern), whereas when a pattern is total on the operand > type (e.g., > ?? `case Object` in the same example), it is more justifiable to match > null. > ?- **Inference.**? It would be nice if a `var` pattern were simply > inference for > ?? a type pattern, rather than some possibly-non-denotable union. > > As a starting example, consider: > > ``` > record Box(Object o) { } > > Box box = ... > switch (box) { > ??? case Box(Chocolate c): > ??? case Box(Frog f): > ??? case Box(var o): > } > ``` > > It would be highly confusing and error-prone for either of the first two > patterns to match `Box(null)` -- given that `Chocolate` and `Frog` > have no type > relation, it should be perfectly safe to reorder the two. But, because > the last > pattern seems so obviously total on boxes, it is quite likely that > what the > author wants is to match all remaining boxes, including those that > contain null. > (Further, it would be terrible if there were _no_ way to say "Match > any `Box`, > even if it contains `null`.? (While one might initially think this > could be > repaired with OR patterns, imagine that `Box` had _n_ components -- > we'd need to > OR together _2^n_ patterns, with complex merging, to express all the > possible > combinations of nullity.)) > > Scala and C# took the approach of saying that "var" patterns are not > just type > inference, they are "any" patterns -- so `Box(Object o)` matches boxes > containing a non-null payload, where `Box(var o)` matches all boxes.? This > means, unfortunately, that `var` is not mere type inference -- which > complicates > the role of `var` in the language considerably.? Users should not have > to choose > between the semantics they want and being explicit about types; these > should be > orthogonal choices.? The above `switch` should be equivalent to: > > ``` > Box box = ... > switch (box) { > ??? case Box(Chocolate c): > ??? case Box(Frog f): > ??? case Box(Object o): > } > ``` > > and the choice to use `Object` or `var` should be solely one of > whether the > manifest types are deemed to improve or impair readability. > > #### Construct and pattern nullability > > Currently, `instanceof` always says `false` on `null`, and `switch` always > throws on `null`.? Whatever null opinions a construct has, these are > applied > before we even test any patterns. > > We can formalize the intuition outlined above as: type patterns that > are _total_ > on their target operand (`var x`, and `T t` on an operand of type `U`, > where `U > <: T`) match null, and non-total type patterns do not. (Another way to say > this is: a `var` pattern is the "any" pattern, and a type pattern that > is? total > on its operand type is also an "any" pattern.)? Additionally, the `null` > constant pattern matches null.? These are the _only_ nullable patterns. > > In our `Box` example, this means that the last case (whether written > as `Box(var > o)` or `Box(Object o)`) matches all boxes, including those containing null > (because the nested pattern is total on the nested operand), but the > first two > cases do not. > > If we retain the current absolute hostility of `switch` to nulls, we can't > trivially refactor from > > ``` > switch (o) { > ??? case Box(Chocolate c): > ??? case Box(Frog f): > ??? case Box(var o): > } > ``` > to > > ``` > switch (o) { > ??? case Box(var contents): > ??????? switch (contents) { > ??????????? case Chocolate c: > ??????????? case Frog f: > ??????????? case Object o: > ??????? } > ??? } > } > ``` > > because the inner `switch(contents)` would NPE before we tried to > match any of > the patterns it contains.? Instead, the user would explicitly have to > do an `if > (contents == null)` test, and, if the intent was to handle `null` in > the same > way as the `Object o` case, some duplication of code would be needed.? > We can > address this sharp corner by slightly relaxing the null-hostility of > `switch`, > as described below. > > A similar sharp corner is the decomposition of a nested pattern `P(Q)` > into > `P(alpha) & alpha instanceof Q`; while this is intended to be a > universally > valid transformation, if P's 1st component might be null and Q is > total,? this > transformation would not be valid because of the existing (mild) > null-hostility > of `instanceof`.? Again, we may be able to address this by adjusting > the rules > surrounding `instanceof` slightly. > > ## Generalizing switch > > The refactoring example above motivates why we might want to relax the > null-handling behavior of `switch`.? On the other hand, the one thing the > current behavior has going for it is that at least the current > behavior is easy > to reason about; it always throws when confronted with a `null`.? Any > relaxed > behavior would be more complex; some switches would still have to > throw (for > compatibility with existing semantics), and some (which can't be expressed > today) would accept nulls.? This is a tricky balance to achieve, but I > think we > have a found a good one. > > A starting point is that we don't want to require readers to do an _O(n)_ > analysis of each of the `case` labels just to determine whether a > given switch > accepts `null` or not; this should be an _O(1)_ analysis.? (We do not > want to > introduce a new flavor of `switch`, such as `switch-nullable`; this > might seem > to fix the proximate problem but would surely create others. As we've > done with > expression switch and patterns, we'd rather rehabilitate `switch` than > create > an almost-but-not-quite-the-same variant.) > > Let's start with the null pattern, which we'll spell for sake of > exposition > `case null`.? What if you were allowed to say `case null` in a switch, > and the > switch would do the obvious thing? > > ``` > switch (o) { > ??? case null -> System.out.println("Ugh, null"); > ??? case String s -> System.out.println("Yay, non-null: " + s); > } > ``` > > Given that the `case null` appears so close to the `switch`, it does > not seem > confusing that this switch would match `null`; the existence of `case > null` at > the top of the switch makes it pretty clear that this is intended > behavior.? (We > could further restrict the null pattern to being the first pattern in > a switch, > to make this clearer.) > > Now, let's look at the other end of the switch -- the last case.? What > if the > last pattern is a total pattern?? (Note that if any `case` has a total > pattern, > it _must_ be the last one, otherwise the cases after that would be > dead, which > would be an error.)? Is it also reasonable for that to match null?? > After all, > we're saying "everything": > > ``` > switch (o) { > ??? case String s: ... > ??? case Object o: ... > } > ``` > > Under this interpretation, the switch-refactoring anomaly above goes away. > > The direction we're going here is that if we can localize the > null-acceptance of > switches in the first (is it `case null`?) and last (is it total?) > cases, then > the incremental complexity of allowing _some_ switches to accept null > might be > outweighed by the incremental benefit of treating `null` more > uniformly (and > thus eliminating the refactoring anomalies.)? Note also that there is > no actual > code compatibility issue; this is all mental-model compatibility. > > So far, we're suggesting: > > ?- A switch with a constant `null` case? will accept nulls; > ?- If present, a constant `null` case must go first; > ?- A switch with a total (any) case matches also accepts nulls; > ?- If present, a total (any) case must go last. > > #### Relocating the problem > > It might be more helpful to view these changes as not changing the > behavior of > `switch`, but of the `default` case of `switch`.? We can equally well > interpret > the current behavior as: > > ?- `switch` always accepts `null`, but matching the `default` case of > a `switch` > ?? throws `NullPointerException`; > ?- any `switch` without a `default` case has an implicit do-nothing > `default` > ?? case. > > If we adopt this change of perspective, then `default`, not `switch`, > is in > control of the null rejection behavior -- and we can view these changes as > adjusting the behavior of `default`.? So we can recast the proposed > changes as: > > ? - Switches accept null; > ? - A constant `null` case will match nulls, and must go first; > ? - A total switch (a switch with a total `case`) cannot have a > `default` case; > ? - A non-total switch without a `default` case gets an implicit > do-nothing > ??? `default` case; > ? - Matching the (implicit or explicit) default case with a `null` operand > ??? always throws NPE. > > The main casualty here is that the `default` case does not mean the same > thing as `case var x` or `case Object o`.? We can't deprecate > `default`, but > for pattern switches, it becomes much less useful. > > #### What about method (declared) patterns? > > So far, we've declared all patterns, except the `null` constant > pattern and the > total (any) pattern, to not match `null`.? What about patterns that are > explicitly declared in code?? It turns out we can rule out these matching > `null` fairly easily. > > We can divide declared patterns into three kinds: deconstruction > patterns (dual > to constructors), static patterns (dual to static methods), and instance > patterns (dual to instance methods.)? For both deconstruction and instance > patterns, the match target becomes the receiver; method bodies are never > expected to deal with the case where `this == null`. > > For static patterns, it is conceivable that they could match `null`, > but this > would put a fairly serious burden on writers of static patterns to > check for > `null` -- which they would invariably forget, and many more NPEs would > ensue. > (Think about writing the pattern for `Optional.of(T t)` -- it would be > overwhelmingly likely we'd forget to check the target for nullity.)? > SO there > is a strong argument to simply say "declared patterns never match > null", to > not put writers of such patterns in this situation. > > So, only the top and bottom patterns in a switch could match null; if > the top > pattern is not `case null`, and the bottom pattern is not total, then > the switch > throws NPE on null, otherwise it accepts null. > > #### Adjusting instanceof > > The remaining anomaly we had was about unrolling nested patterns when > the inner > pattern is total.? We can plug this by simply outlawing total patterns in > `instanceof`. > > This may seem like a cheap trick, but it makes sense on its own.? If the > following statement was allowed: > > ``` > if (e instanceof var x) { X } > ``` > > it would simply be confusing; on the one hand, it looks like it should > always > match, but on the other, `instanceof` is historically null-hostile.? > And, if the > pattern always matches, then the `if` statement is silly; it should be > replaced > with: > > ``` > var x = e; > X > ``` > > since there's nothing conditional about it.? So by banning "any" > patterns on the > RHS of `instanceof`, we both avoid a confusion about what is going to > happen, > and we prevent the unrolling anomaly. > > For reasons of compatibility, we will have to continue to allow > > ``` > if (e instanceof Object) { ... } > ``` > > which succeeds on all non-null operands. > > We've been a little sloppy with the terminology of "any" vs "total"; > note that > in > > ``` > Point p; > if (p instanceof Point(var x, var y)) { } > ``` > > the pattern `Point(var x, var y)` is total on `Point`, but not an > "any" pattern > -- it still doesn't match on p == null. > > On the theory that an "any" pattern in `instanceof` is silly, we may > also want > to ban other "silly" patterns in `instanceof`, such as constant > patterns, since > all of the following have simpler forms: > > ``` > if (x instanceof null) { ... } > if (x instanceof "") { ... } > if (i instanceof 3) { ... } > ``` > > In the first round (type patterns in `instanceof`), we mostly didn't > confront > this issue, saying that `instanceof T t` matched in all the cases where > `instanceof T` would match.? But given that the solution for `switch` > relies > on "any" patterns matching null, we may wish to adjust the behavior of > `instanceof` before it exits preview. > > > [jep305]: https://openjdk.java.net/jeps/305 > [patternmatch]: pattern-match.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Aug 6 21:35:43 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 6 Aug 2020 23:35:43 +0200 (CEST) Subject: Nullable switch In-Reply-To: <6A81A458-9FA8-43D3-9D81-6EB7E2D61CA1@oracle.com> References: <8a0d00c7-07a4-2d1e-8570-7e2cebd7f00e@oracle.com> <0567C5FB-DCF6-49B1-8E1A-47CB158C3A5D@univ-mlv.fr> <6A81A458-9FA8-43D3-9D81-6EB7E2D61CA1@oracle.com> Message-ID: <577630200.188826.1596749743120.JavaMail.zimbra@u-pem.fr> I've re-read the proposed spec about the nullable switch, and it still don't understand why people will think that that the following switch is a nullable switch switch(o) { case String s -> case Object o -> } The current spec merges the notion of totality and of nullability and i fail to understand why. Moreover, the spec explain that we need a any pattern but don't explain why having an explicit syntax "any foo", which separate the notion of totality and nullability is not a good idea. i wonder what is the problem of introducing a syntax "any foo" explicitly. with the following rules for a nullable switch: * A switch with a constant null case will accept nulls; * If present, a constant null case must go first; * A switch with an any case matches also accepts nulls; * If present, an any case must go last. By example, Object o = ... switch(o) { case var o -> // non null Object case any o -> // nullable Object } In that case, the following switch is not nullable switch(o) { case String s -> case Object o -> } and the nullable version is switch(o) { case String s -> case any o -> } regards, R?mi > De: "John Rose" > ?: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > > Envoy?: Vendredi 24 Juillet 2020 03:20:20 > Objet: Re: Next up for patterns: type patterns in switch > On Jul 23, 2020, at 2:53 PM, Remi Forax < [ mailto:forax at univ-mlv.fr | > forax at univ-mlv.fr ] > wrote: >> var x and default are not on the same plane. So it's not really a third thing. >> We are introducing something special for the bottom, null, but not for the top ? > Eh; null doesn?t need to be that special, but Brian?s point is that > you can just mandate that it appears at the top or nowhere. > If you don?t mandate that, then type-coverage checks ensure > that a ?case null? which appears after a nullable case (a total > one) will be not-reachable, and a static error. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Aug 6 22:14:03 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 6 Aug 2020 18:14:03 -0400 Subject: Nullable switch In-Reply-To: <577630200.188826.1596749743120.JavaMail.zimbra@u-pem.fr> References: <8a0d00c7-07a4-2d1e-8570-7e2cebd7f00e@oracle.com> <0567C5FB-DCF6-49B1-8E1A-47CB158C3A5D@univ-mlv.fr> <6A81A458-9FA8-43D3-9D81-6EB7E2D61CA1@oracle.com> <577630200.188826.1596749743120.JavaMail.zimbra@u-pem.fr> Message-ID: If we were paying the full cost of nullable types (T?), then there would be an obvious choice: T would be a non-nullable type pattern, and T? would be a nullable type pattern.? But introducing an `any Foo` notion has similar conceptual surface area, but dramatically less utility.? So the "return on syntax" for "any Foo" is not good. We were considering T? at one point, back when we were considering T? over in Valhalla land, but as soon as that didn't pan out over there, the attractiveness of using it over here pretty quickly went to zero. The argument for using totality to wriggle out of the nullity trap is a sophisticated one, which may be part of the problem, but it is largely about uniformity (and partially a pick-your-poison.) I think this is a forced move: that ??? case Box(var o): be total (including Box(null)).? Any other conclusion is hard to take seriously.? If this excludes Box(null), users will make frequent errors -- because users routinely ignore null's special behaviors (and we mostly want them to keep doing so.) The next step is a principled one: It's not acceptable for `var` patterns to be inconsistent with type inference.? The choice Scala/C# made here is terrible, that switching between an inferred and manifest type changes the semantics of the match. Super Not OK.? So that means ??? case Box(Object o): has to be total on boxes too. Another principled choice is that we want the invariant that ??? x P(Q) and ??? x P(var alpha) && alpha Q be equivalent.? The alternative will lead to bad refactoring anomalies and bugs.? (You can consider things like this as being analogous to the monad laws; they're what let you freely refactor forms that users will assume are equivalent.) Which leads us right to: the pattern `Object o` is total on Object -- including null.? (If `instanceof` or `switch` have a rigid opinion on nullity, we won't get to the part where we try the match, but it can still be a nullable pattern.) You can make a similar argument for refactoring between if-else chains and switches, or between switch-of-nest and nested switch: ??? switch (b) { ??????? case Box(Frog f): ...????? <-- partial ??????? case Box(Object o): ...??? <-- total ??? } should be equivalent to ??? switch (b) { ??????? case Box(var x): ??????????? switch (x) { ??????????????? case Frog f: ??????????????? case Object o:???? <-- must match null, otherwise refactoring is invalid ??????????? } ??? } But if the inner switch throws NPE, our refactoring is broken. Sharp edge, user is bleeding. You're saying that the language is better without asking users to reason about complex rules about nullity.? But the cost of this is sharp edges when refactoring (swap var for manifest type, swap instanceof chain for switch, swap nested pattern switch for switch of nested pattern), and user surprises.? The complexity isn't gone, its just moved to where we we don't talk about it, but it is still waiting there to cut your fingers. The totality rule is grounded in principle, and leads to the "obvious" answers in the most important cases.? Yes, it's a little more complicated.? But the alternative just distributes the complication around the room, like the shards of a broken window no one has bothered to clean up. On 8/6/2020 5:35 PM, forax at univ-mlv.fr wrote: > I've re-read the proposed spec about the nullable switch, > and it still don't understand why people will think that that the > following switch is a nullable switch > ? switch(o) { > ??? case String s -> > ? ? case Object o -> > ? } It's not clear that they have to.? Right now, switches throw hard on null, and yet the world is not full of NPEs that come from switches.? So most users don't even consider null when writing switches, and somehow it all works out.? The last thing we want to do is grab them by the collar, shake them, and say "you are a bad programmer, not thinking about nulls in switch!? You must start obsessing over it immediately!"? Instead, if they're in blissful ignorance now, let's let them stay there.? Let's make the idioms do the obvious thing (Box(var x) matches all boxes), and then, for the 1% of users who find they need to reason about nulls, give them rules they can understand. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Aug 6 23:22:32 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 7 Aug 2020 01:22:32 +0200 (CEST) Subject: Nullable switch In-Reply-To: References: <8a0d00c7-07a4-2d1e-8570-7e2cebd7f00e@oracle.com> <0567C5FB-DCF6-49B1-8E1A-47CB158C3A5D@univ-mlv.fr> <6A81A458-9FA8-43D3-9D81-6EB7E2D61CA1@oracle.com> <577630200.188826.1596749743120.JavaMail.zimbra@u-pem.fr> Message-ID: <2048519557.191709.1596756152316.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" , "John Rose" > > Envoy?: Vendredi 7 Ao?t 2020 00:14:03 > Objet: Re: Nullable switch > If we were paying the full cost of nullable types (T?), then there would be an > obvious choice: T would be a non-nullable type pattern, and T? would be a > nullable type pattern. But introducing an `any Foo` notion has similar > conceptual surface area, but dramatically less utility. So the "return on > syntax" for "any Foo" is not good. It has the same utility as introducing case null. The return on syntax is not good, you're right, but this is true for both case null and case any because their are dual. [...] > The argument for using totality to wriggle out of the nullity trap is a > sophisticated one, which may be part of the problem, but it is largely about > uniformity (and partially a pick-your-poison.) > I think this is a forced move: that > case Box(var o): > be total (including Box(null)). Any other conclusion is hard to take seriously. > If this excludes Box(null), users will make frequent errors -- because users > routinely ignore null's special behaviors (and we mostly want them to keep > doing so.) You made this arguments several times but i don't understand it, why case Box(var o) has to allow null if there is a case Box(any o) which allow null exists. > The next step is a principled one: It's not acceptable for `var` patterns to be > inconsistent with type inference. The choice Scala/C# made here is terrible, > that switching between an inferred and manifest type changes the semantics of > the match. Super Not OK. So that means > case Box(Object o): > has to be total on boxes too. The meaning of case Box(String s) should be the same for any switch, and not it may accept null or not. I'm able to follow your way of thinking, but the end result is just bad. And again, people are not compilers, because you are mixing totality and nullability, is the "case Comparable c" accept null or not in the following code is not an easy question. var o = (i == 3)? "hello: 42; switch(o) { case Comparable c -> } Worst, it may depends on which JDK/library you are using because you may have add an interface to the implement list of a class in the next version of a library. > Another principled choice is that we want the invariant that > x P(Q) > and > x P(var alpha) && alpha Q > be equivalent. The alternative will lead to bad refactoring anomalies and bugs. > (You can consider things like this as being analogous to the monad laws; > they're what let you freely refactor forms that users will assume are > equivalent.) > Which leads us right to: the pattern `Object o` is total on Object -- including > null. (If `instanceof` or `switch` have a rigid opinion on nullity, we won't > get to the part where we try the match, but it can still be a nullable > pattern.) As you said the current semantics of instanceof as a strong opinion about null, so here we have a choice, either we generalizing instanceof so the nullable patterns is allowed or we disallow to use the nullable patterns and instanceof never match null. This decison is unrelated to how define the nullable pattern, it can be "any" or it can be "var|Object + totality". > You can make a similar argument for refactoring between if-else chains and > switches, or between switch-of-nest and nested switch: > switch (b) { > case Box(Frog f): ... <-- partial > case Box(Object o): ... <-- total > } > should be equivalent to > switch (b) { > case Box(var x): > switch (x) { > case Frog f: > case Object o: <-- must match null, otherwise refactoring is invalid > } > } > But if the inner switch throws NPE, our refactoring is broken. Sharp edge, user > is bleeding. Again, this has nothing to do with how to write the nullable pattern. So this argument also works if the nullable pattern uses any. switch(b) { case Box(Frog f) -> case Box(any a) -> } is equivalent to switch(b) { case Box(any a) -> switch(a) { case Flog f -> case any o -> }; } > You're saying that the language is better without asking users to reason about > complex rules about nullity. But the cost of this is sharp edges when > refactoring (swap var for manifest type, swap instanceof chain for switch, swap > nested pattern switch for switch of nested pattern), and user surprises. The > complexity isn't gone, its just moved to where we we don't talk about it, but > it is still waiting there to cut your fingers. No. The refactoring rules works whatever we decide the nullable pattern is. > The totality rule is grounded in principle, and leads to the "obvious" answers > in the most important cases. Yes, it's a little more complicated. But the > alternative just distributes the complication around the room, like the shards > of a broken window no one has bothered to clean up. The obvious answer is not obvious, it's based on the fact that you believe that if someone see case Object o and knows that the case is total, so he will think that this pattern allows null. But knowing if a case is total is far from obvious and in the other way, if you do not want a total case to match null, you have no solution. R?mi > On 8/6/2020 5:35 PM, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] wrote: >> I've re-read the proposed spec about the nullable switch, >> and it still don't understand why people will think that that the following >> switch is a nullable switch >> switch(o) { >> case String s -> >> case Object o -> >> } > It's not clear that they have to. Right now, switches throw hard on null, and > yet the world is not full of NPEs that come from switches. So most users don't > even consider null when writing switches, and somehow it all works out. The > last thing we want to do is grab them by the collar, shake them, and say "you > are a bad programmer, not thinking about nulls in switch! You must start > obsessing over it immediately!" Instead, if they're in blissful ignorance now, > let's let them stay there. Let's make the idioms do the obvious thing (Box(var > x) matches all boxes), and then, for the 1% of users who find they need to > reason about nulls, give them rules they can understand. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Aug 7 01:28:32 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 6 Aug 2020 21:28:32 -0400 Subject: Nullable switch In-Reply-To: <2048519557.191709.1596756152316.JavaMail.zimbra@u-pem.fr> References: <8a0d00c7-07a4-2d1e-8570-7e2cebd7f00e@oracle.com> <0567C5FB-DCF6-49B1-8E1A-47CB158C3A5D@univ-mlv.fr> <6A81A458-9FA8-43D3-9D81-6EB7E2D61CA1@oracle.com> <577630200.188826.1596749743120.JavaMail.zimbra@u-pem.fr> <2048519557.191709.1596756152316.JavaMail.zimbra@u-pem.fr> Message-ID: > If we were paying the full cost of nullable types (T?), then there > would be an obvious choice: T would be a non-nullable type > pattern, and T? would be a nullable type pattern.? But introducing > an `any Foo` notion has similar conceptual surface area, but > dramatically less utility.? So the "return on syntax" for "any > Foo" is not good. > > > It has the same utility as introducing case null. Even if I agreed with that, that's benefit, but "return" is benefit-relative-to-cost.? So let's talk about cost: `case null` is not an entirely new syntactic form whose meaning is non-obvious; it is an obvious extension of `case 0` or `case "foo"`.? So the incremental "cost" is effectively zero by comparison. In any case, `any Foo` is bad for multiple reasons.? As is `Foo|Null`.? A serious problem with both is that they are too easily forgotten, and people will think `Box(var x)` is total when it is not.? In expression switches, the need for totality may save them when they get a compile error (maybe), but in statement switches, it never will. > > [...] > > > > The argument for using totality to wriggle out of the nullity trap > is a sophisticated one, which may be part of the problem, but it > is largely about uniformity (and partially a pick-your-poison.) > > I think this is a forced move: that > > ??? case Box(var o): > > be total (including Box(null)).? Any other conclusion is hard to > take seriously.? If this excludes Box(null), users will make > frequent errors -- because users routinely ignore null's special > behaviors (and we mostly want them to keep doing so.) > > > You made this arguments several times but i don't understand it, why > case Box(var o) has to allow null if there is a case Box(any o) which > allow null exists. That's not an argument, it's just assuming your conclusion :) So, if you want to make your case, I suggest you start over, and start with problems and goals and principles rather than a (partial) solution.? And I think you're going to want to work through the whole thing, and write it all up, rather than describing only an incremental diversion -- because the whole thing has to stand together, and it's way too easy to think you've solved the whole thing when you've really just moved the problem elsewhere (I've been there a few times already on this very topic.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Fri Aug 7 03:48:34 2020 From: amaembo at gmail.com (Tagir Valeev) Date: Fri, 7 Aug 2020 10:48:34 +0700 Subject: [records] Mark generated toString/equals/hashCode in class files somehow? Message-ID: Hello! I'm working on class-file decompiler for records and discovered that there's no special flag for generated equals/hashCode/toString (like ACC_SYNTHETIC). This allows determining whether this method was explicitly specified in the source code only by looking into method implementation whether it has an ObjectMethods.bootstrap indy or not. This looks implementation-dependent and somewhat fragile (though, of course, we will do this if we have no other options). We also have a stub decompiler that decompiles declarations only without checking method bodies at all and it also wants to know whether equals/hashCode/toString methods were autogenerated. Finally, other bytecode tools like code coverage may need this to avoid calculating coverage for methods not present in the source. Is it possible to mark generated methods via ACC_SYNTHETIC or any other flag or add any attribute that can be used to differentiate auto-generated methods from the ones presented in the source code? Having a synthetic mark for auto-generated canonical constructor or accessor methods is less critical (as their bodies could be actually written in the source code like this) but it would be also nice to have it. With best regards, Tagir Valeev. From guy.steele at oracle.com Fri Aug 7 04:05:50 2020 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 7 Aug 2020 00:05:50 -0400 Subject: Nullable switch In-Reply-To: References: <8a0d00c7-07a4-2d1e-8570-7e2cebd7f00e@oracle.com> <0567C5FB-DCF6-49B1-8E1A-47CB158C3A5D@univ-mlv.fr> <6A81A458-9FA8-43D3-9D81-6EB7E2D61CA1@oracle.com> <577630200.188826.1596749743120.JavaMail.zimbra@u-pem.fr> <2048519557.191709.1596756152316.JavaMail.zimbra@u-pem.fr> Message-ID: <6D8C1730-4556-46A2-AF9D-ED87D8EF9923@oracle.com> Okay, so it would seem that we need two keywords (or other syntax) for use in patterns; I will temporarily call them ?anything-but-null? and ?anything-including-null?. And it seems that what Brian and Remi are arguing over is whether ?anything-including-null? should be spelled ?var? or ?any?. And the arguments include some good points about consistency with other parts of the language, consistent rules for refactoring, user expectations, and potential pitfalls for users. IIUC, Remi wants to spell ?anything-but-null? as ?var? (at least in patterns) and ?anything-including-null? as ?any?. Brian wants to spell ?anything-including-null? as ?var?, and I am not sure how he wants to spell ?anything-but-null?. In particular, I am uncertain about this situation: suppose we have a pattern FrogBox(Frog x) (that is, it is known that the argument to FrogBox must be of type Frog), and Tadpole is a class that extends (or implements) Frog; then consider FrogBox fb = ?; switch (fb) { case FrogBox(Tadpole x) -> ? ; case FrogBox(Frog x) -> ? ; } (I think this is similar to previous examples, but I want the catchall case to be Frog, not Object.) Is the preceding switch total, or do I need to say either FrogBox fb = ?; switch (fb) { case FrogBox(Tadpole x) -> ? ; case FrogBox(Frog x) -> ? ; case FrogBox(var x) -> ? ; } or FrogBox fb = ?; switch (fb) { case null -> ? ; case FrogBox(Tadpole x) -> ? ; case FrogBox(Frog x) -> ? ; } in order to be total? > On Aug 6, 2020, at 9:28 PM, Brian Goetz wrote: > > >> If we were paying the full cost of nullable types (T?), then there would be an obvious choice: T would be a non-nullable type pattern, and T? would be a nullable type pattern. But introducing an `any Foo` notion has similar conceptual surface area, but dramatically less utility. So the "return on syntax" for "any Foo" is not good. >> >> It has the same utility as introducing case null. > > Even if I agreed with that, that's benefit, but "return" is benefit-relative-to-cost. So let's talk about cost: `case null` is not an entirely new syntactic form whose meaning is non-obvious; it is an obvious extension of `case 0` or `case "foo"`. So the incremental "cost" is effectively zero by comparison. > > In any case, `any Foo` is bad for multiple reasons. As is `Foo|Null`. A serious problem with both is that they are too easily forgotten, and people will think `Box(var x)` is total when it is not. In expression switches, the need for totality may save them when they get a compile error (maybe), but in statement switches, it never will. > >> >> [...] >> >> >> >> The argument for using totality to wriggle out of the nullity trap is a sophisticated one, which may be part of the problem, but it is largely about uniformity (and partially a pick-your-poison.) >> >> I think this is a forced move: that >> >> case Box(var o): >> >> be total (including Box(null)). Any other conclusion is hard to take seriously. If this excludes Box(null), users will make frequent errors -- because users routinely ignore null's special behaviors (and we mostly want them to keep doing so.) >> >> You made this arguments several times but i don't understand it, why case Box(var o) has to allow null if there is a case Box(any o) which allow null exists. > > That's not an argument, it's just assuming your conclusion :) > > So, if you want to make your case, I suggest you start over, and start with problems and goals and principles rather than a (partial) solution. And I think you're going to want to work through the whole thing, and write it all up, rather than describing only an incremental diversion -- because the whole thing has to stand together, and it's way too easy to think you've solved the whole thing when you've really just moved the problem elsewhere (I've been there a few times already on this very topic.) > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.gibbons at oracle.com Fri Aug 7 04:12:38 2020 From: jonathan.gibbons at oracle.com (Jonathan Gibbons) Date: Thu, 6 Aug 2020 21:12:38 -0700 Subject: [records] Mark generated toString/equals/hashCode in class files somehow? In-Reply-To: References: Message-ID: Tagir, The concept and word you are looking for is "mandated", which is similar to but different from "synthetic". See https://docs.oracle.com/en/java/javase/14/docs/api/java.compiler/javax/lang/model/util/Elements.Origin.html#MANDATED -- Jon On 8/6/20 8:48 PM, Tagir Valeev wrote: > Hello! > > I'm working on class-file decompiler for records and discovered that > there's no special flag for generated equals/hashCode/toString (like > ACC_SYNTHETIC). This allows determining whether this method was > explicitly specified in the source code only by looking into method > implementation whether it has an ObjectMethods.bootstrap indy or not. > This looks implementation-dependent and somewhat fragile (though, of > course, we will do this if we have no other options). We also have a > stub decompiler that decompiles declarations only without checking > method bodies at all and it also wants to know whether > equals/hashCode/toString methods were autogenerated. Finally, other > bytecode tools like code coverage may need this to avoid calculating > coverage for methods not present in the source. > > Is it possible to mark generated methods via ACC_SYNTHETIC or any > other flag or add any attribute that can be used to differentiate > auto-generated methods from the ones presented in the source code? > > Having a synthetic mark for auto-generated canonical constructor or > accessor methods is less critical (as their bodies could be actually > written in the source code like this) but it would be also nice to > have it. > > With best regards, > Tagir Valeev. From amaembo at gmail.com Fri Aug 7 04:20:35 2020 From: amaembo at gmail.com (Tagir Valeev) Date: Fri, 7 Aug 2020 11:20:35 +0700 Subject: [records] Mark generated toString/equals/hashCode in class files somehow? In-Reply-To: References: Message-ID: Hello, Jonathan! I believe, current JVM specification doesn't say that methods could be marked with ACC_MANDATED [1]. I won't mind if it will be used instead of SYNTHETIC. To me, anything is ok if I can avoid bytecode inspection. With best regards, Tagir Valeev. [1] https://docs.oracle.com/javase/specs/jvms/se14/html/jvms-4.html#jvms-4.6 On Fri, Aug 7, 2020 at 11:12 AM Jonathan Gibbons wrote: > > Tagir, > > The concept and word you are looking for is "mandated", which is similar > to but different from "synthetic". > > See > https://docs.oracle.com/en/java/javase/14/docs/api/java.compiler/javax/lang/model/util/Elements.Origin.html#MANDATED > > -- Jon > > > On 8/6/20 8:48 PM, Tagir Valeev wrote: > > Hello! > > > > I'm working on class-file decompiler for records and discovered that > > there's no special flag for generated equals/hashCode/toString (like > > ACC_SYNTHETIC). This allows determining whether this method was > > explicitly specified in the source code only by looking into method > > implementation whether it has an ObjectMethods.bootstrap indy or not. > > This looks implementation-dependent and somewhat fragile (though, of > > course, we will do this if we have no other options). We also have a > > stub decompiler that decompiles declarations only without checking > > method bodies at all and it also wants to know whether > > equals/hashCode/toString methods were autogenerated. Finally, other > > bytecode tools like code coverage may need this to avoid calculating > > coverage for methods not present in the source. > > > > Is it possible to mark generated methods via ACC_SYNTHETIC or any > > other flag or add any attribute that can be used to differentiate > > auto-generated methods from the ones presented in the source code? > > > > Having a synthetic mark for auto-generated canonical constructor or > > accessor methods is less critical (as their bodies could be actually > > written in the source code like this) but it would be also nice to > > have it. > > > > With best regards, > > Tagir Valeev. From brian.goetz at oracle.com Fri Aug 7 14:48:13 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 7 Aug 2020 10:48:13 -0400 Subject: Nullable switch In-Reply-To: <6D8C1730-4556-46A2-AF9D-ED87D8EF9923@oracle.com> References: <8a0d00c7-07a4-2d1e-8570-7e2cebd7f00e@oracle.com> <0567C5FB-DCF6-49B1-8E1A-47CB158C3A5D@univ-mlv.fr> <6A81A458-9FA8-43D3-9D81-6EB7E2D61CA1@oracle.com> <577630200.188826.1596749743120.JavaMail.zimbra@u-pem.fr> <2048519557.191709.1596756152316.JavaMail.zimbra@u-pem.fr> <6D8C1730-4556-46A2-AF9D-ED87D8EF9923@oracle.com> Message-ID: > Okay, so it would seem that we need two keywords (or other syntax) for > use in patterns; I will temporarily call them ?anything-but-null? and > ?anything-including-null?. Not necessarily; the approach we've been driving towards has no (new) keywords, and no _explicit_ consideration of nullability. There's just type patterns, but their semantics take into account whether or not the type pattern "covers" the target type.? This is subtle, I grant, and I can see where people would get confused, but it is far more compositional and less ad-hoc. Ignoring the epicyclical* distastefulness of the "any x" idea, I think the the syntax issues are a bit of a red herring -- the issue is structural.? Under Remi's proposal, there is simply _no_ way to write a switch where any number of cases covers "anything including null", because the switch will throw before you get there: ??? switch (x) { ??????? case String s: ??????? case Object o: ??? } would throw on NPE (as switches do today) before any cases are considered, whether you say "var" or "any" or "Object."? This is, essentially, saying "switch is permanently polluted with null behavior, we're not going to do anything about it, and anyone who wants to treat nulls uniformly with other values just can't use switch (or has to duplicate code, etc.)? It also, as mentioned, provides pitfalls for several expected-to-be-common refactorings, because the "obvious" refactoring does not have the same semantics. Instead, we are proposing to refine the handling of null in switch to work in line with _totality_ of patterns -- building on the use of totality in several other places.? If we have a total pattern (like `Object o`), we already use it for dead-code detection: ??? switch (x) { ??????? case Object o: ... ??????? case P: // error, dead code, no matter what P is ??? } When we do pattern assignment, we use totality in flow analysis: ??? Point p = ... ??? ... ??? Point(var x, var y) = p;? // OK, Point(...) is total on Point ??? Object o = ... ??? ... ??? Point(var x, var y) = o;? // error, Point is not total on Object Note that this definition of totality (so far) has a hole: nullity. We can check the static type of the operand and verify that the pattern covers all instances of that type, but without nullity in the type system, we can't statically exclude null.? So we propose to rectify that: a total pattern matches null too.? (You can consider `var x` to be an "anything" pattern, or consider it to be inference for the obvious type pattern; under this interpretation of total type patterns, the two get you to the same semantics.)? And this matches expected intuition (I claim) for what "case Box(Object o)" at the end of a switch on boxes should do -- but gets there entirely in terms of mechanical composition rules for patterns. We can only use patterns in switch and instanceof, but both of these constructs currently have fail-fast behaviors with null.? So we are proposing to refine the null handling of switch (compatibly) so that we can work with, rather than against, totality. *Celestial term for "bag nailed on the side" > In particular, I am uncertain about this situation: suppose we have a > pattern FrogBox(Frog x) (that is, it is known that the argument to > FrogBox must be of type Frog), and Tadpole is a class that extends (or > implements) Frog; then consider > > FrogBox fb = ?; > switch (fb) { > case FrogBox(Tadpole x) -> ? ; > case FrogBox(Frog x) -> ? ; > } > > (I think this is similar to previous examples, but I want the catchall > case to be Frog, not Object.) ?Is the preceding switch total, or do I > need to say either Yes.? Let's write out the declaration of the FrogBox pattern to see. ??? class FrogBox { ??????? Frog f; ??????? deconstructor FrogBox(Frog f) { ??????????? f = this.frog; ??????? } ??? } For ??? case FrogBox(Frog x): we do overload resolution to find the deconstruction pattern, and discover that its binding is of type Frog.? The nested pattern `Frog x` is total on `Frog`, and the deconstruction pattern FrogBox(Q) is total on FrogBox when Q is total on Frog, and therefore the compound pattern `FrogBox(Frog x)` is total on FrogBox (under the working proposal.) Under Remi's proposal, it is not; it exlcudes FrogBox(null).? (Also, if this were a switch expression, you'd be told it is not exhaustive.)? Further, under Remi's proposal, you can't easily refactor into a nested switch, because that would change the null behavior from "ignore" to "throw", unless you added an extra "if x == null". If you wanted to be total under Remi's proposal, you'd have to change the last case to ??? case FrogBox(any x): But you'd still have no way to refactor to a nested switch without manually handling the null that pops out, and duplicating the code for FrogBox(null) and FrogBox(Frog). -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.buckley at oracle.com Fri Aug 7 17:36:47 2020 From: alex.buckley at oracle.com (Alex Buckley) Date: Fri, 7 Aug 2020 10:36:47 -0700 Subject: [records] Mark generated toString/equals/hashCode in class files somehow? In-Reply-To: References: Message-ID: You are right that ACC_MANDATED is not expressible for methods. This is unfortunate not only for equals/hashCode/toString in a record class, but also for values/valueOf in an enum class. ACC_SYNTHETIC indicates an implementation artifact -- something that varies from compiler to compiler (or from one release of a compiler to the next release of the same compiler). It would be wrong to use ACC_SYNTHETIC to mark the five methods in the previous paragraph. They are language artifacts, whose existence + signature + semantics are the same across compilers. It would be legitimate to add ACC_MANDATED to method_info.access_flags. ACC_MANDATED is defined as 0x8000 in other contexts, so convention dictates that it would have to be defined as 0x8000 in method_info.access_flags too. Happily, 0x8000 is available there. This also applies to field_info.access_flags. Alex On 8/6/2020 9:20 PM, Tagir Valeev wrote: > Hello, Jonathan! > > I believe, current JVM specification doesn't say that methods could be > marked with ACC_MANDATED [1]. I won't mind if it will be used instead > of SYNTHETIC. To me, anything is ok if I can avoid bytecode > inspection. > > With best regards, > Tagir Valeev. > > [1] https://docs.oracle.com/javase/specs/jvms/se14/html/jvms-4.html#jvms-4.6 > > On Fri, Aug 7, 2020 at 11:12 AM Jonathan Gibbons > wrote: >> >> Tagir, >> >> The concept and word you are looking for is "mandated", which is similar >> to but different from "synthetic". >> >> See >> https://docs.oracle.com/en/java/javase/14/docs/api/java.compiler/javax/lang/model/util/Elements.Origin.html#MANDATED >> >> -- Jon >> >> >> On 8/6/20 8:48 PM, Tagir Valeev wrote: >>> Hello! >>> >>> I'm working on class-file decompiler for records and discovered that >>> there's no special flag for generated equals/hashCode/toString (like >>> ACC_SYNTHETIC). This allows determining whether this method was >>> explicitly specified in the source code only by looking into method >>> implementation whether it has an ObjectMethods.bootstrap indy or not. >>> This looks implementation-dependent and somewhat fragile (though, of >>> course, we will do this if we have no other options). We also have a >>> stub decompiler that decompiles declarations only without checking >>> method bodies at all and it also wants to know whether >>> equals/hashCode/toString methods were autogenerated. Finally, other >>> bytecode tools like code coverage may need this to avoid calculating >>> coverage for methods not present in the source. >>> >>> Is it possible to mark generated methods via ACC_SYNTHETIC or any >>> other flag or add any attribute that can be used to differentiate >>> auto-generated methods from the ones presented in the source code? >>> >>> Having a synthetic mark for auto-generated canonical constructor or >>> accessor methods is less critical (as their bodies could be actually >>> written in the source code like this) but it would be also nice to >>> have it. >>> >>> With best regards, >>> Tagir Valeev. From guy.steele at oracle.com Fri Aug 7 18:17:59 2020 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 7 Aug 2020 14:17:59 -0400 Subject: Nullable switch In-Reply-To: References: <8a0d00c7-07a4-2d1e-8570-7e2cebd7f00e@oracle.com> <0567C5FB-DCF6-49B1-8E1A-47CB158C3A5D@univ-mlv.fr> <6A81A458-9FA8-43D3-9D81-6EB7E2D61CA1@oracle.com> <577630200.188826.1596749743120.JavaMail.zimbra@u-pem.fr> <2048519557.191709.1596756152316.JavaMail.zimbra@u-pem.fr> <6D8C1730-4556-46A2-AF9D-ED87D8EF9923@oracle.com> Message-ID: Okay, thanks. The trouble with using ?Object? in this sort of discussion is that it is special in several ways that can be hard to tease apart. Going through the FrogBox example made the issues and distinctions involved, and the positions or designs being argued for, much clearer to me. ?Guy > On Aug 7, 2020, at 10:48 AM, Brian Goetz wrote: > > >> Okay, so it would seem that we need two keywords (or other syntax) for use in patterns; I will temporarily call them ?anything-but-null? and ?anything-including-null?. > > Not necessarily; the approach we've been driving towards has no (new) keywords, and no _explicit_ consideration of nullability. There's just type patterns, but their semantics take into account whether or not the type pattern "covers" the target type. This is subtle, I grant, and I can see where people would get confused, but it is far more compositional and less ad-hoc. > > Ignoring the epicyclical* distastefulness of the "any x" idea, I think the the syntax issues are a bit of a red herring -- the issue is structural. Under Remi's proposal, there is simply _no_ way to write a switch where any number of cases covers "anything including null", because the switch will throw before you get there: > > switch (x) { > case String s: > case Object o: > } > > would throw on NPE (as switches do today) before any cases are considered, whether you say "var" or "any" or "Object." This is, essentially, saying "switch is permanently polluted with null behavior, we're not going to do anything about it, and anyone who wants to treat nulls uniformly with other values just can't use switch (or has to duplicate code, etc.) It also, as mentioned, provides pitfalls for several expected-to-be-common refactorings, because the "obvious" refactoring does not have the same semantics. > > Instead, we are proposing to refine the handling of null in switch to work in line with _totality_ of patterns -- building on the use of totality in several other places. If we have a total pattern (like `Object o`), we already use it for dead-code detection: > > switch (x) { > case Object o: ... > case P: // error, dead code, no matter what P is > } > > When we do pattern assignment, we use totality in flow analysis: > > Point p = ... > ... > Point(var x, var y) = p; // OK, Point(...) is total on Point > > Object o = ... > ... > Point(var x, var y) = o; // error, Point is not total on Object > > Note that this definition of totality (so far) has a hole: nullity. We can check the static type of the operand and verify that the pattern covers all instances of that type, but without nullity in the type system, we can't statically exclude null. So we propose to rectify that: a total pattern matches null too. (You can consider `var x` to be an "anything" pattern, or consider it to be inference for the obvious type pattern; under this interpretation of total type patterns, the two get you to the same semantics.) And this matches expected intuition (I claim) for what "case Box(Object o)" at the end of a switch on boxes should do -- but gets there entirely in terms of mechanical composition rules for patterns. > > We can only use patterns in switch and instanceof, but both of these constructs currently have fail-fast behaviors with null. So we are proposing to refine the null handling of switch (compatibly) so that we can work with, rather than against, totality. > > *Celestial term for "bag nailed on the side" > >> In particular, I am uncertain about this situation: suppose we have a pattern FrogBox(Frog x) (that is, it is known that the argument to FrogBox must be of type Frog), and Tadpole is a class that extends (or implements) Frog; then consider >> >> FrogBox fb = ?; >> switch (fb) { >> case FrogBox(Tadpole x) -> ? ; >> case FrogBox(Frog x) -> ? ; >> } >> >> (I think this is similar to previous examples, but I want the catchall case to be Frog, not Object.) Is the preceding switch total, or do I need to say either > > Yes. Let's write out the declaration of the FrogBox pattern to see. > > class FrogBox { > Frog f; > > deconstructor FrogBox(Frog f) { > f = this.frog; > } > } > > For > > case FrogBox(Frog x): > > we do overload resolution to find the deconstruction pattern, and discover that its binding is of type Frog. The nested pattern `Frog x` is total on `Frog`, and the deconstruction pattern FrogBox(Q) is total on FrogBox when Q is total on Frog, and therefore the compound pattern `FrogBox(Frog x)` is total on FrogBox (under the working proposal.) > > Under Remi's proposal, it is not; it exlcudes FrogBox(null). (Also, if this were a switch expression, you'd be told it is not exhaustive.) Further, under Remi's proposal, you can't easily refactor into a nested switch, because that would change the null behavior from "ignore" to "throw", unless you added an extra "if x == null". > > If you wanted to be total under Remi's proposal, you'd have to change the last case to > > case FrogBox(any x): > > But you'd still have no way to refactor to a nested switch without manually handling the null that pops out, and duplicating the code for FrogBox(null) and FrogBox(Frog). > > > From john.r.rose at oracle.com Fri Aug 7 18:44:56 2020 From: john.r.rose at oracle.com (John Rose) Date: Fri, 7 Aug 2020 11:44:56 -0700 Subject: Nullable switch In-Reply-To: <2048519557.191709.1596756152316.JavaMail.zimbra@u-pem.fr> References: <8a0d00c7-07a4-2d1e-8570-7e2cebd7f00e@oracle.com> <0567C5FB-DCF6-49B1-8E1A-47CB158C3A5D@univ-mlv.fr> <6A81A458-9FA8-43D3-9D81-6EB7E2D61CA1@oracle.com> <577630200.188826.1596749743120.JavaMail.zimbra@u-pem.fr> <2048519557.191709.1596756152316.JavaMail.zimbra@u-pem.fr> Message-ID: On Aug 6, 2020, at 4:22 PM, forax at univ-mlv.fr wrote: > > people are not compilers, People do model compilers, and as Brian has shown, there is already a workable concept (which Java programmers must model) about totality. > because you are mixing totality and nullability, This is the ?pick your poison? moment: Often, cleanly factoring out separate notations for separate concepts is the best move. Not in this case. The use cases (for totality and for null tolerance) are very strongly correlated after (a process that takes months in which) you run through all the various use cases and desired symmetries and other principles that Brian alludes to. It?s better to mix here; using separate ?unmixed" notations leads to two strongly correlated channels of information that the poor programmer will have to manage in tandem, even in cases (common as Brian shows) where null policy isn?t even on the user?s mental radar. *That?s* requiring the user to do the compiler?s job. Amber is about reducing ceremony; having too many notations to make the same choices, and requiring *both at the same time* is increasing ceremony. > is the "case Comparable c" accept null or not in the following code > is not an easy question But as Brian showed, it is the sort of question that programmers have to be aware of: Totality, like the DA/DU rules, is a concept they need to know about to program correctly. But the programmer doesn?t need to ?be a compiler?; of course the compiler does the detail work and tells the programmer whether a given construct will fly or not. And the programmer learns by pattern matching, hunches, intuitions, which sorts of statements and idioms are safe and which require greater scrutiny. If you look at the examples and use cases, and/or try out some coding, I think you?ll find (as I did) that, because the total patterns in a switch are *forced* by the compiler to fall to the bottom, it?s *easy* to see at a glance which cases are null-tolerant and which are null-hostile. HTH ? John From john.r.rose at oracle.com Fri Aug 7 23:07:45 2020 From: john.r.rose at oracle.com (John Rose) Date: Fri, 7 Aug 2020 16:07:45 -0700 Subject: Next up for patterns: type patterns in switch In-Reply-To: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> References: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> Message-ID: On Aug 4, 2020, at 11:11 AM, Brian Goetz wrote: > > One thing this left open was the actual syntax of guards. (We know its snowing in hell now, because I am actually encouraging a syntax conversation.) If this is only a ?syntax of guards? discussion I have little to say, except that I subjectively still find ?P&&z? charming even if the boolean z isn?t (and shouldn?t be) a pattern, and (ouch, this will leave a mark) even though ?false&&true? matches ?(Boolean)false? but ?false&&false? doesn?t. If this is a discussion of ?how to guard patterns?, then see the more substantive comment below, in context. > ... > The most obvious ambiguity was the obvious interpretation of constant patterns; that `0` could be the literal zero or a pattern that matches zero. (I have since proposed we try to avoid constant patterns entirely.) On this thread I have objected to this move, on the grounds that it sets users up for failure with ?==? when they try refactoring legacy constant labels to instanceof tests. Basically, relative to case labels and patterns, ?==? is a treacherous crutch to lean on, in about three different ways (string identity, NaN, and NPE hazards). > On 6/24/2020 10:44 AM, Brian Goetz wrote: >> ... >> An alternate to guards is to allow an imperative `continue` statement in >> `switch`, which would mean "keep trying to match from the next label." Given >> the existing semantics of `continue`, this is a natural extension, but since >> `continue` does not currently have meaning for switch, some work would have to >> be done to disambiguate continue statements in switches enclosed in loops. The >> imperative version is strictly more expressive than most reasonable forms of the >> declarative version, but users are likely to prefer the declarative version. I still think, despite our guesses at user preferences, that the best option is this more powerful and explicit one, even if it sacrifices one-liner clarity for common guards: case P: if (!G) continue switch; // here P matches and G is true case P -> { if (!G) continue switch; // here P matches and G is true return ?; } Here we would spell ?when G:? as ?: if (!G) continue switch;? If the extra power doesn?t cause problems, I would prefer for us to try to live with the verbosity for a while before introducing additional bespoke guard-for-a-pattern syntax. Yes, it will provoke users to have to type more tokens for simple guards. But the provocation will be useful in the end, because then then bloggers will begin to explain what ?continuing? a decision chain does, and that will unlock power for any users willing to move beyond griping about syntax, i.e., most users. From forax at univ-mlv.fr Sat Aug 8 21:04:27 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 8 Aug 2020 23:04:27 +0200 (CEST) Subject: Nullable switch In-Reply-To: References: <6A81A458-9FA8-43D3-9D81-6EB7E2D61CA1@oracle.com> <577630200.188826.1596749743120.JavaMail.zimbra@u-pem.fr> <2048519557.191709.1596756152316.JavaMail.zimbra@u-pem.fr> <6D8C1730-4556-46A2-AF9D-ED87D8EF9923@oracle.com> Message-ID: <1484084516.384308.1596920667140.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Guy Steele" > Cc: "Remi Forax" , "amber-spec-experts" > , "John Rose" > Envoy?: Vendredi 7 Ao?t 2020 16:48:13 > Objet: Re: Nullable switch >> Okay, so it would seem that we need two keywords (or other syntax) for use in >> patterns; I will temporarily call them ?anything-but-null? and >> ?anything-including-null?. > Not necessarily; the approach we've been driving towards has no (new) keywords, > and no _explicit_ consideration of nullability. There's just type patterns, but > their semantics take into account whether or not the type pattern "covers" the > target type. This is subtle, I grant, and I can see where people would get > confused, but it is far more compositional and less ad-hoc. > Ignoring the epicyclical* distastefulness of the "any x" idea, I think the the > syntax issues are a bit of a red herring -- the issue is structural. Under > Remi's proposal, there is simply _no_ way to write a switch where any number of > cases covers "anything including null", because the switch will throw before > you get there: > switch (x) { > case String s: > case Object o: > } > would throw on NPE (as switches do today) before any cases are considered, > whether you say "var" or "any" or "Object." That is not true. You're right that the switch above will generate a NPE as the switches do today because under the rules i propose, there is no case that accept null. But if you add an any case (or a null case), then the switch will accept null, by example, the switch below accept null. switch (x) { case String s: case any o: } As Guy said, i'm proposing to have two different cases, one ?anything-but-null? and one ?anything-including-null? instead of relying on the non-local property of totality. You can re-read my email from the 6th of August for the rules allowing a switch to accept null and more examples. regards, R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Aug 8 21:08:41 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 8 Aug 2020 17:08:41 -0400 Subject: Nullable switch In-Reply-To: <1484084516.384308.1596920667140.JavaMail.zimbra@u-pem.fr> References: <1484084516.384308.1596920667140.JavaMail.zimbra@u-pem.fr> Message-ID: This is exactly why I asked you to provide a *complete* proposal. Sent from my iPad > On Aug 8, 2020, at 5:04 PM, forax at univ-mlv.fr wrote: > > ? > > > De: "Brian Goetz" > ?: "Guy Steele" > Cc: "Remi Forax" , "amber-spec-experts" , "John Rose" > Envoy?: Vendredi 7 Ao?t 2020 16:48:13 > Objet: Re: Nullable switch > > Okay, so it would seem that we need two keywords (or other syntax) for use in patterns; I will temporarily call them ?anything-but-null? and ?anything-including-null?. > > Not necessarily; the approach we've been driving towards has no (new) keywords, and no _explicit_ consideration of nullability. There's just type patterns, but their semantics take into account whether or not the type pattern "covers" the target type. This is subtle, I grant, and I can see where people would get confused, but it is far more compositional and less ad-hoc. > > Ignoring the epicyclical* distastefulness of the "any x" idea, I think the the syntax issues are a bit of a red herring -- the issue is structural. Under Remi's proposal, there is simply _no_ way to write a switch where any number of cases covers "anything including null", because the switch will throw before you get there: > > switch (x) { > case String s: > case Object o: > } > > would throw on NPE (as switches do today) before any cases are considered, whether you say "var" or "any" or "Object." > > That is not true. > You're right that the switch above will generate a NPE as the switches do today because under the rules i propose, there is no case that accept null. > > But if you add an any case (or a null case), then the switch will accept null, > by example, the switch below accept null. > switch (x) { > case String s: > case any o: > } > > As Guy said, i'm proposing to have two different cases, one ?anything-but-null? and one ?anything-including-null? instead of relying on the non-local property of totality. > > You can re-read my email from the 6th of August for the rules allowing a switch to accept null and more examples. > > regards, > R?mi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Aug 8 21:37:02 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 8 Aug 2020 23:37:02 +0200 (CEST) Subject: Nullable switch In-Reply-To: References: <0567C5FB-DCF6-49B1-8E1A-47CB158C3A5D@univ-mlv.fr> <6A81A458-9FA8-43D3-9D81-6EB7E2D61CA1@oracle.com> <577630200.188826.1596749743120.JavaMail.zimbra@u-pem.fr> <2048519557.191709.1596756152316.JavaMail.zimbra@u-pem.fr> Message-ID: <1270364443.389357.1596922622449.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > Envoy?: Vendredi 7 Ao?t 2020 20:44:56 > Objet: Re: Nullable switch > On Aug 6, 2020, at 4:22 PM, forax at univ-mlv.fr wrote: >> >> people are not compilers, > > People do model compilers, and as Brian has shown, there is > already a workable concept (which Java programmers must > model) about totality. > >> because you are mixing totality and nullability, > > This is the ?pick your poison? moment: Often, cleanly factoring > out separate notations for separate concepts is the best move. > > Not in this case. The use cases (for totality and for null tolerance) > are very strongly correlated after (a process that takes months in > which) you run through all the various use cases and desired > symmetries and other principles that Brian alludes to. I'm always worry about this kind of sentence because of the sunk cost fallacy. > It?s better to mix here; using separate ?unmixed" notations leads to two > strongly correlated channels of information that the poor > programmer will have to manage in tandem, even in cases > (common as Brian shows) where null policy isn?t even on the > user?s mental radar. *That?s* requiring the user to do the > compiler?s job. Amber is about reducing ceremony; having > too many notations to make the same choices, and requiring > *both at the same time* is increasing ceremony. I'm not against reducing the ceremony (obviously) and i understand that trying to guess what the user want can be appealing, but using the fact that the case is total as a bit of information about what the user think supposes that the user knows the rules while usually it's the opposite, the compiler enforces the rule. Moreover, having a case being total is not a local property it depends on the type hierarchy which can be outside the compilation unit and as Jens Lidestr?m also noticed, also depends on the type of the expression switched upon wihc can be hidden, combined with the fact that a switch statement doesn't have to cover all cases because of backward compatibility, it's a recipe for disaster. > >> is the "case Comparable c" accept null or not in the following code >> is not an easy question > > But as Brian showed, it is the sort of question that programmers > have to be aware of: Totality, like the DA/DU rules, is a concept > they need to know about to program correctly. Thanks for mentioning DA/DU rules, it helps me to refine why using the fact that the case is total or not is a bad idea. - DA/DU rules are fully local, they don't rely on out of the compilation unit informations. - DA/DU rules either allows an expression or reject it, here, in case of switch statement, if you guess wrong and the case is not total, you get another semantics (NPE or not) instead of compile or don't. > > But the programmer doesn?t need to ?be a compiler?; of course > the compiler does the detail work and tells the programmer > whether a given construct will fly or not. And the programmer > learns by pattern matching, hunches, intuitions, which sorts > of statements and idioms are safe and which require greater > scrutiny. If you look at the examples and use cases, and/or > try out some coding, I think you?ll find (as I did) that, because > the total patterns in a switch are *forced* by the compiler > to fall to the bottom, it?s *easy* to see at a glance which cases > are null-tolerant and which are null-hostile. Unfortunately, being at the bottom say nothing about the fact that the case is total or not. It works in the opposite direction, a total case has to be at the bottom. I think Brian and you have forgotten at some point the fact that a switch is in most cases defined in a different compilations unit/packages than the type it works on. So having the last case being total at some point and later not being total because a file has changed somewhere, basically a separate compilation issue will be far more frequent than with the other constructs we usually deal with. So chagning the semantics of the last case (and the switch) depending on information we know may be not in sync is dangerous. In a sense, using the case being total or not is a clever trick, not a smart one. > > HTH > > ? John R?mi From brian.goetz at oracle.com Mon Aug 10 17:57:01 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 10 Aug 2020 13:57:01 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> Message-ID: There seems to be an awful lot of confusion about the motivation for the nullity proposal, so let me step back and address this from first principles. Let's factor away the null-tolerance of the constructs (switch and instanceof) from what patterns mean, and then we can return to how, if necessary, to resolve any mismatches.? We'll do this by defining what it means for a target to match a pattern, and only then define the semantics of the pattern-aware constructs in terms of that. Let me also observe that some people, in their belief that `null` was a mistake, tend to have a latent hostility to null, and therefore tend to want new features to be at least as null-hostile as the most null-hostile of old features.? (A good example is streams; it was suggested (by some of the same people) that it should be an error for streams to have null elements.? And we considered this briefly -- and concluded this would have been a terrible idea!? The lesson of that investigation was that the desire to "fix" the null mistake by patching individual holes is futile, and tends to lead to worse results.? Instead, being null-agnostic was the right move for streams.) I think we're also being distracted by the fact that, in part because we've chosen `instanceof` as our syntax, we want to use `instanceof` as our mental model for what matching means.? This is a good guiding principle but we must be careful of following it blindly. As a modeling simplification, let's assume that all patterns have exactly one binding variable, and the type of that binding variable is part of the pattern definition.? We could model our match predicate and (conditional) binding function as: ??? match :: (Pattern t) u -> Maybe t A pattern represents the fusion of an applicability predicate, zero or more conditional extractions, and a binding mechanism.? For the simple case of a type pattern `Foo f`, the applicability predicate is "are you a Foo", and there are two possible interpretations -- "would `instanceof` say you are a `Foo`" (which means non-null), or "could you be assigned to a variable of type Foo" (or, equivalently, "are you in the value set of Foo".) A pattern P is _total_ on U if `match P u` returns `Some t` for every `u : U`.? Total patterns are useful because they allow the compiler to reason about control flow and provide better error checking (detecting dead code, silly pattern matches, totality of expression switches, etc.) Let's go back to our trusty Box example.? We can think of the `Box` constructor as a mapping: ??? enBox :: t -> Box t and the Box deconstructor as ??? unBox :: Box t -> t Now, what algebraic relationship do we want between enBox and unBox?? The whole point is that a Box is a structure containing some properties, and that patterns let us destructure Boxes to recover those properties.? enBox and unBox should form a projection-embedding pair, which means that enBox is allowed to be picky about what `t` values it accepts (think of the Rational constructor as throwing on denom==0), but, once boxed, we should be able to recover whatever is in the box.? (The Box code gets to mediate access in both directions, but the _language_ shouldn't make guesses about what this code is going to do.) From the perspective of Box, is `null` a valid value of T?? The answer is: "That's the Box author's business.? The constructor accepts a T, and `null` is a valid member of T's value set.? So if the imperative body of the constructor doesn't do anything special to reject it, then it's part of the domain."? And if its part of the domain, then `unBox` should hand back what we handed to `enBox`.? T in, T out. It has been a driving goal throughout the pattern matching exploration to exploit these dualities, because (among other things) this minimizes sharp edges and makes composition do what you expect it to.? If I do: ??? Box b = new Box(t); and this succeeds, then our `match` function applied to `Box(T)` and `b` should yield what we started with -- `t`.? Singling out `null` for special treatment here as an illegal binding result is unwarranted; it creates a sharp edge where you can put things into boxes but you can only get them out on tuesdays.? The language has no business telling Box it can't contain nulls, or punishing null-happy boxes by making them harder to deconstruct. Null-hostility is for the Box author to choose or not.? I should be able to compose construction and deconstruction without surprises. Remember, we're not yet talking about language syntax here -- we're talking about the semantics of matching (and what we let class authors model).? At this level, there is simply no other reasonable set of semantics here -- the `Box(T)` deconstructor, when applied to a valid Box, should be able to recover whatever was passed to the `new Box(T)` constructor.? Nulls should be rejected by pattern matching at the point where they would be derferenced, not preemptively. There's also only one reasonable definition of the semantics of nested matching.? If `P : Pattern t`, then the nested pattern P(Q) matches u iff ??? u matches P(T alpha) && alpha matches Q It follows that if `Box(Object o)` is going to to be total on all boxes, then Object o is total on all objects. (There's also only one reasonable definition of the `var` pattern; it is type inference where we infer the type pattern for whatever type is the target of the match.? So if `P : Pattern T`, then `P(var x)` infers `T x` for the nested pattern.) Doing anything else is an impediment to composition (and composition is the only tool we have, as language designers, that separate us from the apes.)? I can compose constructors: ??? Box>> b? = new Box(new Flox(new Pox(t))); and I should be able to take this apart exactly the same way: ??? if (b matches Box(Flox(Pox(var t))) The reason `Flox(Pox p)` doesn't match null floxes is not because patterns shouldn't match null, but because a _deconstruction pattern_ that takes apart a Flox is intrinsically going to look inside the Flox -- which means dereferencing it.? But an ordinary type pattern is not necessarily going to. Looking at it from another angle, there is a natural interpretation of applying a total pattern as a generalization of assignment.? It's not an accident that `T t` (or `var x`) looks both like a pattern and like a local variable declaration.? We know that this: ??? T t = e or ??? var t = e is a local variable declaration with initializer, but we can also reasonably (and profitably) interpret it as a pattern match -- take the (total on T) pattern `T t`, and match `e : T` to it.? And the compiler already knows that this is going to succeed if `e : T`.? To gratuitously reject null here makes no sense.? (Totality is important here; if the pattern were not total, then `t` would not be DA after the assignment, and therefore the declaration either has to throw a runtime error, or the compiler has to reject it.) ## Back to switch and instanceof The above discussion argues why there is only one reasonable null behavior for patterns _in the abstract_.?? But, I hear you cry, the semantics for switch and instanceof today are entirely reasonable and intuitive, so how could they be so wrong? And the answer is: we have only been able to use `switch` and `instanceof` so far for pretty trivial things!? When we add patterns to the language, we're raising the expressive ability of these constructs to some power.? And extrapolating from our existing intuitions about these are like extrapolating the behavior of polynomials from their zeroth-order Taylor expansion. (Now, that this point, the split-over-lump crowd says "Then you should define new constructs, if they're so much more powerful." But I still claim it is far better to refine our intuitions about what switch means, even with some discomfort, than to try to keep track of the subtle differences between switch and snitch.) So, why do we have the current null behavior for `instanceof` and `switch`?? Well, right now, `instanceof` only lets you ask a very very simple question -- "is the dynamic type of the target X".? And, the designers judged (reasonable) that, since 99.999% of the time, what you're about to do is cast the target and then deference it, saying "no" is less error-prone than saying OK and then having the subsequent dereference fail. But now, `instanceof` can answer far more sophisticated questions, and that 99.999% becomes a complete unknown.? With what confidence can you say that the body of: ??? if (b instanceof Box(var t)) { ... } is going to dereference t?? If you say more than 50%, you're lying. It would be totally reasonable to just take that t and assign it somewhere else, rebox it into another box, pass it to some T-consuming method, etc.? And who are we to say that Box-consuming protocols are somehow "bad" if they like to truck in null contents? That's not our business!? So the conditions under which "always says no" was reasonable for Java 1.0 are no longer applicable. The same is true for switch, because of the very limited reference types which switch permits (and which were only added in Java 5) -- boxed primitives, strings, and enums.? In all of these cases, we are asking very simple questions ("are you 3"), and these are domains where nulls have historically been denigrated -- so it seemed reasonable for switch to be hostile to them.? But once we introduce patterns, the set of questions you can ask gets enormously larger, and the set of types you can switch over does too.? The old conditions don't apply.? In: ??? switch (o) { ??????? case Box(var t): ... ??????? case Bag(var t): ... ??? } we care about the contents, not the wrapping; the switch is there to do the unwrapping for us.? Who are we to say "sorry, no one should ever be allowed to put a null in a Bag?"? That's not our business! At this point, I suspect Remi says "I'm not saying you can't put a null in a Box, but there should be a different way to unpack it." But unless you can say with 99.99% certainty that nulls are always errors, it is better to be agnostic to nulls in the plumbing and let users filter them at the ultimate point of consumption, than to make the plumbing null-hostile and make users jump through hoops to get the nulls to flow.? The same was true for streams; we made the (absolutely correct) choice to let the nulls flow through the stream, and, if you are using a maybe-null-containing source, and doing null-incompatible things on the elements, it's on you to filter them.? It is easier to filter nulls than to to add back a special encoding for nulls.? (And, the result of that experiment was pretty conclusive: of the hundreds of stack overflow questions I have seen on streams, not one centered around unexpected nulls.) If we have guards, and you want to express "no Boxes with nulls", that's easy: ??? case Box(var t) when t != null: ... And again, as with `instanceof`, we have no reason to believe that there's a 99.99% chance that the next thing the user is going to do is dereference it.? So the justification that null-hostility is the "obvious" semantics here doesn't translate to the new, more powerful language feature. And it gets worse: the people who really want the nulls now have to do additional error-prone work, either use some ad-hoc epicyclical syntax at each use site (and, if the deconstruction pattern has five bindings, you have to say it five times), or having to duplicate blocks of code to avoid the switch anomaly. The conclusion of this section is that while the existing null behavior for instanceof and switch is justified relative to their _current_ limitations, once we remove those limitations, those behaviors are much more arbitrary (and kind of mean: "nulls are so bad, that if you are a null-using person, we will make it harder for you, 'for your own good'.") #### Split the baby? Now, there is room to make a reasonable argument that we'd rather keep the existing switch behavior, but accept the null-friendly matching behavior.? My take is that this is a bad trade, but let's look at it more carefully. Gain: I don't have to learn a new set of rules about what switch/instanceof do with null. Loss: code duplication.? If I want my fallback to handle nulls, I have to duplicate code; instead of ??? switch (o) { ??????? case String s: A ??????? case Long l: B ??????? case Object o: C ??? } I have to do ??? if (o == null) { C } ??? else switch (o) { ??????? case String s: A ??????? case Long l: B ??????? case Object o: C ??? } resulting in duplicating C.? (We have this problem today, but because of the limitations of switch today, it is rarely a problem. When our case labels are more powerful, we'll be using switch for more stuff, and it will surely come up more often.) Loss: refactoring anomaly.? Refactoring a nested switch with: ???? case P(Q): ???? case P(R): ???? case P(S): to ??? case P(var x): ??????? switch (x) { ??????????? case Q: ... ??????????? case R: ... ??????????? case S: ... ??????? } ??? } doesn't work in the obvious way.? Yes, there's a way to refactor it, and the IDE will do it correctly.? But it becomes a sharp edge that users will trip over.? The reason the above refactoring is desirable is because users will reasonably assume it works, and rather than cut them with a sharp edge, we can just make it way they way they reasonable think it should. So, we could make this trade, and it would be more "minimal" -- but I think it would result in a less useful switch in the long run.? I think we would regret it. #### Conclusion If we were designing pattern matching and switch together from scratch, we would never even consider the current nullity behavior; the "wait until someone actually dereferences before we throw" is the obvious and only reasonable choice.? We're being biased based on our existing assumptions about instanceof and switch.? This is a reasonable starting point, but we have to admit that these biases in turn come from the fact that the current interpretations of those constructs are dramatically limited compared to supporting patterns. It is easy to trot out anecdotes where any of the possible schemes would cause a particular user to be confused.? But this is just a way to justify our biases.?? The reality is that, as switch and instanceof get more powerful, we don't get to make as many assumptions about the liklihood of whether `null` is an error or not.? And, the more likely it is not an error, the less justification we have for giving it special semantics. Let the nulls flow. From brian.goetz at oracle.com Mon Aug 10 21:02:21 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 10 Aug 2020 17:02:21 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> Message-ID: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> Some further color on this, to characterize why all the angst over matching Box(null) seems mostly like a collective "bleah, different is scary" freakout... Case 1.? The Box domain rejects nulls in the ctor.? Then it doesn't matter what we do; all the schemes discussed for `Box(Object o)` will do the same thing. Case 2.? The Box domain loves nulls!? Boxes can contain nulls, and users should always expect to find a null in a box; not doing so is using boxes wrong. In that case, `case Box(Object o)` should surely match `Box(null)`, since its an unremarkable element of the Box domain.? Here, though, people get nervous: "if we bind o to null, a careless users might NPE!"? But that's likely to happen anyway -- and should. Suppose we didn't have deconstruction patterns, and instead the user writes: ??? case Box b: ... There's no question this matches Box(null).? And, the same code the careless programmer might write with `Box(var o)`, they're going to write almost exactly the same thing here: ??? case Box b: ??????? Object boxContents = b.contents(); // returns null, no problem ??????? boxContents.foo()????????????????? // Same NPE In this case, we do the users no favors -- actually, we do anti-favors -- by "hiding" Box(null) from the domain, on the off chance that they will screw it up.? If Box is a null-loving domain, then clients need to write null-aware code, and hiding the nulls doesn't help. Further, this example shows another element from our refactoring catalog: users should be able to freely refactor: ??? case Foo target: ??????? Object component = target.component(); with ??? case Foo(Object component) target: ... without changing the semantics.? But if `Foo(Object)` doesn't match `Foo(null)`, that's yet another sharp edge. Essentially, I think the "never match nulls" crowd just really hates nulls and wants them to go away.? But they are not going away, and we do no one any favors by hiding our heads in the sand. On 8/10/2020 1:57 PM, Brian Goetz wrote: > There seems to be an awful lot of confusion about the motivation for > the nullity proposal, so let me step back and address this from first > principles. > > Let's factor away the null-tolerance of the constructs (switch and > instanceof) from what patterns mean, and then we can return to how, if > necessary, to resolve any mismatches.? We'll do this by defining what > it means for a target to match a pattern, and only then define the > semantics of the pattern-aware constructs in terms of that. > > Let me also observe that some people, in their belief that `null` was > a mistake, tend to have a latent hostility to null, and therefore tend > to want new features to be at least as null-hostile as the most > null-hostile of old features.? (A good example is streams; it was > suggested (by some of the same people) that it should be an error for > streams to have null elements.? And we considered this briefly -- and > concluded this would have been a terrible idea!? The lesson of that > investigation was that the desire to "fix" the null mistake by > patching individual holes is futile, and tends to lead to worse > results.? Instead, being null-agnostic was the right move for streams.) > > I think we're also being distracted by the fact that, in part because > we've chosen `instanceof` as our syntax, we want to use `instanceof` > as our mental model for what matching means.? This is a good guiding > principle but we must be careful of following it blindly. > > As a modeling simplification, let's assume that all patterns have > exactly one binding variable, and the type of that binding variable is > part of the pattern definition.? We could model our match predicate > and (conditional) binding function as: > > ??? match :: (Pattern t) u -> Maybe t > > A pattern represents the fusion of an applicability predicate, zero or > more conditional extractions, and a binding mechanism. For the simple > case of a type pattern `Foo f`, the applicability predicate is "are > you a Foo", and there are two possible interpretations -- "would > `instanceof` say you are a `Foo`" (which means non-null), or "could > you be assigned to a variable of type Foo" (or, equivalently, "are you > in the value set of Foo".) > > A pattern P is _total_ on U if `match P u` returns `Some t` for every > `u : U`.? Total patterns are useful because they allow the compiler to > reason about control flow and provide better error checking (detecting > dead code, silly pattern matches, totality of expression switches, etc.) > > Let's go back to our trusty Box example.? We can think of the `Box` > constructor as a mapping: > > ??? enBox :: t -> Box t > > and the Box deconstructor as > > ??? unBox :: Box t -> t > > Now, what algebraic relationship do we want between enBox and unBox?? > The whole point is that a Box is a structure containing some > properties, and that patterns let us destructure Boxes to recover > those properties.? enBox and unBox should form a projection-embedding > pair, which means that enBox is allowed to be picky about what `t` > values it accepts (think of the Rational constructor as throwing on > denom==0), but, once boxed, we should be able to recover whatever is > in the box.? (The Box code gets to mediate access in both directions, > but the _language_ shouldn't make guesses about what this code is > going to do.) > > From the perspective of Box, is `null` a valid value of T?? The answer > is: "That's the Box author's business.? The constructor accepts a T, > and `null` is a valid member of T's value set.? So if the imperative > body of the constructor doesn't do anything special to reject it, then > it's part of the domain."? And if its part of the domain, then `unBox` > should hand back what we handed to `enBox`.? T in, T out. > > It has been a driving goal throughout the pattern matching exploration > to exploit these dualities, because (among other things) this > minimizes sharp edges and makes composition do what you expect it to.? > If I do: > > ??? Box b = new Box(t); > > and this succeeds, then our `match` function applied to `Box(T)` and > `b` should yield what we started with -- `t`.? Singling out `null` for > special treatment here as an illegal binding result is unwarranted; it > creates a sharp edge where you can put things into boxes but you can > only get them out on tuesdays.? The language has no business telling > Box it can't contain nulls, or punishing null-happy boxes by making > them harder to deconstruct. Null-hostility is for the Box author to > choose or not.? I should be able to compose construction and > deconstruction without surprises. > > Remember, we're not yet talking about language syntax here -- we're > talking about the semantics of matching (and what we let class authors > model).? At this level, there is simply no other reasonable set of > semantics here -- the `Box(T)` deconstructor, when applied to a valid > Box, should be able to recover whatever was passed to the `new > Box(T)` constructor.? Nulls should be rejected by pattern matching at > the point where they would be derferenced, not preemptively. > > There's also only one reasonable definition of the semantics of nested > matching.? If `P : Pattern t`, then the nested pattern P(Q) matches u iff > > ??? u matches P(T alpha) && alpha matches Q > > It follows that if `Box(Object o)` is going to to be total on all > boxes, then Object o is total on all objects. > > (There's also only one reasonable definition of the `var` pattern; it > is type inference where we infer the type pattern for whatever type is > the target of the match.? So if `P : Pattern T`, then `P(var x)` > infers `T x` for the nested pattern.) > > Doing anything else is an impediment to composition (and composition > is the only tool we have, as language designers, that separate us from > the apes.)? I can compose constructors: > > ??? Box>> b? = new Box(new Flox(new Pox(t))); > > and I should be able to take this apart exactly the same way: > > ??? if (b matches Box(Flox(Pox(var t))) > > The reason `Flox(Pox p)` doesn't match null floxes is not because > patterns shouldn't match null, but because a _deconstruction pattern_ > that takes apart a Flox is intrinsically going to look inside the Flox > -- which means dereferencing it.? But an ordinary type pattern is not > necessarily going to. > > Looking at it from another angle, there is a natural interpretation of > applying a total pattern as a generalization of assignment.? It's not > an accident that `T t` (or `var x`) looks both like a pattern and like > a local variable declaration.? We know that this: > > ??? T t = e > or > ??? var t = e > > is a local variable declaration with initializer, but we can also > reasonably (and profitably) interpret it as a pattern match -- take > the (total on T) pattern `T t`, and match `e : T` to it.? And the > compiler already knows that this is going to succeed if `e : T`.? To > gratuitously reject null here makes no sense.? (Totality is important > here; if the pattern were not total, then `t` would not be DA after > the assignment, and therefore the declaration either has to throw a > runtime error, or the compiler has to reject it.) > > ## Back to switch and instanceof > > The above discussion argues why there is only one reasonable null > behavior for patterns _in the abstract_.?? But, I hear you cry, the > semantics for switch and instanceof today are entirely reasonable and > intuitive, so how could they be so wrong? > > And the answer is: we have only been able to use `switch` and > `instanceof` so far for pretty trivial things!? When we add patterns > to the language, we're raising the expressive ability of these > constructs to some power.? And extrapolating from our existing > intuitions about these are like extrapolating the behavior of > polynomials from their zeroth-order Taylor expansion. > > (Now, that this point, the split-over-lump crowd says "Then you should > define new constructs, if they're so much more powerful." But I still > claim it is far better to refine our intuitions about what switch > means, even with some discomfort, than to try to keep track of the > subtle differences between switch and snitch.) > > So, why do we have the current null behavior for `instanceof` and > `switch`?? Well, right now, `instanceof` only lets you ask a very very > simple question -- "is the dynamic type of the target X". And, the > designers judged (reasonable) that, since 99.999% of the time, what > you're about to do is cast the target and then deference it, saying > "no" is less error-prone than saying OK and then having the subsequent > dereference fail. > > But now, `instanceof` can answer far more sophisticated questions, and > that 99.999% becomes a complete unknown.? With what confidence can you > say that the body of: > > ??? if (b instanceof Box(var t)) { ... } > > is going to dereference t?? If you say more than 50%, you're lying. It > would be totally reasonable to just take that t and assign it > somewhere else, rebox it into another box, pass it to some T-consuming > method, etc.? And who are we to say that Box-consuming protocols are > somehow "bad" if they like to truck in null contents? That's not our > business!? So the conditions under which "always says no" was > reasonable for Java 1.0 are no longer applicable. > > The same is true for switch, because of the very limited reference > types which switch permits (and which were only added in Java 5) -- > boxed primitives, strings, and enums.? In all of these cases, we are > asking very simple questions ("are you 3"), and these are domains > where nulls have historically been denigrated -- so it seemed > reasonable for switch to be hostile to them.? But once we introduce > patterns, the set of questions you can ask gets enormously larger, and > the set of types you can switch over does too.? The old conditions > don't apply.? In: > > ??? switch (o) { > ??????? case Box(var t): ... > ??????? case Bag(var t): ... > ??? } > > we care about the contents, not the wrapping; the switch is there to > do the unwrapping for us.? Who are we to say "sorry, no one should > ever be allowed to put a null in a Bag?"? That's not our business! > > At this point, I suspect Remi says "I'm not saying you can't put a > null in a Box, but there should be a different way to unpack it." But > unless you can say with 99.99% certainty that nulls are always errors, > it is better to be agnostic to nulls in the plumbing and let users > filter them at the ultimate point of consumption, than to make the > plumbing null-hostile and make users jump through hoops to get the > nulls to flow.? The same was true for streams; we made the (absolutely > correct) choice to let the nulls flow through the stream, and, if you > are using a maybe-null-containing source, and doing null-incompatible > things on the elements, it's on you to filter them.? It is easier to > filter nulls than to to add back a special encoding for nulls.? (And, > the result of that experiment was pretty conclusive: of the hundreds > of stack overflow questions I have seen on streams, not one centered > around unexpected nulls.) > > If we have guards, and you want to express "no Boxes with nulls", > that's easy: > > ??? case Box(var t) when t != null: ... > > And again, as with `instanceof`, we have no reason to believe that > there's a 99.99% chance that the next thing the user is going to do is > dereference it.? So the justification that null-hostility is the > "obvious" semantics here doesn't translate to the new, more powerful > language feature. > > And it gets worse: the people who really want the nulls now have to do > additional error-prone work, either use some ad-hoc epicyclical syntax > at each use site (and, if the deconstruction pattern has five > bindings, you have to say it five times), or having to duplicate > blocks of code to avoid the switch anomaly. > > The conclusion of this section is that while the existing null > behavior for instanceof and switch is justified relative to their > _current_ limitations, once we remove those limitations, those > behaviors are much more arbitrary (and kind of mean: "nulls are so > bad, that if you are a null-using person, we will make it harder for > you, 'for your own good'.") > > #### Split the baby? > > Now, there is room to make a reasonable argument that we'd rather keep > the existing switch behavior, but accept the null-friendly matching > behavior.? My take is that this is a bad trade, but let's look at it > more carefully. > > Gain: I don't have to learn a new set of rules about what > switch/instanceof do with null. > > Loss: code duplication.? If I want my fallback to handle nulls, I have > to duplicate code; instead of > > ??? switch (o) { > ??????? case String s: A > ??????? case Long l: B > ??????? case Object o: C > ??? } > > I have to do > > ??? if (o == null) { C } > ??? else switch (o) { > ??????? case String s: A > ??????? case Long l: B > ??????? case Object o: C > ??? } > > resulting in duplicating C.? (We have this problem today, but because > of the limitations of switch today, it is rarely a problem. When our > case labels are more powerful, we'll be using switch for more stuff, > and it will surely come up more often.) > > Loss: refactoring anomaly.? Refactoring a nested switch with: > > ???? case P(Q): > ???? case P(R): > ???? case P(S): > > to > > ??? case P(var x): > ??????? switch (x) { > ??????????? case Q: ... > ??????????? case R: ... > ??????????? case S: ... > ??????? } > ??? } > > doesn't work in the obvious way.? Yes, there's a way to refactor it, > and the IDE will do it correctly.? But it becomes a sharp edge that > users will trip over.? The reason the above refactoring is desirable > is because users will reasonably assume it works, and rather than cut > them with a sharp edge, we can just make it way they way they > reasonable think it should. > > So, we could make this trade, and it would be more "minimal" -- but I > think it would result in a less useful switch in the long run.? I > think we would regret it. > > #### Conclusion > > If we were designing pattern matching and switch together from > scratch, we would never even consider the current nullity behavior; > the "wait until someone actually dereferences before we throw" is the > obvious and only reasonable choice.? We're being biased based on our > existing assumptions about instanceof and switch.? This is a > reasonable starting point, but we have to admit that these biases in > turn come from the fact that the current interpretations of those > constructs are dramatically limited compared to supporting patterns. > > It is easy to trot out anecdotes where any of the possible schemes > would cause a particular user to be confused.? But this is just a way > to justify our biases.?? The reality is that, as switch and instanceof > get more powerful, we don't get to make as many assumptions about the > liklihood of whether `null` is an error or not.? And, the more likely > it is not an error, the less justification we have for giving it > special semantics. > > Let the nulls flow. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Mon Aug 10 21:12:24 2020 From: john.r.rose at oracle.com (John Rose) Date: Mon, 10 Aug 2020 14:12:24 -0700 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> Message-ID: <91916049-7354-4AD4-AA94-81D07F7F7453@oracle.com> This is all that I wished to imply when I said earlier: > Eh; null doesn?t need to be that special ?And so much more. Thanks. Yes, let the nulls flow. From john.r.rose at oracle.com Mon Aug 10 22:20:42 2020 From: john.r.rose at oracle.com (John Rose) Date: Mon, 10 Aug 2020 15:20:42 -0700 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> Message-ID: <52BBA718-C08A-4AC9-AF61-0E05F6BC5C7F@oracle.com> Letting the nulls flow is a good move in that absorbing game of ?find the primitive?. Here, you are observing forcefully that instanceof and switch, while important precedents and sources of use cases and design patterns, are not the primitives. We are not so lucky that the answer is ?we just need more sugar for the existing constructs?. But we are not so unlucky that we must build our new primitives out of alien materials. The existing ideas about type matching, and specifically that `T x = v;` requires that `T` be total over the type of `v`, are available and useful. Also useful is the idea that some constructs are necessarily null-hostile (starting with dot: `x.f`). In a ?let the nulls flow design?, if a construct is not *necessarily* null hostile, it is necessarily *null permissive*. So we hunt for necessary hostility, and among patterns we find it with destructuring (`Box(var t)` as opposed to `Box`) and with value testing patterns like string or numeric literals (if those are patterns, which I think they should be). We also note that the level of hostility (from patterns) must be compatible with generally null-agnostic use cases: This means a pattern can fail on null (if it *must*) but it *must not throw on null*, because the next pattern in line might be the one that matches the null. So instanceof and switch turn out to be sugar for certain uses of patterns. And together they are universal enough that (luckily) we might not need a new syntax to directly denote the new primitive, of applying a pattern (partial or total) and extracting bindings. The existing behavior w.r.t. nulls of instanceof and switch need to be rationalized. I think that is easy, although there is a little bit to learn. (Just as there?s something to learn today: They are unconditionally null-rejecting at present.) An important thing (as Brian points out) is that if you are choosing to write null-agnostic code, your learning curve should be gentle-to-none. Here are the rules the way I see them, in the presence of primitive patterns which are null-permissive (because they support null-agnostic use cases): `x instanceof P` includes an additional check `x==null` before it tests the pattern P against x. Rationale: Compatibility. Also look at the name: `null` is never an *instance* *of* any type. A pattern might match null, but we are testing whether `x` is an instance, which is to say, an object. Some equations to relate instanceof to the primitive __Matches: x instanceof P ? x __Matches P && (__PermitsNull(P) ? x != null : true) x __Matches P ? x instanceof P || (__PermitsNull(P) ? x == null : false) Do we need syntax for __Matches P? Probably not, because the above equations allow workarounds when the instanceof syntax isn?t exactly right. (And it usually *is* exactly right; the trailing null logic folds away in context, or is harmless in some other way, as the nulls flow around.) What about switch? I like to think that a switch statement is simply sugar (plus optimizations) for a *decision chain*, an if/else chain which tests each case in turn (in source code order, of course): switch (x) { case P: p(); break; case Q: q(); break; ? default: d(); } ? (approximately) { var x_ = x; if (x_ __Matches P) p(); else if (x_ __Matches Q) q(); else ? d(); } (Note that this account of classic switch requires extra tweaks to deal with two embarrassing features: (a) fall through, which requires some way of contriving transfers between arms of the decision chain, and (b) the fact that default can go anywhere, and sometimes is placed in the middle to make use of fall-through. These are embarrassments, not show-stoppers.) So what about nulls? The simple?I will say naive?account of switch is that there is a null check at the head of the switch near `var x_ = x;`. This would account for all of switch?s behaviors as of today, but makes switch hostile to nulls. A more nuanced and useful account of switch?s behavior comes from the following observations: 1. All switch cases *today*, if regarded as patterns, are necessarily null-rejecting. *None of them ever match null.* 2. The NPE observed from a switch-on-null, today, might as well be viewed as arising from the *bottom* of the decision chain, *after all matches* fail. From that point of view, the fact that the failure appears to come from the *top* is simply an optimization, a fast-fail when it is statically provable that there?s no hope ever matching that pesky null, in any given legacy switch. 3. When null meets default, we are painted into a corner, so we have to enjoy the only remaining option: At least in legacy switches, the default case is *also* mandated to reject nulls. (So ?default? turns out to mean ?anything but null?. But that doesn?t parley into a general anti-null story; sorry null-haters.) This feature of default can (maybe) be turned into a benefit: Perhaps we can teach users that by saying ?default? you are *asking* for an NPE, if a null escapes all the intervening patterns in the decision chain. I don?t have a strong opinion on that. The previous three observations fully account for today?s legacy switches, with their limited set of patterns. The next one is also necessary to extend to switch cases which may support null-friendly patterns: 4. We need a rule to allow nulls to flow through switches until the user is ready to handle them. This means that null-permissive patterns in *some* switch cases need to be shielded from null just as with instanceof. What is this rule? We?ve already discussed it adequately; it comes in two parts: A. `case null:` is allowed and does the obvious thing. We might as well require that it always come first. B. There is a way of issuing a case which accepts nulls, and that way is a total pattern that is null friendly. (As Brian points out, this fits with the useful idea that a null-friendly pattern of the form `T v` or `var v` works just like the similar declaration.) Note that B is less arbitrary than it might seem at first blush: To avoid dead code, any total pattern in a switch must come *last*, at the bottom of the decision chain. (There can be no `default:` after it either, since that would be dead.) So the rules together mean: 1. If there is a `case null` at the top, that?s where nulls go. 2. If there is a total pattern at the bottom, that?s where nulls go. 3. Non-total patterns don?t catch nulls *in a switch*, just like in instanceof. 4. If there is neither a `case null` nor a total pattern, the switch throws NPE. I think this covers the use cases, except (perhaps) for some of the ?anecdotal? (really, artificial) use cases one could come up with where the rules get slightly more burdensome than if they were different (and burdensome in more substantial ways). Corresponding rules apply to sub-patterns (and to refactorings). a. a null-hostile sub-pattern (in Box(Pox(var t)) is null-hostile b. a null-permissive sub-pattern (in Box(Object t)) that is also total permits nulls c. a null-permissive sub-pattern (in Box(String t)) that is partial c. is debatable, but I think it?s the right answer also. It aligns the contextual behavior of case patterns with that of sub-patterns. The contextual behavior can be summarized generally: a. some patterns are null hostile, so never match null (constants, destructurings) b. null-permissive patterns which are asked to narrow the target type do not match null c. null-permissive patterns which widen or reiterate the target type let nulls flow through Case b is like instanceof, while case c is like a declaration. The declaration-like behavior is allowed if the corresponding declaration would also be valid, else the instanceof behavior is allowed. Score card: - Patterns intrinsically allow nulls to flow when possible. (Some necessarily reject nulls; others don?t.) - Patterns are always applied in a type context which renders them like declarations or like type tests. - The type context determines whether they do type tests or not; only type tests reject nulls. - Instanceof is declared to be always a type test (even if its pattern is total). - Switch rejects nulls with NPE unless there is a case that accepts nulls. Did I miss anything? From amaembo at gmail.com Tue Aug 11 03:26:00 2020 From: amaembo at gmail.com (Tagir Valeev) Date: Tue, 11 Aug 2020 10:26:00 +0700 Subject: [records] Mark generated toString/equals/hashCode in class files somehow? In-Reply-To: References: Message-ID: Thank you, Alex! I created an issue to track this: https://bugs.openjdk.java.net/browse/JDK-8251375 I'm not sure about the component. I set 'javac', though it also touches JLS and JVMS; hopefully, no actual changes in HotSpot are necessary as HotSpot can happily ignore this flag. I tried to formulate proposed changes in JLS and JVMS right in the issue. For now, I omit fields from the discussion because supporting fields isn't critical for my applications. But it's also ok to extend this to fields. What else can I do to move this forward? With best regards, Tagir Valeev. On Sat, Aug 8, 2020 at 12:37 AM Alex Buckley wrote: > > You are right that ACC_MANDATED is not expressible for methods. This is > unfortunate not only for equals/hashCode/toString in a record class, but > also for values/valueOf in an enum class. > > ACC_SYNTHETIC indicates an implementation artifact -- something that > varies from compiler to compiler (or from one release of a compiler to > the next release of the same compiler). It would be wrong to use > ACC_SYNTHETIC to mark the five methods in the previous paragraph. They > are language artifacts, whose existence + signature + semantics are the > same across compilers. > > It would be legitimate to add ACC_MANDATED to method_info.access_flags. > ACC_MANDATED is defined as 0x8000 in other contexts, so convention > dictates that it would have to be defined as 0x8000 in > method_info.access_flags too. Happily, 0x8000 is available there. This > also applies to field_info.access_flags. > > Alex > > On 8/6/2020 9:20 PM, Tagir Valeev wrote: > > Hello, Jonathan! > > > > I believe, current JVM specification doesn't say that methods could be > > marked with ACC_MANDATED [1]. I won't mind if it will be used instead > > of SYNTHETIC. To me, anything is ok if I can avoid bytecode > > inspection. > > > > With best regards, > > Tagir Valeev. > > > > [1] https://docs.oracle.com/javase/specs/jvms/se14/html/jvms-4.html#jvms-4.6 > > > > On Fri, Aug 7, 2020 at 11:12 AM Jonathan Gibbons > > wrote: > >> > >> Tagir, > >> > >> The concept and word you are looking for is "mandated", which is similar > >> to but different from "synthetic". > >> > >> See > >> https://docs.oracle.com/en/java/javase/14/docs/api/java.compiler/javax/lang/model/util/Elements.Origin.html#MANDATED > >> > >> -- Jon > >> > >> > >> On 8/6/20 8:48 PM, Tagir Valeev wrote: > >>> Hello! > >>> > >>> I'm working on class-file decompiler for records and discovered that > >>> there's no special flag for generated equals/hashCode/toString (like > >>> ACC_SYNTHETIC). This allows determining whether this method was > >>> explicitly specified in the source code only by looking into method > >>> implementation whether it has an ObjectMethods.bootstrap indy or not. > >>> This looks implementation-dependent and somewhat fragile (though, of > >>> course, we will do this if we have no other options). We also have a > >>> stub decompiler that decompiles declarations only without checking > >>> method bodies at all and it also wants to know whether > >>> equals/hashCode/toString methods were autogenerated. Finally, other > >>> bytecode tools like code coverage may need this to avoid calculating > >>> coverage for methods not present in the source. > >>> > >>> Is it possible to mark generated methods via ACC_SYNTHETIC or any > >>> other flag or add any attribute that can be used to differentiate > >>> auto-generated methods from the ones presented in the source code? > >>> > >>> Having a synthetic mark for auto-generated canonical constructor or > >>> accessor methods is less critical (as their bodies could be actually > >>> written in the source code like this) but it would be also nice to > >>> have it. > >>> > >>> With best regards, > >>> Tagir Valeev. From forax at univ-mlv.fr Tue Aug 11 12:23:27 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 11 Aug 2020 14:23:27 +0200 (CEST) Subject: Next up for patterns: type patterns in switch In-Reply-To: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> References: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> Message-ID: <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "John Rose" > Cc: "amber-spec-experts" > Envoy?: Lundi 10 Ao?t 2020 23:02:21 > Objet: Re: Next up for patterns: type patterns in switch > Some further color on this, to characterize why all the angst over matching > Box(null) seems mostly like a collective "bleah, different is scary" > freakout... > Case 1. The Box domain rejects nulls in the ctor. Then it doesn't matter what we > do; all the schemes discussed for `Box(Object o)` will do the same thing. > Case 2. The Box domain loves nulls! Boxes can contain nulls, and users should > always expect to find a null in a box; not doing so is using boxes wrong. > In that case, `case Box(Object o)` should surely match `Box(null)`, since its an > unremarkable element of the Box domain. Here, though, people get nervous: "if > we bind o to null, a careless users might NPE!" But that's likely to happen > anyway -- and should. > Suppose we didn't have deconstruction patterns, and instead the user writes: > case Box b: ... > There's no question this matches Box(null). And, the same code the careless > programmer might write with `Box(var o)`, they're going to write almost exactly > the same thing here: > case Box b: > Object boxContents = b.contents(); // returns null, no problem > boxContents.foo() // Same NPE > In this case, we do the users no favors -- actually, we do anti-favors -- by > "hiding" Box(null) from the domain, on the off chance that they will screw it > up. If Box is a null-loving domain, then clients need to write null-aware code, > and hiding the nulls doesn't help. > Further, this example shows another element from our refactoring catalog: users > should be able to freely refactor: > case Foo target: > Object component = target.component(); > with > case Foo(Object component) target: ... > without changing the semantics. But if `Foo(Object)` doesn't match `Foo(null)`, > that's yet another sharp edge. > Essentially, I think the "never match nulls" crowd just really hates nulls and > wants them to go away. But they are not going away, and we do no one any favors > by hiding our heads in the sand. > On 8/10/2020 1:57 PM, Brian Goetz wrote: >> There seems to be an awful lot of confusion about the motivation for the nullity >> proposal, so let me step back and address this from first principles. >> Let's factor away the null-tolerance of the constructs (switch and instanceof) >> from what patterns mean, and then we can return to how, if necessary, to >> resolve any mismatches. We'll do this by defining what it means for a target to >> match a pattern, and only then define the semantics of the pattern-aware >> constructs in terms of that. >> Let me also observe that some people, in their belief that `null` was a mistake, >> tend to have a latent hostility to null, and therefore tend to want new >> features to be at least as null-hostile as the most null-hostile of old >> features. (A good example is streams; it was suggested (by some of the same >> people) that it should be an error for streams to have null elements. And we >> considered this briefly -- and concluded this would have been a terrible idea! >> The lesson of that investigation was that the desire to "fix" the null mistake >> by patching individual holes is futile, and tends to lead to worse results. >> Instead, being null-agnostic was the right move for streams.) >> I think we're also being distracted by the fact that, in part because we've >> chosen `instanceof` as our syntax, we want to use `instanceof` as our mental >> model for what matching means. This is a good guiding principle but we must be >> careful of following it blindly. >> As a modeling simplification, let's assume that all patterns have exactly one >> binding variable, and the type of that binding variable is part of the pattern >> definition. We could model our match predicate and (conditional) binding >> function as: >> match :: (Pattern t) u -> Maybe t >> A pattern represents the fusion of an applicability predicate, zero or more >> conditional extractions, and a binding mechanism. For the simple case of a type >> pattern `Foo f`, the applicability predicate is "are you a Foo", and there are >> two possible interpretations -- "would `instanceof` say you are a `Foo`" (which >> means non-null), or "could you be assigned to a variable of type Foo" (or, >> equivalently, "are you in the value set of Foo".) >> A pattern P is _total_ on U if `match P u` returns `Some t` for every `u : U`. >> Total patterns are useful because they allow the compiler to reason about >> control flow and provide better error checking (detecting dead code, silly >> pattern matches, totality of expression switches, etc.) >> Let's go back to our trusty Box example. We can think of the `Box` constructor >> as a mapping: >> enBox :: t -> Box t >> and the Box deconstructor as >> unBox :: Box t -> t >> Now, what algebraic relationship do we want between enBox and unBox? The whole >> point is that a Box is a structure containing some properties, and that >> patterns let us destructure Boxes to recover those properties. enBox and unBox >> should form a projection-embedding pair, which means that enBox is allowed to >> be picky about what `t` values it accepts (think of the Rational constructor as >> throwing on denom==0), but, once boxed, we should be able to recover whatever >> is in the box. (The Box code gets to mediate access in both directions, but the >> _language_ shouldn't make guesses about what this code is going to do.) >> From the perspective of Box, is `null` a valid value of T? The answer is: >> "That's the Box author's business. The constructor accepts a T, and `null` is a >> valid member of T's value set. So if the imperative body of the constructor >> doesn't do anything special to reject it, then it's part of the domain." And if >> its part of the domain, then `unBox` should hand back what we handed to >> `enBox`. T in, T out. >> It has been a driving goal throughout the pattern matching exploration to >> exploit these dualities, because (among other things) this minimizes sharp >> edges and makes composition do what you expect it to. If I do: >> Box b = new Box(t); >> and this succeeds, then our `match` function applied to `Box(T)` and `b` should >> yield what we started with -- `t`. Singling out `null` for special treatment >> here as an illegal binding result is unwarranted; it creates a sharp edge where >> you can put things into boxes but you can only get them out on tuesdays. The >> language has no business telling Box it can't contain nulls, or punishing >> null-happy boxes by making them harder to deconstruct. Null-hostility is for >> the Box author to choose or not. I should be able to compose construction and >> deconstruction without surprises. >> Remember, we're not yet talking about language syntax here -- we're talking >> about the semantics of matching (and what we let class authors model). At this >> level, there is simply no other reasonable set of semantics here -- the >> `Box(T)` deconstructor, when applied to a valid Box, should be able to >> recover whatever was passed to the `new Box(T)` constructor. Nulls should be >> rejected by pattern matching at the point where they would be derferenced, not >> preemptively. >> There's also only one reasonable definition of the semantics of nested matching. >> If `P : Pattern t`, then the nested pattern P(Q) matches u iff >> u matches P(T alpha) && alpha matches Q >> It follows that if `Box(Object o)` is going to to be total on all boxes, then >> Object o is total on all objects. >> (There's also only one reasonable definition of the `var` pattern; it is type >> inference where we infer the type pattern for whatever type is the target of >> the match. So if `P : Pattern T`, then `P(var x)` infers `T x` for the nested >> pattern.) >> Doing anything else is an impediment to composition (and composition is the only >> tool we have, as language designers, that separate us from the apes.) I can >> compose constructors: >> Box>> b = new Box(new Flox(new Pox(t))); >> and I should be able to take this apart exactly the same way: >> if (b matches Box(Flox(Pox(var t))) >> The reason `Flox(Pox p)` doesn't match null floxes is not because patterns >> shouldn't match null, but because a _deconstruction pattern_ that takes apart a >> Flox is intrinsically going to look inside the Flox -- which means >> dereferencing it. But an ordinary type pattern is not necessarily going to. >> Looking at it from another angle, there is a natural interpretation of applying >> a total pattern as a generalization of assignment. It's not an accident that `T >> t` (or `var x`) looks both like a pattern and like a local variable >> declaration. We know that this: >> T t = e >> or >> var t = e >> is a local variable declaration with initializer, but we can also reasonably >> (and profitably) interpret it as a pattern match -- take the (total on T) >> pattern `T t`, and match `e : T` to it. And the compiler already knows that >> this is going to succeed if `e : T`. To gratuitously reject null here makes no >> sense. (Totality is important here; if the pattern were not total, then `t` >> would not be DA after the assignment, and therefore the declaration either has >> to throw a runtime error, or the compiler has to reject it.) >> ## Back to switch and instanceof >> The above discussion argues why there is only one reasonable null behavior for >> patterns _in the abstract_. But, I hear you cry, the semantics for switch and >> instanceof today are entirely reasonable and intuitive, so how could they be so >> wrong? >> And the answer is: we have only been able to use `switch` and `instanceof` so >> far for pretty trivial things! When we add patterns to the language, we're >> raising the expressive ability of these constructs to some power. And >> extrapolating from our existing intuitions about these are like extrapolating >> the behavior of polynomials from their zeroth-order Taylor expansion. >> (Now, that this point, the split-over-lump crowd says "Then you should define >> new constructs, if they're so much more powerful." But I still claim it is far >> better to refine our intuitions about what switch means, even with some >> discomfort, than to try to keep track of the subtle differences between switch >> and snitch.) >> So, why do we have the current null behavior for `instanceof` and `switch`? >> Well, right now, `instanceof` only lets you ask a very very simple question -- >> "is the dynamic type of the target X". And, the designers judged (reasonable) >> that, since 99.999% of the time, what you're about to do is cast the target and >> then deference it, saying "no" is less error-prone than saying OK and then >> having the subsequent dereference fail. >> But now, `instanceof` can answer far more sophisticated questions, and that >> 99.999% becomes a complete unknown. With what confidence can you say that the >> body of: >> if (b instanceof Box(var t)) { ... } >> is going to dereference t? If you say more than 50%, you're lying. It would be >> totally reasonable to just take that t and assign it somewhere else, rebox it >> into another box, pass it to some T-consuming method, etc. And who are we to >> say that Box-consuming protocols are somehow "bad" if they like to truck in >> null contents? That's not our business! So the conditions under which "always >> says no" was reasonable for Java 1.0 are no longer applicable. >> The same is true for switch, because of the very limited reference types which >> switch permits (and which were only added in Java 5) -- boxed primitives, >> strings, and enums. In all of these cases, we are asking very simple questions >> ("are you 3"), and these are domains where nulls have historically been >> denigrated -- so it seemed reasonable for switch to be hostile to them. But >> once we introduce patterns, the set of questions you can ask gets enormously >> larger, and the set of types you can switch over does too. The old conditions >> don't apply. In: >> switch (o) { >> case Box(var t): ... >> case Bag(var t): ... >> } >> we care about the contents, not the wrapping; the switch is there to do the >> unwrapping for us. Who are we to say "sorry, no one should ever be allowed to >> put a null in a Bag?" That's not our business! >> At this point, I suspect Remi says "I'm not saying you can't put a null in a >> Box, but there should be a different way to unpack it." But unless you can say >> with 99.99% certainty that nulls are always errors, it is better to be agnostic >> to nulls in the plumbing and let users filter them at the ultimate point of >> consumption, than to make the plumbing null-hostile and make users jump through >> hoops to get the nulls to flow. The same was true for streams; we made the >> (absolutely correct) choice to let the nulls flow through the stream, and, if >> you are using a maybe-null-containing source, and doing null-incompatible >> things on the elements, it's on you to filter them. It is easier to filter >> nulls than to to add back a special encoding for nulls. (And, the result of >> that experiment was pretty conclusive: of the hundreds of stack overflow >> questions I have seen on streams, not one centered around unexpected nulls.) >> If we have guards, and you want to express "no Boxes with nulls", that's easy: >> case Box(var t) when t != null: ... >> And again, as with `instanceof`, we have no reason to believe that there's a >> 99.99% chance that the next thing the user is going to do is dereference it. So >> the justification that null-hostility is the "obvious" semantics here doesn't >> translate to the new, more powerful language feature. >> And it gets worse: the people who really want the nulls now have to do >> additional error-prone work, either use some ad-hoc epicyclical syntax at each >> use site (and, if the deconstruction pattern has five bindings, you have to say >> it five times), or having to duplicate blocks of code to avoid the switch >> anomaly. >> The conclusion of this section is that while the existing null behavior for >> instanceof and switch is justified relative to their _current_ limitations, >> once we remove those limitations, those behaviors are much more arbitrary (and >> kind of mean: "nulls are so bad, that if you are a null-using person, we will >> make it harder for you, 'for your own good'.") >> #### Split the baby? >> Now, there is room to make a reasonable argument that we'd rather keep the >> existing switch behavior, but accept the null-friendly matching behavior. My >> take is that this is a bad trade, but let's look at it more carefully. >> Gain: I don't have to learn a new set of rules about what switch/instanceof do >> with null. >> Loss: code duplication. If I want my fallback to handle nulls, I have to >> duplicate code; instead of >> switch (o) { >> case String s: A >> case Long l: B >> case Object o: C >> } >> I have to do >> if (o == null) { C } >> else switch (o) { >> case String s: A >> case Long l: B >> case Object o: C >> } >> resulting in duplicating C. (We have this problem today, but because of the >> limitations of switch today, it is rarely a problem. When our case labels are >> more powerful, we'll be using switch for more stuff, and it will surely come up >> more often.) >> Loss: refactoring anomaly. Refactoring a nested switch with: >> case P(Q): >> case P(R): >> case P(S): >> to >> case P(var x): >> switch (x) { >> case Q: ... >> case R: ... >> case S: ... >> } >> } >> doesn't work in the obvious way. Yes, there's a way to refactor it, and the IDE >> will do it correctly. But it becomes a sharp edge that users will trip over. >> The reason the above refactoring is desirable is because users will reasonably >> assume it works, and rather than cut them with a sharp edge, we can just make >> it way they way they reasonable think it should. >> So, we could make this trade, and it would be more "minimal" -- but I think it >> would result in a less useful switch in the long run. I think we would regret >> it. >> #### Conclusion >> If we were designing pattern matching and switch together from scratch, we would >> never even consider the current nullity behavior; the "wait until someone >> actually dereferences before we throw" is the obvious and only reasonable >> choice. We're being biased based on our existing assumptions about instanceof >> and switch. This is a reasonable starting point, but we have to admit that >> these biases in turn come from the fact that the current interpretations of >> those constructs are dramatically limited compared to supporting patterns. >> It is easy to trot out anecdotes where any of the possible schemes would cause a >> particular user to be confused. But this is just a way to justify our biases. >> The reality is that, as switch and instanceof get more powerful, we don't get >> to make as many assumptions about the liklihood of whether `null` is an error >> or not. And, the more likely it is not an error, the less justification we have >> for giving it special semantics. >> Let the nulls flow. I mostly agree with everything you say, but - i don't think that framing the problem we have in term of null haters / null friends is productive. - match :: (Pattern t) u -> Maybe t, the railway design pattern as coined by Scott Wlaschin[1], means that __each__ matcher is able to decide what to do with null, so it's a more lax model as the one we are working on. There is a way to see a switch as a cascade of instanceof and to allow null, A switch like this switch (v) { case String s: A case Long l: B case Object o: C } can be seen as if (v instanceof String s) { A } else if (v instanceof Lonkl) { } else { var o = (Object) v; // <--- cast here C } i.e. considering the last case not as an instanceof but as a cast, which obviously allows null. This is almost the same semantics as the one you are proposing but instead of the notion of totality, being the last case is enough to accept null. The difference appears in the switch statement, if A and B are unrelated, switch(v) { case A a: (1) case B b: (2) } if v is null, it will execute (2) instead of not executing anything or throwing a NPE. I prefer this semantics, because it's local, the last case allows null, it doesn't depends on the type of v or the relationship between A and B. Now, to bridge with the fact that the current incarnation of the switchs are null hostile, as John said in a related mail thread, a switch with only constants or without a type pattern, is null hostile, and if there is a type pattern, a switch is null friendly. R?mi [1] https://fsharpforfunandprofit.com/rop/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Aug 11 12:27:50 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 11 Aug 2020 14:27:50 +0200 (CEST) Subject: Next up for patterns: type patterns in switch In-Reply-To: <52BBA718-C08A-4AC9-AF61-0E05F6BC5C7F@oracle.com> References: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> <52BBA718-C08A-4AC9-AF61-0E05F6BC5C7F@oracle.com> Message-ID: <1109639776.140209.1597148870399.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Mardi 11 Ao?t 2020 00:20:42 > Objet: Re: Next up for patterns: type patterns in switch > Letting the nulls flow is a good move in that absorbing > game of ?find the primitive?. Here, you are observing forcefully > that instanceof and switch, while important precedents and > sources of use cases and design patterns, are not the primitives. > We are not so lucky that the answer is ?we just need more sugar > for the existing constructs?. But we are not so unlucky that > we must build our new primitives out of alien materials. > > The existing ideas about type matching, and specifically that > `T x = v;` requires that `T` be total over the type of `v`, > are available and useful. > > Also useful is the idea that some constructs are necessarily > null-hostile (starting with dot: `x.f`). In a ?let the nulls > flow design?, if a construct is not *necessarily* null hostile, > it is necessarily *null permissive*. So we hunt for necessary > hostility, and among patterns we find it with destructuring > (`Box(var t)` as opposed to `Box`) and with value testing > patterns like string or numeric literals (if those are patterns, > which I think they should be). We also note that the level > of hostility (from patterns) must be compatible with > generally null-agnostic use cases: This means a pattern > can fail on null (if it *must*) but it *must not throw on > null*, because the next pattern in line might be the one > that matches the null. > > So instanceof and switch turn out to be sugar for certain > uses of patterns. And together they are universal enough > that (luckily) we might not need a new syntax to directly > denote the new primitive, of applying a pattern (partial > or total) and extracting bindings. > > The existing behavior w.r.t. nulls of instanceof and switch > need to be rationalized. I think that is easy, although there > is a little bit to learn. (Just as there?s something to learn > today: They are unconditionally null-rejecting at present.) > An important thing (as Brian points out) is that if you > are choosing to write null-agnostic code, your learning > curve should be gentle-to-none. > > Here are the rules the way I see them, in the presence > of primitive patterns which are null-permissive (because > they support null-agnostic use cases): > > `x instanceof P` includes an additional check `x==null` before > it tests the pattern P against x. Rationale: Compatibility. > Also look at the name: `null` is never an *instance* *of* any > type. A pattern might match null, but we are testing whether > `x` is an instance, which is to say, an object. > > Some equations to relate instanceof to the primitive __Matches: > > x instanceof P ? x __Matches P && (__PermitsNull(P) ? x != null : true) > > x __Matches P ? x instanceof P || (__PermitsNull(P) ? x == null : false) > > Do we need syntax for __Matches P? Probably not, because the > above equations allow workarounds when the instanceof syntax > isn?t exactly right. (And it usually *is* exactly right; the trailing > null logic folds away in context, or is harmless in some other way, > as the nulls flow around.) > > What about switch? I like to think that a switch statement is simply > sugar (plus optimizations) for a *decision chain*, an if/else chain which > tests each case in turn (in source code order, of course): > > switch (x) { > case P: p(); break; > case Q: q(); break; > ? > default: d(); } > > ? (approximately) > > { var x_ = x; > if (x_ __Matches P) p(); else > if (x_ __Matches Q) q(); else > ? > d(); } > > (Note that this account of classic switch requires extra tweaks > to deal with two embarrassing features: (a) fall through, which > requires some way of contriving transfers between arms of the > decision chain, and (b) the fact that default can go anywhere, > and sometimes is placed in the middle to make use of fall-through. > These are embarrassments, not show-stoppers.) > > So what about nulls? The simple?I will say naive?account > of switch is that there is a null check at the head of the switch > near `var x_ = x;`. This would account for all of switch?s behaviors > as of today, but makes switch hostile to nulls. > > A more nuanced and useful account of switch?s behavior comes > from the following observations: > > 1. All switch cases *today*, if regarded as patterns, are necessarily > null-rejecting. *None of them ever match null.* > > 2. The NPE observed from a switch-on-null, today, might as well > be viewed as arising from the *bottom* of the decision chain, > *after all matches* fail. From that point of view, the fact that > the failure appears to come from the *top* is simply an optimization, > a fast-fail when it is statically provable that there?s no hope ever > matching that pesky null, in any given legacy switch. > > 3. When null meets default, we are painted into a corner, so we > have to enjoy the only remaining option: At least in legacy switches, > the default case is *also* mandated to reject nulls. (So ?default? > turns out to mean ?anything but null?. But that doesn?t parley > into a general anti-null story; sorry null-haters.) This feature > of default can (maybe) be turned into a benefit: Perhaps we > can teach users that by saying ?default? you are *asking* for > an NPE, if a null escapes all the intervening patterns in the > decision chain. I don?t have a strong opinion on that. > > The previous three observations fully account for today?s > legacy switches, with their limited set of patterns. The next > one is also necessary to extend to switch cases which may > support null-friendly patterns: > > 4. We need a rule to allow nulls to flow through switches > until the user is ready to handle them. This means that > null-permissive patterns in *some* switch cases need to > be shielded from null just as with instanceof. > > What is this rule? We?ve already discussed it adequately; > it comes in two parts: > > A. `case null:` is allowed and does the obvious thing. > We might as well require that it always come first. > > B. There is a way of issuing a case which accepts nulls, > and that way is a total pattern that is null friendly. > (As Brian points out, this fits with the useful idea > that a null-friendly pattern of the form `T v` or `var v` > works just like the similar declaration.) > > Note that B is less arbitrary than it might seem at first > blush: To avoid dead code, any total pattern in a switch > must come *last*, at the bottom of the decision chain. > (There can be no `default:` after it either, since that would > be dead.) > > So the rules together mean: > > 1. If there is a `case null` at the top, that?s where nulls go. > 2. If there is a total pattern at the bottom, that?s where nulls go. > 3. Non-total patterns don?t catch nulls *in a switch*, just like in instanceof. > 4. If there is neither a `case null` nor a total pattern, the switch throws NPE. I'm proposing B. if the last case is a type pattern, then the switch is null friendly and this last case accept null. I get, 1. If there is a `case null` at the top, that?s where nulls go. 2. If there is a type pattern at the bottom, that?s where nulls go. 4. If there is neither a `case null` nor a type pattern, the switch throws NPE R?mi From brian.goetz at oracle.com Tue Aug 11 13:57:36 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Aug 2020 09:57:36 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> References: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> Message-ID: <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> > - i don't think that framing the problem we have in term of null > haters / null friends is productive. It may not be productive, but it is one of the elephants in the room.? Some people sometimes take a hostile view of null ("We should prevent stream elements from being null!"), and, while this is often motivated by the best of intentions, it is a bias we need to be aware of. > - match :: (Pattern t) u -> Maybe t, the railway design pattern as > coined by Scott Wlaschin[1],? means that __each__ matcher is able to > decide what to do with null, so it's a more lax model as the one we > are working on. Seems like exactly the model we are working on.? Some patterns can match null (total type patterns, any patterns, null constant pattern) and some cannot (deconstruction patterns, non-null constant patterns.)? Eventually, there will be a way to write patterns in Java code (e.g., deconstruction patterns for arbitrary classes), but we can still make the "null OK / null not OK" decision for these entire categories at once, so humans and compilers can reason about them. > There is a way to see a switch as a cascade of instanceof and to allow > null, Herein lies danger, as both `switch` and `instanceof` have pre-existing, somewhat accidental null opinions.? But I agree that it is a good goal that switches and if-else chains of instanceof tests on the same target be refactorable to each other to the extent possible. > This is almost the same semantics as the one you are proposing but > instead of the notion of totality, being the last case is enough to > accept null. ... but what if the switch isn't total?? I can have ??? switch (o) { ??????? case Integer i: ... ??????? case String s: ... ??????? // no default, no total pattern ??? } and it would be quite surprising to randomly shove nulls into the last case.? This would prevent the cases from being reordered even though they have no dominance ordering.? The reason the last case is special in the examples we've given so far is ... wait for it ... THEY ARE TOTAL.? It would be a compiler error to have any more cases after them.? They are intrinsically catch-alls. > I prefer this semantics, because it's local, the last case allows > null, it doesn't depends on the type of v or the relationship between > A and B. This proposal seems entirely motivated by "let's have a really simple rule", rather than based on principles of what actually should happen in real programs or what programs should be expressible.? And, regardless of motivation, it is surely the wrong answer. From forax at univ-mlv.fr Tue Aug 11 15:42:56 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 11 Aug 2020 17:42:56 +0200 (CEST) Subject: Next up for patterns: type patterns in switch In-Reply-To: <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> References: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> Message-ID: <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" , "John Rose" > Cc: "amber-spec-experts" > Envoy?: Mardi 11 Ao?t 2020 15:57:36 > Objet: Re: Next up for patterns: type patterns in switch >> - i don't think that framing the problem we have in term of null >> haters / null friends is productive. > > It may not be productive, but it is one of the elephants in the room. > Some people sometimes take a hostile view of null ("We should prevent > stream elements from being null!"), and, while this is often motivated > by the best of intentions, it is a bias we need to be aware of. > As far as i know, nobody in the EG has proposed that. > >> There is a way to see a switch as a cascade of instanceof and to allow >> null, > > Herein lies danger, as both `switch` and `instanceof` have pre-existing, > somewhat accidental null opinions.? But I agree that it is a good goal > that switches and if-else chains of instanceof tests on the same target > be refactorable to each other to the extent possible. > >> This is almost the same semantics as the one you are proposing but >> instead of the notion of totality, being the last case is enough to >> accept null. > > ... but what if the switch isn't total?? I can have > > ??? switch (o) { > ??????? case Integer i: ... > ??????? case String s: ... > ??????? // no default, no total pattern > ??? } > > and it would be quite surprising to randomly shove nulls into the last case.? It's not random, it's based on the fact that all cases but the last one are instanceof and the last one is a cast (see below) > This would prevent the cases from being reordered even though > they have no dominance ordering.? The reason the last case is special in > the examples we've given so far is ... wait for it ... THEY ARE TOTAL. > It would be a compiler error to have any more cases after them.? They > are intrinsically catch-alls. That why i have proposed first to syntactically make a difference using the syntax case String|null first but i'm sure someone can come with a better syntax. Again, i understand why you think that a semantics based on the pattern being total or not make sense, but you are only talking about the positive side and not the negative one. The main drawback, being total is not a __local__ property. So this is far worst that re-organizing two cases in the switch because as a user you have done something. A pattern can be total or not depending if someone change the return type of the method you are switching on (this methods can be in another module) or change the class hierarchy (again, the hierarchy can be defined in another module), the code stay exactly the same but the semantics is changed. Adding an interface to a class should be harmless, but if this class is a case in a switch, this is changing the semantics of the switch. What can be worst in term of property ? (again, C# doesn't have this issue because it is using "var" so the semantics is stable even if the type switched on change). > >> I prefer this semantics, because it's local, the last case allows >> null, it doesn't depends on the type of v or the relationship between >> A and B. > > This proposal seems entirely motivated by "let's have a really simple > rule", rather than based on principles of what actually should happen in > real programs or what programs should be expressible.? And, regardless > of motivation, it is surely the wrong answer. You can not in the same mail said that it's good goal "that switches and if-else chains of instanceof tests on the same target be refactorable to each other to the extent possible." and ends with "This proposal seems entirely motivated by ..., rather than based on principles of what actually should happen in real programs". The semantics i'm proposing is based on real codes: A switch like this switch (v) { case String s: A case Long l: B case Object o: C } can be seen as if (v instanceof String s) { A } else if (v instanceof Long l) { B } else { var o = (Object) v; // <--- cast here C } It's a cut and paste from the mail you're answering. R?mi From alex.buckley at oracle.com Tue Aug 11 16:00:53 2020 From: alex.buckley at oracle.com (Alex Buckley) Date: Tue, 11 Aug 2020 09:00:53 -0700 Subject: [records] Mark generated toString/equals/hashCode in class files somehow? In-Reply-To: References: Message-ID: If the mandated status of a class/member was to be reified in the class file, then you would need Core Reflection and Language Model APIs to expose that status, along the lines of isSynthetic. Alex On 8/10/2020 8:26 PM, Tagir Valeev wrote: > Thank you, Alex! > > I created an issue to track this: > https://bugs.openjdk.java.net/browse/JDK-8251375 > I'm not sure about the component. I set 'javac', though it also > touches JLS and JVMS; hopefully, no actual changes in HotSpot are > necessary as HotSpot can happily ignore this flag. > I tried to formulate proposed changes in JLS and JVMS right in the > issue. For now, I omit fields from the discussion because supporting > fields isn't critical for my applications. > But it's also ok to extend this to fields. > > What else can I do to move this forward? > > With best regards, > Tagir Valeev. From brian.goetz at oracle.com Tue Aug 11 16:36:18 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Aug 2020 12:36:18 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> References: <836d1fdb-0e7f-50a5-e6e6-305bba2e0f81@oracle.com> <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> Message-ID: <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> > The main drawback, being total is not a __local__ property. This is true, and we've talked about this from the beginning, but I also think that the fear of nonlocality here is pretty overblown. Yes, there are puzzlers, but they are mostly of the kind where `var` and "diamond" interact -- and the solution to those puzzlers is "add back some explicitness, so it's clear what is going on."? The solution is not "let's cripple var so it doesn't interact with diamond", or "let's outlaw the interaction of var and diamond" or "let's not do var." We proposed using totality not because its "clever", or because it makes a certain pesky problem go away, but because it is the most _natural_ way to unify the semantics of all the ways patterns are used, and this becomes obvious once you get to examples that actually use nested patterns.? I think much of the reaction to this aspect is based on holding on to the existing model of switch/instanceof, which are both very limited right now.? Once you see these as more general deconstruction operations, the role of totality immediately comes to the fore, because of the expanded range of problems we will see with conditional destructuring.? So I think there's a lot of "judging the new feature by the use cases of the old feature."? We may be reusing the syntax of instanceof and switch (for good reasons), but these are not the instanceof and switch you learned about in CS 101.? Lets not hobble the new feature because the old version was insufficiently ambitious. > So this is far worst that re-organizing two cases in the switch because as a user you have done something. A pattern can be total or not depending if someone change the return type of the method you are switching on (this methods can be in another module) or change the class hierarchy (again, the hierarchy can be defined in another module), the code stay exactly the same but the semantics is changed. Sorry, but I think this is mostly FUD.? There are a zillion ways to leave information implicit: var, diamond, generic invocation, method chaining, method references (which pick an overloading), implicit lambdas, etc.? And there are a zillion ways to construct puzzlers from code that does something implicit, where the answer would be obvious if we wrote out the fully explicit representation.? And the same thing always happens when a new feature wants to join this club: "There exists a possible way someone could get confused by combining these ten features, so we must blame the last feature that came in the door!" So, coming back to the main goal, our job here is first to find the _natural_ semantics of pattern? matching.? And every path I've gone down tells me that totality is part of the natural semantics.? And I've not seen any credible arguments that suggest it is not -- just fear that, through interactions with other features, it might be too new, too complicated, too confusing, etc.? But this happens with every new feature!? We're used to (and have come to terms with) the OLD features interacting in possibly puzzling ways (e.g., var+diamond), but when a corner of a corner of a NEW feature threatens to do so, people declare "this entire new feature is broken." More importantly, we want to have the right _global_ story. Proposing localized hacks to deal with the aspects you don't like is highly unlikely to bring us to the right global story. From forax at univ-mlv.fr Tue Aug 11 20:01:04 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 11 Aug 2020 22:01:04 +0200 (CEST) Subject: Next up for patterns: type patterns in switch In-Reply-To: <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> References: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> Message-ID: <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "John Rose" , "amber-spec-experts" > Envoy?: Mardi 11 Ao?t 2020 18:36:18 > Objet: Re: Next up for patterns: type patterns in switch >> The main drawback, being total is not a __local__ property. > > This is true, and we've talked about this from the beginning, but I also > think that the fear of nonlocality here is pretty overblown. Yes, there > are puzzlers, but they are mostly of the kind where `var` and "diamond" > interact -- and the solution to those puzzlers is "add back some > explicitness, so it's clear what is going on."? The solution is not > "let's cripple var so it doesn't interact with diamond", or "let's > outlaw the interaction of var and diamond" or "let's not do var." > > We proposed using totality not because its "clever", or because it makes > a certain pesky problem go away, but because it is the most _natural_ > way to unify the semantics of all the ways patterns are used, and this > becomes obvious once you get to examples that actually use nested > patterns.? I think much of the reaction to this aspect is based on > holding on to the existing model of switch/instanceof, which are both > very limited right now.? Once you see these as more general > deconstruction operations, the role of totality immediately comes to the > fore, because of the expanded range of problems we will see with > conditional destructuring.? So I think there's a lot of "judging the new > feature by the use cases of the old feature."? We may be reusing the > syntax of instanceof and switch (for good reasons), but these are not > the instanceof and switch you learned about in CS 101.? Lets not hobble > the new feature because the old version was insufficiently ambitious. > >> So this is far worst that re-organizing two cases in the switch because as a >> user you have done something. A pattern can be total or not depending if >> someone change the return type of the method you are switching on (this methods >> can be in another module) or change the class hierarchy (again, the hierarchy >> can be defined in another module), the code stay exactly the same but the >> semantics is changed. > > Sorry, but I think this is mostly FUD.? There are a zillion ways to > leave information implicit: var, diamond, generic invocation, method > chaining, method references (which pick an overloading), implicit > lambdas, etc.? And there are a zillion ways to construct puzzlers from > code that does something implicit, where the answer would be obvious if > we wrote out the fully explicit representation.? And the same thing > always happens when a new feature wants to join this club: "There exists > a possible way someone could get confused by combining these ten > features, so we must blame the last feature that came in the door!" It's not FUD, and worst you know that there is a solution to avoid the problems of non locality of requiring the last case to be total. This is how i see the future if we introduce the current proposed semantics. Someone will ask on StackOverflow why a particular switch doesn't accept null (hint: the last case is not total even if it looks like it is). Someone will answer that instead of trying to know if the last case is total or not, it should use "case var foo" because it's always total (quoting this email to get an inception like taste). The IDEs will start to have a hint to refactor the "case TotalType foo" to "case var foo" because it solves the non locality problems making the code more future proof. In 5 years from now, most devs that are using pattern matching will write the code as in C#. So i propose to disallow that a case with a total pattern can use an explicit type with a nice message saying to use "case var" instead. This will effectively fix the non locality issues of the current proposal. > > So, coming back to the main goal, our job here is first to find the > _natural_ semantics of pattern? matching.? And every path I've gone down > tells me that totality is part of the natural semantics.? And I've not > seen any credible arguments that suggest it is not -- just fear that, > through interactions with other features, it might be too new, too > complicated, too confusing, etc.? But this happens with every new > feature!? We're used to (and have come to terms with) the OLD features > interacting in possibly puzzling ways (e.g., var+diamond), but when a > corner of a corner of a NEW feature threatens to do so, people declare > "this entire new feature is broken." yes, we are spending a lot of time one a corner of a corner case, and i agree that totality is part of the natural semantics, but it doesn't mean we have to force users to think in term of totality. > > More importantly, we want to have the right _global_ story. Proposing > localized hacks to deal with the aspects you don't like is highly > unlikely to bring us to the right global story. given we are talking about a corner? case, it should play only a small part of the global story anyway. regards, R?mi From brian.goetz at oracle.com Tue Aug 11 20:22:32 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Aug 2020 16:22:32 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> References: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> Message-ID: > It's not FUD, and worst you know that there is a solution to avoid the problems of non locality of requiring the last case to be total. It's not FUD in the sense that it is something to be aware of and evaluate.? But it is FUD in the sense that you are presenting these examples as being SO CATASTROPHICALLY BAD that you are ready to pull the emergency brake cord on the path we've been converging on for three years, as if some fatal flaw was just discovered last week. So, let's please dial down the volume, because nothing is remotely as bad as is being claimed.? And the local tweaks suggested so far are neither local nor grounded in principle nor an improvement. > This is how i see the future if we introduce the current proposed semantics. It could play out this way, but this feels like largely speculation.? Speculation is of course part of our analysis toolbox, but let's not give it too much credit that we're willing to redesign the language over it. > So i propose to disallow that a case with a total pattern can use an explicit type with a nice message saying to use "case var" instead. OK, that's a new "solution" (which surely leads to its own problems), but can we please agree on an underlying problem first? (I think the problem you are alluding to is that it is too hard to tell whether a case is total?)? And can we please stop proposing "patches" until we are in full agreement on the problem, since it doesn't seem to be working. > yes, we are spending a lot of time one a corner of a corner case, and i agree that totality is part of the natural semantics, > but it doesn't mean we have to force users to think in term of totality. That's part of the problem -- I don't remotely think we are forcing users to think in this terms!? I think 99.999% of the time, the obvious code will do the obvious thing, without forcing the users into fully internalizing it. From forax at univ-mlv.fr Tue Aug 11 21:29:57 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 11 Aug 2020 23:29:57 +0200 (CEST) Subject: On the last case being explicitly total In-Reply-To: References: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> Message-ID: <1054488882.230933.1597181397649.JavaMail.zimbra@u-pem.fr> The problem: if there is no explicit "case null", the spec relies on the last case being total or not to accept null. The idea that relying only on the last case being total or not is a bad idea, because 1/ the syntax for a total case and a non total case are exactly the same while the semantics is not. 2/ knowing if something is total for a human is not obvious 3/ being total is not a local property. The proposed solution is necessarily "just a fix" because it's the only part of the spec that bug me. So i've unsuccessfully proposed several ways to make explicit that the case is total. First, using a special syntax to declare that a case is total, "case var|null" or "case any", or whatever syntax you find cool. Then i've proposed that the last case is always total because it's always like a cast and not like an instanceof. I'm now proposing to disallow the use of an explicit type in a total pattern and to ask users to use "case var" instead which is a restriction on the current proposed semantics. Using "case var" make it obvious that the pattern is total, it's syntactically different from the other case and it is a local property because the type is inferred as the same as the type switched upon (so even if the type switched upon changed, the case stay total). R?mi From brian.goetz at oracle.com Tue Aug 11 22:12:56 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Aug 2020 18:12:56 -0400 Subject: On the last case being explicitly total In-Reply-To: <1054488882.230933.1597181397649.JavaMail.zimbra@u-pem.fr> References: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <1054488882.230933.1597181397649.JavaMail.zimbra@u-pem.fr> Message-ID: <8160e6a1-e15c-75a9-282d-f08449661e2d@oracle.com> Thanks for clarifying that your sole concern here is that it seems too hard to tell, visually, whether the last case is total or not. First, let me try to explain why this doesn't bother me so much. Here's one example, but there are more. In a lot of code today that uses switches, nulls never get near a switch anyway -- otherwise we'd have many many more complaints about the NPEs that switches would be throwing.? But we don't, meaning the current uses of switch never see null in the first place. But tomorrow, switches will get stronger.? People will want to replace chains of if-then-else like: ??? if (x instanceof Fork) { A } ??? else if (x instanceof Knife) { B } ??? else { C } with type switches: ??? switch (x) { ??????? case Fork f: A ??????? case Knife k: B ??????? case Object o: C ??? } And guess what, the two have exactly the same semantics, down to the nullity behavior!? The old version dumped the nulls (if there were any) into the else clause, and the new version dumps them into the Object clause.? But if we keep the preemptive null-hostility of switch, then the above wouldn't work (though you could use `default` instead of Object.)? So what I think you object to is not relaxing the null-hostility, but relaxing it when there's no flashing red light that says "warning, total switch, new null semantics."? (I worry this is just Stroustrup's rule over again, though.) The day after that, switches will get stronger again -- and be able to handle deconstruction and nested patterns.? And this is where it really pays off.? (But, if you don't like the totality-seeking behavior, any "fix" would have to cover the nested total pattern too.) In general, I think total patterns _will_ look total, because they are intrinsically catch-alls.? I think we worry it will not because we haven't been programming at all with catch-all switch cases, because switch has historically been too weak to do this. ? But once we get used to the patterns that naturally arise, I think we'll be happy and it won't be scary at all. What about sealed types?? If we have a type sealed to A/B/C, then a switch with ??? case A: ??? case B: ??? case C: is total but NPEs on null.? That's probably OK, though, since its the same thing as a switch today does on enums. In any case, I'm going to take an action item to write up the whole refactoring catalog -- just to make sure we are comfortable with any asymmetries that arise. OK, now back to your story. I rejected the "any X" approach because I think it is too error-prone -- it flips the default on what should be the catch-all case.? The natural thing is for ??? case Box(var x): to match all boxes.? Having to remember to specify a different ad-hoc syntax everywhere you think there might be a null is asking too much of users, they'll forget all the time and I don't think they'll thank us.? It feels like the same "wrong default" as field accessibility and mutability today, but worse. I rejected the "last case gets null" idea because we don't always want the last case to be a catch all!? And also, because we should be free to reorder type / total deconstruction patterns for unrelated types, but this would prevent that. ? We want catch-all cases to be total, but not all switches have a catch-all, and we don't want to create subtle reordering anomalies that no one will ever remember. Now you're suggesting that a total pattern must be "case var". Unfortunately that's not great either, because think about: ??? switch (multiBox) { ??????? case MultiBox(String a, String b, String c): ... ??????? case MultiBox(Integer a, Integer b, Integer c): ... ??????? case var x: // are you kidding me?? I can't destructure here? ??? } Here, in the catch-all clause, we KNOW it is a multi-box; why can't we use a deconstruction pattern?? Isn't that what patterns are for? Maybe you will be OK if the pattern has all nested var clauses: ??? case MultiBox(var x, var y, var z): .... but my read is that you would still say that doesn't look "total" enough? I think this is still coming at it from the wrong angle.? If the problem is that its not obvious that a switch is total (note that this is not a problem with expression switches -- they are always total, well, almost always), maybe you're really asking for something like (as we've discussed) ??? total-switch (o) { ??????? // compiler, please type check that I cover the target type ??? } or, with a smaller hammer: ??? switch (multiBox) { ??????? case MultiBox(String a, String b, String c): ... ??????? case MultiBox(Integer a, Integer b, Integer c): ... ??????? final case MultiBox(Object a, Object b, Object c): ... ??? } where `final case` would mean "this covers it all, its an error if it doesn't."? And then you are asking that any total non-exprssion switch without a `final case` (unless the last case is `var x`?) give an error.? That seems pretty fussy, but I might be talked into _allowing_ you to say `final case` (or `finally `) as a way to force the totality type checking, and make the totality clear. But, if that's what we're talking about, I'd prefer to keep that on the shelf as an option, rather than preemptively plunk for it now. We can always add it later compatibly, and I'm still not convinced this is remotely as big a problem as you think it is.? We knew back from the switch expression days that we might want to come back for a "check me for exhaustiveness please" option, and we still might. > The problem: if there is no explicit "case null", the spec relies on the last case being total or not to accept null. > > The idea that relying only on the last case being total or not is a bad idea, because > 1/ the syntax for a total case and a non total case are exactly the same while the semantics is not. > 2/ knowing if something is total for a human is not obvious > 3/ being total is not a local property. > > The proposed solution is necessarily "just a fix" because it's the only part of the spec that bug me. > > So i've unsuccessfully proposed several ways to make explicit that the case is total. > First, using a special syntax to declare that a case is total, "case var|null" or "case any", or whatever syntax you find cool. Then i've proposed that the last case is always total because it's always like a cast and not like an instanceof. > > I'm now proposing to disallow the use of an explicit type in a total pattern and to ask users to use "case var" instead which is a restriction on the current proposed semantics. > Using "case var" make it obvious that the pattern is total, it's syntactically different from the other case and it is a local property because the type is inferred as the same as the type switched upon (so even if the type switched upon changed, the case stay total). > > R?mi > > > From guy.steele at oracle.com Tue Aug 11 22:13:28 2020 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 11 Aug 2020 18:13:28 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> References: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> Message-ID: <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> > On Aug 11, 2020, at 4:01 PM, forax at univ-mlv.fr wrote: > . . . > > So i propose to disallow that a case with a total pattern can use an explicit type with a nice message saying to use "case var" instead. > This will effectively fix the non locality issues of the current proposal. That specific proposal makes me slightly uneasy, if only because it forces the programmer to remove from the program some type information that may possibly be useful (to the human reader, if not also the compiler). On the other hand, Remi?s proposal that > From: forax at univ-mlv.fr > Date: August 11, 2020 at 11:42:56 AM EDT > . . . > A switch like this > switch (v) { > case String s: A > case Long l: B > case Object o: C > } > can be seen as > if (v instanceof String s) { > A > } else if (v instanceof Long l) { > B > } else { > var o = (Object) v; // <--- cast here > C > } bothers me for a different reason: it seems quite strange and arbitrary that syntactically identical constructs (in this situation, case labels) should mean two very different things depending on which one happens to be the last one in a switch. On the other hand, I think Remi?s point about totality being an implicit and non-local property that is easily undermined by code changes in another compilation unit is worrisome. I think Brian?s arguments about the nature of pattern matching are very well thought out, and I agree that it is better for decisions about the handling of null to lie in the hands of the programmer, or in the design of constructs such as switch and instanceof, rather than be baked into the definition of pattern matching itself. Putting this all together, I reach two conclusions: (1) We can live with the current definition of instanceof, provided we make it clear that instanceof is not purely equivalent to pattern matching, and that instanceof and pattern matching can be simply defined in terms of each other. (2) We have a real disagreement about switch, but I think the fault lies in the design of switch rather than with pattern matching, and the fault is this: Sometimes when we write switch (v) { case Type1 x: A case Type2 y: B case Type3 z: C } we mean for Type1 and Type 2 and Type3 to be three disparate and co-equal things?in which case it seems absurd for any of them to match null; but other times we mean for Type3 to be a catchall, in which case we do want it to match null if nothing before it has. If I understand correctly, Brian suggests that the compiler decide which to do by analyzing whether Type3 is total; but I, the programmer, have no way to put in my two cents? worth about whether I think Type3 is total, so I (and Remi) worry about whether the compiler and I will be on the same page regarding this question, especially if it is easy for programmer to be mistaken (yes! it is always easy for the programmer to be mistaken!). I believe some previous discussion has focused on ways to modify the _pattern_ to indicated either an expectation of totality or a specific way of handling null. But at this point I think augmenting patterns is overkill; what we need (and all we need) is a modification to the syntax of switch to indicate an expectation of totality. I have a modest suggestion: switch (v) { case Type1 x: A case Type2 y: B case Type3 z: C // Type3 is not expected to be a catchall } switch (v) { case Type1 x: A case Type2 y: B default case Type3 z: C // Type3 is expected to be a catchall; it is a static error if Type3 is not total on v, // and Type3 will match null (unlike Type1 and Type2) } Now, I will admit that this syntax is a wee bit delicate, because adding a colon might apparently change the meaning: switch (v) { case Type1 x: A case Type2 y: B default: case Type3 z: C } but I believe that in situations that matter, the compiler can and will reject this last example on other grounds (please correct me if I am mistaken). ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Tue Aug 11 22:18:53 2020 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 11 Aug 2020 18:18:53 -0400 Subject: On the last case being explicitly total In-Reply-To: <8160e6a1-e15c-75a9-282d-f08449661e2d@oracle.com> References: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <1054488882.230933.1597181397649.JavaMail.zimbra@u-pem.fr> <8160e6a1-e15c-75a9-282d-f08449661e2d@oracle.com> Message-ID: > On Aug 11, 2020, at 6:12 PM, Brian Goetz wrote: > . . . > > or, with a smaller hammer: > > switch (multiBox) { > case MultiBox(String a, String b, String c): ... > case MultiBox(Integer a, Integer b, Integer c): ... > final case MultiBox(Object a, Object b, Object c): ... > } > > where `final case` would mean "this covers it all, its an error if it doesn't.? Looks like we independently came upon the same thing literally within seconds of each other. :-) Which of course is not evidence either for or against its being a good idea. From guy.steele at oracle.com Tue Aug 11 22:22:09 2020 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 11 Aug 2020 18:22:09 -0400 Subject: On the last case being explicitly total In-Reply-To: <8160e6a1-e15c-75a9-282d-f08449661e2d@oracle.com> References: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <1054488882.230933.1597181397649.JavaMail.zimbra@u-pem.fr> <8160e6a1-e15c-75a9-282d-f08449661e2d@oracle.com> Message-ID: <2E0694F2-950C-4696-9C0A-F137FE32EE99@oracle.com> > On Aug 11, 2020, at 6:12 PM, Brian Goetz wrote: > > . . . I might be talked into _allowing_ you to say `final case` (or `finally `) as a way to force the totality type checking, and make the totality clear. > > But, if that's what we're talking about, I'd prefer to keep that on the shelf as an option, rather than preemptively plunk for it now. We can always add it later compatibly, and I'm still not convinced this is remotely as big a problem as you think it is. We knew back from the switch expression days that we might want to come back for a "check me for exhaustiveness please" option, and we still might. I agree that the important thing for now is to have some kind of plan, but it need not be implemented right away. From brian.goetz at oracle.com Tue Aug 11 22:26:42 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Aug 2020 18:26:42 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> References: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> Message-ID: > On the other hand, I think Remi?s point about totality being an > implicit and non-local property that is easily undermined by code > changes in another compilation unit is worrisome. ... which in turn derives from something else worrisome (a problem we bought last year): that it is not clear from looking at a switch whether it is exhaustive or not.? Expression switches must exhaustive, but statement switches need not be.? Here, we are saying that exhaustive switch statements are a useful new thing (which they are) and get rewarded with new behaviors (some may not find it a reward), but you have to look too closely to determine whether the switch is total.? If it is, the last clause is total too (well, unless it is an enum switch that names all the constants, or a switch over a sealed type that names all the sub-types but without a catch-all.) So I claim that, if there is a problem, it is that it should be more obvious that a switch is exhaustive on its target. > Putting this all together, I reach two conclusions: > > (1) We can live with the current definition of instanceof, provided we > make it clear that instanceof is not purely equivalent to pattern > matching, and that instanceof and pattern matching can be simply > defined in terms of each other. I think we can do slightly better than this.? I argue that ??? x instanceof null is silly because we can just say ??? x == null instead (which is more direct), and similarly ??? x instanceof var y ??? x instance of Object o are silly because we can just say ??? var y = x instead (again more direct).? So let's just ban the nullable patterns in instanceof, and no one will ever notice. > (2) We have a real disagreement about switch, but I think the fault > lies in the design of switch rather than with pattern matching, and > the fault is this: > > Sometimes when we write > > switch (v) { > case Type1 x: A > case Type2 y: B > case Type3 z: C > } > > we mean for Type1 and Type 2 and Type3 to be three disparate and > co-equal things?in which case it seems absurd for any of them to match > null; but other times we mean for Type3 to be a catchall, in which > case we do want it to match null if nothing before it has. Agreed.? The fundamental concern that Remi (and Stephen, over on a-dev) have raised is that we can't tell which it is, and that is disturbing.? (I still think it won't matter in reality, but I understand the concern.)? The same ambiguity happens with deconstruction patterns: ??? case Type3(var x, var y, var z) t3: ... which we can think of as "enhanced" type patterns. (encouraging that our mails crossed with mostly the same observation and possible fix.) > I believe some previous discussion has focused on ways to modify the > _pattern_ to indicated either an expectation of totality or a specific > way of handling null. ?But at this point I think augmenting patterns > is overkill; what we need (and all we need) is a modification to the > syntax of switch to indicate an expectation of totality. ?I have a > modest suggestion: > > switch (v) { > case Type1 x: A > case Type2 y: B > case Type3 z: C ? ?// Type3 is not expected to be a catchall > } > > switch (v) { > case Type1 x: A > case Type2 y: B > default?case Type3 z: C ? ?// Type3 is expected to be a catchall; it > is a static error if Type3 is not total on v, > // and Type3 will match null (unlike Type1 and Type2) > } And we already had another reason to want something like this: expression switches are exhaustive, statement switches are not, and we'd like to be able to engage the compiler to do exhaustiveness checking for statement switches even in the absence of patterns. > Now, I will admit that this syntax is a wee bit delicate, because > adding a colon might apparently change the meaning: > > switch (v) { > case Type1 x: A > case Type2 y: B > default: case Type3 z: C > } Or `final case` or `finally ` or `default-case` or ... I am iffy about `default` because of its historical association, but I will have to re-think it in light of this idea before I have an opinion. > but I believe that in situations that matter, the compiler can and > will reject this last example on other grounds (please correct me if I > am mistaken yes, the compiler can catch this. The other degree of freedom on this mini-feature is whether `default` is a hint, or whether it would be an error to not say `default` on a total pattern.? I think it might be seen as a burden if it were required, but Remi might think it not strong enough if its just a hint. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Tue Aug 11 22:37:20 2020 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 11 Aug 2020 18:37:20 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> Message-ID: > On Aug 11, 2020, at 6:26 PM, Brian Goetz wrote: > > >> On the other hand, I think Remi?s point about totality being an implicit and non-local property that is easily undermined by code changes in another compilation unit is worrisome. > > ... which in turn derives from something else worrisome (a problem we bought last year): that it is not clear from looking at a switch whether it is exhaustive or not. Expression switches must exhaustive, but statement switches need not be. Here, we are saying that exhaustive switch statements are a useful new thing (which they are) and get rewarded with new behaviors (some may not find it a reward), but you have to look too closely to determine whether the switch is total. If it is, the last clause is total too (well, unless it is an enum switch that names all the constants, or a switch over a sealed type that names all the sub-types but without a catch-all.) > > So I claim that, if there is a problem, it is that it should be more obvious that a switch is exhaustive on its target. > >> Putting this all together, I reach two conclusions: >> >> (1) We can live with the current definition of instanceof, provided we make it clear that instanceof is not purely equivalent to pattern matching, and that instanceof and pattern matching can be simply defined in terms of each other. > > I think we can do slightly better than this. I argue that > > x instanceof null > > is silly because we can just say > > x == null > > instead (which is more direct), and similarly > > x instanceof var y > x instance of Object o > > are silly because we can just say > > var y = x > > instead (again more direct). So let's just ban the nullable patterns in instanceof, and no one will ever notice. > >> (2) We have a real disagreement about switch, but I think the fault lies in the design of switch rather than with pattern matching, and the fault is this: >> >> Sometimes when we write >> >> switch (v) { >> case Type1 x: A >> case Type2 y: B >> case Type3 z: C >> } >> >> we mean for Type1 and Type 2 and Type3 to be three disparate and co-equal things?in which case it seems absurd for any of them to match null; but other times we mean for Type3 to be a catchall, in which case we do want it to match null if nothing before it has. > > Agreed. The fundamental concern that Remi (and Stephen, over on a-dev) have raised is that we can't tell which it is, and that is disturbing. (I still think it won't matter in reality, but I understand the concern.) The same ambiguity happens with deconstruction patterns: > > case Type3(var x, var y, var z) t3: ... > > which we can think of as "enhanced" type patterns. Sure. > (encouraging that our mails crossed with mostly the same observation and possible fix.) > >> I believe some previous discussion has focused on ways to modify the _pattern_ to indicated either an expectation of totality or a specific way of handling null. But at this point I think augmenting patterns is overkill; what we need (and all we need) is a modification to the syntax of switch to indicate an expectation of totality. I have a modest suggestion: >> >> switch (v) { >> case Type1 x: A >> case Type2 y: B >> case Type3 z: C // Type3 is not expected to be a catchall >> } >> >> switch (v) { >> case Type1 x: A >> case Type2 y: B >> default case Type3 z: C // Type3 is expected to be a catchall; it is a static error if Type3 is not total on v, >> // and Type3 will match null (unlike Type1 and Type2) >> } > > And we already had another reason to want something like this: expression switches are exhaustive, statement switches are not, and we'd like to be able to engage the compiler to do exhaustiveness checking for statement switches even in the absence of patterns. > >> Now, I will admit that this syntax is a wee bit delicate, because adding a colon might apparently change the meaning: >> >> switch (v) { >> case Type1 x: A >> case Type2 y: B >> default: case Type3 z: C >> } > > Or `final case` or `finally ` or `default-case` or ... > > I am iffy about `default` because of its historical association, but I will have to re-think it in light of this idea before I have an opinion. I don;t care about the syntax very much. I thought of ?default? because it sort of communicates the right idea and is already a keyword: it says that the last clause is BOTH a case (with a pattern) but also a catchall. I have to admit that ?default case? (there is really no need for a hyphen here) is a bit wordy compared to ?finally?, which is very clever but could cause some cognitive dissonance in users who think too hard about ?try? (really? a case clause that is always executed before you exit the switch??). >> but I believe that in situations that matter, the compiler can and will reject this last example on other grounds (please correct me if I am mistaken > > yes, the compiler can catch this. > > The other degree of freedom on this mini-feature is whether `default` is a hint, or whether it would be an error to not say `default` on a total pattern. I think it might be seen as a burden if it were required, but Remi might think it not strong enough if its just a hint. Yeah, I thought about that, and decided that it would be a bad idea for the compiler to complain about the absence of ?default?, in part because you don't want to feel vaguely obligated to include it in simple cases involving, for example, exhaustive use of enums. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Aug 11 22:43:59 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Aug 2020 18:43:59 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> Message-ID: <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> Let's assume that your trick works fine, Remi is happy, and `default P` means that P is total, that the switch is total, everything.? Great.? Now, what is the story for nested patterns? ?? switch (container) { ?????? case Box(Frog f): ... ?????? case Box(Chocolate c): ... ?????? case Box(var x): .... ?????? case Bag(Frog f): ... ?????? case Bag(Chocolate c): ... ?????? case Bag(var x): .... ??? } We still have totality at the nested level; when container is a box or a bag, the catch-all case is total on that kind of container, but no one has said "default" or "total" or anything like that.? And a Box(null) will get dumped into the third case.? Do we care? I don't, really, but you knew that.? Remi might, but we'll have to hear from him.? But I offer this example as a bisection to determine whether the discomfort is really about totality-therefore-nullable in patterns, or really just about exhaustive switches (and the nullity thing is a red herring.) So, who is bothered by the fact that case #3 gets Box(null), and case #6 gets Bag(null)?? Anyone?? (And, if not, but you are bothered by the lack of totality on the true catch-alls, why not?) On 8/11/2020 6:37 PM, Guy Steele wrote: > > >> On Aug 11, 2020, at 6:26 PM, Brian Goetz > > wrote: >> >> >>> On the other hand, I think Remi?s point about totality being an >>> implicit and non-local property that is easily undermined by code >>> changes in another compilation unit is worrisome. >> >> ... which in turn derives from something else worrisome (a problem we >> bought last year): that it is not clear from looking at a switch >> whether it is exhaustive or not. Expression switches must exhaustive, >> but statement switches need not be.? Here, we are saying that >> exhaustive switch statements are a useful new thing (which they are) >> and get rewarded with new behaviors (some may not find it a reward), >> but you have to look too closely to determine whether the switch is >> total.? If it is, the last clause is total too (well, unless it is an >> enum switch that names all the constants, or a switch over a sealed >> type that names all the sub-types but without a catch-all.) >> >> So I claim that, if there is a problem, it is that it should be more >> obvious that a switch is exhaustive on its target. >> >>> Putting this all together, I reach two conclusions: >>> >>> (1) We can live with the current definition of instanceof, provided >>> we make it clear that instanceof is not purely equivalent to pattern >>> matching, and that instanceof and pattern matching can be simply >>> defined in terms of each other. >> >> I think we can do slightly better than this.? I argue that >> >> ??? x instanceof null >> >> is silly because we can just say >> >> ??? x == null >> >> instead (which is more direct), and similarly >> >> ??? x instanceof var y >> ??? x instance of Object o >> >> are silly because we can just say >> >> ??? var y = x >> >> instead (again more direct).? So let's just ban the nullable patterns >> in instanceof, and no one will ever notice. >> >>> (2) We have a real disagreement about switch, but I think the fault >>> lies in the design of switch rather than with pattern matching, and >>> the fault is this: >>> >>> Sometimes when we write >>> >>> switch (v) { >>> case Type1 x: A >>> case Type2 y: B >>> case Type3 z: C >>> } >>> >>> we mean for Type1 and Type 2 and Type3 to be three disparate and >>> co-equal things?in which case it seems absurd for any of them to >>> match null; but other times we mean for Type3 to be a catchall, in >>> which case we do want it to match null if nothing before it has. >> >> Agreed.? The fundamental concern that Remi (and Stephen, over on >> a-dev) have raised is that we can't tell which it is, and that is >> disturbing.? (I still think it won't matter in reality, but I >> understand the concern.)? The same ambiguity happens with >> deconstruction patterns: >> >> ??? case Type3(var x, var y, var z) t3: ... >> >> which we can think of as "enhanced" type patterns. > > Sure. > >> (encouraging that our mails crossed with mostly the same observation >> and possible fix.) >> >>> I believe some previous discussion has focused on ways to modify the >>> _pattern_ to indicated either an expectation of totality or a >>> specific way of handling null. ?But at this point I think augmenting >>> patterns is overkill; what we need (and all we need) is a >>> modification to the syntax of switch to indicate an expectation of >>> totality. ?I have a modest suggestion: >>> >>> switch (v) { >>> case Type1 x: A >>> case Type2 y: B >>> case Type3 z: C ? ?// Type3 is not expected to be a catchall >>> } >>> >>> switch (v) { >>> case Type1 x: A >>> case Type2 y: B >>> default?case Type3 z: C ? ?// Type3 is expected to be a catchall; it >>> is a static error if Type3 is not total on v, >>> // and Type3 will match null (unlike Type1 and Type2) >>> } >> >> And we already had another reason to want something like this: >> expression switches are exhaustive, statement switches are not, and >> we'd like to be able to engage the compiler to do exhaustiveness >> checking for statement switches even in the absence of patterns. >> >>> Now, I will admit that this syntax is a wee bit delicate, because >>> adding a colon might apparently change the meaning: >>> >>> switch (v) { >>> case Type1 x: A >>> case Type2 y: B >>> default: case Type3 z: C >>> } >> >> Or `final case` or `finally ` or `default-case` or ... >> >> I am iffy about `default` because of its historical association, but >> I will have to re-think it in light of this idea before I have an >> opinion. > > I don;t care about the syntax very much. ?I thought of ?default? > because it sort of communicates the right idea and is already a > keyword: it says that the last clause is BOTH a case (with a pattern) > but also a catchall. > > I have to admit that ?default case? (there is really no need for a > hyphen here) is a bit wordy compared to ?finally?, which is very > clever but could cause some cognitive dissonance in users who think > too hard about ?try? (really? a case clause that is always executed > before you exit the switch??). > >>> but I believe that in situations that matter, the compiler can and >>> will reject this last example on other grounds (please correct me if >>> I am mistaken >> >> yes, the compiler can catch this. >> >> The other degree of freedom on this mini-feature is whether `default` >> is a hint, or whether it would be an error to not say `default` on a >> total pattern.? I think it might be seen as a burden if it were >> required, but Remi might think it not strong enough if its just a hint. > > Yeah, I thought about that, and decided that it would be a bad idea > for the compiler to complain about the absence of ?default?, in part > because you don't want to feel vaguely obligated to include it in > simple cases involving, for example, exhaustive use of enums. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Aug 11 22:51:59 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 11 Aug 2020 18:51:59 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> Message-ID: <439060ed-a4e1-b1bc-81ed-30be49d02851@oracle.com> > If it is, the last clause is total too (well, unless it is an enum > switch that names all the constants, or a switch over a sealed type > that names all the sub-types but without a catch-all.) I want to drill into this comment because it's significant. If we have ??? enum E { A, B; } then a switch ??? switch (e) { ??????? case A: ... ??????? case B: .... ??? } is exhaustive but has no catch-all and will still NPE on null. (Which seems reasonable, there's a strong presumption that null enum values are a bug.) And we would not want to make the user explicitly say `case null`, just like we don't make them say `default` when all the cases are covered. Presumably the same is true with: ??? sealed interface E permits A, B { } ??? switch (e) { ??????? case A a: ??????? case B b: ??? } and presumably this is also reasonable. Which is to say: the nullity behaviors we've designed around pattern matching are specific to catch-all cases, which we presume are going to be common in pattern switches.? If null is not part of your domain, you won't notice; if it is part of your domain, you want the catch-all catching it. From guy.steele at oracle.com Wed Aug 12 02:57:31 2020 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 11 Aug 2020 22:57:31 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> References: <022e3ed3-d311-0791-d99b-12d12f91caa2@oracle.com> <342037542.139843.1597148607384.JavaMail.zimbra@u-pem.fr> <67279243-ede2-5f74-8fcb-daf35a68891f@oracle.com> <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> Message-ID: > On Aug 11, 2020, at 6:43 PM, Brian Goetz wrote: > > Let's assume that your trick works fine, Remi is happy, and `default P` means that P is total, that the switch is total, everything. Great. Yes, if people will stand for it, ?default P? is certainly an improvement over ?default case P?, and a better word choice than ?finally?. And it avoids the ticklish one-colon difference to which I had alluded. ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Aug 12 13:45:30 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 12 Aug 2020 15:45:30 +0200 (CEST) Subject: Next up for patterns: type patterns in switch In-Reply-To: <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> References: <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> Message-ID: <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Guy Steele" > Cc: "Remi Forax" , "John Rose" , > "amber-spec-experts" > Envoy?: Mercredi 12 Ao?t 2020 00:43:59 > Objet: Re: Next up for patterns: type patterns in switch > Let's assume that your trick works fine, Remi is happy, and `default P` means > that P is total, that the switch is total, everything. Great. yep, i'm hapy with "default P", let me explain why : I see the switch semantics as a kind of compact way to represent a cascade of if-else, with that they are several pattern case Constant is equivalent to if (o.equals(Constant)) or if (o == Constant) case null is equivalent to if (o == null) case P p is equivalent of if (o instanceof P p) and default P p is equivalent to else { P p = o; } some patterns accept null (case null and default P), so if one of them is present in a switch, the switch accept null, otherwise it does not. so the following code if (o instanceof Frog f) { ... } else if (o instanceof Chocolate c) { ... } else { var x = o; ... } can be refactored to switch(o) { case Frog f: ... case Chocolate c: ... default var x: ... } About exhaustiveness, if a switch is exhaustive, by example with sealed interface Container permits Box, Bag { } the following switch is exhaustive switch(container) { case Box box: ... case Bag bag: ... } Here there is no need for a total pattern and if a user want to allow null, he can add a "case null". > Now, what is the story for nested patterns? > switch (container) { > case Box(Frog f): ... > case Box(Chocolate c): ... > case Box(var x): .... > case Bag(Frog f): ... > case Bag(Chocolate c): ... > case Bag(var x): .... > } so this is a mix between an exhaustive switch with two total patterns once de-constructed, for me, it should be written like this switch(container) { case Box(Frog g): ... case Box(Chocolate c): ... default Box(var x): ... case Bag(Frog g): ... case Bag(Chocolate c): ... default Bag(var x): ... } using the syntax "default Box(var x)" to say that the nested-patterns are locally total thus accept null. It's a little weird to have the "default" in front of the type name while it applies on the nested part but i'm Ok with that. > We still have totality at the nested level; when container is a box or a bag, > the catch-all case is total on that kind of container, but no one has said > "default" or "total" or anything like that. And a Box(null) will get dumped > into the third case. Do we care? > I don't, really, but you knew that. Remi might, but we'll have to hear from him. > But I offer this example as a bisection to determine whether the discomfort is > really about totality-therefore-nullable in patterns, or really just about > exhaustive switches (and the nullity thing is a red herring.) I think it's a mix, at top-level it's an exhaustive switch so not nullable but if Box and the Bag may contains null, "Box(var x)" and "Bag(var x)" should use a default pattern because there a kind of locally total. > So, who is bothered by the fact that case #3 gets Box(null), and case #6 gets > Bag(null)? Anyone? (And, if not, but you are bothered by the lack of totality > on the true catch-alls, why not?) I'm bothered if the pattern are not declared as total and i believe Stephen Colebourne on amber-dev is proposing exactly the same rules. R?mi > On 8/11/2020 6:37 PM, Guy Steele wrote: >>> On Aug 11, 2020, at 6:26 PM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >>> brian.goetz at oracle.com ] > wrote: >>>> On the other hand, I think Remi?s point about totality being an implicit and >>>> non-local property that is easily undermined by code changes in another >>>> compilation unit is worrisome. >>> ... which in turn derives from something else worrisome (a problem we bought >>> last year): that it is not clear from looking at a switch whether it is >>> exhaustive or not. Expression switches must exhaustive, but statement switches >>> need not be. Here, we are saying that exhaustive switch statements are a useful >>> new thing (which they are) and get rewarded with new behaviors (some may not >>> find it a reward), but you have to look too closely to determine whether the >>> switch is total. If it is, the last clause is total too (well, unless it is an >>> enum switch that names all the constants, or a switch over a sealed type that >>> names all the sub-types but without a catch-all.) >>> So I claim that, if there is a problem, it is that it should be more obvious >>> that a switch is exhaustive on its target. >>>> Putting this all together, I reach two conclusions: >>>> (1) We can live with the current definition of instanceof, provided we make it >>>> clear that instanceof is not purely equivalent to pattern matching, and that >>>> instanceof and pattern matching can be simply defined in terms of each other. >>> I think we can do slightly better than this. I argue that >>> x instanceof null >>> is silly because we can just say >>> x == null >>> instead (which is more direct), and similarly >>> x instanceof var y >>> x instance of Object o >>> are silly because we can just say >>> var y = x >>> instead (again more direct). So let's just ban the nullable patterns in >>> instanceof, and no one will ever notice. >>>> (2) We have a real disagreement about switch, but I think the fault lies in the >>>> design of switch rather than with pattern matching, and the fault is this: >>>> Sometimes when we write >>>> switch (v) { >>>> case Type1 x: A >>>> case Type2 y: B >>>> case Type3 z: C >>>> } >>>> we mean for Type1 and Type 2 and Type3 to be three disparate and co-equal >>>> things?in which case it seems absurd for any of them to match null; but other >>>> times we mean for Type3 to be a catchall, in which case we do want it to match >>>> null if nothing before it has. >>> Agreed. The fundamental concern that Remi (and Stephen, over on a-dev) have >>> raised is that we can't tell which it is, and that is disturbing. (I still >>> think it won't matter in reality, but I understand the concern.) The same >>> ambiguity happens with deconstruction patterns: >>> case Type3(var x, var y, var z) t3: ... >>> which we can think of as "enhanced" type patterns. >> Sure. >>> (encouraging that our mails crossed with mostly the same observation and >>> possible fix.) >>>> I believe some previous discussion has focused on ways to modify the _pattern_ >>>> to indicated either an expectation of totality or a specific way of handling >>>> null. But at this point I think augmenting patterns is overkill; what we need >>>> (and all we need) is a modification to the syntax of switch to indicate an >>>> expectation of totality. I have a modest suggestion: >>>> switch (v) { >>>> case Type1 x: A >>>> case Type2 y: B >>>> case Type3 z: C // Type3 is not expected to be a catchall >>>> } >>>> switch (v) { >>>> case Type1 x: A >>>> case Type2 y: B >>>> default case Type3 z: C // Type3 is expected to be a catchall; it is a static >>>> error if Type3 is not total on v, >>>> // and Type3 will match null (unlike Type1 and Type2) >>>> } >>> And we already had another reason to want something like this: expression >>> switches are exhaustive, statement switches are not, and we'd like to be able >>> to engage the compiler to do exhaustiveness checking for statement switches >>> even in the absence of patterns. >>>> Now, I will admit that this syntax is a wee bit delicate, because adding a colon >>>> might apparently change the meaning: >>>> switch (v) { >>>> case Type1 x: A >>>> case Type2 y: B >>>> default: case Type3 z: C >>>> } >>> Or `final case` or `finally ` or `default-case` or ... >>> I am iffy about `default` because of its historical association, but I will have >>> to re-think it in light of this idea before I have an opinion. >> I don;t care about the syntax very much. I thought of ?default? because it sort >> of communicates the right idea and is already a keyword: it says that the last >> clause is BOTH a case (with a pattern) but also a catchall. >> I have to admit that ?default case? (there is really no need for a hyphen here) >> is a bit wordy compared to ?finally?, which is very clever but could cause some >> cognitive dissonance in users who think too hard about ?try? (really? a case >> clause that is always executed before you exit the switch??). >>>> but I believe that in situations that matter, the compiler can and will reject >>>> this last example on other grounds (please correct me if I am mistaken >>> yes, the compiler can catch this. >>> The other degree of freedom on this mini-feature is whether `default` is a hint, >>> or whether it would be an error to not say `default` on a total pattern. I >>> think it might be seen as a burden if it were required, but Remi might think it >>> not strong enough if its just a hint. >> Yeah, I thought about that, and decided that it would be a bad idea for the >> compiler to complain about the absence of ?default?, in part because you don't >> want to feel vaguely obligated to include it in simple cases involving, for >> example, exhaustive use of enums. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Aug 12 14:29:50 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 12 Aug 2020 10:29:50 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> References: <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> Message-ID: > > > So, who is bothered by the fact that case #3 gets Box(null), and > case #6 gets Bag(null)?? Anyone? (And, if not, but you are > bothered by the lack of totality on the true catch-alls, why not?) > > > I'm bothered if the pattern are not declared as total That's too bad, because there was a proposal that might work (`default` enables type checking for totality on switches) and now you want to escalate it to something that "totally" does not work (`default` makes _patterns_ total.) > and i believe Stephen Colebourne on amber-dev is proposing exactly the > same rules. > You should read my long explanation to Stephen over there.? TL;DR: this is not engaging with half of what pattern matching is really for, and so is coming to the wrong answer.? Totality needs to be a property of the pattern, not the context -- otherwise, the same pattern in switch doesn't mean the same pattern in instanceof, and each construct needs its own totality hacks.? This is just another bad variant; worse, in fact, that the "any x" proposal. Having total patterns is very important; if I have: ??? record Bag(T t) implements Container { } ??? record DoubleBox(InnerBox x) implements Container { } ??? record InnerBox(T x) { } and I do ??? switch (container) { ??????? case DoubleBox(InnerBox(var x)): ... ??? } the outer pattern is not total on Containers (doesn't match Bag), but the inner pattern is once we match DoubleBox.? I want to match all DoubleBoxes, and destructure their contents.? If I try to wedge totality into patterns through switch case modifiers, then (a) the totality is all or nothing, at all levels, and (b) what do I do when I want to refactor that switch into a chain of instanceof operations?? Then there's no way to express the pattern I want! Totality must be a property of the pattern, and then we can define (orthogonally, please) how the patterns interact with the enclosing construct. I think the mistake that is dragging you down the wrong road is the assumption that pattern matching is always about conditionality. But destructuring is just as important as conditionality.? Sometimes a pattern is total (on a given type), and sometimes its not, but in all cases it describes the same destructuring.?? What was wrong with the "just use case var x" proposal is that it said "you can't use destructuring when you're total, because a null might sneak along for the ride", which was just mean. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Wed Aug 12 16:20:32 2020 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 12 Aug 2020 12:20:32 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> References: <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> Message-ID: <3975CA7D-772C-468D-99CA-D63E036C6816@oracle.com> > On Aug 12, 2020, at 9:45 AM, forax at univ-mlv.fr wrote: > . . . > yep, i'm hapy with "default P", let me explain why : > > I see the switch semantics as a kind of compact way to represent a cascade of if-else, > with that they are several pattern > case Constant is equivalent to if (o.equals(Constant)) or if (o == Constant) > case null is equivalent to if (o == null) > case P p is equivalent of if (o instanceof P p) > and > default P p is equivalent to else { P p = o; } > > some patterns accept null (case null and default P), so if one of them is present in a switch, the switch accept null, otherwise it does not. > > so the following code > if (o instanceof Frog f) { ... } > else if (o instanceof Chocolate c) { ... } > else { var x = o; ... } > can be refactored to > switch(o) { > case Frog f: ... > case Chocolate c: ... > default var x: ... > } > > About exhaustiveness, if a switch is exhaustive, by example with > sealed interface Container permits Box, Bag { } > the following switch is exhaustive > switch(container) { > case Box box: ... > case Bag bag: ... > } > > Here there is no need for a total pattern and if a user want to allow null, he can add a "case null". > >> Now, what is the story for nested patterns? >> >> switch (container) { >> case Box(Frog f): ... >> case Box(Chocolate c): ... >> case Box(var x): .... >> >> case Bag(Frog f): ... >> case Bag(Chocolate c): ... >> case Bag(var x): .... >> >> } > > so this is a mix between an exhaustive switch with two total patterns once de-constructed, for me, it should be written like this > switch(container) { > case Box(Frog g): ... > case Box(Chocolate c): ... > default Box(var x): ... > > case Bag(Frog g): ... > case Bag(Chocolate c): ... > default Bag(var x): ... > } > > using the syntax "default Box(var x)" to say that the nested-patterns are locally total thus accept null. > It's a little weird to have the "default" in front of the type name while it applies on the nested part but i'm Ok with that. Very interesting proposal, to allow more than one ?default? clause in a switch! I?m not yet sure whether I like this path, but I want to explore it further, and to do that I will attempt to formalize it a bit more and then look at a more detailed example. Remi suggested these rules: `case Constant` is equivalent to `if (o.equals(Constant))` or `if (o == Constant)` `case null` is equivalent to `if (o == null)` `case P p` is equivalent to `if (o instanceof P p)` `default P p` is equivalent to `else { P p = o; }` I think that to get the desired effect in the last example, it is necessary to be more detailed, and distinguish various kinds of patterns (let T stand for a type, let P: Pattern T, let Q be any pattern, and let S be a statement): `case X` is equivalent to `if (CASE_EXPAND(o, X))` `default X` is equivalent to `if (DEFAULT_EXPAND(o, X))` (**) we will refer to this rule later `default` is equivalent to `default var unused_variable` `CASE_EXPAND(o, Constant)` is equivalent to `o.equals(Constant)` or `(o == Constant)` `CASE_EXPAND(o, null)` is equivalent to `(o == null)` `CASE_EXPAND(o, T p)` is equivalent to `o instanceof T p` `CASE_EXPAND(o, var p)` is equivalent to `o instanceof var p` [just a way to bind p to the value of o in the middle of an expression] `CASE_EXPAND(o, P(Q))` is equivalent to `o instanceof P(T alpha) && CASE_EXPAND(alpha, Q)` `DEFAULT_EXPAND(o, Constant)` is a static error? `DEFAULT_EXPAND(o, null)` is a static error? `DEFAULT_EXPAND(o, T p)` is equivalent to `{ T p = o; }` `DEFAULT_EXPAND(o, var p)` is equivalent to `{ var p = o; }` `DEFAULT_EXPAND(o, P(Q))` is equivalent to `o instanceof P(T alpha) && DEFAULT_EXPAND(alpha, Q)` where by an abuse of notation I write `{ T p = o; }` for an ?expression? that checks to see whether the value of o is assignable to type T, and if it is then binds the variable p to that value produces the value true, and otherwise produces false. (In other words, it is like `o instanceof T p` but accepts nulls.) I am assuming that `o instanceof P(T alpha)` is null-friendly (it always allows the possibility that alpha may be bound to null). (And I note that I have been sloppy about how the cases and their associated statements are glued together to make a complete translation of a switch statement.) Now let?s examine this extended example (assume `record Box(Object f)` and `record Bag(Object f)` and record `FrogBox(Frog f)`): switch(o) { case Box(Frog g): ... case Box(Chocolate c): ... default Box(var x): ... case Bag(Frog g): ... case Bag(Chocolate c): ... default Bag(Object x): ? // I changed `var x` to `Object x` here case FrogBox(Toad t): ? case FrogBox(Tadpole tp): ... default FrogBox(Frog fr): ? default: ... } This would expand to something like: if (o instanceof Box(Object alpha) && alpha instanceof Frog g) ? else if (o instanceof Box(Object alpha) && alpha instanceof Chocolate c) ? else if (o instanceof Box(Object alpha) && { var x = alpha; }) ? else if (o instanceof Bag(Object alpha) && alpha instanceof Frog g) ? else if (o instanceof Bag(Object alpha) && alpha instanceof Chocolate c) ? else if (o instanceof Bag(Object alpha) && { Object o = alpha; }) ? else if (o instanceof FrogBox(T alpha) && alpha instanceof Toad t) ? else if (o instanceof Box(T alpha) && alpha instanceof Tadpole tp) ? else if (o instanceof Box(T alpha) && { Frog fr = alpha; }) ? else ? But now I realize that this model is not quite what we had discussed before: I think I need to change one of the rules above (**) to three rules: `default T p` is equivalent to `T p = o; if (true)` `default var p` is equivalent to `var p = o; if (true)` `default P(Q)` is equivalent to `if (DEFAULT_EXPAND(o, P(Q)))` That is, at ?top level?, the situations `default T p` and `default var p` are not conditional, but are required to succeed, and you get a static error for the first one if o is not assignable to T. This may be ugly, but at least it reveals explicitly that we are treating the outermost situation in a `default` label a bit differently from nested situations. Does this model capture the intent of what everyone wants? ?Guy From guy.steele at oracle.com Wed Aug 12 16:29:17 2020 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 12 Aug 2020 12:29:17 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <3975CA7D-772C-468D-99CA-D63E036C6816@oracle.com> References: <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> <3975CA7D-772C-468D-99CA-D63E036C6816@oracle.com> Message-ID: <2046B42B-880D-4B66-AA09-F64A650273E4@oracle.com> And now that I have posed the model below, I can spot at least two ways in which the model is wrong in its details. But I think its structure provides a useful framework for discussion. I will try to produce a corrected version late today. ?Guy > On Aug 12, 2020, at 12:20 PM, Guy Steele wrote: > >> On Aug 12, 2020, at 9:45 AM, forax at univ-mlv.fr wrote: >> . . . >> yep, i'm hapy with "default P", let me explain why : >> >> I see the switch semantics as a kind of compact way to represent a cascade of if-else, >> with that they are several pattern >> case Constant is equivalent to if (o.equals(Constant)) or if (o == Constant) >> case null is equivalent to if (o == null) >> case P p is equivalent of if (o instanceof P p) >> and >> default P p is equivalent to else { P p = o; } >> >> some patterns accept null (case null and default P), so if one of them is present in a switch, the switch accept null, otherwise it does not. >> >> so the following code >> if (o instanceof Frog f) { ... } >> else if (o instanceof Chocolate c) { ... } >> else { var x = o; ... } >> can be refactored to >> switch(o) { >> case Frog f: ... >> case Chocolate c: ... >> default var x: ... >> } >> >> About exhaustiveness, if a switch is exhaustive, by example with >> sealed interface Container permits Box, Bag { } >> the following switch is exhaustive >> switch(container) { >> case Box box: ... >> case Bag bag: ... >> } >> >> Here there is no need for a total pattern and if a user want to allow null, he can add a "case null". >> >>> Now, what is the story for nested patterns? >>> >>> switch (container) { >>> case Box(Frog f): ... >>> case Box(Chocolate c): ... >>> case Box(var x): .... >>> >>> case Bag(Frog f): ... >>> case Bag(Chocolate c): ... >>> case Bag(var x): .... >>> >>> } >> >> so this is a mix between an exhaustive switch with two total patterns once de-constructed, for me, it should be written like this >> switch(container) { >> case Box(Frog g): ... >> case Box(Chocolate c): ... >> default Box(var x): ... >> >> case Bag(Frog g): ... >> case Bag(Chocolate c): ... >> default Bag(var x): ... >> } >> >> using the syntax "default Box(var x)" to say that the nested-patterns are locally total thus accept null. >> It's a little weird to have the "default" in front of the type name while it applies on the nested part but i'm Ok with that. > > Very interesting proposal, to allow more than one ?default? clause in a switch! > > I?m not yet sure whether I like this path, but I want to explore it further, and to do that I will attempt to formalize it a bit more and then look at a more detailed example. > > Remi suggested these rules: > > `case Constant` is equivalent to `if (o.equals(Constant))` or `if (o == Constant)` > `case null` is equivalent to `if (o == null)` > `case P p` is equivalent to `if (o instanceof P p)` > `default P p` is equivalent to `else { P p = o; }` > > I think that to get the desired effect in the last example, it is necessary to be more detailed, and distinguish various kinds of patterns (let T stand for a type, let P: Pattern T, let Q be any pattern, and let S be a statement): > > `case X` is equivalent to `if (CASE_EXPAND(o, X))` > `default X` is equivalent to `if (DEFAULT_EXPAND(o, X))` (**) we will refer to this rule later > `default` is equivalent to `default var unused_variable` > > `CASE_EXPAND(o, Constant)` is equivalent to `o.equals(Constant)` or `(o == Constant)` > `CASE_EXPAND(o, null)` is equivalent to `(o == null)` > `CASE_EXPAND(o, T p)` is equivalent to `o instanceof T p` > `CASE_EXPAND(o, var p)` is equivalent to `o instanceof var p` [just a way to bind p to the value of o in the middle of an expression] > `CASE_EXPAND(o, P(Q))` is equivalent to `o instanceof P(T alpha) && CASE_EXPAND(alpha, Q)` > > `DEFAULT_EXPAND(o, Constant)` is a static error? > `DEFAULT_EXPAND(o, null)` is a static error? > `DEFAULT_EXPAND(o, T p)` is equivalent to `{ T p = o; }` > `DEFAULT_EXPAND(o, var p)` is equivalent to `{ var p = o; }` > `DEFAULT_EXPAND(o, P(Q))` is equivalent to `o instanceof P(T alpha) && DEFAULT_EXPAND(alpha, Q)` > > where by an abuse of notation I write `{ T p = o; }` for an ?expression? that checks to see whether the value of o is assignable to type T, and if it is then binds the variable p to that value produces the value true, and otherwise produces false. (In other words, it is like `o instanceof T p` but accepts nulls.) > > I am assuming that `o instanceof P(T alpha)` is null-friendly (it always allows the possibility that alpha may be bound to null). > > (And I note that I have been sloppy about how the cases and their associated statements are glued together to make a complete translation of a switch statement.) > > > Now let?s examine this extended example (assume `record Box(Object f)` and `record Bag(Object f)` and record `FrogBox(Frog f)`): > > switch(o) { > case Box(Frog g): ... > case Box(Chocolate c): ... > default Box(var x): ... > > case Bag(Frog g): ... > case Bag(Chocolate c): ... > default Bag(Object x): ? // I changed `var x` to `Object x` here > > case FrogBox(Toad t): ? > case FrogBox(Tadpole tp): ... > default FrogBox(Frog fr): ? > > default: ... > } > > This would expand to something like: > > if (o instanceof Box(Object alpha) && alpha instanceof Frog g) ? > else if (o instanceof Box(Object alpha) && alpha instanceof Chocolate c) ? > else if (o instanceof Box(Object alpha) && { var x = alpha; }) ? > > else if (o instanceof Bag(Object alpha) && alpha instanceof Frog g) ? > else if (o instanceof Bag(Object alpha) && alpha instanceof Chocolate c) ? > else if (o instanceof Bag(Object alpha) && { Object o = alpha; }) ? > > else if (o instanceof FrogBox(T alpha) && alpha instanceof Toad t) ? > else if (o instanceof Box(T alpha) && alpha instanceof Tadpole tp) ? > else if (o instanceof Box(T alpha) && { Frog fr = alpha; }) ? > > else ? > > > But now I realize that this model is not quite what we had discussed before: I think I need to change one of the rules above (**) to three rules: > > `default T p` is equivalent to `T p = o; if (true)` > `default var p` is equivalent to `var p = o; if (true)` > `default P(Q)` is equivalent to `if (DEFAULT_EXPAND(o, P(Q)))` > > That is, at ?top level?, the situations `default T p` and `default var p` are not conditional, but are required to succeed, and you get a static error for the first one if o is not assignable to T. This may be ugly, but at least it reveals explicitly that we are treating the outermost situation in a `default` label a bit differently from nested situations. > > > Does this model capture the intent of what everyone wants? > > ?Guy > From brian.goetz at oracle.com Wed Aug 12 16:57:19 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 12 Aug 2020 12:57:19 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <3975CA7D-772C-468D-99CA-D63E036C6816@oracle.com> References: <256357890.203351.1597160576595.JavaMail.zimbra@u-pem.fr> <4b1e1f3d-b0df-000d-0165-642ac734b3c4@oracle.com> <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> <3975CA7D-772C-468D-99CA-D63E036C6816@oracle.com> Message-ID: > But now I realize that this model is not quite what we had discussed before: I think I need to change one of the rules above (**) to three rules: > > `default T p` is equivalent to `T p = o; if (true)` > `default var p` is equivalent to `var p = o; if (true)` > `default P(Q)` is equivalent to `if (DEFAULT_EXPAND(o, P(Q)))` > > That is, at ?top level?, the situations `default T p` and `default var p` are not conditional, but are required to succeed, and you get a static error for the first one if o is not assignable to T. This may be ugly, but at least it reveals explicitly that we are treating the outermost situation in a `default` label a bit differently from nested situations. This is the key thing -- the important part is pattern composition (the non-nested cases are the mostly uninteresting ones, except for how they compose with containing patterns.)? The essence of this approach is that `default` modifies not only the pattern, but how patterns compose with each other. And we have been here before, on the road to the current proposed semantics!? In an earlier iteration, we first tried to roll null-handling into the definition of nesting rather than the definition of matching (because we were leaning on the wrong primitive, instanceof).? This gave us more flexibility, but we had to give up this simple rule for composition: ??? x matches P(Q) == x matches P(alpha) && alpha matches Q to adjust for the observation that total patterns seemed very desirable at nested levels even if it was not obvious whether they are as useful at the top level.? The cost of this version was that the rules for nesting were less compositional, and again, this inhibits what refactorings can be done.? Essentially, it meant that top-level patterns and nested patterns were different, as a way of rescuing the null behavior of switch. (One conclusion I've come to from the many attempts we've made to find the perfect story here is that it is far far easier to start with totality and define composition simply and then filter, than to start with a non-total set of base cases or sharp-edged compositional rules and then try to bake totality back in.) This direction strikes me as a whack-a-mole exercise for several reasons: ?- It is mucking with composition.? This rarely ends well, and you rarely end up with only one knob (or, if one is enough, you still have two sets of composition semantics.) ?- There is no fine grained-control; either you muck with the composition all the way down, or you don't. ?- (big one) It is using switch syntax to affect pattern semantics. What happens for the same patterns in instanceof or catch or assignment?? Do we have to invent a different way to say "modify this pattern this way" in all these contexts? (aside, of course, from the baseline "I think this is fixing a problem that doesn't really exist" suspicion.) Here's an observation that the Remi+Stephen crowd are missing: the shape of pattern switches will likely be dramatically different from the shape of existing switches, and we're trying to apply the wrong intuitions. The current switches are very, very simple: N-way comparison to constants, plus maybe a catch-all, in domains where null almost never shows up (and when it does, is almost always an error.)?? They are "symmetric" in this way. In pattern switches, though, I'd expect the common (nontrivial) case to be much more like: ??? case Box(SomethingSpecial x): ... ??? case Box(EverythingElse): ??? case Bag(SomethingElseSpecial x): ??? case Bag(EverythingElse): ??? case EverythingElse: If you think about the partial ordering on patterns given by dominance/value set inclusion, you see multiple disjoint chains of strictly ordered cases, where each chain locally terminates in a catch-all for that chain (or a sub-chain), leaning on the dictionary order imposed by composition, with the whole thing often terminating in a global catch-all.? That's a very different "shape" that our switches today, and I think we're trying to impose intuition from the old shape on the new. From forax at univ-mlv.fr Wed Aug 12 19:57:41 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 12 Aug 2020 21:57:41 +0200 (CEST) Subject: Next up for patterns: type patterns in switch In-Reply-To: References: <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> Message-ID: <648491328.346938.1597262261833.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Guy Steele" , "John Rose" , > "amber-spec-experts" > Envoy?: Mercredi 12 Ao?t 2020 16:29:50 > Objet: Re: Next up for patterns: type patterns in switch >>> So, who is bothered by the fact that case #3 gets Box(null), and case #6 gets >>> Bag(null)? Anyone? (And, if not, but you are bothered by the lack of totality >>> on the true catch-alls, why not?) >> I'm bothered if the pattern are not declared as total > That's too bad, because there was a proposal that might work (`default` enables > type checking for totality on switches) and now you want to escalate it to > something that "totally" does not work (`default` makes _patterns_ total.) >> and i believe Stephen Colebourne on amber-dev is proposing exactly the same >> rules. in fact, it's not exactly the same rules, because Stephen rules allows "case Foo foo" and "default Foo foo" in the same switch, while i think that "case Foo foo" should be flagged as an error if the pattern is total. > You should read my long explanation to Stephen over there. TL;DR: this is not > engaging with half of what pattern matching is really for, and so is coming to > the wrong answer. Totality needs to be a property of the pattern, not the > context -- otherwise, the same pattern in switch doesn't mean the same pattern > in instanceof, and each construct needs its own totality hacks. This is just > another bad variant; worse, in fact, that the "any x" proposal. I agree that totality should be a property of a pattern not of a case, like currently "default" does. As you said, using "any" is better because it's per pattern and not per case. Continuing with the "default" keyword even if we may come with a better keyword latter, we are back to what Guy was proposing, "case default var x" or inside a nested-pattern "case Box(default var x)", default being the keyword saying it's total thus allows null. > Having total patterns is very important; if I have: > record Bag(T t) implements Container { } > record DoubleBox(InnerBox x) implements Container { } > record InnerBox(T x) { } > and I do > switch (container) { > case DoubleBox(InnerBox(var x)): ... > } > the outer pattern is not total on Containers (doesn't match Bag), but the inner > pattern is once we match DoubleBox. I want to match all DoubleBoxes, and > destructure their contents. If I try to wedge totality into patterns through > switch case modifiers, then (a) the totality is all or nothing, at all levels, > and (b) what do I do when I want to refactor that switch into a chain of > instanceof operations? Then there's no way to express the pattern I want! > Totality must be a property of the pattern, and then we can define > (orthogonally, please) how the patterns interact with the enclosing construct. Right, it should be switch (container) { case DoubleBox(InnerBox(default var x)): ... } > I think the mistake that is dragging you down the wrong road is the assumption > that pattern matching is always about conditionality. But destructuring is just > as important as conditionality. Sometimes a pattern is total (on a given type), > and sometimes its not, but in all cases it describes the same destructuring. > What was wrong with the "just use case var x" proposal is that it said "you > can't use destructuring when you're total, because a null might sneak along for > the ride", which was just mean. I agree destructuring is just as important as conditionality and those two things should be orthogonal. But i still think having a keyword to signal that a pattern (not a case) is total is better than letting people guess. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Aug 12 20:05:02 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 12 Aug 2020 16:05:02 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <648491328.346938.1597262261833.JavaMail.zimbra@u-pem.fr> References: <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> <648491328.346938.1597262261833.JavaMail.zimbra@u-pem.fr> Message-ID: <641bfdf2-c38b-2a78-faab-fca022a2933a@oracle.com> Since I am always the one saying "state your concern, not your solution", let me frame this: > But i still think having a keyword to signal that a pattern (not a > case) is total is better than letting people guess. From this I take away that (a) the rules we've proposed for totality vs not are mostly OK in terms of their expressiveness and their defaults, but (b) you are worried that it is too subtle for Java developers to determine whether a given sub-tree of a pattern is total on the part it is matching, so (c) you would like some additional assertions to say "this is total, error if I'm wrong". These assertions benefit would both the writer (to catch errors) and reader (so totality snaps off the page.) Do I have it right? But isn't the "switch is total" kind of the half-brother of this story, too?? Since statement switches might be total or partial, stating the intent would be useful in the same way, right?? (To be clear, I think this is two separate issues, but they are related.) From brian.goetz at oracle.com Wed Aug 12 20:44:04 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 12 Aug 2020 16:44:04 -0400 Subject: A peek at the roadmap for pattern matching and more Message-ID: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> Several folks have asked that I sketch out a little more of the roadmap for pattern matching, so that we can better evaluate the features being discussed now (since, for example, the semantics of pattern matching are heavily influenced by nested patterns.) I've checked into two drafts which are VERY ROUGH, but which correspond to the next two logical increments after patterns in switch. Deconstruction patterns: https://github.com/openjdk/amber-docs/blob/master/eg-drafts/deconstruction-patterns-records-and-classes.md This document outlines the semantics of deconstruction patterns and nested patterns, and declaration of deconstructors in classes. Reconstructors: https://github.com/openjdk/amber-docs/blob/master/eg-drafts/reconstruction-records-and-classes.md This one addresses the challenge of "mutating" immutable objects such as records and inline classes, and builds on deconstructors. (There are some teasers for related features, mostly to put these in context, but I don't want to get distracted on those features until these are nailed down, so take them as merely context.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Aug 12 21:42:09 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 12 Aug 2020 23:42:09 +0200 (CEST) Subject: Next up for patterns: type patterns in switch In-Reply-To: <641bfdf2-c38b-2a78-faab-fca022a2933a@oracle.com> References: <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> <648491328.346938.1597262261833.JavaMail.zimbra@u-pem.fr> <641bfdf2-c38b-2a78-faab-fca022a2933a@oracle.com> Message-ID: <1984247510.5901.1597268529951.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Guy Steele" , "John Rose" , "amber-spec-experts" > > Envoy?: Mercredi 12 Ao?t 2020 22:05:02 > Objet: Re: Next up for patterns: type patterns in switch > Since I am always the one saying "state your concern, not your > solution", let me frame this: > >> But i still think having a keyword to signal that a pattern (not a >> case) is total is better than letting people guess. > > From this I take away that (a) the rules we've proposed for totality vs > not are mostly OK in terms of their expressiveness and their defaults, > but (b) you are worried that it is too subtle for Java developers to > determine whether a given sub-tree of a pattern is total on the part it > is matching, so (c) you would like some additional assertions to say > "this is total, error if I'm wrong". These assertions benefit would both > the writer (to catch errors) and reader (so totality snaps off the page.) > > Do I have it right? yes, you can add (d) the keyword also makes the fact that a pattern is total a local property so actions at distance that change the pattern from total to non total will lead to an error. > > But isn't the "switch is total" kind of the half-brother of this story, > too?? Since statement switches might be total or partial, stating the > intent would be useful in the same way, right?? (To be clear, I think > this is two separate issues, but they are related.) yes, there is no problem with the expression switches and statement switch with one total pattern at top-level (if there is a keyword for pattern totality), but if a statement switch is exhaustive it can become partial after a change and there is currently no way to express that you want to keep the statement switch total. R?mi From forax at univ-mlv.fr Wed Aug 12 22:00:56 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 13 Aug 2020 00:00:56 +0200 (CEST) Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> Message-ID: <64862796.6714.1597269656683.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Mercredi 12 Ao?t 2020 22:44:04 > Objet: A peek at the roadmap for pattern matching and more > Several folks have asked that I sketch out a little more of the roadmap for > pattern matching, so that we can better evaluate the features being discussed > now (since, for example, the semantics of pattern matching are heavily > influenced by nested patterns.) > Reconstructors: > [ > https://github.com/openjdk/amber-docs/blob/master/eg-drafts/reconstruction-records-and-classes.md > | > https://github.com/openjdk/amber-docs/blob/master/eg-drafts/reconstruction-records-and-classes.md > ] > This one addresses the challenge of "mutating" immutable objects such as records > and inline classes, and builds on deconstructors. There is nothing on the meaning of "this" in the block of with, i believe that like a constructor, "this" should be a reference to the instance on which with is called and be final ? > (There are some teasers for related features, mostly to put these in context, > but I don't want to get distracted on those features until these are nailed > down, so take them as merely context.) R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Aug 12 22:18:18 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 12 Aug 2020 18:18:18 -0400 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <64862796.6714.1597269656683.JavaMail.zimbra@u-pem.fr> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <64862796.6714.1597269656683.JavaMail.zimbra@u-pem.fr> Message-ID: <4961548b-cb63-8376-0049-b803eb5b1c40@oracle.com> > There is nothing on the meaning of "this" in the block of with, i > believe that like a constructor, "this" should be a reference to the > instance on which with is called and be final ? This is one possible interpretation, yes.? But, it's not clear whether this carries its weight. It would have the advantage that you could call methods on the reconstruction target, but has the same disadvantage as the name resolution for inner classes, which offers all sorts of puzzlers-in-waiting, since `foo()` might now be a method on the target, or a method in the local context.?? I think "lambda" is a better model than "anonymous constructor body" here. (In any case, observant readers will notice that it is a state monad in disguise.) From forax at univ-mlv.fr Wed Aug 12 22:48:08 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 13 Aug 2020 00:48:08 +0200 (CEST) Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <4961548b-cb63-8376-0049-b803eb5b1c40@oracle.com> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <64862796.6714.1597269656683.JavaMail.zimbra@u-pem.fr> <4961548b-cb63-8376-0049-b803eb5b1c40@oracle.com> Message-ID: <18849068.8547.1597272488900.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Jeudi 13 Ao?t 2020 00:18:18 > Objet: Re: A peek at the roadmap for pattern matching and more >> There is nothing on the meaning of "this" in the block of with, i >> believe that like a constructor, "this" should be a reference to the >> instance on which with is called and be final ? ok, i said two different things in the same sentence, "like a constructor" and the "receiver of with" sorry. > > This is one possible interpretation, yes.? But, it's not clear whether > this carries its weight. > > It would have the advantage that you could call methods on the > reconstruction target, but has the same disadvantage as the name > resolution for inner classes, which offers all sorts of > puzzlers-in-waiting, since `foo()` might now be a method on the target, > or a method in the local context.?? I think "lambda" is a better model > than "anonymous constructor body" here. Choices are: 1/ the enclosing instance like in lambda 2/ the receiver of "with" 3/ the newly created instance like in a constructor 4/ a poison I'm not sure lambda is the best model, i'm betting safely on 4 for now. > > (In any case, observant readers will notice that it is a state monad in > disguise.) R?mi From forax at univ-mlv.fr Wed Aug 12 23:28:09 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 13 Aug 2020 01:28:09 +0200 (CEST) Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> Message-ID: <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Mercredi 12 Ao?t 2020 22:44:04 > Objet: A peek at the roadmap for pattern matching and more > Several folks have asked that I sketch out a little more of the roadmap for > pattern matching, so that we can better evaluate the features being discussed > now (since, for example, the semantics of pattern matching are heavily > influenced by nested patterns.) > I've checked into two drafts which are VERY ROUGH, but which correspond to the > next two logical increments after patterns in switch. > Deconstruction patterns: > [ > https://github.com/openjdk/amber-docs/blob/master/eg-drafts/deconstruction-patterns-records-and-classes.md > | > https://github.com/openjdk/amber-docs/blob/master/eg-drafts/deconstruction-patterns-records-and-classes.md > ] > This document outlines the semantics of deconstruction patterns and nested > patterns, and declaration of deconstructors in classes. In the code: public deconstructor B(int a, int b) { super(var aa) = this; a = aa; b = this.b; } i believe the first line should be A(var aa) = super; I know that you have consider something like this, but i prefer making the deconstructor a method returning a tuple at Java level, to be closer to the JVM level. So a syntax more like class Point { int x; int y; (int x, int y) deconstructor { return (this.x, this.y); } } Conceptually, it's also more like the reverse of a constructor, a constructor takes the values from the stack to move them to the heap, a desconstructor takes the value from the heap and move them to the stack. > (There are some teasers for related features, mostly to put these in context, > but I don't want to get distracted on those features until these are nailed > down, so take them as merely context.) R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Wed Aug 12 23:36:41 2020 From: john.r.rose at oracle.com (John Rose) Date: Wed, 12 Aug 2020 16:36:41 -0700 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> Message-ID: On Aug 12, 2020, at 4:28 PM, Remi Forax wrote: > > > I know that you have consider something like this, but i prefer making the deconstructor a method returning a tuple at Java level, to be closer to the JVM level. > So a syntax more like > class Point { > int x; > int y; > > (int x, int y) deconstructor { > return (this.x, this.y); > } > } > > Conceptually, it's also more like the reverse of a constructor, a constructor takes the values from the stack to move them to the heap, a desconstructor takes the value from the heap and move them to the stack. To be closer to the JVM level we should number our variables, not name them. That is, if being closer to the JVM level were so important as to prefer positional notations to name-based notations. One reason to avoid tuples is we?d have to reify them more thoroughly in the language, and that seems like busy-work. The more important reason to avoid tuples is they don?t have named components, and the stuff we are looking at these days with records, constructors, and deconstructors is *all about names*. ? John From brian.goetz at oracle.com Thu Aug 13 01:51:51 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 12 Aug 2020 21:51:51 -0400 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> Message-ID: > In the code: > public deconstructor B(int a, int b) { > super(var aa) = this; > a = aa; > b = this.b; > } > i believe the first line should be > A(var aa) = super; P = e is like if (e instanceof P) { rest of method }. What would be on the LHS of the instanceof would be `this`, not `super`. This is like `if (this instanceof super(var aa))`. > I know that you have consider something like this, but i prefer making the deconstructor a method returning a tuple at Java level, to be closer to the JVM level. Yes, we did consider this, but I don?t like it, because it?s fake. Having tuple-like syntax that you could only use in one place would feel like ?glass 99% empty.? Unless people can use tuples as returns, destructure them, store them in variables, denote their types, pass them to methods, etc, it will just be a tease. No one will thank us, and I don?t think it really carries the message home the way the current framing does. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Aug 13 02:46:31 2020 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 12 Aug 2020 22:46:31 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <648491328.346938.1597262261833.JavaMail.zimbra@u-pem.fr> References: <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> <648491328.346938.1597262261833.JavaMail.zimbra@u-pem.fr> Message-ID: <1BD5CAF9-C4A7-4AFA-9510-BF24FCE0CEBC@oracle.com> > On Aug 12, 2020, at 3:57 PM, forax at univ-mlv.fr wrote: > . . . > > I agree destructuring is just as important as conditionality and those two things should be orthogonal. > But i still think having a keyword to signal that a pattern (not a case) is total is better than letting people guess. Yes, and here is the example that convinced me that one needs to be able to mark patterns as total, not just cases: (Assume for the following example that any pattern may be preceded by ?default?, that the only implication of ?default? is that you get a static error if the pattern it precedes is not total, and that we can abbreviate ?case default? as simply ?default?.) record Box(T t) { } record Bag(T t) { } record Pair(T t, U u) { } Triple, Bag> p; switch (x) { case Pair(Box(Tadpole t), Bag(String s)): ? case Pair(Box(Tadpole t), Bag(default Object o)): ? case Pair(Box(default Frog f), Bag(String s)): ? default Pair(Box(Frog f), Bag(Object o)): ? } I think there is some charm to this. To be clear, the intent of this variant proposal is that _any_ pattern may have ?default? in front of it, but I suspect that in practice its primary use will be within ?switch? (or at least within some sort of conditional context), which is what justifies the use of the keyword ?default?. Certainly default Pair(T x, U y) = expr; is not especially useful, because if Pair(T x, U y) is not total on the type of expr we presumably get a static error from the declaration even without the keyword ?default? present. Possibly someone might write if (x instanceof FrogBox(Toad t)) ? else if (x instanceof FrogBox(Tadpole tp)) ? else if (x instanceof default FrogBox(Frog f)) ... or if (x instanceof FrogBox(Toad t)) ? else if (x instanceof FrogBox(Tadpole tp)) ? else if (x instanceof FrogBox(default Frog f)) ... and either of those might actually be useful (in fact, I believe they are equivalent but differ in stylistic emphasis), but then that someone should remember that instanceof is never true when x is null (which may be what is wanted in this specific code). So ?default? is in practice used much like ?any? that R?mi earlier discussed, but the theory is that ?default? does not dictate null-acceptance; rather, null-acceptance depends on pattern totality (in the manner Brian has argued for), and ?default? is a way for the programmer to ensure pattern totality. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Aug 13 12:19:42 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 13 Aug 2020 14:19:42 +0200 (CEST) Subject: Next up for patterns: type patterns in switch In-Reply-To: <1BD5CAF9-C4A7-4AFA-9510-BF24FCE0CEBC@oracle.com> References: <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> <648491328.346938.1597262261833.JavaMail.zimbra@u-pem.fr> <1BD5CAF9-C4A7-4AFA-9510-BF24FCE0CEBC@oracle.com> Message-ID: <1036714364.51116.1597321182916.JavaMail.zimbra@u-pem.fr> > De: "Guy Steele" > ?: "Remi Forax" > Cc: "Brian Goetz" , "John Rose" > , "amber-spec-experts" > > Envoy?: Jeudi 13 Ao?t 2020 04:46:31 > Objet: Re: Next up for patterns: type patterns in switch >> On Aug 12, 2020, at 3:57 PM, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] >> wrote: >> . . . >> I agree destructuring is just as important as conditionality and those two >> things should be orthogonal. >> But i still think having a keyword to signal that a pattern (not a case) is >> total is better than letting people guess. > Yes, and here is the example that convinced me that one needs to be able to mark > patterns as total, not just cases: > (Assume for the following example that any pattern may be preceded by ?default?, > that the only implication of ?default? is that you get a static error if the > pattern it precedes is not total, and that we can abbreviate ?case default? as > simply ?default?.) > record Box(T t) { } > record Bag(T t) { } > record Pair(T t, U u) { } > Triple, Bag> p; > switch (x) { > case Pair(Box(Tadpole t), Bag(String s)): ? > case Pair(Box(Tadpole t), Bag(default Object o)): ? > case Pair(Box(default Frog f), Bag(String s)): ? > default Pair(Box(Frog f), Bag(Object o)): ? > } > I think there is some charm to this. For the last use of default, default Pair(Box(Frog f), Bag(Object o)): ? I wonder if we find it natural only because we are used to use the keyword "default" inside a switch, using a different keyword, by example"total", total Pair(Box(Frog f), Bag(Object o)): ? breaks the spell for me. I think i prefer using "default" (or any other keyword) only where it makes sense and doesn't allow "default" to be propagated. so default Pair p: ... is ok but default Pair(Box(Frog f), Bag(Object o)): ? should be written case Pair(Box(Frog f), Bag(default Object o)): ? > To be clear, the intent of this variant proposal is that _any_ pattern may have > ?default? in front of it, but I suspect that in practice its primary use will > be within ?switch? (or at least within some sort of conditional context), which > is what justifies the use of the keyword ?default?. Certainly > default Pair(T x, U y) = expr; > is not especially useful, because if Pair(T x, U y) is not total on the type of > expr we presumably get a static error from the declaration even without the > keyword ?default? present. Possibly someone might write > if (x instanceof FrogBox(Toad t)) ? > else if (x instanceof FrogBox(Tadpole tp)) ? > else if (x instanceof default FrogBox(Frog f)) ... > or > if (x instanceof FrogBox(Toad t)) ? > else if (x instanceof FrogBox(Tadpole tp)) ? > else if (x instanceof FrogBox(default Frog f)) ... > and either of those might actually be useful (in fact, I believe they are > equivalent but differ in stylistic emphasis), but then that someone should > remember that instanceof is never true when x is null (which may be what is > wanted in this specific code). > So ?default? is in practice used much like ?any? that R?mi earlier discussed, > but the theory is that ?default? does not dictate null-acceptance; rather, > null-acceptance depends on pattern totality (in the manner Brian has argued > for), and ?default? is a way for the programmer to ensure pattern totality. yes ! R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Aug 13 12:25:51 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 13 Aug 2020 14:25:51 +0200 (CEST) Subject: A peek at the roadmap for pattern matching and more In-Reply-To: References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> Message-ID: <80909575.51644.1597321551854.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > Envoy?: Jeudi 13 Ao?t 2020 01:36:41 > Objet: Re: A peek at the roadmap for pattern matching and more > On Aug 12, 2020, at 4:28 PM, Remi Forax wrote: >> >> >> I know that you have consider something like this, but i prefer making the >> deconstructor a method returning a tuple at Java level, to be closer to the JVM >> level. >> So a syntax more like >> class Point { >> int x; >> int y; >> >> (int x, int y) deconstructor { >> return (this.x, this.y); >> } >> } >> >> Conceptually, it's also more like the reverse of a constructor, a constructor >> takes the values from the stack to move them to the heap, a desconstructor >> takes the value from the heap and move them to the stack. > > To be closer to the JVM level we should number our variables, not name them. > That is, if being closer to the JVM level were so important as to prefer > positional notations to name-based notations. I was thinking about closer to the generated classfile, not closer to the VM internals. > > One reason to avoid tuples is we?d have to reify them more thoroughly in the > language, and that seems like busy-work. It depends if tuples are reified by the VM or not. > > The more important reason to avoid tuples is they don?t have named components, > and the stuff we are looking at these days with records, constructors, and > deconstructors is *all about names*. yes, a tuple as a type is a structural type, if we use the now classical move of using a nominal type + some inference rules to represent the type of a tuple, the record class is a tuple type, so yes it has names the same way an abstract method of a functional interface has a name. > > ? John R?mi From forax at univ-mlv.fr Thu Aug 13 13:06:10 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 13 Aug 2020 15:06:10 +0200 (CEST) Subject: A peek at the roadmap for pattern matching and more In-Reply-To: References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> Message-ID: <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Jeudi 13 Ao?t 2020 03:51:51 > Objet: Re: A peek at the roadmap for pattern matching and more [...] >> I know that you have consider something like this, but i prefer making the >> deconstructor a method returning a tuple at Java level, to be closer to the JVM >> level. > Yes, we did consider this, but I don?t like it, because it?s fake. Having > tuple-like syntax that you could only use in one place would feel like ?glass > 99% empty.? Unless people can use tuples as returns, destructure them, store > them in variables, denote their types, pass them to methods, etc, it will just > be a tease. No one will thank us, and I don?t think it really carries the > message home the way the current framing does. ok, let's take a step back because i think the current proposal has fall in a kind of local maximum in term of design. We want a destructor, a destructor is a method that returns a list of values the same way a constructor takes a list of parameters. The way to represent that list of value is to use a record. So we can already write a de-constructor with the current version of Java, With a mutable Point class MutablePoint { int x, y; record PointTuple(int x, int y) { } public PointTuple deconstructor() { return new PointTuple(this.x, this.y); } } We need to enhance a little the compiler because we want to have different deconstructor but while the classfile let us to have several methods with the same name and same parameters but different return type, Java doesn't allow that. So the name deconstructor is considered as special by the compiler, so one can write class MutablePoint { int x, y; record PointTuple(int x, int y) { } record Point3DTuple(int x, int y, int z) { } public PointTuple deconstructor() { return new PointTuple(this.x, this.y); } public Point3DTuple deconstructor() { return new Point3DTuple(this.x, this.y, 0); } } And that enough, we don't need more to declare a constructor. Here we are half way to the current proposal, because we are forcing users to explicitly declare the record, but we have face the same kind of choice with lambdas, should we let the compiler transform (int, int -> int) into a synthetic functional interface and decide to not follow on that idea. Here for a reason, i will be happy to heard, you have decided to cross the rubicon. Let say i'm ok with that, deconstructors are not the only place we may want a method to return several values, so i see no point to only enable that feature for deconstructors, i should wuth by example a method minMax that returns both the minimum and the maximum of an array public static (int min, int max) minMax(int[] array) { ... } the compiler will generate a synthetic record, so the generated code is equivalent to public record FunnyNameWithIntMinAndIntMaxInIt(int min, int max) { } public static (int min, int max) minMax(int[] array) { ... return new FunnyNameWithIntMinAndIntMaxInIt(min, max); } On interesting question is inside minMax, how write the return given that the record is declared with a synthetic name unknown from the user, for the simple answer is to use the syntax (min, max) So our example can be written to public static (int min, int max) minMax(int[] array) { ... return (min, max); } At that point, if we take a look the compiler does two different operations, first it desugar the return type (int min, int max) to the record FunnyNameWithIntMinAndIntMaxInIt, then it uses inference to convert the syntax (min, max) to new FunnyNameWithIntMinAndIntMaxInIt(min, max). I see no reason to not allow this inference to work with user defined record, so the class MutablePoint can be written that way class MutablePoint { int x, y; record PointTuple(int x, int y) { } record Point3DTuple(int x, int y, int z) { } public PointTuple deconstructor() { return (this.x, this.y); // inferred as PointTuple } public Point3DTuple deconstructor() { return (this.x, this.y, 0); // inferred as Point3DTuple } } Now, let's take a look to the use-site, where a desconstructor is used, it can be in a switch, MutablePoint p = ... switch(p) { case MutablePoint(var x, var y): ... } or it can be when doing a de-structured assignment MutablePoint p = ... MutablePoint(var x, var y) = p; in both case, we can use inference in the same way we can infer the constructor, we can infer part of the de-structuring pattern, so it's reasonable given that a constructor and a de-constructor are dual. So you can wirte: MutablePoint p = ... switch(p) { case (var x, var y): ... // inferered MutablePoint } or MutablePoint p = ... (var x, var y) = p; So the good news is that we can have desconstructors with only a small tweak of the compiler but it will argue that if we introduce the notion of synthetic record, we should also implement inference for record constructors and the de-structuring patterns. regards, R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Aug 13 13:10:46 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 13 Aug 2020 15:10:46 +0200 (CEST) Subject: A peek at the roadmap for pattern matching and more In-Reply-To: References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> Message-ID: <1529996403.55324.1597324246152.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Jeudi 13 Ao?t 2020 03:51:51 > Objet: Re: A peek at the roadmap for pattern matching and more >> In the code: >> public deconstructor B(int a, int b) { >> super(var aa) = this; >> a = aa; >> b = this.b; >> } >> i believe the first line should be >> A(var aa) = super; > P = e > is like > if (e instanceof P) { rest of method }. > What would be on the LHS of the instanceof would be `this`, not `super`. This is > like `if (this instanceof super(var aa))`. I'm not able to parse "super(var aa)", it's not a pattern we have not talk about. And "super" is "this" typed as the superclass and with the method calls on it using invokespecial instead of invokevirtual. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Aug 13 13:29:43 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 13 Aug 2020 09:29:43 -0400 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <1529996403.55324.1597324246152.JavaMail.zimbra@u-pem.fr> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1529996403.55324.1597324246152.JavaMail.zimbra@u-pem.fr> Message-ID: <349D941D-69C3-44E6-A6C3-5D1558F9831A@oracle.com> > I'm not able to parse "super(var aa)", it's not a pattern we have not talk about. > And "super" is "this" typed as the superclass and with the method calls on it using invokespecial instead of invokevirtual. It?s a straightforward duality to constructor-super relationships. In a _constructor_, you can say super(?) and it will overload select against constructors in the superclass, and chain to that. In a _deconstructor_, you can use super(?) as a pattern, and it will overload select against deconstructors in the superclass, and match to that. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Aug 13 13:42:40 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 13 Aug 2020 09:42:40 -0400 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> Message-ID: <1C1EBC04-ABBB-4F37-B352-BF0D21937646@oracle.com> > We want a destructor, a destructor is a method that returns a list of values the same way a constructor takes a list of parameters. > The way to represent that list of value is to use a record. Yes, already been down that road, I did an entire sketch of this feature where we expose ?anonymous records.? It felt really cool ? for about five minutes. The conclusion of this exercise was that in this case, records are better as a _translation mechanism_ than a surface syntax. So yes, the way to represent it is a record, but in the class file, not the language. > So we can already write a de-constructor with the current version of Java, As you figured out, if you cast a deconstructor as a method with a special name, you say goodbye to deconstructor overloading. And if you do make it magically overloadable, then it's magic, and you?ve lost all the benefit off pretending its a method. You just moved the complexity around. And again here comes that glass half empty feeling: ?Why can I only do it with this method??. Much better to recognize that deconstruction is a fundamental operation on objects, just like construction ? because it is. > i should wuth by example a method minMax that returns both the minimum and the maximum of an array > public static (int min, int max) minMax(int[] array) { Nope. Not going there. I went down this road too, but multiple-return is another one of those ?tease? features that looks cool but very quickly looks like glass 80% empty. More fake tuples for which the user will not thank us. And, to the extent we pick _this particular_ tuple syntax, the reaction is ?well then WTF did you do records for?? Having records _and_ structural tuples would be silly. We?ve made our choice: Java?s tuples are nominal. Worse, it only addresses the simplest of patterns ? deconstructors ? because they are inherently total. When you get to conditional patterns (Optional.of(var x)), you want to wrap that record in something like an optional. If the return value is fake, now you have to deal with the interaction of trade tuples and generics. No thank you. Believe me, I?ve been down ALL of these roads. > the compiler will generate a synthetic record, so the generated code is equivalent to The synthetic record part is the part I liked, and kept. There?s room for an ?anonymous tuple? feature someday, maybe, but we don?t need it now. I?m willing to consider the ?record conversion? when we get to collection literals, but not now. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Aug 13 14:24:11 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 13 Aug 2020 10:24:11 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <1BD5CAF9-C4A7-4AFA-9510-BF24FCE0CEBC@oracle.com> References: <295034635.218312.1597176064428.JavaMail.zimbra@u-pem.fr> <16803318-455A-4F8A-A92B-9CBB0EBC406C@oracle.com> <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> <648491328.346938.1597262261833.JavaMail.zimbra@u-pem.fr> <1BD5CAF9-C4A7-4AFA-9510-BF24FCE0CEBC@oracle.com> Message-ID: >> I agree destructuring is just as important as conditionality and those two things should be orthogonal. >> But i still think having a keyword to signal that a pattern (not a case) is total is better than letting people guess. I?m glad this discussion has moved to the right thing ? whether we should provide extra help in type-checking totality. (Note, though, there were several dozen aspects to my patterns-in-switch proposal, and we?ve spent 99% of our time discussing a single, pretty cornery one. This is the hazard of discussing these things on mailing lists, where all the discussion is replies to the first thing that someone complained about, especially when the first complaint is loud. So EVERYONE, PLEASE, take a detailed review through the doc. It can?t be 100% perfect aside form this issue.) I still think, though, that this may be mostly fear of the unknown. Having spent the last year staring at pattern switches, it is really pretty obvious which are the total ones. I think that will be true for most users, once they use the feature a little. (Remember how alien people said method references looked, or how worried people were about the fact that static mrefs and unbound instance mrefs had the same syntax? Wasn?t a problem.) So all of this discussion rests on the assumption that (a) it is really hard to tell when a pattern is total or not, and (b) the stakes for getting that wrong are really high. I worry that both of these concerns are significantly overstated, and that the incremental user model complexity may be worse than the disease. > Yes, and here is the example that convinced me that one needs to be able to mark patterns as total, not just cases: The underlying observation here is that totality is a property of an arbitrary sub-tree of a pattern, which could be the whole thing, a leaf, or some intermediate sub-tree. Guy?s exploration into overloading `default` does shine a light on one of the uglier parts of my proposal, which is that the existing `default` case is mostly left to rot. I?m going to take this as a cue that, whether we do something to highlight pattern totality or not, that we should try to integrate default cases better into the design. So, here are the properties of the existing default case, in the current state of the proposal: - it matches all non-null values with no bindings - it can only appear at the end (unlike the current language, where it can appear anywhere) - You cannot fall into it from a pattern case (unlike the current language, where you can fall into it) - People will either use default or a total pattern, but rarely both (since their only difference is null) Just as we?ve leaned towards rehabilitating switch, maybe we can try to rehabilitate default. And the role it can play is in signaling totality of the switch.. There are two ?cases? of exhaustive switches: nullable ones and non-nullable ones. The nullable ones are those that end in a total (catch-all) pattern. The non-nullable ones are the analogues of the current exhaustive switch on enums; those where all the cases are ?parts? of the whole: switch (day) { case MONDAY..FRIDAY: work(); case SATURDAY: play(); case SUNDAY: worship(); // And on the null day, there was a NullPointerException } This is a total switch, but not a nullable one. On the other hand: switch (something) { case X: ... case Object o: } This is also a total switch, but a nullable one. So here?s a proposed rehabilitation of default, inspired by Guy?s exploration: - A switch has a sequence of cases, with zero or one default clases - The default case must be last (except for legacy switches) - A switch with a default case must be total (possibly modulo null) - Default can be used with no pattern, which means ?everything else but null? - Default can be used with a pattern, in which case it has exactly the same semantics as the same pattern with ?case? - Nullability is still strictly a function of whether there are any nullable cases, which can only be ?case null? (first) and total pattern (last) So we can say switch (something) { case X: ... case Object o: } or switch (something) { case X: ... default Object o: } which mean the same thing, but the latter engages the additional type checking that the switch is total. This is not inconsistent with GUy?s sketch, it just is the switch-specific part. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Aug 13 17:22:46 2020 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 13 Aug 2020 13:22:46 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <1036714364.51116.1597321182916.JavaMail.zimbra@u-pem.fr> References: <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> <648491328.346938.1597262261833.JavaMail.zimbra@u-pem.fr> <1BD5CAF9-C4A7-4AFA-9510-BF24FCE0CEBC@oracle.com> <1036714364.51116.1597321182916.JavaMail.zimbra@u-pem.fr> Message-ID: <5FCCADEE-0F72-4EFB-8A63-B6AD46673086@oracle.com> > On Aug 13, 2020, at 8:19 AM, forax at univ-mlv.fr wrote: > > . . . > > I wonder if we find it natural only because we are used to use the keyword "default" inside a switch, . . . I think that may be so; but given that it is so, I am happy to exploit that fact! > I think i prefer using "default" (or any other keyword) only where it makes sense and doesn't allow "default" to be propagated. > so > default Pair p: ... > is ok but > default Pair(Box(Frog f), Bag(Object o)): ? > should be written > case Pair(Box(Frog f), Bag(default Object o)): ? I think you intended that last line to read case Pair(Box(default Frog f), Bag(default Object o)): ? and if so, I agree that this may be a better way to write it in the context I originally gave: switch (x) { case Pair(Box(Tadpole t), Bag(String s)): ? case Pair(Box(Tadpole t), Bag(default Object o)): ? case Pair(Box(default Frog f), Bag(String s)): ? case Pair(Box(default Frog f), Bag(default Object o)): ? // I originally had "default Pair(Box(Frog f), Bag(Object o)): ?? here } But either way works, because of the subtle fact that if P: Pattern T, then Q is total over type T if and only if P(Q) is total over type P, so one can choose, on purely stylistic grounds, whether to use the ?default? tag at the root of a pattern subtree that is total, or at all the relevant leaves, or for that matter at a relevant set of interior subtrees. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Aug 13 19:15:23 2020 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 13 Aug 2020 15:15:23 -0400 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <1C1EBC04-ABBB-4F37-B352-BF0D21937646@oracle.com> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> <1C1EBC04-ABBB-4F37-B352-BF0D21937646@oracle.com> Message-ID: <7FF83B5A-1AF0-415C-983B-C3938B8A29DB@oracle.com> > On Aug 13, 2020, at 9:42 AM, Brian Goetz wrote: > > . . . >> i should wuth by example a method minMax that returns both the minimum and the maximum of an array > >> public static (int min, int max) minMax(int[] array) { > > Nope. Not going there. I went down this road too, but multiple-return is another one of those ?tease? features that looks cool but very quickly looks like glass 80% empty. More fake tuples for which the user will not thank us. And, to the extent we pick _this particular_ tuple syntax, the reaction is ?well then WTF did you do records for?? Having records _and_ structural tuples would be silly. We?ve made our choice: Java?s tuples are nominal. > . . . > > There?s room for an ?anonymous tuple? feature someday, maybe, but we don?t need it now. I think it is important that we first get the right story for general nominal tuples. And I also recommend that we then quickly nail down some turf by immediately putting into the library (say, in java.util) some standard record classes Pair(A first, B second) Triple(A first, B second, C third) Quadruple(A first, B second, C third, D fourth) . . . Octuple(A first, B second, C third, D fourth, E fifth, F sixth, G seventh, H eighth) Once that is done, I believe it would be very easy, if we ever care to do it, to introduce (a, b, c) in an expression context as a syntactic abbreviation for new java.util.Triple(a, b c) and (A, B, C) in a type context as a syntactic abbreviation for java.util.Triple N?est-ce pas? (This is where I get to recount once again my little story about that fact that way back in 1995, I persuaded James and the Java team to remove C's comma operator from Java on the grounds that (a) in C it?s used only in macros, which Java doesn?t have, and in `for` statements, which can be handled specially in the syntax, and (b) it would leave open the possibility of supporting tuples later. And 25 years later it?s still an option.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Aug 13 19:39:33 2020 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 13 Aug 2020 15:39:33 -0400 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> Message-ID: > On Aug 13, 2020, at 9:06 AM, forax at univ-mlv.fr wrote: > > . . . > ok, let's take a step back because i think the current proposal has fall in a kind of local maximum in term of design. > > We want a destructor, a destructor is a method that returns a list of values the same way a constructor takes a list of parameters. > The way to represent that list of value is to use a record. While I agree that this sort of approach can be made to work, I have to say it is philosophically puzzling to say that the way to destruct (or deconstruct) an object is to take values from the relevant fields and then package them up in some other object! Superficially, at least, it seems like this approach leads to an infinite regress. Whereas I can more easily understand that the job of public deconstructor Point(int x, int y) { x = this.x; y = this.y; } is to take values out of the object ?this? and put them into separate _variables_, not a new object. (Granted, these variables have a somewhat new and mysterious existence and character.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Aug 13 20:14:29 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 13 Aug 2020 16:14:29 -0400 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> Message-ID: > > Whereas I can more easily understand that the job of > > ? ? public deconstructor Point(int x, int y) { > ? ? ? ? x = this.x; > ? ? ? ? y = this.y; > ? ? } > > is to take values out of the object ?this? and put them into separate > _variables_, not a new object. ?(Granted, these variables have a > somewhat new and mysterious existence and character.) And ... It is very easy to look at things from the perspective of the language we have, and say "that's just a method, it should look like a method."? This looks like it doesn't add new abstractions that the user must understand, and that's often the right move.? But really, we should look at it from the perspective of the language we _want to have in the future_. IMO, that the language does not have deconstructors yet looks to me like a gaping hole!? A key principle of OO is _mediated access to encapsulated state_.? That is, we hide our representation, and then expose behavior that mediates access to a relevant subset of the state, adding validation, copying, adaptation, etc along the way. But why do we have mediated aggregration, mediated access, but not mediated disaggregation?? Why must every object creation be a one-way trip?? (Or, if not, why must the return trip look nothing like the outbound?)? Something is missing. By framing deconstructors as the arrow-reversed form of constructors, we provide the missing edges in our graph in a symmetrical way.? By framing deconstruction as merely a "multi-getter", we may provide the missing edges, but the result is badly asymmetric. From forax at univ-mlv.fr Thu Aug 13 20:22:34 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 13 Aug 2020 22:22:34 +0200 (CEST) Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <1C1EBC04-ABBB-4F37-B352-BF0D21937646@oracle.com> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> <1C1EBC04-ABBB-4F37-B352-BF0D21937646@oracle.com> Message-ID: <710875014.108955.1597350154092.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Jeudi 13 Ao?t 2020 15:42:40 > Objet: Re: A peek at the roadmap for pattern matching and more >> We want a destructor, a destructor is a method that returns a list of values the >> same way a constructor takes a list of parameters. >> The way to represent that list of value is to use a record. > Yes, already been down that road, I did an entire sketch of this feature where > we expose ?anonymous records.? It felt really cool ? for about five minutes. > The conclusion of this exercise was that in this case, records are better as a > _translation mechanism_ than a surface syntax. first why ? and second, my experience is the opposite, the more there is a difference between Java the language and the translation strategy the more it will bite you with corner cases. > So yes, the way to represent it is a record, but in the class file, not the > language. >> So we can already write a de-constructor with the current version of Java, > As you figured out, if you cast a deconstructor as a method with a special name, > you say goodbye to deconstructor overloading. And if you do make it magically > overloadable, then it's magic, It's not magic, it's a translation strategy like the same way with bridges, you can have methods with the same name and different return types. > and you?ve lost all the benefit off pretending its a method. You just moved the > complexity around. And again here comes that glass half empty feeling: ?Why can > I only do it with this method??. Much better to recognize that deconstruction > is a fundamental operation on objects, just like construction ? because it is. But constructors are plain methods with some special bells and whistles (funny names, funny rules for the initialization of this and final fields, etc). Having a deconstructor being a method with several overloads doesn't strike me as very magic compared to a constructor. And you can not have several overloaded methods with different return types because Java like C doesn't force you to specify the return type of a method call. So unlike with a pattern where you provide the type, you don't provide the return type when you do a method call. >> i should wuth by example a method minMax that returns both the minimum and the >> maximum of an array >> public static (int min, int max) minMax(int[] array) { > Nope. Not going there. I went down this road too, but multiple-return is another > one of those ?tease? features that looks cool but very quickly looks like glass > 80% empty. More fake tuples for which the user will not thank us. And, to the > extent we pick _this particular_ tuple syntax, the reaction is ?well then WTF > did you do records for?? Records are the underlying type used by the tuple syntax. The tuple syntax is as its name suggest, just a syntax. From the semantics POV, it's an anonymous record, thus, not a structural type. You may use a record instead of an anonymous record, if you want to reuse the same record for multiple deconstructor, if i have several classes representing rectangular shape, i may want them to share the same record. It's really very similar to the difference between a method reference and a lambda or a classical class and an anonymous class. A record is explicitly named, it can have annotations, it can be serializable, implements interfaces, have it's own toString() etc. An anonymous record is the anonymous version of a record, where you don't care about the name or resource sharing (on record per deconstructor). > Having records _and_ structural tuples would be silly. We?ve made our choice: > Java?s tuples are nominal. But the syntax you are proposing for deconstruction is using the syntax of a _structural tuples_ . Here the glass is half empty, because why the deconstructor can have that feature and not any methods given that at the end a de-constructor is desugared as a method that return a record. As i said in my original mail. I agree with you that we don't want a _structural tuples_, but it's not, it's an anonymous record and restricted them only to deconstructor does not make make a lot of sens. I's like saying look this shinny anonymous record, deconstructor can use them but not regular methods. > Worse, it only addresses the simplest of patterns ? deconstructors ? because > they are inherently total. When you get to conditional patterns > (Optional.of(var x)), you want to wrap that record in something like an > optional. If the return value is fake, now you have to deal with the > interaction of trade tuples and generics. No thank you. I don't see the issue here, where is the problem with the following code ? Optional<(int x, int y)> deconstructor() { return Optional.of( (this.x, this.y) ); } Maybe users will find the syntax weird ? But it's like any new feature ? I believe you are thinking that the type (int x, int y) is a tuple while technically it's an anonymous record, so from the type system POV it acts as a record, i.e. a nominal class. > Believe me, I?ve been down ALL of these roads. >> the compiler will generate a synthetic record, so the generated code is >> equivalent to > The synthetic record part is the part I liked, and kept. > There?s room for an ?anonymous tuple? feature someday, maybe, but we don?t need > it now. I'm not suggesting to introduce an anonymous tuple, as you said above, we have decided to use record instead, a nominal type. I'm suggesting to introduce the concept of anonymous record, which is what the synthetic record is, because i see no point to restrict it only to deconstructors. > I?m willing to consider the ?record conversion? when we get to collection > literals, but not now. To represent the couple key/value i suppose. I don't disagree, but i don't think it's wise to introduce the idea of a synthetic record without the notion of anonymous record, but we don't need a synthetic record for deconstructors, it's nice to have but not an essential part of the design. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Aug 13 20:28:52 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 13 Aug 2020 16:28:52 -0400 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <710875014.108955.1597350154092.JavaMail.zimbra@u-pem.fr> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> <1C1EBC04-ABBB-4F37-B352-BF0D21937646@oracle.com> <710875014.108955.1597350154092.JavaMail.zimbra@u-pem.fr> Message-ID: This was meant to be a sneak peek, for context, not an invitation to argue about the syntax.? Was it a mistake to share it at this time? (And remember the rule: don't comment on syntax until you've said everything you have to say about the model, because otherwise nothing else will ever get said.? So shall I assume you think the model is perfect?) On 8/13/2020 4:22 PM, forax at univ-mlv.fr wrote: > > > ------------------------------------------------------------------------ > > *De: *"Brian Goetz" > *?: *"Remi Forax" > *Cc: *"amber-spec-experts" > *Envoy?: *Jeudi 13 Ao?t 2020 15:42:40 > *Objet: *Re: A peek at the roadmap for pattern matching and more > > We want a destructor, a destructor is a method that returns a > list of values the same way a constructor takes a list of > parameters. > The way to represent that list of value is to use a record. > > > Yes, already been down that road, I did an entire sketch of this > feature where we expose ?anonymous records.? ?It felt really cool > ? for about five minutes. ?The conclusion of this exercise was > that in this case, records are better as a _translation mechanism_ > than a surface syntax. > > > first why ? and second, my experience is the opposite, the more there > is a difference between Java the language and the translation strategy > the more it will bite you with corner cases. > > ?So yes, the way to represent it is a record, but in the class > file, not the language. > > So we can already write a de-constructor with the current > version of Java, > > > As you figured out, if you cast a deconstructor as a method with a > special name, you say goodbye to deconstructor overloading. ? And > if you do make it magically overloadable, then it's magic, > > > It's not magic, it's a translation strategy like the same way with > bridges, you can have methods with the same name and different return > types. > > and you?ve lost all the benefit off pretending its a method. ?You > just moved the complexity around. ?And again here comes that glass > half empty feeling: ?Why can I only do it with this method??. Much > better to recognize that deconstruction is a fundamental operation > on objects, just like construction ? because it is. > > > But constructors are plain methods with some special bells and > whistles (funny names, funny rules for the initialization of this and > final fields, etc). > Having a deconstructor being a method with several overloads doesn't > strike me as very magic compared to a constructor. > > And you can not have several overloaded methods with different return > types because Java like C doesn't force you to specify the return type > of a method call. So unlike with a pattern where you provide the type, > you don't provide the return type when you do a method call. > > > i should wuth by example a method minMax that returns both the > minimum and the maximum of an array > > ? public static (int min, int max) minMax(int[] array) { > > > Nope. ?Not going there. ?I went down this road too, but > multiple-return is another one of those ?tease? features that > looks cool but very quickly looks like glass 80% empty. ?More fake > tuples for which the user will not thank us. ?And, to the extent > we pick _this particular_ tuple syntax, the reaction is ?well then > WTF did you do records for?? > > > Records are the underlying type used by the tuple syntax. > The tuple syntax is as its name suggest, just a syntax. From the > semantics POV, it's an anonymous record, thus, not a structural type. > > You may use a record instead of an anonymous record, if you want to > reuse the same record for multiple deconstructor, if i have several > classes representing rectangular shape, i may want them to share the > same record. > It's really very similar to the difference between a method reference > and a lambda or a classical class and an anonymous class. A record is > explicitly named, it can have annotations, it can be serializable, > implements interfaces, have it's own toString() etc. An anonymous > record is the anonymous version of a record, where you don't care > about the name or resource sharing (on record per deconstructor). > > ?Having records _and_ structural tuples would be silly. ?We?ve > made our choice: Java?s tuples are nominal. > > > But the syntax you are proposing for deconstruction is using the > syntax of a _structural tuples_ . Here the glass is half empty, > because why the deconstructor can have that feature and not any > methods given that at the end a de-constructor is desugared as a > method that return a record. As i said in my original mail. I agree > with you that we don't want a _structural tuples_, but it's not, it's > an anonymous record and restricted them only to deconstructor does not > make make a lot of sens. > I's like saying look this shinny anonymous record, deconstructor can > use them but not regular methods. > > > Worse, it only addresses the simplest of patterns ? deconstructors > ? because they are inherently total. ?When you get to conditional > patterns (Optional.of(var x)), you want to wrap that record in > something like an optional. ?If the return value is fake, now you > have to deal with the interaction of trade tuples and generics. > ?No thank you. > > > I don't see the issue here, where is the problem with the following code ? > ? Optional<(int x, int y)> deconstructor() { > ??? return Optional.of( (this.x, this.y) ); > ? } > > Maybe users will find the syntax weird ? But it's like any new feature ? > I believe you are thinking that the type (int x, int y) is a tuple > while technically it's an anonymous record, so from the type system > POV it acts as a record, i.e. a nominal class. > > > Believe me, I?ve been down ALL of these roads. > > the compiler will generate a synthetic record, so the > generated code is equivalent to > > > The synthetic record part is the part I liked, and kept. > There?s room for an ?anonymous tuple? feature someday, maybe, but > we don?t need it now. > > > I'm not suggesting to introduce an anonymous tuple, as you said above, > we have decided to use record instead, a nominal type. > I'm suggesting to introduce the concept of anonymous record, which is > what the synthetic record is, because i see no point to restrict it > only to deconstructors. > > > I?m willing to consider the ?record conversion? when we get to > collection literals, but not now. > > > To represent the couple key/value i suppose. > I don't disagree, but i don't think it's wise to introduce the idea of > a synthetic record without the notion of anonymous record, but we > don't need a synthetic record for deconstructors, it's nice to have > but not an essential part of the design. > > R?mi > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Aug 13 20:41:13 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 13 Aug 2020 22:41:13 +0200 (CEST) Subject: A peek at the roadmap for pattern matching and more In-Reply-To: References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> Message-ID: <362249704.109869.1597351273358.JavaMail.zimbra@u-pem.fr> > De: "Guy Steele" > ?: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > > Envoy?: Jeudi 13 Ao?t 2020 21:39:33 > Objet: Re: A peek at the roadmap for pattern matching and more >> On Aug 13, 2020, at 9:06 AM, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] >> wrote: >> . . . >> ok, let's take a step back because i think the current proposal has fall in a >> kind of local maximum in term of design. >> We want a destructor, a destructor is a method that returns a list of values the >> same way a constructor takes a list of parameters. >> The way to represent that list of value is to use a record. > While I agree that this sort of approach can be made to work, I have to say it > is philosophically puzzling to say that the way to destruct (or deconstruct) an > object is to take values from the relevant fields and then package them up in > some other object! Superficially, at least, it seems like this approach leads > to an infinite regress. We hope most of these records to be inline types, (see the first sentence about the translation stategy) , so they are box in term of Java syntax but are more like immutable compounds of values at runtime. > Whereas I can more easily understand that the job of > public deconstructor Point(int x, int y) { > x = this.x; > y = this.y; > } > is to take values out of the object ?this? and put them into separate > _variables_, not a new object. If i am one of my students, either i will not understand understand why assigning x and y change something, or i will declare x and y final because StackOverflow says you should do that for every methods. > (Granted, these variables have a somewhat new and mysterious existence and > character.) yes it's full of magic. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Aug 13 20:44:59 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 13 Aug 2020 22:44:59 +0200 (CEST) Subject: A peek at the roadmap for pattern matching and more In-Reply-To: References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> <1C1EBC04-ABBB-4F37-B352-BF0D21937646@oracle.com> <710875014.108955.1597350154092.JavaMail.zimbra@u-pem.fr> Message-ID: <1267009961.110217.1597351499357.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Jeudi 13 Ao?t 2020 22:28:52 > Objet: Re: A peek at the roadmap for pattern matching and more > This was meant to be a sneak peek, for context, not an invitation to argue about > the syntax. Was it a mistake to share it at this time? No you can do that by the end of September, i'm usually too busy. Currently, i'm on vacation walking on the beach and swimming, so i've a lot of free time to think :) > (And remember the rule: don't comment on syntax until you've said everything you > have to say about the model, because otherwise nothing else will ever get said. > So shall I assume you think the model is perfect?) R?mi > On 8/13/2020 4:22 PM, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] wrote: >>> De: "Brian Goetz" [ mailto:brian.goetz at oracle.com | ] >>> ?: "Remi Forax" [ mailto:forax at univ-mlv.fr | ] >>> Cc: "amber-spec-experts" [ mailto:amber-spec-experts at openjdk.java.net | >>> ] >>> Envoy?: Jeudi 13 Ao?t 2020 15:42:40 >>> Objet: Re: A peek at the roadmap for pattern matching and more >>>> We want a destructor, a destructor is a method that returns a list of values the >>>> same way a constructor takes a list of parameters. >>>> The way to represent that list of value is to use a record. >>> Yes, already been down that road, I did an entire sketch of this feature where >>> we expose ?anonymous records.? It felt really cool ? for about five minutes. >>> The conclusion of this exercise was that in this case, records are better as a >>> _translation mechanism_ than a surface syntax. >> first why ? and second, my experience is the opposite, the more there is a >> difference between Java the language and the translation strategy the more it >> will bite you with corner cases. >>> So yes, the way to represent it is a record, but in the class file, not the >>> language. >>>> So we can already write a de-constructor with the current version of Java, >>> As you figured out, if you cast a deconstructor as a method with a special name, >>> you say goodbye to deconstructor overloading. And if you do make it magically >>> overloadable, then it's magic, >> It's not magic, it's a translation strategy like the same way with bridges, you >> can have methods with the same name and different return types. >>> and you?ve lost all the benefit off pretending its a method. You just moved the >>> complexity around. And again here comes that glass half empty feeling: ?Why can >>> I only do it with this method??. Much better to recognize that deconstruction >>> is a fundamental operation on objects, just like construction ? because it is. >> But constructors are plain methods with some special bells and whistles (funny >> names, funny rules for the initialization of this and final fields, etc). >> Having a deconstructor being a method with several overloads doesn't strike me >> as very magic compared to a constructor. >> And you can not have several overloaded methods with different return types >> because Java like C doesn't force you to specify the return type of a method >> call. So unlike with a pattern where you provide the type, you don't provide >> the return type when you do a method call. >>>> i should wuth by example a method minMax that returns both the minimum and the >>>> maximum of an array >>>> public static (int min, int max) minMax(int[] array) { >>> Nope. Not going there. I went down this road too, but multiple-return is another >>> one of those ?tease? features that looks cool but very quickly looks like glass >>> 80% empty. More fake tuples for which the user will not thank us. And, to the >>> extent we pick _this particular_ tuple syntax, the reaction is ?well then WTF >>> did you do records for?? >> Records are the underlying type used by the tuple syntax. >> The tuple syntax is as its name suggest, just a syntax. From the semantics POV, >> it's an anonymous record, thus, not a structural type. >> You may use a record instead of an anonymous record, if you want to reuse the >> same record for multiple deconstructor, if i have several classes representing >> rectangular shape, i may want them to share the same record. >> It's really very similar to the difference between a method reference and a >> lambda or a classical class and an anonymous class. A record is explicitly >> named, it can have annotations, it can be serializable, implements interfaces, >> have it's own toString() etc. An anonymous record is the anonymous version of a >> record, where you don't care about the name or resource sharing (on record per >> deconstructor). >>> Having records _and_ structural tuples would be silly. We?ve made our choice: >>> Java?s tuples are nominal. >> But the syntax you are proposing for deconstruction is using the syntax of a >> _structural tuples_ . Here the glass is half empty, because why the >> deconstructor can have that feature and not any methods given that at the end a >> de-constructor is desugared as a method that return a record. As i said in my >> original mail. I agree with you that we don't want a _structural tuples_, but >> it's not, it's an anonymous record and restricted them only to deconstructor >> does not make make a lot of sens. >> I's like saying look this shinny anonymous record, deconstructor can use them >> but not regular methods. >>> Worse, it only addresses the simplest of patterns ? deconstructors ? because >>> they are inherently total. When you get to conditional patterns >>> (Optional.of(var x)), you want to wrap that record in something like an >>> optional. If the return value is fake, now you have to deal with the >>> interaction of trade tuples and generics. No thank you. >> I don't see the issue here, where is the problem with the following code ? >> Optional<(int x, int y)> deconstructor() { >> return Optional.of( (this.x, this.y) ); >> } >> Maybe users will find the syntax weird ? But it's like any new feature ? >> I believe you are thinking that the type (int x, int y) is a tuple while >> technically it's an anonymous record, so from the type system POV it acts as a >> record, i.e. a nominal class. >>> Believe me, I?ve been down ALL of these roads. >>>> the compiler will generate a synthetic record, so the generated code is >>>> equivalent to >>> The synthetic record part is the part I liked, and kept. >>> There?s room for an ?anonymous tuple? feature someday, maybe, but we don?t need >>> it now. >> I'm not suggesting to introduce an anonymous tuple, as you said above, we have >> decided to use record instead, a nominal type. >> I'm suggesting to introduce the concept of anonymous record, which is what the >> synthetic record is, because i see no point to restrict it only to >> deconstructors. >>> I?m willing to consider the ?record conversion? when we get to collection >>> literals, but not now. >> To represent the couple key/value i suppose. >> I don't disagree, but i don't think it's wise to introduce the idea of a >> synthetic record without the notion of anonymous record, but we don't need a >> synthetic record for deconstructors, it's nice to have but not an essential >> part of the design. >> R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Aug 13 21:18:56 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 13 Aug 2020 17:18:56 -0400 Subject: Last call on pattern switch (well, not really) Message-ID: <3ef710bd-06df-a634-3069-37e9b48d3d6f@oracle.com> So, what happened is what always happens on mailing lists -- I put out a multi-page writeup reflecting hundreds of hours of research and incorporating years of discussion, and 99% of the discussion was a too-loud, back-and-forth thread on a relatively uninteresting corner case on the subject of whatever happened to be the first strongly-stated opinion. The result is that we didn't have a substantive discussion on the other 99% of the proposal, and some folks may even have been intimidated by the back-and-forth (see, for example, Tagir's comment: https://twitter.com/tagir_valeev/status/1293931093066997761) and held back on their feedback.? I would be very unhappy if we missed out on Tagir's feedback because we had made the environment inhospitable because of a long back-and-forth on a less important topic. Let me reiterate some guidelines for discussion, that hopefully will keep us from finding ourselves in this corner too frequently.? I've said all this before, but its good to repeat once in a while. ?- Be aware that syntax discussions always suck up the oxygen. Once the syntax discussion starts, it is unlikely any substantive discussion on the more important issues will take root.? (With the right model, the right syntax can be found later; the wrong model can't be saved even with the best of syntax.)? So please, save these until you're confident that you -- and everyone else -- have said what have to say about goals, models, success metrics, and the like first. ?- Be mindful the shape of the reply chain.? The best discussions usually have wide but shallow trees, where many people comment, but no reply-chain goes too long.? The worst are usually long and narrow. ?- Lead with uncertainty.? Things usually start on the wrong foot if we lead with "X is wrong" or "You should do it Y way instead."? Better to ask rather than tell; there's a good chance that the proposal author has already spent a lot of time thinking about the problem and may already have considered X or Y, or there may be bigger-picture issues that have motivated the proposed direction. ?- The trivial crowds out the substantial.? We all have a tendency to "I'll just reply quickly with the trivial stuff", because that's easy and we're busy.? But very often these things tend to dominate the discussion.? Probably best to try to cover everything in your first draft (or ask questions if you're stuck) rather than send the trivial comments first. Thanks for everyone's help in keeping the discussion moving in the right directions.? We need everyone's perspective here. And for those of you who haven't reviewed the patterns-in-switch draft, please do ... the ship is leaving the dock soon. -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Thu Aug 13 21:37:47 2020 From: john.r.rose at oracle.com (John Rose) Date: Thu, 13 Aug 2020 14:37:47 -0700 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> Message-ID: On Aug 13, 2020, at 12:39 PM, Guy Steele wrote: > > Whereas I can more easily understand that the job of > > public deconstructor Point(int x, int y) { > x = this.x; > y = this.y; > } > > is to take values out of the object ?this? and put them into separate _variables_, not a new object. (Granted, these variables have a somewhat new and mysterious existence and character.) And if this mysterious character were something completely unrelated to any other part of the Java language, I?d be more inclined to admit that maybe the missing primitive is some sort of tuple. It might have to be given a status like that of Java arrays to avoid the infinite regress problem you point out. BUT, when I stare at a block of code that is setting some named/typed variables, where those variables must be DA at the end of the block, and then they are to be available in various other scopes (not the current scope, but other API points), THEN I say, ?where have I seen that pattern before??? There is ALREADY a well-developed part of the Java language which covers this sort of workflow (of putting values into a set of required named/typed variables). Of course, it?s a constructor, especially when the object has final fields that are checked by the DA/DU rules. Now, maybe those rules aren?t everybody?s favorite; they are complicated to learn. But Java programmers do learn them. How about if we give them additional rewards for having learned then? Instead of asking them to learn yet another sub-language (tuples-for-deconstructors) that is completely different? (Yes, I?m partial to DA/DU because those are my rules, for better or worse. Maybe Remi?s going to say something about a sunk cost fallacy. But I think the rules are useful and, with an IDE?s help, intuitive. And they can do more for us now. Let?s double down on them.) So here?s a principle to try out: The natural form of a multiple-value producing construct, in Java, is a scope in which named/typed variables are in scope, are DU, and must be DA before exit. As a variation, the natural form of a transactional multiple-value consuming-and-producing construct, in Java, is a scope where the values to be produced and consumed are both named/typed variables (as above), and where if a name is to be produced and consumed, it is a (mutable) DA value which is updated in the scope, and if a name is to be consumed only it is an (immutable) DA value, and otherwise the name to be produced only is DU but must be DA at every (normal) block exit. Regarding ?block exit?: A keyword like ?return? (as in lambda) or ?yield? (as in e-switch) or ?break? (as in s-switch) can provide early return, exactly as today with constructors, and with suitable DU/DA requirements on the variables. (A value-producing return (or yield) could in some cases be take to return a right-typed bundle. This might make sense, for example, with a constructor for an inline type, which aborts the construction of the current ?this? in favor of some replacement value. This is illegitimate for an identity class, but reasonable for an inline class.) The above proposed patterns make sense both internally to a class (e.g., as constructor bodies which can touch private names) or externally (for untrusted code outside the capsule), as some sort of with-block construction, or (today) lambdas. Note that currently the natural form of a multiple-value consuming construct in Java is a method or lambda body with formal parameter list containing (wait for it) a set of named/typed variables in scope. And/or a method body, where the named/type variables are instance variables of the method?s class. Or both (there are extra axes of variation here). -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Thu Aug 13 21:38:29 2020 From: john.r.rose at oracle.com (John Rose) Date: Thu, 13 Aug 2020 14:38:29 -0700 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> Message-ID: <5108C91B-11CA-4B62-BE6B-045A57CECF33@oracle.com> What you said! On Aug 13, 2020, at 1:14 PM, Brian Goetz wrote: > > By framing deconstructors as the arrow-reversed form of constructors, we provide the missing edges in our graph in a symmetrical way. By framing deconstruction as merely a "multi-getter", we may provide the missing edges, but the result is badly asymmetric. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Aug 13 22:01:51 2020 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 13 Aug 2020 18:01:51 -0400 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <5108C91B-11CA-4B62-BE6B-045A57CECF33@oracle.com> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> <5108C91B-11CA-4B62-BE6B-045A57CECF33@oracle.com> Message-ID: > On Aug 13, 2020, at 5:37 PM, John Rose wrote: > > On Aug 13, 2020, at 12:39 PM, Guy Steele > wrote: >> >> Whereas I can more easily understand that the job of >> >> public deconstructor Point(int x, int y) { >> x = this.x; >> y = this.y; >> } >> >> is to take values out of the object ?this? and put them into separate _variables_, not a new object. (Granted, these variables have a somewhat new and mysterious existence and character.) > > And if this mysterious character were something completely unrelated > to any other part of the Java language, I?d be more inclined to admit > that maybe the missing primitive is some sort of tuple. It might have > to be given a status like that of Java arrays to avoid the infinite regress > problem you point out. > > BUT, when I stare at a block of code that is setting some named/typed > variables, where those variables must be DA at the end of the block, > and then they are to be available in various other scopes (not the > current scope, but other API points), THEN I say, ?where have I > seen that pattern before??? There is ALREADY a well-developed > part of the Java language which covers this sort of workflow > (of putting values into a set of required named/typed variables). > > Of course, it?s a constructor, Actually, a constructor _body_. Let us also recall that there is a second well-developed part of the Java language that puts values into a set of required named/types variables: method invocation. And its structure and behavior are rather different from that of a constructor body. (more below) > especially when the object has final > fields that are checked by the DA/DU rules. Now, maybe those rules > aren?t everybody?s favorite; they are complicated to learn. But > Java programmers do learn them. How about if we give them > additional rewards for having learned then? Instead of asking > them to learn yet another sub-language (tuples-for-deconstructors) > that is completely different? > > (Yes, I?m partial to DA/DU because those are my rules, for better > or worse. Maybe Remi?s going to say something about a sunk cost > fallacy. But I think the rules are useful and, with an IDE?s help, > intuitive. And they can do more for us now. Let?s double down > on them.) > > So here?s a principle to try out: > > The natural form of a multiple-value producing construct, in Java, > is a scope in which named/typed variables are in scope, are DU, > and must be DA before exit. > > As a variation, the natural form of a transactional multiple-value > consuming-and-producing construct, in Java, is a scope where > the values to be produced and consumed are both named/typed > variables (as above), and where if a name is to be produced and > consumed, it is a (mutable) DA value which is updated in the > scope, and if a name is to be consumed only it is an (immutable) > DA value, and otherwise the name to be produced only is DU > but must be DA at every (normal) block exit. > > Regarding ?block exit?: A keyword like ?return? (as in lambda) > or ?yield? (as in e-switch) or ?break? (as in s-switch) can provide > early return, exactly as today with constructors, and with suitable > DU/DA requirements on the variables. (A value-producing return > (or yield) could in some cases be take to return a right-typed > bundle. This might make sense, for example, with a constructor > for an inline type, which aborts the construction of the current > ?this? in favor of some replacement value. This is illegitimate > for an identity class, but reasonable for an inline class.) > > The above proposed patterns make sense both internally to > a class (e.g., as constructor bodies which can touch private names) > or externally (for untrusted code outside the capsule), as some sort > of with-block construction, or (today) lambdas. > > Note that currently the natural form of a multiple-value consuming > construct in Java is a method or lambda body with formal parameter list > containing (wait for it) a set of named/typed variables in scope. > And/or a method body, where the named/type variables are instance > variables of the method?s class. Or both (there are extra axes of > variation here). > > On Aug 13, 2020, at 5:38 PM, John Rose wrote: > > What you said! > > On Aug 13, 2020, at 1:14 PM, Brian Goetz > wrote: >> >> By framing deconstructors as the arrow-reversed form of constructors, we provide the missing edges in our graph in a symmetrical way. By framing deconstruction as merely a "multi-getter", we may provide the missing edges, but the result is badly asymmetric. > All of which would seem to suggest R?mi?s multi-value-return minmax example as the dual to method invocation: >> . . . a method minMax that returns both the minimum and the maximum of an array >> public static (int min, int max) minMax(int[] array) { >> > Nope. Not going there. I went down this road too, but multiple-return is another one of those ?tease? features that looks cool but very quickly looks like glass 80% empty. Part of the job of method invocation is to take a set of values and definitely assign them to a set of variables (the method parameters). This could be done with a block that is charged with the task of definitely assigning to those variables: Math.atan{ x = 2.0; y = 3.0 } myString.substring{ if (weird) { beginIndex = 3; endIndex = 5; } else { beginIndex = 0; endIndex = myString.length(); } } but for convenience (or for compatibility with C) we provide a different mechanism, with different syntax, that in effect uses positional tuples. A block-with-assignment mechanism is possible, but that?s not Java. Therefore we will keep re-encountering the question of why positional tuples are good Java style for passing several arguments to a method but not for returning several values from a method. -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Thu Aug 13 23:00:28 2020 From: john.r.rose at oracle.com (John Rose) Date: Thu, 13 Aug 2020 16:00:28 -0700 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> <5108C91B-11CA-4B62-BE6B-045A57CECF33@oracle.com> Message-ID: <710B9B56-01B7-4E6A-BA2F-BA146ED953E5@oracle.com> On Aug 13, 2020, at 3:01 PM, Guy Steele wrote: > >> >> On Aug 13, 2020, at 5:37 PM, John Rose > wrote: >> >> On Aug 13, 2020, at 12:39 PM, Guy Steele > wrote: >>> >>> Whereas I can more easily understand that the job of >>> >>> public deconstructor Point(int x, int y) { >>> x = this.x; >>> y = this.y; >>> } >>> >>> is to take values out of the object ?this? and put them into separate _variables_, not a new object. (Granted, these variables have a somewhat new and mysterious existence and character.) >> >> And if this mysterious character were something completely unrelated >> to any other part of the Java language, I?d be more inclined to admit >> that maybe the missing primitive is some sort of tuple. It might have >> to be given a status like that of Java arrays to avoid the infinite regress >> problem you point out. >> >> BUT, when I stare at a block of code that is setting some named/typed >> variables, where those variables must be DA at the end of the block, >> and then they are to be available in various other scopes (not the >> current scope, but other API points), THEN I say, ?where have I >> seen that pattern before??? There is ALREADY a well-developed >> part of the Java language which covers this sort of workflow >> (of putting values into a set of required named/typed variables). >> >> Of course, it?s a constructor, > > Actually, a constructor _body_. Yep. And it is distinguished from a tuple-based notation in its reference to (live) named/type values on exit. We *could* have used tuples there, by requiring that every (normal) exit from a constructor must ?return multiple values? by specifying a positional argument package (a tuple) corresponding to all required (final) field settings. We *could* have observed that something like `this(a,b,c)`, where the argument list is exactly the required fields, is a perfectly universal way to commit all required field values to an object, at the end of its constructor. Why didn?t we? It would have been more symmetric in some way, to have the outputs of the constructor leave the block in the same format as the inputs. One reason is the entities which are already present: The fields are there, ready and waiting for assignment. Another reason is surely that tuples would have been the wrong notation for that job. In a nutshell, positional notations only work well when there are only a few positions, and named notations, though more verbose, are more robustly expressive regardless of the number of positions; they also degrade gracefully when items may be omitted (optional initialization/binding/assignment). I think we should (continue to) design for object arities which are larger than (comfortable) parameter list arities. > Let us also recall that there is a second well-developed part of the Java language > that puts values into a set of required named/types variables: method invocation. > And its structure and behavior are rather different from that of a constructor body. > > (more below) > > ... > All of which would seem to suggest R?mi?s multi-value-return minmax example as the dual to method invocation: > >>> . . . a method minMax that returns both the minimum and the maximum of an array >>> public static (int min, int max) minMax(int[] array) { >>> >> Nope. Not going there. I went down this road too, but multiple-return is another one of those ?tease? features that looks cool but very quickly looks like glass 80% empty. > > Part of the job of method invocation is to take a set of values and definitely assign them to a set of variables (the method parameters). This could be done with a block that is charged with the task of definitely assigning to those variables: > > Math.atan{ x = 2.0; y = 3.0 } > myString.substring{ if (weird) { beginIndex = 3; endIndex = 5; } else { beginIndex = 0; endIndex = myString.length(); } } > > but for convenience (or for compatibility with C) we provide a different mechanism, with different syntax, that in effect uses positional tuples. A block-with-assignment mechanism is possible, but that?s not Java. > > Therefore we will keep re-encountering the question of why positional tuples are good Java style for passing several arguments to a method but not for returning several values from a method. That?s a good argument; your code example looks plenty ugly. Surely positional notation is better for those simple use cases, of well-known APIs where programmers have committed the order of arguments firmly to memory. But there are two reasons ?that?s not Java? is not the whole story here. 1. At high arities, positional notations falter, and people ask for keyword-based argument notations, because it?s hard to commit to memory the order of arguments for every API. Java might answer those demands at some point. What we are discussing here could do the job. 2. Java already has a ?block of assignments? notation, the constructor body. Using that notation elsewhere, rewarding programmers for learning that notation by giving them more ways to use it, is a legitimate tactic. (Yeah, maybe putting it in an external block, outside its class, is ?Not Java?; but lambdas were similarly ?Not Java? at one point; now they are.) The imperative constructor body, with its named assignments, can be more expressive and compact than a tuple expression. It can be read piecewise, and the names help the reading (and writing) process. Conditional control flow can visually reify case analysis for setting up the field values to be output from the constructor body, without introducing extra temps. All this is even more true when we connect up record parameters to record fields, and allow elision of assignments of the form `this.x =x`. That amounts to an optionality feature where the (positional) argument list of a record provides defaults and then the compact constructor body provides a named argument set (not an ordered list) of additionally processed values. Tuples are not the right notation here; it would be less clear code if changing one record component (say, doing a range clip) required the coder to specify the adjusted record components as a new argument list. Tuple notations work OK for two or three items but don?t scale nearly as well as name-based notations when you have a larger collection of columns to wrangle. You could say, well, tuples are better if you are going to specify all the names in some well-determined order?as is the case with argument lists I suppose?because you can drop the noise of the names (they don?t add anything). Yes, in that case tuples are better. But even for argument lists there is a place where you really want by-name arguments, because remembering the order of names is just too hard. That?s what I mean by positional notation not scaling well to high arities. When we are talking about objects, I think we need to design for field sets that are more numerous than comfortable argument list arities. The constructor body notation is therefore a better precedent to build on, for deconstructors and reconstructors, and anything else that has a transaction on an object-sized scope (bigger than an arg-list sized scope). ADTs like Box and Rational and Point3D don?t support my case very well, because they amount (at most) to pairs or triples. But if you get anywhere close to database rows (and I do think we want to scale out that way), then tuples won?t take us where we want to go, but transactional blocks on names (that is, constructor bodies suitably generalized) will take us places, and will make use of mindshare already present in Java programmers. Back to the point about ?the fields are already there?: While this may be why constructor bodies are the way they are, I think we could reconsider the source of the names that are present in what I call a ?transactional block? (with named values falling out the bottom, and perhaps also falling in the top), starting with deconstructors. These names could be specified by an argument list for an ad hoc API point, not the (final non-static) field set of a class. So an arrow-reversed constructor body is not just a fine way to unpack the pre-existing fields of a class (that wants to cooperate with pattern matching). It is a direction in which Java can, maybe, move to add some benefits of keyword-based calling sequences, without importing something completely new. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Aug 13 23:11:15 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 14 Aug 2020 01:11:15 +0200 (CEST) Subject: A peek at the roadmap for pattern matching and more In-Reply-To: References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> Message-ID: <1174063589.118596.1597360275349.JavaMail.zimbra@u-pem.fr> > De: "John Rose" > ?: "Guy Steele" > Cc: "Remi Forax" , "Brian Goetz" , > "amber-spec-experts" > Envoy?: Jeudi 13 Ao?t 2020 23:37:47 > Objet: Re: A peek at the roadmap for pattern matching and more > On Aug 13, 2020, at 12:39 PM, Guy Steele < [ mailto:guy.steele at oracle.com | > guy.steele at oracle.com ] > wrote: >> Whereas I can more easily understand that the job of >> public deconstructor Point(int x, int y) { >> x = this.x; >> y = this.y; >> } >> is to take values out of the object ?this? and put them into separate >> _variables_, not a new object. (Granted, these variables have a somewhat new >> and mysterious existence and character.) > And if this mysterious character were something completely unrelated > to any other part of the Java language, I?d be more inclined to admit > that maybe the missing primitive is some sort of tuple. It might have > to be given a status like that of Java arrays to avoid the infinite regress > problem you point out. > BUT, when I stare at a block of code that is setting some named/typed > variables, where those variables must be DA at the end of the block, > and then they are to be available in various other scopes (not the > current scope, but other API points), THEN I say, ?where have I > seen that pattern before??? There is ALREADY a well-developed > part of the Java language which covers this sort of workflow > (of putting values into a set of required named/typed variables). > Of course, it?s a constructor, especially when the object has final > fields that are checked by the DA/DU rules. Now, maybe those rules > aren?t everybody?s favorite; they are complicated to learn. But > Java programmers do learn them. How about if we give them > additional rewards for having learned then? Instead of asking > them to learn yet another sub-language (tuples-for-deconstructors) > that is completely different? > (Yes, I?m partial to DA/DU because those are my rules, for better > or worse. Maybe Remi?s going to say something about a sunk cost > fallacy. But I think the rules are useful and, with an IDE?s help, > intuitive. And they can do more for us now. Let?s double down > on them.) That's true, most Java devs already knows how to write constructors using DA/DU rules, but we don't need those complex rules in a deconstructor. DA/DU rules exist because we want to be able to see the instance not yet fully initialized in the constructor, so users can write things like a circular list. There is no such need for a deconstructor, the result is always a tree, not a graph. A deconstructor is just a method with several return values, which is the dual of the constructor that takes several parameter values. For me it's like having a complex lock on the front door and wanting to have the same mechanism on the opposite side of the front door (to go out) because you already know how to unlock the front door. [...] R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Thu Aug 13 23:24:13 2020 From: john.r.rose at oracle.com (John Rose) Date: Thu, 13 Aug 2020 16:24:13 -0700 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <1174063589.118596.1597360275349.JavaMail.zimbra@u-pem.fr> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> <1174063589.118596.1597360275349.JavaMail.zimbra@u-pem.fr> Message-ID: <670D9E1B-206E-45E9-B912-D16C24F6CD33@oracle.com> On Aug 13, 2020, at 4:11 PM, forax at univ-mlv.fr wrote: > > For me it's like having a complex lock on the front door and wanting to have the same mechanism on the opposite side of the front door (to go out) because you already know how to unlock the front door. Ow! Tough crowd. As you may have noted from my previous note, I?m also concerned with managing, not single or double or triple doors, but cases where there are too many doors to go through all at once. This is where languages add optional and named arguments. And once having done so, I do admit that we could use such a new thing profitably to manage wide multi-component data flows both in and out of methods, if the symmetry argument holds. And we could leave constructor blocks where they always have been, in a corner. My proposals double down on the *asymmetric* way Java delivers multiple values out of blocks, compared to how they are sent into blocks by position-argument-to-parameter binding. Brian?s point about symmetry is that it can be a siren song: You put tuples in one place for symmetry (with argument lists) and suddenly you risk having a new kind of value, neither primitive nor class nor array. (Arrays are the old tuple; you really want to do that again?) Tuples incur technical debt which makes the symmetry proposal expensive, which is why we are looking at other options as well. ? John From alex.buckley at oracle.com Thu Aug 13 23:35:10 2020 From: alex.buckley at oracle.com (Alex Buckley) Date: Thu, 13 Aug 2020 16:35:10 -0700 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: <8a0d00c7-07a4-2d1e-8570-7e2cebd7f00e@oracle.com> Message-ID: <8e3b4d87-87f7-fbfa-3617-90ce156f6262@oracle.com> On 7/23/2020 11:52 AM, Brian Goetz wrote: > On 7/23/2020 2:38 PM, Remi Forax wrote: >> On guards, they do not work well with exhaustiveness. Still not sure >> they are useful. > > It works fine, it's just more work for the user to get right. > > We induce a domination ordering on patterns.? If `T <: U`, then `T t` < > `U u` (`T t` is less specific than `U u`.)? Similarly, for all guard > conditions g, `P & g` < `P`.? What this says is that if you want > exhaustiveness, you need an unguarded pattern somewhere, either: > > ??? case A: > ??? case B & g: > ??? case B:????????????? // catches B & !g > ??? case C: > > or > > ??? case A: > ??? case B & g: > ??? case C: > ??? case Object: ?? // catches B & !g > > I understand your diffidence about guards, but I'm not sure we can do > nothing.? The main reason that some sort of guards feel like a forced > move (could be an imperative guard, like `continue`, but I don't think > anyone would be happy with that) is that the fall-off-the-cliff behavior > is so bad.? If you have a 26-arm switch, and you want the equivalent of > the second of the above cases -- B-but-not-g gets shunted into the > bottom clause -- you may very well have to refactor away from switch, or > at least mangle your switch badly, which would be pretty bad. Is the following what you mean by "mangle your switch badly" ? switch (o) { case A: ... case B: do some B-ish stuff ... also, if (g) {...} case C: ... ... case Z: ... case Object: if (o instanceof B && !g) { do the B-ish non-g thing } } Is a guard (a) part of the `case` construct, or (b) part of the pattern operand for a `case` construct? The original mail introduced "guard expression" as "a boolean expression that conditions whether the case matches", which sounds like (a). However, the purpose of a `case` construct is to enumerate one or more possible values of the selector expression, and if a `case` construct has a post-condition `& g()` then it's not just enumerating, and it isn't a `case` construct anymore. I mean, we don't want to see guards in the `case` constructs of legacy switches, right? (`switch (i) { case 100 & g():`) So, is the answer (b) ? Alex From forax at univ-mlv.fr Thu Aug 13 23:53:57 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 14 Aug 2020 01:53:57 +0200 (CEST) Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <670D9E1B-206E-45E9-B912-D16C24F6CD33@oracle.com> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> <1174063589.118596.1597360275349.JavaMail.zimbra@u-pem.fr> <670D9E1B-206E-45E9-B912-D16C24F6CD33@oracle.com> Message-ID: <1637082231.119921.1597362837307.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "John Rose" > ?: "Remi Forax" > Cc: "Guy Steele" , "Brian Goetz" , "amber-spec-experts" > > Envoy?: Vendredi 14 Ao?t 2020 01:24:13 > Objet: Re: A peek at the roadmap for pattern matching and more > On Aug 13, 2020, at 4:11 PM, forax at univ-mlv.fr wrote: >> >> For me it's like having a complex lock on the front door and wanting to have the >> same mechanism on the opposite side of the front door (to go out) because you >> already know how to unlock the front door. > > Ow! Tough crowd. > > As you may have noted from my previous note, I?m > also concerned with managing, not single or double > or triple doors, but cases where there are too many > doors to go through all at once. This is where languages > add optional and named arguments. And once having > done so, I do admit that we could use such a new thing > profitably to manage wide multi-component data flows > both in and out of methods, if the symmetry argument > holds. And we could leave constructor blocks where > they always have been, in a corner. > > My proposals double down on the *asymmetric* > way Java delivers multiple values out of blocks, > compared to how they are sent into blocks by > position-argument-to-parameter binding. > > Brian?s point about symmetry is that it can be > a siren song: You put tuples in one place for > symmetry (with argument lists) and suddenly > you risk having a new kind of value, neither > primitive nor class nor array. (Arrays are the > old tuple; you really want to do that again?) > Tuples incur technical debt which makes > the symmetry proposal expensive, which is > why we are looking at other options as well. I'm a little disappointed by the current discussions, when Brian announced that a record will be immutable, I was flabbergasted how brilliant the idea was because not only you can use a record and do directly pattern destructuring on it but also you can use it as an anonymous carrier of values to transfer the values from a plain old class to a representation you can do destructuring on it. It seems that that plan has been lost at some point. An anonymous record is not a tuple, it's a plain Java class from the VM POV, so not a new kind of value (apart from the fact that it will be an inline most of the time) so i don't think it's a risky move. > > ? John R?mi From forax at univ-mlv.fr Fri Aug 14 00:01:11 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 14 Aug 2020 02:01:11 +0200 (CEST) Subject: Next up for patterns: type patterns in switch In-Reply-To: <5FCCADEE-0F72-4EFB-8A63-B6AD46673086@oracle.com> References: <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> <648491328.346938.1597262261833.JavaMail.zimbra@u-pem.fr> <1BD5CAF9-C4A7-4AFA-9510-BF24FCE0CEBC@oracle.com> <1036714364.51116.1597321182916.JavaMail.zimbra@u-pem.fr> <5FCCADEE-0F72-4EFB-8A63-B6AD46673086@oracle.com> Message-ID: <663728521.120548.1597363271906.JavaMail.zimbra@u-pem.fr> > De: "Guy Steele" > ?: "Remi Forax" > Cc: "Brian Goetz" , "John Rose" > , "amber-spec-experts" > > Envoy?: Jeudi 13 Ao?t 2020 19:22:46 > Objet: Re: Next up for patterns: type patterns in switch >> On Aug 13, 2020, at 8:19 AM, [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] >> wrote: >> . . . >> I wonder if we find it natural only because we are used to use the keyword >> "default" inside a switch, . . . > I think that may be so; but given that it is so, I am happy to exploit that > fact! >> I think i prefer using "default" (or any other keyword) only where it makes >> sense and doesn't allow "default" to be propagated. >> so >> default Pair p: ... >> is ok but >> default Pair(Box(Frog f), Bag(Object o)): ? >> should be written >> case Pair(Box(Frog f), Bag(default Object o)): ? > I think you intended that last line to read > case Pair(Box(default Frog f), Bag(default Object o)): ? yes, thank you > and if so, I agree that this may be a better way to write it in the context I > originally gave: > switch (x) { > case Pair(Box(Tadpole t), Bag(String s)): ? > case Pair(Box(Tadpole t), Bag(default Object o)): ? > case Pair(Box(default Frog f), Bag(String s)): ? > case Pair(Box(default Frog f), Bag(default Object o)): ? // I originally had " > default Pair(Box(Frog f), Bag(Object o)): ?? here > } > But either way works, because of the subtle fact that if P: Pattern T, then Q is > total over type T if and only if P(Q) is total over type P, so one can choose, > on purely stylistic grounds, whether to use the ?default? tag at the root of a > pattern subtree that is total, or at all the relevant leaves, or for that > matter at a relevant set of interior subtrees. yes, i'm advocating on putting it in the subtree because i find the resulting code more readable because you can see how the subtrees of each case are related to each other, the result seems "balanced" visually. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Fri Aug 14 00:17:42 2020 From: john.r.rose at oracle.com (John Rose) Date: Thu, 13 Aug 2020 17:17:42 -0700 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <1637082231.119921.1597362837307.JavaMail.zimbra@u-pem.fr> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> <1174063589.118596.1597360275349.JavaMail.zimbra@u-pem.fr> <670D9E1B-206E-45E9-B912-D16C24F6CD33@oracle.com> <1637082231.119921.1597362837307.JavaMail.zimbra@u-pem.fr> Message-ID: On Aug 13, 2020, at 4:53 PM, forax at univ-mlv.fr wrote: > > use it as an anonymous carrier of values to transfer the values from a plain old class to a representation you can do destructuring on it Yes, something like that could play a role in reifying a ?bundle of names coming out of a block?, just as an object of functional interface reifies a lambda expressions with an ad hoc block (?with a bundle of incoming names and an outgoing value?). Put another way: Anonymous named records have the same information content as the result of a set (unordered) of assignments to typed names. Hmm? Does Java have a syntax for that? Could be sugar for constructing values of that type. In short: I like those anonymous records, and have already proposed some good sugar for creating them. And if you have a API point with named parameters that takes them positionally (the way the JVM likes them), there are various fruitful ways to spread out an anonymous record into the correct argument positions (starting with a new anti-varargs rule, and an extension to MH.invokeWithArguments). And, given that, you *could* (but *shouldn?t*) write that kind of ugly code that Guy was calling ?not Java?. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Aug 14 00:18:21 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 13 Aug 2020 20:18:21 -0400 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <1637082231.119921.1597362837307.JavaMail.zimbra@u-pem.fr> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> <1174063589.118596.1597360275349.JavaMail.zimbra@u-pem.fr> <670D9E1B-206E-45E9-B912-D16C24F6CD33@oracle.com> <1637082231.119921.1597362837307.JavaMail.zimbra@u-pem.fr> Message-ID: <22e79552-3fa8-45d9-58b4-aff9b1f94b51@oracle.com> First, recall that I asked already that we table this digression; I provided the peek because several folks asked me for some more context, not to distract us with a shiny bikeshed. Second, I get what you are saying -- I really do, because I've been exactly there on my long and winding journey -- and even thought briefly I had reached the end there.? And I totally get the idea that it seems like we're adding more concepts than we need to.? But I'm looking at this from the point of view of _what mental model do we? want to encourage_, not how can we make the fewest changes to the language.?? I realize the "just use a fake tuple" (or anonymous record) appeals to a sense of economy, but I do not believe it equilibrates in the right place.? We have invested a significant amount in integrating pattern matching into the language; I don't want to nail something on the side. > I'm a little disappointed by the current discussions, when Brian announced that a record will be immutable, I was flabbergasted how brilliant the idea was because not only you can use a record and do directly pattern destructuring on it but also you can use it as an anonymous carrier of values to transfer the values from a plain old class to a representation you can do destructuring on it. And we're doing all those things.? Just at the translation level, not the source level. > It seems that that plan has been lost at some point. No, it was not lost, it got extensive scrutiny, I spent a significant amount of time considering it and working out the details before I soured on the idea.? Not all initially attractive ideas turn out to be right. But, back to my top point: this is not the issue that is in front of us now. From john.r.rose at oracle.com Fri Aug 14 00:22:46 2020 From: john.r.rose at oracle.com (John Rose) Date: Thu, 13 Aug 2020 17:22:46 -0700 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <992187173.9453.1597274889949.JavaMail.zimbra@u-pem.fr> <1632800374.54555.1597323970020.JavaMail.zimbra@u-pem.fr> <1174063589.118596.1597360275349.JavaMail.zimbra@u-pem.fr> <670D9E1B-206E-45E9-B912-D16C24F6CD33@oracle.com> <1637082231.119921.1597362837307.JavaMail.zimbra@u-pem.fr> Message-ID: <72274E99-9D16-4D89-94F4-E5675A45E943@oracle.com> P.S. I?m going to take Brian?s lead now and talk about the short-term matters immediately in front of us. It?s been fun? We?ll get back it later. > On Aug 13, 2020, at 5:17 PM, John Rose wrote: > > On Aug 13, 2020, at 4:53 PM, forax at univ-mlv.fr wrote: >> >> use it as an anonymous carrier of values to transfer the values from a plain old class to a representation you can do destructuring on it > > Yes, something like that could play a role in reifying > a ?bundle of names coming out of a block?, just as an > object of functional interface reifies a lambda expressions > with an ad hoc block (?with a bundle of incoming names > and an outgoing value?). > > Put another way: Anonymous named records have the > same information content as the result of a set (unordered) > of assignments to typed names. Hmm? Does Java have > a syntax for that? Could be sugar for constructing values > of that type. > > In short: I like those anonymous records, and have > already proposed some good sugar for creating them. > > And if you have a API point with named parameters > that takes them positionally (the way the JVM likes > them), there are various fruitful ways to spread out > an anonymous record into the correct argument positions > (starting with a new anti-varargs rule, and an extension > to MH.invokeWithArguments). And, given that, you > *could* (but *shouldn?t*) write that kind of ugly code > that Guy was calling ?not Java?. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Aug 14 00:37:56 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 14 Aug 2020 02:37:56 +0200 (CEST) Subject: Next up for patterns: type patterns in switch In-Reply-To: References: <48587cd2-032d-50b4-a7a1-ed3e214868c6@oracle.com> <859639325.325160.1597239930054.JavaMail.zimbra@u-pem.fr> <648491328.346938.1597262261833.JavaMail.zimbra@u-pem.fr> <1BD5CAF9-C4A7-4AFA-9510-BF24FCE0CEBC@oracle.com> Message-ID: <1382479871.121474.1597365476779.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Guy Steele" > Cc: "Remi Forax" , "John Rose" , > "amber-spec-experts" > Envoy?: Jeudi 13 Ao?t 2020 16:24:11 > Objet: Re: Next up for patterns: type patterns in switch >>> I agree destructuring is just as important as conditionality and those two >>> things should be orthogonal. >>> But i still think having a keyword to signal that a pattern (not a case) is >>> total is better than letting people guess. > I?m glad this discussion has moved to the right thing ? whether we should > provide extra help in type-checking totality. (Note, though, there were several > dozen aspects to my patterns-in-switch proposal, and we?ve spent 99% of our > time discussing a single, pretty cornery one. This is the hazard of discussing > these things on mailing lists, where all the discussion is replies to the first > thing that someone complained about, especially when the first complaint is > loud. So EVERYONE, PLEASE, take a detailed review through the doc. It can?t be > 100% perfect aside form this issue.) > I still think, though, that this may be mostly fear of the unknown. Having spent > the last year staring at pattern switches, it is really pretty obvious which > are the total ones. I think that will be true for most users, once they use the > feature a little. (Remember how alien people said method references looked, or > how worried people were about the fact that static mrefs and unbound instance > mrefs had the same syntax? Wasn?t a problem.) So all of this discussion rests > on the assumption that (a) it is really hard to tell when a pattern is total or > not, and (b) the stakes for getting that wrong are really high. I worry that > both of these concerns are significantly overstated, and that the incremental > user model complexity may be worse than the disease. I'm still not sure that using a keyword to mark that a pattern is total is the best solution, i still like the rule that a total pattern should use var, it's less intrusive but perhaps i've written too much OCAML so i've no problem with all types being inferred in a pattern matching. If you are doing pattern matching on a hierarchy you have not built yourself, it's quite easy to think that a pattern a total while it's not. As an anecdote, in the example with TadPole, i did not now what a tadpole was (why on earth you have two words for a pollywog in English ?) so the relation between a Frog and a TadPole was not that clear for me. And given that the semantics is different when a pattern is total or not, i think a syntactic difference do more good than harm. >> Yes, and here is the example that convinced me that one needs to be able to mark >> patterns as total, not just cases: > The underlying observation here is that totality is a property of an arbitrary > sub-tree of a pattern, which could be the whole thing, a leaf, or some > intermediate sub-tree. > Guy?s exploration into overloading `default` does shine a light on one of the > uglier parts of my proposal, which is that the existing `default` case is > mostly left to rot. I?m going to take this as a cue that, whether we do > something to highlight pattern totality or not, that we should try to integrate > default cases better into the design. > So, here are the properties of the existing default case, in the current state > of the proposal: > - it matches all non-null values with no bindings > - it can only appear at the end (unlike the current language, where it can > appear anywhere) > - You cannot fall into it from a pattern case (unlike the current language, > where you can fall into it) > - People will either use default or a total pattern, but rarely both (since > their only difference is null) > Just as we?ve leaned towards rehabilitating switch, maybe we can try to > rehabilitate default. And the role it can play is in signaling totality of the > switch. yes > There are two ?cases? of exhaustive switches: nullable ones and non-nullable > ones. > The nullable ones are those that end in a total (catch-all) pattern. The > non-nullable ones are the analogues of the current exhaustive switch on enums; > those where all the cases are ?parts? of the whole: > switch (day) { > case MONDAY..FRIDAY: work(); > case SATURDAY: play(); > case SUNDAY: worship(); > // And on the null day, there was a NullPointerException > } > This is a total switch, but not a nullable one. On the other hand: > switch (something) { > case X: ... > case Object o: > } > This is also a total switch, but a nullable one. > So here?s a proposed rehabilitation of default, inspired by Guy?s exploration: > - A switch has a sequence of cases, with zero or one default case > - The default case must be last (except for legacy switches) > - A switch with a default case must be total (possibly modulo null) > - Default can be used with no pattern, which means ?everything else but null? > - Default can be used with a pattern, in which case it has exactly the same > semantics as the same pattern with ?case? > - Nullability is still strictly a function of whether there are any nullable > cases, which can only be ?case null? (first) and total pattern (last) > So we can say > switch (something) { > case X: ... > case Object o: > } > or > switch (something) { > case X: ... > default Object o: > } > which mean the same thing, but the latter engages the additional type checking > that the switch is total. > This is not inconsistent with GUy?s sketch, it just is the switch-specific part. I think you know my position on this, i would prefer the compiler to emit an error on the former code and ask to add default, converting the former code to the latter. The current status quo is a way to be sure that people will talk endlessly about that, with the "pro" default advocating for more safety while the others will answer that seeing if a pattern is total or not is obvious and it's less keystrokes. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Fri Aug 14 01:34:01 2020 From: john.r.rose at oracle.com (John Rose) Date: Thu, 13 Aug 2020 18:34:01 -0700 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: Message-ID: <54CF901C-1393-4E0D-9458-922153D0FD00@oracle.com> OK, I?m going to make some *concise* responses to what appear to be the choice points in your Email. I?m going to leave aside the JEP document[1] for now. [1]: https://cr.openjdk.java.net/~briangoetz/amber/pattern-match.html On Jun 24, 2020, at 7:44 AM, Brian Goetz wrote: > ... > ## Patterns in switch > **Typing.** Currently, the operand of a `switch` may only be one of the > integral primitive types, the box type of an integral primitive, `String`, or an > `enum` type. (Further, if the `switch` operand is an `enum` type, the `case` > labels must be _unqualified_ enum constant names.) Clearly we can relax this > restriction to allow other types, and constrain the case labels to only be > patterns that are applicable to that type, but it may leave a seam of "legacy" > vs "pattern" switch, especially if we do not adopt bare constant literals as > the denotation of constant patterns. (We have confronted this issue before with > expression switch, and concluded that it was better to rehabilitate the `switch` > we have rather than create a new construct, and we will make the same choice > here, but the cost of this is often a visible seam.) Yes, let?s do this. The in-place enhancement is worth the seam. So (modulo syntax ambiguities) if `x instanceof P` is valid, then `switch (x) { ? case P ? }` is valid, right? (Is there a more formal draft spec of this syntax somewhere, yet, or is that TBW after this discussion?) > **Parsing.** The grammar currently specifies that the operand of a `case` label > is a `CaseConstant`, which casts a wide syntactic net, later narrowed with > post-checks after attribution. This means that, since parsing is done before we > know the type of the operand, we must be watchful for ambiguities between > patterns and expressions (and possibly refine the production for `case` labels.) (OK, so something like CaseConstant | Pattern is needed, with rules for dealing with ambiguity.) > **Nullity.** The `switch` construct is currently hostile to `null`, but some > patterns do match `null`, and it may be desirable if nulls can be handled > within a suitably crafted `switch`. Is it desirable? Yes; let those nulls flow. We?ve discussed that one to exhaustion and beyond, and it seems the next step is a formal spec. > > **Exhaustiveness.** For switches over the permitted subtypes of sealed types, > we will want to be able to do exhaustiveness analysis -- including for nested > patterns (i.e., if `Shape` is `Circle` or `Rect`, then `Box(Circle c)` and > `Box(Rect r)` are exhaustive on `Box`.) Ref: https://cr.openjdk.java.net/~briangoetz/amber/pattern-match.html#exhaustiveness A solution for this should fix enums, I think. The proposal on this list of saying ?default P:? instead of ?default:? where ?P? had better cover the rest of the switch values, looks like a winner to me. Surely there are other ways to do it. Failing that, just requiring the user to write ?default:?, for now, as with enums, is adequate. So, it?s a nice-to-have, but could be left for later. > **Fallthrough.** Fallthrough is everyone's least favorite feature of `switch`, > but it exists for a reason. (The mistake was making fallthrough the default > behavior, but that ship has sailed.) In the absence of an OR pattern > combinator, one might find fallthrough in switch useful in conjunction with > patterns: > > ``` > case Box(int x): > case Bag(int x): > // use x > ``` > > However, it is likely that we will, at least initially, disallow falling out > of, or into, a case label with binding variables. This is good. We can revisit. OR-patterns with bindings are very hard, perhaps too hard. Side thought: Range patterns can be done with libraries. So maybe use cases for other OR-patterns can be done similarly, with either libraries or ad hoc destructor expressions. (See my discussion elsewhere of external transactional blocks.) Exploring those ideas should not hinder or delay the release of basic pattern-in-switch. > #### Translation > > Switches on primitives and their wrapper types are translated using the > `tableswitch` or `lookupswitch` bytecodes; switches on strings and enums are > lowered in the compiler to switches involving hash codes (for strings) or > ordinals (for enums.) > > For switches on patterns, we would need a new strategy, one likely built on > `invokedynamic`, where we lower the cases to a densely numbered `int` switch, > and then invoke a classifier function with the operand which tells us the first > case number it matches. Yep. I note that the classifier function could be given a ?continue after case #N? entry point which would allow any form of guard to be issued as in-line code, if you wrap a suitable loop around the desugared switch. Also, if we ever add fall-through back in, there?s probably a natural way to map it to fall-through in the desugared switch. Basically the switch structure stays but the head and cases are lowered to compactly assigned case numbers. > ... > > #### Guards > > Worse, the semantics of > `switch` mean that once a `case` label is selected, there is no way to say > "oops, forget it, keep trying from the next label?. As with null hostility, we can blame this on the very limited portfolio of ?switch? to date. Since case labels are *always* singleton values, they *never* overlaps and so there?s never a significant need to retry a switch operation. The exception to this observation is that ?default? overlaps *all* labels. And this leads me to observe that C and Java switch allows you to pic *any one* case, and position it above the default, so that *just that one* case has the ability to fall through into the default logic. That is so very annoying when you have *two or more* cases that need to continue on to the default logic. It?s a sharp edge that stems from using fall-through as a weaker substitute for (a) or-patterns and (b) continue-to-default. Now that case labels can overlap almost arbitrarily (they cannot shadow later ones), we need to question the axiom of ?switch logic? that you can?t restart the selection process once you have picked a case. If we allow a restarting branch, we won?t need a guard syntax per se. > It is common in languages with pattern matching to support some form of "guard" > expression, which is a boolean expression that conditions whether the case > matches, such as: > > ``` > case Point(var x, var y) > __where x == y: ... > ``` Imperative guard: case Point(var x, var y): if (x != y) continue switch; > > Bindings from the pattern would have to be available in guard expressions. Comes for free with imperative guards. > ...the more complex guards are, the harder it is > to tell where the case label ends and the "body" begins. (And worse if we allow > switch expressions inside guards.) An imperative guards is just control flow. That?s pretty hard to miss. (It?s also uglier than a declarative guard; maybe we can add them later as sugar.) > An alternate to guards is to allow an imperative `continue` statement in > `switch`, which would mean "keep trying to match from the next label." Given > the existing semantics of `continue`, this is a natural extension, but since > `continue` does not currently have meaning for switch, some work would have to > be done to disambiguate continue statements in switches enclosed in loops. Indeed, that?s really the only bikeshed that needs painting. At an absolute minimum allow ?continue L?, where ?L:? precedes a (statement) switch, to mean ?restart matching after current case?. As a bonus this fixes the sharp edge I mentioned above, of wanting to fall through to default from *two or more* cases in a classic switch. I also suggest ?continue switch?. Maybe while we are at it ?continue for?, etc. (I.e., every breakable or continuable construct accepts its own head keyword in place of a label.) This covers expression switches, which do not admit labels. > The > imperative version is strictly more expressive than most reasonable forms of the > declarative version, Indeed. It also translates really easily. > but users are likely to prefer the declarative version. Then offer them sugar, but later, for dessert. Not yet. ``` case P __OnlyIf E: //desugars to: case P: {if (!(E)) continue switch;} ``` > ## Nulls > > Almost no language design exercise is complete without some degree of wrestling > with `null`. As we define more complex patterns than simple type patterns, and > extend constructs such as `switch` (which have existing opinions about nullity) > to support patterns, we need to have a clear understanding of which patterns > are nullable, and separate the nullity behaviors of patterns from the nullity > behaviors of those constructs which use patterns. Let ?em flow through the patterns. Arrange to catch ?em with `case null` (only at the top) and a total case (only at the bottom). Keep the existing NPE for `default:` if there wasn?t a `case null` (and there must not have been a previous total case). I also like `default P:` as meaning ?P had better be total.? Which in turn means ?yeah, nulls flow here?. But that?s a separable feature, which can be carved off for later. > ## Nullity and patterns > > This topic has a number of easily-tangled concerns: > > - **Construct nullability.** Constructs to which we want to add pattern > awareness (`instanceof`, `switch`) already have their own opinion about > nulls. Currently, `instanceof` always says false when presented with a > `null`, and `switch` always NPEs. We may, or may not, wish to refine these > rules in some cases. For now keep instanceof null-free by disallowing expressions which, in a switch, would have allowed nulls. That means total patterns (except those mandated by legacy) and `null`. Revisit later. Would that meet the various requirements we?ve discussed? (It?s hard to keep track.) > - **Pattern nullability.** Some patterns clearly would never match `null` > (such as deconstruction patterns), whereas others (an "any" pattern, and > surely the `null` constant pattern) might make sense to match null. null and T where T <: X (X the target type) can match null Nothing else can at this point. (Future discussion: I think there?s an argument still to be had about *static* matchers matching null. My take on that is currently ?let ?em flow.?) > - **Refactoring friendliness.** There are a number of cases that we would like > to freely refactor back and forth, such as certain chains of `if ... else if` > with switches. Yes, refactoring is a deep test of good design. (Sorry about case Object vs. instanceof Object, though. I think that?s a seam we live with.) > - **Nesting vs top-level.** The "obvious" thing to do at the top level of a > construct is not always the "obvious" thing to do in a nested construct. I think we are in uneasy agreement that a pattern does not intrinsically issue NPEs, but either lets nulls through or rejects them. Whether or not the programmer approves of or is conscious of nulls (we can?t guess) Meanwhile, the construct using a pattern has a null policy: 1. instanceof : return false 2. switch : issue NPE if there is no statically nullable pattern (no chance of matching) 3. nested patterns : let ?em flow (FUTURE WORK) > - **Totality vs partiality.** When a pattern is partial on the operand type > (e.g., `case String` when the operand of switch is `Object`), it is almost > never the case we want to match null (except in the case of the `null` > constant pattern), whereas when a pattern is total on the operand type (e.g., > `case Object` in the same example), it is more justifiable to match null. Again, uneasy agreement. Let?s do it this way. Optionally, consider a notation which lets the programmer say, ?this last pattern here should be total, now and forever?. Reward the extra information by allowing ?default:? to be dropped. Support this for enum switches also, maybe even primitive switches. Specify an exception to be thrown if execution ever runs past the ?default? point. > - **Inference.** It would be nice if a `var` pattern were simply inference for > a type pattern, rather than some possibly-non-denotable union. Yes please. ?var? should always be refactorable with a denotable type pattern. > #### Construct and pattern nullability > > Currently, `instanceof` always says `false` on `null`, and `switch` always > throws on `null`. Whatever null opinions a construct has, these are applied > before we even test any patterns. I?m OK with this, although I prefer to think about it as something that happens after all the patterns fail to match the null. In fact, I?d hope that my preferred frame of mind is a legitimate alternative account; i.e., you can?t tell the difference, except maybe from a line number in the debugger. > A similar sharp corner is the decomposition of a nested pattern `P(Q)` into > `P(alpha) & alpha instanceof Q`; while this is intended to be a universally > valid transformation, if P's 1st component might be null and Q is total, this > transformation would not be valid because of the existing (mild) null-hostility > of `instanceof`. Again, we may be able to address this by adjusting the rules > surrounding `instanceof` slightly. I hope so. We don?t need to decide this yet, until we add in deconstructors?? There was some discussion of `default P(Q,R)` vs. `case P(default Q,default R)`. I think this needs more thought. > ... > Given that the `case null` appears so close to the `switch`, it does not seem > confusing that this switch would match `null`; the existence of `case null` at > the top of the switch makes it pretty clear that this is intended behavior. (We > could further restrict the null pattern to being the first pattern in a switch, > to make this clearer.) Granted. > Now, let's look at the other end of the switch -- the last case. What if the > last pattern is a total pattern? (Note that if any `case` has a total pattern, > it _must_ be the last one, otherwise the cases after that would be dead, which > would be an error.) Is it also reasonable for that to match null? After all, > we're saying "everything?: Granted; the last case is inevitably special. (Though maybe we need that extra marker to unlock the no-default-needed reward.) > > So far, we're suggesting: > > - A switch with a constant `null` case will accept nulls; > - If present, a constant `null` case must go first; > - A switch with a total (any) case matches also accepts nulls; > - If present, a total (any) case must go last. Yes. > > #### Relocating the problem > > It might be more helpful to view these changes as not changing the behavior of > `switch`, but of the `default` case of `switch`. (See above. If it?s just perspective I don?t much care.) > > ... > The main casualty here is that the `default` case does not mean the same > thing as `case var x` or `case Object o`. We can't deprecate `default`, but > for pattern switches, it becomes much less useful. (Subsequent discussion has maybe revived default as a useful syntax.) > #### What about method (declared) patterns? > ... (More later, I hope.) From joe.darcy at oracle.com Fri Aug 14 02:22:23 2020 From: joe.darcy at oracle.com (Joe Darcy) Date: Thu, 13 Aug 2020 19:22:23 -0700 Subject: [records] Mark generated toString/equals/hashCode in class files somehow? In-Reply-To: References: Message-ID: <1e3f37e8-67ec-4a63-40ee-b63f13e7997b@oracle.com> Note that the javax.lang.model API already exposes that information: https://docs.oracle.com/en/java/javase/14/docs/api/java.compiler/javax/lang/model/util/Elements.html#getOrigin(javax.lang.model.element.Element) This as a case where there can be higher fidelity when a source file vs a class file is the basis of the element, as noted in the javadoc: > Note that if this method returns EXPLICIT and the element was created > from a class file, then the element may not, in fact, correspond to an > explicitly declared construct in source code. This is due to > limitations of the fidelity of the class file format in preserving > information from source code. For example, at least some versions of > the class file format do not preserve whether a constructor was > explicitly declared by the programmer or was implicitly declared as > the default constructor. If it is not deemed worthwhile to expand the usage of the ACC_MANDATED bit position for other constructs, it would be possible to define new attributes for that information, at a higher spec and implementation cost. Cheers, -Joe On 8/11/2020 9:00 AM, Alex Buckley wrote: > If the mandated status of a class/member was to be reified in the > class file, then you would need Core Reflection and Language Model > APIs to expose that status, along the lines of isSynthetic. > > Alex > > On 8/10/2020 8:26 PM, Tagir Valeev wrote: >> Thank you, Alex! >> >> I created an issue to track this: >> https://bugs.openjdk.java.net/browse/JDK-8251375 >> I'm not sure about the component. I set 'javac', though it also >> touches JLS and JVMS; hopefully, no actual changes in HotSpot are >> necessary as HotSpot can happily ignore this flag. >> I tried to formulate proposed changes in JLS and JVMS right in the >> issue. For now, I omit fields from the discussion because supporting >> fields isn't critical for my applications. >> But it's also ok to extend this to fields. >> >> What else can I do to move this forward? >> >> With best regards, >> Tagir Valeev. From amaembo at gmail.com Fri Aug 14 03:00:35 2020 From: amaembo at gmail.com (Tagir Valeev) Date: Fri, 14 Aug 2020 10:00:35 +0700 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: Message-ID: Hello! I haven't read all the discussions in this thread, so sorry if my points were already covered. It looks like relocating the problem to defaults have a hole for this trivial switch: switch(obj) { case null: // fallthrough default: } If we think that default case always throws for null, should this switch throw? I agree that method patterns should never match null. I have no strong opinion on guards but 'continue' idea looks nice to me. Sometimes it's desired to perform a non-trivial decision on whether to match or not, and it's not always convenient to express it as a single expression. In this case, 'continue' would work. Also, it looks more friendly to step-by-step debugger: when debugger position lands at 'continue', you clearly see that the match was failed. It's unclear for me how exhaustiveness on nested patterns plays with null. case Box(Circle c) and case Box(Rect r) don't cover case Box(null) which is a valid possibility for Box type. I'm still heavily concerned about any patterns with the explicit type (`T t` on an operand of type `U`, where `U <: T`). The exact type of the switch selector expression might not be readily visible around the switch statement, so it could be unclear from reading the switch statement whether it's null-friendly or not without IDE help. Moreover, this allows action at a distance. E.g.: interface Super {} interface Sub extends Super {} Sub getValue(); ... switch(getValue()) { ... case Sub s: ... } Suppose I decided to generalize the return value of getValue() changing it to Super. Most of the clients would either stop compiling or continue work as expected, so I can go through the compilation errors and fix them. However, this switch will silently change the behavior throwing on null instead of matching on the last case. This may introduce subtle errors. Also (wearing the IDE developer hat), we have the parameter nullability inference in IDEA. It works during the indexing when resolve is not available and the results of this inference should not depend on other Java files, so it's quite limited. Currently, it works if the method starts with switch: static void doSmth(String s) { switch(s) { // we can safely mark s parameter as not-null producing a warning on call-sites if null is passed .. } } This would not work if the switch ends with explicit type pattern: static void doSmth(U u) { switch(u) { ... case T t: ... } } Now, without resolving T and U we don't know whether U <: T, thus we cannot infer u nullity anymore. In 99% of cases if T != U then it's not any pattern, thus null is not allowed in this method, but we cannot say this for sure. So I would stick with more explicit ways to show whether the given pattern matches null. In particular, separating `var t` (allows null) and `T t` (disallows null) is one possibility. I don't think overloading the `var` keyword with a new meaning is very harmful. Another possibility (I like it more) is to keep the proposal as it is for nested patterns but invent something for top-level patterns because only top-level patterns affect switch null-friendliness. E.g. the following could work: - disable "any" top-level patterns (they are still "any" patterns but using them as top-level patterns is not allowed, same as in instanceof). - `case null` is allowed and it's the only allowed top-level pattern that matches null. It must go as the last case, but it's allowed to have a default branch after that. - you can fallthrough from `case null` to default. - for arrow-style switches (when fallthrough is not allowed), we can invent a special syntax like `case null, default ->` - no binding variable for any pattern. If you need it, just reuse a selector expression (or extract it to the new local prior the switch and use it). So to determine whether the switch is null-friendly, we have to look at the last non-default case. If it's 'case null' then switch is null-friendly, otherwise it's not. This analysis can be easily done both by human and by machine, doesn't require non-local information like type resolution. It's also easier from the learning point of view. It's quite evident what explicit 'case null' does: if you see this construct for the first time you will likely guess correctly. However when seeing 'case T t:' you may mistakenly assume that this doesn't match null if you haven't read the textbook. There's no visual hint that null is matched. This approach will make the nested pattern refactoring a little bit harder. Namely, if we want to refactor this sample: switch (o) { case Box(Chocolate c): case Box(Frog f): case Box(var o): } we would need to write switch (o) { case Box(var contents): switch (contents) { case Chocolate c: case Frog f: case null: // add 'case null' or not, depending on whether you expect that content could be null default: // use contents instead of 'o' here } } } However, I think that 1. Not so many people will need such a refactoring at all. 2. This is still a mechanical transformation and can be easily automated by IDEs, so nobody would need to do this manually. 3. The resulting nested switch is in fact more readable, as it clearly says that it matches null. With best regards, Tagir Valeev. On Wed, Jun 24, 2020 at 9:48 PM Brian Goetz wrote: > > There are a lot of directions we could take next for pattern matching. The one that builds most on what we've already done, and offers significant incremental expressiveness, is extending the type patterns we already have to a new context: switch. (There is still plenty of work to do on deconstruction patterns, pattern assignment, etc, but these require more design work.) > > Here's an overview of where I think we are here. > > [JEP 305][jep305] introduced the first phase of [pattern matching][patternmatch] > into the Java language. It was deliberately limited, focusing on only one kind > of pattern (type test patterns) and one linguistic context (`instanceof`). > Having introduced the concept to Java developers, we can now extend both the > kinds of patterns and the linguistic context where patterns are used. > > ## Patterns in switch > > The obvious next context in which to introduce pattern matching is `switch`; a > switch using patterns as `case` labels can replace `if .. else if` chains with > a more direct way of expressing a multi-way conditional. > > Unfortunately, `switch` is one of the most complex, irregular constructs we have > in Java, so we must teach it some new tricks while avoiding some existing traps. > Such tricks and traps may include: > > **Typing.** Currently, the operand of a `switch` may only be one of the > integral primitive types, the box type of an integral primitive, `String`, or an > `enum` type. (Further, if the `switch` operand is an `enum` type, the `case` > labels must be _unqualified_ enum constant names.) Clearly we can relax this > restriction to allow other types, and constrain the case labels to only be > patterns that are applicable to that type, but it may leave a seam of "legacy" > vs "pattern" switch, especially if we do not adopt bare constant literals as > the denotation of constant patterns. (We have confronted this issue before with > expression switch, and concluded that it was better to rehabilitate the `switch` > we have rather than create a new construct, and we will make the same choice > here, but the cost of this is often a visible seam.) > > **Parsing.** The grammar currently specifies that the operand of a `case` label > is a `CaseConstant`, which casts a wide syntactic net, later narrowed with > post-checks after attribution. This means that, since parsing is done before we > know the type of the operand, we must be watchful for ambiguities between > patterns and expressions (and possibly refine the production for `case` labels.) > > **Nullity.** The `switch` construct is currently hostile to `null`, but some > patterns do match `null`, and it may be desirable if nulls can be handled > within a suitably crafted `switch`. > > **Exhaustiveness.** For switches over the permitted subtypes of sealed types, > we will want to be able to do exhaustiveness analysis -- including for nested > patterns (i.e., if `Shape` is `Circle` or `Rect`, then `Box(Circle c)` and > `Box(Rect r)` are exhaustive on `Box`.) > > **Fallthrough.** Fallthrough is everyone's least favorite feature of `switch`, > but it exists for a reason. (The mistake was making fallthrough the default > behavior, but that ship has sailed.) In the absence of an OR pattern > combinator, one might find fallthrough in switch useful in conjunction with > patterns: > > ``` > case Box(int x): > case Bag(int x): > // use x > ``` > > However, it is likely that we will, at least initially, disallow falling out > of, or into, a case label with binding variables. > > #### Translation > > Switches on primitives and their wrapper types are translated using the > `tableswitch` or `lookupswitch` bytecodes; switches on strings and enums are > lowered in the compiler to switches involving hash codes (for strings) or > ordinals (for enums.) > > For switches on patterns, we would need a new strategy, one likely built on > `invokedynamic`, where we lower the cases to a densely numbered `int` switch, > and then invoke a classifier function with the operand which tells us the first > case number it matches. So a switch like: > > ``` > switch (o) { > case P: A > case Q: B > } > ``` > > is lowered to: > > ``` > int target = indy[BSM=PatternSwitch, args=[P,Q]](o) > switch (target) { > case 0: A > case 1: B > } > ``` > > A symbolic description of the patterns is provided as the bootstrap argument > list, which builds a decision tree based on analysis of the patterns and their > target types. > > #### Guards > > No matter how rich our patterns are, it is often the case that we will want > to provide additional filtering on the results of a pattern: > > ``` > if (shape instanceof Cylinder c && c.color() == RED) { ... } > ``` > > Because we use `instanceof` as part of a boolean expression, it is easy to > narrow the results by conjoining additional checks with `&&`. But in a `case` > label, we do not necessarily have this opportunity. Worse, the semantics of > `switch` mean that once a `case` label is selected, there is no way to say > "oops, forget it, keep trying from the next label". > > It is common in languages with pattern matching to support some form of "guard" > expression, which is a boolean expression that conditions whether the case > matches, such as: > > ``` > case Point(var x, var y) > __where x == y: ... > ``` > > Bindings from the pattern would have to be available in guard expressions. > > Syntactic options (and hazards) for guards abound; users would probably find it > natural to reuse `&&` to attach guards to patterns; C# has chosen `when` for > introducing guards; we could use `case P if (e)`, etc. Whatever we do here, > there is a readability risk, as the more complex guards are, the harder it is > to tell where the case label ends and the "body" begins. (And worse if we allow > switch expressions inside guards.) > > An alternate to guards is to allow an imperative `continue` statement in > `switch`, which would mean "keep trying to match from the next label." Given > the existing semantics of `continue`, this is a natural extension, but since > `continue` does not currently have meaning for switch, some work would have to > be done to disambiguate continue statements in switches enclosed in loops. The > imperative version is strictly more expressive than most reasonable forms of the > declarative version, but users are likely to prefer the declarative version. > > ## Nulls > > Almost no language design exercise is complete without some degree of wrestling > with `null`. As we define more complex patterns than simple type patterns, and > extend constructs such as `switch` (which have existing opinions about nullity) > to support patterns, we need to have a clear understanding of which patterns > are nullable, and separate the nullity behaviors of patterns from the nullity > behaviors of those constructs which use patterns. > > ## Nullity and patterns > > This topic has a number of easily-tangled concerns: > > - **Construct nullability.** Constructs to which we want to add pattern > awareness (`instanceof`, `switch`) already have their own opinion about > nulls. Currently, `instanceof` always says false when presented with a > `null`, and `switch` always NPEs. We may, or may not, wish to refine these > rules in some cases. > - **Pattern nullability.** Some patterns clearly would never match `null` > (such as deconstruction patterns), whereas others (an "any" pattern, and > surely the `null` constant pattern) might make sense to match null. > - **Refactoring friendliness.** There are a number of cases that we would like > to freely refactor back and forth, such as certain chains of `if ... else if` > with switches. > - **Nesting vs top-level.** The "obvious" thing to do at the top level of a > construct is not always the "obvious" thing to do in a nested construct. > - **Totality vs partiality.** When a pattern is partial on the operand type > (e.g., `case String` when the operand of switch is `Object`), it is almost > never the case we want to match null (except in the case of the `null` > constant pattern), whereas when a pattern is total on the operand type (e.g., > `case Object` in the same example), it is more justifiable to match null. > - **Inference.** It would be nice if a `var` pattern were simply inference for > a type pattern, rather than some possibly-non-denotable union. > > As a starting example, consider: > > ``` > record Box(Object o) { } > > Box box = ... > switch (box) { > case Box(Chocolate c): > case Box(Frog f): > case Box(var o): > } > ``` > > It would be highly confusing and error-prone for either of the first two > patterns to match `Box(null)` -- given that `Chocolate` and `Frog` have no type > relation, it should be perfectly safe to reorder the two. But, because the last > pattern seems so obviously total on boxes, it is quite likely that what the > author wants is to match all remaining boxes, including those that contain null. > (Further, it would be terrible if there were _no_ way to say "Match any `Box`, > even if it contains `null`. (While one might initially think this could be > repaired with OR patterns, imagine that `Box` had _n_ components -- we'd need to > OR together _2^n_ patterns, with complex merging, to express all the possible > combinations of nullity.)) > > Scala and C# took the approach of saying that "var" patterns are not just type > inference, they are "any" patterns -- so `Box(Object o)` matches boxes > containing a non-null payload, where `Box(var o)` matches all boxes. This > means, unfortunately, that `var` is not mere type inference -- which complicates > the role of `var` in the language considerably. Users should not have to choose > between the semantics they want and being explicit about types; these should be > orthogonal choices. The above `switch` should be equivalent to: > > ``` > Box box = ... > switch (box) { > case Box(Chocolate c): > case Box(Frog f): > case Box(Object o): > } > ``` > > and the choice to use `Object` or `var` should be solely one of whether the > manifest types are deemed to improve or impair readability. > > #### Construct and pattern nullability > > Currently, `instanceof` always says `false` on `null`, and `switch` always > throws on `null`. Whatever null opinions a construct has, these are applied > before we even test any patterns. > > We can formalize the intuition outlined above as: type patterns that are _total_ > on their target operand (`var x`, and `T t` on an operand of type `U`, where `U > <: T`) match null, and non-total type patterns do not. (Another way to say > this is: a `var` pattern is the "any" pattern, and a type pattern that is total > on its operand type is also an "any" pattern.) Additionally, the `null` > constant pattern matches null. These are the _only_ nullable patterns. > > In our `Box` example, this means that the last case (whether written as `Box(var > o)` or `Box(Object o)`) matches all boxes, including those containing null > (because the nested pattern is total on the nested operand), but the first two > cases do not. > > If we retain the current absolute hostility of `switch` to nulls, we can't > trivially refactor from > > ``` > switch (o) { > case Box(Chocolate c): > case Box(Frog f): > case Box(var o): > } > ``` > to > > ``` > switch (o) { > case Box(var contents): > switch (contents) { > case Chocolate c: > case Frog f: > case Object o: > } > } > } > ``` > > because the inner `switch(contents)` would NPE before we tried to match any of > the patterns it contains. Instead, the user would explicitly have to do an `if > (contents == null)` test, and, if the intent was to handle `null` in the same > way as the `Object o` case, some duplication of code would be needed. We can > address this sharp corner by slightly relaxing the null-hostility of `switch`, > as described below. > > A similar sharp corner is the decomposition of a nested pattern `P(Q)` into > `P(alpha) & alpha instanceof Q`; while this is intended to be a universally > valid transformation, if P's 1st component might be null and Q is total, this > transformation would not be valid because of the existing (mild) null-hostility > of `instanceof`. Again, we may be able to address this by adjusting the rules > surrounding `instanceof` slightly. > > ## Generalizing switch > > The refactoring example above motivates why we might want to relax the > null-handling behavior of `switch`. On the other hand, the one thing the > current behavior has going for it is that at least the current behavior is easy > to reason about; it always throws when confronted with a `null`. Any relaxed > behavior would be more complex; some switches would still have to throw (for > compatibility with existing semantics), and some (which can't be expressed > today) would accept nulls. This is a tricky balance to achieve, but I think we > have a found a good one. > > A starting point is that we don't want to require readers to do an _O(n)_ > analysis of each of the `case` labels just to determine whether a given switch > accepts `null` or not; this should be an _O(1)_ analysis. (We do not want to > introduce a new flavor of `switch`, such as `switch-nullable`; this might seem > to fix the proximate problem but would surely create others. As we've done with > expression switch and patterns, we'd rather rehabilitate `switch` than create > an almost-but-not-quite-the-same variant.) > > Let's start with the null pattern, which we'll spell for sake of exposition > `case null`. What if you were allowed to say `case null` in a switch, and the > switch would do the obvious thing? > > ``` > switch (o) { > case null -> System.out.println("Ugh, null"); > case String s -> System.out.println("Yay, non-null: " + s); > } > ``` > > Given that the `case null` appears so close to the `switch`, it does not seem > confusing that this switch would match `null`; the existence of `case null` at > the top of the switch makes it pretty clear that this is intended behavior. (We > could further restrict the null pattern to being the first pattern in a switch, > to make this clearer.) > > Now, let's look at the other end of the switch -- the last case. What if the > last pattern is a total pattern? (Note that if any `case` has a total pattern, > it _must_ be the last one, otherwise the cases after that would be dead, which > would be an error.) Is it also reasonable for that to match null? After all, > we're saying "everything": > > ``` > switch (o) { > case String s: ... > case Object o: ... > } > ``` > > Under this interpretation, the switch-refactoring anomaly above goes away. > > The direction we're going here is that if we can localize the null-acceptance of > switches in the first (is it `case null`?) and last (is it total?) cases, then > the incremental complexity of allowing _some_ switches to accept null might be > outweighed by the incremental benefit of treating `null` more uniformly (and > thus eliminating the refactoring anomalies.) Note also that there is no actual > code compatibility issue; this is all mental-model compatibility. > > So far, we're suggesting: > > - A switch with a constant `null` case will accept nulls; > - If present, a constant `null` case must go first; > - A switch with a total (any) case matches also accepts nulls; > - If present, a total (any) case must go last. > > #### Relocating the problem > > It might be more helpful to view these changes as not changing the behavior of > `switch`, but of the `default` case of `switch`. We can equally well interpret > the current behavior as: > > - `switch` always accepts `null`, but matching the `default` case of a `switch` > throws `NullPointerException`; > - any `switch` without a `default` case has an implicit do-nothing `default` > case. > > If we adopt this change of perspective, then `default`, not `switch`, is in > control of the null rejection behavior -- and we can view these changes as > adjusting the behavior of `default`. So we can recast the proposed changes as: > > - Switches accept null; > - A constant `null` case will match nulls, and must go first; > - A total switch (a switch with a total `case`) cannot have a `default` case; > - A non-total switch without a `default` case gets an implicit do-nothing > `default` case; > - Matching the (implicit or explicit) default case with a `null` operand > always throws NPE. > > The main casualty here is that the `default` case does not mean the same > thing as `case var x` or `case Object o`. We can't deprecate `default`, but > for pattern switches, it becomes much less useful. > > #### What about method (declared) patterns? > > So far, we've declared all patterns, except the `null` constant pattern and the > total (any) pattern, to not match `null`. What about patterns that are > explicitly declared in code? It turns out we can rule out these matching > `null` fairly easily. > > We can divide declared patterns into three kinds: deconstruction patterns (dual > to constructors), static patterns (dual to static methods), and instance > patterns (dual to instance methods.) For both deconstruction and instance > patterns, the match target becomes the receiver; method bodies are never > expected to deal with the case where `this == null`. > > For static patterns, it is conceivable that they could match `null`, but this > would put a fairly serious burden on writers of static patterns to check for > `null` -- which they would invariably forget, and many more NPEs would ensue. > (Think about writing the pattern for `Optional.of(T t)` -- it would be > overwhelmingly likely we'd forget to check the target for nullity.) SO there > is a strong argument to simply say "declared patterns never match null", to > not put writers of such patterns in this situation. > > So, only the top and bottom patterns in a switch could match null; if the top > pattern is not `case null`, and the bottom pattern is not total, then the switch > throws NPE on null, otherwise it accepts null. > > #### Adjusting instanceof > > The remaining anomaly we had was about unrolling nested patterns when the inner > pattern is total. We can plug this by simply outlawing total patterns in > `instanceof`. > > This may seem like a cheap trick, but it makes sense on its own. If the > following statement was allowed: > > ``` > if (e instanceof var x) { X } > ``` > > it would simply be confusing; on the one hand, it looks like it should always > match, but on the other, `instanceof` is historically null-hostile. And, if the > pattern always matches, then the `if` statement is silly; it should be replaced > with: > > ``` > var x = e; > X > ``` > > since there's nothing conditional about it. So by banning "any" patterns on the > RHS of `instanceof`, we both avoid a confusion about what is going to happen, > and we prevent the unrolling anomaly. > > For reasons of compatibility, we will have to continue to allow > > ``` > if (e instanceof Object) { ... } > ``` > > which succeeds on all non-null operands. > > We've been a little sloppy with the terminology of "any" vs "total"; note that > in > > ``` > Point p; > if (p instanceof Point(var x, var y)) { } > ``` > > the pattern `Point(var x, var y)` is total on `Point`, but not an "any" pattern > -- it still doesn't match on p == null. > > On the theory that an "any" pattern in `instanceof` is silly, we may also want > to ban other "silly" patterns in `instanceof`, such as constant patterns, since > all of the following have simpler forms: > > ``` > if (x instanceof null) { ... } > if (x instanceof "") { ... } > if (i instanceof 3) { ... } > ``` > > In the first round (type patterns in `instanceof`), we mostly didn't confront > this issue, saying that `instanceof T t` matched in all the cases where > `instanceof T` would match. But given that the solution for `switch` relies > on "any" patterns matching null, we may wish to adjust the behavior of > `instanceof` before it exits preview. > > > [jep305]: https://openjdk.java.net/jeps/305 > [patternmatch]: pattern-match.html > From brian.goetz at oracle.com Fri Aug 14 11:24:10 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 14 Aug 2020 07:24:10 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: <8e3b4d87-87f7-fbfa-3617-90ce156f6262@oracle.com> References: <8a0d00c7-07a4-2d1e-8570-7e2cebd7f00e@oracle.com> <8e3b4d87-87f7-fbfa-3617-90ce156f6262@oracle.com> Message-ID: > > Is the following what you mean by "mangle your switch badly" ? > > switch (o) { > case A: ... > case B: do some B-ish stuff ... also, if (g) {...} > case C: ... > ... > case Z: ... > case Object: if (o instanceof B && !g) { do the B-ish non-g thing } > } Worse! What I meant is that if you do not have guards, you have to move such stuff OUTSIDE of the switch. Instead of: case B & g: ? case B: b-logic case Object: default-logic you have to do switch (x) { case B: if (!g) { break out of the switch! } B logic; case Object: default-logic; } if (it was B but not G) { do the default logic again } The problem with a guard-less switch is that you get exactly one chance to fall into one of the buckets, and once you?re in a bucket, your only choice is to fall out of the switch. > Is a guard (a) part of the `case` construct, or (b) part of the pattern operand for a `case` construct? Good question. We went back and forth a few times on this. My initial preference was to make the guard logic part of the pattern; ideally, to make ?Pattern && boolean-expr? a pattern. But this is messy, since it invites ambiguities, and not really needed in the other primary consumer of patterns, since boolean expressions can already be conjoined with &&, and flow scoping already does everything we want. The real problem is switch is too inflexible, a problem revealed when its case labels are made more powerful. So it seems that the sensible t hing to do is to make guards a feature of switch, and say a case label is one of: case case case when > The original mail introduced "guard expression" as "a boolean expression that conditions whether the case matches", which sounds like (a). However, the purpose of a `case` construct is to enumerate one or more possible values of the selector expression, and if a `case` construct has a post-condition `& g()` then it's not just enumerating, and it isn't a `case` construct anymore. I mean, we don't want to see guards in the `case` constructs of legacy switches, right? (`switch (i) { case 100 & g():`) So, is the answer (b) ? Does the above grammar suggestion answer your question? -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Aug 14 15:16:49 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 14 Aug 2020 17:16:49 +0200 (CEST) Subject: Next up for patterns: type patterns in switch In-Reply-To: References: Message-ID: <1754652500.221166.1597418209692.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Tagir Valeev" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Vendredi 14 Ao?t 2020 05:00:35 > Objet: Re: Next up for patterns: type patterns in switch > Hello! > > I haven't read all the discussions in this thread, so sorry if my > points were already covered. yep, they have been covered but your mail is a good opportunity to try to summary the discussions. 1/ You have the same rules for top-level patterns (the one that starts with case) and nested patterns so the same rules works for switch(s) { case Sub sub: default Super zuper: } and switch(b) { case Box(Sub sub): case Box(default Super zuper): } So the behavior depends on the patterns that starts with the same de-structuring patterns. 2/ Patterns has to be ordered from the most specific to the less specific, so the following switch is valid switch(b) { case Box(Sub sub): case Box(default Super zuper): } but not this one switch(b) { case Box(Super zuper): case Box(Sub sub): } It's like with the catches of a try/catch block. 3/ Nullability, with switch(b) { case Box(Sub sub): case Box(default Super zuper): } the question is, is Box(null) allowed or not. The rule is null is allowed if either the null pattern is used or a total pattern is used. - the null pattern: case Box(null): - a total pattern: case Box(Super zuper): A total pattern is a pattern that cover all the others for a destructuring prefix. So in the switch above Box(null) is allowed. Also: because of 2/, case Box(null) has to be the first case of the switch above. 4/ Exhaustiveness patterns are exhaustive if - all pattern covers all the cases of a sealed type or all constants of an enum. - if there is a total pattern that covers all the patterns. expression switch requires exhaustiveness, statement switch don't. 5/ Totality As you said below, knowing if a pattern is total or not can change when code change/is refactored. Moreover, users may think that a pattern is total where it's not. So the keyword "default" can be used as a prefix to verify that a pattern is total or not. Also if the pattern is top level, instead of writing "case default Super ...", the syntax is simplified to "default Super". An example with nested patterns switch(s) { case Sub sub: default Super zuper: } and an example with a total pattern at top-level switch(b) { case Box(Sub sub): case Box(default Super zuper): } Brian thinks that default is not mandatory, is disagree. 6/ Var a type pattern using "var" is always total, thus accept null. By example, the following switch accept Box(null) switch(b) { case Box(Sub sub): case Box(default var zuper): } Here using 'default' and 'var' is a little stupid given that they both means that the pattern is total. 7/ Default in switch(foo) { default: } is not considered as a total pattern to be backward compatible. This is quite messy because it means that "default" is not a simpler syntax of "default var x" which allows null. I hope i've not forgotten something, or paint something with the wrong color :) > > It looks like relocating the problem to defaults have a hole for this > trivial switch: > > switch(obj) { > case null: // fallthrough > default: > } > > If we think that default case always throws for null, should this switch throw? no it doesn't, there are two kind of switches, those that disallow null and those that allow null. A switch allow null is either there is a case null or there is a total case, a case that covers all others (this is a recursive rule, it works at all de-structuring levels). > > I agree that method patterns should never match null. I have no strong > opinion on guards but 'continue' idea looks nice to me. Sometimes it's > desired to perform a non-trivial decision on whether to match or not, > and it's not always convenient to express it as a single expression. > In this case, 'continue' would work. Also, it looks more friendly to > step-by-step debugger: when debugger position lands at 'continue', you > clearly see that the match was failed. > > It's unclear for me how exhaustiveness on nested patterns plays with > null. case Box(Circle c) and case Box(Rect r) don't cover case > Box(null) which is a valid possibility for Box type. if you have only case Box(Circle) and case Box(Rect), null is not allowed, if you add default Box(Shape), null is allowed because it's a total pattern (it covers Circle and Rect) thus allows null. > > I'm still heavily concerned about any patterns with the explicit type > (`T t` on an operand of type `U`, where `U <: T`). The exact type of > the switch selector expression might not be readily visible around the > switch statement, so it could be unclear from reading the switch > statement whether it's null-friendly or not without IDE help. yes > Moreover, this allows action at a distance. E.g.: > > interface Super {} > interface Sub extends Super {} > > Sub getValue(); > ... > switch(getValue()) { > ... > case Sub s: ... > } > > Suppose I decided to generalize the return value of getValue() > changing it to Super. Most of the clients would either stop compiling > or continue work as expected, so I can go through the compilation > errors and fix them. However, this switch will silently change the > behavior throwing on null instead of matching on the last case. This > may introduce subtle errors. yes, that's why we are thinking to introduce the keyword default in front of the pattern (see /5) > > Also (wearing the IDE developer hat), we have the parameter > nullability inference in IDEA. It works during the indexing when > resolve is not available and the results of this inference should not > depend on other Java files, so it's quite limited. Currently, it works > if the method starts with switch: > > static void doSmth(String s) { > switch(s) { // we can safely mark s parameter as not-null producing > a warning on call-sites if null is passed > .. > } > } > > This would not work if the switch ends with explicit type pattern: > > static void doSmth(U u) { > switch(u) { > ... > case T t: ... > } > } > > Now, without resolving T and U we don't know whether U <: T, thus we > cannot infer u nullity anymore. In 99% of cases if T != U then it's > not any pattern, thus null is not allowed in this method, but we > cannot say this for sure. We have discussed about having the keyword default but not how it is desugared by the translation strategy, i.e. if the translation strategy encode that a pattern is total in the constant arguments of invokedynamic or not. > > So I would stick with more explicit ways to show whether the given > pattern matches null. In particular, separating `var t` (allows null) > and `T t` (disallows null) is one possibility. I don't think > overloading the `var` keyword with a new meaning is very harmful. I agree, it's a solution less verbose that using default but at least both Brian and Guy disagree. > Another possibility (I like it more) is to keep the proposal as it is > for nested patterns but invent something for top-level patterns > because only top-level patterns affect switch null-friendliness. [...] Nope, every solution that have different rules for top-level and nested patterns does not pass muster (see 1/) > > With best regards, > Tagir Valeev. regards, R?mi > > On Wed, Jun 24, 2020 at 9:48 PM Brian Goetz wrote: >> >> There are a lot of directions we could take next for pattern matching. The one >> that builds most on what we've already done, and offers significant incremental >> expressiveness, is extending the type patterns we already have to a new >> context: switch. (There is still plenty of work to do on deconstruction >> patterns, pattern assignment, etc, but these require more design work.) >> >> Here's an overview of where I think we are here. >> >> [JEP 305][jep305] introduced the first phase of [pattern matching][patternmatch] >> into the Java language. It was deliberately limited, focusing on only one kind >> of pattern (type test patterns) and one linguistic context (`instanceof`). >> Having introduced the concept to Java developers, we can now extend both the >> kinds of patterns and the linguistic context where patterns are used. >> >> ## Patterns in switch >> >> The obvious next context in which to introduce pattern matching is `switch`; a >> switch using patterns as `case` labels can replace `if .. else if` chains with >> a more direct way of expressing a multi-way conditional. >> >> Unfortunately, `switch` is one of the most complex, irregular constructs we have >> in Java, so we must teach it some new tricks while avoiding some existing traps. >> Such tricks and traps may include: >> >> **Typing.** Currently, the operand of a `switch` may only be one of the >> integral primitive types, the box type of an integral primitive, `String`, or an >> `enum` type. (Further, if the `switch` operand is an `enum` type, the `case` >> labels must be _unqualified_ enum constant names.) Clearly we can relax this >> restriction to allow other types, and constrain the case labels to only be >> patterns that are applicable to that type, but it may leave a seam of "legacy" >> vs "pattern" switch, especially if we do not adopt bare constant literals as >> the denotation of constant patterns. (We have confronted this issue before with >> expression switch, and concluded that it was better to rehabilitate the `switch` >> we have rather than create a new construct, and we will make the same choice >> here, but the cost of this is often a visible seam.) >> >> **Parsing.** The grammar currently specifies that the operand of a `case` label >> is a `CaseConstant`, which casts a wide syntactic net, later narrowed with >> post-checks after attribution. This means that, since parsing is done before we >> know the type of the operand, we must be watchful for ambiguities between >> patterns and expressions (and possibly refine the production for `case` labels.) >> >> **Nullity.** The `switch` construct is currently hostile to `null`, but some >> patterns do match `null`, and it may be desirable if nulls can be handled >> within a suitably crafted `switch`. >> >> **Exhaustiveness.** For switches over the permitted subtypes of sealed types, >> we will want to be able to do exhaustiveness analysis -- including for nested >> patterns (i.e., if `Shape` is `Circle` or `Rect`, then `Box(Circle c)` and >> `Box(Rect r)` are exhaustive on `Box`.) >> >> **Fallthrough.** Fallthrough is everyone's least favorite feature of `switch`, >> but it exists for a reason. (The mistake was making fallthrough the default >> behavior, but that ship has sailed.) In the absence of an OR pattern >> combinator, one might find fallthrough in switch useful in conjunction with >> patterns: >> >> ``` >> case Box(int x): >> case Bag(int x): >> // use x >> ``` >> >> However, it is likely that we will, at least initially, disallow falling out >> of, or into, a case label with binding variables. >> >> #### Translation >> >> Switches on primitives and their wrapper types are translated using the >> `tableswitch` or `lookupswitch` bytecodes; switches on strings and enums are >> lowered in the compiler to switches involving hash codes (for strings) or >> ordinals (for enums.) >> >> For switches on patterns, we would need a new strategy, one likely built on >> `invokedynamic`, where we lower the cases to a densely numbered `int` switch, >> and then invoke a classifier function with the operand which tells us the first >> case number it matches. So a switch like: >> >> ``` >> switch (o) { >> case P: A >> case Q: B >> } >> ``` >> >> is lowered to: >> >> ``` >> int target = indy[BSM=PatternSwitch, args=[P,Q]](o) >> switch (target) { >> case 0: A >> case 1: B >> } >> ``` >> >> A symbolic description of the patterns is provided as the bootstrap argument >> list, which builds a decision tree based on analysis of the patterns and their >> target types. >> >> #### Guards >> >> No matter how rich our patterns are, it is often the case that we will want >> to provide additional filtering on the results of a pattern: >> >> ``` >> if (shape instanceof Cylinder c && c.color() == RED) { ... } >> ``` >> >> Because we use `instanceof` as part of a boolean expression, it is easy to >> narrow the results by conjoining additional checks with `&&`. But in a `case` >> label, we do not necessarily have this opportunity. Worse, the semantics of >> `switch` mean that once a `case` label is selected, there is no way to say >> "oops, forget it, keep trying from the next label". >> >> It is common in languages with pattern matching to support some form of "guard" >> expression, which is a boolean expression that conditions whether the case >> matches, such as: >> >> ``` >> case Point(var x, var y) >> __where x == y: ... >> ``` >> >> Bindings from the pattern would have to be available in guard expressions. >> >> Syntactic options (and hazards) for guards abound; users would probably find it >> natural to reuse `&&` to attach guards to patterns; C# has chosen `when` for >> introducing guards; we could use `case P if (e)`, etc. Whatever we do here, >> there is a readability risk, as the more complex guards are, the harder it is >> to tell where the case label ends and the "body" begins. (And worse if we allow >> switch expressions inside guards.) >> >> An alternate to guards is to allow an imperative `continue` statement in >> `switch`, which would mean "keep trying to match from the next label." Given >> the existing semantics of `continue`, this is a natural extension, but since >> `continue` does not currently have meaning for switch, some work would have to >> be done to disambiguate continue statements in switches enclosed in loops. The >> imperative version is strictly more expressive than most reasonable forms of the >> declarative version, but users are likely to prefer the declarative version. >> >> ## Nulls >> >> Almost no language design exercise is complete without some degree of wrestling >> with `null`. As we define more complex patterns than simple type patterns, and >> extend constructs such as `switch` (which have existing opinions about nullity) >> to support patterns, we need to have a clear understanding of which patterns >> are nullable, and separate the nullity behaviors of patterns from the nullity >> behaviors of those constructs which use patterns. >> >> ## Nullity and patterns >> >> This topic has a number of easily-tangled concerns: >> >> - **Construct nullability.** Constructs to which we want to add pattern >> awareness (`instanceof`, `switch`) already have their own opinion about >> nulls. Currently, `instanceof` always says false when presented with a >> `null`, and `switch` always NPEs. We may, or may not, wish to refine these >> rules in some cases. >> - **Pattern nullability.** Some patterns clearly would never match `null` >> (such as deconstruction patterns), whereas others (an "any" pattern, and >> surely the `null` constant pattern) might make sense to match null. >> - **Refactoring friendliness.** There are a number of cases that we would like >> to freely refactor back and forth, such as certain chains of `if ... else if` >> with switches. >> - **Nesting vs top-level.** The "obvious" thing to do at the top level of a >> construct is not always the "obvious" thing to do in a nested construct. >> - **Totality vs partiality.** When a pattern is partial on the operand type >> (e.g., `case String` when the operand of switch is `Object`), it is almost >> never the case we want to match null (except in the case of the `null` >> constant pattern), whereas when a pattern is total on the operand type (e.g., >> `case Object` in the same example), it is more justifiable to match null. >> - **Inference.** It would be nice if a `var` pattern were simply inference for >> a type pattern, rather than some possibly-non-denotable union. >> >> As a starting example, consider: >> >> ``` >> record Box(Object o) { } >> >> Box box = ... >> switch (box) { >> case Box(Chocolate c): >> case Box(Frog f): >> case Box(var o): >> } >> ``` >> >> It would be highly confusing and error-prone for either of the first two >> patterns to match `Box(null)` -- given that `Chocolate` and `Frog` have no type >> relation, it should be perfectly safe to reorder the two. But, because the last >> pattern seems so obviously total on boxes, it is quite likely that what the >> author wants is to match all remaining boxes, including those that contain null. >> (Further, it would be terrible if there were _no_ way to say "Match any `Box`, >> even if it contains `null`. (While one might initially think this could be >> repaired with OR patterns, imagine that `Box` had _n_ components -- we'd need to >> OR together _2^n_ patterns, with complex merging, to express all the possible >> combinations of nullity.)) >> >> Scala and C# took the approach of saying that "var" patterns are not just type >> inference, they are "any" patterns -- so `Box(Object o)` matches boxes >> containing a non-null payload, where `Box(var o)` matches all boxes. This >> means, unfortunately, that `var` is not mere type inference -- which complicates >> the role of `var` in the language considerably. Users should not have to choose >> between the semantics they want and being explicit about types; these should be >> orthogonal choices. The above `switch` should be equivalent to: >> >> ``` >> Box box = ... >> switch (box) { >> case Box(Chocolate c): >> case Box(Frog f): >> case Box(Object o): >> } >> ``` >> >> and the choice to use `Object` or `var` should be solely one of whether the >> manifest types are deemed to improve or impair readability. >> >> #### Construct and pattern nullability >> >> Currently, `instanceof` always says `false` on `null`, and `switch` always >> throws on `null`. Whatever null opinions a construct has, these are applied >> before we even test any patterns. >> >> We can formalize the intuition outlined above as: type patterns that are _total_ >> on their target operand (`var x`, and `T t` on an operand of type `U`, where `U >> <: T`) match null, and non-total type patterns do not. (Another way to say >> this is: a `var` pattern is the "any" pattern, and a type pattern that is total >> on its operand type is also an "any" pattern.) Additionally, the `null` >> constant pattern matches null. These are the _only_ nullable patterns. >> >> In our `Box` example, this means that the last case (whether written as `Box(var >> o)` or `Box(Object o)`) matches all boxes, including those containing null >> (because the nested pattern is total on the nested operand), but the first two >> cases do not. >> >> If we retain the current absolute hostility of `switch` to nulls, we can't >> trivially refactor from >> >> ``` >> switch (o) { >> case Box(Chocolate c): >> case Box(Frog f): >> case Box(var o): >> } >> ``` >> to >> >> ``` >> switch (o) { >> case Box(var contents): >> switch (contents) { >> case Chocolate c: >> case Frog f: >> case Object o: >> } >> } >> } >> ``` >> >> because the inner `switch(contents)` would NPE before we tried to match any of >> the patterns it contains. Instead, the user would explicitly have to do an `if >> (contents == null)` test, and, if the intent was to handle `null` in the same >> way as the `Object o` case, some duplication of code would be needed. We can >> address this sharp corner by slightly relaxing the null-hostility of `switch`, >> as described below. >> >> A similar sharp corner is the decomposition of a nested pattern `P(Q)` into >> `P(alpha) & alpha instanceof Q`; while this is intended to be a universally >> valid transformation, if P's 1st component might be null and Q is total, this >> transformation would not be valid because of the existing (mild) null-hostility >> of `instanceof`. Again, we may be able to address this by adjusting the rules >> surrounding `instanceof` slightly. >> >> ## Generalizing switch >> >> The refactoring example above motivates why we might want to relax the >> null-handling behavior of `switch`. On the other hand, the one thing the >> current behavior has going for it is that at least the current behavior is easy >> to reason about; it always throws when confronted with a `null`. Any relaxed >> behavior would be more complex; some switches would still have to throw (for >> compatibility with existing semantics), and some (which can't be expressed >> today) would accept nulls. This is a tricky balance to achieve, but I think we >> have a found a good one. >> >> A starting point is that we don't want to require readers to do an _O(n)_ >> analysis of each of the `case` labels just to determine whether a given switch >> accepts `null` or not; this should be an _O(1)_ analysis. (We do not want to >> introduce a new flavor of `switch`, such as `switch-nullable`; this might seem >> to fix the proximate problem but would surely create others. As we've done with >> expression switch and patterns, we'd rather rehabilitate `switch` than create >> an almost-but-not-quite-the-same variant.) >> >> Let's start with the null pattern, which we'll spell for sake of exposition >> `case null`. What if you were allowed to say `case null` in a switch, and the >> switch would do the obvious thing? >> >> ``` >> switch (o) { >> case null -> System.out.println("Ugh, null"); >> case String s -> System.out.println("Yay, non-null: " + s); >> } >> ``` >> >> Given that the `case null` appears so close to the `switch`, it does not seem >> confusing that this switch would match `null`; the existence of `case null` at >> the top of the switch makes it pretty clear that this is intended behavior. (We >> could further restrict the null pattern to being the first pattern in a switch, >> to make this clearer.) >> >> Now, let's look at the other end of the switch -- the last case. What if the >> last pattern is a total pattern? (Note that if any `case` has a total pattern, >> it _must_ be the last one, otherwise the cases after that would be dead, which >> would be an error.) Is it also reasonable for that to match null? After all, >> we're saying "everything": >> >> ``` >> switch (o) { >> case String s: ... >> case Object o: ... >> } >> ``` >> >> Under this interpretation, the switch-refactoring anomaly above goes away. >> >> The direction we're going here is that if we can localize the null-acceptance of >> switches in the first (is it `case null`?) and last (is it total?) cases, then >> the incremental complexity of allowing _some_ switches to accept null might be >> outweighed by the incremental benefit of treating `null` more uniformly (and >> thus eliminating the refactoring anomalies.) Note also that there is no actual >> code compatibility issue; this is all mental-model compatibility. >> >> So far, we're suggesting: >> >> - A switch with a constant `null` case will accept nulls; >> - If present, a constant `null` case must go first; >> - A switch with a total (any) case matches also accepts nulls; >> - If present, a total (any) case must go last. >> >> #### Relocating the problem >> >> It might be more helpful to view these changes as not changing the behavior of >> `switch`, but of the `default` case of `switch`. We can equally well interpret >> the current behavior as: >> >> - `switch` always accepts `null`, but matching the `default` case of a `switch` >> throws `NullPointerException`; >> - any `switch` without a `default` case has an implicit do-nothing `default` >> case. >> >> If we adopt this change of perspective, then `default`, not `switch`, is in >> control of the null rejection behavior -- and we can view these changes as >> adjusting the behavior of `default`. So we can recast the proposed changes as: >> >> - Switches accept null; >> - A constant `null` case will match nulls, and must go first; >> - A total switch (a switch with a total `case`) cannot have a `default` case; >> - A non-total switch without a `default` case gets an implicit do-nothing >> `default` case; >> - Matching the (implicit or explicit) default case with a `null` operand >> always throws NPE. >> >> The main casualty here is that the `default` case does not mean the same >> thing as `case var x` or `case Object o`. We can't deprecate `default`, but >> for pattern switches, it becomes much less useful. >> >> #### What about method (declared) patterns? >> >> So far, we've declared all patterns, except the `null` constant pattern and the >> total (any) pattern, to not match `null`. What about patterns that are >> explicitly declared in code? It turns out we can rule out these matching >> `null` fairly easily. >> >> We can divide declared patterns into three kinds: deconstruction patterns (dual >> to constructors), static patterns (dual to static methods), and instance >> patterns (dual to instance methods.) For both deconstruction and instance >> patterns, the match target becomes the receiver; method bodies are never >> expected to deal with the case where `this == null`. >> >> For static patterns, it is conceivable that they could match `null`, but this >> would put a fairly serious burden on writers of static patterns to check for >> `null` -- which they would invariably forget, and many more NPEs would ensue. >> (Think about writing the pattern for `Optional.of(T t)` -- it would be >> overwhelmingly likely we'd forget to check the target for nullity.) SO there >> is a strong argument to simply say "declared patterns never match null", to >> not put writers of such patterns in this situation. >> >> So, only the top and bottom patterns in a switch could match null; if the top >> pattern is not `case null`, and the bottom pattern is not total, then the switch >> throws NPE on null, otherwise it accepts null. >> >> #### Adjusting instanceof >> >> The remaining anomaly we had was about unrolling nested patterns when the inner >> pattern is total. We can plug this by simply outlawing total patterns in >> `instanceof`. >> >> This may seem like a cheap trick, but it makes sense on its own. If the >> following statement was allowed: >> >> ``` >> if (e instanceof var x) { X } >> ``` >> >> it would simply be confusing; on the one hand, it looks like it should always >> match, but on the other, `instanceof` is historically null-hostile. And, if the >> pattern always matches, then the `if` statement is silly; it should be replaced >> with: >> >> ``` >> var x = e; >> X >> ``` >> >> since there's nothing conditional about it. So by banning "any" patterns on the >> RHS of `instanceof`, we both avoid a confusion about what is going to happen, >> and we prevent the unrolling anomaly. >> >> For reasons of compatibility, we will have to continue to allow >> >> ``` >> if (e instanceof Object) { ... } >> ``` >> >> which succeeds on all non-null operands. >> >> We've been a little sloppy with the terminology of "any" vs "total"; note that >> in >> >> ``` >> Point p; >> if (p instanceof Point(var x, var y)) { } >> ``` >> >> the pattern `Point(var x, var y)` is total on `Point`, but not an "any" pattern >> -- it still doesn't match on p == null. >> >> On the theory that an "any" pattern in `instanceof` is silly, we may also want >> to ban other "silly" patterns in `instanceof`, such as constant patterns, since >> all of the following have simpler forms: >> >> ``` >> if (x instanceof null) { ... } >> if (x instanceof "") { ... } >> if (i instanceof 3) { ... } >> ``` >> >> In the first round (type patterns in `instanceof`), we mostly didn't confront >> this issue, saying that `instanceof T t` matched in all the cases where >> `instanceof T` would match. But given that the solution for `switch` relies >> on "any" patterns matching null, we may wish to adjust the behavior of >> `instanceof` before it exits preview. >> >> >> [jep305]: https://openjdk.java.net/jeps/305 >> [patternmatch]: pattern-match.html From Daniel_Heidinga at ca.ibm.com Fri Aug 14 16:01:25 2020 From: Daniel_Heidinga at ca.ibm.com (Daniel Heidinga) Date: Fri, 14 Aug 2020 16:01:25 +0000 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From alex.buckley at oracle.com Fri Aug 14 16:46:35 2020 From: alex.buckley at oracle.com (Alex Buckley) Date: Fri, 14 Aug 2020 09:46:35 -0700 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: <8a0d00c7-07a4-2d1e-8570-7e2cebd7f00e@oracle.com> <8e3b4d87-87f7-fbfa-3617-90ce156f6262@oracle.com> Message-ID: On 8/14/2020 4:24 AM, Brian Goetz wrote:> My initial preference was to make the guard logic part of the pattern; > ideally, to make ?Pattern && boolean-expr? a pattern. ?But this is > messy, since it invites ambiguities, and not really needed in the other > primary consumer of patterns, since boolean expressions can already be > conjoined with &&, and flow scoping already does everything we want. > The real problem is switch is too inflexible, a problem revealed when > its case labels are made more powerful. ?So it seems that the sensible > thing to do is to make guards a feature of switch, and say a case label > is one of: > > ? ? case > ? ? case > ? ? case when Got it, thanks. Alex From brian.goetz at oracle.com Fri Aug 14 17:19:34 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 14 Aug 2020 13:19:34 -0400 Subject: [pattern-switch] Summary of open issues Message-ID: Here's a summary of the issues raised in the reviews of the patterns-in-switch document.? I'm going to (try to) start a new thread for each of them; let's not reply to this one with new topics (or with discussion on these topics.)? I'll update this thread as we add or remove things from the list. ?- Is totality too subtle? (Remi) There is some concern that the notion of using totality to subsume nullability (at least in nested contexts) is sound, he is concerned that the difference between total and non-total patterns may be too subtle, and this may lead to NPE issues.? To evaluate this, we need to evaluate both the "is totality too subtle" and the "how much are we worried about NPE in this context" directions. ?- Guards.? (John, Tagir) There is acknowledgement that some sort of "whoops, not this case" support is needed in order to maintain switch as a useful construct in the face of richer case labels, but some disagreement about whether an imperative statement (e.g., continue) or a declarative guard (e.g., `when `) is the right choice. ?- Exhaustiveness and null. (Tagir)? For sealed domains (enums and sealed types), we kind of cheated with expression switches because we could count on the switch filtering out the null. But Tagir raises an excellent point, which is that we do not yet have a sound definition of exhaustiveness that scales to nested patterns (do Box(Rect) and Box(Circle) cover Box(Shape)?)? This is an interaction between sealed types and patterns that needs to be ironed out.? (Thanks Tagir!) ?- Switch and null. (Tagir, Kevin)? Should we reconsider trying to rehabilitate switches null-acceptance?? There are several who are questioning whether this is trying to push things too far for too little benefit. ?- Rehabilitating default.? The current design leaves default to rot; it is possible it has a better role to play with respect to the rehabilitation of switch, such as signalling that the switch is total. ?- Restrictions on instanceof.? It has been proposed that we restrict total patterns from instanceof to avoid confusion; while no one has really objected, a few people have expressed mild discomfort.? Leaving it on the list for now until we resolve some of the other nullity questions. ?- Meta. (Brian)? Nearly all of this is about null.? Is it possible that everything else about the proposal is so perfect that there's nothing else to talk about?? Seems unlikely.? I recommend we turn up the attenuation knob on nullity issues to leave some oxygen for some of the other flowers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Aug 14 17:20:26 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 14 Aug 2020 13:20:26 -0400 Subject: [pattern-switch] Exhaustiveness Message-ID: > > ?- Exhaustiveness and null. (Tagir)? For sealed domains (enums and > sealed types), we kind of cheated with expression switches because we > could count on the switch filtering out the null.? But Tagir raises an > excellent point, which is that we do not yet have a sound definition > of exhaustiveness that scales to nested patterns (do Box(Rect) and > Box(Circle) cover Box(Shape)?)? This is an interaction between sealed > types and patterns that needs to be ironed out.? (Thanks Tagir!) [ Breaking this out from Tagir's more comprehensive reply ] > It's unclear for me how exhaustiveness on nested patterns plays with > null. case Box(Circle c) and case Box(Rect r) don't cover case > Box(null) which is a valid possibility for Box type. It?s not even clear how exhaustiveness plays with null even without nesting, so let's start there. Consider this switch: switch (trafficLight) { case GREEN, YELLOW: driveReallyFast(); case RED: sigh(); ? ? } Is it exhaustive? ?Well, we want to say yes. ?And with the existing null-hostility of switch, it is. ?But even without that, we?d like to say yes, because a null enum value is almost always an error, and making users deal with cases that don?t happen in reality is kind of rude. For a domain sealed to a set of alternatives (enums or sealed classes), let?s say that a set of patterns is _weakly exhaustive_ if it covers all the alternatives but not null, and _strongly exhaustive_ if it also covers null. ?When we did switch expressions, we said that weakly exhaustive coverings didn?t need a default in a switch expression. ?I think we?re primed to say the same thing for sealed classes. ?But, this ?weak is good enough? leans on the fact that the existing hostility of switch will cover what we miss. ?We get no such cover in nested cases. I think it?s worth examining further why we are willing to accept the weak coverage with enums. ?Is it really that we?re willing to assume that enums just should never be null? ?If we had type cardinalities in the language, would we treat `enum X` as declaring a cardinality-1 type called X? ?I think we might. ?OK, what about sealed classes? ?Would the same thing carry over? ?Not so sure there. ?And this is a problem, because we ultimately want: ? ? case Optional.of(var x): ? ? case Optional.empty(): to be exhaustive on Optional, and said exhaustiveness will likely lean on some sort of sealing. This is related to Guy's observation that totality is a "subtree all the way down" property.? Consider: ??? sealed class Container permits Box, Bag { } ??? sealed class Shape permits Rect, Circle { } Ignoring null, Box+Bag should be exhaustive on container, and Rect+Circle should be exhaustive on shape.? So if we are switching over a Container, then what of: ??? case Box(Rect r): ??? case Box(Circle c): ??? case Bag(Rect r): ??? case Bag(Circle c): We have some "nullity holes" in three places: Box(null), Bag(null), and null itself.?? Is this set of cases exhaustive on Bags, Boxes, or Containers? I think users would like to be able to write the above four cases and treat it as exhaustive; having to explicitly provide Box(null) / Box b, Bag(null) / Bag b, or a catch-all to accept null+Box(null)+Bag(null) would all be deemed unpleasant ceremony. Hmm... -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Aug 14 17:20:30 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 14 Aug 2020 13:20:30 -0400 Subject: [pattern-switch] Guards Message-ID: <09e39b8d-2e1c-ca19-9617-dec2a3f5996e@oracle.com> > > ?- Guards.? (John, Tagir) There is acknowledgement that some sort of > "whoops, not this case" support is needed in order to maintain switch > as a useful construct in the face of richer case labels, but some > disagreement about whether an imperative statement (e.g., continue) or > a declarative guard (e.g., `when `) is the right choice. This is probably the biggest blocking decision in front of us. John correctly points out that the need for some sort of guard is a direct consequence of making switch stronger; with the current meaning of switch, which is "which one of these is it", there's no need for backtracking, but as we can express richer case labels, the risk of the case label _not being rich enough_ starts to loom. We explored rolling boolean guards into patterns themselves (`P && g`), which was theoretically attractive but turned out to not be all that great.? There are some potential ambiguities (even if we do something else about constant patterns, there are still some patterns that look like expressions and vice versa, making the grammar ugly here) and it just doesn't have that much incremental expressive power, since the most credible other use of patterns already (instanceof) has no problem conjoining additional conditions, because it's a boolean expression.? So this is largely about filling in the gaps of switch so that we don't have fall-off-the-cliff behaviors. There are two credible approaches here: ?- An imperative statement (like `continue` or `next-case`), which means "whoops, fell in the wrong bucket, please backtrack to the dispatch"; ?- A declarative clause on the case label (like `when `) that qualifies whether the case is selected. Most of the discussion so far has been on the axis of "continue is lower-level, and therefore better suited to be a language primitive" vs "the code that uses guards is easier to read and reason about."? Assuming we have to do one (and I think we do), we have three choices (one, the other, or both.)? I think we should step away from the either/or mentality and try to shine a light on what goes well, or badly, when we _don't_ have one or the other. For example, with guards, we can express fine degrees of refinement in the case labels: ? ? case P & g1: ... ? ? case P & g2: ... ? ? case P & g3: ... but without them, we can only have one `case P`: ? ? case P: ? ? ? ? if (g1) { ... } ? ? ? ? else if (g2) { ... } ? ? ? ? else if (g3) { ... } My main fear of the without-guards branches is that it will be prohibitively hard to understand what a switch is doing, because the case arms will be full of imperative control-flow logic. On the other hand, a valid concern when you have guards is that there will be so much logic in the guard that you won't be able to tell where the case label ends and where the arm begins. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Aug 14 17:20:21 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 14 Aug 2020 13:20:21 -0400 Subject: Next up for patterns: type patterns in switch In-Reply-To: References: Message-ID: <8b4faf00-1711-a8b7-5489-7acf45782a91@oracle.com> > I haven't read all the discussions in this thread, so sorry if my > points were already covered. As it turned out, that?s probably best :) > It looks like relocating the problem to defaults have a hole for this > trivial switch: > > switch(obj) { > ?case null: ?// fallthrough > ?default: > } > > If we think that default case always throws for null, should this > switch throw? Nice ?catch?, fall through claims another victim. To summarize some of the previous discussions: the original document kind of leaves ?default? to rot, but perhaps it can be rehabilititated to have more of a role, such as indicating ?this switch is total and this is the last case.? ?I?m going to start another discussion on this. > It's unclear for me how exhaustiveness on nested patterns plays with > null. case Box(Circle c) and case Box(Rect r) don't cover case > Box(null) which is a valid possibility for Box type. It?s not even clear how exhaustiveness plays with null even without nesting :)? Moved to a separate thread. > I'm still heavily concerned about any patterns with the explicit type > (`T t` on an operand of type `U`, where `U <: T`). To be moved to a separate thread. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Aug 14 17:20:33 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 14 Aug 2020 13:20:33 -0400 Subject: [pattern-switch] Is totality too subtle? Message-ID: <772a72c6-e569-a2a0-bba9-0ecaa15f8ace@oracle.com> I just want to summarize what's been said on this: > ?- Is totality too subtle? (Remi) There is some concern that the > notion of using totality to subsume nullability (at least in nested > contexts) is sound, he is concerned that the difference between total > and non-total patterns may be too subtle, and this may lead to NPE > issues.? To evaluate this, we need to evaluate both the "is totality > too subtle" and the "how much are we worried about NPE in this > context" directions. ... but I'd like to do so without repeating what has been said before.? So please, new observations only. A key assumption underlying the proposed semantics of nullity in patterns (not switches) stems from the notion that definition of matching a pattern should be independent of the semantics of its consuming construct.? Instanceof and switch may have pre-existing opinions about patterns, but we should be wary about polluting those.? A pattern should mean something on its own. Totality is defined relative to a target type; `String s` may be total on String, but definitely not on Object.? We can view a nested pattern as a tree; as Guy observed, totality is a property of a _sub-tree_ of this tree, which might be one leaf, the whole tree, or some sub-tree of the tree.? But a node cannot be total on its target if the sub-trees are not total on their corresponding target. Further, the above property is desirable; we anticipate that catch-alls for an entire pattern chain, or for a sub-chain of a pattern chain, will be common, because they are useful; being able to say "Box containing anything", and destructure it at the same time, is an essential case that pattern matching should cover.? This requires the existence of leaf (type) patterns that are nullable.? One way to get there is with the current proposal; another is to have two of every kind of pattern (or a modifier on patterns) to add in or subtract out nullity. The main concern, if I understand it correctly, comes in the context of switch.? The switch header is already "at a distance" from the possibly-total (at the bottom); the type of the switch target may be further "at a distance" (because of var, expression nesting, etc), and so the concern is it may not be sufficiently obvious that a pattern is total, and therefore the semantics of the switch may not be sufficiently obvious. To the extent there are two ways to write a pattern (or compose two patterns), identical except for nullability, the choice of which is the default (nullable or not) is extremely consequential for actual user experience. I think the above is largely neutral and agreed on.? Now, some personal observations: ?- I think the concern that "it will be too hard to tell if a switch is meant to be exhaustive" is overblown.? Catch-all switches typically look like catch-alls, both because of what they say and where they are placed.? There will be common patterns of cases, which may not be familiar to everyone now, but will be very familiar soon enough, which will provide significant context for helping to understand the author intent. ?- Even if the above concern is not overblown, I think the consequences of getting it wrong may still be.? Yes, the null-handling behaviors of some switches may not be obvious, but: (a) right now, no switch ever deals with null at all, and there is not an epidemic of NPEs flying out of switches, and (b) what this does is move the null handling from a place where it always throws to a place where it might throw if the user is not careful.? Multiplying the low probability of nulls showing up at the gate unexpected in the first place, by the low conditional probability that this will make things worse, it seems like we're pretty deep in corner-case-of-a-corner-case. ?- I think that many of the proposed "remedies" are misguided.? Many of the action-at-a-distance concerns can be likened to the interaction between `var` and diamond; yes, when you combine two features that have some degree of implicitness, things get less obvious.? But the answer is not "don't do var", or "disallow var in the presence of diamond" (because they do interact in a well-behaved and potentially useful way), it is to warn users to not go overboard on being implicit everywhere they possibly can; use `var` where it adds value, and don't use it where it doesn't. So, while I am sympathetic to the concerns, I am skeptical that this is such a big problem that we have to distort the language (and force users to reason about nullity on every pattern.)? All the cures proposed so far seem much worse than the disease. We still need to work through the other nullity issues, which might cause some rearrangement of the deck chairs.? So at the very least, let's let this one lie for a while, until the others are worked out, but assuming that causes no change, I am of the opinion that we should do nothing now, and revisit in Preview/2 in light of actual experience, to see if it turns out that people can't handle the current behavior.? I think we're in deep danger of extrapolating from an abstract fear. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Aug 14 17:42:42 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 14 Aug 2020 13:42:42 -0400 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> Message-ID: <58dfd44f-5c4b-1a20-378a-5712f51a145f@oracle.com> > Aside: My understanding of the MethodParameters attribute is that it's > intended?more as an informational than API and unlikely to affect > application behaviour at runtime. The existing MP attribute represents a somewhat unfortunate compromise. The EE folks (remember EE?) had a compelling-enough use case for being able to reflect over parameter names, which was that they wanted to generate various remote-invocation stubs (corba, *RPC, etc) from classes, but felt that having to annotate each parameter (`foo(@MyNameIs("bob") String bob)`) would suck.? So they wanted us to reify the parameter names by default in the classfile, for reflection to serve up. The ME folks (remember ME?) freaked out; "Whoa, more bytes in the classfile, you have to make that optional, and not the default!" The EE guys came back with "OK, how about driving retention based on an annotation." We said (for the zillion'th time):? You know that's not what annotations are for. So the compromise was to make it a tooling option; invoke `javac` with `-parameters`.?? Result: no one can count on them. Now, the MP attribute itself should not need to change here; only the conditions under which the language promises to make these available for reflection and condition the compatibility rules. > Making names significant and part of the API will affect refactoring > and separate compilation: developers are used to being able to freely > rename parameters without worrying about affects on other parts of > their code base or about breaking their consumers (library authors). > ?Are there things the language can do to help with name transitions? > ?Is treating parameter name changes in a __byname method as a breaking > API change?something that developers will have to adapt to? Note that we already have this problem with records; the name of a component in records flows into constructor descriptors, accessor names, etc.? It is an open question whether the developers can handle the responsibility :) I have long wanted to have widely available, easy to use tooling that could detect incompatible changes between versions of a module, where it would look at the old bytecode and say "whoa, you did there, did you mean to?"? In the presence of such tooling, this would be one more set of assertions for such a mechanism.? But so far I haven't gotten my wish.? (Perhaps our IDE friends are cooking something like this up?)? I do think that detection of incompatible changes is something for tooling to handle, not the language.? But I agree that this is a concern, and more so than for records, which have the component names stapled to their forehead. > Similarly libraries that implement a common specification (Java EE, > etc) have historically allowed each implementation to name the > parameters in ways that made sense for their implementation. ?I can > see this becoming "names are API, live with it" which is a reasonable > answer but one that will take some time to filter thru the ecosystem > after the feature is released. ?Any concerns that adopting this > feature forces a "flag day" on consumers of such libraries with all > implementors needing to move in lockstep? I think the real challenge here is ensuring that, if the decision is made to make a member __byname, that they understand the consequences.? ("Nah, that could never change!", said every developer ever, incorrectly.)? Overuse of __byname could easily become a problem.? (I think also you're observing that this is in a sense worse than "names are API, live with it"; really, it's "names are *sometimes* API, live with that!"?) A member going from insignificant to significant naming should be a compatible move, as should (eventually) adding a named parameter with a default to a constructor (though there's more work to do here to make this compatible.)? The rule about telescoping chains is there in part to allow us to evolve such members and leave the old entry points around.? The flag day is when we realize "crap, that name really is so terrible we have to change it." > In the "Refining overloading" section [2] > > Each __byname constructor / deconstructor can be invoked > positionally, so it must follow > > the existing rules for overloading and overload selection. A set of > __byname members is > > valid if and only if for each two such members, their name sets are > either disjoint or one > > is a proper subset of the other and the types corresponding to the > common names are the same. > This seems like an answer to the refactoring concerns above as it > provides a way to leave previous API points around to not break > consumers. ?Will the normal deprecation schemes apply to these > name-only overloads? ?If so it covers most of the refactoring concerns > - though the need to leave the old API points is a bit of code smell > that may discourage some kinds of refactoring. ?Time will tell. Yes.? The real concern is (when we get to default parameters) that, if you have seventeen parameters already, that you'll have 18 tomorrow is virtually a guarantee.?? This is tricky ground, I think I have a story though. > Is the intention that the VM would check the disjoint constraints > during classloading? ?I can see it being pushed to the VM to validate > or left as a language level rule with the VM's resolution taking the > first available match. ?At first glance, both seem reasonable though > VM validation will incur startup costs and would need more clarity on > how?checks for duplicate __byname constructors would mesh with the > "Factories" proposal, either disjoint or combined checks? ?Details to > be worked out as the proposal progresses. Asking the VM to check these seems a bit much, especially given that the names exist primarily for overload selection, which is a language concern.? My thinking for by-name invocation is that we do overload selection statically, pick a positional signature, and invoke with an indy that treats that positional signature as a fast path, but falls back to an ugly reflective linkage on fail.? This is the standard trick of "replicate the language semantics at runtime in indy" so the VM doesn't have to. -------------- next part -------------- An HTML attachment was scrubbed... URL: From eaftan at google.com Fri Aug 14 20:00:38 2020 From: eaftan at google.com (Eddie Aftandilian) Date: Fri, 14 Aug 2020 13:00:38 -0700 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <58dfd44f-5c4b-1a20-378a-5712f51a145f@oracle.com> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <58dfd44f-5c4b-1a20-378a-5712f51a145f@oracle.com> Message-ID: On Fri, Aug 14, 2020 at 10:43 AM Brian Goetz wrote: > > > Aside: My understanding of the MethodParameters attribute is that it's > > intended more as an informational than API and unlikely to affect > > application behaviour at runtime. > > The existing MP attribute represents a somewhat unfortunate compromise. > > The EE folks (remember EE?) had a compelling-enough use case for being > able to reflect over parameter names, which was that they wanted to > generate various remote-invocation stubs (corba, *RPC, etc) from > classes, but felt that having to annotate each parameter > (`foo(@MyNameIs("bob") String bob)`) would suck. So they wanted us to > reify the parameter names by default in the classfile, for reflection to > serve up. > > The ME folks (remember ME?) freaked out; "Whoa, more bytes in the > classfile, you have to make that optional, and not the default!" > > The EE guys came back with "OK, how about driving retention based on an > annotation." > > We said (for the zillion'th time): You know that's not what annotations > are for. > > So the compromise was to make it a tooling option; invoke `javac` with > `-parameters`. Result: no one can count on them. > > Now, the MP attribute itself should not need to change here; only the > conditions under which the language promises to make these available for > reflection and condition the compatibility rules. > > Making names significant and part of the API will affect refactoring > > and separate compilation: developers are used to being able to freely > > rename parameters without worrying about affects on other parts of > > their code base or about breaking their consumers (library authors). > > Are there things the language can do to help with name transitions? > > Is treating parameter name changes in a __byname method as a breaking > > API change something that developers will have to adapt to? > > Note that we already have this problem with records; the name of a > component in records flows into constructor descriptors, accessor names, > etc. It is an open question whether the developers can handle the > responsibility :) > > I have long wanted to have widely available, easy to use tooling that > could detect incompatible changes between versions of a module, where it > would look at the old bytecode and say "whoa, you did thing> there, did you mean to?" In the presence of such tooling, this > would be one more set of assertions for such a mechanism. But so far I > haven't gotten my wish. (Perhaps our IDE friends are cooking something > like this up?) I do think that detection of incompatible changes is > something for tooling to handle, not the language. But I agree that > this is a concern, and more so than for records, which have the > component names stapled to their forehead. > Inside Google we have enabled the `-parameters` attribute by default and have an Error Prone check that simulates named parameters ( https://errorprone.info/bugpattern/ParameterName). Initially we had it enabled as a compilation error. We believed that renames of parameters happened infrequently and rarely crossed compilation boundaries. We found that those renames were more frequent than expected, and there were a number of accidental breaking changes to core libraries like Guava that caused breakage at a distance. We ended up demoting the check to a warning. The general feeling was that this was mostly a problem for core libraries and not typical application code. One proposal was to leave it as an error by default, but to allow core libraries to opt out of publishing their parameter names. All that said, I don't think this is a problem for records since the names there are clearly part of the API. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Aug 14 20:18:12 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 14 Aug 2020 16:18:12 -0400 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <58dfd44f-5c4b-1a20-378a-5712f51a145f@oracle.com> Message-ID: <97dfca1d-112e-668f-ffbe-dcc201d33eae@oracle.com> > > Inside Google we have enabled the `-parameters` attribute by default > and have an Error Prone check that simulates named parameters > (https://errorprone.info/bugpattern/ParameterName). > > Initially we had it enabled as a compilation error.? We believed that > renames of parameters happened infrequently and rarely crossed > compilation boundaries.? We found that those renames were more > frequent than expected, and there were a number of accidental breaking > changes to core libraries like Guava that caused breakage at a > distance.? We ended up demoting the check to a warning.? The general > feeling was that this was mostly a problem for core libraries and not > typical application code.? One proposal was to leave it as an error by > default, but to allow core libraries to opt out of publishing their > parameter names. > > All that said, I don't think this is a problem for records since the > names there are clearly part of the API. Thanks for enhancing the theoretical discussion with actual data! The part of this that really interests me is the "boundary-crossing" one.? Within a maintenance boundary (package, module, multi-module project, monorepo, depending on your tooling) you are free to rename at will, using suitable refactoring tools, since you can find all the clients and fix them.? It only becomes a real problem when references cross boundaries. I'm curious what's behind your intuition about why records would be immune (not that I disagree.)? Is it that they are right there in the header?? That they are so restricted that users can't lose sight of the fact that they are special? From jonathan.gibbons at oracle.com Fri Aug 14 21:06:18 2020 From: jonathan.gibbons at oracle.com (Jonathan Gibbons) Date: Fri, 14 Aug 2020 14:06:18 -0700 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <97dfca1d-112e-668f-ffbe-dcc201d33eae@oracle.com> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <58dfd44f-5c4b-1a20-378a-5712f51a145f@oracle.com> <97dfca1d-112e-668f-ffbe-dcc201d33eae@oracle.com> Message-ID: <72073973-4f0c-d0c1-f2a2-608a942305c2@oracle.com> On 8/14/20 1:18 PM, Brian Goetz wrote: > >> >> Inside Google we have enabled the `-parameters` attribute by default >> and have an Error Prone check that simulates named parameters >> (https://errorprone.info/bugpattern/ParameterName). >> >> Initially we had it enabled as a compilation error.? We believed that >> renames of parameters happened infrequently and rarely crossed >> compilation boundaries.? We found that those renames were more >> frequent than expected, and there were a number of accidental >> breaking changes to core libraries like Guava that caused breakage at >> a distance.? We ended up demoting the check to a warning.? The >> general feeling was that this was mostly a problem for core libraries >> and not typical application code. One proposal was to leave it as an >> error by default, but to allow core libraries to opt out of >> publishing their parameter names. >> >> All that said, I don't think this is a problem for records since the >> names there are clearly part of the API. > > Thanks for enhancing the theoretical discussion with actual data! > > The part of this that really interests me is the "boundary-crossing" > one.? Within a maintenance boundary (package, module, multi-module > project, monorepo, depending on your tooling) you are free to rename > at will, using suitable refactoring tools, since you can find all the > clients and fix them.? It only becomes a real problem when references > cross boundaries. It wouldn't help the reliability issue at all, but an interesting enhancement to the javac option would be to enable MP for the exported API of a module, and not otherwise. > > I'm curious what's behind your intuition about why records would be > immune (not that I disagree.)? Is it that they are right there in the > header?? That they are so restricted that users can't lose sight of > the fact that they are special? > > From eaftan at google.com Fri Aug 14 23:54:56 2020 From: eaftan at google.com (Eddie Aftandilian) Date: Fri, 14 Aug 2020 16:54:56 -0700 Subject: A peek at the roadmap for pattern matching and more In-Reply-To: <97dfca1d-112e-668f-ffbe-dcc201d33eae@oracle.com> References: <9855cc08-92dc-2025-8543-e7cbfd9f3281@oracle.com> <58dfd44f-5c4b-1a20-378a-5712f51a145f@oracle.com> <97dfca1d-112e-668f-ffbe-dcc201d33eae@oracle.com> Message-ID: On Fri, Aug 14, 2020 at 1:18 PM Brian Goetz wrote: > > > > > Inside Google we have enabled the `-parameters` attribute by default > > and have an Error Prone check that simulates named parameters > > (https://errorprone.info/bugpattern/ParameterName). > > > > Initially we had it enabled as a compilation error. We believed that > > renames of parameters happened infrequently and rarely crossed > > compilation boundaries. We found that those renames were more > > frequent than expected, and there were a number of accidental breaking > > changes to core libraries like Guava that caused breakage at a > > distance. We ended up demoting the check to a warning. The general > > feeling was that this was mostly a problem for core libraries and not > > typical application code. One proposal was to leave it as an error by > > default, but to allow core libraries to opt out of publishing their > > parameter names. > > > > All that said, I don't think this is a problem for records since the > > names there are clearly part of the API. > > Thanks for enhancing the theoretical discussion with actual data! > Happy to help :) > The part of this that really interests me is the "boundary-crossing" > one. Within a maintenance boundary (package, module, multi-module > project, monorepo, depending on your tooling) you are free to rename at > will, using suitable refactoring tools, since you can find all the > clients and fix them. It only becomes a real problem when references > cross boundaries. > > I'm curious what's behind your intuition about why records would be > immune (not that I disagree.) Is it that they are right there in the > header? That they are so restricted that users can't lose sight of the > fact that they are special? > I think it's the same reasoning you stated in one of the messages earlier in this thread: "the name of a component in records flows into constructor descriptors, accessor names, etc." Changing the name of a record component seems to obviously be a breaking change in that it affects especially the accessor names. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Aug 17 14:07:23 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 17 Aug 2020 16:07:23 +0200 (CEST) Subject: Sealed local interfaces Message-ID: <385055420.151260.1597673243223.JavaMail.zimbra@u-pem.fr> I've found a discrepancies in the current spec of sealed, the current spec allows local sealed interface but you have no way to provide a sub-types apart a sealed sub interfaces. A record or an interface is allowed to be declared inside a method, by example, this is a valid code static void foo() { interface I {} record Foo() implements I {} } the interface can also be sealed static void foo() { sealed interface I {} record Foo implements I {} } but this code does not compile because Foo is a local class and a local class can not implement a sealed interface. This rule was intended to not allow this kind of code sealed interface I {} static void foo() { record Foo implements I {} } because the interface I is visible while Foo is not. But we have forgotten to discuss about the case where both Foo and I are in the same scope. I see two possible fixes - disallow sealed local interfaces - allow local class/record to implement a sealed interface if they are in the same scope. R?mi From vicente.romero at oracle.com Tue Aug 18 23:15:00 2020 From: vicente.romero at oracle.com (Vicente Romero) Date: Tue, 18 Aug 2020 19:15:00 -0400 Subject: Sealed local interfaces In-Reply-To: <385055420.151260.1597673243223.JavaMail.zimbra@u-pem.fr> References: <385055420.151260.1597673243223.JavaMail.zimbra@u-pem.fr> Message-ID: Hi Remi, On 8/17/20 10:07 AM, Remi Forax wrote: > I've found a discrepancies in the current spec of sealed, > the current spec allows local sealed interface but you have no way to provide a sub-types apart a sealed sub interfaces. > > A record or an interface is allowed to be declared inside a method, > by example, this is a valid code > static void foo() { > interface I {} > > record Foo() implements I {} > } > > the interface can also be sealed > static void foo() { > sealed interface I {} > > record Foo implements I {} > } > > but this code does not compile because Foo is a local class and a local class can not implement a sealed interface. > > This rule was intended to not allow this kind of code > sealed interface I {} > > static void foo() { > record Foo implements I {} > } > because the interface I is visible while Foo is not. > > But we have forgotten to discuss about the case where both Foo and I are in the same scope. > > I see two possible fixes > - disallow sealed local interfaces +1 having a local sealed class seems like unnecessary, this option makes more sense to me > - allow local class/record to implement a sealed interface if they are in the same scope. > > R?mi Vicente From brian.goetz at oracle.com Tue Aug 18 23:19:54 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 18 Aug 2020 19:19:54 -0400 Subject: Sealed local interfaces In-Reply-To: References: <385055420.151260.1597673243223.JavaMail.zimbra@u-pem.fr> Message-ID: In theory, there is benefit to having local sealed interfaces, in that they would provide a source of exhaustiveness information for switches inside the method.? On the other hand, we don't really want to encourage methods to be enormous.? So at least for now, this fix seems fine. On 8/18/2020 7:15 PM, Vicente Romero wrote: > Hi Remi, > > On 8/17/20 10:07 AM, Remi Forax wrote: >> I've found a discrepancies in the current spec of sealed, >> the current spec allows local sealed interface but you have no way to >> provide a sub-types apart a sealed sub interfaces. >> >> A record or an interface is allowed to be declared inside a method, >> by example, this is a valid code >> ?? static void foo() { >> ???? interface I {} >> >> ???? record Foo() implements I {} >> ?? } >> >> the interface can also be sealed >> ?? static void foo() { >> ???? sealed interface I {} >> >> ???? record Foo implements I {} >> ?? } >> >> but this code does not compile because Foo is a local class and a >> local class can not implement a sealed interface. >> >> This rule was intended to not allow this kind of code >> ?? sealed interface I {} >> ?? ?? static void foo() { >> ???? record Foo implements I {} >> ?? } >> because the interface I is visible while Foo is not. >> >> But we have forgotten to discuss about the case where both Foo and I >> are in the same scope. >> >> I see two possible fixes >> - disallow sealed local interfaces > +1 having a local sealed class seems like unnecessary, this option > makes more sense to me >> - allow local class/record to implement a sealed interface if they >> are in the same scope. >> >> R?mi > Vicente -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Aug 19 08:43:35 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 19 Aug 2020 10:43:35 +0200 (CEST) Subject: Sealed local interfaces In-Reply-To: References: <385055420.151260.1597673243223.JavaMail.zimbra@u-pem.fr> Message-ID: <348474541.466059.1597826615633.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Vicente Romero" , "amber-spec-experts" > > Envoy?: Mercredi 19 Ao?t 2020 01:19:54 > Objet: Re: Sealed local interfaces > In theory, there is benefit to having local sealed interfaces, in that they > would provide a source of exhaustiveness information for switches inside the > method. On the other hand, we don't really want to encourage methods to be > enormous. So at least for now, this fix seems fine. I agree, in my defense, i've written that code in a unit test method :) R?mi > On 8/18/2020 7:15 PM, Vicente Romero wrote: >> Hi Remi, >> On 8/17/20 10:07 AM, Remi Forax wrote: >>> I've found a discrepancies in the current spec of sealed, >>> the current spec allows local sealed interface but you have no way to provide a >>> sub-types apart a sealed sub interfaces. >>> A record or an interface is allowed to be declared inside a method, >>> by example, this is a valid code >>> static void foo() { >>> interface I {} >>> record Foo() implements I {} >>> } >>> the interface can also be sealed >>> static void foo() { >>> sealed interface I {} >>> record Foo implements I {} >>> } >>> but this code does not compile because Foo is a local class and a local class >>> can not implement a sealed interface. >>> This rule was intended to not allow this kind of code >>> sealed interface I {} >>> static void foo() { >>> record Foo implements I {} >>> } >>> because the interface I is visible while Foo is not. >>> But we have forgotten to discuss about the case where both Foo and I are in the >>> same scope. >>> I see two possible fixes >>> - disallow sealed local interfaces >> +1 having a local sealed class seems like unnecessary, this option makes more >> sense to me >>> - allow local class/record to implement a sealed interface if they are in the >>> same scope. >>> R?mi >> Vicente -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Aug 20 15:56:20 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 20 Aug 2020 11:56:20 -0400 Subject: [pattern-switch] Exhaustiveness In-Reply-To: References: Message-ID: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> Tagir's question about exhaustiveness in switches points to some technical debt left over from expression switches. (Note: this entire discussion has nothing to do with whether `case Object o` is nullable; it has strictly to do with extending the existing treatment of exhaustive switches over enums to sealed classes, when we can conclude a switch over such a type is total without a default / total case, and what implicit cases we have to insert to make up for that.? Please let's not conflate this thread with that issue.) When we did expression switch, for an exhaustive switch that covered all the enums without a default, we inserted an extra catch-all case that throws ICCE, on the theory that nulls are already checked by the switch and so anything that hits the synthetic default must be a novel enum value, which merits an ICCE.? This worked for enum switches (where all case labels are discrete values), but doesn't quite scale to sealed types. Let's fix that. As a recap, suppose we have ??? enum E { A, B; } and suppose that, via separate compilation, a novel value C is introduced that was unknown at the time the switch was compiled. An "exhaustive" statement switch on E: ??? switch (e) { ? ?? ?? case A: ??????? case B: ??? } throws NPE on null but does nothing on C, because switch statements make no attempt at being exhaustive. An _expression_ switch that is deemed exhaustive without a default case: ??? var s = switch (e) { ? ?? ?? case A -> ... ??????? case B -> ... ??? } throws NPE on null and ICCE on C. At the time, we were concerned about the gap between statement and expression switches, and talked about having a way to make statement switches exhaustive.? That's still on the table, and we should still address this, but that's not the subject of this mail. What I want to focus on in this mail is the interplay between exhaustiveness analysis and (exhaustive) switch semantics, and what code we have to inject to make up for gaps.? We've identified two sources of gaps: nulls, and novel enum values. When we get to sealed types, we can add novel subtypes to the list of things we have to detect and implicitly reject; when we get to deconstruction patterns, we need to address these at nested levels too. Let's analyze switches on Container assuming: ??? Container = Box | Bag ??? Shape = Rect | Circle and assume a novel shape Pentagon shows up unexpectedly via separate compilation. If we have a switch _statement_ with: ??? case Box(Rect r) ??? case Box(Circle c) ??? case Bag(Rect r) ??? case Bag(Circle c) then the only value we implicitly handle is null; everything else just falls out of the switch, because they don't try to be exhaustive. If this is an expression switch, then I think its safe to say: ?- The switch should deemed exhaustive; no Box(null) etc cases needed. ?- We get an NPE on null. But that leaves Box(null), Bag(null), Box(Pentagon), and Bag(Pentagon).? We have to do something (the switch has to be total) with these, and again, asking users to manually handle these is unreasonable.? A reasonable strawman here is: ??? ICCE on Box(Pentagon) and Bag(Pentagon) ??? NPE on Box(null) and Bag(null) Essentially, what this means is: we need to explicitly consider null and novel values/types of enum/sealed classes in our exhaustiveness analysis, and, if these are not seen to be explicitly covered and the implicit coverage plays into the conclusion of overall weak totality, then we need to insert implicit catch-alls for these cases. If we switch over: ??? case Box(Rect r) ??? case Box(Circle c) ??? case Box b ??? case Bag(Rect r) ??? case Bag(Circle c) then Box(Pentagon) and Box(null) are handled by the `Box b` case and don't need to be handled by a catch-all. If we have: ??? case Box(Rect r) ??? case Box(Circle c) ??? case Bag(Rect r) ??? case Bag(Circle c) ??? default then Box(Pentagon|null) and Bag(Pentagon|null) clearly fall into the default case, so no special handling is needed there. Are we in agreement on what _should_ happen in these cases?? If so, I can put a more formal basis on it. On 8/14/2020 1:20 PM, Brian Goetz wrote: >> >> ?- Exhaustiveness and null. (Tagir)? For sealed domains (enums and >> sealed types), we kind of cheated with expression switches because we >> could count on the switch filtering out the null.? But Tagir raises >> an excellent point, which is that we do not yet have a sound >> definition of exhaustiveness that scales to nested patterns (do >> Box(Rect) and Box(Circle) cover Box(Shape)?)? This is an interaction >> between sealed types and patterns that needs to be ironed out.? >> (Thanks Tagir!) > > [ Breaking this out from Tagir's more comprehensive reply ] > >> It's unclear for me how exhaustiveness on nested patterns plays with >> null. case Box(Circle c) and case Box(Rect r) don't cover case >> Box(null) which is a valid possibility for Box type. > > It?s not even clear how exhaustiveness plays with null even without > nesting, so let's start there. > > Consider this switch: > > switch (trafficLight) { > case GREEN, YELLOW: driveReallyFast(); > case RED: sigh(); > ? ? } > > Is it exhaustive? ?Well, we want to say yes. ?And with the existing > null-hostility of switch, it is. ?But even without that, we?d like to > say yes, because a null enum value is almost always an error, and > making users deal with cases that don?t happen in reality is kind of > rude. > > For a domain sealed to a set of alternatives (enums or sealed > classes), let?s say that a set of patterns is _weakly exhaustive_ if > it covers all the alternatives but not null, and _strongly exhaustive_ > if it also covers null. ?When we did switch expressions, we said that > weakly exhaustive coverings didn?t need a default in a switch > expression. ?I think we?re primed to say the same thing for sealed > classes. ?But, this ?weak is good enough? leans on the fact that the > existing hostility of switch will cover what we miss. ?We get no such > cover in nested cases. > > I think it?s worth examining further why we are willing to accept the > weak coverage with enums. ?Is it really that we?re willing to assume > that enums just should never be null? ?If we had type cardinalities in > the language, would we treat `enum X` as declaring a cardinality-1 > type called X? ?I think we might. ?OK, what about sealed classes? > ?Would the same thing carry over? ?Not so sure there. ?And this is a > problem, because we ultimately want: > > case Optional.of(var x): > case Optional.empty(): > > to be exhaustive on Optional, and said exhaustiveness will likely > lean on some sort of sealing. > > This is related to Guy's observation that totality is a "subtree all > the way down" property.? Consider: > > ??? sealed class Container permits Box, Bag { } > ??? sealed class Shape permits Rect, Circle { } > > Ignoring null, Box+Bag should be exhaustive on container, and > Rect+Circle should be exhaustive on shape.? So if we are switching > over a Container, then what of: > > ??? case Box(Rect r): > ??? case Box(Circle c): > ??? case Bag(Rect r): > ??? case Bag(Circle c): > > We have some "nullity holes" in three places: Box(null), Bag(null), > and null itself.?? Is this set of cases exhaustive on Bags, Boxes, or > Containers? > > I think users would like to be able to write the above four cases and > treat it as exhaustive; having to explicitly provide Box(null) / Box > b, Bag(null) / Bag b, or a catch-all to accept > null+Box(null)+Bag(null) would all be deemed unpleasant ceremony. > > Hmm... > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Thu Aug 20 16:58:01 2020 From: amaembo at gmail.com (Tagir Valeev) Date: Thu, 20 Aug 2020 23:58:01 +0700 Subject: [pattern-switch] Exhaustiveness In-Reply-To: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> Message-ID: Hello! > Are we in agreement on what _should_ happen in these cases? This sounds like a good idea. I see the following alternatives (not saying that they are better, just putting them on the table). 1. Do not perform exhaustiveness analysis for nested patterns, except if all nested components are total. So, in your example, either explicit default, or case Box and case Bag will be necessary. This is somewhat limiting but I'm not sure that the exhaustiveness of complex nested patterns would be very popular. If one really needs this, they could use nested switch expression: return switch(container) { case Box(var s) -> switch(s) {case Rect r -> ...; case Circle c -> ...; /*optional case null is possible*/}; case Bag(var s) -> switch(s) {case Rect r -> ...; case Circle c -> ...; /*optional case null is possible*/}; } Here Box(var s) has total nested component, so it matches everything that Box matches, the same with Bag, thus we perform exhaustiveness analysis using Container declaration. This approach does not close the door for us. We can rethink later and add exhaustiveness analysis for nested patterns when we will finally determine how it should look like. Note that this still allows making Optional.of(var x) + Optional.empty exhaustive if we provide an appropriate mechanism to declare this kind of patterns. 2. Allow deconstructors and records to specify whether null is a possible value for a given component. Like make deconstructors and records null-hostile by default and provide a syntax (T? or T|null or whatever) to allow nulls. In this case, if the deconstructor is null-friendly, then the exhaustive pattern must handle Box(null). Otherwise, Box(null) is a compilation error. Yes, I know, this may quickly evolve to the point when we will need to add a full-fledged nullability to the type system. But probably it's possible to allow nullability specification for new language features only? Ok, ok, sorry, my imagination drove me too far away from reality. Forget it. With best regards, Tagir Valeev. On Thu, Aug 20, 2020 at 10:57 PM Brian Goetz wrote: > > Tagir's question about exhaustiveness in switches points to some technical debt left over from expression switches. > > (Note: this entire discussion has nothing to do with whether `case Object o` is nullable; it has strictly to do with extending the existing treatment of exhaustive switches over enums to sealed classes, when we can conclude a switch over such a type is total without a default / total case, and what implicit cases we have to insert to make up for that. Please let's not conflate this thread with that issue.) > > > When we did expression switch, for an exhaustive switch that covered all the enums without a default, we inserted an extra catch-all case that throws ICCE, on the theory that nulls are already checked by the switch and so anything that hits the synthetic default must be a novel enum value, which merits an ICCE. This worked for enum switches (where all case labels are discrete values), but doesn't quite scale to sealed types. Let's fix that. > > As a recap, suppose we have > > enum E { A, B; } > > and suppose that, via separate compilation, a novel value C is introduced that was unknown at the time the switch was compiled. > > An "exhaustive" statement switch on E: > > switch (e) { > case A: > case B: > } > > throws NPE on null but does nothing on C, because switch statements make no attempt at being exhaustive. > > An _expression_ switch that is deemed exhaustive without a default case: > > var s = switch (e) { > case A -> ... > case B -> ... > } > > throws NPE on null and ICCE on C. > > At the time, we were concerned about the gap between statement and expression switches, and talked about having a way to make statement switches exhaustive. That's still on the table, and we should still address this, but that's not the subject of this mail. > > What I want to focus on in this mail is the interplay between exhaustiveness analysis and (exhaustive) switch semantics, and what code we have to inject to make up for gaps. We've identified two sources of gaps: nulls, and novel enum values. When we get to sealed types, we can add novel subtypes to the list of things we have to detect and implicitly reject; when we get to deconstruction patterns, we need to address these at nested levels too. > > Let's analyze switches on Container assuming: > > Container = Box | Bag > Shape = Rect | Circle > > and assume a novel shape Pentagon shows up unexpectedly via separate compilation. > > If we have a switch _statement_ with: > > case Box(Rect r) > case Box(Circle c) > case Bag(Rect r) > case Bag(Circle c) > > then the only value we implicitly handle is null; everything else just falls out of the switch, because they don't try to be exhaustive. > > If this is an expression switch, then I think its safe to say: > > - The switch should deemed exhaustive; no Box(null) etc cases needed. > - We get an NPE on null. > > But that leaves Box(null), Bag(null), Box(Pentagon), and Bag(Pentagon). We have to do something (the switch has to be total) with these, and again, asking users to manually handle these is unreasonable. A reasonable strawman here is: > > ICCE on Box(Pentagon) and Bag(Pentagon) > NPE on Box(null) and Bag(null) > > Essentially, what this means is: we need to explicitly consider null and novel values/types of enum/sealed classes in our exhaustiveness analysis, and, if these are not seen to be explicitly covered and the implicit coverage plays into the conclusion of overall weak totality, then we need to insert implicit catch-alls for these cases. > > If we switch over: > > case Box(Rect r) > case Box(Circle c) > case Box b > case Bag(Rect r) > case Bag(Circle c) > > then Box(Pentagon) and Box(null) are handled by the `Box b` case and don't need to be handled by a catch-all. > > If we have: > > case Box(Rect r) > case Box(Circle c) > case Bag(Rect r) > case Bag(Circle c) > default > > then Box(Pentagon|null) and Bag(Pentagon|null) clearly fall into the default case, so no special handling is needed there. > > > Are we in agreement on what _should_ happen in these cases? If so, I can put a more formal basis on it. > > > On 8/14/2020 1:20 PM, Brian Goetz wrote: > > > - Exhaustiveness and null. (Tagir) For sealed domains (enums and sealed types), we kind of cheated with expression switches because we could count on the switch filtering out the null. But Tagir raises an excellent point, which is that we do not yet have a sound definition of exhaustiveness that scales to nested patterns (do Box(Rect) and Box(Circle) cover Box(Shape)?) This is an interaction between sealed types and patterns that needs to be ironed out. (Thanks Tagir!) > > > [ Breaking this out from Tagir's more comprehensive reply ] > > It's unclear for me how exhaustiveness on nested patterns plays with > null. case Box(Circle c) and case Box(Rect r) don't cover case > Box(null) which is a valid possibility for Box type. > > > It?s not even clear how exhaustiveness plays with null even without nesting, so let's start there. > > Consider this switch: > > switch (trafficLight) { > case GREEN, YELLOW: driveReallyFast(); > case RED: sigh(); > } > > Is it exhaustive? Well, we want to say yes. And with the existing null-hostility of switch, it is. But even without that, we?d like to say yes, because a null enum value is almost always an error, and making users deal with cases that don?t happen in reality is kind of rude. > > For a domain sealed to a set of alternatives (enums or sealed classes), let?s say that a set of patterns is _weakly exhaustive_ if it covers all the alternatives but not null, and _strongly exhaustive_ if it also covers null. When we did switch expressions, we said that weakly exhaustive coverings didn?t need a default in a switch expression. I think we?re primed to say the same thing for sealed classes. But, this ?weak is good enough? leans on the fact that the existing hostility of switch will cover what we miss. We get no such cover in nested cases. > > I think it?s worth examining further why we are willing to accept the weak coverage with enums. Is it really that we?re willing to assume that enums just should never be null? If we had type cardinalities in the language, would we treat `enum X` as declaring a cardinality-1 type called X? I think we might. OK, what about sealed classes? Would the same thing carry over? Not so sure there. And this is a problem, because we ultimately want: > > case Optional.of(var x): > case Optional.empty(): > > to be exhaustive on Optional, and said exhaustiveness will likely lean on some sort of sealing. > > This is related to Guy's observation that totality is a "subtree all the way down" property. Consider: > > sealed class Container permits Box, Bag { } > sealed class Shape permits Rect, Circle { } > > Ignoring null, Box+Bag should be exhaustive on container, and Rect+Circle should be exhaustive on shape. So if we are switching over a Container, then what of: > > case Box(Rect r): > case Box(Circle c): > case Bag(Rect r): > case Bag(Circle c): > > We have some "nullity holes" in three places: Box(null), Bag(null), and null itself. Is this set of cases exhaustive on Bags, Boxes, or Containers? > > I think users would like to be able to write the above four cases and treat it as exhaustive; having to explicitly provide Box(null) / Box b, Bag(null) / Bag b, or a catch-all to accept null+Box(null)+Bag(null) would all be deemed unpleasant ceremony. > > Hmm... > > From brian.goetz at oracle.com Thu Aug 20 19:09:00 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 20 Aug 2020 15:09:00 -0400 Subject: [pattern-switch] Exhaustiveness In-Reply-To: References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> Message-ID: <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> Here's an attempt at a formalism for capturing this. There are several categories of patterns we might call total on a type T.? We could refine the taxonomy as: ?- Strongly total -- matches all values of T. ?- Weakly total -- matches all values of T except perhaps null. What we want to do is characterize the aggregate totality on T of a _set_ of patterns P*.? A set of patterns could in the aggregate be either of the above, or also: ?- Optimistically total -- matches all values of subtypes of T _known at compile time_, except perhaps null. Note that we have an ordering: ??? partial < optimistically total < weakly total < strongly total Now, some rules about defining the totality of a set of patterns. T-Total: The singleton set containing the type pattern `T t` is strongly total on U <: T.? (This is the rule we've been discussing, but that's not the point of this mail -- we just need a base case right now.) T-Subset: If a set of patterns P* contains a subset of patterns that is X-total on T, then P* is X-total on T. T-Sealed: If T is sealed to U* (direct subtypes only), and for each U in U*, there is some subset of P* that is optimistically total on U, then P* is optimistically total on T. T-Nested: Given a deconstructor D(U) and a collection of patterns { P1..Pn }, if { P1..Pn } is X-total on U, then { D(P1)..D(Pn) } is min(X,weak)-total on D. OK, examples.? Let's say ??? Container = Box | Bag ??? Shape = Round | Rect ??? Round = Circle | Ellipse { Container c }: total on Container by T-Total. { Box b, Bag b }: optimistically total on Container ?- Container sealed to Box and Bag ?- `Box b` total on Box, `Bag b` total on Bag ?- { Box b, Bag b } optimistically total on Container by T-Sealed { Box(Round r), Box(Rect r) }: optimistically total on Box ?- Box sealed to Round and Rect ?- { Round r, Rect r } optimistically total on Shape by T-Sealed ?- { Box(Round r), Box(Rect r) } optimistically total on Box by T-Nested { Box(Object o) } weakly total on Box ?- Object o total on Object ?- { Object o } total on Object by T-Subset ?- { Box(Object o) } weakly total on Box by T-Nested { Box(Rect r), Box(Circle c), Box(Ellipse e) } optimistically total on Box ?- Shape sealed to Round and Rect ?- { Rect r } total on Rect ?- { Circle c, Ellipse e } optimistically total on Round ?- { Rect r, Circle c, Ellipse e } is optimistically total on Shape, because for each of { Rect, Round }, there is a subset that is optimistically total on that type ?- { Box(Rect r), Box(Circle c), Box(Ellipse e) } optimistically total on Box by T-Nested We can enhance this model to construct the residue (a characterization of what falls through the cracks), and therefore has to be handled by a catch-all in a putatively total switch.? A grammar for the residue would be: ??? R := null | novel | D(R) so might includes Box(null), Box(novel), Bag(Box(novel)), etc. We would need to extend this to support deconstructors with multiple bindings too. OK, coming back to reality: ?- The patterns of a switch expression must be at least optimistically total. ?- The translation of a switch expression must include a synthetic case that catches all elements of the residue of its patterns, and throws the appropriate exceptions: ?? - NPE for a null ?? - ICCE for a novel value ?? - One of the above, or maybe something else, for D(novel), D(null), D(E(novel, null)), etc We still have not addressed how we might nominate a _statement_ switch as being some form of total; that's a separate story. Also a separate story: under what conditions in the new world do switches throw NPE, but this seems like progress. Given the weird shape of the residue, it's not clear there's a clean way to extrapolate the NPE|ICCE rule, since we might have Foo(null, novel), and would arbitrarily have to pick which exception to throw, and neither would really be all that great. Perhaps there's a new exception type lurking here. On 8/20/2020 12:58 PM, Tagir Valeev wrote: > Hello! > >> Are we in agreement on what _should_ happen in these cases? > This sounds like a good idea. I see the following alternatives (not > saying that they are better, just putting them on the table). > > 1. Do not perform exhaustiveness analysis for nested patterns, except > if all nested components are total. So, in your example, either > explicit default, or case Box and case Bag will be necessary. This is > somewhat limiting but I'm not sure that the exhaustiveness of complex > nested patterns would be very popular. If one really needs this, they > could use nested switch expression: > return switch(container) { > case Box(var s) -> switch(s) {case Rect r -> ...; case Circle c -> > ...; /*optional case null is possible*/}; > case Bag(var s) -> switch(s) {case Rect r -> ...; case Circle c -> > ...; /*optional case null is possible*/}; > } > Here Box(var s) has total nested component, so it matches everything > that Box matches, the same with Bag, thus we perform exhaustiveness > analysis using Container declaration. > This approach does not close the door for us. We can rethink later and > add exhaustiveness analysis for nested patterns when we will finally > determine how it should look like. Note that this still allows making > Optional.of(var x) + Optional.empty exhaustive if we provide an > appropriate mechanism to declare this kind of patterns. > > 2. Allow deconstructors and records to specify whether null is a > possible value for a given component. Like make deconstructors and > records null-hostile by default and provide a syntax (T? or T|null or > whatever) to allow nulls. In this case, if the deconstructor is > null-friendly, then the exhaustive pattern must handle Box(null). > Otherwise, Box(null) is a compilation error. Yes, I know, this may > quickly evolve to the point when we will need to add a full-fledged > nullability to the type system. But probably it's possible to allow > nullability specification for new language features only? Ok, ok, > sorry, my imagination drove me too far away from reality. Forget it. > > With best regards, > Tagir Valeev. > > On Thu, Aug 20, 2020 at 10:57 PM Brian Goetz wrote: >> Tagir's question about exhaustiveness in switches points to some technical debt left over from expression switches. >> >> (Note: this entire discussion has nothing to do with whether `case Object o` is nullable; it has strictly to do with extending the existing treatment of exhaustive switches over enums to sealed classes, when we can conclude a switch over such a type is total without a default / total case, and what implicit cases we have to insert to make up for that. Please let's not conflate this thread with that issue.) >> >> >> When we did expression switch, for an exhaustive switch that covered all the enums without a default, we inserted an extra catch-all case that throws ICCE, on the theory that nulls are already checked by the switch and so anything that hits the synthetic default must be a novel enum value, which merits an ICCE. This worked for enum switches (where all case labels are discrete values), but doesn't quite scale to sealed types. Let's fix that. >> >> As a recap, suppose we have >> >> enum E { A, B; } >> >> and suppose that, via separate compilation, a novel value C is introduced that was unknown at the time the switch was compiled. >> >> An "exhaustive" statement switch on E: >> >> switch (e) { >> case A: >> case B: >> } >> >> throws NPE on null but does nothing on C, because switch statements make no attempt at being exhaustive. >> >> An _expression_ switch that is deemed exhaustive without a default case: >> >> var s = switch (e) { >> case A -> ... >> case B -> ... >> } >> >> throws NPE on null and ICCE on C. >> >> At the time, we were concerned about the gap between statement and expression switches, and talked about having a way to make statement switches exhaustive. That's still on the table, and we should still address this, but that's not the subject of this mail. >> >> What I want to focus on in this mail is the interplay between exhaustiveness analysis and (exhaustive) switch semantics, and what code we have to inject to make up for gaps. We've identified two sources of gaps: nulls, and novel enum values. When we get to sealed types, we can add novel subtypes to the list of things we have to detect and implicitly reject; when we get to deconstruction patterns, we need to address these at nested levels too. >> >> Let's analyze switches on Container assuming: >> >> Container = Box | Bag >> Shape = Rect | Circle >> >> and assume a novel shape Pentagon shows up unexpectedly via separate compilation. >> >> If we have a switch _statement_ with: >> >> case Box(Rect r) >> case Box(Circle c) >> case Bag(Rect r) >> case Bag(Circle c) >> >> then the only value we implicitly handle is null; everything else just falls out of the switch, because they don't try to be exhaustive. >> >> If this is an expression switch, then I think its safe to say: >> >> - The switch should deemed exhaustive; no Box(null) etc cases needed. >> - We get an NPE on null. >> >> But that leaves Box(null), Bag(null), Box(Pentagon), and Bag(Pentagon). We have to do something (the switch has to be total) with these, and again, asking users to manually handle these is unreasonable. A reasonable strawman here is: >> >> ICCE on Box(Pentagon) and Bag(Pentagon) >> NPE on Box(null) and Bag(null) >> >> Essentially, what this means is: we need to explicitly consider null and novel values/types of enum/sealed classes in our exhaustiveness analysis, and, if these are not seen to be explicitly covered and the implicit coverage plays into the conclusion of overall weak totality, then we need to insert implicit catch-alls for these cases. >> >> If we switch over: >> >> case Box(Rect r) >> case Box(Circle c) >> case Box b >> case Bag(Rect r) >> case Bag(Circle c) >> >> then Box(Pentagon) and Box(null) are handled by the `Box b` case and don't need to be handled by a catch-all. >> >> If we have: >> >> case Box(Rect r) >> case Box(Circle c) >> case Bag(Rect r) >> case Bag(Circle c) >> default >> >> then Box(Pentagon|null) and Bag(Pentagon|null) clearly fall into the default case, so no special handling is needed there. >> >> >> Are we in agreement on what _should_ happen in these cases? If so, I can put a more formal basis on it. >> >> >> On 8/14/2020 1:20 PM, Brian Goetz wrote: >> >> >> - Exhaustiveness and null. (Tagir) For sealed domains (enums and sealed types), we kind of cheated with expression switches because we could count on the switch filtering out the null. But Tagir raises an excellent point, which is that we do not yet have a sound definition of exhaustiveness that scales to nested patterns (do Box(Rect) and Box(Circle) cover Box(Shape)?) This is an interaction between sealed types and patterns that needs to be ironed out. (Thanks Tagir!) >> >> >> [ Breaking this out from Tagir's more comprehensive reply ] >> >> It's unclear for me how exhaustiveness on nested patterns plays with >> null. case Box(Circle c) and case Box(Rect r) don't cover case >> Box(null) which is a valid possibility for Box type. >> >> >> It?s not even clear how exhaustiveness plays with null even without nesting, so let's start there. >> >> Consider this switch: >> >> switch (trafficLight) { >> case GREEN, YELLOW: driveReallyFast(); >> case RED: sigh(); >> } >> >> Is it exhaustive? Well, we want to say yes. And with the existing null-hostility of switch, it is. But even without that, we?d like to say yes, because a null enum value is almost always an error, and making users deal with cases that don?t happen in reality is kind of rude. >> >> For a domain sealed to a set of alternatives (enums or sealed classes), let?s say that a set of patterns is _weakly exhaustive_ if it covers all the alternatives but not null, and _strongly exhaustive_ if it also covers null. When we did switch expressions, we said that weakly exhaustive coverings didn?t need a default in a switch expression. I think we?re primed to say the same thing for sealed classes. But, this ?weak is good enough? leans on the fact that the existing hostility of switch will cover what we miss. We get no such cover in nested cases. >> >> I think it?s worth examining further why we are willing to accept the weak coverage with enums. Is it really that we?re willing to assume that enums just should never be null? If we had type cardinalities in the language, would we treat `enum X` as declaring a cardinality-1 type called X? I think we might. OK, what about sealed classes? Would the same thing carry over? Not so sure there. And this is a problem, because we ultimately want: >> >> case Optional.of(var x): >> case Optional.empty(): >> >> to be exhaustive on Optional, and said exhaustiveness will likely lean on some sort of sealing. >> >> This is related to Guy's observation that totality is a "subtree all the way down" property. Consider: >> >> sealed class Container permits Box, Bag { } >> sealed class Shape permits Rect, Circle { } >> >> Ignoring null, Box+Bag should be exhaustive on container, and Rect+Circle should be exhaustive on shape. So if we are switching over a Container, then what of: >> >> case Box(Rect r): >> case Box(Circle c): >> case Bag(Rect r): >> case Bag(Circle c): >> >> We have some "nullity holes" in three places: Box(null), Bag(null), and null itself. Is this set of cases exhaustive on Bags, Boxes, or Containers? >> >> I think users would like to be able to write the above four cases and treat it as exhaustive; having to explicitly provide Box(null) / Box b, Bag(null) / Bag b, or a catch-all to accept null+Box(null)+Bag(null) would all be deemed unpleasant ceremony. >> >> Hmm... >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Aug 20 20:29:06 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 20 Aug 2020 22:29:06 +0200 (CEST) Subject: [pattern-switch] Exhaustiveness In-Reply-To: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> Message-ID: <1375261316.206645.1597955346798.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Jeudi 20 Ao?t 2020 17:56:20 > Objet: Re: [pattern-switch] Exhaustiveness > Tagir's question about exhaustiveness in switches points to some technical debt > left over from expression switches. > (Note: this entire discussion has nothing to do with whether `case Object o` is > nullable; it has strictly to do with extending the existing treatment of > exhaustive switches over enums to sealed classes, when we can conclude a switch > over such a type is total without a default / total case, and what implicit > cases we have to insert to make up for that. Please let's not conflate this > thread with that issue.) > When we did expression switch, for an exhaustive switch that covered all the > enums without a default, we inserted an extra catch-all case that throws ICCE, > on the theory that nulls are already checked by the switch and so anything that > hits the synthetic default must be a novel enum value, which merits an ICCE. > This worked for enum switches (where all case labels are discrete values), but > doesn't quite scale to sealed types. Let's fix that. > As a recap, suppose we have > enum E { A, B; } > and suppose that, via separate compilation, a novel value C is introduced that > was unknown at the time the switch was compiled. > An "exhaustive" statement switch on E: > switch (e) { > case A: > case B: > } > throws NPE on null but does nothing on C, because switch statements make no > attempt at being exhaustive. > An _expression_ switch that is deemed exhaustive without a default case: > var s = switch (e) { > case A -> ... > case B -> ... > } > throws NPE on null and ICCE on C. > At the time, we were concerned about the gap between statement and expression > switches, and talked about having a way to make statement switches exhaustive. > That's still on the table, and we should still address this, but that's not the > subject of this mail. > What I want to focus on in this mail is the interplay between exhaustiveness > analysis and (exhaustive) switch semantics, and what code we have to inject to > make up for gaps. We've identified two sources of gaps: nulls, and novel enum > values. When we get to sealed types, we can add novel subtypes to the list of > things we have to detect and implicitly reject; when we get to deconstruction > patterns, we need to address these at nested levels too. > Let's analyze switches on Container assuming: > Container = Box | Bag > Shape = Rect | Circle > and assume a novel shape Pentagon shows up unexpectedly via separate > compilation. > If we have a switch _statement_ with: > case Box(Rect r) > case Box(Circle c) > case Bag(Rect r) > case Bag(Circle c) > then the only value we implicitly handle is null; everything else just falls out > of the switch, because they don't try to be exhaustive. > If this is an expression switch, then I think its safe to say: > - The switch should deemed exhaustive; no Box(null) etc cases needed. > - We get an NPE on null. > But that leaves Box(null), Bag(null), Box( Pentagon ), and Bag( Pentagon ). We > have to do something (the switch has to be total) with these, and again, asking > users to manually handle these is unreasonable. A reasonable strawman here is: > ICCE on Box( Pentagon) and Bag( Pentagon ) > NPE on Box(null) and Bag(null) > Essentially, what this means is: we need to explicitly consider null and novel > values/types of enum/sealed classes in our exhaustiveness analysis, and, if > these are not seen to be explicitly covered and the implicit coverage plays > into the conclusion of overall weak totality, then we need to insert implicit > catch-alls for these cases. > If we switch over: > case Box(Rect r) > case Box(Circle c) > case Box b > case Bag(Rect r) > case Bag(Circle c) > then Box(Pentagon) and Box(null) are handled by the `Box b` case and don't need > to be handled by a catch-all. > If we have: > case Box(Rect r) > case Box(Circle c) > case Bag(Rect r) > case Bag(Circle c) > default > then Box(Pentagon|null) and Bag(Pentagon|null) clearly fall into the default > case, so no special handling is needed there. > Are we in agreement on what _should_ happen in these cases? If so, I can put a > more formal basis on it. yes, but it doesn't mean that in term of translation strategy we need all those cases as synthetic cases, installing some null checks upfront (of a pattern deconstruction) and a default should be enough, or am i missing something. R?mi > On 8/14/2020 1:20 PM, Brian Goetz wrote: >>> - Exhaustiveness and null. (Tagir) For sealed domains (enums and sealed types), >>> we kind of cheated with expression switches because we could count on the >>> switch filtering out the null. But Tagir raises an excellent point, which is >>> that we do not yet have a sound definition of exhaustiveness that scales to >>> nested patterns (do Box(Rect) and Box(Circle) cover Box(Shape)?) This is an >>> interaction between sealed types and patterns that needs to be ironed out. >>> (Thanks Tagir!) >> [ Breaking this out from Tagir's more comprehensive reply ] >>> It's unclear for me how exhaustiveness on nested patterns plays with >>> null. case Box(Circle c) and case Box(Rect r) don't cover case >>> Box(null) which is a valid possibility for Box type. >> It?s not even clear how exhaustiveness plays with null even without nesting, so >> let's start there. >> Consider this switch: >> switch (trafficLight) { >> case GREEN, YELLOW: driveReallyFast(); >> case RED: sigh(); >> } >> Is it exhaustive? Well, we want to say yes. And with the existing null-hostility >> of switch, it is. But even without that, we?d like to say yes, because a null >> enum value is almost always an error, and making users deal with cases that >> don?t happen in reality is kind of rude. >> For a domain sealed to a set of alternatives (enums or sealed classes), let?s >> say that a set of patterns is _weakly exhaustive_ if it covers all the >> alternatives but not null, and _strongly exhaustive_ if it also covers null. >> When we did switch expressions, we said that weakly exhaustive coverings didn?t >> need a default in a switch expression. I think we?re primed to say the same >> thing for sealed classes. But, this ?weak is good enough? leans on the fact >> that the existing hostility of switch will cover what we miss. We get no such >> cover in nested cases. >> I think it?s worth examining further why we are willing to accept the weak >> coverage with enums. Is it really that we?re willing to assume that enums just >> should never be null? If we had type cardinalities in the language, would we >> treat `enum X` as declaring a cardinality-1 type called X? I think we might. >> OK, what about sealed classes? Would the same thing carry over? Not so sure >> there. And this is a problem, because we ultimately want: >> case Optional.of(var x): >> case Optional.empty(): >> to be exhaustive on Optional, and said exhaustiveness will likely lean on >> some sort of sealing. >> This is related to Guy's observation that totality is a "subtree all the way >> down" property. Consider: >> sealed class Container permits Box, Bag { } >> sealed class Shape permits Rect, Circle { } >> Ignoring null, Box+Bag should be exhaustive on container, and Rect+Circle should >> be exhaustive on shape. So if we are switching over a Container, then >> what of: >> case Box(Rect r): >> case Box(Circle c): >> case Bag(Rect r): >> case Bag(Circle c): >> We have some "nullity holes" in three places: Box(null), Bag(null), and null >> itself. Is this set of cases exhaustive on Bags, Boxes, or Containers? >> I think users would like to be able to write the above four cases and treat it >> as exhaustive; having to explicitly provide Box(null) / Box b, Bag(null) / Bag >> b, or a catch-all to accept null+Box(null)+Bag(null) would all be deemed >> unpleasant ceremony. >> Hmm... -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Aug 20 20:34:27 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 20 Aug 2020 22:34:27 +0200 (CEST) Subject: [pattern-switch] Exhaustiveness In-Reply-To: <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> Message-ID: <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Tagir Valeev" > Cc: "amber-spec-experts" > Envoy?: Jeudi 20 Ao?t 2020 21:09:00 > Objet: Re: [pattern-switch] Exhaustiveness > Here's an attempt at a formalism for capturing this. > There are several categories of patterns we might call total on a type T. We > could refine the taxonomy as: > - Strongly total -- matches all values of T. > - Weakly total -- matches all values of T except perhaps null. > What we want to do is characterize the aggregate totality on T of a _set_ of > patterns P*. A set of patterns could in the aggregate be either of the above, > or also: > - Optimistically total -- matches all values of subtypes of T _known at compile > time_, except perhaps null. > Note that we have an ordering: > partial < optimistically total < weakly total < strongly total > Now, some rules about defining the totality of a set of patterns. > T-Total: The singleton set containing the type pattern `T t` is strongly total > on U <: T. (This is the rule we've been discussing, but that's not the point of > this mail -- we just need a base case right now.) > T-Subset: If a set of patterns P* contains a subset of patterns that is X-total > on T, then P* is X-total on T. > T-Sealed: If T is sealed to U* (direct subtypes only), and for each U in U*, > there is some subset of P* that is optimistically total on U, then P* is > optimistically total on T. > T-Nested: Given a deconstructor D(U) and a collection of patterns { P1..Pn }, if > { P1..Pn } is X-total on U, then { D(P1)..D(Pn) } is min(X,weak)-total on D. > OK, examples. Let's say > Container = Box | Bag > Shape = Round | Rect > Round = Circle | Ellipse > { Container c }: total on Container by T-Total. > { Box b, Bag b }: optimistically total on Container > - Container sealed to Box and Bag > - `Box b` total on Box, `Bag b` total on Bag > - { Box b, Bag b } optimistically total on Container by T-Sealed > { Box(Round r), Box(Rect r) }: optimistically total on Box > - Box sealed to Round and Rect > - { Round r, Rect r } optimistically total on Shape by T-Sealed > - { Box(Round r), Box(Rect r) } optimistically total on Box by T-Nested > { Box(Object o) } weakly total on Box > - Object o total on Object > - { Object o } total on Object by T-Subset > - { Box(Object o) } weakly total on Box by T-Nested > { Box(Rect r), Box(Circle c), Box(Ellipse e) } optimistically total on > Box > - Shape sealed to Round and Rect > - { Rect r } total on Rect > - { Circle c, Ellipse e } optimistically total on Round > - { Rect r, Circle c, Ellipse e } is optimistically total on Shape, because for > each of { Rect, Round }, there is a subset that is optimistically total on that > type > - { Box(Rect r), Box(Circle c), Box(Ellipse e) } optimistically total on Box by > T-Nested > We can enhance this model to construct the residue (a characterization of what > falls through the cracks), and therefore has to be handled by a catch-all in a > putatively total switch. A grammar for the residue would be: > R := null | novel | D(R) > so might includes Box(null), Box(novel), Bag(Box(novel)), etc. We would need to > extend this to support deconstructors with multiple bindings too. > OK, coming back to reality: > - The patterns of a switch expression must be at least optimistically total. > - The translation of a switch expression must include a synthetic case that > catches all elements of the residue of its patterns, and throws the appropriate > exceptions: > - NPE for a null > - ICCE for a novel value > - One of the above, or maybe something else, for D(novel), D(null), D(E(novel, > null)), etc > We still have not addressed how we might nominate a _statement_ switch as being > some form of total; that's a separate story. > Also a separate story: under what conditions in the new world do switches throw > NPE, but this seems like progress. > Given the weird shape of the residue, it's not clear there's a clean way to > extrapolate the NPE|ICCE rule, since we might have Foo(null, novel), and would > arbitrarily have to pick which exception to throw, and neither would really be > all that great. Perhaps there's a new exception type lurking here. I disagree here, if me move the null checks upfront the type check a NPE will always be thrown before a ICCE on the same pattern which is the usual rule for any JDK methods (if you see a pattern matching as an equivalent of a cascade of instanceof + method call). R?mi > On 8/20/2020 12:58 PM, Tagir Valeev wrote: >> Hello! >>> Are we in agreement on what _should_ happen in these cases? >> This sounds like a good idea. I see the following alternatives (not >> saying that they are better, just putting them on the table). >> 1. Do not perform exhaustiveness analysis for nested patterns, except >> if all nested components are total. So, in your example, either >> explicit default, or case Box and case Bag will be necessary. This is >> somewhat limiting but I'm not sure that the exhaustiveness of complex >> nested patterns would be very popular. If one really needs this, they >> could use nested switch expression: >> return switch(container) { >> case Box(var s) -> switch(s) {case Rect r -> ...; case Circle c -> >> ...; /*optional case null is possible*/}; >> case Bag(var s) -> switch(s) {case Rect r -> ...; case Circle c -> >> ...; /*optional case null is possible*/}; >> } >> Here Box(var s) has total nested component, so it matches everything >> that Box matches, the same with Bag, thus we perform exhaustiveness >> analysis using Container declaration. >> This approach does not close the door for us. We can rethink later and >> add exhaustiveness analysis for nested patterns when we will finally >> determine how it should look like. Note that this still allows making >> Optional.of(var x) + Optional.empty exhaustive if we provide an >> appropriate mechanism to declare this kind of patterns. >> 2. Allow deconstructors and records to specify whether null is a >> possible value for a given component. Like make deconstructors and >> records null-hostile by default and provide a syntax (T? or T|null or >> whatever) to allow nulls. In this case, if the deconstructor is >> null-friendly, then the exhaustive pattern must handle Box(null). >> Otherwise, Box(null) is a compilation error. Yes, I know, this may >> quickly evolve to the point when we will need to add a full-fledged >> nullability to the type system. But probably it's possible to allow >> nullability specification for new language features only? Ok, ok, >> sorry, my imagination drove me too far away from reality. Forget it. >> With best regards, >> Tagir Valeev. >> On Thu, Aug 20, 2020 at 10:57 PM Brian Goetz [ mailto:brian.goetz at oracle.com | >> ] wrote: >>> Tagir's question about exhaustiveness in switches points to some technical debt >>> left over from expression switches. >>> (Note: this entire discussion has nothing to do with whether `case Object o` is >>> nullable; it has strictly to do with extending the existing treatment of >>> exhaustive switches over enums to sealed classes, when we can conclude a switch >>> over such a type is total without a default / total case, and what implicit >>> cases we have to insert to make up for that. Please let's not conflate this >>> thread with that issue.) >>> When we did expression switch, for an exhaustive switch that covered all the >>> enums without a default, we inserted an extra catch-all case that throws ICCE, >>> on the theory that nulls are already checked by the switch and so anything that >>> hits the synthetic default must be a novel enum value, which merits an ICCE. >>> This worked for enum switches (where all case labels are discrete values), but >>> doesn't quite scale to sealed types. Let's fix that. >>> As a recap, suppose we have >>> enum E { A, B; } >>> and suppose that, via separate compilation, a novel value C is introduced that >>> was unknown at the time the switch was compiled. >>> An "exhaustive" statement switch on E: >>> switch (e) { >>> case A: >>> case B: >>> } >>> throws NPE on null but does nothing on C, because switch statements make no >>> attempt at being exhaustive. >>> An _expression_ switch that is deemed exhaustive without a default case: >>> var s = switch (e) { >>> case A -> ... >>> case B -> ... >>> } >>> throws NPE on null and ICCE on C. >>> At the time, we were concerned about the gap between statement and expression >>> switches, and talked about having a way to make statement switches exhaustive. >>> That's still on the table, and we should still address this, but that's not >>> the subject of this mail. >>> What I want to focus on in this mail is the interplay between exhaustiveness >>> analysis and (exhaustive) switch semantics, and what code we have to inject to >>> make up for gaps. We've identified two sources of gaps: nulls, and novel enum >>> values. When we get to sealed types, we can add novel subtypes to the list of >>> things we have to detect and implicitly reject; when we get to deconstruction >>> patterns, we need to address these at nested levels too. >>> Let's analyze switches on Container assuming: >>> Container = Box | Bag >>> Shape = Rect | Circle >>> and assume a novel shape Pentagon shows up unexpectedly via separate >>> compilation. >>> If we have a switch _statement_ with: >>> case Box(Rect r) >>> case Box(Circle c) >>> case Bag(Rect r) >>> case Bag(Circle c) >>> then the only value we implicitly handle is null; everything else just falls out >>> of the switch, because they don't try to be exhaustive. >>> If this is an expression switch, then I think its safe to say: >>> - The switch should deemed exhaustive; no Box(null) etc cases needed. >>> - We get an NPE on null. >>> But that leaves Box(null), Bag(null), Box(Pentagon), and Bag(Pentagon). We have >>> to do something (the switch has to be total) with these, and again, asking >>> users to manually handle these is unreasonable. A reasonable strawman here is: >>> ICCE on Box(Pentagon) and Bag(Pentagon) >>> NPE on Box(null) and Bag(null) >>> Essentially, what this means is: we need to explicitly consider null and novel >>> values/types of enum/sealed classes in our exhaustiveness analysis, and, if >>> these are not seen to be explicitly covered and the implicit coverage plays >>> into the conclusion of overall weak totality, then we need to insert implicit >>> catch-alls for these cases. >>> If we switch over: >>> case Box(Rect r) >>> case Box(Circle c) >>> case Box b >>> case Bag(Rect r) >>> case Bag(Circle c) >>> then Box(Pentagon) and Box(null) are handled by the `Box b` case and don't need >>> to be handled by a catch-all. >>> If we have: >>> case Box(Rect r) >>> case Box(Circle c) >>> case Bag(Rect r) >>> case Bag(Circle c) >>> default >>> then Box(Pentagon|null) and Bag(Pentagon|null) clearly fall into the default >>> case, so no special handling is needed there. >>> Are we in agreement on what _should_ happen in these cases? If so, I can put a >>> more formal basis on it. >>> On 8/14/2020 1:20 PM, Brian Goetz wrote: >>> - Exhaustiveness and null. (Tagir) For sealed domains (enums and sealed types), >>> we kind of cheated with expression switches because we could count on the >>> switch filtering out the null. But Tagir raises an excellent point, which is >>> that we do not yet have a sound definition of exhaustiveness that scales to >>> nested patterns (do Box(Rect) and Box(Circle) cover Box(Shape)?) This is an >>> interaction between sealed types and patterns that needs to be ironed out. >>> (Thanks Tagir!) >>> [ Breaking this out from Tagir's more comprehensive reply ] >>> It's unclear for me how exhaustiveness on nested patterns plays with >>> null. case Box(Circle c) and case Box(Rect r) don't cover case >>> Box(null) which is a valid possibility for Box type. >>> It?s not even clear how exhaustiveness plays with null even without nesting, so >>> let's start there. >>> Consider this switch: >>> switch (trafficLight) { >>> case GREEN, YELLOW: driveReallyFast(); >>> case RED: sigh(); >>> } >>> Is it exhaustive? Well, we want to say yes. And with the existing >>> null-hostility of switch, it is. But even without that, we?d like to say yes, >>> because a null enum value is almost always an error, and making users deal with >>> cases that don?t happen in reality is kind of rude. >>> For a domain sealed to a set of alternatives (enums or sealed classes), let?s >>> say that a set of patterns is _weakly exhaustive_ if it covers all the >>> alternatives but not null, and _strongly exhaustive_ if it also covers null. >>> When we did switch expressions, we said that weakly exhaustive coverings >>> didn?t need a default in a switch expression. I think we?re primed to say the >>> same thing for sealed classes. But, this ?weak is good enough? leans on the >>> fact that the existing hostility of switch will cover what we miss. We get no >>> such cover in nested cases. >>> I think it?s worth examining further why we are willing to accept the weak >>> coverage with enums. Is it really that we?re willing to assume that enums just >>> should never be null? If we had type cardinalities in the language, would we >>> treat `enum X` as declaring a cardinality-1 type called X? I think we might. >>> OK, what about sealed classes? Would the same thing carry over? Not so sure >>> there. And this is a problem, because we ultimately want: >>> case Optional.of(var x): >>> case Optional.empty(): >>> to be exhaustive on Optional, and said exhaustiveness will likely lean on >>> some sort of sealing. >>> This is related to Guy's observation that totality is a "subtree all the way >>> down" property. Consider: >>> sealed class Container permits Box, Bag { } >>> sealed class Shape permits Rect, Circle { } >>> Ignoring null, Box+Bag should be exhaustive on container, and Rect+Circle should >>> be exhaustive on shape. So if we are switching over a Container, then >>> what of: >>> case Box(Rect r): >>> case Box(Circle c): >>> case Bag(Rect r): >>> case Bag(Circle c): >>> We have some "nullity holes" in three places: Box(null), Bag(null), and null >>> itself. Is this set of cases exhaustive on Bags, Boxes, or Containers? >>> I think users would like to be able to write the above four cases and treat it >>> as exhaustive; having to explicitly provide Box(null) / Box b, Bag(null) / Bag >>> b, or a catch-all to accept null+Box(null)+Bag(null) would all be deemed >>> unpleasant ceremony. >>> Hmm... -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Aug 20 20:37:24 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 20 Aug 2020 16:37:24 -0400 Subject: [pattern-switch] Exhaustiveness In-Reply-To: <1375261316.206645.1597955346798.JavaMail.zimbra@u-pem.fr> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <1375261316.206645.1597955346798.JavaMail.zimbra@u-pem.fr> Message-ID: > but it doesn't mean that in term of translation strategy we need all > those cases as synthetic cases, installing some null checks upfront > (of a pattern deconstruction) and a default should be enough, or am i > missing something. They don't have to be implemented as individual synthetic cases, but it is a reasonable mental model for purposes of discussing what should happen. But, depending on what mapping we make between the residue and exceptions to be thrown, "null check + default" may not be a sufficiently rich model.? (It was for enums, but the shape of the residue is more complicated here.)? If our residue is null, and Box(null), Box(novel), and we want want to extrapolate from the exception rules we have for enums, then we are throwing different things for Box(null) and Box(novel).? Which is why synthetic cases might be a more accurate starting point; we can express Box(null) and Box(novel) as patterns. From forax at univ-mlv.fr Thu Aug 20 21:26:34 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 20 Aug 2020 23:26:34 +0200 (CEST) Subject: [pattern-switch] Exhaustiveness In-Reply-To: References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <1375261316.206645.1597955346798.JavaMail.zimbra@u-pem.fr> Message-ID: <689693259.209417.1597958794148.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Jeudi 20 Ao?t 2020 22:37:24 > Objet: Re: [pattern-switch] Exhaustiveness >> but it doesn't mean that in term of translation strategy we need all >> those cases as synthetic cases, installing some null checks upfront >> (of a pattern deconstruction) and a default should be enough, or am i >> missing something. > > They don't have to be implemented as individual synthetic cases, but it > is a reasonable mental model for purposes of discussing what should happen. > > But, depending on what mapping we make between the residue and > exceptions to be thrown, "null check + default" may not be a > sufficiently rich model.? (It was for enums, but the shape of the > residue is more complicated here.)? If our residue is null, and > Box(null), Box(novel), and we want want to extrapolate from the > exception rules we have for enums, then we are throwing different things > for Box(null) and Box(novel).? Which is why synthetic cases might be a > more accurate starting point; we can express Box(null) and Box(novel) as > patterns. Having a meaningful error message with the pattern for both the NPE and ICCE is not enough ? R?mi From guy.steele at oracle.com Thu Aug 20 21:54:25 2020 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 20 Aug 2020 17:54:25 -0400 Subject: [pattern-switch] Exhaustiveness In-Reply-To: <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> Message-ID: <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> > On Aug 20, 2020, at 4:34 PM, Remi Forax wrote: > . . . > Given the weird shape of the residue, it's not clear there's a clean way to extrapolate the NPE|ICCE rule, since we might have Foo(null, novel), and would arbitrarily have to pick which exception to throw, and neither would really be all that great. Perhaps there's a new exception type lurking here. > > I disagree here, if me move the null checks upfront the type check a NPE will always be thrown before a ICCE on the same pattern which is the usual rule for any JDK methods > (if you see a pattern matching as an equivalent of a cascade of instanceof + method call). > On Aug 20, 2020, at 4:37 PM, Brian Goetz wrote: > > >> but it doesn't mean that in term of translation strategy we need all those cases as synthetic cases, installing some null checks upfront (of a pattern deconstruction) and a default should be enough, or am i missing something. > > They don't have to be implemented as individual synthetic cases, but it is a reasonable mental model for purposes of discussing what should happen. > > But, depending on what mapping we make between the residue and exceptions to be thrown, "null check + default" may not be a sufficiently rich model. (It was for enums, but the shape of the residue is more complicated here.) If our residue is null, and Box(null), Box(novel), and we want want to extrapolate from the exception rules we have for enums, then we are throwing different things for Box(null) and Box(novel). Which is why synthetic cases might be a more accurate starting point; we can express Box(null) and Box(novel) as patterns. If I am interpreting your comments correctly, the two of you seem to be an ?violent agreement?: Brian says (and I agree) that it is a useful mental model to represent any additional code needed to detect situations that fall ?in the residue? as actual case clauses, which we can imagine as being implicitly synthesized by the compiler. If we were to do so, then presumably such synthesized case clauses would have to be inserted into the switch code at valid positions. In particular, a ?null? case would need to appear *before* all other similar cases. On the other hand, a ?novel? case could be inserted anywhere, _as long as it occurs after any related ?null? case_. This we are led to conclude, as R?mi apparently has, that according to the mental model proposed by Brian, in any specific position of the pattern, the null check necessarily occurs before the novel check. On the other hand, because the compiler is free to handle a novel case at any point (as long as it?s after any null check), a valid strategy for the compiler (though not the only one) is always to insert the novel case(s) _after_ all the explicit (user-written) case?in other words, handling the ?novel? case can be effectively equivalent to handling a ?default" case. I think this is what R?mi means when saying "installing some null checks upfront (of a pattern deconstruction) and a default should be enough?. The ambiguity that this analysis still does not addresses situations such as D(E(novel, null)); this example is briefly alluded to at the end of Brian?s initial sketch of the formalism, but unfortunately the sketch does not address multi-parameter deconstructs in detail. So let?s go through this example: suppose that there are explicit cases that are optimistically total (I like the terminology Brian has provided) on D(E(Shape, Coin)), which might look like this: D(E(Round, Head)) D(E(Round, Tail)) D(E(Rect, Head)) D(E(Rect, Tail)) Then I think the residue would consist of D(null) D(novel) D(E(null, null)) D(E(null, Head)) D(E(null, Tail)) D(E(null, novel)) D(E(Round, null)) D(E(Rect, null)) D(E(Round, novel)) D(E(Rect, novel)) D(E(novel, null)) D(E(novel, Head)) D(E(novel, Tail)) D(E(novel, novel)) The order shown above is permissible, but some pairs may be traded, under the constraint that if two cases differ in one position and one of them has ?null? in that position, then that one must come earlier. If we wish behavior to be deterministic, it would be Java-like to insist that (1) the cases be listed consistent with an increasing lexicographic partial order, where null < novel, and (2) that sub-patterns effectively be processed from left right. Under these rules, the cases D(E(null, null)) D(E(null, novel)) would raise NPE, and D(E(novel, null)) D(E(novel, novel)) would raise ICCE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Aug 20 22:14:37 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 20 Aug 2020 18:14:37 -0400 Subject: [pattern-switch] Exhaustiveness In-Reply-To: <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> Message-ID: <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> I suspect there are other orderings too, such as "any nulls beat any novels" or vice versa, which would also be deterministic and potentially more natural to the user.? But before we go there, I want to make sure we have something where users can understand the exceptions that are thrown without too much head-scratching. If a user had: ??? case Box(Head) ??? case Box(Tail) and a Box(null) arrived unexpectedly at the switch, would NPE really be what they expect?? An NPE happens when you _dereference_ a null. But no one is deferencing anything here; it's just that Box(null) fell into that middle space of "well, you didn't really cover it, but it's such a silly case that I didn't want to make you cover it either, but here we are and we have to do something."? So maybe want some sort of SillyCaseException (perhaps with a less silly name) for at least the null residue. On the other hand, ICCE for Box(novel) does seem reasonable because the world really has changed in an incompatible way since the user wrote the code, and they probably do want to be alerted to the fact that their code is out of sync with the world. Separately (but not really separately), I'd like to refine my claim that `switch` is null-hostile.? In reality, `switch` NPEs on null in three cases: a null enum, String, or primitive box.? And, in each of these cases, it NPEs because (the implementation) really does dereference the target!? For a `String`, it calls `hashCode()`.? For an `enum`, it calls `ordinal()`.? And for a box, it calls `xxxValue()`.? It is _those_ methods that NPE, not the switch. (Yes, we could have designed it so that the implementation did a null check before calling those things.) > The ambiguity that this analysis still does not addresses situations > such as?D(E(novel, null)); this example is briefly alluded to at the > end of Brian?s initial sketch of the formalism, but unfortunately the > sketch does not address multi-parameter deconstructs in detail. ?So > let?s go through this example: suppose that there are explicit cases > that are optimistically total (I like the terminology Brian has > provided) on D(E(Shape, Coin)), which might look like this: > > D(E(Round, Head)) > D(E(Round, Tail)) > D(E(Rect, Head)) > D(E(Rect, Tail)) > > Then I think the residue would consist of > > D(null) > D(novel) > D(E(null, null)) > D(E(null, Head)) > D(E(null, Tail)) > D(E(null, novel)) > D(E(Round, null)) > D(E(Rect, null)) > D(E(Round, novel)) > D(E(Rect, novel)) > D(E(novel, null)) > D(E(novel, Head)) > D(E(novel, Tail)) > D(E(novel, novel)) > > The order shown above is permissible, but some pairs may be traded, > under the constraint that if two cases differ in one position and one > of them has ?null? in that position, then that one must come earlier. > > If we wish behavior to be deterministic, it would be Java-like to > insist that (1) the cases be listed consistent with an increasing > lexicographic partial order, where null < novel, and (2) that > sub-patterns effectively be processed from left right. ?Under these > rules, the cases > > D(E(null, null)) > D(E(null, novel)) > > would raise NPE, and > > D(E(novel, null)) > D(E(novel, novel)) > > would raise ICCE. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Fri Aug 21 00:53:22 2020 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 20 Aug 2020 20:53:22 -0400 Subject: [pattern-switch] Exhaustiveness In-Reply-To: <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> Message-ID: > On Aug 20, 2020, at 5:54 PM, Guy Steele wrote: > > . . . > > The ambiguity that this analysis still does not addresses situations such as D(E(novel, null)); this example is briefly alluded to at the end of Brian?s initial sketch of the formalism, but unfortunately the sketch does not address multi-parameter deconstructs in detail. So let?s go through this example: suppose that there are explicit cases that are optimistically total (I like the terminology Brian has provided) on D(E(Shape, Coin)), which might look like this: > > D(E(Round, Head)) > D(E(Round, Tail)) > D(E(Rect, Head)) > D(E(Rect, Tail)) > > Then I think the residue would consist of > > D(null) > D(novel) > D(E(null, null)) > D(E(null, Head)) > D(E(null, Tail)) > D(E(null, novel)) > D(E(Round, null)) > D(E(Rect, null)) > D(E(Round, novel)) > D(E(Rect, novel)) > D(E(novel, null)) > D(E(novel, Head)) > D(E(novel, Tail)) > D(E(novel, novel)) > > The order shown above is permissible, but some pairs may be traded, under the constraint that if two cases differ in one position and one of them has ?null? in that position, then that one must come earlier. > > If we wish behavior to be deterministic, it would be Java-like to insist that (1) the cases be listed consistent with an increasing lexicographic partial order, where null < novel, and (2) that sub-patterns effectively be processed from left right. Under these rules, the cases > > D(E(null, null)) > D(E(null, novel)) > > would raise NPE, and > > D(E(novel, null)) > D(E(novel, novel)) > > would raise ICCE. I went for a walk after supper (always a good time for extra thinking) and realized (pondering the fact that there are two standard ways to order tuples, namely lexicographic order, in which the order of the tuple elements matters, and the product order, in which the order of the tuple elements does not matter) that I may have insufficiently appreciated R?mi?s comment. Maybe he wanted all cases that have a null anywhere checked before any other cases. This may require a pattern position to be examined more than once, but does have the nice properties of (a) not requiring left-to-right processing, and (b) always raising NPE in preference of ICCE if NPE is possible. We can express this in terms of Brian?s ?synthetic cases? as follows: The user writes: switch (x) { case D(E(Round, Head)): S1 case D(E(Round, Tail)): S2 case D(E(Rect, Head)): S3 case D(E(Rect, Tail)) S4 } The residue can be covered by: D(null) D(E(null, _)) D(E(_, null)) D(E(Round, novel)) D(E(Rect, novel)) D(E(novel, Head)) D(E(novel, Tail)) D(E(novel, novel)) D(novel) where I have written ?_? for ?var unusedVariable?. And from this we see (since we can put all the cases involving ?novel" _last_) that we can rewrite the switch, by addign Brian?s synthetic case clauses, as: switch (x) { case D(null): case D(E(null, _)): case D(E(_, null)): NPE case D(E(Round, Head)): S1 case D(E(Round, Tail)): S2 case D(E(Rect, Head)): S3 case D(E(Rect, Tail)) S4 default: ICCE } which I now think is the structure that R?mi was really hinting at. From guy.steele at oracle.com Fri Aug 21 01:02:02 2020 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 20 Aug 2020 21:02:02 -0400 Subject: [pattern-switch] Exhaustiveness In-Reply-To: <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> Message-ID: <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> > On Aug 20, 2020, at 6:14 PM, Brian Goetz wrote: > > I suspect there are other orderings too, such as "any nulls beat any novels" or vice versa, which would also be deterministic and potentially more natural to the user. But before we go there, I want to make sure we have something where users can understand the exceptions that are thrown without too much head-scratching. > > If a user had: > > case Box(Head) > case Box(Tail) > > and a Box(null) arrived unexpectedly at the switch, would NPE really be what they expect? An NPE happens when you _dereference_ a null. But no one is deferencing anything here; it's just that Box(null) fell into that middle space of "well, you didn't really cover it, but it's such a silly case that I didn't want to make you cover it either, but here we are and we have to do something." So maybe want some sort of SillyCaseException (perhaps with a less silly name) for at least the null residue. I believe that if Head and Tail exhaustively cover an enum or sealed type (as was the intended implication of my example)?more generally, in a situation that is optimistically total---then the user would be very happy to have some sort of error signaled if some other value shows up unexpectedly in a statement switch, whether that value is ?Ankle" or ?null?. Maybe a new error name would be appropriate, such as UnexpectedNull. If the user does not want such implicit handling of an optimistically total situation in a statement switch, then it is always possible to provide explicit clauses ?case null: break;? and ?default: break;?. > On the other hand, ICCE for Box(novel) does seem reasonable because the world really has changed in an incompatible way since the user wrote the code, and they probably do want to be alerted to the fact that their code is out of sync with the world. Yep. > Separately (but not really separately), I'd like to refine my claim that `switch` is null-hostile. In reality, `switch` NPEs on null in three cases: a null enum, String, or primitive box. And, in each of these cases, it NPEs because (the implementation) really does dereference the target! For a `String`, it calls `hashCode()`. For an `enum`, it calls `ordinal()`. And for a box, it calls `xxxValue()`. It is _those_ methods that NPE, not the switch. (Yes, we could have designed it so that the implementation did a null check before calling those things.) From brian.goetz at oracle.com Fri Aug 21 15:14:57 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 21 Aug 2020 11:14:57 -0400 Subject: [pattern-switch] Exhaustiveness In-Reply-To: <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> Message-ID: <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> Yes, this is the sort of ordering I was aiming at. > If the user does not want such implicit handling of an optimistically total situation in a statement switch, then it is always possible to provide explicit clauses ?case null: break;? and ?default: break;?. Indeed, and this is why I was trying to break it down into a set of cases, to ensure that there always is a pattern the user can denote if they want to catch some part of the residue.? Where we are now is: ?- In a total switch (currently just switch expressions), any residue involving novel values gets ICCE, a null gets NPE, and any residue not in the above categories gets (something, maybe NPE, maybe something else.) ?- If the user explicitly wants Box(null), they have two choices: explicitly match Box(null), or, more likely, use some total pattern on Box (`Box(var x)`, `Box b`, etc.)? Similarly, if they want (for whatever reason) Box(novel), they can similarly use totality.? (I hope people are beginning to see why totality in nesting is so critical.) So, next sub-subject (sub-ject?): when, and under what conditions, do we get NPE from non-total switches?? I said this yesterday: > Separately (but not really separately), I'd like to refine my claim > that `switch` is null-hostile.? In reality, `switch` NPEs on null in > three cases: a null enum, String, or primitive box.? And, in each of > these cases, it NPEs because (the implementation) really does > dereference the target!? For a `String`, it calls `hashCode()`.? For > an `enum`, it calls `ordinal()`.? And for a box, it calls > `xxxValue()`.? It is _those_ methods that NPE, not the switch.? (Yes, > we could have designed it so that the implementation did a null check > before calling those things.) I bring this up because these situations cause current switch to NPE even when the switch is not total, and this muddies the story a lot.? We can refine this behavior by saying: "If a switch *on enums, strings, or boxes* has no nullable cases, then there is an implicit `case null: NPE` at the beginning". In other words, I am proposing to treat this "preemptive throwing" as an artifact of switching over these special types (which is fair because the language already gives these types special treatment.)? Then, we are free to treat residue-handling as a consequence of totality, not a general null-hostility of switch. Let me repeat that, because it's a big deal. ??? Switch is *not* null-hostile.? We were just extrapolating from too few data points to ??? see it. ??? Switches on _enums, strings, and boxes_, that do not explicitly have null-handling cases, ??? are null-hostile, because switching on these involves calling methods on Enum, String, ??? or {Integer,Long,...}. ??? If you put a `case null` in a switch on strings/etc, it doesn't throw, it's just matching ??? a value. ??? In all other cases, null is just a value that can be matched, or not, and if the ??? switch ignores its residue, the nulls leak out just like the rest of it. ??? In the general case, switches throw only when they are total; for partial switches ??? (e.g. statement switches), null is just another value that didn't get matched. I believe this restores us to sanity. Next up (separate topic): letting statement switches opt into totality. Assuming that we're on the right track, and drilling into the next level, we now have to bring this back to totality. On 8/20/2020 9:02 PM, Guy Steele wrote: > >> On Aug 20, 2020, at 6:14 PM, Brian Goetz wrote: >> >> I suspect there are other orderings too, such as "any nulls beat any novels" or vice versa, which would also be deterministic and potentially more natural to the user. But before we go there, I want to make sure we have something where users can understand the exceptions that are thrown without too much head-scratching. >> >> If a user had: >> >> case Box(Head) >> case Box(Tail) >> >> and a Box(null) arrived unexpectedly at the switch, would NPE really be what they expect? An NPE happens when you _dereference_ a null. But no one is deferencing anything here; it's just that Box(null) fell into that middle space of "well, you didn't really cover it, but it's such a silly case that I didn't want to make you cover it either, but here we are and we have to do something." So maybe want some sort of SillyCaseException (perhaps with a less silly name) for at least the null residue. > I believe that if Head and Tail exhaustively cover an enum or sealed type (as was the intended implication of my example)?more generally, in a situation that is optimistically total---then the user would be very happy to have some sort of error signaled if some other value shows up unexpectedly in a statement switch, whether that value is ?Ankle" or ?null?. Maybe a new error name would be appropriate, such as UnexpectedNull. > > If the user does not want such implicit handling of an optimistically total situation in a statement switch, then it is always possible to provide explicit clauses ?case null: break;? and ?default: break;?. > >> On the other hand, ICCE for Box(novel) does seem reasonable because the world really has changed in an incompatible way since the user wrote the code, and they probably do want to be alerted to the fact that their code is out of sync with the world. > Yep. > >> Separately (but not really separately), I'd like to refine my claim that `switch` is null-hostile. In reality, `switch` NPEs on null in three cases: a null enum, String, or primitive box. And, in each of these cases, it NPEs because (the implementation) really does dereference the target! For a `String`, it calls `hashCode()`. For an `enum`, it calls `ordinal()`. And for a box, it calls `xxxValue()`. It is _those_ methods that NPE, not the switch. (Yes, we could have designed it so that the implementation did a null check before calling those things.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Aug 21 20:18:37 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 21 Aug 2020 16:18:37 -0400 Subject: [pattern-switch] Totality In-Reply-To: <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> Message-ID: On 8/21/2020 11:14 AM, Brian Goetz wrote: > > Next up (separate topic): letting statement switches opt into totality. > Assuming the discussion on Exhaustiveness is good, let's talk about totality. Expression switches must be total; we totalize them by throwing when we encounter any residue, even though we only require that the set of cases in the switch be optimistically total.? Residue includes: ?- `null` switch targets in String, Enum, and primitive box switches only; ?- novel values in enum switches without a total case clause; ?- novel subtypes in switches on sealed types without a total case clause; ?- when an optimistically total subchain of deconstruction pattern cases wraps a residue value (e.g., D(null) or D(novel)) What about statement switches?? Right now, any residue for a statement switch without a total case clause will just be silently ignored (because statement switches need not be total.) What we would like is a way to say "this switch is total, please type check it for me as such, and insert any needed residue-catching cases."? I think this is a job for `default`. Now that we've got some clarity that switches _don't_ throw on null, but instead it is as if string/enum/box switches have an implicit `case null` when no explicit one is present, we can define `default`, once again, to be total (and not just weakly total.)? So in: ??? switch (object) { ??????? case "foo": ??????? case Box(Frog fs): ??????? default: ... ??? } a `null` just falls into `default` just like anything else that is not the string "foo" or a box of frogs ("let the nulls flow"). Default would have to come last (except in legacy switches, where a legacy switch has one of the distinguished target types and all constant case labels.) What if we want to destructure too?? Well, add a pattern: ??? switch (object) { ??????? case "foo": ??????? case Box(Frog fs): ??????? default Object o: ... ??? } This would additionally assert that the following pattern is total, otherwise a compilation error ensues.? (Note, though, that this is entirely about `switch`, not patterns.? The semantics of the pattern is unchanged, and I do not believe that sprinkling `default` into nested patterns to shout "TOTALITY HERE, I MEAN IT" carries its weight.) This seems a better job to give default in this new world; anything not previously matched, where we retcon the current null behavior as being only about string, enum, or boxes. This leaves us with only one hole, which is: suppose I have an _optimistically total_ statement switch.?? Users might like to (a) assert the switch is total, and get the concomitant type checking, and (b) get residue ejection for free.? Of the two, though, A is much more important than B, but we'll take B when we can get it. Perhaps, if the target of a switch is a sealed type, we can interpret: ??? switch (shape) { ??????? case Rect r: ... ??????? default Circle c: ... ??? } as meaning that `Circle c` _closes_ the switch to make it total, and engages the totality checking to ensure this is true.? So, `default P` would mean either: ?- P is total, or ?- P is not total, but taken with the other cases, makes the switch optimistically total and in the latter case, would engage the residue-detection-and-ejection machinery. This might be stretching it a tad too far, but I like that we can given `default` useful new jobs to do in `switch` rather than just giving him a gold watch. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Sat Aug 22 00:32:13 2020 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 21 Aug 2020 20:32:13 -0400 Subject: [pattern-switch] Totality In-Reply-To: References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> Message-ID: <10EB1C30-FAF0-4BB7-A9D7-4DC220C4919A@oracle.com> > On Aug 21, 2020, at 4:18 PM, Brian Goetz wrote: > > > > On 8/21/2020 11:14 AM, Brian Goetz wrote: >> >> Next up (separate topic): letting statement switches opt into totality. >> > > Assuming the discussion on Exhaustiveness is good, let's talk about totality. > > Expression switches must be total; we totalize them by throwing when we encounter any residue, even though we only require that the set of cases in the switch be optimistically total. Residue includes: > > - `null` switch targets in String, Enum, and primitive box switches only; > - novel values in enum switches without a total case clause; > - novel subtypes in switches on sealed types without a total case clause; > - when an optimistically total subchain of deconstruction pattern cases wraps a residue value (e.g., D(null) or D(novel)) > > What about statement switches? Right now, any residue for a statement switch without a total case clause will just be silently ignored (because statement switches need not be total.) > > What we would like is a way to say "this switch is total, please type check it for me as such, and insert any needed residue-catching cases." I think this is a job for `default`. > > Now that we've got some clarity that switches _don't_ throw on null, but instead it is as if string/enum/box switches have an implicit `case null` when no explicit one is present, we can define `default`, once again, to be total (and not just weakly total.) So in: > > switch (object) { > case "foo": > case Box(Frog fs): > default: ... > } > > a `null` just falls into `default` just like anything else that is not the string "foo" or a box of frogs ("let the nulls flow"). Default would have to come last (except in legacy switches, where a legacy switch has one of the distinguished target types and all constant case labels.) > > What if we want to destructure too? Well, add a pattern: > > switch (object) { > case "foo": > case Box(Frog fs): > default Object o: ... > } > > This would additionally assert that the following pattern is total, otherwise a compilation error ensues. (Note, though, that this is entirely about `switch`, not patterns. The semantics of the pattern is unchanged, and I do not believe that sprinkling `default` into nested patterns to shout "TOTALITY HERE, I MEAN IT" carries its weight.) > > This seems a better job to give default in this new world; anything not previously matched, where we retcon the current null behavior as being only about string, enum, or boxes. > > This leaves us with only one hole, which is: suppose I have an _optimistically total_ statement switch. Users might like to (a) assert the switch is total, and get the concomitant type checking, and (b) get residue ejection for free. Of the two, though, A is much more important than B, but we'll take B when we can get it. Perhaps, if the target of a switch is a sealed type, we can interpret: > > switch (shape) { > case Rect r: ... > default Circle c: ... > } > > as meaning that `Circle c` _closes_ the switch to make it total, and engages the totality checking to ensure this is true. So, `default P` would mean either: > > - P is total, or > - P is not total, but taken with the other cases, makes the switch optimistically total > > and in the latter case, would engage the residue-detection-and-ejection machinery. > > This might be stretching it a tad too far, but I like that we can given `default` useful new jobs to do in `switch` rather than just giving him a gold watch. This is a pretty good story, but I am sufficiently distressed over the asymmetry of having to treat specially the last one of several otherwise completely symmetric and equal cases: switch (color) { case Red: ? case Green: ? default Blue: ? } when I would much rather see switch (color) { case Red: ? case Green: ? case Blue: ? } that I am going to explore several other design options, some of them more obviously terrible than others, in hopes of prompting someone else to have a brilliant idea. First of all, let me note that after Brian?s detailed analysis about the treatment of `null`, the only real difficulty we face is compatibility with legacy switches on enum types. We missed an opportunity when enum was first introduced. I really hate to recommend an incompatible change to the language, but this message is just brainstorming, so: Option 1: If the type of the switch expression is an enum or a sealed type, then it is a static error if the patterns are not at least optimistically total. **This would be an incompatible change with respect to existing switches on enum types.** Option 2: If the type of the switch expression is a sealed type, then it is a static error if the patterns are not at least optimistically total. This treats enums and sealed types differently, but is compatible (as are all the other options I will list below). Option 3: If the type of the switch expression is a sealed type, then it is a static error if the patterns are not at least optimistically total. You can get the benefit of this feature when switching on an enum type by adding the keyword ?sealed? to the declaration of the enum type. enum Color { RED, GREEN } Color x; switch (x) { RED: ? } // Okay sealed enum Color { RED, GREEN } Color x; switch (x) { RED: ? } // static error: cases are not optimistically total Option 4: If the type of the switch expression is a sealed type, then it is a static error if the patterns are not at least optimistically total. You can get the benefit of this feature when switching on an enum type by adding the keyword ?enum? to the switch statement. enum Color { RED, GREEN } Color x; switch (x) { RED: ? } // Okay enum Color { RED, GREEN } Color x; switch enum (x) { RED: ? } // static error: cases are not optimistically total Option 5: Expression switches must be total. So if you want a statement switch but want it to be total, convert it to an expression switch by writing ?(void)? in front of it (and add a semicolon at the end). enum Color { RED, GREEN } Color x; switch (x) { RED: ? } // Okay enum Color { RED, GREEN } Color x; (void) switch (x) { RED: ? }; // static error: cases are not optimistically total (Yeah, I have glossed over a number of details here.) Option 6: The classic idiom for switching on a enum type looks like this example taken from the JLS: switch (c) { case PENNY: return CoinColor.COPPER; case NICKEL: return CoinColor.NICKEL; case DIME: case QUARTER: return CoinColor.SILVER; default: throw new AssertionError("Unknown coin: " + c); } The only really annoying thing about this is having to write (and read) the boilerplate code for constructing the error to be thrown. So how about this abbreviation: switch (c) { case PENNY: return CoinColor.COPPER; case NICKEL: return CoinColor.NICKEL; case DIME: case QUARTER: return CoinColor.SILVER; default throw; } The meaning of ?default throw;? is that it is a static error if the case patterns are not optimistically total (and it reminds you that you will get some synthetic default cases that will throw an error if something goes wrong). -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Aug 22 11:50:39 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 22 Aug 2020 13:50:39 +0200 (CEST) Subject: [pattern-switch] Exhaustiveness In-Reply-To: References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> Message-ID: <1474698324.443116.1598097039494.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Guy Steele" > ?: "Brian Goetz" , "Remi Forax" > Cc: "Tagir Valeev" , "amber-spec-experts" > Envoy?: Vendredi 21 Ao?t 2020 02:53:22 > Objet: Re: [pattern-switch] Exhaustiveness >> On Aug 20, 2020, at 5:54 PM, Guy Steele wrote: >> >> . . . >> >> The ambiguity that this analysis still does not addresses situations such as >> D(E(novel, null)); this example is briefly alluded to at the end of Brian?s >> initial sketch of the formalism, but unfortunately the sketch does not address >> multi-parameter deconstructs in detail. So let?s go through this example: >> suppose that there are explicit cases that are optimistically total (I like the >> terminology Brian has provided) on D(E(Shape, Coin)), which might look like >> this: >> >> D(E(Round, Head)) >> D(E(Round, Tail)) >> D(E(Rect, Head)) >> D(E(Rect, Tail)) >> >> Then I think the residue would consist of >> >> D(null) >> D(novel) >> D(E(null, null)) >> D(E(null, Head)) >> D(E(null, Tail)) >> D(E(null, novel)) >> D(E(Round, null)) >> D(E(Rect, null)) >> D(E(Round, novel)) >> D(E(Rect, novel)) >> D(E(novel, null)) >> D(E(novel, Head)) >> D(E(novel, Tail)) >> D(E(novel, novel)) >> >> The order shown above is permissible, but some pairs may be traded, under the >> constraint that if two cases differ in one position and one of them has ?null? >> in that position, then that one must come earlier. >> >> If we wish behavior to be deterministic, it would be Java-like to insist that >> (1) the cases be listed consistent with an increasing lexicographic partial >> order, where null < novel, and (2) that sub-patterns effectively be processed >> from left right. Under these rules, the cases >> >> D(E(null, null)) >> D(E(null, novel)) >> >> would raise NPE, and >> >> D(E(novel, null)) >> D(E(novel, novel)) >> >> would raise ICCE. > > I went for a walk after supper (always a good time for extra thinking) and > realized (pondering the fact that there are two standard ways to order tuples, > namely lexicographic order, in which the order of the tuple elements matters, > and the product order, in which the order of the tuple elements does not > matter) that I may have insufficiently appreciated R?mi?s comment. Maybe he > wanted all cases that have a null anywhere checked before any other cases. > This may require a pattern position to be examined more than once, but does > have the nice properties of (a) not requiring left-to-right processing, and (b) > always raising NPE in preference of ICCE if NPE is possible. We can express > this in terms of Brian?s ?synthetic cases? as follows: > > The user writes: > > switch (x) { > case D(E(Round, Head)): S1 > case D(E(Round, Tail)): S2 > case D(E(Rect, Head)): S3 > case D(E(Rect, Tail)) S4 > } > > The residue can be covered by: > > D(null) > D(E(null, _)) > D(E(_, null)) > D(E(Round, novel)) > D(E(Rect, novel)) > D(E(novel, Head)) > D(E(novel, Tail)) > D(E(novel, novel)) > D(novel) > > where I have written ?_? for ?var unusedVariable?. And from this we see (since > we can put all the cases involving ?novel" _last_) that we can rewrite the > switch, by addign Brian?s synthetic case clauses, as: > > switch (x) { > case D(null): > case D(E(null, _)): > case D(E(_, null)): NPE > case D(E(Round, Head)): S1 > case D(E(Round, Tail)): S2 > case D(E(Rect, Head)): S3 > case D(E(Rect, Tail)) S4 > default: ICCE > } > > which I now think is the structure that R?mi was really hinting at. yes, once you need to deconstruct something and there is no pattern that traps null (neither case null nor case var) a NPE should be generated if one of the values is null. As you said, it's like adding synthetic cases on top the de-constructing patterns. R?mi From forax at univ-mlv.fr Sat Aug 22 12:01:28 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 22 Aug 2020 14:01:28 +0200 (CEST) Subject: [pattern-switch] Exhaustiveness In-Reply-To: <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> References: <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> Message-ID: <1595524268.443568.1598097688934.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Guy Steele" > Cc: "Remi Forax" , "Tagir Valeev" , > "amber-spec-experts" > Envoy?: Vendredi 21 Ao?t 2020 17:14:57 > Objet: Re: [pattern-switch] Exhaustiveness > Yes, this is the sort of ordering I was aiming at. >> If the user does not want such implicit handling of an optimistically total >> situation in a statement switch, then it is always possible to provide explicit >> clauses ?case null: break;? and ?default: break;?. > Indeed, and this is why I was trying to break it down into a set of cases, to > ensure that there always is a pattern the user can denote if they want to catch > some part of the residue. Where we are now is: > - In a total switch (currently just switch expressions), any residue involving > novel values gets ICCE, a null gets NPE, and any residue not in the above > categories gets (something, maybe NPE, maybe something else.) > - If the user explicitly wants Box(null), they have two choices: explicitly > match Box(null), or, more likely, use some total pattern on Box (`Box(var x)`, > `Box b`, etc.) Similarly, if they want (for whatever reason) Box(novel), they > can similarly use totality. (I hope people are beginning to see why totality in > nesting is so critical.) > So, next sub-subject (sub-ject?): when, and under what conditions, do we get NPE > from non-total switches? I said this yesterday: >> Separately (but not really separately), I'd like to refine my claim that >> `switch` is null-hostile. In reality, `switch` NPEs on null in three cases: a >> null enum, String, or primitive box. And, in each of these cases, it NPEs >> because (the implementation) really does dereference the target! For a >> `String`, it calls `hashCode()`. For an `enum`, it calls `ordinal()`. And for a >> box, it calls `xxxValue()`. It is _those_ methods that NPE, not the switch. >> (Yes, we could have designed it so that the implementation did a null check >> before calling those things.) > I bring this up because these situations cause current switch to NPE even when > the switch is not total, and this muddies the story a lot. We can refine this > behavior by saying: "If a switch *on enums, strings, or boxes* has no nullable > cases, then there is an implicit `case null: NPE` at the beginning". > In other words, I am proposing to treat this "preemptive throwing" as an > artifact of switching over these special types (which is fair because the > language already gives these types special treatment.) Then, we are free to > treat residue-handling as a consequence of totality, not a general > null-hostility of switch. > Let me repeat that, because it's a big deal. > Switch is *not* null-hostile. We were just extrapolating from too few data > points to > see it. > Switches on _enums, strings, and boxes_, that do not explicitly have > null-handling cases, > are null-hostile, because switching on these involves calling methods on Enum, > String, > or {Integer,Long,...}. > If you put a `case null` in a switch on strings/etc, it doesn't throw, it's just > matching > a value. > In all other cases, null is just a value that can be matched, or not, and if the > switch ignores its residue, the nulls leak out just like the rest of it. > In the general case, switches throw only when they are total; for partial > switches > (e.g. statement switches), null is just another value that didn't get matched. > I believe this restores us to sanity. I'm not hostile to that view, but may i ask an honest question, why this semantics is better ? Do you have examples where it makes sense to let the null to slip through the statement switch ? Because as i can see why being null hostile is a good default, it follows the motos "blow early, blow often" or "in case of doubt throws". R?mi [...] > On 8/20/2020 9:02 PM, Guy Steele wrote: >>> On Aug 20, 2020, at 6:14 PM, Brian Goetz [ mailto:brian.goetz at oracle.com | >>> ] wrote: >>> I suspect there are other orderings too, such as "any nulls beat any novels" or >>> vice versa, which would also be deterministic and potentially more natural to >>> the user. But before we go there, I want to make sure we have something where >>> users can understand the exceptions that are thrown without too much >>> head-scratching. >>> If a user had: >>> case Box(Head) >>> case Box(Tail) >>> and a Box(null) arrived unexpectedly at the switch, would NPE really be what >>> they expect? An NPE happens when you _dereference_ a null. But no one is >>> deferencing anything here; it's just that Box(null) fell into that middle space >>> of "well, you didn't really cover it, but it's such a silly case that I didn't >>> want to make you cover it either, but here we are and we have to do something." >>> So maybe want some sort of SillyCaseException (perhaps with a less silly name) >>> for at least the null residue. >> I believe that if Head and Tail exhaustively cover an enum or sealed type (as >> was the intended implication of my example)?more generally, in a situation that >> is optimistically total---then the user would be very happy to have some sort >> of error signaled if some other value shows up unexpectedly in a statement >> switch, whether that value is ?Ankle" or ?null?. Maybe a new error name would >> be appropriate, such as UnexpectedNull. >> If the user does not want such implicit handling of an optimistically total >> situation in a statement switch, then it is always possible to provide explicit >> clauses ?case null: break;? and ?default: break;?. >>> On the other hand, ICCE for Box(novel) does seem reasonable because the world >>> really has changed in an incompatible way since the user wrote the code, and >>> they probably do want to be alerted to the fact that their code is out of sync >>> with the world. >> Yep. >>> Separately (but not really separately), I'd like to refine my claim that >>> `switch` is null-hostile. In reality, `switch` NPEs on null in three cases: a >>> null enum, String, or primitive box. And, in each of these cases, it NPEs >>> because (the implementation) really does dereference the target! For a >>> `String`, it calls `hashCode()`. For an `enum`, it calls `ordinal()`. And for >>> a box, it calls `xxxValue()`. It is _those_ methods that NPE, not the switch. >>> (Yes, we could have designed it so that the implementation did a null check >>> before calling those things.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Aug 22 12:37:17 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 22 Aug 2020 14:37:17 +0200 (CEST) Subject: [pattern-switch] Totality In-Reply-To: <10EB1C30-FAF0-4BB7-A9D7-4DC220C4919A@oracle.com> References: <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <10EB1C30-FAF0-4BB7-A9D7-4DC220C4919A@oracle.com> Message-ID: <654759416.445557.1598099837886.JavaMail.zimbra@u-pem.fr> > De: "Guy Steele" > ?: "Brian Goetz" > Cc: "Remi Forax" , "Tagir Valeev" , > "amber-spec-experts" > Envoy?: Samedi 22 Ao?t 2020 02:32:13 > Objet: Re: [pattern-switch] Totality >> On Aug 21, 2020, at 4:18 PM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >> brian.goetz at oracle.com ] > wrote: >> On 8/21/2020 11:14 AM, Brian Goetz wrote: >>> Next up (separate topic): letting statement switches opt into totality. >> Assuming the discussion on Exhaustiveness is good, let's talk about totality. >> Expression switches must be total; we totalize them by throwing when we >> encounter any residue, even though we only require that the set of cases in the >> switch be optimistically total. Residue includes: >> - `null` switch targets in String, Enum, and primitive box switches only; >> - novel values in enum switches without a total case clause; >> - novel subtypes in switches on sealed types without a total case clause; >> - when an optimistically total subchain of deconstruction pattern cases wraps a >> residue value (e.g., D(null) or D(novel)) >> What about statement switches? Right now, any residue for a statement switch >> without a total case clause will just be silently ignored (because statement >> switches need not be total.) >> What we would like is a way to say "this switch is total, please type check it >> for me as such, and insert any needed residue-catching cases." I think this is >> a job for `default`. >> Now that we've got some clarity that switches _don't_ throw on null, but instead >> it is as if string/enum/box switches have an implicit `case null` when no >> explicit one is present, we can define `default`, once again, to be total (and >> not just weakly total.) So in: >> switch (object) { >> case "foo": >> case Box(Frog fs): >> default: ... >> } >> a `null` just falls into `default` just like anything else that is not the >> string "foo" or a box of frogs ("let the nulls flow"). Default would have to >> come last (except in legacy switches, where a legacy switch has one of the >> distinguished target types and all constant case labels.) >> What if we want to destructure too? Well, add a pattern: >> switch (object) { >> case "foo": >> case Box(Frog fs): >> default Object o: ... >> } >> This would additionally assert that the following pattern is total, otherwise a >> compilation error ensues. (Note, though, that this is entirely about `switch`, >> not patterns. The semantics of the pattern is unchanged, and I do not believe >> that sprinkling `default` into nested patterns to shout "TOTALITY HERE, I MEAN >> IT" carries its weight.) >> This seems a better job to give default in this new world; anything not >> previously matched, where we retcon the current null behavior as being only >> about string, enum, or boxes. >> This leaves us with only one hole, which is: suppose I have an _optimistically >> total_ statement switch. Users might like to (a) assert the switch is total, >> and get the concomitant type checking, and (b) get residue ejection for free. >> Of the two, though, A is much more important than B, but we'll take B when we >> can get it. Perhaps, if the target of a switch is a sealed type, we can >> interpret: >> switch (shape) { >> case Rect r: ... >> default Circle c: ... >> } >> as meaning that `Circle c` _closes_ the switch to make it total, and engages the >> totality checking to ensure this is true. So, `default P` would mean either: >> - P is total, or >> - P is not total, but taken with the other cases, makes the switch >> optimistically total >> and in the latter case, would engage the residue-detection-and-ejection >> machinery. >> This might be stretching it a tad too far, but I like that we can given >> `default` useful new jobs to do in `switch` rather than just giving him a gold >> watch. > This is a pretty good story, but I am sufficiently distressed over the asymmetry > of having to treat specially the last one of several otherwise completely > symmetric and equal cases: > switch (color) { > case Red: ? > case Green: ? > default Blue: ? > } > when I would much rather see > switch (color) { > case Red: ? > case Green: ? > case Blue: ? > } > that I am going to explore several other design options, some of them more > obviously terrible than others, in hopes of prompting someone else to have a > brilliant idea. > First of all, let me note that after Brian?s detailed analysis about the > treatment of `null`, the only real difficulty we face is compatibility with > legacy switches on enum types. We missed an opportunity when enum was first > introduced. I really hate to recommend an incompatible change to the language, > but this message is just brainstorming, so: > Option 1: If the type of the switch expression is an enum or a sealed type, then > it is a static error if the patterns are not at least optimistically total. > **This would be an incompatible change with respect to existing switches on > enum types.** > Option 2: If the type of the switch expression is a sealed type, then it is a > static error if the patterns are not at least optimistically total. This treats > enums and sealed types differently, but is compatible (as are all the other > options I will list below). > Option 3: If the type of the switch expression is a sealed type, then it is a > static error if the patterns are not at least optimistically total. You can get > the benefit of this feature when switching on an enum type by adding the > keyword ?sealed? to the declaration of the enum type. > enum Color { RED, GREEN } > Color x; > switch (x) { RED: ? }// Okay > sealed enum Color { RED, GREEN } > Color x; > switch (x) { RED: ? }// static error: cases are not optimistically total > Option 4: If the type of the switch expression is a sealed type, then it is a > static error if the patterns are not at least optimistically total. You can get > the benefit of this feature when switching on an enum type by adding the > keyword ?enum? to the switch statement. > enum Color { RED, GREEN } > Color x; > switch (x) { RED: ? }// Okay > enum Color { RED, GREEN } > Color x; > switch enum (x) { RED: ? }// static error: cases are not optimistically total > Option 5: Expression switches must be total. So if you want a statement switch > but want it to be total, convert it to an expression switch by writing ?(void)? > in front of it (and add a semicolon at the end). > enum Color { RED, GREEN } > Color x; > switch (x) { RED: ? }// Okay > enum Color { RED, GREEN } > Color x; > (void) switch (x) { RED: ? };// static error: cases are not optimistically total > (Yeah, I have glossed over a number of details here.) I've already written a code organically doing mostly that, transforming a statement switch to an expression switch to be sure it did not compile when i will add more enum constants, var _ = switch(x) { case RED: ... yield false; case GREEN: ... yield false }; > Option 6: The classic idiom for switching on a enum type looks like this example > taken from the JLS: > switch (c) { > case PENNY: return CoinColor.COPPER; > case NICKEL: return CoinColor.NICKEL; > case DIME: case QUARTER: return CoinColor.SILVER; > default: throw new AssertionError("Unknown coin: " + c); > } > The only really annoying thing about this is having to write (and read) the > boilerplate code for constructing the error to be thrown. So how about this > abbreviation: > switch (c) { > case PENNY: return CoinColor.COPPER; > case NICKEL: return CoinColor.NICKEL; > case DIME: case QUARTER: return CoinColor.SILVER; > default throw; > } > The meaning of ?default throw;? is that it is a static error if the case > patterns are not optimistically total (and it reminds you that you will get > some synthetic default cases that will throw an error if something goes wrong). If we go down to the route of saying that switch on enum, string and box are special because null hostile, why not go a step further and say that, apart those switches and the switch one primitive types, all other switches should be total, so obviously an expression switch should be total but a statement switch should be total too. And now we only need to solve the problem of enums inside a statement switch, here i disagree with Brian that it's a job for "default", as a developer i want the compiler to emit an error at compile time not at runtime. I wonder if like Option 1 we can not bully our way out by first raising a warning if the statement switch is not optimistically total (IDEs already does that but ask for a default) and adds an ICCE automatically if the switch is total (it's a behavior incompatible change but it's for aligning the statement switch to the expression switch and i believe it will be fine in real life) then later convert that warning to an error like we want to do with wrapper type and ==. I also want to add that if we add things like guards, we may also want this kind of switches to be exhaustive, int i = ... switch(i) { case i where i > 0: ... } R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Sat Aug 22 16:24:42 2020 From: guy.steele at oracle.com (Guy Steele) Date: Sat, 22 Aug 2020 12:24:42 -0400 Subject: [pattern-switch] Totality In-Reply-To: <10EB1C30-FAF0-4BB7-A9D7-4DC220C4919A@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <10EB1C30-FAF0-4BB7-A9D7-4DC220C4919A@oracle.com> Message-ID: <80EC2A63-E38B-4A9C-9F3F-A0FA5DFD1292@oracle.com> And added below is an Option 7. > On Aug 21, 2020, at 8:32 PM, Guy Steele wrote: > > > >> On Aug 21, 2020, at 4:18 PM, Brian Goetz > wrote: >> >> >> >> On 8/21/2020 11:14 AM, Brian Goetz wrote: >>> >>> Next up (separate topic): letting statement switches opt into totality. >>> >> >> Assuming the discussion on Exhaustiveness is good, let's talk about totality. >> >> Expression switches must be total; we totalize them by throwing when we encounter any residue, even though we only require that the set of cases in the switch be optimistically total. Residue includes: >> >> - `null` switch targets in String, Enum, and primitive box switches only; >> - novel values in enum switches without a total case clause; >> - novel subtypes in switches on sealed types without a total case clause; >> - when an optimistically total subchain of deconstruction pattern cases wraps a residue value (e.g., D(null) or D(novel)) >> >> What about statement switches? Right now, any residue for a statement switch without a total case clause will just be silently ignored (because statement switches need not be total.) >> >> What we would like is a way to say "this switch is total, please type check it for me as such, and insert any needed residue-catching cases." I think this is a job for `default`. >> >> Now that we've got some clarity that switches _don't_ throw on null, but instead it is as if string/enum/box switches have an implicit `case null` when no explicit one is present, we can define `default`, once again, to be total (and not just weakly total.) So in: >> >> switch (object) { >> case "foo": >> case Box(Frog fs): >> default: ... >> } >> >> a `null` just falls into `default` just like anything else that is not the string "foo" or a box of frogs ("let the nulls flow"). Default would have to come last (except in legacy switches, where a legacy switch has one of the distinguished target types and all constant case labels.) >> >> What if we want to destructure too? Well, add a pattern: >> >> switch (object) { >> case "foo": >> case Box(Frog fs): >> default Object o: ... >> } >> >> This would additionally assert that the following pattern is total, otherwise a compilation error ensues. (Note, though, that this is entirely about `switch`, not patterns. The semantics of the pattern is unchanged, and I do not believe that sprinkling `default` into nested patterns to shout "TOTALITY HERE, I MEAN IT" carries its weight.) >> >> This seems a better job to give default in this new world; anything not previously matched, where we retcon the current null behavior as being only about string, enum, or boxes. >> >> This leaves us with only one hole, which is: suppose I have an _optimistically total_ statement switch. Users might like to (a) assert the switch is total, and get the concomitant type checking, and (b) get residue ejection for free. Of the two, though, A is much more important than B, but we'll take B when we can get it. Perhaps, if the target of a switch is a sealed type, we can interpret: >> >> switch (shape) { >> case Rect r: ... >> default Circle c: ... >> } >> >> as meaning that `Circle c` _closes_ the switch to make it total, and engages the totality checking to ensure this is true. So, `default P` would mean either: >> >> - P is total, or >> - P is not total, but taken with the other cases, makes the switch optimistically total >> >> and in the latter case, would engage the residue-detection-and-ejection machinery. >> >> This might be stretching it a tad too far, but I like that we can given `default` useful new jobs to do in `switch` rather than just giving him a gold watch. > > This is a pretty good story, but I am sufficiently distressed over the asymmetry of having to treat specially the last one of several otherwise completely symmetric and equal cases: > > switch (color) { > case Red: ? > case Green: ? > default Blue: ? > } > > when I would much rather see > > switch (color) { > case Red: ? > case Green: ? > case Blue: ? > } > > that I am going to explore several other design options, some of them more obviously terrible than others, in hopes of prompting someone else to have a brilliant idea. > > First of all, let me note that after Brian?s detailed analysis about the treatment of `null`, the only real difficulty we face is compatibility with legacy switches on enum types. We missed an opportunity when enum was first introduced. I really hate to recommend an incompatible change to the language, but this message is just brainstorming, so: > > Option 1: If the type of the switch expression is an enum or a sealed type, then it is a static error if the patterns are not at least optimistically total. **This would be an incompatible change with respect to existing switches on enum types.** > > Option 2: If the type of the switch expression is a sealed type, then it is a static error if the patterns are not at least optimistically total. This treats enums and sealed types differently, but is compatible (as are all the other options I will list below). > > Option 3: If the type of the switch expression is a sealed type, then it is a static error if the patterns are not at least optimistically total. You can get the benefit of this feature when switching on an enum type by adding the keyword ?sealed? to the declaration of the enum type. > > enum Color { RED, GREEN } > Color x; > switch (x) { RED: ? } // Okay > > sealed enum Color { RED, GREEN } > Color x; > switch (x) { RED: ? } // static error: cases are not optimistically total > > Option 4: If the type of the switch expression is a sealed type, then it is a static error if the patterns are not at least optimistically total. You can get the benefit of this feature when switching on an enum type by adding the keyword ?enum? to the switch statement. > > enum Color { RED, GREEN } > Color x; > switch (x) { RED: ? } // Okay > > enum Color { RED, GREEN } > Color x; > switch enum (x) { RED: ? } // static error: cases are not optimistically total > > Option 5: Expression switches must be total. So if you want a statement switch but want it to be total, convert it to an expression switch by writing ?(void)? in front of it (and add a semicolon at the end). > > enum Color { RED, GREEN } > Color x; > switch (x) { RED: ? } // Okay > > enum Color { RED, GREEN } > Color x; > (void) switch (x) { RED: ? }; // static error: cases are not optimistically total > > (Yeah, I have glossed over a number of details here.) > > Option 6: The classic idiom for switching on a enum type looks like this example taken from the JLS: > > switch (c) { > case PENNY: return CoinColor.COPPER; > case NICKEL: return CoinColor.NICKEL; > case DIME: case QUARTER: return CoinColor.SILVER; > default: throw new AssertionError("Unknown coin: " + c); > } > > The only really annoying thing about this is having to write (and read) the boilerplate code for constructing the error to be thrown. So how about this abbreviation: > > switch (c) { > case PENNY: return CoinColor.COPPER; > case NICKEL: return CoinColor.NICKEL; > case DIME: case QUARTER: return CoinColor.SILVER; > default throw; > } > > The meaning of ?default throw;? is that it is a static error if the case patterns are not optimistically total (and it reminds you that you will get some synthetic default cases that will throw an error if something goes wrong). [Forgive me; I have realized that I omitted the keyword ?case? in all the case clauses in the previous examples.] Option 7: If the switch expression is a cast expression, then it is a static error if it is a static error if the patterns are not at least optimistically total. enum Color { RED, GREEN } Color x; switch (x) { case RED: ? } // Okay enum Color { RED, GREEN } Color x; switch ((Color)x) { case RED: ? } // static error: cases are not optimistically total This idea works for _any_ type, not just sealed or enum types. If you are switching on an int, then switch(v) { case 1: ? case 2: ... } is fine, but switch((int)v) { case 1: ? case 2: ... } will be a static error, and the only way to avoid such a static error will be to include a default clause or the equivalent (such as ?case var z?). I hate to break =(or even bend) the pure compositionality of expression syntax, but this approach its likely to be backward compatible in practice and is a fairly clear indication of what the programmer intends. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Aug 22 17:14:15 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 22 Aug 2020 13:14:15 -0400 Subject: Letting the nulls flow (Was: Exhaustiveness) In-Reply-To: <1595524268.443568.1598097688934.JavaMail.zimbra@u-pem.fr> References: <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <1595524268.443568.1598097688934.JavaMail.zimbra@u-pem.fr> Message-ID: Breaking into a separate thread.?? I hope we can put this one to bed once and for all. > I'm not hostile to that view, but may i ask an honest question, why > this semantics is better ? > Do you have examples where it makes sense to let the null to slip > through the statement switch ? Because as i can see why being null > hostile is a good default, it follows the motos "blow early, blow > often" or "in case of doubt throws". Charitably, I think this approach is borne of a belief that, if we keep the nulls out by posting sentries at the door, we can live an interior life unfettered by stray nulls.? But I think it is also time to recognize that this approach to "block the nulls at the door" (a) doesn't actually work, (b) creates sharp edges when the doors move (which they do, though refactoring), and (c) pushes the problems elsewhere. (To illustrate (c), just look at the conversation about nulls in patterns and switch we are having right now!? We all came to this exercise thinking "switch is null-hostile, that's how it's always been, that's how it must be", and are contorting ourselves to try to come up with a consistent explanation. ? But, if we look deeper, we see that switch is *only accidentally* null-hostile, based on some highly contextual decisions that were made when adding enum and autoboxing in Java 5.? I'll talk more about that decision in a moment, but my point right now is that we are doing a _lot_ of work to try to be consistent with an arbitrary decision that was made in the past, in a specific and limited context, and probably not with the greatest care.? Truly today's problems come from yesterdays "solutions."? If we weren't careful, an accidental decision about nulls in enum switch almost polluted the semantics of pattern matching!? That would be terrible!? So let's stop doing that, and let's stop creating new ways for our tomorrow's selves to be painted into a corner.) As background, I'll observe that every time a new context comes up, someone suggests "we should make it null-hostile."? (Closely related: we should make that new kind of variable immutable.)? And, nearly every time, this ends up being the wrong choice.? This happened with Streams; when we first wrestled with nulls in streams, someone pushed for "Just have streams throw on null elements."? But this would have been terrible; it would have meant that calculations on null-friendly domains, that were prepared to engage null directly, simply could not use streams in the obvious way; calculations like: ??? Stream.of(arrayOfStuff) ??????????????? .map(Stuff::methodThatMightReturnNull) ??????????????? .filter(x -> x != null) ??????????????? .map(Stuff::doSomething) ??????????????? .collect(toList()) would not be directly expressible, because we would have already NPEed.? Sure, there are workarounds, but for what?? Out of a naive hope that, if we inject enough null checks, no one will ever have to deal with null?? Out of irrational hatred for nulls?? Nothing good comes from either of these motivations. But, this episode wasn't over.? It was then suggested "OK, we can't NPE, but how about we filter the nulls?"? Which would have been worse.? It would mean that, for example, doing a map+toArray on an array might not have the same size as the initial array -- which would violate what should be a pretty rock-solid intuition.? It would kill all the pre-sized-array optimizations.? It would mean `zip` would have no useful semantics.? Etc etc. In the end, we came to the right answer for streams, which is "let the nulls flow". ? And this is was the right choice because Streams is general-purpose plumbing.? The "blow early" bias is about guarding the gates, and thereby hopefully keeping the nulls from getting into the house and having wild null parties at our expense. And this works when the gates are few, fixed, and well marked.? But if your language exhibits any compositional mechanisms (which is our best tool), then what was the front door soon becomes the middle of the hallway after a trivial refactoring -- which means that no refactorings are really trivial.? Oof. We already went through a good example recently where it would be foolish to try to exclude null (and yet we tried anyway) -- deconstruction patterns.? If a constructor ??? new Foo(x) can accept null, then a deconstructor ??? case Foo(var x) should dutifully serve up that null.? The guard-the-gates brigade tried valiently to put up new gates at each deconstructor, but that would have been a foolish place to put such a boundary.? I offered an analogy to having deconstruction reject null over on amber-dev: > In languages with side-effects (like Java), not all aggregation > operations are reversible; if I bake a pie, I can't later recover the > apples and the sugar.? But many are, and we like abstractions like > these (collections, Optional, stream, etc) because they are very > useful and easily reasoned about.? So those that are, should commit to > the principle.? It would be OK for a list implementation to behave > like this: > > ??? Listy list = new Listy(); > ??? list.add(null) // throws NPE > > because a List is free to express constraints on its domain.? But it > would be exceedingly bizarre for a list implementation to behave like > this: > > ??? Listy list = new Listy(); > ??? list.add(3);???? // ok, I like ints > ??? list.add(null); // ok, I like nulls too > ??? assertTrue(list.size() == 2);?? // ok > ??? assertTrue(list.get(0) == 3); // ok > ??? assertTrue(list.get(1) == null);? // NPE! > > If the list takes in nulls, it should give them back. Now, this is like the first suggested form of null-hostility in streams, and to everyone's credit, no one suggested exactly that, but what was suggested was the second, silent form of hostility -- just pretend you don't see the nulls.? And, like with streams, that would have been silly.? So, OK, we dodged the bullet of infecting patterns with special nullity rules.? Whew. Now, switch.? As I mentioned, I think we're here mostly because we are perpetuating the null biases of the past.? In Java 1.0, switches were only over primitives, so there was no question about nulls.? In Java 5, we added two new reference-typed switch targets: enums and boxes.? I wasn't in the room when that decision was made, but I can imagine how it went: Java 5 was a *very* full release, and under dramatic pressure to get out the door.? The discussion came up about nulls, maybe someone even suggested `case null` back then.? And I'm sure the answer was some form of "null enums and primitive boxes are almost always bugs, let's not bend over backwards and add new complexity to the language (case null) just to accomodate this bug, let's just throw NPE." And, given how limited switch was, and the special characteristics of enums and boxes, this was probably a pragmatic decision, but I think we lost sight of the subtleties of the context.? It is almost certainly right that 99.999% of the time, a null enum or box is a bug.? But this is emphatically not true when we broaden the type to Object.? Since the context and conditions change, the decision should be revisited before copying it to other contexts. In Java 7, when we added switching on strings, I do remember the discussion about nulls; it was mostly about "well, there's a precedent, and it's not worth breaking the precedent even if null strings are more common than null Integers, and besides, the mandate of Project Coin is very limited, and `case null` would probably be out of scope."? While this may have again been a pragmatic choice at the time given the constraints, it further set us down a slippery slope where the assumption that "switches always throw null" is set in concrete.? But this assumption is not founded on solid ground. So, the better way to approach this is to imagine Java had no switch, and we were adding a general switch today.? Would we really be advocating so hard for "Oooh, another door we can guard, let's stick it to the nulls there too"?? (And, even if we were tempted to, should we?) The plain fact is that we got away with null-hostility in the first three forms of reference types in switch because switch (at the time) was such a weak and non-compositional mechanism, and there are darn few things it can actually do well.? But, if we were designing a general-purpose switch, with rich labels and enhanced control flow (e.g., guards) as we are today, where we envisioned refactoring between switches on nested patterns and patterns with nested switches, this would be more like a general plumbing mechanism, like streams, and when plumbing has an opinion about the nulls, frantic calls to the plumber are not far behind.? The nulls must flow unimpeded, because otherwise, we create new anomalies and blockages like the streams examples I gave earlier and refactoring surprises. And having these anomalies doesn't really make life any better for the users -- it actually makes everything just less predictable, because it means simple refactorings are not simple -- and in a way that is very easy to forget about. If we really could keep the nulls out at the front gate, and thus define a clear null-free domain to work in, then I would be far more sympathetic to the calls of "new gates, new guards!"? But the gates approach just doesn't work, and we have ample evidence of this.? And the richer and more compositional we make the language, the more sharp edges this creates, because old interiors become new gates. So, back to the case at hand (though we should bring specifics this back to the case-at-hand thread): what's happening here is our baby switch is growing up into a general purpose mechanism.? And, we should expect it to take on responsibilities suited to its new abilities. Now, for the backlash.? Whenever we make an argument for what-appears-to-be relaxing an existing null-hostility, there is much concern about how the nulls will run free and wreak havoc. But, let's examine that more closely. The concern seems to be that, if if we let the null through the gate, we'll just get more NPEs, at worse places.? Well, we can't get more NPEs; at most, we can get exactly the same number.? But in reality, we will likely get less.? There are three cases. 1.? The domain is already null-free.? In this case, it doesn't make a difference; no NPEs before, none after. 2.? The domain is mostly null-free, but nulls do creep in, we see them as bugs, and we are happy to get notified.? This is the case today with enums, where a null enum is almost always a bug.? Yes, in cases like this, not guarding the gates means that the bug will get further before it is detected, or might go undetected.? This isn't fantastic, but this also isn't a disaster, because it is rare and is still likely it will get detected eventually. 3.? The domain is at least partially null tolerant.? Here, we are moving an always-throw at the gates to a might-throw-in-the-guts-if-you-forget.? But also, there are plenty of things you can do with a null binding that don't NPE, such as pass it to a method that deals sensibly with nulls, add it to an ArrayList, print it, etc.? This is a huge improvement, from "must treat null in a special, out of band way" to "treat null uniformly."? At worst, it is no worse, and often better. And, when it comes to general purpose domains, #3 is much bigger than #2.? So I think we have to optimize for #3. Finally, there are those who argue we should "just" have nullable types (T? and T!), and then all of this goes away.? I would love to get there, but it would be a very long road.? But let's imagine we do get there.? OMG how terrible it would be when constructs like lambdas, switches, or patterns willfully try to save us from the nulls, thus doing the job (badly) of the type system!? We'd have explicitly nullable types for which some constructs NPE anyway. Or, we'd have to redefine the semantics of everything in complex ways based on whether the underlying input types are nullable or not.? We would feel pretty stupid for having created new corners to paint ourselves into. Our fears of untamed nulls wantonly running through the streets are overblown.? Our attempts to contain the nulls through ad-hoc gate-guarding have all been failures.? Let the nulls flow. From brian.goetz at oracle.com Sat Aug 22 18:06:13 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 22 Aug 2020 14:06:13 -0400 Subject: [pattern-switch] Totality In-Reply-To: <10EB1C30-FAF0-4BB7-A9D7-4DC220C4919A@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <10EB1C30-FAF0-4BB7-A9D7-4DC220C4919A@oracle.com> Message-ID: <70e4f240-3a6d-bf85-1555-2e5060e48911@oracle.com> >> This might be stretching it a tad too far, but I like that we can >> given `default` useful new jobs to do in `switch` rather than just >> giving him a gold watch. > > This is a pretty good story, but I am sufficiently distressed over the > asymmetry of having to treat specially the last one of several > otherwise completely symmetric and equal cases: Yeah, you should be distressed.? It is stretching it too far.? I think the first version of `default` -- total default with destructuring -- does work, and is a good new job for default.? But it doesn't quite stretch to address the issue that comes up from sealing.? So I'll put that in the "good try" bucket. > First of all, let me note that after Brian?s detailed analysis about > the treatment of `null`, the only real difficulty we face is > compatibility with legacy switches on enum types. ?We missed an > opportunity when enum was first introduced. ?I really hate to > recommend an incompatible change to the language, but this message is > just brainstorming, so: > > Option 1: If the type of the switch expression is an enum or a sealed > type, then it is a static error if the patterns are not at least > optimistically total. **This would be an incompatible change with > respect to existing switches on enum types.** Actually, as stated, this is not inconsistent -- for switch *expressions*.? We added switch expressions in Java 12 and required that they be total, and, when the target is an enum, we nodded to optimistic totality, by not requiring a `default` when the cases were optimistically total.? Where we don't have an equivalent story is for switch *statements*, which have always been partial -- and for which partiality is reasonable (just as an `if` without an `else` is reasonable.)? And the story for optimistic totality does not scale quite as well as we'd hoped to sealed types, for a few reasons: ?- Assuming `null` is always a mistake is an OK move for enums, but seems questionable for generalized sealed types; ?- If we want to lift optimistic totality through deconstruction patterns (Box(Head), Box(Tail) o.t. on Box), the shape of the residue gets complicated. Remi a dites: > If we go down to the route of saying that switch on enum, string and > box are special because null hostile, > why not go a step further and say that, apart those switches and the > switch one primitive types, all other switches should be total, so > obviously an expression switch should be total but a statement switch > should be total too. I see the attractiveness of this argument, but I don't think we can be that cavalier, for two main reasons: ?- There is a vaguely principled reason why these specific switches have special nullity behavior, which doesn't scale to general switches.? So I think we can get away with "sealing" off the nullity behavior to the legacy cases, but only because the legacy cases actually have some nullity-relevance and the general case does not. ?- Partial statement switches are a totally reasonable thing, even on enums and sealed types!? They are the switch equivalent of an `if` without an `else`.? Which is an entirely reasonable thing to want to do, and preventing people from doing so is probably a cure worse than the disease. In other words, having spent some time analyzing the history and assumptions, we see that the general nullity-behavior for a non-limited switch should be permissive (as argued on the other thread) and the current behavior a special case, and it is therefore reasonable to try to "seal" the null behavior off in a corner.?? But it is not the case that the general totality behavior for switches should be "always total"; partial switches are fine, and there are lots of examples of such. This argument reminds me of another switch oddity: fallthrough. Fallthrough is not a wrong feature; the wrong feature was fallthrough BY DEFAULT.? Similarly, partial switch is not a wrong feature; the wrong feature is "there's no way to engage totality checking." > And now we only need to solve the problem of enums inside a statement > switch, here i disagree with Brian that it's a job for "default", as a > developer i want the compiler to emit an error at compile time not at > runtime. You have misunderstood my proposal, then.? The errors would be at compile time, except for the residue (which has always been at runtime, it's just the residue is getting bigger.) > I also want to add that if we add things like guards, we may also want > this kind of switches to be exhaustive, > ? int i = ... > ? switch(i) { > ??? case i where i > 0: ... > ? } Allow my to introduce my friend, Dr. Halting!? It is only the most trivial kinds of guards we can analyze for (optimistic) totality; it is my belief (though we should have that discussion (on another thread!)) that if we try to do any analysis of guard conditions here, it will be worse than if we try to do none. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Sat Aug 22 18:26:19 2020 From: guy.steele at oracle.com (Guy Steele) Date: Sat, 22 Aug 2020 14:26:19 -0400 Subject: [pattern-switch] Totality In-Reply-To: <70e4f240-3a6d-bf85-1555-2e5060e48911@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <10EB1C30-FAF0-4BB7-A9D7-4DC220C4919A@oracle.com> <70e4f240-3a6d-bf85-1555-2e5060e48911@oracle.com> Message-ID: > On Aug 22, 2020, at 2:06 PM, Brian Goetz wrote: > . . . >> >> Option 1: If the type of the switch expression is an enum or a sealed type, then it is a static error if the patterns are not at least optimistically total. **This would be an incompatible change with respect to existing switches on enum types.** > > Actually, as stated, this is not inconsistent -- for switch *expressions*. We added switch expressions in Java 12 and required that they be total, and, when the target is an enum, we nodded to optimistic totality, by not requiring a `default` when the cases were optimistically total. Where we don't have an equivalent story is for switch *statements*, which have always been partial -- and for which partiality is reasonable (just as an `if` without an `else` is reasonable.) And the story for optimistic totality does not scale quite as well as we'd hoped to sealed types, for a few reasons: Yes, this was an error on my part; I meant to write "Option 1: If the type of the switch statement is . . .? -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Sat Aug 22 22:38:13 2020 From: guy.steele at oracle.com (Guy Steele) Date: Sat, 22 Aug 2020 18:38:13 -0400 Subject: [pattern-switch] Totality In-Reply-To: <80EC2A63-E38B-4A9C-9F3F-A0FA5DFD1292@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <10EB1C30-FAF0-4BB7-A9D7-4DC220C4919A@oracle.com> <80EC2A63-E38B-4A9C-9F3F-A0FA5DFD1292@oracle.com> Message-ID: Option 8: The statement ?switch case (x) { ? }? is like ?switch (x) { ? }? but insists that the value x be handled by some case clause. The switch body cannot contain a default clause (static error if it does), and it?s impossible for the switch statement to silently do nothing. It?s a static error if the set of case patterns is not at least optimistically total, and you get residue checking. enum Color { RED, GREEN } Color x; switch (x) { case RED: ? } // Okay enum Color { RED, GREEN } Color x; switch case (x) { case RED: ? } // static error: cases are not optimistically total Note that you can still use int and String types, but because default clauses are forbidden, you have to use a total pattern instead: switch case (myString.length()) { case 2: case 3: case 5: case 7: primeSquawk(); case 4: case 9: squareSquawk(); case int n: squawk(n); } I think this option clearly dominates options 4 and 7 (?switch enum (x)? and ?switch ((Color) x)?). Note that it?s not completely redundant to allow ?switch case? expressions as well (which would ease refactoring), but the only extra constraint added by ?switch case? is that a default clause cannot appear. If this option were adopted, I suspect it would quickly become idiomatic to use ?switch case? on enums and many sealed types, and to use ?switch? with a ?default ? clause in most other cases. From amaembo at gmail.com Sun Aug 23 03:46:35 2020 From: amaembo at gmail.com (Tagir Valeev) Date: Sun, 23 Aug 2020 10:46:35 +0700 Subject: Letting the nulls flow (Was: Exhaustiveness) In-Reply-To: References: <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <1595524268.443568.1598097688934.JavaMail.zimbra@u-pem.fr> Message-ID: Hello! Some data from the current IntelliJ IDEA codebase We have 64 occurrences of this code pattern if($x$ == null) {...} // presumably completes abruptly switch($x) {...} Roughly half of them are enum switches and the other half is string switches Also, we have 29 occurrences of this code pattern: if($x$ != null) { switch($x$) { ... } ... } Also, we have one occurrence of this code pattern: if($x$ == null) {... } else { switch($x) {...} } All of them could benefit from null-friendly switch. Btw often null branch is the same as default branch (or some other non-null branch). With best regards, Tagir Valeev On Sun, Aug 23, 2020 at 12:14 AM Brian Goetz wrote: > > Breaking into a separate thread. I hope we can put this one to bed > once and for all. > > > I'm not hostile to that view, but may i ask an honest question, why > > this semantics is better ? > > Do you have examples where it makes sense to let the null to slip > > through the statement switch ? Because as i can see why being null > > hostile is a good default, it follows the motos "blow early, blow > > often" or "in case of doubt throws". > > Charitably, I think this approach is borne of a belief that, if we keep > the nulls out by posting sentries at the door, we can live an interior > life unfettered by stray nulls. But I think it is also time to > recognize that this approach to "block the nulls at the door" (a) > doesn't actually work, (b) creates sharp edges when the doors move > (which they do, though refactoring), and (c) pushes the problems elsewhere. > > (To illustrate (c), just look at the conversation about nulls in > patterns and switch we are having right now! We all came to this > exercise thinking "switch is null-hostile, that's how it's always been, > that's how it must be", and are contorting ourselves to try to come up > with a consistent explanation. But, if we look deeper, we see that > switch is *only accidentally* null-hostile, based on some highly > contextual decisions that were made when adding enum and autoboxing in > Java 5. I'll talk more about that decision in a moment, but my point > right now is that we are doing a _lot_ of work to try to be consistent > with an arbitrary decision that was made in the past, in a specific and > limited context, and probably not with the greatest care. Truly today's > problems come from yesterdays "solutions." If we weren't careful, an > accidental decision about nulls in enum switch almost polluted the > semantics of pattern matching! That would be terrible! So let's stop > doing that, and let's stop creating new ways for our tomorrow's selves > to be painted into a corner.) > > > As background, I'll observe that every time a new context comes up, > someone suggests "we should make it null-hostile." (Closely related: we > should make that new kind of variable immutable.) And, nearly every > time, this ends up being the wrong choice. This happened with Streams; > when we first wrestled with nulls in streams, someone pushed for "Just > have streams throw on null elements." But this would have been > terrible; it would have meant that calculations on null-friendly > domains, that were prepared to engage null directly, simply could not > use streams in the obvious way; calculations like: > > Stream.of(arrayOfStuff) > .map(Stuff::methodThatMightReturnNull) > .filter(x -> x != null) > .map(Stuff::doSomething) > .collect(toList()) > > would not be directly expressible, because we would have already NPEed. > Sure, there are workarounds, but for what? Out of a naive hope that, if > we inject enough null checks, no one will ever have to deal with null? > Out of irrational hatred for nulls? Nothing good comes from either of > these motivations. > > But, this episode wasn't over. It was then suggested "OK, we can't NPE, > but how about we filter the nulls?" Which would have been worse. It > would mean that, for example, doing a map+toArray on an array might not > have the same size as the initial array -- which would violate what > should be a pretty rock-solid intuition. It would kill all the > pre-sized-array optimizations. It would mean `zip` would have no useful > semantics. Etc etc. > > In the end, we came to the right answer for streams, which is "let the > nulls flow". And this is was the right choice because Streams is > general-purpose plumbing. The "blow early" bias is about guarding the > gates, and thereby hopefully keeping the nulls from getting into the > house and having wild null parties at our expense. And this works when > the gates are few, fixed, and well marked. But if your language > exhibits any compositional mechanisms (which is our best tool), then > what was the front door soon becomes the middle of the hallway after a > trivial refactoring -- which means that no refactorings are really > trivial. Oof. > > We already went through a good example recently where it would be > foolish to try to exclude null (and yet we tried anyway) -- > deconstruction patterns. If a constructor > > new Foo(x) > > can accept null, then a deconstructor > > case Foo(var x) > > should dutifully serve up that null. The guard-the-gates brigade tried > valiently to put up new gates at each deconstructor, but that would have > been a foolish place to put such a boundary. I offered an analogy to > having deconstruction reject null over on amber-dev: > > > In languages with side-effects (like Java), not all aggregation > > operations are reversible; if I bake a pie, I can't later recover the > > apples and the sugar. But many are, and we like abstractions like > > these (collections, Optional, stream, etc) because they are very > > useful and easily reasoned about. So those that are, should commit to > > the principle. It would be OK for a list implementation to behave > > like this: > > > > Listy list = new Listy(); > > list.add(null) // throws NPE > > > > because a List is free to express constraints on its domain. But it > > would be exceedingly bizarre for a list implementation to behave like > > this: > > > > Listy list = new Listy(); > > list.add(3); // ok, I like ints > > list.add(null); // ok, I like nulls too > > assertTrue(list.size() == 2); // ok > > assertTrue(list.get(0) == 3); // ok > > assertTrue(list.get(1) == null); // NPE! > > > > If the list takes in nulls, it should give them back. > > Now, this is like the first suggested form of null-hostility in streams, > and to everyone's credit, no one suggested exactly that, but what was > suggested was the second, silent form of hostility -- just pretend you > don't see the nulls. And, like with streams, that would have been > silly. So, OK, we dodged the bullet of infecting patterns with special > nullity rules. Whew. > > Now, switch. As I mentioned, I think we're here mostly because we are > perpetuating the null biases of the past. In Java 1.0, switches were > only over primitives, so there was no question about nulls. In Java 5, > we added two new reference-typed switch targets: enums and boxes. I > wasn't in the room when that decision was made, but I can imagine how it > went: Java 5 was a *very* full release, and under dramatic pressure to > get out the door. The discussion came up about nulls, maybe someone > even suggested `case null` back then. And I'm sure the answer was some > form of "null enums and primitive boxes are almost always bugs, let's > not bend over backwards and add new complexity to the language (case > null) just to accomodate this bug, let's just throw NPE." > > And, given how limited switch was, and the special characteristics of > enums and boxes, this was probably a pragmatic decision, but I think we > lost sight of the subtleties of the context. It is almost certainly > right that 99.999% of the time, a null enum or box is a bug. But this > is emphatically not true when we broaden the type to Object. Since the > context and conditions change, the decision should be revisited before > copying it to other contexts. > > In Java 7, when we added switching on strings, I do remember the > discussion about nulls; it was mostly about "well, there's a precedent, > and it's not worth breaking the precedent even if null strings are more > common than null Integers, and besides, the mandate of Project Coin is > very limited, and `case null` would probably be out of scope." While > this may have again been a pragmatic choice at the time given the > constraints, it further set us down a slippery slope where the > assumption that "switches always throw null" is set in concrete. But > this assumption is not founded on solid ground. > > So, the better way to approach this is to imagine Java had no switch, > and we were adding a general switch today. Would we really be > advocating so hard for "Oooh, another door we can guard, let's stick it > to the nulls there too"? (And, even if we were tempted to, should we?) > > The plain fact is that we got away with null-hostility in the first > three forms of reference types in switch because switch (at the time) > was such a weak and non-compositional mechanism, and there are darn few > things it can actually do well. But, if we were designing a > general-purpose switch, with rich labels and enhanced control flow > (e.g., guards) as we are today, where we envisioned refactoring between > switches on nested patterns and patterns with nested switches, this > would be more like a general plumbing mechanism, like streams, and when > plumbing has an opinion about the nulls, frantic calls to the plumber > are not far behind. The nulls must flow unimpeded, because otherwise, > we create new anomalies and blockages like the streams examples I gave > earlier and refactoring surprises. And having these anomalies doesn't > really make life any better for the users -- it actually makes > everything just less predictable, because it means simple refactorings > are not simple -- and in a way that is very easy to forget about. > > If we really could keep the nulls out at the front gate, and thus define > a clear null-free domain to work in, then I would be far more > sympathetic to the calls of "new gates, new guards!" But the gates > approach just doesn't work, and we have ample evidence of this. And the > richer and more compositional we make the language, the more sharp edges > this creates, because old interiors become new gates. > > So, back to the case at hand (though we should bring specifics this back > to the case-at-hand thread): what's happening here is our baby switch is > growing up into a general purpose mechanism. And, we should expect it > to take on responsibilities suited to its new abilities. > > > Now, for the backlash. Whenever we make an argument for > what-appears-to-be relaxing an existing null-hostility, there is much > concern about how the nulls will run free and wreak havoc. But, let's > examine that more closely. > > The concern seems to be that, if if we let the null through the gate, > we'll just get more NPEs, at worse places. Well, we can't get more > NPEs; at most, we can get exactly the same number. But in reality, we > will likely get less. There are three cases. > > 1. The domain is already null-free. In this case, it doesn't make a > difference; no NPEs before, none after. > > 2. The domain is mostly null-free, but nulls do creep in, we see them > as bugs, and we are happy to get notified. This is the case today with > enums, where a null enum is almost always a bug. Yes, in cases like > this, not guarding the gates means that the bug will get further before > it is detected, or might go undetected. This isn't fantastic, but this > also isn't a disaster, because it is rare and is still likely it will > get detected eventually. > > 3. The domain is at least partially null tolerant. Here, we are moving > an always-throw at the gates to a > might-throw-in-the-guts-if-you-forget. But also, there are plenty of > things you can do with a null binding that don't NPE, such as pass it to > a method that deals sensibly with nulls, add it to an ArrayList, print > it, etc. This is a huge improvement, from "must treat null in a > special, out of band way" to "treat null uniformly." At worst, it is no > worse, and often better. > > And, when it comes to general purpose domains, #3 is much bigger than > #2. So I think we have to optimize for #3. > > > Finally, there are those who argue we should "just" have nullable types > (T? and T!), and then all of this goes away. I would love to get there, > but it would be a very long road. But let's imagine we do get there. > OMG how terrible it would be when constructs like lambdas, switches, or > patterns willfully try to save us from the nulls, thus doing the job > (badly) of the type system! We'd have explicitly nullable types for > which some constructs NPE anyway. Or, we'd have to redefine the > semantics of everything in complex ways based on whether the underlying > input types are nullable or not. We would feel pretty stupid for having > created new corners to paint ourselves into. > > Our fears of untamed nulls wantonly running through the streets are > overblown. Our attempts to contain the nulls through ad-hoc > gate-guarding have all been failures. Let the nulls flow. > From forax at univ-mlv.fr Sun Aug 23 15:25:29 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 23 Aug 2020 17:25:29 +0200 (CEST) Subject: switch on Class ? Message-ID: <378945233.554008.1598196329618.JavaMail.zimbra@u-pem.fr> There is a feature of Pizza (remember Generic Java ++) we have not discussed yet, being able to do a switch on Class. public sealed interface Numeric> permits Amount, Percentage, Quantity { private BigDecimal value() { return switch(this) { case Amount(value) -> value; case Percentage(value) -> value; case Quantity(value) -> value; }; } private static > T fromValue(Class type, BigDecimal newValue) { return type.cast(switch(type) { case Amount.class -> new Amount(newValue); case Percentage.class -> new Percentage(newValue); case Quantity.class -> new Quantity(newValue); }); } default T add(T numeric) { return fromValue(getClass(), value().add(numeric.value())); } } with Amount be declared like this record Amount(BigDecimal value) implements Numeric { } This kind of switch is interesting because it's also one that can be exhaustive, like the switch on type or the switch on Enum. In the method fromValue, type is typed as a Class so a Class> and given that Numeric is a sealed class only permitting Amount, Percentage and Quantity, the only possible Class for a switch(type) are Amount.class, Percentage.class and Quantity.class. I'm pretty sure the call fromValue(getClass(), ...) doesn't not compile because the compiler has no idea that all subtypes of Numeric implements Numeric but you get the idea. regards, R?mi From brian.goetz at oracle.com Sun Aug 23 15:40:05 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 23 Aug 2020 11:40:05 -0400 Subject: switch on Class ? In-Reply-To: <378945233.554008.1598196329618.JavaMail.zimbra@u-pem.fr> References: <378945233.554008.1598196329618.JavaMail.zimbra@u-pem.fr> Message-ID: It has indeed come up before.? There is some overlap with pattern switch, and some non-overlap, but it's pretty clear the impact of pattern switch is much larger. I would much prefer to finish the discussions on the fundamentals first, which are actually blocking progress on a much-higher-priority feature.? So let's come back to this later. I also have reason to believe that, if we do generalized patterns property, we won't need to do this as a language feature, we can do it as a library feature.? So, let's come back to this later. On 8/23/2020 11:25 AM, Remi Forax wrote: > There is a feature of Pizza (remember Generic Java ++) we have not discussed yet, > being able to do a switch on Class. > > public sealed interface Numeric> > permits Amount, Percentage, Quantity { > > private BigDecimal value() { > return switch(this) { > case Amount(value) -> value; > case Percentage(value) -> value; > case Quantity(value) -> value; > }; > } > > private static > T fromValue(Class type, BigDecimal newValue) { > return type.cast(switch(type) { > case Amount.class -> new Amount(newValue); > case Percentage.class -> new Percentage(newValue); > case Quantity.class -> new Quantity(newValue); > }); > } > > default T add(T numeric) { return fromValue(getClass(), value().add(numeric.value())); } > } > > with Amount be declared like this > record Amount(BigDecimal value) implements Numeric { } > > > This kind of switch is interesting because it's also one that can be exhaustive, like the switch on type or the switch on Enum. > > In the method fromValue, type is typed as a Class so a Class> and given that Numeric is a sealed class only permitting Amount, Percentage and Quantity, the only possible Class for a switch(type) are Amount.class, Percentage.class and Quantity.class. > > I'm pretty sure the call fromValue(getClass(), ...) doesn't not compile because the compiler has no idea that all subtypes of Numeric implements Numeric but you get the idea. > > regards, > R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sun Aug 23 15:43:03 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 23 Aug 2020 11:43:03 -0400 Subject: Letting the nulls flow (Was: Exhaustiveness) In-Reply-To: References: <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <1595524268.443568.1598097688934.JavaMail.zimbra@u-pem.fr> Message-ID: <09ce416d-ae78-39c0-efc5-fb8eba566595@oracle.com> Thanks, Tagir -- this is a perfect example of what I meant yesterday by how the "blow early, blow often" approach is a false promise.? It just means that responsible programmers who need to deal with null as a fact-of-life have to do *extra* work (which is therefore more duplicative or error-prone) to deal with it. On 8/22/2020 11:46 PM, Tagir Valeev wrote: > Hello! > > Some data from the current IntelliJ IDEA codebase > > We have 64 occurrences of this code pattern > if($x$ == null) {...} // presumably completes abruptly > switch($x) {...} > Roughly half of them are enum switches and the other half is string switches > > Also, we have 29 occurrences of this code pattern: > if($x$ != null) { > switch($x$) { ... } > ... > } > > Also, we have one occurrence of this code pattern: > if($x$ == null) {... > } else { > switch($x) {...} > } > > All of them could benefit from null-friendly switch. Btw often null > branch is the same as default branch (or some other non-null branch). > > With best regards, > Tagir Valeev > > On Sun, Aug 23, 2020 at 12:14 AM Brian Goetz wrote: >> Breaking into a separate thread. I hope we can put this one to bed >> once and for all. >> >>> I'm not hostile to that view, but may i ask an honest question, why >>> this semantics is better ? >>> Do you have examples where it makes sense to let the null to slip >>> through the statement switch ? Because as i can see why being null >>> hostile is a good default, it follows the motos "blow early, blow >>> often" or "in case of doubt throws". >> Charitably, I think this approach is borne of a belief that, if we keep >> the nulls out by posting sentries at the door, we can live an interior >> life unfettered by stray nulls. But I think it is also time to >> recognize that this approach to "block the nulls at the door" (a) >> doesn't actually work, (b) creates sharp edges when the doors move >> (which they do, though refactoring), and (c) pushes the problems elsewhere. >> >> (To illustrate (c), just look at the conversation about nulls in >> patterns and switch we are having right now! We all came to this >> exercise thinking "switch is null-hostile, that's how it's always been, >> that's how it must be", and are contorting ourselves to try to come up >> with a consistent explanation. But, if we look deeper, we see that >> switch is *only accidentally* null-hostile, based on some highly >> contextual decisions that were made when adding enum and autoboxing in >> Java 5. I'll talk more about that decision in a moment, but my point >> right now is that we are doing a _lot_ of work to try to be consistent >> with an arbitrary decision that was made in the past, in a specific and >> limited context, and probably not with the greatest care. Truly today's >> problems come from yesterdays "solutions." If we weren't careful, an >> accidental decision about nulls in enum switch almost polluted the >> semantics of pattern matching! That would be terrible! So let's stop >> doing that, and let's stop creating new ways for our tomorrow's selves >> to be painted into a corner.) >> >> >> As background, I'll observe that every time a new context comes up, >> someone suggests "we should make it null-hostile." (Closely related: we >> should make that new kind of variable immutable.) And, nearly every >> time, this ends up being the wrong choice. This happened with Streams; >> when we first wrestled with nulls in streams, someone pushed for "Just >> have streams throw on null elements." But this would have been >> terrible; it would have meant that calculations on null-friendly >> domains, that were prepared to engage null directly, simply could not >> use streams in the obvious way; calculations like: >> >> Stream.of(arrayOfStuff) >> .map(Stuff::methodThatMightReturnNull) >> .filter(x -> x != null) >> .map(Stuff::doSomething) >> .collect(toList()) >> >> would not be directly expressible, because we would have already NPEed. >> Sure, there are workarounds, but for what? Out of a naive hope that, if >> we inject enough null checks, no one will ever have to deal with null? >> Out of irrational hatred for nulls? Nothing good comes from either of >> these motivations. >> >> But, this episode wasn't over. It was then suggested "OK, we can't NPE, >> but how about we filter the nulls?" Which would have been worse. It >> would mean that, for example, doing a map+toArray on an array might not >> have the same size as the initial array -- which would violate what >> should be a pretty rock-solid intuition. It would kill all the >> pre-sized-array optimizations. It would mean `zip` would have no useful >> semantics. Etc etc. >> >> In the end, we came to the right answer for streams, which is "let the >> nulls flow". And this is was the right choice because Streams is >> general-purpose plumbing. The "blow early" bias is about guarding the >> gates, and thereby hopefully keeping the nulls from getting into the >> house and having wild null parties at our expense. And this works when >> the gates are few, fixed, and well marked. But if your language >> exhibits any compositional mechanisms (which is our best tool), then >> what was the front door soon becomes the middle of the hallway after a >> trivial refactoring -- which means that no refactorings are really >> trivial. Oof. >> >> We already went through a good example recently where it would be >> foolish to try to exclude null (and yet we tried anyway) -- >> deconstruction patterns. If a constructor >> >> new Foo(x) >> >> can accept null, then a deconstructor >> >> case Foo(var x) >> >> should dutifully serve up that null. The guard-the-gates brigade tried >> valiently to put up new gates at each deconstructor, but that would have >> been a foolish place to put such a boundary. I offered an analogy to >> having deconstruction reject null over on amber-dev: >> >>> In languages with side-effects (like Java), not all aggregation >>> operations are reversible; if I bake a pie, I can't later recover the >>> apples and the sugar. But many are, and we like abstractions like >>> these (collections, Optional, stream, etc) because they are very >>> useful and easily reasoned about. So those that are, should commit to >>> the principle. It would be OK for a list implementation to behave >>> like this: >>> >>> Listy list = new Listy(); >>> list.add(null) // throws NPE >>> >>> because a List is free to express constraints on its domain. But it >>> would be exceedingly bizarre for a list implementation to behave like >>> this: >>> >>> Listy list = new Listy(); >>> list.add(3); // ok, I like ints >>> list.add(null); // ok, I like nulls too >>> assertTrue(list.size() == 2); // ok >>> assertTrue(list.get(0) == 3); // ok >>> assertTrue(list.get(1) == null); // NPE! >>> >>> If the list takes in nulls, it should give them back. >> Now, this is like the first suggested form of null-hostility in streams, >> and to everyone's credit, no one suggested exactly that, but what was >> suggested was the second, silent form of hostility -- just pretend you >> don't see the nulls. And, like with streams, that would have been >> silly. So, OK, we dodged the bullet of infecting patterns with special >> nullity rules. Whew. >> >> Now, switch. As I mentioned, I think we're here mostly because we are >> perpetuating the null biases of the past. In Java 1.0, switches were >> only over primitives, so there was no question about nulls. In Java 5, >> we added two new reference-typed switch targets: enums and boxes. I >> wasn't in the room when that decision was made, but I can imagine how it >> went: Java 5 was a *very* full release, and under dramatic pressure to >> get out the door. The discussion came up about nulls, maybe someone >> even suggested `case null` back then. And I'm sure the answer was some >> form of "null enums and primitive boxes are almost always bugs, let's >> not bend over backwards and add new complexity to the language (case >> null) just to accomodate this bug, let's just throw NPE." >> >> And, given how limited switch was, and the special characteristics of >> enums and boxes, this was probably a pragmatic decision, but I think we >> lost sight of the subtleties of the context. It is almost certainly >> right that 99.999% of the time, a null enum or box is a bug. But this >> is emphatically not true when we broaden the type to Object. Since the >> context and conditions change, the decision should be revisited before >> copying it to other contexts. >> >> In Java 7, when we added switching on strings, I do remember the >> discussion about nulls; it was mostly about "well, there's a precedent, >> and it's not worth breaking the precedent even if null strings are more >> common than null Integers, and besides, the mandate of Project Coin is >> very limited, and `case null` would probably be out of scope." While >> this may have again been a pragmatic choice at the time given the >> constraints, it further set us down a slippery slope where the >> assumption that "switches always throw null" is set in concrete. But >> this assumption is not founded on solid ground. >> >> So, the better way to approach this is to imagine Java had no switch, >> and we were adding a general switch today. Would we really be >> advocating so hard for "Oooh, another door we can guard, let's stick it >> to the nulls there too"? (And, even if we were tempted to, should we?) >> >> The plain fact is that we got away with null-hostility in the first >> three forms of reference types in switch because switch (at the time) >> was such a weak and non-compositional mechanism, and there are darn few >> things it can actually do well. But, if we were designing a >> general-purpose switch, with rich labels and enhanced control flow >> (e.g., guards) as we are today, where we envisioned refactoring between >> switches on nested patterns and patterns with nested switches, this >> would be more like a general plumbing mechanism, like streams, and when >> plumbing has an opinion about the nulls, frantic calls to the plumber >> are not far behind. The nulls must flow unimpeded, because otherwise, >> we create new anomalies and blockages like the streams examples I gave >> earlier and refactoring surprises. And having these anomalies doesn't >> really make life any better for the users -- it actually makes >> everything just less predictable, because it means simple refactorings >> are not simple -- and in a way that is very easy to forget about. >> >> If we really could keep the nulls out at the front gate, and thus define >> a clear null-free domain to work in, then I would be far more >> sympathetic to the calls of "new gates, new guards!" But the gates >> approach just doesn't work, and we have ample evidence of this. And the >> richer and more compositional we make the language, the more sharp edges >> this creates, because old interiors become new gates. >> >> So, back to the case at hand (though we should bring specifics this back >> to the case-at-hand thread): what's happening here is our baby switch is >> growing up into a general purpose mechanism. And, we should expect it >> to take on responsibilities suited to its new abilities. >> >> >> Now, for the backlash. Whenever we make an argument for >> what-appears-to-be relaxing an existing null-hostility, there is much >> concern about how the nulls will run free and wreak havoc. But, let's >> examine that more closely. >> >> The concern seems to be that, if if we let the null through the gate, >> we'll just get more NPEs, at worse places. Well, we can't get more >> NPEs; at most, we can get exactly the same number. But in reality, we >> will likely get less. There are three cases. >> >> 1. The domain is already null-free. In this case, it doesn't make a >> difference; no NPEs before, none after. >> >> 2. The domain is mostly null-free, but nulls do creep in, we see them >> as bugs, and we are happy to get notified. This is the case today with >> enums, where a null enum is almost always a bug. Yes, in cases like >> this, not guarding the gates means that the bug will get further before >> it is detected, or might go undetected. This isn't fantastic, but this >> also isn't a disaster, because it is rare and is still likely it will >> get detected eventually. >> >> 3. The domain is at least partially null tolerant. Here, we are moving >> an always-throw at the gates to a >> might-throw-in-the-guts-if-you-forget. But also, there are plenty of >> things you can do with a null binding that don't NPE, such as pass it to >> a method that deals sensibly with nulls, add it to an ArrayList, print >> it, etc. This is a huge improvement, from "must treat null in a >> special, out of band way" to "treat null uniformly." At worst, it is no >> worse, and often better. >> >> And, when it comes to general purpose domains, #3 is much bigger than >> #2. So I think we have to optimize for #3. >> >> >> Finally, there are those who argue we should "just" have nullable types >> (T? and T!), and then all of this goes away. I would love to get there, >> but it would be a very long road. But let's imagine we do get there. >> OMG how terrible it would be when constructs like lambdas, switches, or >> patterns willfully try to save us from the nulls, thus doing the job >> (badly) of the type system! We'd have explicitly nullable types for >> which some constructs NPE anyway. Or, we'd have to redefine the >> semantics of everything in complex ways based on whether the underlying >> input types are nullable or not. We would feel pretty stupid for having >> created new corners to paint ourselves into. >> >> Our fears of untamed nulls wantonly running through the streets are >> overblown. Our attempts to contain the nulls through ad-hoc >> gate-guarding have all been failures. Let the nulls flow. >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Sun Aug 23 15:44:59 2020 From: amaembo at gmail.com (Tagir Valeev) Date: Sun, 23 Aug 2020 22:44:59 +0700 Subject: switch on Class ? In-Reply-To: <378945233.554008.1598196329618.JavaMail.zimbra@u-pem.fr> References: <378945233.554008.1598196329618.JavaMail.zimbra@u-pem.fr> Message-ID: Hello! Here's previous discussion: http://mail.openjdk.java.net/pipermail/amber-spec-experts/2018-April/000531.html I still think that switch on class literal is a good idea. Tagir. ??, 23 ???. 2020 ?., 22:26 Remi Forax : > There is a feature of Pizza (remember Generic Java ++) we have not > discussed yet, > being able to do a switch on Class. > > public sealed interface Numeric> > permits Amount, Percentage, Quantity { > > private BigDecimal value() { > return switch(this) { > case Amount(value) -> value; > case Percentage(value) -> value; > case Quantity(value) -> value; > }; > } > > private static > T fromValue(Class type, > BigDecimal newValue) { > return type.cast(switch(type) { > case Amount.class -> new Amount(newValue); > case Percentage.class -> new Percentage(newValue); > case Quantity.class -> new Quantity(newValue); > }); > } > > default T add(T numeric) { return fromValue(getClass(), > value().add(numeric.value())); } > } > > with Amount be declared like this > record Amount(BigDecimal value) implements Numeric { } > > > This kind of switch is interesting because it's also one that can be > exhaustive, like the switch on type or the switch on Enum. > > In the method fromValue, type is typed as a Class so a Class Numeric<...>> and given that Numeric is a sealed class only permitting > Amount, Percentage and Quantity, the only possible Class for a switch(type) > are Amount.class, Percentage.class and Quantity.class. > > I'm pretty sure the call fromValue(getClass(), ...) doesn't not compile > because the compiler has no idea that all subtypes of Numeric implements > Numeric but you get the idea. > > regards, > R?mi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sun Aug 23 16:23:56 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 23 Aug 2020 12:23:56 -0400 Subject: [pattern-switch] Summary of open issues In-Reply-To: References: Message-ID: <9a932375-cfcf-3e2d-259c-77cbfe665a24@oracle.com> It's a good time to check in on this list, because I think we've made a lot of progress in the last weak clearing away the layers of overgrowth that have gotten in the way of seeing a clear story.? Here's my summary of what's been cleared away.? (Reminder: this is the summary thread, so its for summary information only, if you want to argue with the specific points I'm making, take it to the right thread.) Totality and patterns.? I think it's even more obvious at this point that the only realistic interpretation of `T t` as a pattern is that it is total on `T`, including null.? Anything else adds value-destroying complexity when we try to compose patterns.? But, it was hard to see this because ... Nullity and switch.? We came at this with the mistaken assumption that "switches are just null-hostile."? But that was wrong.? After digging at it, we realized that the null-hostility of switch was an artifact of the limited domains to which switch had been originally applied.? When we zoom out, it becomes obvious that the null-hostility of switch is not scalable (we even tried to distort the semantics of patterns to try to accomodate it.)? The obvious move here is: ?- Allow `case null` in all reference switches, with the obvious semantics; ?- Enums, strings, and boxes get an implicit `case null: throw` at the top if there is no explicit `case null`; ?- Total patterns -- including default -- match null So this means that _all_ switches are nullable, but these special target types bring their own special null defaults. While we first thought it might be the switch that was throwing, and then we thought maybe it was the (possibly implicit) default that was throwing, in reality, it is the invisible `case null` that comes with switching on these special types. (Meta: this shows how much damage the "blow early, blow often" rule does -- by introducing ad-hoc rules to try to create a null-free playing field, trying to undo even one of these can take weeks of analysis to unravel.) These two moves clear away almost all of the null hazards with switch, and put us on a principled foundation: with the exception of the legacy switch types, null is just another value that flows through patterns and switches, which can be matched, with the obvious and now-composible semantics.? I think this also should reduce the "totality is too subtle" concern, because we've put the null hostility into a more well-defined "box" -- switches are just nullable, so *of course* the nulls flow into the total pattern. Separately, there are TWO issues regarding switch totality.? The first is how we can give statement switches the same error checking for totality that statement switches currently enjoy. I think there is npw room to cleanly repurpose `default` for this. The second is, while tangentially related to nullity, is about _optimistic totality_.? The optimistic totality we embraced in expression switches over enums does not scale quite cleanly yet to sealed types, and specifically to _lifting_ type patterns over sealed types.? We have more work to do here. (One of the underappreciated moves of the optimistic totality we did in 12 is that it is _better_ to leave out the default clause when a switch is believed to be optimistically total, because it leads to better type checking -- omissions and separate compilation artifacts are detected at compile time rather than runtime.? We would like to get the same for sealing.) So, summary: ?- Type patterns `T t` are total on U <: T, and `var t` is total on all types; ?- Total patterns match null; - switches are not null hostile; - `default` is a total switch case; ?- you can say `case null` in switch; ?- For switches on *enums, strings, and boxes*, there is an implicit `case null` that throws, but you can override this with an explicit `case null`.? (There's still some fine points to discuss here; if we want `case null` to be able to fall into default, then we can't require it be at the top.) ?- We can consider enhancing `default` to take a total destructuring pattern with minimal distortion; ?- We need to have a longer conversation about optimistic totality. I think that's good progress for the week, as it checks off 3 of the items on this list. On 8/14/2020 1:19 PM, Brian Goetz wrote: > Here's a summary of the issues raised in the reviews of the > patterns-in-switch document.? I'm going to (try to) start a new thread > for each of them; let's not reply to this one with new topics (or with > discussion on these topics.)? I'll update this thread as we add or > remove things from the list. > > ?- Is totality too subtle? (Remi) There is some concern that the > notion of using totality to subsume nullability (at least in nested > contexts) is sound, he is concerned that the difference between total > and non-total patterns may be too subtle, and this may lead to NPE > issues.? To evaluate this, we need to evaluate both the "is totality > too subtle" and the "how much are we worried about NPE in this > context" directions. > > ?- Guards.? (John, Tagir) There is acknowledgement that some sort of > "whoops, not this case" support is needed in order to maintain switch > as a useful construct in the face of richer case labels, but some > disagreement about whether an imperative statement (e.g., continue) or > a declarative guard (e.g., `when `) is the right choice. > > ?- Exhaustiveness and null. (Tagir)? For sealed domains (enums and > sealed types), we kind of cheated with expression switches because we > could count on the switch filtering out the null. But Tagir raises an > excellent point, which is that we do not yet have a sound definition > of exhaustiveness that scales to nested patterns (do Box(Rect) and > Box(Circle) cover Box(Shape)?)? This is an interaction between sealed > types and patterns that needs to be ironed out.? (Thanks Tagir!) > > ?- Switch and null. (Tagir, Kevin)? Should we reconsider trying to > rehabilitate switches null-acceptance?? There are several who are > questioning whether this is trying to push things too far for too > little benefit. > > ?- Rehabilitating default.? The current design leaves default to rot; > it is possible it has a better role to play with respect to the > rehabilitation of switch, such as signalling that the switch is total. > > ?- Restrictions on instanceof.? It has been proposed that we restrict > total patterns from instanceof to avoid confusion; while no one has > really objected, a few people have expressed mild discomfort.? Leaving > it on the list for now until we resolve some of the other nullity > questions. > > ?- Meta. (Brian)? Nearly all of this is about null.? Is it possible > that everything else about the proposal is so perfect that there's > nothing else to talk about?? Seems unlikely.? I recommend we turn up > the attenuation knob on nullity issues to leave some oxygen for some > of the other flowers. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sun Aug 23 19:28:57 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sun, 23 Aug 2020 21:28:57 +0200 (CEST) Subject: Letting the nulls flow (Was: Exhaustiveness) In-Reply-To: <09ce416d-ae78-39c0-efc5-fb8eba566595@oracle.com> References: <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <1595524268.443568.1598097688934.JavaMail.zimbra@u-pem.fr> <09ce416d-ae78-39c0-efc5-fb8eba566595@oracle.com> Message-ID: <300656218.566649.1598210937225.JavaMail.zimbra@u-pem.fr> I think we agree that switch should be able to be null-friendly, the question is more what is the default, and how "default" works. I wonder if "case null" is the right design, if for a lot of switch, null behavior is the same as the behavior of an existing case or default. Currently case null is the first (depending on nesting) case, so you can not easily said that null and another case share the same behavior. Whatever we decide for "default", a syntax that let append null to an existing case seems better ? Something along "case Foo, null: ... " R?mi > De: "Brian Goetz" > ?: "Tagir Valeev" > Cc: "Remi Forax" , "Guy Steele" , > "amber-spec-experts" > Envoy?: Dimanche 23 Ao?t 2020 17:43:03 > Objet: Re: Letting the nulls flow (Was: Exhaustiveness) > Thanks, Tagir -- this is a perfect example of what I meant yesterday by how the > "blow early, blow often" approach is a false promise. It just means that > responsible programmers who need to deal with null as a fact-of-life have to do > *extra* work (which is therefore more duplicative or error-prone) to deal with > it. > On 8/22/2020 11:46 PM, Tagir Valeev wrote: >> Hello! >> Some data from the current IntelliJ IDEA codebase >> We have 64 occurrences of this code pattern >> if($x$ == null) {...} // presumably completes abruptly >> switch($x) {...} >> Roughly half of them are enum switches and the other half is string switches >> Also, we have 29 occurrences of this code pattern: >> if($x$ != null) { >> switch($x$) { ... } >> ... >> } >> Also, we have one occurrence of this code pattern: >> if($x$ == null) {... >> } else { >> switch($x) {...} >> } >> All of them could benefit from null-friendly switch. Btw often null >> branch is the same as default branch (or some other non-null branch). >> With best regards, >> Tagir Valeev >> On Sun, Aug 23, 2020 at 12:14 AM Brian Goetz [ mailto:brian.goetz at oracle.com | >> ] wrote: >>> Breaking into a separate thread. I hope we can put this one to bed >>> once and for all. >>>> I'm not hostile to that view, but may i ask an honest question, why >>>> this semantics is better ? >>>> Do you have examples where it makes sense to let the null to slip >>>> through the statement switch ? Because as i can see why being null >>>> hostile is a good default, it follows the motos "blow early, blow >>>> often" or "in case of doubt throws". >>> Charitably, I think this approach is borne of a belief that, if we keep >>> the nulls out by posting sentries at the door, we can live an interior >>> life unfettered by stray nulls. But I think it is also time to >>> recognize that this approach to "block the nulls at the door" (a) >>> doesn't actually work, (b) creates sharp edges when the doors move >>> (which they do, though refactoring), and (c) pushes the problems elsewhere. >>> (To illustrate (c), just look at the conversation about nulls in >>> patterns and switch we are having right now! We all came to this >>> exercise thinking "switch is null-hostile, that's how it's always been, >>> that's how it must be", and are contorting ourselves to try to come up >>> with a consistent explanation. But, if we look deeper, we see that >>> switch is *only accidentally* null-hostile, based on some highly >>> contextual decisions that were made when adding enum and autoboxing in >>> Java 5. I'll talk more about that decision in a moment, but my point >>> right now is that we are doing a _lot_ of work to try to be consistent >>> with an arbitrary decision that was made in the past, in a specific and >>> limited context, and probably not with the greatest care. Truly today's >>> problems come from yesterdays "solutions." If we weren't careful, an >>> accidental decision about nulls in enum switch almost polluted the >>> semantics of pattern matching! That would be terrible! So let's stop >>> doing that, and let's stop creating new ways for our tomorrow's selves >>> to be painted into a corner.) >>> As background, I'll observe that every time a new context comes up, >>> someone suggests "we should make it null-hostile." (Closely related: we >>> should make that new kind of variable immutable.) And, nearly every >>> time, this ends up being the wrong choice. This happened with Streams; >>> when we first wrestled with nulls in streams, someone pushed for "Just >>> have streams throw on null elements." But this would have been >>> terrible; it would have meant that calculations on null-friendly >>> domains, that were prepared to engage null directly, simply could not >>> use streams in the obvious way; calculations like: >>> Stream.of(arrayOfStuff) >>> .map(Stuff::methodThatMightReturnNull) >>> .filter(x -> x != null) >>> .map(Stuff::doSomething) >>> .collect(toList()) >>> would not be directly expressible, because we would have already NPEed. >>> Sure, there are workarounds, but for what? Out of a naive hope that, if >>> we inject enough null checks, no one will ever have to deal with null? >>> Out of irrational hatred for nulls? Nothing good comes from either of >>> these motivations. >>> But, this episode wasn't over. It was then suggested "OK, we can't NPE, >>> but how about we filter the nulls?" Which would have been worse. It >>> would mean that, for example, doing a map+toArray on an array might not >>> have the same size as the initial array -- which would violate what >>> should be a pretty rock-solid intuition. It would kill all the >>> pre-sized-array optimizations. It would mean `zip` would have no useful >>> semantics. Etc etc. >>> In the end, we came to the right answer for streams, which is "let the >>> nulls flow". And this is was the right choice because Streams is >>> general-purpose plumbing. The "blow early" bias is about guarding the >>> gates, and thereby hopefully keeping the nulls from getting into the >>> house and having wild null parties at our expense. And this works when >>> the gates are few, fixed, and well marked. But if your language >>> exhibits any compositional mechanisms (which is our best tool), then >>> what was the front door soon becomes the middle of the hallway after a >>> trivial refactoring -- which means that no refactorings are really >>> trivial. Oof. >>> We already went through a good example recently where it would be >>> foolish to try to exclude null (and yet we tried anyway) -- >>> deconstruction patterns. If a constructor >>> new Foo(x) >>> can accept null, then a deconstructor >>> case Foo(var x) >>> should dutifully serve up that null. The guard-the-gates brigade tried >>> valiently to put up new gates at each deconstructor, but that would have >>> been a foolish place to put such a boundary. I offered an analogy to >>> having deconstruction reject null over on amber-dev: >>>> In languages with side-effects (like Java), not all aggregation >>>> operations are reversible; if I bake a pie, I can't later recover the >>>> apples and the sugar. But many are, and we like abstractions like >>>> these (collections, Optional, stream, etc) because they are very >>>> useful and easily reasoned about. So those that are, should commit to >>>> the principle. It would be OK for a list implementation to behave >>>> like this: >>>> Listy list = new Listy(); >>>> list.add(null) // throws NPE >>>> because a List is free to express constraints on its domain. But it >>>> would be exceedingly bizarre for a list implementation to behave like >>>> this: >>>> Listy list = new Listy(); >>>> list.add(3); // ok, I like ints >>>> list.add(null); // ok, I like nulls too >>>> assertTrue(list.size() == 2); // ok >>>> assertTrue(list.get(0) == 3); // ok >>>> assertTrue(list.get(1) == null); // NPE! >>>> If the list takes in nulls, it should give them back. >>> Now, this is like the first suggested form of null-hostility in streams, >>> and to everyone's credit, no one suggested exactly that, but what was >>> suggested was the second, silent form of hostility -- just pretend you >>> don't see the nulls. And, like with streams, that would have been >>> silly. So, OK, we dodged the bullet of infecting patterns with special >>> nullity rules. Whew. >>> Now, switch. As I mentioned, I think we're here mostly because we are >>> perpetuating the null biases of the past. In Java 1.0, switches were >>> only over primitives, so there was no question about nulls. In Java 5, >>> we added two new reference-typed switch targets: enums and boxes. I >>> wasn't in the room when that decision was made, but I can imagine how it >>> went: Java 5 was a *very* full release, and under dramatic pressure to >>> get out the door. The discussion came up about nulls, maybe someone >>> even suggested `case null` back then. And I'm sure the answer was some >>> form of "null enums and primitive boxes are almost always bugs, let's >>> not bend over backwards and add new complexity to the language (case >>> null) just to accomodate this bug, let's just throw NPE." >>> And, given how limited switch was, and the special characteristics of >>> enums and boxes, this was probably a pragmatic decision, but I think we >>> lost sight of the subtleties of the context. It is almost certainly >>> right that 99.999% of the time, a null enum or box is a bug. But this >>> is emphatically not true when we broaden the type to Object. Since the >>> context and conditions change, the decision should be revisited before >>> copying it to other contexts. >>> In Java 7, when we added switching on strings, I do remember the >>> discussion about nulls; it was mostly about "well, there's a precedent, >>> and it's not worth breaking the precedent even if null strings are more >>> common than null Integers, and besides, the mandate of Project Coin is >>> very limited, and `case null` would probably be out of scope." While >>> this may have again been a pragmatic choice at the time given the >>> constraints, it further set us down a slippery slope where the >>> assumption that "switches always throw null" is set in concrete. But >>> this assumption is not founded on solid ground. >>> So, the better way to approach this is to imagine Java had no switch, >>> and we were adding a general switch today. Would we really be >>> advocating so hard for "Oooh, another door we can guard, let's stick it >>> to the nulls there too"? (And, even if we were tempted to, should we?) >>> The plain fact is that we got away with null-hostility in the first >>> three forms of reference types in switch because switch (at the time) >>> was such a weak and non-compositional mechanism, and there are darn few >>> things it can actually do well. But, if we were designing a >>> general-purpose switch, with rich labels and enhanced control flow >>> (e.g., guards) as we are today, where we envisioned refactoring between >>> switches on nested patterns and patterns with nested switches, this >>> would be more like a general plumbing mechanism, like streams, and when >>> plumbing has an opinion about the nulls, frantic calls to the plumber >>> are not far behind. The nulls must flow unimpeded, because otherwise, >>> we create new anomalies and blockages like the streams examples I gave >>> earlier and refactoring surprises. And having these anomalies doesn't >>> really make life any better for the users -- it actually makes >>> everything just less predictable, because it means simple refactorings >>> are not simple -- and in a way that is very easy to forget about. >>> If we really could keep the nulls out at the front gate, and thus define >>> a clear null-free domain to work in, then I would be far more >>> sympathetic to the calls of "new gates, new guards!" But the gates >>> approach just doesn't work, and we have ample evidence of this. And the >>> richer and more compositional we make the language, the more sharp edges >>> this creates, because old interiors become new gates. >>> So, back to the case at hand (though we should bring specifics this back >>> to the case-at-hand thread): what's happening here is our baby switch is >>> growing up into a general purpose mechanism. And, we should expect it >>> to take on responsibilities suited to its new abilities. >>> Now, for the backlash. Whenever we make an argument for >>> what-appears-to-be relaxing an existing null-hostility, there is much >>> concern about how the nulls will run free and wreak havoc. But, let's >>> examine that more closely. >>> The concern seems to be that, if if we let the null through the gate, >>> we'll just get more NPEs, at worse places. Well, we can't get more >>> NPEs; at most, we can get exactly the same number. But in reality, we >>> will likely get less. There are three cases. >>> 1. The domain is already null-free. In this case, it doesn't make a >>> difference; no NPEs before, none after. >>> 2. The domain is mostly null-free, but nulls do creep in, we see them >>> as bugs, and we are happy to get notified. This is the case today with >>> enums, where a null enum is almost always a bug. Yes, in cases like >>> this, not guarding the gates means that the bug will get further before >>> it is detected, or might go undetected. This isn't fantastic, but this >>> also isn't a disaster, because it is rare and is still likely it will >>> get detected eventually. >>> 3. The domain is at least partially null tolerant. Here, we are moving >>> an always-throw at the gates to a >>> might-throw-in-the-guts-if-you-forget. But also, there are plenty of >>> things you can do with a null binding that don't NPE, such as pass it to >>> a method that deals sensibly with nulls, add it to an ArrayList, print >>> it, etc. This is a huge improvement, from "must treat null in a >>> special, out of band way" to "treat null uniformly." At worst, it is no >>> worse, and often better. >>> And, when it comes to general purpose domains, #3 is much bigger than >>> #2. So I think we have to optimize for #3. >>> Finally, there are those who argue we should "just" have nullable types >>> (T? and T!), and then all of this goes away. I would love to get there, >>> but it would be a very long road. But let's imagine we do get there. >>> OMG how terrible it would be when constructs like lambdas, switches, or >>> patterns willfully try to save us from the nulls, thus doing the job >>> (badly) of the type system! We'd have explicitly nullable types for >>> which some constructs NPE anyway. Or, we'd have to redefine the >>> semantics of everything in complex ways based on whether the underlying >>> input types are nullable or not. We would feel pretty stupid for having >>> created new corners to paint ourselves into. >>> Our fears of untamed nulls wantonly running through the streets are >>> overblown. Our attempts to contain the nulls through ad-hoc >>> gate-guarding have all been failures. Let the nulls flow. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sun Aug 23 19:40:53 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 23 Aug 2020 15:40:53 -0400 Subject: Letting the nulls flow (Was: Exhaustiveness) In-Reply-To: <300656218.566649.1598210937225.JavaMail.zimbra@u-pem.fr> References: <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <1595524268.443568.1598097688934.JavaMail.zimbra@u-pem.fr> <09ce416d-ae78-39c0-efc5-fb8eba566595@oracle.com> <300656218.566649.1598210937225.JavaMail.zimbra@u-pem.fr> Message-ID: <9ba5a39d-2d4f-a747-35e0-3901140f576c@oracle.com> As I joked with Stephen on amber-dev, treating nulls specially in patterns like this (stapling a null-permit onto a pattern) feels like something from the "Bargaining" stage of the Kubler-Ross scale: "OK, fine, I'll let the nulls flow past the "switch" gate, but I want each null to show me their permit before passing the "case" gate.? I would really like to get the the Acceptance stage, where we admit that `null` is something that needs to be treated uniformly.? I think we're almost there. (As I mentioned in my other mail this morning, with this new null-friendly disposition for switch, it seems possible that `case null` need not come first, since now the presence of `case null` is really the only one that affects the overall behavior of switch -- and, only for very specific switches.? (I think that might have been some Bargaining too.)? So if we think null-falling-into-default is desirable, then we can relax the "case null must come first".) On 8/23/2020 3:28 PM, forax at univ-mlv.fr wrote: > I think we agree that switch should be able to be null-friendly, > the question is more what is the default, and how "default" works. > > I wonder if "case null" is the right design, if for a lot of switch, > null behavior is the same as the behavior of an existing case or default. > Currently case null is the first (depending on nesting) case, so you > can not easily said that null and another case share the same behavior. > > Whatever we decide for "default", a syntax that let append null to an > existing case seems better ? > Something along "case Foo, null: ... " > > R?mi > > ------------------------------------------------------------------------ > > *De: *"Brian Goetz" > *?: *"Tagir Valeev" > *Cc: *"Remi Forax" , "Guy Steele" > , "amber-spec-experts" > > *Envoy?: *Dimanche 23 Ao?t 2020 17:43:03 > *Objet: *Re: Letting the nulls flow (Was: Exhaustiveness) > > Thanks, Tagir -- this is a perfect example of what I meant > yesterday by how the "blow early, blow often" approach is a false > promise.? It just means that responsible programmers who need to > deal with null as a fact-of-life have to do *extra* work (which is > therefore more duplicative or error-prone) to deal with it. > > > On 8/22/2020 11:46 PM, Tagir Valeev wrote: > > Hello! > > Some data from the current IntelliJ IDEA codebase > > We have 64 occurrences of this code pattern > if($x$ == null) {...} // presumably completes abruptly > switch($x) {...} > Roughly half of them are enum switches and the other half is string switches > > Also, we have 29 occurrences of this code pattern: > if($x$ != null) { > switch($x$) { ... } > ... > } > > Also, we have one occurrence of this code pattern: > if($x$ == null) {... > } else { > switch($x) {...} > } > > All of them could benefit from null-friendly switch. Btw often null > branch is the same as default branch (or some other non-null branch). > > With best regards, > Tagir Valeev > > On Sun, Aug 23, 2020 at 12:14 AM Brian Goetz wrote: > > Breaking into a separate thread. I hope we can put this one to bed > once and for all. > > I'm not hostile to that view, but may i ask an honest question, why > this semantics is better ? > Do you have examples where it makes sense to let the null to slip > through the statement switch ? Because as i can see why being null > hostile is a good default, it follows the motos "blow early, blow > often" or "in case of doubt throws". > > Charitably, I think this approach is borne of a belief that, if we keep > the nulls out by posting sentries at the door, we can live an interior > life unfettered by stray nulls. But I think it is also time to > recognize that this approach to "block the nulls at the door" (a) > doesn't actually work, (b) creates sharp edges when the doors move > (which they do, though refactoring), and (c) pushes the problems elsewhere. > > (To illustrate (c), just look at the conversation about nulls in > patterns and switch we are having right now! We all came to this > exercise thinking "switch is null-hostile, that's how it's always been, > that's how it must be", and are contorting ourselves to try to come up > with a consistent explanation. But, if we look deeper, we see that > switch is *only accidentally* null-hostile, based on some highly > contextual decisions that were made when adding enum and autoboxing in > Java 5. I'll talk more about that decision in a moment, but my point > right now is that we are doing a _lot_ of work to try to be consistent > with an arbitrary decision that was made in the past, in a specific and > limited context, and probably not with the greatest care. Truly today's > problems come from yesterdays "solutions." If we weren't careful, an > accidental decision about nulls in enum switch almost polluted the > semantics of pattern matching! That would be terrible! So let's stop > doing that, and let's stop creating new ways for our tomorrow's selves > to be painted into a corner.) > > > As background, I'll observe that every time a new context comes up, > someone suggests "we should make it null-hostile." (Closely related: we > should make that new kind of variable immutable.) And, nearly every > time, this ends up being the wrong choice. This happened with Streams; > when we first wrestled with nulls in streams, someone pushed for "Just > have streams throw on null elements." But this would have been > terrible; it would have meant that calculations on null-friendly > domains, that were prepared to engage null directly, simply could not > use streams in the obvious way; calculations like: > > Stream.of(arrayOfStuff) > .map(Stuff::methodThatMightReturnNull) > .filter(x -> x != null) > .map(Stuff::doSomething) > .collect(toList()) > > would not be directly expressible, because we would have already NPEed. > Sure, there are workarounds, but for what? Out of a naive hope that, if > we inject enough null checks, no one will ever have to deal with null? > Out of irrational hatred for nulls? Nothing good comes from either of > these motivations. > > But, this episode wasn't over. It was then suggested "OK, we can't NPE, > but how about we filter the nulls?" Which would have been worse. It > would mean that, for example, doing a map+toArray on an array might not > have the same size as the initial array -- which would violate what > should be a pretty rock-solid intuition. It would kill all the > pre-sized-array optimizations. It would mean `zip` would have no useful > semantics. Etc etc. > > In the end, we came to the right answer for streams, which is "let the > nulls flow". And this is was the right choice because Streams is > general-purpose plumbing. The "blow early" bias is about guarding the > gates, and thereby hopefully keeping the nulls from getting into the > house and having wild null parties at our expense. And this works when > the gates are few, fixed, and well marked. But if your language > exhibits any compositional mechanisms (which is our best tool), then > what was the front door soon becomes the middle of the hallway after a > trivial refactoring -- which means that no refactorings are really > trivial. Oof. > > We already went through a good example recently where it would be > foolish to try to exclude null (and yet we tried anyway) -- > deconstruction patterns. If a constructor > > new Foo(x) > > can accept null, then a deconstructor > > case Foo(var x) > > should dutifully serve up that null. The guard-the-gates brigade tried > valiently to put up new gates at each deconstructor, but that would have > been a foolish place to put such a boundary. I offered an analogy to > having deconstruction reject null over on amber-dev: > > In languages with side-effects (like Java), not all aggregation > operations are reversible; if I bake a pie, I can't later recover the > apples and the sugar. But many are, and we like abstractions like > these (collections, Optional, stream, etc) because they are very > useful and easily reasoned about. So those that are, should commit to > the principle. It would be OK for a list implementation to behave > like this: > > Listy list = new Listy(); > list.add(null) // throws NPE > > because a List is free to express constraints on its domain. But it > would be exceedingly bizarre for a list implementation to behave like > this: > > Listy list = new Listy(); > list.add(3); // ok, I like ints > list.add(null); // ok, I like nulls too > assertTrue(list.size() == 2); // ok > assertTrue(list.get(0) == 3); // ok > assertTrue(list.get(1) == null); // NPE! > > If the list takes in nulls, it should give them back. > > Now, this is like the first suggested form of null-hostility in streams, > and to everyone's credit, no one suggested exactly that, but what was > suggested was the second, silent form of hostility -- just pretend you > don't see the nulls. And, like with streams, that would have been > silly. So, OK, we dodged the bullet of infecting patterns with special > nullity rules. Whew. > > Now, switch. As I mentioned, I think we're here mostly because we are > perpetuating the null biases of the past. In Java 1.0, switches were > only over primitives, so there was no question about nulls. In Java 5, > we added two new reference-typed switch targets: enums and boxes. I > wasn't in the room when that decision was made, but I can imagine how it > went: Java 5 was a *very* full release, and under dramatic pressure to > get out the door. The discussion came up about nulls, maybe someone > even suggested `case null` back then. And I'm sure the answer was some > form of "null enums and primitive boxes are almost always bugs, let's > not bend over backwards and add new complexity to the language (case > null) just to accomodate this bug, let's just throw NPE." > > And, given how limited switch was, and the special characteristics of > enums and boxes, this was probably a pragmatic decision, but I think we > lost sight of the subtleties of the context. It is almost certainly > right that 99.999% of the time, a null enum or box is a bug. But this > is emphatically not true when we broaden the type to Object. Since the > context and conditions change, the decision should be revisited before > copying it to other contexts. > > In Java 7, when we added switching on strings, I do remember the > discussion about nulls; it was mostly about "well, there's a precedent, > and it's not worth breaking the precedent even if null strings are more > common than null Integers, and besides, the mandate of Project Coin is > very limited, and `case null` would probably be out of scope." While > this may have again been a pragmatic choice at the time given the > constraints, it further set us down a slippery slope where the > assumption that "switches always throw null" is set in concrete. But > this assumption is not founded on solid ground. > > So, the better way to approach this is to imagine Java had no switch, > and we were adding a general switch today. Would we really be > advocating so hard for "Oooh, another door we can guard, let's stick it > to the nulls there too"? (And, even if we were tempted to, should we?) > > The plain fact is that we got away with null-hostility in the first > three forms of reference types in switch because switch (at the time) > was such a weak and non-compositional mechanism, and there are darn few > things it can actually do well. But, if we were designing a > general-purpose switch, with rich labels and enhanced control flow > (e.g., guards) as we are today, where we envisioned refactoring between > switches on nested patterns and patterns with nested switches, this > would be more like a general plumbing mechanism, like streams, and when > plumbing has an opinion about the nulls, frantic calls to the plumber > are not far behind. The nulls must flow unimpeded, because otherwise, > we create new anomalies and blockages like the streams examples I gave > earlier and refactoring surprises. And having these anomalies doesn't > really make life any better for the users -- it actually makes > everything just less predictable, because it means simple refactorings > are not simple -- and in a way that is very easy to forget about. > > If we really could keep the nulls out at the front gate, and thus define > a clear null-free domain to work in, then I would be far more > sympathetic to the calls of "new gates, new guards!" But the gates > approach just doesn't work, and we have ample evidence of this. And the > richer and more compositional we make the language, the more sharp edges > this creates, because old interiors become new gates. > > So, back to the case at hand (though we should bring specifics this back > to the case-at-hand thread): what's happening here is our baby switch is > growing up into a general purpose mechanism. And, we should expect it > to take on responsibilities suited to its new abilities. > > > Now, for the backlash. Whenever we make an argument for > what-appears-to-be relaxing an existing null-hostility, there is much > concern about how the nulls will run free and wreak havoc. But, let's > examine that more closely. > > The concern seems to be that, if if we let the null through the gate, > we'll just get more NPEs, at worse places. Well, we can't get more > NPEs; at most, we can get exactly the same number. But in reality, we > will likely get less. There are three cases. > > 1. The domain is already null-free. In this case, it doesn't make a > difference; no NPEs before, none after. > > 2. The domain is mostly null-free, but nulls do creep in, we see them > as bugs, and we are happy to get notified. This is the case today with > enums, where a null enum is almost always a bug. Yes, in cases like > this, not guarding the gates means that the bug will get further before > it is detected, or might go undetected. This isn't fantastic, but this > also isn't a disaster, because it is rare and is still likely it will > get detected eventually. > > 3. The domain is at least partially null tolerant. Here, we are moving > an always-throw at the gates to a > might-throw-in-the-guts-if-you-forget. But also, there are plenty of > things you can do with a null binding that don't NPE, such as pass it to > a method that deals sensibly with nulls, add it to an ArrayList, print > it, etc. This is a huge improvement, from "must treat null in a > special, out of band way" to "treat null uniformly." At worst, it is no > worse, and often better. > > And, when it comes to general purpose domains, #3 is much bigger than > #2. So I think we have to optimize for #3. > > > Finally, there are those who argue we should "just" have nullable types > (T? and T!), and then all of this goes away. I would love to get there, > but it would be a very long road. But let's imagine we do get there. > OMG how terrible it would be when constructs like lambdas, switches, or > patterns willfully try to save us from the nulls, thus doing the job > (badly) of the type system! We'd have explicitly nullable types for > which some constructs NPE anyway. Or, we'd have to redefine the > semantics of everything in complex ways based on whether the underlying > input types are nullable or not. We would feel pretty stupid for having > created new corners to paint ourselves into. > > Our fears of untamed nulls wantonly running through the streets are > overblown. Our attempts to contain the nulls through ad-hoc > gate-guarding have all been failures. Let the nulls flow. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sun Aug 23 20:12:53 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sun, 23 Aug 2020 22:12:53 +0200 (CEST) Subject: Letting the nulls flow (Was: Exhaustiveness) In-Reply-To: References: <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <1595524268.443568.1598097688934.JavaMail.zimbra@u-pem.fr> Message-ID: <167664622.567978.1598213573573.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Guy Steele" , "Tagir Valeev" , "amber-spec-experts" > > Envoy?: Samedi 22 Ao?t 2020 19:14:15 > Objet: Letting the nulls flow (Was: Exhaustiveness) > Breaking into a separate thread.?? I hope we can put this one to bed > once and for all. > >> I'm not hostile to that view, but may i ask an honest question, why >> this semantics is better ? >> Do you have examples where it makes sense to let the null to slip >> through the statement switch ? Because as i can see why being null >> hostile is a good default, it follows the motos "blow early, blow >> often" or "in case of doubt throws". > > Charitably, I think this approach is borne of a belief that, if we keep > the nulls out by posting sentries at the door, we can live an interior > life unfettered by stray nulls.? But I think it is also time to > recognize that this approach to "block the nulls at the door" (a) > doesn't actually work, (b) creates sharp edges when the doors move > (which they do, though refactoring), and (c) pushes the problems elsewhere. > > (To illustrate (c), just look at the conversation about nulls in > patterns and switch we are having right now!? We all came to this > exercise thinking "switch is null-hostile, that's how it's always been, > that's how it must be", and are contorting ourselves to try to come up > with a consistent explanation. ? But, if we look deeper, we see that > switch is *only accidentally* null-hostile, based on some highly > contextual decisions that were made when adding enum and autoboxing in > Java 5.? I'll talk more about that decision in a moment, but my point > right now is that we are doing a _lot_ of work to try to be consistent > with an arbitrary decision that was made in the past, in a specific and > limited context, and probably not with the greatest care.? Truly today's > problems come from yesterdays "solutions."? If we weren't careful, an > accidental decision about nulls in enum switch almost polluted the > semantics of pattern matching!? That would be terrible!? So let's stop > doing that, and let's stop creating new ways for our tomorrow's selves > to be painted into a corner.) > > > As background, I'll observe that every time a new context comes up, > someone suggests "we should make it null-hostile."? (Closely related: we > should make that new kind of variable immutable.)? And, nearly every > time, this ends up being the wrong choice.? This happened with Streams; > when we first wrestled with nulls in streams, someone pushed for "Just > have streams throw on null elements."? But this would have been > terrible; it would have meant that calculations on null-friendly > domains, that were prepared to engage null directly, simply could not > use streams in the obvious way; calculations like: > > ??? Stream.of(arrayOfStuff) > ??????????????? .map(Stuff::methodThatMightReturnNull) > ??????????????? .filter(x -> x != null) > ??????????????? .map(Stuff::doSomething) > ??????????????? .collect(toList()) Each time i see a map(...).filter(x -> x != null), i see a code that screams flatMap() or the new mapMulti() .flatMap(stuff -> Optional.ofNullable(stuff.methodThatMightReturnNull()).stream()) .mapMulti((stuff, consumer) -> { var result = stuff.methodThatMightReturnNull(); if (result != null) { consumer(result); } }) Note that in both case, there is no null that flows through the Stream anymore. I think the facts that - there was already loops that declare variables that can be null, - ArrayList or HashMap accept null and - we want refactoring from a loop to a stream and vice-versa sealed the deal. > > would not be directly expressible, because we would have already NPEed. > Sure, there are workarounds, but for what?? Out of a naive hope that, if > we inject enough null checks, no one will ever have to deal with null? > Out of irrational hatred for nulls?? Nothing good comes from either of > these motivations. > > But, this episode wasn't over.? It was then suggested "OK, we can't NPE, > but how about we filter the nulls?"? Which would have been worse.? It > would mean that, for example, doing a map+toArray on an array might not > have the same size as the initial array -- which would violate what > should be a pretty rock-solid intuition.? It would kill all the > pre-sized-array optimizations.? It would mean `zip` would have no useful > semantics.? Etc etc. I agree, ignoring null is never the right way to do something and which zip, we finally have one ! > > In the end, we came to the right answer for streams, which is "let the > nulls flow". ? And this is was the right choice because Streams is > general-purpose plumbing.? The "blow early" bias is about guarding the > gates, and thereby hopefully keeping the nulls from getting into the > house and having wild null parties at our expense. And this works when > the gates are few, fixed, and well marked.? But if your language > exhibits any compositional mechanisms (which is our best tool), then > what was the front door soon becomes the middle of the hallway after a > trivial refactoring -- which means that no refactorings are really > trivial.? Oof. I agree but at the same time, if we talk about refactoring, a switch on type is a refactoring from a cascade of if ... instanceof and instanceof while not null hostile, doesn't not match null. > > We already went through a good example recently where it would be > foolish to try to exclude null (and yet we tried anyway) -- > deconstruction patterns.? If a constructor > > ??? new Foo(x) > > can accept null, then a deconstructor > > ??? case Foo(var x) > > should dutifully serve up that null.? The guard-the-gates brigade tried > valiantly to put up new gates at each deconstructor, but that would have > been a foolish place to put such a boundary.? I again, i fully agree with you Foo(var) as to accept null, otherwise it's not a total pattern too. Again using the analogy of a cascade of if ... instanceof, Foo(var) is equivalent to "else" thus should accept null. [...] > > Now, switch.? As I mentioned, I think we're here mostly because we are > perpetuating the null biases of the past.? In Java 1.0, switches were > only over primitives, so there was no question about nulls.? In Java 5, > we added two new reference-typed switch targets: enums and boxes.? I > wasn't in the room when that decision was made, but I can imagine how it > went: Java 5 was a *very* full release, and under dramatic pressure to > get out the door.? The discussion came up about nulls, maybe someone > even suggested `case null` back then.? And I'm sure the answer was some > form of "null enums and primitive boxes are almost always bugs, let's > not bend over backwards and add new complexity to the language (case > null) just to accomodate this bug, let's just throw NPE." > > And, given how limited switch was, and the special characteristics of > enums and boxes, this was probably a pragmatic decision, but I think we > lost sight of the subtleties of the context.? It is almost certainly > right that 99.999% of the time, a null enum or box is a bug.? But this > is emphatically not true when we broaden the type to Object.? Since the > context and conditions change, the decision should be revisited before > copying it to other contexts. > > In Java 7, when we added switching on strings, I do remember the > discussion about nulls; it was mostly about "well, there's a precedent, > and it's not worth breaking the precedent even if null strings are more > common than null Integers, and besides, the mandate of Project Coin is > very limited, and `case null` would probably be out of scope."? While > this may have again been a pragmatic choice at the time given the > constraints, it further set us down a slippery slope where the > assumption that "switches always throw null" is set in concrete.? But > this assumption is not founded on solid ground. > > So, the better way to approach this is to imagine Java had no switch, > and we were adding a general switch today.? Would we really be > advocating so hard for "Oooh, another door we can guard, let's stick it > to the nulls there too"?? (And, even if we were tempted to, should we?) > > The plain fact is that we got away with null-hostility in the first > three forms of reference types in switch because switch (at the time) > was such a weak and non-compositional mechanism, and there are darn few > things it can actually do well.? But, if we were designing a > general-purpose switch, with rich labels and enhanced control flow > (e.g., guards) as we are today, where we envisioned refactoring between > switches on nested patterns and patterns with nested switches, this > would be more like a general plumbing mechanism, like streams, and when > plumbing has an opinion about the nulls, frantic calls to the plumber > are not far behind.? The nulls must flow unimpeded, because otherwise, > we create new anomalies and blockages like the streams examples I gave > earlier and refactoring surprises. And having these anomalies doesn't > really make life any better for the users -- it actually makes > everything just less predictable, because it means simple refactorings > are not simple -- and in a way that is very easy to forget about. > > If we really could keep the nulls out at the front gate, and thus define > a clear null-free domain to work in, then I would be far more > sympathetic to the calls of "new gates, new guards!"? But the gates > approach just doesn't work, and we have ample evidence of this.? And the > richer and more compositional we make the language, the more sharp edges > this creates, because old interiors become new gates. > > So, back to the case at hand (though we should bring specifics this back > to the case-at-hand thread): what's happening here is our baby switch is > growing up into a general purpose mechanism.? And, we should expect it > to take on responsibilities suited to its new abilities. > > > Now, for the backlash.? Whenever we make an argument for > what-appears-to-be relaxing an existing null-hostility, there is much > concern about how the nulls will run free and wreak havoc. But, let's > examine that more closely. > > The concern seems to be that, if if we let the null through the gate, > we'll just get more NPEs, at worse places.? Well, we can't get more > NPEs; at most, we can get exactly the same number.? But in reality, we > will likely get less.? There are three cases. > > 1.? The domain is already null-free.? In this case, it doesn't make a > difference; no NPEs before, none after. > > 2.? The domain is mostly null-free, but nulls do creep in, we see them > as bugs, and we are happy to get notified.? This is the case today with > enums, where a null enum is almost always a bug.? Yes, in cases like > this, not guarding the gates means that the bug will get further before > it is detected, or might go undetected.? This isn't fantastic, but this > also isn't a disaster, because it is rare and is still likely it will > get detected eventually. > > 3.? The domain is at least partially null tolerant.? Here, we are moving > an always-throw at the gates to a > might-throw-in-the-guts-if-you-forget.? But also, there are plenty of > things you can do with a null binding that don't NPE, such as pass it to > a method that deals sensibly with nulls, add it to an ArrayList, print > it, etc.? This is a huge improvement, from "must treat null in a > special, out of band way" to "treat null uniformly."? At worst, it is no > worse, and often better. > > And, when it comes to general purpose domains, #3 is much bigger than > #2.? So I think we have to optimize for #3. > > > Finally, there are those who argue we should "just" have nullable types > (T? and T!), and then all of this goes away.? I would love to get there, > but it would be a very long road.? But let's imagine we do get there. > OMG how terrible it would be when constructs like lambdas, switches, or > patterns willfully try to save us from the nulls, thus doing the job > (badly) of the type system!? We'd have explicitly nullable types for > which some constructs NPE anyway. Or, we'd have to redefine the > semantics of everything in complex ways based on whether the underlying > input types are nullable or not.? We would feel pretty stupid for having > created new corners to paint ourselves into. > > Our fears of untamed nulls wantonly running through the streets are > overblown.? Our attempts to contain the nulls through ad-hoc > gate-guarding have all been failures.? Let the nulls flow. I think we mostly agree, but let the nulls flow is an ambiguous sentence. If the question is should a switch never allow null, obviously the answer is no, but at the same time, i want to keep the analogy that a switch is just a compact form of a cascade of if ... value/instanceof/else (nested when there are nested destructuring pattern). So "case null", the recent mail of Tagir, shake my conviction that "case null" was the right pattern, i think we should allow "case Foo, null" too. For "case var x", i fully agree with you that it should accept null For "default", i think we have not talk enough about "default", technically we don't need one in a switch on types, "case var x" is enough, so my first idea was to disallow "default" apart inside where it is already allowed for backward compatibility. This trick avoids to have to define a semantics for "default". Note that if we keep "default", it has to accept null because fundamentally, "default" is like "else". For a destructuring pattern on Foo, if there is no "case Foo(null)", "case Foo(var x)" (or case var x at the top-level/or default), i don't see why this pattern has to accept "null" when destructutring because the switch will not know what to do with that null. Raising a NPE the earliest seems the right semantics for me. For a statement switch, if we want to be 100% compatible, a statement switch behave like there is always a "default", so it should be always accept null at the top level. I tried the usual jedi mind trick to convince you that a statement switch doesn't have to behave like a legacy switch but my force power doesn't seem to work through internet. In that case, i suppose we can choose among the solutions proposed by Guy to get a statement switch which has no implicit "default". R?mi From brian.goetz at oracle.com Sun Aug 23 20:31:09 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 23 Aug 2020 16:31:09 -0400 Subject: Letting the nulls flow (Was: Exhaustiveness) In-Reply-To: <167664622.567978.1598213573573.JavaMail.zimbra@u-pem.fr> References: <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <1595524268.443568.1598097688934.JavaMail.zimbra@u-pem.fr> <167664622.567978.1598213573573.JavaMail.zimbra@u-pem.fr> Message-ID: <43b7c262-b529-24cc-94bd-1147ba029fdf@oracle.com> > Note that if we keep "default", it has to accept null because fundamentally, "default" is like "else". I agree on default, and I also agree on the comparison to `else`. Which underscores the importance of totality here; a switch: ??? case Frog f: ??? case Tadpole t: ??? default / case var x / case _ / case Object o / ... any of these, they're all total is really equivalent to the if-else chain: ??? if (x instanceof Frog f)? { ... } ??? else if (x instanceof Tadpole t) { ... } ??? else { ... } ... *precisely because* the patterns in the last line are total.? In other words, a total pattern in a switch is like the `else` of an `if` (and like with `if`, nothing can come after an unqualified `else` clause, because it would be dead.)? I believe this is the analogy you are looking for in the comments above about refactoring between switch and if chains; if the pattern is a "no op", when refactoring to/from an if-else chain, the "no op" pattern maps to the else clause. (Note that we already see this nod to totality elsewhere in the language, too; we distinguish between static casts, unchecked casts, and dynamic casts, based on ... wait for it ... totality.? If the "test" in question is total, that affects the semantics of the "test", such as what exceptions it may throw.) > For a destructuring pattern on Foo, if there is no "case Foo(null)", "case Foo(var x)" (or case var x at the top-level/or default), i don't see why this pattern has to accept "null" when destructutring because the switch will not know what to do with that null. Raising a NPE the earliest seems the right semantics for me. I am not sure exactly what you are saying here. ??? case Foo(var x) alwyas matches Foo(null), but it only matches `null` itself when `Foo(var x)` is total on the target type (IOW, when the pattern test is a no-op.) > I tried the usual jedi mind trick to convince you that a statement switch doesn't have to behave like a legacy switch but my force power doesn't seem to work through internet. It was a good try, you made me think for a few minutes. > In that case, i suppose we can choose among the solutions proposed by Guy to get a statement switch which has no implicit "default". Let's be careful to separate the "optimistically total" feature (in the presence of sealing) from the base switch semantics.? I think the base switch semantics are quite simple now -- we carve out legacy behavior for three kinds of types (enums, strings, and boxes), and then switches are 100% null-friendly after that. Scaling optimistic totality to sealed classes wrapped in deconstruction patterns looks like it is going to require more work, but I think that's a separate problem. From forax at univ-mlv.fr Sun Aug 23 21:10:41 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sun, 23 Aug 2020 23:10:41 +0200 (CEST) Subject: Letting the nulls flow (Was: Exhaustiveness) In-Reply-To: <43b7c262-b529-24cc-94bd-1147ba029fdf@oracle.com> References: <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <1595524268.443568.1598097688934.JavaMail.zimbra@u-pem.fr> <167664622.567978.1598213573573.JavaMail.zimbra@u-pem.fr> <43b7c262-b529-24cc-94bd-1147ba029fdf@oracle.com> Message-ID: <907087257.892.1598217041106.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Guy Steele" , "Tagir Valeev" , "amber-spec-experts" > > Envoy?: Dimanche 23 Ao?t 2020 22:31:09 > Objet: Re: Letting the nulls flow (Was: Exhaustiveness) >> Note that if we keep "default", it has to accept null because fundamentally, >> "default" is like "else". > > I agree on default, and I also agree on the comparison to `else`. Which > underscores the importance of totality here; a switch: > > ??? case Frog f: > ??? case Tadpole t: > ??? default / case var x / case _ / case Object o / ... any of these, > they're all total > > is really equivalent to the if-else chain: > > ??? if (x instanceof Frog f)? { ... } > ??? else if (x instanceof Tadpole t) { ... } > ??? else { ... } > > ... *precisely because* the patterns in the last line are total.? In > other words, a total pattern in a switch is like the `else` of an `if` > (and like with `if`, nothing can come after an unqualified `else` > clause, because it would be dead.)? I believe this is the analogy you > are looking for in the comments above about refactoring between switch > and if chains; if the pattern is a "no op", when refactoring to/from an > if-else chain, the "no op" pattern maps to the else clause. yes > > (Note that we already see this nod to totality elsewhere in the > language, too; we distinguish between static casts, unchecked casts, and > dynamic casts, based on ... wait for it ... totality.? If the "test" in > question is total, that affects the semantics of the "test", such as > what exceptions it may throw.) Not a good analogy because the fact that the compiler may remove the cast is actually crazy, by example, does this code raise an exception or not var list = (List)(List) List.of(3); // i've just an unsafe cast somewhere Object o = list.get(0); System.out.println(o); and same question with var o = list.get(0); The fact that a CCE can appear "randomly" is a big headache for my students. I will prefer not to repeat that mistake of the past. > >> For a destructuring pattern on Foo, if there is no "case Foo(null)", "case >> Foo(var x)" (or case var x at the top-level/or default), i don't see why this >> pattern has to accept "null" when destructuring because the switch will not >> know what to do with that null. Raising a NPE the earliest seems the right >> semantics for me. > > I am not sure exactly what you are saying here. > > ??? case Foo(var x) > > always matches Foo(null), but it only matches `null` itself when > `Foo(var x)` is total on the target type (IOW, when the pattern test is > a no-op.) the sentence should have been i don't see why this pattern has to accept "null" as component > > >> I tried the usual jedi mind trick to convince you that a statement switch >> doesn't have to behave like a legacy switch but my force power doesn't seem to >> work through internet. > > It was a good try, you made me think for a few minutes. >> In that case, i suppose we can choose among the solutions proposed by Guy to get >> a statement switch which has no implicit "default". > > Let's be careful to separate the "optimistically total" feature (in the > presence of sealing) from the base switch semantics.? I think the base > switch semantics are quite simple now -- we carve out legacy behavior > for three kinds of types (enums, strings, and boxes), and then switches > are 100% null-friendly after that. > > Scaling optimistic totality to sealed classes wrapped in deconstruction > patterns looks like it is going to require more work, but I think that's > a separate problem. I don't disagree. R?mi From forax at univ-mlv.fr Sun Aug 23 21:16:23 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sun, 23 Aug 2020 23:16:23 +0200 (CEST) Subject: switch on Class ? In-Reply-To: References: <378945233.554008.1598196329618.JavaMail.zimbra@u-pem.fr> Message-ID: <1278596578.1010.1598217383928.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" , "amber-spec-experts" > > Envoy?: Dimanche 23 Ao?t 2020 17:40:05 > Objet: Re: switch on Class ? > It has indeed come up before. There is some overlap with pattern switch, and > some non-overlap, but it's pretty clear the impact of pattern switch is much > larger. > I would much prefer to finish the discussions on the fundamentals first, which > are actually blocking progress on a much-higher-priority feature. So let's come > back to this later. > I also have reason to believe that, if we do generalized patterns property, we > won't need to do this as a language feature, we can do it as a library feature. > So, let's come back to this later. Exhaustiveness is hard to emulate in a library. R?mi > On 8/23/2020 11:25 AM, Remi Forax wrote: >> There is a feature of Pizza (remember Generic Java ++) we have not discussed >> yet, >> being able to do a switch on Class. >> public sealed interface Numeric> >> permits Amount, Percentage, Quantity { >> private BigDecimal value() { >> return switch(this) { >> case Amount(value) -> value; >> case Percentage(value) -> value; >> case Quantity(value) -> value; >> }; >> } >> private static > T fromValue(Class type, BigDecimal >> newValue) { >> return type.cast(switch(type) { >> case Amount.class -> new Amount(newValue); >> case Percentage.class -> new Percentage(newValue); >> case Quantity.class -> new Quantity(newValue); >> }); >> } >> default T add(T numeric) { return fromValue(getClass(), >> value().add(numeric.value())); } >> } >> with Amount be declared like this >> record Amount(BigDecimal value) implements Numeric { } >> This kind of switch is interesting because it's also one that can be exhaustive, >> like the switch on type or the switch on Enum. >> In the method fromValue, type is typed as a Class so a Class> Numeric<...>> and given that Numeric is a sealed class only permitting Amount, >> Percentage and Quantity, the only possible Class for a switch(type) are >> Amount.class, Percentage.class and Quantity.class. >> I'm pretty sure the call fromValue(getClass(), ...) doesn't not compile because >> the compiler has no idea that all subtypes of Numeric implements >> Numeric but you get the idea. >> regards, >> R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sun Aug 23 21:49:31 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 23 Aug 2020 17:49:31 -0400 Subject: switch on Class ? In-Reply-To: <1278596578.1010.1598217383928.JavaMail.zimbra@u-pem.fr> References: <378945233.554008.1598196329618.JavaMail.zimbra@u-pem.fr> <1278596578.1010.1598217383928.JavaMail.zimbra@u-pem.fr> Message-ID: <0cd81484-2ec5-8269-78e7-85b6b92b07d2@oracle.com> > > I also have reason to believe that, if we do generalized patterns > property, we won't need to do this as a language feature, we can > do it as a library feature.? So, let's come back to this later. > > > Exhaustiveness is hard to emulate in a library. Understood, but I would happily trade non-exhaustive, library-based switches on all kinds of non-constants for adding YET ANOTHER weird, bespoke, ad-hoc form of switch.? Or even the possibility of same in the future for another weird thing now.? (Anyway, this is a topic for later.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Mon Aug 24 03:35:14 2020 From: guy.steele at oracle.com (Guy Steele) Date: Sun, 23 Aug 2020 23:35:14 -0400 Subject: [pattern-switch] Totality In-Reply-To: References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <10EB1C30-FAF0-4BB7-A9D7-4DC220C4919A@oracle.com> <80EC2A63-E38B-4A9C-9F3F-A0FA5DFD1292@oracle.com> Message-ID: Option 8 (?switch case (x) { ? }?) is increasingly appealing to me, because it is completely compatible, flags the variant up front rather than at the end, is easily pronounced, and has a story that I think is simple to explain. However, I would also like to offer this variant, which has two additional constraints and is perhaps in some sense ?the switch statement we wish we had had all along?: Option 9: The statement ?switch case (x) { ? }? is like ?switch (x) { ? }? but insists that the value x be handled by some case clause. It is a static error if any SwitchLabel of the switch statement begins with ?default". It is a static error if the set of case patterns is not at least optimistically total on the type of x (therefore it is impossible for the switch statement to silently do nothing), and you get residue checking. It is a static error if the last BlockStatement in any SwitchBlockStatementGroup can complete normally. It is a static error if any SwitchLabel of the switch statement is not part of a SwitchBlockStatementGroup. (The effect of the two additional constraints is to prevent fallthrough and fallout. Thus under this definition a ?switch case? always transfers control to a nonempty set of BlockStatements that follows some switch label that begins with ?case?, and those statements cannot fall through?even the last set of statements needs to have a ?break? or something. Draconian, perhaps even Procrustean, but opt-in.) > On Aug 22, 2020, at 6:38 PM, Guy Steele wrote: > > Option 8: The statement ?switch case (x) { ? }? is like ?switch (x) { ? }? but insists that the value x be handled by some case clause. The switch body cannot contain a default clause (static error if it does), and it?s impossible for the switch statement to silently do nothing. It?s a static error if the set of case patterns is not at least optimistically total, and you get residue checking. > > enum Color { RED, GREEN } > Color x; > switch (x) { case RED: ? } // Okay > > enum Color { RED, GREEN } > Color x; > switch case (x) { case RED: ? } // static error: cases are not optimistically total > > Note that you can still use int and String types, but because default clauses are forbidden, you have to use a total pattern instead: > > switch case (myString.length()) { > case 2: case 3: case 5: case 7: primeSquawk(); > case 4: case 9: squareSquawk(); > case int n: squawk(n); > } > > I think this option clearly dominates options 4 and 7 (?switch enum (x)? and ?switch ((Color) x)?). > > Note that it?s not completely redundant to allow ?switch case? expressions as well (which would ease refactoring), but the only extra constraint added by ?switch case? is that a default clause cannot appear. If this option were adopted, I suspect it would quickly become idiomatic to use ?switch case? on enums and many sealed types, and to use ?switch? with a ?default ? clause in most other cases. > From forax at univ-mlv.fr Mon Aug 24 14:08:26 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 24 Aug 2020 16:08:26 +0200 (CEST) Subject: switch: using an expicit type as total is dangerous Message-ID: <1809503421.343196.1598278106154.JavaMail.zimbra@u-pem.fr> Ok, let restart this conversation, actually there are two issues. 1/ there is no syntax for saying that a type is total or not, if we take a close look to the different mails, at some points several of us uses different syntax to show that part of a pattern is total, Guy has used '_', i've proposed 'default', Brian used a blank line after the pattern and we all have used "var" a some points. 2/ using an explicit type for a total type is a footgun because the semantics will change if the hierarchy or the return type of a method switched upon change. Getting an error by the compiler when the behaviour change is important. And devs are used to change something in their code and fix all the errors reported by the compiler (using the refactoring capability of the IDE or not), if we let people to use an explicit type as a total pattern, the change of semantics will go undetected. We have currently discussed about 1/ and 2/ at the same time, but conflating the two is maybe a mistake. I believe that before talking about 1/, we should determine if we should tackle 2/ or not. As i already said, for me 2/ is the real issue and i think Tagir agree. R?mi From brian.goetz at oracle.com Mon Aug 24 16:23:47 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 24 Aug 2020 12:23:47 -0400 Subject: Optimistic totality Message-ID: <1ec09a32-13b6-6739-35a7-691ffad427e1@oracle.com> As I mentioned yesterday, I think our ideas about totality and null handling were getting polluted by our desire to support intuitive, optimistic totality.? So let's try to separate them, by outlining some goals for optimistic totality. First, I'll posit that we're now able to stand on a more solid foundation: ?- Null is just another value for purposes of pattern matching; total patterns match null. ?- Null is just another value for purposes of switches; switches will feed null into total cases. ?- The perceived null-hostility of switch is actually about switches on enums, boxes, and strings; in the general case, we don't want, or need, such null-hostility. This is a much simpler story, and has many fewer sharp edges. Declaring clarity on that, we now have two additional problems to solve: ?- How to fill the gap between (always total) expression switches and (always partial) statement switches.? Totality checking is a useful static analysis that can identify bugs earlier, and being able to restore symmetry in semantics (even if it requires asymmetry in syntax) reduces unexpected errors and potholes. ?- How to extend the optimistic totality of expression switches over enums (which is a very restricted case) to the more general case of switches over sealed types, and switches with weakly total cases (such as total deconstruction patterns.) This mail will focus mostly on the second problem; I'll start another thread for the first. The goal of optimistic totality handling is to allow users to write a set cases that covers the target "well enough" that a catch-all throwing default is not needed.? This has two benefits: ?- Let the compiler write the dumb do-nothing code, rather than making the user do it; ?- If the user writes a throwing catch-all, we lose the opportunity to type-check the assumption that the switch was total in the first place. Users are well aware of the first benefit, but the second benefit is actually more important.? If the user writes: ??? Freq frequency = switch (trafficLight) { ??????? case RED -> Freq.ofTHz(450); ??????? case YELLOW -> Freq.ofTHz(525); ??????? default -> throw ...; ??? } We are deprived of two ways to help: ?- We cannot tell whether the user meant for { RED, YELLOW } to cover the space, so we cannot offer helpful type checking of "you forgot green." ?- Even if the code, as written, does cover the space, if a new constant / permitted subtype is added later, we lose the opportunity to catch it at next compilation, and alert the user to the fact that their assumption of totality was broken by someone else. On the other hand, if there is no default clause, we get exhaustiveness checking when the code is first written, and continual revalidation of this assumption on every recompile. OK, so optimistic totality is good.? What does that really mean?? We already know one case: ?- Specifying all the known constants of an enum, but no default or null case. Because this case is so limited, we handled this one pretty well in 12; we NPE on null, and ICCE on everything else. Another (new) case is: ?- When we have a _weakly total_ (total except for null) pattern on the target type.? A key category of weakly total patterns are deconstruction patterns whose sub-patterns are total.? Such as: ??? var x = switch (box) { ??????? case Box(var x) b -> ... ??? } The pattern `Box(var x)` matches all _non-null_ boxes.? (It can't match null boxes, because we'd be invoking the deconstructor with a null receiver, which would surely NPE anyway, since a deconstructor is going to have some `x = this.x` ~99.99% of the time.)? So, should this be good enough to declare the switch optimistically total?? I think so; having to say `case null` in this switch would be irritating to users for no good reason. What we've done is flipped things around; rather than saying "switches NPE on null", we can say "total switches with optimistically total case sets can throw on silly inputs" -- because the very concept of optimistic totality suggests that we think the residue consists only of silly inputs (and we are only throwing when the switch is total anyway.)? Now we can have a more refined definition of silly inputs. Another case: ?- The sealed class analogue of an enum switch. Here, we have a sealed class C, and a set of patterns that, for every permitted subtype D of C, some subset of the patterns is (optimistically) total on D.? Now, our residue has two inhabitants: null, and novel subclasses. Do we think this should be optimistically total?? Yes; all the reasons why a throwing default is bad on the enum case apply to the sealed case, there is just a larger residue set. Another case: ?- When we have a deconstructor D(C), and a set of patterns D(P1)...D(Pn) such that P1..Pn are optimistically total on C, we would like to conclude that the lifted patterns are optimistically total on D. Example: ??? switch (boxOfShape) {?? // Shape = Circle + Rect ? ?? ?? case Box(Circle c): ??????? case Box(Rect r): ??? } Our claim here is that because Circle + Rect are o.t. on Shape, Box(Circle)+Box(Rect) should be o.t. on Box.? Do we buy that?? Again, I think we want this; asking users to insert Box(null) or Box(novel) cases to get totality checking is counterproductive. What we see here is that we have an accumulation of situations where we think a given set of patterns covers a target type "well enough" that we are willing to (a) let the user skate on being truly total, and (b) engage enhanced type checking against the set of "good enough" cases. After writing this, I think we are, once again, being overly constrained by (and worse, distracted by) "consistency" with what we decided in 12 for the simple case of enum switches: that the answer always has to be some form of ICCE or NPE.? These were easy answers when the residue was so easily characterized, but trying to extrapolate from them too rigidly may be a mistake. So, I think that we should save NPE and ICCE for the more accurate, narrow uses we found for them in 12, and for any "complex" residue, just define a new exception type -- and focus our energy on ensuring we get good error messages out of it, and move past this distraction. The real point here is defining what we consider to be acceptable residue for an optimistically total switch, and ensure that we can deliver clear error messages when a Box(Hexagon) shows up. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Aug 24 17:30:31 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 24 Aug 2020 13:30:31 -0400 Subject: Opting into totality Message-ID: <081e6ffa-7716-b5b7-8b8c-4164800b5113@oracle.com> The previous mail, on optimistic totality, only applied to switches that were already total.? Currently that includes only expression switches; we want a non-too-invasive way to mark a statement switch as total as well. First, I'd like to rule out the variants of "automatically make XYZ statement switches total", such as, e.g., "statement switches over enums or sealed types" -- because partial switches are entirely _useful and reasonable_.? A partial switch is like an `if` without an `else`.? Nothing wrong with that, regardless of whether the target is an enum or sealed type or some other known-enumerable type.? There's no reason to rule out partiality here.? Users won't thank us. There are a number of syntactic options for opting in, such as: ?- A modified keyword (total-switch) ?- A modifier before or after `switch` (e.g., `switch enum`, as per Guy's #4, `sealed switch`, or `switch case` as per #8) ?- Turning the statement switch into a void expression switch (Guy's #5) ?- Giving `default` new jobs to do ?- A streamlined form of throw-on-residue default (e.g., `default: throw`, #6) ?- Using cast expressions to highlight the target type (#7) There is something pretty attractive about saying only "this statement switch is total in exactly the same ways that all expression switches are", because then we have factored the interaction between the switch being total (or not) and the set of cases being optimistically total (or not), rather than introduced yet another mechanism whose only purpose is to plug a hole. The idea of "turn it into an expression switch by casting to void" is cute, but will always be perceived by users as an "idiom" rather than an actual feature.? I would like to encourage developers to think more in terms of totality, because it contains hidden benefits for them.? (I am also vaguely fearful of an unexpected interaction when we try to turn `void` into a type in Valhalla, and would rather "avoid" creating new constraints on this.) The idea of `default: throw` reads nicely, but the semantics feel too ad-hoc, in that we're inferring the behavior of `throw` in the context of a `default` in the context of an otherwise-optimistically-total switch.? I would expect that people would then expect `throw` to mean "throw the obvious thing" in other contexts, leading to disappointment. Another option is to engage some notion of unreachability.? It has been requested a few times before to have a feature like an "unreachable" statement: ??? void m(int x) { ??????? if (x == 0) ??????????? throw new FooException(); ??????? else ??????????? return; ??????? unreachable; ??? } Which would be interpreted as an assertion that this statement is unreachable, that the compiler should issue an error if it can prove the statement _is_ reachable, and translate to throwing some sort of assertion error.? If we had such a notion, we could extend it to say: ??? default: optimistically-unreachable which would have the effect of (a) engaging reachability analysis and generating compiler errors if the case is known to be reachable, and (b) throwing the expected set of errors on residue at runtime. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Aug 24 18:57:03 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 24 Aug 2020 14:57:03 -0400 Subject: switch: using an expicit type as total is dangerous In-Reply-To: <1809503421.343196.1598278106154.JavaMail.zimbra@u-pem.fr> References: <1809503421.343196.1598278106154.JavaMail.zimbra@u-pem.fr> Message-ID: <9a93449a-6ab6-b9ad-c30c-189b5d1b64ad@oracle.com> > 2/ using an explicit type for a total type is a footgun because the semantics will change if the hierarchy or the return type of a method switched upon change. Sorry, I think this argument is a pure red herring.?? I get why this is one of those "scary the first time you see it" issues, but I think the fear has been overblown to near-panic proportions.? We've spent a lot of time talking about it and, the more we talk, the less worried I am. The conditions that have to combine for this to happen are already individually rare: ??? - a hierarchy change, combined with ??? - enough use-site type inference that is not obvious what the type dependencies are, combined with ??? - null actually being a member of the domain, combined with ??? - users not realizing null is a member of the domain. Then, for it to actually be a problem, not only do all of the above have to happen, but an unhandled null has to actually show up. Even then, the severity of this case is low -- most likely, the NPE gets moved from one place to another. Even then, the remediation cost is trivial. So by the above, this seems like a corner^7 case.? How much are you willing to distort the language to avoid it?? I think the rational answer is "none".? And we've spilled a *lot* of ink on this issue -- to the point where it has distracted us from much? bigger concerns -- and the only thing you've managed to convince me of is that, the more you look at the issue, the more it becomes obvious that there is only one reasonable choice here.? So unless you have a powerful NEW argument other than "something could change under subtle conditions, and then something bad might happen", I think its time to move on. > I believe that before talking about 1/, we should determine if we should tackle 2/ or not. I thought we had already established this, on both counts :) To (2), I think the semantics proposed are right.? The answer to "I can write code that is brittle because it depends on the type hierarchy" is not to forgo hierarchy analysis. To (1), I think it treats something whose importance is "x" by spending 1000x worth of user-model complexity budget to arrive at less sensible semantics. If you have new arguments, I'm willing to hear them.? But I don't think there's any point in repeating the same arguments. From guy.steele at oracle.com Mon Aug 24 20:12:26 2020 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 24 Aug 2020 16:12:26 -0400 Subject: Opting into totality In-Reply-To: <081e6ffa-7716-b5b7-8b8c-4164800b5113@oracle.com> References: <081e6ffa-7716-b5b7-8b8c-4164800b5113@oracle.com> Message-ID: <396981E9-C301-48CC-AB63-F4BB33F77C64@oracle.com> > On Aug 24, 2020, at 1:30 PM, Brian Goetz wrote: > > The previous mail, on optimistic totality, only applied to switches that were already total. Currently that includes only expression switches; we want a non-too-invasive way to mark a statement switch as total as well. . . . I am going to argue here that, just as fear of letting nulls flow stemmed from a early design that conflated multiple design issues as a result of extrapolating from too few data points (enums and strings), we have been boxed into another corner because we conflated expression-ness and the need for totality. In this essay I will first tease these two issues apart, and then suggest how we might go forward using what we have learned from discussions of the last few weeks. Going back to the dawn of time, a switch statement does not have to be total. Why is this possible? Because there is an obvious default behavior: do nothing. If we were to view it in terms of delivering a value of some type, we would say that type is ?void?. Then why did we not allow a switch expression to be _exactly_ analogous? In fact, we could have, by relying on existing precedent in the language: if no switch label matches and there is no default, or if execution of the statements of the switch block completes normally, we could simply decree that a switch expression has the default behavior ?do nothing? and delivers a _default value_?exactly as we do for initialization of fields and array components. So for enum Color { RED, GREEN, BLUE } Color x = ? int n = switch (x) { RED -> 1; GREEN -> 2; }; then if x is BLUE, n will get the value 0. But I am guessing that we worried about programming errors and demanded totality for switch expressions, so we enforced it by fiat because we had no other mechanism to request totality. So, standing where we are today, first imagine that we relax the totality requirement of switch expressions and allow them to produce default values (zero or null) in exactly the same situation that a statement switch would ?do nothing?. Next, let us introduce pattern matching in switch labels, as we have discussed at length. Then we introduce two mechanisms that we have discussed more recently, and say that each of these mechanisms may be used in either a switch statement or a switch expression. The first is a switch label of the form ?default ?, which behaves just like a switch label ?case ? except that it is a static error if the is not total on the type of the selector expression. This mechanism is good for extensible type hierarchies, where we expect to call out a number of special cases and then have a catch-all case, and we want the compiler to confirm to us on every compilation that the catch-all case actually does catch everything. The second is the possibility of writing ?switch case? rather than ?switch?, which introduces these extra constraints on the switch block: It is a static error if any SwitchLabel of the switch statement begins with ?default". It is a static error if the set of case patterns is not at least optimistically total on the type of the selector expression. It is a static error if the last BlockStatement in _any_ SwitchBlockStatementGroup, or the Block in any SwitchRule, can complete normally. It is a static error if any SwitchLabel of the switch statement is not part of a SwitchBlockStatementGroup or SwitchRule. In addition, the compiler automatically inserts SwitchBlockStatementGroups or SwitchRules to cover the residue, so as to throw an appropriate error at run time if the value produced by the selector expression belongs to the residue. This mechanism is good for enums and sealed types, that is, situations where we expect to enumerate all the special cases explicitly and want to be notified by the compiler (or failing that, at run time) if we have failed to do so. In this way two distinct methods are provided for requesting totality checking (and note that they are mutually exclusive), and either may be used with either a switch statement switch or a switch expression. At this stage, we have six possibilities, generated by an _orthogonal_ choice of (1) statement versus expression, and (2) use of ?default ?, ?switch case?, or neither. But we are still justly worried that _one_ of these _six_ cases is error-prone: the possibility of switch expressions generating default values. So we can rule that out again, but in a more principled way that still retains both orthogonality of choice and backward compatibility. We replace this line of the JLS: ? If the type of the selector expression is not an enum type, then there is exactly one default label associated with the switch block. with this: ? If the type of the selector expression is not an enum type, then either the ?switch case? form is used or there is exactly one default label associated with the switch block. Furthermore, we retain the existing sentence in the description of the run-time evaluation of switch expressions that says "If no switch label matches, then an IncompatibleClassChangeError is thrown and the entire switch expression completes abruptly for that reason.? In this way we have six orthogonally generated choices (instead of two non-orthogonally-generated possibilities), of which we then, for the sake of backward compatibility, allow the most dangerous one to be used only for enums, and add back the previously existing ICCE guardrail for that situation, so that switch expressions never generate default values after all. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Aug 24 21:04:40 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 24 Aug 2020 17:04:40 -0400 Subject: Opting into totality In-Reply-To: <396981E9-C301-48CC-AB63-F4BB33F77C64@oracle.com> References: <081e6ffa-7716-b5b7-8b8c-4164800b5113@oracle.com> <396981E9-C301-48CC-AB63-F4BB33F77C64@oracle.com> Message-ID: <84bab351-8521-da49-4592-79c83fab1887@oracle.com> > I am going to argue here that, just as fear of letting nulls flow > stemmed from a early design that conflated multiple design issues as a > result of extrapolating from too few data points (enums and strings), > we have been boxed into another corner because we conflated > expression-ness and the need for totality. I'm not sure we _conflated_ the two, as we did this with our eyes open (and fairly recently), but I suspect I agree with the rest -- that for $REASONS, we introduced an asymmetry that we knew would come back to bite us, and left a note for ourselves to come back and revisit, especially as optimistic totality became more important (e.g., through sealed types.) > Going back to the dawn of time, a switch statement does not have to be > total. ?Why is this possible? ?Because there is an obvious default > behavior: do nothing. ?If we were to view it in terms of delivering a > value of some type, we would say that type is ?void?. Yep.? Cue usual comparison to "if without else."? Partiality, for statements, is OK, but not for expressions.? Can't have a ternary expression with no `: alternative` part. > Then why did we not allow a switch expression to be _exactly_ > analogous? ?In fact, we could have, by relying on existing precedent > in the language: if no switch label matches and there is no default, > or if execution of the statements of the switch block completes > normally, we could simply decree that a switch expression has the > default behavior ?do nothing? and delivers a _default value_?exactly > as we do for initialization of fields and array components. ?So for > > enum Color { RED, GREEN, BLUE } > Color x = ? > int n = switch (x) { RED -> 1; GREEN -> 2; }; > > then if x is BLUE, n will get the value 0. > > But I am guessing that we worried about programming errors and > demanded totality for switch expressions, so we enforced it by fiat > because we had no other mechanism to request totality. Yes, that's right.? Note that this is analogous to another feature that is frequently requested -- the so-called "safe dereference" operators, where `x?.y` yields a default value if `x` is null.? When this one first reared its head (and about a thousand times since, since it comes up near-constantly), our objection was that, while `null` might conceivably be a reasonable default for such an expression if `y` is of reference type, yielding a default primitive value is more likely to lead to errors than not.? (This is not unlike the problem with `Map::get`, where `null` means both "mapping not present" and "element mapped to null", and there's no way to tell the difference unless you can freeze the map for updates while asking two questions (Map::containsKey and Map::get.)?? The argument against both is the same. > The first is a switch label of the form ?default ?, which > behaves just like a switch label ?case ? except that it is a > static error if the is not total on the type of the selector > expression. ?This mechanism is good for extensible type hierarchies, > where we expect to call out a number of special cases and then have a > catch-all case, and we want the compiler to confirm to us on every > compilation that the catch-all case actually does catch everything. What I like about this use of default is that it decomposes into independent parts.?? The `default` case means "everything else"; `default ` means "everything else, and destructure that everything else with this pattern, which had better be total because, well, I'm matching everything else."?? THe addition of a pattern doesn't change the meaning of default; it is a form of composition.? You could recast this as taking one feature -- "patterns in switch" -- and turning it into two: patterns in case labels, and patterns in default labels. > The second is the possibility of writing ?switch case? rather than > ?switch?, which introduces these extra constraints on the switch > block: It is a static error if any SwitchLabel of the switch statement > begins with ?default". ?It is a static error if the set of case > patterns is not at least optimistically?total on the type of the > selector expression. ?It is a static error if the last?BlockStatement > in _any_ SwitchBlockStatementGroup, or the Block in any SwitchRule, > can complete normally. ?It is a static error if any SwitchLabel of the > switch statement is not part of a?SwitchBlockStatementGroup or > SwitchRule. ?In addition, the compiler automatically inserts > SwitchBlockStatementGroups or SwitchRules to cover the residue, so as > to throw an appropriate error at run time if the value produced by the > selector expression belongs to the residue. ?This mechanism is good > for enums and sealed types, that is, situations where we expect to > enumerate all the special cases explicitly and want to be notified by > the compiler (or failing that, at run time) if we have failed to do so. Is it fair to recharacterize this as follows -- we divide potentially-total statement switches into two kinds: ?- Those that arrive at totality via a catch-all clause; ?- Those that arrive at totality via an (optimistic) covering of an enumerated domain ("switch by parts") and we provide a separate mechanism for each to declare their totality (which engages different type checking and translation.) (I see later that you say this, so good, we're on the same page.) (We might have called this "enum switch", especially if we had taken Alan's suggestion of declaring sealed types with a more enum-like syntax.) Some comments on the restrictions: > It is a static error if the last?BlockStatement in _any_ > SwitchBlockStatementGroup, or the Block in any SwitchRule, can > complete normally. This is mostly already true in general; we envision it is a static error if you fall into, or out of, any case that has bindings.? (We might relax this to allow falling _into_ a total case.)?? This was a simplification; we could support binding merging, but the return-on-complexity didn't seem quite there. > ???If the type of the selector expression is not an enum type, then > either the ?switch case? form is used or there is exactly > one?default?label associated with the switch block. I presume you intend that this eventually becomes true for switches on sealed types as well. One question I have is that optimistic totality can apply in switches that are not on sealed types, or where sealed types show up in nested contexts.? For example: ??? switch (box) { ??????? case Box(var x): ... ??? } Here, I think we want to say that Box(var x) is o.t. on Box, but it doesn't match null.?? So how does the programmer indicate that they want to get totality checking and residue rejection? Similarly, suppose we have a sealed type Shape = Circ + Rect, and the obvious container Box: ??? switch (boxOfShape) { ??????? case Box(Circ c): ??????? case Box(Rect r): ??? } Again, I think we want this set of cases to be o.t., but the switch is not on a sealed type.? I am not sure how to integrate these cases into your model. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Mon Aug 24 22:01:16 2020 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 24 Aug 2020 18:01:16 -0400 Subject: Opting into totality In-Reply-To: <84bab351-8521-da49-4592-79c83fab1887@oracle.com> References: <081e6ffa-7716-b5b7-8b8c-4164800b5113@oracle.com> <396981E9-C301-48CC-AB63-F4BB33F77C64@oracle.com> <84bab351-8521-da49-4592-79c83fab1887@oracle.com> Message-ID: > On Aug 24, 2020, at 5:04 PM, Brian Goetz wrote: > > > >> I am going to argue here that, just as fear of letting nulls flow stemmed from a early design that conflated multiple design issues as a result of extrapolating from too few data points (enums and strings), we have been boxed into another corner because we conflated expression-ness and the need for totality. > > I'm not sure we _conflated_ the two, as we did this with our eyes open (and fairly recently), but I suspect I agree with the rest -- that for $REASONS, we introduced an asymmetry that we knew would come back to bite us, and left a note for ourselves to come back and revisit, especially as optimistic totality became more important (e.g., through sealed types.) > >> Going back to the dawn of time, a switch statement does not have to be total. Why is this possible? Because there is an obvious default behavior: do nothing. If we were to view it in terms of delivering a value of some type, we would say that type is ?void?. > > Yep. Cue usual comparison to "if without else." Partiality, for statements, is OK, but not for expressions. Can't have a ternary expression with no `: alternative` part. > >> Then why did we not allow a switch expression to be _exactly_ analogous? In fact, we could have, by relying on existing precedent in the language: if no switch label matches and there is no default, or if execution of the statements of the switch block completes normally, we could simply decree that a switch expression has the default behavior ?do nothing? and delivers a _default value_?exactly as we do for initialization of fields and array components. So for >> >> enum Color { RED, GREEN, BLUE } >> Color x = ? >> int n = switch (x) { RED -> 1; GREEN -> 2; }; >> >> then if x is BLUE, n will get the value 0. >> >> But I am guessing that we worried about programming errors and demanded totality for switch expressions, so we enforced it by fiat because we had no other mechanism to request totality. > > Yes, that's right. Note that this is analogous to another feature that is frequently requested -- the so-called "safe dereference" operators, where `x?.y` yields a default value if `x` is null. When this one first reared its head (and about a thousand times since, since it comes up near-constantly), our objection was that, while `null` might conceivably be a reasonable default for such an expression if `y` is of reference type, yielding a default primitive value is more likely to lead to errors than not. (This is not unlike the problem with `Map::get`, where `null` means both "mapping not present" and "element mapped to null", and there's no way to tell the difference unless you can freeze the map for updates while asking two questions (Map::containsKey and Map::get.) The argument against both is the same. > >> The first is a switch label of the form ?default ?, which behaves just like a switch label ?case ? except that it is a static error if the is not total on the type of the selector expression. This mechanism is good for extensible type hierarchies, where we expect to call out a number of special cases and then have a catch-all case, and we want the compiler to confirm to us on every compilation that the catch-all case actually does catch everything. > > What I like about this use of default is that it decomposes into independent parts. The `default` case means "everything else"; `default ` means "everything else, and destructure that everything else with this pattern, which had better be total because, well, I'm matching everything else." THe addition of a pattern doesn't change the meaning of default; it is a form of composition. You could recast this as taking one feature -- "patterns in switch" -- and turning it into two: patterns in case labels, and patterns in default labels. > >> The second is the possibility of writing ?switch case? rather than ?switch?, which introduces these extra constraints on the switch block: It is a static error if any SwitchLabel of the switch statement begins with ?default". It is a static error if the set of case patterns is not at least optimistically total on the type of the selector expression. It is a static error if the last BlockStatement in _any_ SwitchBlockStatementGroup, or the Block in any SwitchRule, can complete normally. It is a static error if any SwitchLabel of the switch statement is not part of a SwitchBlockStatementGroup or SwitchRule. In addition, the compiler automatically inserts SwitchBlockStatementGroups or SwitchRules to cover the residue, so as to throw an appropriate error at run time if the value produced by the selector expression belongs to the residue. This mechanism is good for enums and sealed types, that is, situations where we expect to enumerate all the special cases explicitly and want to be notified by the compiler (or failing that, at run time) if we have failed to do so. > > Is it fair to recharacterize this as follows -- we divide potentially-total statement switches into two kinds: > - Those that arrive at totality via a catch-all clause; > - Those that arrive at totality via an (optimistic) covering of an enumerated domain ("switch by parts") > > and we provide a separate mechanism for each to declare their totality (which engages different type checking and translation.) > > (I see later that you say this, so good, we're on the same page.) > > (We might have called this "enum switch", especially if we had taken Alan's suggestion of declaring sealed types with a more enum-like syntax.) > > > Some comments on the restrictions: > >> It is a static error if the last BlockStatement in _any_ SwitchBlockStatementGroup, or the Block in any SwitchRule, can complete normally. > > This is mostly already true in general; we envision it is a static error if you fall into, or out of, any case that has bindings. (We might relax this to allow falling _into_ a total case.) This was a simplification; we could support binding merging, but the return-on-complexity didn't seem quite there. > > > >> ? If the type of the selector expression is not an enum type, then either the ?switch case? form is used or there is exactly one default label associated with the switch block. > > I presume you intend that this eventually becomes true for switches on sealed types as well. Yes, sorry, I didn?t state the conditions quite correctly, and I believe the correct way to state them will emerge once we work out the complete theory of optimistic totality. > One question I have is that optimistic totality can apply in switches that are not on sealed types, or where sealed types show up in nested contexts. For example: > > switch (box) { > case Box(var x): ... > } > > Here, I think we want to say that Box(var x) is o.t. on Box, but it doesn't match null. So how does the programmer indicate that they want to get totality checking and residue rejection? I believe that ?switch case? can handle this: switch case (box) { case Box(var x): ... } This says, among other things, that it is a static error if the (singleton) set of case patterns { Box(var x) } is not o.t. on the type of ?box?, and it says we want residue checking, so it?s as if the compiler rewrote it to: switch case (box) { case null: throw case Box(var x): ... } Alternatively, we could write switch (box) { default Box(var x): ... } which says that it is a static error if the pattern Box(var x) is not total on the type of ?box?. It?s not, because it doesn?t match null, so we get a static error, as desired. Perhaps we should have written switch (box) { case Box(var x): ? default Box z: ... } But I?m thinking the ?switch case? solution is preferable for this specific example. > Similarly, suppose we have a sealed type Shape = Circ + Rect, and the obvious container Box: > > switch (boxOfShape) { > case Box(Circ c): > case Box(Rect r): > } > > Again, I think we want this set of cases to be o.t., but the switch is not on a sealed type. I am not sure how to integrate these cases into your model. And again, I think that ?switch case? does the job: switch case (boxOfShape) { case Box(Circ c): ... case Box(Rect r): ... } This says, among other things, that it is a static error if the set of case patterns { Box(Circ c), Box(Rest r) } is not o.t. on the type of ?boxOfShape?, and it says we want residue checking, so it?s as if the compiler rewrote it to: switch case (boxOfShape) { case null: throw case Box(Circ c): ... case Box(Rect r): ? case Box(var _): throw } From guy.steele at oracle.com Mon Aug 24 22:05:48 2020 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 24 Aug 2020 18:05:48 -0400 Subject: Opting into totality In-Reply-To: References: <081e6ffa-7716-b5b7-8b8c-4164800b5113@oracle.com> <396981E9-C301-48CC-AB63-F4BB33F77C64@oracle.com> <84bab351-8521-da49-4592-79c83fab1887@oracle.com> Message-ID: <9BE4B17F-F4EE-4B7D-8B40-77316EA42A7E@oracle.com> > On Aug 24, 2020, at 6:01 PM, Guy Steele wrote: > > > >> On Aug 24, 2020, at 5:04 PM, Brian Goetz wrote: >> >> >> >>> I am going to argue here that, just as fear of letting nulls flow stemmed from a early design that conflated multiple design issues as a result of extrapolating from too few data points (enums and strings), we have been boxed into another corner because we conflated expression-ness and the need for totality. >> >> I'm not sure we _conflated_ the two, as we did this with our eyes open (and fairly recently), but I suspect I agree with the rest -- that for $REASONS, we introduced an asymmetry that we knew would come back to bite us, and left a note for ourselves to come back and revisit, especially as optimistic totality became more important (e.g., through sealed types.) P.S. I stand by my choice of the word ?conflate? (?combine (two or more texts, ideas, etc.) into one?), which operation can certainly be intentional; I very carefully did not say ?confuse?. :-) I regard ?conflate? as a neutral operational word, and certainly not a pejorative. From brian.goetz at oracle.com Mon Aug 24 22:17:38 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 24 Aug 2020 18:17:38 -0400 Subject: Opting into totality In-Reply-To: References: <081e6ffa-7716-b5b7-8b8c-4164800b5113@oracle.com> <396981E9-C301-48CC-AB63-F4BB33F77C64@oracle.com> <84bab351-8521-da49-4592-79c83fab1887@oracle.com> Message-ID: <3ff64bd7-afe3-0901-7d69-9ca3342e8c40@oracle.com> >> One question I have is that optimistic totality can apply in switches that are not on sealed types, or where sealed types show up in nested contexts. For example: >> >> switch (box) { >> case Box(var x): ... >> } >> >> Here, I think we want to say that Box(var x) is o.t. on Box, but it doesn't match null. So how does the programmer indicate that they want to get totality checking and residue rejection? > I believe that ?switch case? can handle this: > > switch case (box) { > case Box(var x): ... > } > > This says, among other things, that it is a static error if the (singleton) set of case patterns { Box(var x) } is not o.t. on the type of ?box?, and it says we want residue checking, so it?s as if the compiler rewrote it to: > > switch case (box) { > case null: throw > case Box(var x): ... > } > > Alternatively, we could write > > switch (box) { > default Box(var x): ... > } > > which says that it is a static error if the pattern Box(var x) is not total on the type of ?box?. It?s not, because it doesn?t match null, so we get a static error, as desired. Perhaps we should have written > > switch (box) { > case Box(var x): ? > default Box z: ... > } > > But I?m thinking the ?switch case? solution is preferable for this specific example. OK, so (taking this example with the next) the mental model for `switch case` is not as I suggested -- "switch by covering parts" -- as much as "a switch that is optimistically total, and which gets built-in residue rejection."? Because there are multiple categories of o.t. that don't involve enum/sealed types at all, or that get their optimism only indirectly through enum/sealed types. Let me probe at another aspect; that it is an error for a `switch case` to have a default clause.? This seems a tad on the pedantic side, in that surely if you add a `default` clause to an already o.t. switch with non-empty residue, it is (a) still total and (b) might afford you the opportunity to customize the residue rejection.? But I think your response would be "that's fine, it's not a switch case, it's an ordinary total switch with a default clause. Which says to me that you are motivated to highlight the negative space between "optimistically total" and "total"; what a `switch case` asserts is _both_ that the set of cases is optimistically total, _and_ that there's a non-empty residue for which which we cede responsibility to the switch.? (Secondarily, I think you're saying that a `default` at the end of the switch is enough to say "obviously, it's total.") From guy.steele at oracle.com Mon Aug 24 23:06:45 2020 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 24 Aug 2020 19:06:45 -0400 Subject: Opting into totality In-Reply-To: <3ff64bd7-afe3-0901-7d69-9ca3342e8c40@oracle.com> References: <3ff64bd7-afe3-0901-7d69-9ca3342e8c40@oracle.com> Message-ID: More later (I have a meeting soon), but for now, remember that customized residue handling can _always_ be handled with a case label (such as ?case var x?, or something more specific); you don?t need ?default?. Sent from my iPhone > On Aug 24, 2020, at 6:17 PM, Brian Goetz wrote: > > ? >>> One question I have is that optimistic totality can apply in switches that are not on sealed types, or where sealed types show up in nested contexts. For example: >>> >>> switch (box) { >>> case Box(var x): ... >>> } >>> >>> Here, I think we want to say that Box(var x) is o.t. on Box, but it doesn't match null. So how does the programmer indicate that they want to get totality checking and residue rejection? >> I believe that ?switch case? can handle this: >> >> switch case (box) { >> case Box(var x): ... >> } >> >> This says, among other things, that it is a static error if the (singleton) set of case patterns { Box(var x) } is not o.t. on the type of ?box?, and it says we want residue checking, so it?s as if the compiler rewrote it to: >> >> switch case (box) { >> case null: throw >> case Box(var x): ... >> } >> >> Alternatively, we could write >> >> switch (box) { >> default Box(var x): ... >> } >> >> which says that it is a static error if the pattern Box(var x) is not total on the type of ?box?. It?s not, because it doesn?t match null, so we get a static error, as desired. Perhaps we should have written >> >> switch (box) { >> case Box(var x): ? >> default Box z: ... >> } >> >> But I?m thinking the ?switch case? solution is preferable for this specific example. > > OK, so (taking this example with the next) the mental model for `switch case` is not as I suggested -- "switch by covering parts" -- as much as "a switch that is optimistically total, and which gets built-in residue rejection." Because there are multiple categories of o.t. that don't involve enum/sealed types at all, or that get their optimism only indirectly through enum/sealed types. > > Let me probe at another aspect; that it is an error for a `switch case` to have a default clause. This seems a tad on the pedantic side, in that surely if you add a `default` clause to an already o.t. switch with non-empty residue, it is (a) still total and (b) might afford you the opportunity to customize the residue rejection. But I think your response would be "that's fine, it's not a switch case, it's an ordinary total switch with a default clause. > > Which says to me that you are motivated to highlight the negative space between "optimistically total" and "total"; what a `switch case` asserts is _both_ that the set of cases is optimistically total, _and_ that there's a non-empty residue for which which we cede responsibility to the switch. (Secondarily, I think you're saying that a `default` at the end of the switch is enough to say "obviously, it's total.") > > > From guy.steele at oracle.com Tue Aug 25 02:17:29 2020 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 24 Aug 2020 22:17:29 -0400 Subject: Opting into totality In-Reply-To: References: <3ff64bd7-afe3-0901-7d69-9ca3342e8c40@oracle.com> Message-ID: <3DFEE3C0-2E04-4998-873E-73B9EA7250E1@oracle.com> See below and also far below. > On Aug 24, 2020, at 7:06 PM, Guy Steele wrote: > > More later (I have a meeting soon), but for now, remember that customized residue handling can _always_ be handled with a case label (such as ?case var x?, or something more specific); you don?t need ?default?. > > Sent from my iPhone Indeed, to me much of the value of the residue model lies in demonstrating that the residue can be characterized in terms of additional synthesized switch labels, and that the user can therefore easily supersede any such synthesized label by providing it explicitly. >> On Aug 24, 2020, at 6:17 PM, Brian Goetz wrote: >> >> ? >>>> One question I have is that optimistic totality can apply in switches that are not on sealed types, or where sealed types show up in nested contexts. For example: >>>> >>>> switch (box) { >>>> case Box(var x): ... >>>> } >>>> >>>> Here, I think we want to say that Box(var x) is o.t. on Box, but it doesn't match null. So how does the programmer indicate that they want to get totality checking and residue rejection? >>> I believe that ?switch case? can handle this: >>> >>> switch case (box) { >>> case Box(var x): ... >>> } >>> >>> This says, among other things, that it is a static error if the (singleton) set of case patterns { Box(var x) } is not o.t. on the type of ?box?, and it says we want residue checking, so it?s as if the compiler rewrote it to: >>> >>> switch case (box) { >>> case null: throw >>> case Box(var x): ... >>> } >>> >>> Alternatively, we could write >>> >>> switch (box) { >>> default Box(var x): ... >>> } >>> >>> which says that it is a static error if the pattern Box(var x) is not total on the type of ?box?. It?s not, because it doesn?t match null, so we get a static error, as desired. Perhaps we should have written >>> >>> switch (box) { >>> case Box(var x): ? >>> default Box z: ... >>> } >>> >>> But I?m thinking the ?switch case? solution is preferable for this specific example. >> >> OK, so (taking this example with the next) the mental model for `switch case` is not as I suggested -- "switch by covering parts" -- as much as "a switch that is optimistically total, and which gets built-in residue rejection." Because there are multiple categories of o.t. that don't involve enum/sealed types at all, or that get their optimism only indirectly through enum/sealed types. >> >> Let me probe at another aspect; that it is an error for a `switch case` to have a default clause. This seems a tad on the pedantic side, in that surely if you add a `default` clause to an already o.t. switch with non-empty residue, it is (a) still total and (b) might afford you the opportunity to customize the residue rejection. But I think your response would be "that's fine, it's not a switch case, it's an ordinary total switch with a default clause. >> >> Which says to me that you are motivated to highlight the negative space between "optimistically total" and "total"; what a `switch case` asserts is _both_ that the set of cases is optimistically total, _and_ that there's a non-empty residue for which which we cede responsibility to the switch. (Secondarily, I think you're saying that a `default` at the end of the switch is enough to say "obviously, it's total.") I have thought about what you have just said; (a) I agree, and (b) have made me realize that I have unnecessarily and unreasonably conflated the o.t. issue with the fallthrough issue. So let me try once more. For context, here are the three important paragraphs again, and the only changes I have made are to remove from the third paragraph the sentences that address fallthrough and fallout: ????????????? Then we introduce two mechanisms that we have discussed more recently, and say that each of these mechanisms may be used in either a switch statement or a switch expression. The first is a switch label of the form ?default ?, which behaves just like a switch label ?case ? except that it is a static error if the is not total on the type of the selector expression. This mechanism is good for extensible type hierarchies, where we expect to call out a number of special cases and then have a catch-all case, and we want the compiler to confirm to us on every compilation that the catch-all case actually does catch everything. The second is the possibility of writing ?switch case? rather than ?switch?, which introduces these extra constraints on the switch block: It is a static error if any SwitchLabel of the switch statement begins with ?default". It is a static error if the set of case patterns is not at least optimistically total on the type of the selector expression. In addition, the compiler automatically inserts SwitchBlockStatementGroups or SwitchRules to cover the residue, so as to throw an appropriate error at run time if the value produced by the selector expression belongs to the residue. This mechanism is good for enums and sealed types, that is, situations where we expect to enumerate all the special cases explicitly and want to be notified by the compiler (or failing that, at run time) if we have failed to do so. ????????????? I also considered removing the sentence It is a static error if any SwitchLabel of the switch statement begins with ?default?. which is the one that makes these two options mutually exclusive. But under this slimmed-down definition, if you add a ?default? clause to a ?switch case? construct, then the fact that you used ?switch case? has no effect: the ?default? clause ensures that the set of patterns is total, therefore also o.t., and the residue will be empty. So there is no point is allowing both features to be used together. Now we have twelve combinations produced by three orthogonal choices: statement or expression there is a single pattern that must be total (?default?), or the set of patterns must be o.t. (?switch case?), or no totality constraint fallthrough could happen (switch blocks using colons) or cannot happen (switch rules using ->) and two of them have additional rules to ensure that a switch expression need not generate default values. From forax at univ-mlv.fr Tue Aug 25 11:51:47 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 25 Aug 2020 13:51:47 +0200 (CEST) Subject: Optimistic totality In-Reply-To: <1ec09a32-13b6-6739-35a7-691ffad427e1@oracle.com> References: <1ec09a32-13b6-6739-35a7-691ffad427e1@oracle.com> Message-ID: <872553292.643386.1598356307535.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Lundi 24 Ao?t 2020 18:23:47 > Objet: Optimistic totality > As I mentioned yesterday, I think our ideas about totality and null handling > were getting polluted by our desire to support intuitive, optimistic totality. > So let's try to separate them, by outlining some goals for optimistic totality. > First, I'll posit that we're now able to stand on a more solid foundation: > - Null is just another value for purposes of pattern matching; total patterns > match null. > - Null is just another value for purposes of switches; switches will feed null > into total cases. > - The perceived null-hostility of switch is actually about switches on enums, > boxes, and strings; in the general case, we don't want, or need, such > null-hostility. > This is a much simpler story, and has many fewer sharp edges. > Declaring clarity on that, we now have two additional problems to solve: > - How to fill the gap between (always total) expression switches and (always > partial) statement switches. Totality checking is a useful static analysis that > can identify bugs earlier, and being able to restore symmetry in semantics > (even if it requires asymmetry in syntax) reduces unexpected errors and > potholes. > - How to extend the optimistic totality of expression switches over enums (which > is a very restricted case) to the more general case of switches over sealed > types, and switches with weakly total cases (such as total deconstruction > patterns.) > This mail will focus mostly on the second problem; I'll start another thread for > the first. > The goal of optimistic totality handling is to allow users to write a set cases > that covers the target "well enough" that a catch-all throwing default is not > needed. This has two benefits: > - Let the compiler write the dumb do-nothing code, rather than making the user > do it; > - If the user writes a throwing catch-all, we lose the opportunity to type-check > the assumption that the switch was total in the first place. > Users are well aware of the first benefit, but the second benefit is actually > more important. yes, this point is very important, that one issue I currently have with the IDEs, when you write an expression switch on an enum, they tend to ask for a default even if the switch is exhaustive, but this has the side effect of not reporting an error if someone add a new case to the enum. > If the user writes: > Freq frequency = switch (trafficLight) { > case RED -> Freq.ofTHz(450); > case YELLOW -> Freq.ofTHz(525); > default -> throw ...; > } > We are deprived of two ways to help: > - We cannot tell whether the user meant for { RED, YELLOW } to cover the space, > so we cannot offer helpful type checking of "you forgot green." > - Even if the code, as written, does cover the space, if a new constant / > permitted subtype is added later, we lose the opportunity to catch it at next > compilation, and alert the user to the fact that their assumption of totality > was broken by someone else. yes !, it's exactly what i've just said above ! > On the other hand, if there is no default clause, we get exhaustiveness checking > when the code is first written, and continual revalidation of this assumption > on every recompile. > OK, so optimistic totality is good. What does that really mean? We already know > one case: > - Specifying all the known constants of an enum, but no default or null case. > Because this case is so limited, we handled this one pretty well in 12; we NPE > on null, and ICCE on everything else. > Another (new) case is: > - When we have a _weakly total_ (total except for null) pattern on the target > type. A key category of weakly total patterns are deconstruction patterns whose > sub-patterns are total. Such as: > var x = switch (box) { > case Box(var x) b -> ... > } > The pattern `Box(var x)` matches all _non-null_ boxes. (It can't match null > boxes, because we'd be invoking the deconstructor with a null receiver, which > would surely NPE anyway, since a deconstructor is going to have some `x = > this.x` ~99.99% of the time.) So, should this be good enough to declare the > switch optimistically total? hum, this is where I don't like the current state of the syntax, because i don't know if this case is an instanceof or not, you seem to think it's not an instanceof and just a deconstruction, so the desconstructor can NPE because there is no instanceof upward. Forcing the user to use 'default' here makes the syntax clearer. With var x = switch(box) { default Box(var x) b -> }; This is now clear that this is not an instanceof, it's an "else", thus switch(null) can NPE. > I think so; having to say `case null` in this switch would be irritating to > users for no good reason. I agree > What we've done is flipped things around; rather than saying "switches NPE on > null", we can say "total switches with optimistically total case sets can throw > on silly inputs" -- because the very concept of optimistic totality suggests > that we think the residue consists only of silly inputs (and we are only > throwing when the switch is total anyway.) yes > Now we can have a more refined definition of silly inputs. > Another case: > - The sealed class analogue of an enum switch. > Here, we have a sealed class C, and a set of patterns that, for every permitted > subtype D of C, some subset of the patterns is (optimistically) total on D. > Now, our residue has two inhabitants: null, and novel subclasses. > Do we think this should be optimistically total? Yes; all the reasons why a > throwing default is bad on the enum case apply to the sealed case, there is > just a larger residue set. yes, NPE if null, ICCE if it's a novel type. > Another case: > - When we have a deconstructor D(C), and a set of patterns D(P1)...D(Pn) such > that P1..Pn are optimistically total on C, we would like to conclude that the > lifted patterns are optimistically total on D. > Example: > switch (boxOfShape) { // Shape = Circle + Rect > case Box(Circle c): > case Box(Rect r): > } > Our claim here is that because Circle + Rect are o.t. on Shape, > Box(Circle)+Box(Rect) should be o.t. on Box. Do we buy that? Again, I > think we want this; asking users to insert Box(null) or Box(novel) cases to get > totality checking is counterproductive. yes ! > What we see here is that we have an accumulation of situations where we think a > given set of patterns covers a target type "well enough" that we are willing to > (a) let the user skate on being truly total, and (b) engage enhanced type > checking against the set of "good enough" cases. > After writing this, I think we are, once again, being overly constrained by (and > worse, distracted by) "consistency" with what we decided in 12 for the simple > case of enum switches: that the answer always has to be some form of ICCE or > NPE. These were easy answers when the residue was so easily characterized, but > trying to extrapolate from them too rigidly may be a mistake. > So, I think that we should save NPE and ICCE for the more accurate, narrow uses > we found for them in 12, and for any "complex" residue, just define a new > exception type -- and focus our energy on ensuring we get good error messages > out of it, and move past this distraction. I disagree, this argument seems weak, because it's not an argument for a new exception, it's a kind of argument for not using NPE and ICCE because a switch on type doesn't behave like a switch on enums. Also I still think we can bridge the gap with the switch on enums to make it works like a switch on types, in that case, your argument is moot. > The real point here is defining what we consider to be acceptable residue for an > optimistically total switch, and ensure that we can deliver clear error > messages when a Box(Hexagon) shows up. yes, a good error message is more important than a new kind of exception. One minor issue I see is that the error message will contains a Pattern which is not among the patterns that are defined in the switch. For a ICCE, the pattern reported is Box(Hexagon) which is fine. For an NPE , by example in the case of the box of shapes, the pattern reported can be "Box(null)" but may be a Box(var) better suits the problem. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Tue Aug 25 16:59:53 2020 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 25 Aug 2020 12:59:53 -0400 Subject: Opting into totality In-Reply-To: <3DFEE3C0-2E04-4998-873E-73B9EA7250E1@oracle.com> References: <3ff64bd7-afe3-0901-7d69-9ca3342e8c40@oracle.com> <3DFEE3C0-2E04-4998-873E-73B9EA7250E1@oracle.com> Message-ID: <2EB1D100-A4D9-4EFC-B3D3-D57727549F9F@oracle.com> I have now gone back and re-read some of the previous email about totality, including four that I now quote below: > On Aug 14, 2020, at 1:20 PM, Brian Goetz wrote: > >> . . . totality is a "subtree all the way down" property. This observation was later refined by the message quoted below, but I think we never remarked explicitly on that fact. It turns out that ?weak totality? is a ?subtree all the way down property? but ?strong totality? is not. > On Aug 20, 2020, at 3:09 PM, Brian Goetz wrote: > > Here's an attempt at a formalism for capturing this. > > There are several categories of patterns we might call total on a type T. We could refine the taxonomy as: > > - Strongly total -- matches all values of T. > - Weakly total -- matches all values of T except perhaps null. > > What we want to do is characterize the aggregate totality on T of a _set_ of patterns P*. A set of patterns could in the aggregate be either of the above, or also: > > - Optimistically total -- matches all values of subtypes of T _known at compile time_, except perhaps null. > > Note that we have an ordering: > > partial < optimistically total < weakly total < strongly total > > Now, some rules about defining the totality of a set of patterns. > > T-Total: The singleton set containing the type pattern `T t` is strongly total on U <: T. (This is the rule we've been discussing, but that's not the point of this mail -- we just need a base case right now.) > > T-Subset: If a set of patterns P* contains a subset of patterns that is X-total on T, then P* is X-total on T. > > T-Sealed: If T is sealed to U* (direct subtypes only), and for each U in U*, there is some subset of P* that is optimistically total on U, then P* is optimistically total on T. > > T-Nested: Given a deconstructor D(U) and a collection of patterns { P1..Pn }, if { P1..Pn } is X-total on U, then { D(P1)..D(Pn) } is min(X,weak)-total on D. That ?min(X,weak)? is crucial here. > > OK, examples. > . . . > > { Box(Object o) } weakly total on Box > > - Object o total on Object > - { Object o } total on Object by T-Subset > - { Box(Object o) } weakly total on Box by T-Nested So, going back to my earlier email about the box of frogs and the bag of objects: > On Aug 12, 2020, at 10:46 PM, Guy Steele wrote: > >> On Aug 12, 2020, at 3:57 PM, forax at univ-mlv.fr wrote: >> . . . >> >> I agree destructuring is just as important as conditionality and those two things should be orthogonal. >> But i still think having a keyword to signal that a pattern (not a case) is total is better than letting people guess. > > Yes, and here is the example that convinced me that one needs to be able to mark patterns as total, not just cases: > > (Assume for the following example that any pattern may be preceded by ?default?, that the only implication of ?default? is that you get a static error if the pattern it precedes is not total, and that we can abbreviate ?case default? as simply ?default?.) > > record Box(T t) { } > record Bag(T t) { } > > record Pair(T t, U u) { } > > Triple, Bag> p; > > switch (x) { > case Pair(Box(Tadpole t), Bag(String s)): ? > case Pair(Box(Tadpole t), Bag(default Object o)): ? > case Pair(Box(default Frog f), Bag(String s)): ? > default Pair(Box(Frog f), Bag(Object o)): ? > } and here we can see that this was wrong, now that we have the terminology to distinguish weak totality and strong totality. When I wrote "you get a static error if the pattern it precedes is not total? we would now say "you get a static error if the pattern it precedes is not _strongly_ total?. But ?subtree all the way down? is a property of weak totality but not of string totality. Therefore default Pair(Box(Frog f), Bag(Object o)): ? and case Pair(Box(default Frog f), Bag(default Object o)): ? do _not_ mean the same thing after all. We conclude that not only is the latter preferable, as R?mi indicated: > On Aug 13, 2020, at 8:19 AM, forax at univ-mlv.fr wrote: > . . . > > I think i prefer using "default" (or any other keyword) only where it makes sense and doesn't allow "default" to be propagated. > so > default Pair p: ... > is ok but > default Pair(Box(Frog f), Bag(Object o)): ? > should be written > case Pair(Box(default Frog f), Bag(default Object o)): ? (I have corrected a typo), but in fact only the latter is _correct_. Taking all this into account in the context of my latest proposal: while ?switch case? would forbid the use of a switch label that begins with the keyword ?default?, nevertheless ?default? as a _pattern marker_ may be profitably used within a ?switch case? construction: switch case (x) { case Pair(Box(Tadpole t), Bag(String s)): ? case Pair(Box(Tadpole t), Bag(default Object o)): ? case Pair(Box(default Frog f), Bag(String s)): ? case Pair(Box(default Frog f), Bag(default Object o)): ? } or perhaps just switch case (x) { case Pair(Box(Tadpole t), Bag(String s)): ? case Pair(Box(Tadpole t), Bag(Object o)): ? case Pair(Box(Frog f), Bag(String s)): ? case Pair(Box(default Frog f), Bag(default Object o)): ? } -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Aug 25 18:08:50 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 25 Aug 2020 14:08:50 -0400 Subject: Opting into totality In-Reply-To: <3DFEE3C0-2E04-4998-873E-73B9EA7250E1@oracle.com> References: <3ff64bd7-afe3-0901-7d69-9ca3342e8c40@oracle.com> <3DFEE3C0-2E04-4998-873E-73B9EA7250E1@oracle.com> Message-ID: > I have thought about what you have just said; (a) I agree, and (b) have made me realize that I have unnecessarily and unreasonably conflated the o.t. issue with the fallthrough issue. So let me try once more. For context, here are the three important paragraphs again, and the only changes I have made are to remove from the third paragraph the sentences that address fallthrough and fallout: While we're doing a conflation inventory, here's another candidate (which I will soon reject after drawing the distinction): totality checking vs automatic residue handling.? The interesting cases here all involve getting the compiler to do TWO categories of extra work for the user that is not done on old-school (partial, statement) switches: ?- Validating that the cases are (optimistically) total, or giving a compile error ?- Inserting synthetic cases to throw on the residue of optimistic totality I'm not seriously suggesting that we should treat them as separate features, as much as observing that we may still only be circling the target.? In thinking about it this morning, I initially thought that a better axis for dividing the problem would be on the "shape" of the switch; some switches are "tree shaped" (type-hierarchy-based switches which lean hard on partial ordering of cases, usually culminating in catch-alls) and some are "star shaped" (all of the traditional switch examples, plus many switches over sealed types.)?? I initially thought that `switch case` was mostly about the latter shape, but then I realized this is a false distinction; both produce residue, even when no sealing or enums are involved, as a side-effect of the fact that we want to not force users to spell out all the silly cases. The pervasiveness of o.t., and the complex shape of residue that comes out of even well-behaved switches, says to me the place worth spending complexity budget on is not totality, but _optimistic totality with residue_.? (Strong totality is just residue = empty; weak totality is just residue = { null }, so collapsing these all may make sense.)? Because, the set of switches that are truly total with no optimism and no residue is pretty simple, and probably doesn't need an awful lot of help.? (For comparison, we don't need the language to help us assert than an if ... else chain is total; if the last entry is an "else", we're there.)? So I'm happy collapsing these two sub-features, but I want to keep the focus on the optimistic/residue part, because that's where the value is. We already have a 2x2 matrix of orthogonal choices that we started with: { statement, expression } x { arrow, colon }, and we're not going to pare back that space.? So let's look at the other dimension, which you've currently got as { "single point of totality", optimistic totality, partiality }. Given the lack of clear boundary between the tree-shaped and star-shaped switches with respect to their residue, I'm not sure the distinction between SPoT and optimistic totality carries its weight, and we can collapse these into "X-total vs partial", where X-total means both totality checking and automatic residue rejection.? The interesting part is validating that the programmer has contributed enough totality that the compiler can fill in the gaps -- let's highlight that. To pick a different syntax for purposes of exposition, suppose we had `total-switch` variant of `switch`.? A `total-switch` would require that (a) the cases be at least optimistically total and (b) would insert additional throwing cases to patch any unhandled residue.? And we could do this with any of the combinations that lead to X-totality -- a default clause (strongly total), a weakly total deconstruction pattern (residue = null), an optimistically total set of type patterns for sealed types (residue = { null, novel }), and all the more complicated ones -- distinguishing between these cases doesn't seem interesting.? In this model, `default` just becomes a shorthand for "everything else, maybe with destructuring." Which leaves: ?? { statement, expression } x { arrow, colon } x { switch, total-switch } where an expression switch of any stripe is automatically upgraded to a total-switch of that type.? I think this is still orthogonal (with the above caveat) since the fallthrough rules apply uniformly, and as mentioned above we can probably rehabilitate `default` so it can remain relevant in all the boxes.? (There are still some details to work out here, notably what to do about weakly total deconstruction patterns.) The asymmetry where expression switches default to total, but statement switches default to partial is unfortunate, but we knew all along that was the necessary cost of rehabilitating switch rather than creating a parallel "snitch" (New Switch) construct. It's just the final payment is coming due now. I would "strongly" prefer to not propagate `default` into patterns, but leave it as a behavior of switch.? I think our refined taxonomy of totality is now good enough to get us where we need to go without festooning pattern nesting with nullity knobs.? (I think that, if we have to include totality markers at every stage of the pattern syntax, we will have made a mistake somewhere else; as I mentioned to Remi when he brought up "var vs any" (which is just another spelling for default vs nothing), my objection is not to the syntax but to the amount of complexity budget it burns for a low-value thing -- raising ideally-ignorable corner cases into the user's direct field of view, when ideally we can define the natural positive space and let the rest fall into automatically handled residue.? If we have defined the pattern semantics correctly, I'm not sure anyone will notice.) Which (if you buy all the above) leaves us with a bikeshed to paint on how to spell `total-switch` or `switch-case` or ...? ? From guy.steele at oracle.com Tue Aug 25 19:03:06 2020 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 25 Aug 2020 15:03:06 -0400 Subject: Opting into totality In-Reply-To: References: <3ff64bd7-afe3-0901-7d69-9ca3342e8c40@oracle.com> <3DFEE3C0-2E04-4998-873E-73B9EA7250E1@oracle.com> Message-ID: <7909BD76-C7C2-43A1-9D2D-D1FC7A59F7E9@oracle.com> > On Aug 25, 2020, at 2:08 PM, Brian Goetz wrote: > . . . [good analysis up to this point] > Which leaves: > > { statement, expression } x { arrow, colon } x { switch, total-switch } > > where an expression switch of any stripe is automatically upgraded to a total-switch of that type. This description actually does not quite describe the current behavior: for an expression switch, if the type of the selector expression is enum, then the switch is _assumed_ to be total (the user need not supply a ?default? label), but if the type is int or String, then the switch is _required_ to be total (the user must supply a default label). Did you mean to suggest a change from this current behavior, so that the user would not need to supply a default label for the int and String versions? (My guess is ?no?.) > I think this is still orthogonal (with the above caveat) since the fallthrough rules apply uniformly, and as mentioned above we can probably rehabilitate `default` so it can remain relevant in all the boxes. (There are still some details to work out here, notably what to do about weakly total deconstruction patterns.) > > The asymmetry where expression switches default to total, but statement switches default to partial is unfortunate, but we knew all along that was the necessary cost of rehabilitating switch rather than creating a parallel "snitch" (New Switch) construct. It's just the final payment is coming due now. > > I would "strongly" prefer to not propagate `default` into patterns, but leave it as a behavior of switch. I think our refined taxonomy of totality is now good enough to get us where we need to go without festooning pattern nesting with nullity knobs. (I think that, if we have to include totality markers at every stage of the pattern syntax, we will have made a mistake somewhere else; as I mentioned to Remi when he brought up "var vs any" (which is just another spelling for default vs nothing), my objection is not to the syntax but to the amount of complexity budget it burns for a low-value thing -- raising ideally-ignorable corner cases into the user's direct field of view, when ideally we can define the natural positive space and let the rest fall into automatically handled residue. If we have defined the pattern semantics correctly, I'm not sure anyone will notice.) Yes, we may not need to introduce ?default? as a marker of tagging subpatterns. If we?re wrong, it can be introduced later. > Which (if you buy all the above) leaves us with a bikeshed to paint on how to spell `total-switch` or `switch-case` or ? ? We can still ponder whether ?default ? should issue a static error if the is not total. We can furthermore ponder whether ?default ? should require the to be strongly total or just weakly total. From guy.steele at oracle.com Tue Aug 25 23:11:05 2020 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 25 Aug 2020 19:11:05 -0400 Subject: Opting into totality In-Reply-To: References: <3ff64bd7-afe3-0901-7d69-9ca3342e8c40@oracle.com> <3DFEE3C0-2E04-4998-873E-73B9EA7250E1@oracle.com> Message-ID: <2825D26A-91A6-43FB-A540-1A5E3C166C6D@oracle.com> > On Aug 25, 2020, at 2:08 PM, Brian Goetz wrote: > . . . > To pick a different syntax for purposes of exposition, suppose we had `total-switch` variant of `switch`. A `total-switch` would require that (a) the cases be at least optimistically total and (b) would insert additional throwing cases to patch any unhandled residue. And we could do this with any of the combinations that lead to X-totality -- a default clause (strongly total), a weakly total deconstruction pattern (residue = null), an optimistically total set of type patterns for sealed types (residue = { null, novel }), and all the more complicated ones -- distinguishing between these cases doesn't seem interesting. In this model, `default` just becomes a shorthand for "everything else, maybe with destructuring." > On Aug 25, 2020, at 3:03 PM, Guy Steele wrote: > . . . > We can still ponder whether ?default ? should issue a static error if the is not total. We can furthermore ponder whether ?default ? should require the to be strongly total or just weakly total. On reflection, I believe that in addition to having a choice of `switch` or `total-switch`, we should indeed be a little more careful about `default` switch labels and break them down as follows: plain `default` as a switch label means the same as `case var _` (which is always _strongly_ total on the type of the selector expression) `default ` and `default ` and `default ` are not permitted `default ` means the same thing as `case ` but it is a static error if the is not _strongly_ total on the type of the selector expression `default ` means the same thing as `case ` but it is a static error if the is not _weakly_ total on the type of the selector expression I think that this, in addition to your `total-switch` requirements that (a) the set of cases be at least _optimistically_ total and (b) additional throwing cases are inserted to patch any unhandled residue, gives a more complete and perhaps even satisfactory solution. The main point here is that a `default`clause should require _strong_ totality of a type pattern but only _weak_ totality of a deconstruction pattern. (I am aware that this set of rules would appear to allow two default clauses in a single switch statement if the first has a and the second has a : switch (myBox) { case Box(Frog f): ? default Box(Object o): ? default Box b: ? } or switch (myBox) { case Box(Frog f): ? default Box(Object o): ? default: ? } Of course, the second default clause would be chosen only for the value null. We can have a separate debate about whether to arbitrarily forbid such usage.) From brian.goetz at oracle.com Tue Aug 25 23:53:05 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 25 Aug 2020 19:53:05 -0400 Subject: Opting into totality In-Reply-To: <2825D26A-91A6-43FB-A540-1A5E3C166C6D@oracle.com> References: <3ff64bd7-afe3-0901-7d69-9ca3342e8c40@oracle.com> <3DFEE3C0-2E04-4998-873E-73B9EA7250E1@oracle.com> <2825D26A-91A6-43FB-A540-1A5E3C166C6D@oracle.com> Message-ID: <008C2D8E-82B8-4363-B5C6-2C057C5D04C0@oracle.com> >> We can still ponder whether ?default ? should issue a static error if the is not total. We can furthermore ponder whether ?default ? should require the to be strongly total or just weakly total. > > On reflection, I believe that in addition to having a choice of `switch` or `total-switch`, we should indeed be a little more careful about `default` switch labels and break them down as follows: > > plain `default` as a switch label means the same as `case var _` > (which is always _strongly_ total on the type of the selector expression) > > `default ` and `default ` and `default ` are not permitted > > `default ` means the same thing as `case ` > but it is a static error if the is not _strongly_ total on the type of the selector expression > > `default ` means the same thing as `case ` > but it is a static error if the is not _weakly_ total on the type of the selector expression FWIW, this is the same as I was going to propose. It is equivalent to the notion that the optional pattern is always at least weakly total, given that deconstruction patterns are the only ones that can currently be weakly total. I?ll add that in the last case, it is as if there is an implicit `case null: throw`. We might also consider requiring that `default` always be the last case (which is also a way to outlaw the dual defaults.) Given that the only residue from the only non-strongly-total pattern allowable is ?null?, it seems better to allow only one default. From brian.goetz at oracle.com Wed Aug 26 00:01:43 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 25 Aug 2020 20:01:43 -0400 Subject: Opting into totality In-Reply-To: <7909BD76-C7C2-43A1-9D2D-D1FC7A59F7E9@oracle.com> References: <3ff64bd7-afe3-0901-7d69-9ca3342e8c40@oracle.com> <3DFEE3C0-2E04-4998-873E-73B9EA7250E1@oracle.com> <7909BD76-C7C2-43A1-9D2D-D1FC7A59F7E9@oracle.com> Message-ID: <9CFC9F6A-BCF4-4536-9606-7C00C787FAD9@oracle.com> > >> Which leaves: >> >> { statement, expression } x { arrow, colon } x { switch, total-switch } >> >> where an expression switch of any stripe is automatically upgraded to a total-switch of that type. > > This description actually does not quite describe the current behavior: for an expression switch, if the type of the selector expression is enum, then the switch is _assumed_ to be total (the user need not supply a ?default? label), but if the type is int or String, then the switch is _required_ to be total (the user must supply a default label). Did you mean to suggest a change from this current behavior, so that the user would not need to supply a default label for the int and String versions? (My guess is ?no?.) What I meant to suggest is that a switch is total if either (a) it says ?total? or (b) it is an expression switch. A total switch must either have a default clause or have an optimistically total set of cases, otherwise a compilation error ensues. Total switches get some extra implicit cases to handle any residue. In the current language, the only optimistically total set of cases is an enum switch where all the constants are covered, but as we add sealed types and deconstruction patterns, there will be more ways to get there. If we plan to complete the set of primitive types that can be switched on, we?d treat `boolean` as if it were an enum with constants `true` and `false`. (We could conceivably do the same for byte, short, and theoretically even int (though I suspect we?d hit file system length limits), but I think this would be silly.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Wed Aug 26 00:05:44 2020 From: guy.steele at oracle.com (Guy Steele) Date: Tue, 25 Aug 2020 20:05:44 -0400 Subject: Opting into totality In-Reply-To: <008C2D8E-82B8-4363-B5C6-2C057C5D04C0@oracle.com> References: <008C2D8E-82B8-4363-B5C6-2C057C5D04C0@oracle.com> Message-ID: <7EEF1BAE-0655-429E-B51A-F7222462A52B@oracle.com> > On Aug 25, 2020, at 7:53 PM, Brian Goetz wrote: > > ? >> >>> We can still ponder whether ?default ? should issue a static error if the is not total. We can furthermore ponder whether ?default ? should require the to be strongly total or just weakly total. >> >> On reflection, I believe that in addition to having a choice of `switch` or `total-switch`, we should indeed be a little more careful about `default` switch labels and break them down as follows: >> >> plain `default` as a switch label means the same as `case var _` >> (which is always _strongly_ total on the type of the selector expression) >> >> `default ` and `default ` and `default ` are not permitted >> >> `default ` means the same thing as `case ` >> but it is a static error if the is not _strongly_ total on the type of the selector expression >> >> `default ` means the same thing as `case ` >> but it is a static error if the is not _weakly_ total on the type of the selector expression > > > FWIW, this is the same as I was going to propose. It is equivalent to the notion that the optional pattern is always at least weakly total, given that deconstruction patterns are the only ones that can currently be weakly total. I?ll add that in the last case, it is as if there is an implicit `case null: throw`. > > We might also consider requiring that `default` always be the last case (which is also a way to outlaw the dual defaults.) Given that the only residue from the only non-strongly-total pattern allowable is ?null?, it seems better to allow only one default. I concur; I stopped short of that in my previous message to highlight the fact that it s a separate decision. From brian.goetz at oracle.com Wed Aug 26 13:42:22 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Aug 2020 09:42:22 -0400 Subject: Opting into totality In-Reply-To: <008C2D8E-82B8-4363-B5C6-2C057C5D04C0@oracle.com> References: <3ff64bd7-afe3-0901-7d69-9ca3342e8c40@oracle.com> <3DFEE3C0-2E04-4998-873E-73B9EA7250E1@oracle.com> <2825D26A-91A6-43FB-A540-1A5E3C166C6D@oracle.com> <008C2D8E-82B8-4363-B5C6-2C057C5D04C0@oracle.com> Message-ID: Note that this: On 8/25/2020 7:53 PM, Brian Goetz wrote: > `default ` means the same thing as `case ` > but it is a static error if the is not_weakly_ total on the type of the selector when combined with > I?ll add that in the last case, it is as if there is an implicit `case null: throw`. comports nicely with the planned semantics for pattern assignment: ??? P = e requires that P be (weakly) total on the static type of e, and is permitted to NPE when e is null.? When P is a total type pattern or an any pattern, this degenerates to normal variable declaration + initialization.? So: ??? Box b = ... ??? String s = b? // type error ??? Box bb = b? // fine, total ??? Box(Object o) bb = b? // fine, weakly total, throws NPE when b = null I will consolidate all of this today into a new message. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Aug 26 15:00:47 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Aug 2020 11:00:47 -0400 Subject: Finalizing in JDK 16 - Pattern matching for instanceof In-Reply-To: References: Message-ID: I have been thinking about this and I have two refinements I would like to suggest for Pattern Matching in instanceof.? Both have come out of the further work on the next phases of pattern matching. 1.? Instanceof expressions must express conditionality. One of the uncomfortable collisions between the independently-derived pattern semantics and instanceof semantics is the treatment of total patterns.? Instanceof always says "no" on null, but the sensible thing on total patterns is that _strongly total patterns_ match null.? This yields a collision between ??? x instanceof Object and ??? x instanceof Object o This is not necessarily a problem for the specification, in that instanceof is free to say "when x is null, we don't even test the pattern."? But it is not good for the users, in that these two things are subtly different. While I get why some people would like to bootstrap this into an argument why the pattern semantics are wrong, the key observation here is: _both of these questions are stupid_.? So I think there's an obvious way to fix this so that there is no problem here: instanceof must ask a question.? So the second form would be illegal, with a compiler error saying "pattern always matches the target." Proposed: An `instanceof` expression must be able to evaluate to both true and false, otherwise it is invalid.? This rules out strongly total patterns on the RHS.? If you have a strongly total pattern, use pattern assignment instead. 2.? Mutability of binding variables. We did it again; we gave in to our desire to try to "fix mistakes of the past", with the obvious results.? This time, we did it by making binding variables implicitly final. This is the same mistake we make over and over again with both nullity and finality; when a new context comes up, we try to exclude the "mistakes" (nullability and mutability) from those contexts. We've seen plenty of examples recently with nullity.? Here's a historical example with finality.? When we did Lambda, some clever fellow said "we could make the lambda parameters implicitly final."? And there was a round of "ooh, that would be nice", because it fed our desire to fix mistakes of the past. But we quickly realized it would be a new mistake, because it would be an impediment to refactoring between lambdas and inner classes, and undermined the mental model of "a lambda is just an anonymous method." Further, the asymmetry has a user-model cost.? And what would be the benefit?? Well, it would make us feel better, but ultimately, would not have a significant impact on accidental-mutation errors because the context was so limited (and most lambdas are small anyway.)? In the end, it would have been a huge mistake. I now think that we have done the same with binding variables. Here are two motivating examples: (a) Pattern assignment.? For (weakly) total pattern P, you will be able to say ??? P = e Note that `int x` and `var x` are both valid patterns and local variable declarations; it would be good if pattern assignment were a strict generalization of local variable declaration.? The sole asymmetry is that for pattern assignment, the variable is final.? Ooops. (b) Reconstruction.? We have analogized that a `with` expression: ??? x with { B } is like the block expression: ??? { X(VARS) = x; B /* mutates vars */; yield new X(VARS) } except that mutating the variables would not be allowed. From a specification perspective, there is nontrivial spec complexity to keep pattern variables and locals separately, but some of their difference is gratuitous (mutability.)? If we reduce the gratuitious differences, we can likely bring them closer together, which will reduce friction and technical debt in the future. Like with lambda parameters, I am now thinking that we gave in to the base desire to fix a past mistake, but in a way that doesn't really make the language better or safer, just more complicated.? Let's back this one out before it really bites us. On 7/27/2020 6:53 AM, Gavin Bierman wrote: > In JDK 16 we are planning to finalize two JEPs: > > - Pattern matching for `instanceof` > - Records > > Whilst we don't have any major open issues for either of these features, I would > like us to close them out. So I thought it would be useful to quickly summarize > the features and the issues that have arisen over the preview periods so far. In > this email I will discuss pattern matching; a following email will cover the > Records feature. > > Pattern matching > ---------------- > > Adding conditional pattern matching to an expression form is the main technical > novelty of our design of this feature. There are several advantages that come > from this targeting of an expression form: First, we get to refactor a very > common programming pattern: > > if (e instanceof T) { > T t = (T)e; // grr... > ... > } > > to > > if (e instanceof T t) { > // let the pattern matching do the work! > ... > } > > A second, less obvious advantage is that we can combine the pattern matching > instanceof with other *expressions*. This enables us to compactly express things > with expressions that are unnecessarily complicated using statements. For > example, when implementing a class Point, we might write an equals method as > follows: > > public boolean equals(Object o) { > if (!(o instanceof Point)) > return false; > Point other = (Point) o; > return x == other.x > && y == other.y; > } > > Using pattern matching with instanceof instead, we can combine this into a > single expression, eliminating the repetition and simplifying the control flow: > > public boolean equals(Object o) { > return (o instanceof Point other) > && x == other.x > && y == other.y; > } > > The conditionality of pattern matching - if a value does not match a pattern, > then the pattern variable is not bound - means that we have to consider > carefully the scope of the pattern variable. We could do something simple and > say that the scope of the pattern variable is the containing statement and all > subsequent statements in the enclosing block. But this has unfortunate > 'poisoning' consequences, e.g. > > if (a instanceof Point p) { > ... > } > if (b instanceof Point p) { // ERROR - p is in scope > ... > } > > In other words in the second statement the pattern variable is in a poisoned > state - it is in scope, but it should not be accessible as it may not be > instantiated with a value. Moreover, as it is in scope, we can't declare it > again. This means that a pattern variable is 'poisoned' after it is declared, so > the pattern-loving programmer will have to think of lots of distinct names for > their pattern variables. > > We have chosen another way: Java already uses flow analysis - both in checking > the access of local variables and blank final fields, and detecting unreachable > statements. We lean on this concept to introduce the new notion of flow scoping. > A pattern variable is only in scope where the compiler can deduce that the > pattern has matched and the variable will be bound. This analysis is flow > sensitive and works in a similar way to the existing analyses. Returning to our > example: > > if (a instanceof Point p) { > // p is in scope > ... > } > // p not in scope here > if (b instanceof Point p) { // Sure! > ... > } > > The motto is "a pattern variable is in scope where it has definitely matched". > This is intuitive, allows for the safe reuse of pattern variables, and Java > developers are already used to flow sensitive analyses. > > As pattern variables are treated in all other respects like normal variables > -- and this was an important design principle -- they can shadow fields. > However, their flow scoping nature means that some care must be taken to > determine whether a name refers to a pattern variable declaration shadowing a > field declaration or a field declaration. > > // field p is in scope > > if (e instanceof Point p) { > // p refers to the pattern variable > } else { > // p refers to the field > } > > We call this unfortunate interaction of flow scoping and shadowing the "Swiss > cheese property". To rule it out would require ad-hoc special cases or more > features, and our sense is that will not be that common, so we have decided to > keep the feature simple. We hope that IDEs will quickly come to help programmers > who have difficulty with flow scoping and shadowing. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Wed Aug 26 16:43:33 2020 From: amaembo at gmail.com (Tagir Valeev) Date: Wed, 26 Aug 2020 23:43:33 +0700 Subject: Finalizing in JDK 16 - Pattern matching for instanceof In-Reply-To: References: Message-ID: Hello! Quick comments from IDE developer point of view. > Proposed: An `instanceof` expression must be able to evaluate to both true and false, otherwise it is invalid. This rules out strongly total patterns on the RHS. If you have a strongly total pattern, use pattern assignment instead. 1 is an interesting suggestion. IntelliJ IDEA has an inspection warning if normal `instanceof` can be replaced with a null-check, or can be removed at all (replaced with 'true'). Sometimes these warnings occur if the expression on the left has the same type as instanceof operand. I wanted to expand it to pattern instanceof, but it's quite unclear which kind of quick-fix can we suggest, as a new pattern variable is introduced (and we could be in the middle of &&-chain, so it's not so easy to convert it to a normal local variable right here). Now, I believe, we don't warn at all if we detect that pattern now because we don't like warnings if we cannot suggest a quick-fix. If we make this a compilation error we will be forced to highlight it (even without a quick-fix, or suggesting a quick-fix only sometimes). I usually support such kinds of compilation errors, because thanks to them code rots not so fast. E.g. if we narrowed the return type of some method, instanceof check on that method call might become useless. If we immediately fail with compilation error, people will have to fix them, reducing amount of useless code in the project. > 2 Mutability of binding variables. We already have an inspection that allows converting old "instanceof+new local+cast" code pattern to the new instanceof. I remember that when I implemented it initially, I forgot to rule out the cases when the new local is mutable, so we had false-positives. After this change, we can remove these extra checks, and the inspection will apply to the wider range of existing code. On the other hand, we have an inline refactoring that substitutes the pattern variable with the cast (converting pattern-instanceof to old-style instanceof). Now, we should probably disable it if the pattern variable is not effectively final. We also have an indication (underlining) for mutating variables, so I believe, we can extend it to pattern variables without any problems. I have mixed feelings about this proposal, but not that I don't like it. Probably it's ok. With best regards, Tagir Valeev. On Wed, Aug 26, 2020 at 10:01 PM Brian Goetz wrote: > > I have been thinking about this and I have two refinements I would like to suggest for Pattern Matching in instanceof. Both have come out of the further work on the next phases of pattern matching. > > > 1. Instanceof expressions must express conditionality. > > One of the uncomfortable collisions between the independently-derived pattern semantics and instanceof semantics is the treatment of total patterns. Instanceof always says "no" on null, but the sensible thing on total patterns is that _strongly total patterns_ match null. This yields a collision between > > x instanceof Object > and > x instanceof Object o > > This is not necessarily a problem for the specification, in that instanceof is free to say "when x is null, we don't even test the pattern." But it is not good for the users, in that these two things are subtly different. > > While I get why some people would like to bootstrap this into an argument why the pattern semantics are wrong, the key observation here is: _both of these questions are stupid_. So I think there's an obvious way to fix this so that there is no problem here: instanceof must ask a question. So the second form would be illegal, with a compiler error saying "pattern always matches the target." > > Proposed: An `instanceof` expression must be able to evaluate to both true and false, otherwise it is invalid. This rules out strongly total patterns on the RHS. If you have a strongly total pattern, use pattern assignment instead. > > > 2. Mutability of binding variables. > > We did it again; we gave in to our desire to try to "fix mistakes of the past", with the obvious results. This time, we did it by making binding variables implicitly final. > > This is the same mistake we make over and over again with both nullity and finality; when a new context comes up, we try to exclude the "mistakes" (nullability and mutability) from those contexts. > > We've seen plenty of examples recently with nullity. Here's a historical example with finality. When we did Lambda, some clever fellow said "we could make the lambda parameters implicitly final." And there was a round of "ooh, that would be nice", because it fed our desire to fix mistakes of the past. But we quickly realized it would be a new mistake, because it would be an impediment to refactoring between lambdas and inner classes, and undermined the mental model of "a lambda is just an anonymous method." > > Further, the asymmetry has a user-model cost. And what would be the benefit? Well, it would make us feel better, but ultimately, would not have a significant impact on accidental-mutation errors because the context was so limited (and most lambdas are small anyway.) In the end, it would have been a huge mistake. > > > I now think that we have done the same with binding variables. Here are two motivating examples: > > (a) Pattern assignment. For (weakly) total pattern P, you will be able to say > > P = e > > Note that `int x` and `var x` are both valid patterns and local variable declarations; it would be good if pattern assignment were a strict generalization of local variable declaration. The sole asymmetry is that for pattern assignment, the variable is final. Ooops. > > (b) Reconstruction. We have analogized that a `with` expression: > > x with { B } > > is like the block expression: > > { X(VARS) = x; B /* mutates vars */; yield new X(VARS) } > > except that mutating the variables would not be allowed. > > From a specification perspective, there is nontrivial spec complexity to keep pattern variables and locals separately, but some of their difference is gratuitous (mutability.) If we reduce the gratuitious differences, we can likely bring them closer together, which will reduce friction and technical debt in the future. > > > Like with lambda parameters, I am now thinking that we gave in to the base desire to fix a past mistake, but in a way that doesn't really make the language better or safer, just more complicated. Let's back this one out before it really bites us. > > > > > > On 7/27/2020 6:53 AM, Gavin Bierman wrote: > > In JDK 16 we are planning to finalize two JEPs: > > - Pattern matching for `instanceof` > - Records > > Whilst we don't have any major open issues for either of these features, I would > like us to close them out. So I thought it would be useful to quickly summarize > the features and the issues that have arisen over the preview periods so far. In > this email I will discuss pattern matching; a following email will cover the > Records feature. > > Pattern matching > ---------------- > > Adding conditional pattern matching to an expression form is the main technical > novelty of our design of this feature. There are several advantages that come > from this targeting of an expression form: First, we get to refactor a very > common programming pattern: > > if (e instanceof T) { > T t = (T)e; // grr... > ... > } > > to > > if (e instanceof T t) { > // let the pattern matching do the work! > ... > } > > A second, less obvious advantage is that we can combine the pattern matching > instanceof with other *expressions*. This enables us to compactly express things > with expressions that are unnecessarily complicated using statements. For > example, when implementing a class Point, we might write an equals method as > follows: > > public boolean equals(Object o) { > if (!(o instanceof Point)) > return false; > Point other = (Point) o; > return x == other.x > && y == other.y; > } > > Using pattern matching with instanceof instead, we can combine this into a > single expression, eliminating the repetition and simplifying the control flow: > > public boolean equals(Object o) { > return (o instanceof Point other) > && x == other.x > && y == other.y; > } > > The conditionality of pattern matching - if a value does not match a pattern, > then the pattern variable is not bound - means that we have to consider > carefully the scope of the pattern variable. We could do something simple and > say that the scope of the pattern variable is the containing statement and all > subsequent statements in the enclosing block. But this has unfortunate > 'poisoning' consequences, e.g. > > if (a instanceof Point p) { > ... > } > if (b instanceof Point p) { // ERROR - p is in scope > ... > } > > In other words in the second statement the pattern variable is in a poisoned > state - it is in scope, but it should not be accessible as it may not be > instantiated with a value. Moreover, as it is in scope, we can't declare it > again. This means that a pattern variable is 'poisoned' after it is declared, so > the pattern-loving programmer will have to think of lots of distinct names for > their pattern variables. > > We have chosen another way: Java already uses flow analysis - both in checking > the access of local variables and blank final fields, and detecting unreachable > statements. We lean on this concept to introduce the new notion of flow scoping. > A pattern variable is only in scope where the compiler can deduce that the > pattern has matched and the variable will be bound. This analysis is flow > sensitive and works in a similar way to the existing analyses. Returning to our > example: > > if (a instanceof Point p) { > // p is in scope > ... > } > // p not in scope here > if (b instanceof Point p) { // Sure! > ... > } > > The motto is "a pattern variable is in scope where it has definitely matched". > This is intuitive, allows for the safe reuse of pattern variables, and Java > developers are already used to flow sensitive analyses. > > As pattern variables are treated in all other respects like normal variables > -- and this was an important design principle -- they can shadow fields. > However, their flow scoping nature means that some care must be taken to > determine whether a name refers to a pattern variable declaration shadowing a > field declaration or a field declaration. > > // field p is in scope > > if (e instanceof Point p) { > // p refers to the pattern variable > } else { > // p refers to the field > } > > We call this unfortunate interaction of flow scoping and shadowing the "Swiss > cheese property". To rule it out would require ad-hoc special cases or more > features, and our sense is that will not be that common, so we have decided to > keep the feature simple. We hope that IDEs will quickly come to help programmers > who have difficulty with flow scoping and shadowing. > > From brian.goetz at oracle.com Wed Aug 26 17:01:16 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Aug 2020 13:01:16 -0400 Subject: [pattern-switch] Totality In-Reply-To: References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> Message-ID: <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> I think we now have a sound story for totality in both patterns and switches.? Let's start with refining what we mean by totality. We have seen a lot of cases -- and not just those involving enums or sealed types -- where we want to say a pattern or set of patterns is "total enough" to not force the user to explicitly handle the corner cases.? In these cases, we will let the compiler handle the corner cases by generating exceptions. A prime example is the deconstruction pattern Foo(var x); this matches all Foos, but not null.? Similarly, there is a whole family of such corner cases: Foo(Bar(var x)) matches all Foos, except for null and Foo(null).? But we are in agreement that it would be overly pedantic to force the user to handle these explicitly. Note that these show up not only in switch, but in pattern assignment: ??? Object o = e;??? // Object o is a total pattern on typeof(e), so always succeeds ??? Foo(var x) = foo // Total on Foo, except null, so should NPE on null ??? Foo(Bar(var x)) = foo // Total on Foo, except null and Foo(null), throw on these ??? var y = switch (foo) { ??????? case Foo(var x) -> bar(x);? // Total on Foo, except null, so NPE on null ??? } These goals come from a pragmatic desire to not pedantically require the user to spell out the residue, but to accept the pattern (or set of patterns) as total anyway. Now, let's put this intuition on a sounder footing by writing some formal rules for saying that a pattern P is _total on T with remainder R_, where R is a _set of patterns_. We will say P is _strongly total on T_ if P is total on T with empty remainder. The intuition is that, if P is total on T with remainder R, the values matched by R but not by P are "silly" values and a language construct like switch can (a) consider P sufficient to establish totality and (b) can insert synthetic tests for each of the patterns in R that throw. We will lean on this in _both_ switch and pattern assignment.? For switch, we will treat it as if we insert synthetic cases after P that throw, so that the remaining values can still be matched by earlier explicit patterns. Invariant: If P is total on T with remainder R then, for all t in T, either t matches P, or t matches some pattern in R.? (This is not the definition of what makes a pattern total; it is just something that is true about total patterns.) Base cases: ?- The type pattern `T t` is strongly total on U <: T. ?- The inferred type pattern `var x` is strongly total on all T. Induction cases: ?- Let D(T) be a deconstructor. ?? - The deconstruction pattern D(Q), where Q is strongly total on T, is total on D with remainder { null }. ?? - The deconstruction pattern D(Q), where Q is total on T with remainder R*, is total on D with remainder { null } union { D(R) : R in R* } We can easily generalize the definition of totality to a set of patterns.? In this case, we can handle sealed types and enums: ?- Let E be an enum type.? The set of patterns { C1 .. Cn } is total on E with remainder { null, E e } if C1 .. Cn contains all the constants of E. Observation: that `E e` pattern is pretty broad!? But that's OK; it captures all novel constant values, and, because the explicit cases cover all the known values, captures only the novel values.? Same for sealed types: ?- Let S be a sealed abstract type or interface, with permitted direct subtypes C*, and P* be a set of patterns applicable to S.? If for each C in C*, there exists a subset Pc of P* that is total on C with remainder Rc, then P* is total on S with remainder { null } union \forall{c \in C}{ Rc }. ?- Let D(T) be a deconstructor, and let P* be total on T with remainder R*.? Then { D(P) : P in P* } is total on D with remainder { null } \union { D(R) : R in R* }. Example: ? Container = Box | Bag ? Shape = Circle | Rect ? P* = { Box(Circle), Box(Rect), Bag(Circle), Bag(Rect) } { Circle, Rect } total on Shape with remainder { Shape, null } -> { Box(Circle), Box(Rect) total on Box with remainder { Box(Shape), Box(null), null } -> { Bag(Circle), Bag(Rect) total on Bag with remainder { Bag(Shape), Bag(null), null } -> P* total on Container with remainder { Container(Box(Shape)), Container(Box(null)), Container(Bag(Shape)), Container(Box(Shape)), Container(null), null } Now: ?- A pattern assignment `P = e` requires that P be total on the static type of `e`, with some remainder R, and throws on the remainder. ?- A total switch on `e` requires that its cases be total on the static type of `e`, with some remainder R, and inserts synthetic throwing cases at the end of the switch for each pattern in R. We can then decide how to opt into totality in switches, other than "be an expression switch." On 8/21/2020 4:18 PM, Brian Goetz wrote: > > > On 8/21/2020 11:14 AM, Brian Goetz wrote: >> >> Next up (separate topic): letting statement switches opt into totality. >> > > Assuming the discussion on Exhaustiveness is good, let's talk about > totality. > > Expression switches must be total; we totalize them by throwing when > we encounter any residue, even though we only require that the set of > cases in the switch be optimistically total.? Residue includes: > > ?- `null` switch targets in String, Enum, and primitive box switches only; > ?- novel values in enum switches without a total case clause; > ?- novel subtypes in switches on sealed types without a total case clause; > ?- when an optimistically total subchain of deconstruction pattern > cases wraps a residue value (e.g., D(null) or D(novel)) > > What about statement switches?? Right now, any residue for a statement > switch without a total case clause will just be silently ignored > (because statement switches need not be total.) > > What we would like is a way to say "this switch is total, please type > check it for me as such, and insert any needed residue-catching > cases."? I think this is a job for `default`. > > Now that we've got some clarity that switches _don't_ throw on null, > but instead it is as if string/enum/box switches have an implicit > `case null` when no explicit one is present, we can define `default`, > once again, to be total (and not just weakly total.)? So in: > > ??? switch (object) { > ??????? case "foo": > ??????? case Box(Frog fs): > ??????? default: ... > ??? } > > a `null` just falls into `default` just like anything else that is not > the string "foo" or a box of frogs ("let the nulls flow"). Default > would have to come last (except in legacy switches, where a legacy > switch has one of the distinguished target types and all constant case > labels.) > > What if we want to destructure too?? Well, add a pattern: > > ??? switch (object) { > ??????? case "foo": > ??????? case Box(Frog fs): > ??????? default Object o: ... > ??? } > > This would additionally assert that the following pattern is total, > otherwise a compilation error ensues.? (Note, though, that this is > entirely about `switch`, not patterns.? The semantics of the pattern > is unchanged, and I do not believe that sprinkling `default` into > nested patterns to shout "TOTALITY HERE, I MEAN IT" carries its weight.) > > This seems a better job to give default in this new world; anything > not previously matched, where we retcon the current null behavior as > being only about string, enum, or boxes. > > This leaves us with only one hole, which is: suppose I have an > _optimistically total_ statement switch.?? Users might like to (a) > assert the switch is total, and get the concomitant type checking, and > (b) get residue ejection for free.? Of the two, though, A is much more > important than B, but we'll take B when we can get it. Perhaps, if the > target of a switch is a sealed type, we can interpret: > > ??? switch (shape) { > ??????? case Rect r: ... > ??????? default Circle c: ... > ??? } > > as meaning that `Circle c` _closes_ the switch to make it total, and > engages the totality checking to ensure this is true.? So, `default P` > would mean either: > > ?- P is total, or > ?- P is not total, but taken with the other cases, makes the switch > optimistically total > > and in the latter case, would engage the > residue-detection-and-ejection machinery. > > This might be stretching it a tad too far, but I like that we can > given `default` useful new jobs to do in `switch` rather than just > giving him a gold watch. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Wed Aug 26 18:37:42 2020 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 26 Aug 2020 14:37:42 -0400 Subject: Finalizing in JDK 16 - Pattern matching for instanceof In-Reply-To: References: Message-ID: > On Aug 26, 2020, at 11:00 AM, Brian Goetz wrote: > > I have been thinking about this and I have two refinements I would like to suggest for Pattern Matching in instanceof. Both have come out of the further work on the next phases of pattern matching. > > > 1. Instanceof expressions must express conditionality. > > One of the uncomfortable collisions between the independently-derived pattern semantics and instanceof semantics is the treatment of total patterns. Instanceof always says "no" on null, but the sensible thing on total patterns is that _strongly total patterns_ match null. This yields a collision between > > x instanceof Object > and > x instanceof Object o > > This is not necessarily a problem for the specification, in that instanceof is free to say "when x is null, we don't even test the pattern." But it is not good for the users, in that these two things are subtly different. > > While I get why some people would like to bootstrap this into an argument why the pattern semantics are wrong, the key observation here is: _both of these questions are stupid_. So I think there's an obvious way to fix this so that there is no problem here: instanceof must ask a question. So the second form would be illegal, with a compiler error saying "pattern always matches the target." > > Proposed: An `instanceof` expression must be able to evaluate to both true and false, otherwise it is invalid. This rules out strongly total patterns on the RHS. If you have a strongly total pattern, use pattern assignment instead. Makes sense to me, but one question: would this restriction "must be able to evaluate to both true and false? be applied to _every_ `instanceof` expression, or only those that have a pattern to right of the `instanceof` keyword? I ask because if it is applied to _every_ `instanceof` expression, this would represent an incompatible change to the behavior of `x instanceof Object`, among others. Is it indeed the intent to make an incompatible change to the language? > 2. Mutability of binding variables. > > We did it again; we gave in to our desire to try to "fix mistakes of the past", with the obvious results. This time, we did it by making binding variables implicitly final. > > This is the same mistake we make over and over again with both nullity and finality; when a new context comes up, we try to exclude the "mistakes" (nullability and mutability) from those contexts. > > We've seen plenty of examples recently with nullity. Here's a historical example with finality. When we did Lambda, some clever fellow said "we could make the lambda parameters implicitly final." And there was a round of "ooh, that would be nice", because it fed our desire to fix mistakes of the past. But we quickly realized it would be a new mistake, because it would be an impediment to refactoring between lambdas and inner classes, and undermined the mental model of "a lambda is just an anonymous method." > > Further, the asymmetry has a user-model cost. And what would be the benefit? Well, it would make us feel better, but ultimately, would not have a significant impact on accidental-mutation errors because the context was so limited (and most lambdas are small anyway.) In the end, it would have been a huge mistake. > > > I now think that we have done the same with binding variables. Here are two motivating examples: > > (a) Pattern assignment. For (weakly) total pattern P, you will be able to say > > P = e > > Note that `int x` and `var x` are both valid patterns and local variable declarations; it would be good if pattern assignment were a strict generalization of local variable declaration. The sole asymmetry is that for pattern assignment, the variable is final. Ooops. > > (b) Reconstruction. We have analogized that a `with` expression: > > x with { B } > > is like the block expression: > > { X(VARS) = x; B /* mutates vars */; yield new X(VARS) } > > except that mutating the variables would not be allowed. > > From a specification perspective, there is nontrivial spec complexity to keep pattern variables and locals separately, but some of their difference is gratuitous (mutability.) If we reduce the gratuitious differences, we can likely bring them closer together, which will reduce friction and technical debt in the future. > > > Like with lambda parameters, I am now thinking that we gave in to the base desire to fix a past mistake, but in a way that doesn't really make the language better or safer, just more complicated. Let's back this one out before it really bites us. I agree with this analysis. It does suggest that we should consider whether to extend the syntax of a type pattern from `T x` to `[final] T x`, and the syntax of a deconstruction pattern from `D(Q) [v]` to `[final] D(Q) [v]` (in the latter case, `final` may be present only if `v` is also present), so that the user can choose to mark a pattern binding variable as `final`. (This is something that could be added right away or later.) So one could choose to write such things as: x instanceof final String s Box(final Frog f) = ?; final Box(final Frog f) b = ?; Box(Bag(final Frog f)) = ?; Box(final Bag(var x) theBag) = ?; -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Aug 26 18:46:09 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Aug 2020 11:46:09 -0700 (PDT) Subject: Finalizing in JDK 16 - Pattern matching for instanceof In-Reply-To: References: Message-ID: <4bcd9f76-5aa3-e1c9-326f-c23a2042fa19@oracle.com> >> >> Proposed: An `instanceof` expression must be able to evaluate to both >> true and false, otherwise it is invalid.? This rules out strongly >> total patterns on the RHS.? If you have a strongly total pattern, use >> pattern assignment instead. > > Makes sense to me, but one question: would this restriction "must be > able to evaluate to both true and false? be applied to > _every_?`instanceof` expression, or only those that have a pattern to > right of the `instanceof` keyword? ?I ask because if it is applied to > _every_ `instanceof` expression, this would represent an incompatible > change to the behavior of `x instanceof Object`, among others. ?Is it > indeed the intent to make an incompatible change to the language? Well, it's more of an aspiration than a rule -- based on what we already do.? We already use some form of this to rule out bad casts: ??? String s = ... ??? if (s instanceof Integer)? { ... } ??? // Error, incompatible types So here, we have a candidate instanceof that would always be false, which the compiler can derive by type analysis, and was always rejected.? It would be joined by ??? if (s instanceof Object o) { ... } because Object o is total in this case, but not ??? if (s instanceof Object) because this _can_ yield either true (if s is not null) or false (if it is.) The reality is that the natural semantics of `instanceof ` and the semantics of pattern matching disagree in one small place.? But it still made more sense to extend instanceof to patterns rather than create a new "matches" operator, so we patch that small place by not allowing us to ask the confusing question. >> >> Like with lambda parameters, I am now thinking that we gave in to the >> base desire to fix a past mistake, but in a way that doesn't really >> make the language better or safer, just more complicated.? Let's back >> this one out before it really bites us. > > I agree with this analysis. ?It does suggest that we should consider > whether to extend the syntax of a type pattern from `T x` to `[final] > T x`, and the syntax of a deconstruction pattern from `D(Q) [v]` to > `[final] D(Q) [v]` (in the latter case, `final` may be present only if > `v` is also present), so that the user can choose to mark a pattern > binding variable as `final`. ?(This is something that could be added > right away or later.) Exactly so.? This would further be doubling down on the pleasing pun between variable declaration and type patterns.? I don't think we need it now, but some day we might feel it is missing, and if so, there's a ready answer that requires no new bikesheds to be harmed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Wed Aug 26 19:26:05 2020 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 26 Aug 2020 15:26:05 -0400 Subject: Finalizing in JDK 16 - Pattern matching for instanceof In-Reply-To: <4bcd9f76-5aa3-e1c9-326f-c23a2042fa19@oracle.com> References: <4bcd9f76-5aa3-e1c9-326f-c23a2042fa19@oracle.com> Message-ID: <619F3621-DC81-4E0F-BD86-7DC576576A77@oracle.com> > On Aug 26, 2020, at 2:46 PM, Brian Goetz wrote: > > >>> >>> Proposed: An `instanceof` expression must be able to evaluate to both true and false, otherwise it is invalid. This rules out strongly total patterns on the RHS. If you have a strongly total pattern, use pattern assignment instead. >> >> Makes sense to me, but one question: would this restriction "must be able to evaluate to both true and false? be applied to _every_ `instanceof` expression, or only those that have a pattern to right of the `instanceof` keyword? I ask because if it is applied to _every_ `instanceof` expression, this would represent an incompatible change to the behavior of `x instanceof Object`, among others. Is it indeed the intent to make an incompatible change to the language? > > Well, it's more of an aspiration than a rule -- based on what we already do. We already use some form of this to rule out bad casts: > > String s = ... > if (s instanceof Integer) { ... } > // Error, incompatible types > > So here, we have a candidate instanceof that would always be false, which the compiler can derive by type analysis, and was always rejected. It would be joined by > > if (s instanceof Object o) { ... } > > because Object o is total in this case, but not > > if (s instanceof Object) > > because this _can_ yield either true (if s is not null) or false (if it is.) Thank you: I had missed this point about null. Duh. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Wed Aug 26 19:27:40 2020 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 26 Aug 2020 15:27:40 -0400 Subject: [pattern-switch] Totality In-Reply-To: <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> Message-ID: <68A69B16-2B0A-4876-90EC-40CD2F061C6A@oracle.com> I like this new analysis. See below. > On Aug 26, 2020, at 1:01 PM, Brian Goetz wrote: > > I think we now have a sound story for totality in both patterns and switches. Let's start with refining what we mean by totality. > > We have seen a lot of cases -- and not just those involving enums or sealed types -- where we want to say a pattern or set of patterns is "total enough" to not force the user to explicitly handle the corner cases. In these cases, we will let the compiler handle the corner cases by generating exceptions. > > A prime example is the deconstruction pattern Foo(var x); this matches all Foos, but not null. Similarly, there is a whole family of such corner cases: Foo(Bar(var x)) matches all Foos, except for null and Foo(null). But we are in agreement that it would be overly pedantic to force the user to handle these explicitly. > > Note that these show up not only in switch, but in pattern assignment: > > Object o = e; // Object o is a total pattern on typeof(e), so always succeeds > Foo(var x) = foo // Total on Foo, except null, so should NPE on null > Foo(Bar(var x)) = foo // Total on Foo, except null and Foo(null), throw on these > > var y = switch (foo) { > case Foo(var x) -> bar(x); // Total on Foo, except null, so NPE on null > } > > These goals come from a pragmatic desire to not pedantically require the user to spell out the residue, but to accept the pattern (or set of patterns) as total anyway. > > > Now, let's put this intuition on a sounder footing by writing some formal rules for saying that a pattern P is _total on T with remainder R_, where R is a _set of patterns_. We will say P is _strongly total on T_ if P is total on T with empty remainder. > > The intuition is that, if P is total on T with remainder R, the values matched by R but not by P are "silly" values and a language construct like switch can (a) consider P sufficient to establish totality and (b) can insert synthetic tests for each of the patterns in R that throw. > > We will lean on this in _both_ switch and pattern assignment. For switch, we will treat it as if we insert synthetic cases after P that throw, so that the remaining values can still be matched by earlier explicit patterns. > > Invariant: If P is total on T with remainder R then, for all t in T, either t matches P, or t matches some pattern in R. (This is not the definition of what makes a pattern total; it is just something that is true about total patterns.) > > Base cases: > > - The type pattern `T t` is strongly total on U <: T. > - The inferred type pattern `var x` is strongly total on all T. > > Induction cases: > > - Let D(T) be a deconstructor. > - The deconstruction pattern D(Q), where Q is strongly total on T, is total on D with remainder { null }. > - The deconstruction pattern D(Q), where Q is total on T with remainder R*, is total on D with remainder { null } union { D(R) : R in R* } Note that the first sub-bullet is ?merely? an important special case of the second sub-subbullet. > We can easily generalize the definition of totality to a set of patterns. In this case, we can handle sealed types and enums: > > - Let E be an enum type. The set of patterns { C1 .. Cn } is total on E with remainder { null, E e } if C1 .. Cn contains all the constants of E. > > Observation: that `E e` pattern is pretty broad! But that's OK; it captures all novel constant values, and, because the explicit cases cover all the known values, captures only the novel values. Same for sealed types: > > - Let S be a sealed abstract type or interface, with permitted direct subtypes C*, and P* be a set of patterns applicable to S. If for each C in C*, there exists a subset Pc of P* that is total on C with remainder Rc, then P* is total on S with remainder { null } union \forall{c \in C}{ Rc }. I think there should not be set braces around that last occurrence of ?Rc?. > - Let D(T) be a deconstructor, and let P* be total on T with remainder R*. Then { D(P) : P in P* } is total on D with remainder { null } \union { D(R) : R in R* }. > > Example: > Container = Box | Bag > Shape = Circle | Rect > P* = { Box(Circle), Box(Rect), Bag(Circle), Bag(Rect) } I assume that this use of syntax ?T = U | V? is meant to imply that T is a sealed type that permits U and V. > { Circle, Rect } total on Shape with remainder { Shape, null } > > -> { Box(Circle), Box(Rect) total on Box with remainder { Box(Shape), Box(null), null } > > -> { Bag(Circle), Bag(Rect) total on Bag with remainder { Bag(Shape), Bag(null), null } > > -> P* total on Container with remainder { Container(Box(Shape)), Container(Box(null)), Container(Bag(Shape)), Container(Box(Shape)), Container(null), null } I believe that, in this last remainder, the second occurrence of `Container(Box(Shape))` was intended to be `Container(Bag(null))`. But I also think that the phrase `Container(Box(Shape))` is inconsistent or incoherent; there has been some confusion of the _type parameter_ of `Container` with a _deconstruction parameter_ of `Container`. To say `Container(Box(?))` is as silly as to say `Shape(Rect)`. I will try to redo this derivation while being very explicit about the type parameters: Container = Box | Bag Shape = Circle | Rect P* = { Box(Circle), Box(Rect), Bag(Circle), Bag(Rect) } { Box(T), Bag(T) } total on Container with remainder { Container, null } Now instantiate the previous rule with T=Shape to get { Box(Shape), Bag(Shape) } total on Container(Shape) with remainder { Container(Shape), null } { Circle, Rect } total on Shape with remainder { Shape, null } -> { Box(Circle), Box(Rect) total on Box with remainder { Box(Shape), Box(null), null } -> { Bag(Circle), Bag(Rect) total on Bag with remainder { Bag(Shape), Bag(null), null } -> P* total on Container with remainder { Box(Shape), Box(null), Bag(Shape), Bag(null), Container, null } > Now: > > - A pattern assignment `P = e` requires that P be total on the static type of `e`, with some remainder R, and throws on the remainder. > - A total switch on `e` requires that its cases be total on the static type of `e`, with some remainder R, and inserts synthetic throwing cases at the end of the switch for each pattern in R. Yes! To which we can perhaps add: a pattern `instanceof` expression `x instanceof P` requires that the pattern P _not_ be total on the type of x. > We can then decide how to opt into totality in switches, other than "be an expression switch.? Yes! -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Aug 26 19:32:21 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Aug 2020 15:32:21 -0400 Subject: [pattern-switch] Totality In-Reply-To: <68A69B16-2B0A-4876-90EC-40CD2F061C6A@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> <68A69B16-2B0A-4876-90EC-40CD2F061C6A@oracle.com> Message-ID: <96e414e9-6f89-34ad-e2d4-28b550b3ec45@oracle.com> > >> We can easily generalize the definition of totality to a set of >> patterns.? In this case, we can handle sealed types and enums: >> >> ?- Let E be an enum type.? The set of patterns { C1 .. Cn } is total >> on E with remainder { null, E e } if C1 .. Cn contains all the >> constants of E. >> >> Observation: that `E e` pattern is pretty broad!? But that's OK; it >> captures all novel constant values, and, because the explicit cases >> cover all the known values, captures only the novel values.? Same for >> sealed types: >> >> ?- Let S be a sealed abstract type or interface, with permitted >> direct subtypes C*, and P* be a set of patterns applicable to S.? If >> for each C in C*, there exists a subset Pc of P* that is total on C >> with remainder Rc, then P* is total on S with remainder { null } >> union \forall{c \in C}{ Rc }. > > I think there should not be set braces around that last occurrence of > ?Rc?. Right. > I assume that this use of syntax ?T = U | V? is meant to imply that T > is a sealed type that permits U and V. Right. > >> { Circle, Rect } total on Shape with remainder { Shape, null } >> >> -> { Box(Circle), Box(Rect) total on Box with remainder { >> Box(Shape), Box(null), null } >> >> -> { Bag(Circle), Bag(Rect) total on Bag with remainder { >> Bag(Shape), Bag(null), null } >> >> -> P* total on Container with remainder { >> Container(Box(Shape)), Container(Box(null)), Container(Bag(Shape)), >> Container(Box(Shape)), Container(null), null } > > I believe that, in this last remainder, the second occurrence of > `Container(Box(Shape))` was intended to be `Container(Bag(null))`. Yes, cut and paste bug.? This IDE needs better inspection helpers! -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Aug 26 19:37:05 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Aug 2020 15:37:05 -0400 Subject: [pattern-switch] Summary of open issues In-Reply-To: References: Message-ID: <1e3058dd-b98e-cffe-371a-b395fb768838@oracle.com> So, scorecard: ?- I think the totality, exhaustiveness, and nullity problems are now put to bed in a consistent and principled manner, along with a reasonable new job for "default" that doesn't leave it to rot.? Remaining is: ?? - How does a switch opt into totality, other than by being an expression switch? ?- Guards: there's a bikeshed to be painted, what's stopping you guys? ?- Restrictions on instanceof: proposed, under discussion. On 8/14/2020 1:19 PM, Brian Goetz wrote: > Here's a summary of the issues raised in the reviews of the > patterns-in-switch document.? I'm going to (try to) start a new thread > for each of them; let's not reply to this one with new topics (or with > discussion on these topics.)? I'll update this thread as we add or > remove things from the list. > > ?- Is totality too subtle? (Remi) There is some concern that the > notion of using totality to subsume nullability (at least in nested > contexts) is sound, he is concerned that the difference between total > and non-total patterns may be too subtle, and this may lead to NPE > issues.? To evaluate this, we need to evaluate both the "is totality > too subtle" and the "how much are we worried about NPE in this > context" directions. > > ?- Guards.? (John, Tagir) There is acknowledgement that some sort of > "whoops, not this case" support is needed in order to maintain switch > as a useful construct in the face of richer case labels, but some > disagreement about whether an imperative statement (e.g., continue) or > a declarative guard (e.g., `when `) is the right choice. > > ?- Exhaustiveness and null. (Tagir)? For sealed domains (enums and > sealed types), we kind of cheated with expression switches because we > could count on the switch filtering out the null. But Tagir raises an > excellent point, which is that we do not yet have a sound definition > of exhaustiveness that scales to nested patterns (do Box(Rect) and > Box(Circle) cover Box(Shape)?)? This is an interaction between sealed > types and patterns that needs to be ironed out.? (Thanks Tagir!) > > ?- Switch and null. (Tagir, Kevin)? Should we reconsider trying to > rehabilitate switches null-acceptance?? There are several who are > questioning whether this is trying to push things too far for too > little benefit. > > ?- Rehabilitating default.? The current design leaves default to rot; > it is possible it has a better role to play with respect to the > rehabilitation of switch, such as signalling that the switch is total. > > ?- Restrictions on instanceof.? It has been proposed that we restrict > total patterns from instanceof to avoid confusion; while no one has > really objected, a few people have expressed mild discomfort.? Leaving > it on the list for now until we resolve some of the other nullity > questions. > > ?- Meta. (Brian)? Nearly all of this is about null.? Is it possible > that everything else about the proposal is so perfect that there's > nothing else to talk about?? Seems unlikely.? I recommend we turn up > the attenuation knob on nullity issues to leave some oxygen for some > of the other flowers. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Aug 26 20:43:20 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Aug 2020 16:43:20 -0400 Subject: [pattern-switch] Totality In-Reply-To: <68A69B16-2B0A-4876-90EC-40CD2F061C6A@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> <68A69B16-2B0A-4876-90EC-40CD2F061C6A@oracle.com> Message-ID: <45cdabfb-a0ca-1d4f-9cbe-2dbbf0e9c3ee@oracle.com> >> >> ?- Let S be a sealed abstract type or interface, with permitted >> direct subtypes C*, and P* be a set of patterns applicable to S.? If >> for each C in C*, there exists a subset Pc of P* that is total on C >> with remainder Rc, then P* is total on S with remainder { null } >> union \forall{c \in C}{ Rc }. > Guy's example implicitly points out that this rule is missing a case.? It should be { null, S s } union { Rc : c in C }.? The `S s` entry is the analogue of `E e` for sealed types. The analogue of this rule, when S is concrete, is: ?- Let S be a sealed concrete class, with permitted direct subtypes C*, and P* be a set of patterns applicable to S.? If for each C in C* union { S }, there exists a subset Pc of P* that is total on C with remainder Rc, then P* is total on S with remainder { null } union { Rc : c in C }. That is, we have to consider `S` one of its own subtypes when S is concrete, and then we don't need it in the remainder. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Wed Aug 26 20:53:36 2020 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 26 Aug 2020 13:53:36 -0700 Subject: [pattern-switch] Guards In-Reply-To: <09e39b8d-2e1c-ca19-9617-dec2a3f5996e@oracle.com> References: <09e39b8d-2e1c-ca19-9617-dec2a3f5996e@oracle.com> Message-ID: I think it will usually be easy for users to keep guards simple, by just extracting out a method when they're not, and the aesthetics will even motivate them to do that. With `continue`, there might be any amount of complexity, even state changes, in the statement group before the `continue` is hit, and it's not necessarily extractable. Really, it just seems like we now have two different kinds of fall-through layered on top of each other. The silent continue is for going to the next group, while the explicit continue goes back to checking for a match. Why that way and not the other way around? Just legacy reasons. I like the guards. On Fri, Aug 14, 2020 at 10:21 AM Brian Goetz wrote: > > - Guards. (John, Tagir) There is acknowledgement that some sort of > "whoops, not this case" support is needed in order to maintain switch as a > useful construct in the face of richer case labels, but some disagreement > about whether an imperative statement (e.g., continue) or a declarative > guard (e.g., `when `) is the right choice. > > > This is probably the biggest blocking decision in front of us. > > John correctly points out that the need for some sort of guard is a direct > consequence of making switch stronger; with the current meaning of switch, > which is "which one of these is it", there's no need for backtracking, but > as we can express richer case labels, the risk of the case label _not being > rich enough_ starts to loom. > > We explored rolling boolean guards into patterns themselves (`P && g`), > which was theoretically attractive but turned out to not be all that > great. There are some potential ambiguities (even if we do something else > about constant patterns, there are still some patterns that look like > expressions and vice versa, making the grammar ugly here) and it just > doesn't have that much incremental expressive power, since the most > credible other use of patterns already (instanceof) has no problem > conjoining additional conditions, because it's a boolean expression. So > this is largely about filling in the gaps of switch so that we don't have > fall-off-the-cliff behaviors. > > There are two credible approaches here: > > - An imperative statement (like `continue` or `next-case`), which means > "whoops, fell in the wrong bucket, please backtrack to the dispatch"; > > - A declarative clause on the case label (like `when `) that > qualifies whether the case is selected. > > Most of the discussion so far has been on the axis of "continue is > lower-level, and therefore better suited to be a language primitive" vs > "the code that uses guards is easier to read and reason about." Assuming > we have to do one (and I think we do), we have three choices (one, the > other, or both.) I think we should step away from the either/or mentality > and try to shine a light on what goes well, or badly, when we _don't_ have > one or the other. > > For example, with guards, we can express fine degrees of refinement in the > case labels: > > case P & g1: ... > case P & g2: ... > case P & g3: ... > > but without them, we can only have one `case P`: > > case P: > if (g1) { ... } > else if (g2) { ... } > else if (g3) { ... } > > My main fear of the without-guards branches is that it will be > prohibitively hard to understand what a switch is doing, because the case > arms will be full of imperative control-flow logic. > > On the other hand, a valid concern when you have guards is that there will > be so much logic in the guard that you won't be able to tell where the case > label ends and where the arm begins. > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Aug 26 20:57:34 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 26 Aug 2020 16:57:34 -0400 Subject: [pattern-switch] Guards In-Reply-To: References: <09e39b8d-2e1c-ca19-9617-dec2a3f5996e@oracle.com> Message-ID: <983c33a5-3147-c5af-8256-275e02f4715b@oracle.com> This inclination (guards are simple, continue is complicated) is also supported by the following observation: we just went through a big exercise for (optimistic) totality, but any pattern that is guarded (imperatively or declaratively) can't be assumed to cover anything.? It is far easier to spot a pattern with a guard (even if the guard is `&& false`) and know it doesn't contribute to totality, than to scan the imperative body looking for a `continue`. On 8/26/2020 4:53 PM, Kevin Bourrillion wrote: > I think it will usually be easy for users to keep guards simple, by > just extracting out a method when they're not, and the aesthetics will > even motivate them to do that. > > With `continue`, there might be any amount of complexity, even state > changes, in the statement group before the `continue` is hit, and it's > not necessarily extractable. > > Really, it just seems like we now have two different kinds of > fall-through layered on top of each other. The silent continue is for > going to the next group, while the explicit continue goes back to > checking for a match. Why that way and not the other way around? Just > legacy reasons. > > I like the guards. > > > > On Fri, Aug 14, 2020 at 10:21 AM Brian Goetz > wrote: > >> >> ?- Guards.? (John, Tagir) There is acknowledgement that some sort >> of "whoops, not this case" support is needed in order to maintain >> switch as a useful construct in the face of richer case labels, >> but some disagreement about whether an imperative statement >> (e.g., continue) or a declarative guard (e.g., `when >> `) is the right choice. > > This is probably the biggest blocking decision in front of us. > > John correctly points out that the need for some sort of guard is > a direct consequence of making switch stronger; with the current > meaning of switch, which is "which one of these is it", there's no > need for backtracking, but as we can express richer case labels, > the risk of the case label _not being rich enough_ starts to loom. > > We explored rolling boolean guards into patterns themselves (`P && > g`), which was theoretically attractive but turned out to not be > all that great. There are some potential ambiguities (even if we > do something else about constant patterns, there are still some > patterns that look like expressions and vice versa, making the > grammar ugly here) and it just doesn't have that much incremental > expressive power, since the most credible other use of patterns > already (instanceof) has no problem conjoining additional > conditions, because it's a boolean expression.? So this is largely > about filling in the gaps of switch so that we don't have > fall-off-the-cliff behaviors. > > There are two credible approaches here: > > ?- An imperative statement (like `continue` or `next-case`), which > means "whoops, fell in the wrong bucket, please backtrack to the > dispatch"; > > ?- A declarative clause on the case label (like `when > `) that qualifies whether the case is selected. > > Most of the discussion so far has been on the axis of "continue is > lower-level, and therefore better suited to be a language > primitive" vs "the code that uses guards is easier to read and > reason about."? Assuming we have to do one (and I think we do), we > have three choices (one, the other, or both.)? I think we should > step away from the either/or mentality and try to shine a light on > what goes well, or badly, when we _don't_ have one or the other. > > For example, with guards, we can express fine degrees of > refinement in the case labels: > > ? ? case P & g1: ... > ? ? case P & g2: ... > ? ? case P & g3: ... > > but without them, we can only have one `case P`: > > ? ? case P: > ? ? ? ? if (g1) { ... } > ? ? ? ? else if (g2) { ... } > ? ? ? ? else if (g3) { ... } > > My main fear of the without-guards branches is that it will be > prohibitively hard to understand what a switch is doing, because > the case arms will be full of imperative control-flow logic. > > On the other hand, a valid concern when you have guards is that > there will be so much logic in the guard that you won't be able to > tell where the case label ends and where the arm begins. > > > > > -- > Kevin Bourrillion?|?Java Librarian |?Google, Inc.?|kevinb at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Wed Aug 26 21:16:35 2020 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 26 Aug 2020 14:16:35 -0700 Subject: [pattern-switch] Guards In-Reply-To: References: <09e39b8d-2e1c-ca19-9617-dec2a3f5996e@oracle.com> Message-ID: On Wed, Aug 26, 2020 at 1:53 PM Kevin Bourrillion wrote: The silent continue is for going to the next group, while the explicit > continue goes back to checking for a match. Why that way and not the other > way around? Just legacy reasons. > Contrast: in a loop, implicit and explicit continue mean the same thing; *neither* of them means unconditionally. So, by "I like the guards", I mean I would favor having guards and NOT supporting continue. You can already have a whole night of pub trivia about switch as it is. :-) On Fri, Aug 14, 2020 at 10:21 AM Brian Goetz wrote: > >> >> - Guards. (John, Tagir) There is acknowledgement that some sort of >> "whoops, not this case" support is needed in order to maintain switch as a >> useful construct in the face of richer case labels, but some disagreement >> about whether an imperative statement (e.g., continue) or a declarative >> guard (e.g., `when `) is the right choice. >> >> >> This is probably the biggest blocking decision in front of us. >> >> John correctly points out that the need for some sort of guard is a >> direct consequence of making switch stronger; with the current meaning of >> switch, which is "which one of these is it", there's no need for >> backtracking, but as we can express richer case labels, the risk of the >> case label _not being rich enough_ starts to loom. >> >> We explored rolling boolean guards into patterns themselves (`P && g`), >> which was theoretically attractive but turned out to not be all that >> great. There are some potential ambiguities (even if we do something else >> about constant patterns, there are still some patterns that look like >> expressions and vice versa, making the grammar ugly here) and it just >> doesn't have that much incremental expressive power, since the most >> credible other use of patterns already (instanceof) has no problem >> conjoining additional conditions, because it's a boolean expression. So >> this is largely about filling in the gaps of switch so that we don't have >> fall-off-the-cliff behaviors. >> >> There are two credible approaches here: >> >> - An imperative statement (like `continue` or `next-case`), which means >> "whoops, fell in the wrong bucket, please backtrack to the dispatch"; >> >> - A declarative clause on the case label (like `when `) that >> qualifies whether the case is selected. >> >> Most of the discussion so far has been on the axis of "continue is >> lower-level, and therefore better suited to be a language primitive" vs >> "the code that uses guards is easier to read and reason about." Assuming >> we have to do one (and I think we do), we have three choices (one, the >> other, or both.) I think we should step away from the either/or mentality >> and try to shine a light on what goes well, or badly, when we _don't_ have >> one or the other. >> >> For example, with guards, we can express fine degrees of refinement in >> the case labels: >> >> case P & g1: ... >> case P & g2: ... >> case P & g3: ... >> >> but without them, we can only have one `case P`: >> >> case P: >> if (g1) { ... } >> else if (g2) { ... } >> else if (g3) { ... } >> >> My main fear of the without-guards branches is that it will be >> prohibitively hard to understand what a switch is doing, because the case >> arms will be full of imperative control-flow logic. >> >> On the other hand, a valid concern when you have guards is that there >> will be so much logic in the guard that you won't be able to tell where the >> case label ends and where the arm begins. >> >> >> > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Aug 28 01:13:38 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 28 Aug 2020 03:13:38 +0200 (CEST) Subject: [pattern-switch] Totality In-Reply-To: <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> References: <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> Message-ID: <254928371.551362.1598577218956.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Guy Steele" > Cc: "amber-spec-experts" , > "amber-spec-experts" > Envoy?: Mercredi 26 Ao?t 2020 19:01:16 > Objet: Re: [pattern-switch] Totality > I think we now have a sound story for totality in both patterns and switches. > Let's start with refining what we mean by totality. > We have seen a lot of cases -- and not just those involving enums or sealed > types -- where we want to say a pattern or set of patterns is "total enough" to > not force the user to explicitly handle the corner cases. In these cases, we > will let the compiler handle the corner cases by generating exceptions. > A prime example is the deconstruction pattern Foo(var x); this matches all Foos, > but not null. Similarly, there is a whole family of such corner cases: > Foo(Bar(var x)) matches all Foos, except for null and Foo(null). But we are in > agreement that it would be overly pedantic to force the user to handle these > explicitly. While i agree with the general idea, i disagree with the fact that the compiler should handle the null and total cases, it can not handle the total case, it has to be handled by the by the runtime, not the compiler. I will use '?' for the total type, by example this switch is "total enough" sealed interface Stuff permits Pixel, Car {} record Pixel(int x, int y, Color color) implements Stuff {} record Car(Color color) implements Stuff {} switch(stuff) { case Pixel(? x, ? y, Color color) -> color; case Car(Color color) -> color } We have agreed during one of our first meetings that we want that if Pixel is changed to record Pixel(BigInteger x, BigInteger y, Color color) {} with or without recompilation of the switch it should be Ok. So here, '?' can not be 'var', at least if we want var to be only the inferred type. In term of semantics, there is a difference between var which is an implicit type and '?' which represents whatever type. That doesn't mean that '?' can not be spelt 'var', but it demonstrates that the corner cases should be managed by the runtime, not the compiler. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Aug 28 13:45:41 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 28 Aug 2020 09:45:41 -0400 Subject: [pattern-switch] Totality In-Reply-To: <254928371.551362.1598577218956.JavaMail.zimbra@u-pem.fr> References: <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> <254928371.551362.1598577218956.JavaMail.zimbra@u-pem.fr> Message-ID: <4796383e-bd52-dc50-0218-bc5e36124f51@oracle.com> > While i agree with the general idea, i disagree with the fact that the > compiler should handle the null and total cases, it can not handle the > total case, it has to be handled by the by the runtime, not the compiler. > > I will use '?' for the total type, by example this switch is "total > enough" > ? sealed interface Stuff permits Pixel, Car {} > ? record Pixel(int x, int y, Color color) implements Stuff {} > ? record Car(Color color) implements Stuff {} > ? switch(stuff) { > ??? case Pixel(? x, ? y, Color color) -> color; > ? ? case Car(Color color) -> color > ? } > > We have agreed during one of our first meetings that we want that if > Pixel is changed to record Pixel(BigInteger x, BigInteger y, Color > color) {} with or without recompilation of the switch it should be Ok. I don't recall agreeing to anything like this, so perhaps you can dig up a reference, and I can try to reconstruct what I thought I was agreeing to? It seems that you are arguing that the use site of a pattern match should be unaffected by (possibly incompatible) changes in the declaration of the pattern.? But this can't possibly be what you are suggesting. (If you want to have a conversation on what declaration changes are behaviorally compatible, that is a fine conversation!) > That doesn't mean that '?' can not be spelt 'var', but it demonstrates > that the corner cases should be managed by the runtime, not the compiler. > I don't understand what you mean by this distinction.? The compiler identifies the corner cases, and generates runtime code to handle them, just like it does with expression enum switches today. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Aug 28 17:58:08 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 28 Aug 2020 19:58:08 +0200 (CEST) Subject: [pattern-switch] Totality In-Reply-To: <4796383e-bd52-dc50-0218-bc5e36124f51@oracle.com> References: <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> <254928371.551362.1598577218956.JavaMail.zimbra@u-pem.fr> <4796383e-bd52-dc50-0218-bc5e36124f51@oracle.com> Message-ID: <1440895501.939704.1598637488246.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Guy Steele" , "amber-spec-experts" > > Envoy?: Vendredi 28 Ao?t 2020 15:45:41 > Objet: Re: [pattern-switch] Totality >> While i agree with the general idea, i disagree with the fact that the compiler >> should handle the null and total cases, it can not handle the total case, it >> has to be handled by the by the runtime, not the compiler. >> I will use '?' for the total type, by example this switch is "total enough" >> sealed interface Stuff permits Pixel, Car {} >> record Pixel(int x, int y, Color color) implements Stuff {} >> record Car(Color color) implements Stuff {} >> switch(stuff) { >> case Pixel(? x, ? y, Color color) -> color; >> case Car(Color color) -> color >> } >> We have agreed during one of our first meetings that we want that if Pixel is >> changed to record Pixel(BigInteger x, BigInteger y, Color color) {} with or >> without recompilation of the switch it should be Ok. > I don't recall agreeing to anything like this, so perhaps you can dig up a > reference, and I can try to reconstruct what I thought I was agreeing to? [ https://cr.openjdk.java.net/~briangoetz/amber/pattern-match-translation.html | https://cr.openjdk.java.net/~briangoetz/amber/pattern-match-translation.html ] section Migration Compat > It seems that you are arguing that the use site of a pattern match should be > unaffected by (possibly incompatible) changes in the declaration of the > pattern. But this can't possibly be what you are suggesting. It should not be affected by compatible changes, obviously, the question is what a compatible change is. Compatible changes and a switch being total are intertwined, > (If you want to have a conversation on what declaration changes are behaviorally > compatible, that is a fine conversation!) >> That doesn't mean that '?' can not be spelt 'var', but it demonstrates that the >> corner cases should be managed by the runtime, not the compiler. > I don't understand what you mean by this distinction. The compiler identifies > the corner cases, and generates runtime code to handle them, just like it does > with expression enum switches today. My point is that a some patterns like the type pattern is a runtime check. If for a total pattern being total is a runtime property and not a compile time property. If a case like case Pixel(var x, var y, var color) is total, and if x and y are not used, having the type of x and y implicitly encoded in the bytecode makes the refactoring from impossible, thus the type on any inner pattern that follows a destructuring can only be tested at runtime. Otherwise you will have a lot of ICCE that will be thrown spuriously. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Aug 28 19:40:02 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 28 Aug 2020 15:40:02 -0400 Subject: [pattern-switch] Totality In-Reply-To: <1440895501.939704.1598637488246.JavaMail.zimbra@u-pem.fr> References: <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <1853BA58-6F71-4367-83DE-20E9738631AB@oracle.com> <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> <254928371.551362.1598577218956.JavaMail.zimbra@u-pem.fr> <4796383e-bd52-dc50-0218-bc5e36124f51@oracle.com> <1440895501.939704.1598637488246.JavaMail.zimbra@u-pem.fr> Message-ID: <34869b78-9157-25be-357d-7aa5f2b8e8b3@oracle.com> > I don't recall agreeing to anything like this, so perhaps you can > dig up a reference, and I can try to reconstruct what I thought I > was agreeing to? > > > https://cr.openjdk.java.net/~briangoetz/amber/pattern-match-translation.html > section Migration Compat Haha, OK.?? Yes, that was a first draft from before we really understood the problem.? In light of further exploration and understanding, I'm fine to see that particular sentence go under the bus.? It was a reasonable stake-in-the-ground about compatibility given what we knew at the time. > > It should not be affected by compatible changes, obviously, the > question is what a compatible change is. > Compatible changes and a switch being total are intertwined, Realistically, any significant change in the type hierachy is likely to be incompatible.? Consider: ??? case SonOfFoo: ??? case Foo: If we change from Foo extends SonOfFoo to the opposite, compiled switches won't work the same way.? Oh well. > > If a case like case Pixel(var x, var y, var color) is total, and if x > and y are not used, having the type of x and y implicitly encoded in > the bytecode makes the refactoring from impossible, > thus the type on any inner pattern that follows a destructuring can > only be tested at runtime. Otherwise you will have a lot of ICCE that > will be thrown spuriously. > What?? Changing the signature of a member to use completely unrelated types is a refactoring that happens "a lot"?? Makes no sense.? (Nor can it reasonably be expected to be compatible.) I think you need to back up and try to distill what you are actually worried about here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Aug 28 21:59:55 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 28 Aug 2020 23:59:55 +0200 (CEST) Subject: [pattern-switch] Totality In-Reply-To: <34869b78-9157-25be-357d-7aa5f2b8e8b3@oracle.com> References: <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> <254928371.551362.1598577218956.JavaMail.zimbra@u-pem.fr> <4796383e-bd52-dc50-0218-bc5e36124f51@oracle.com> <1440895501.939704.1598637488246.JavaMail.zimbra@u-pem.fr> <34869b78-9157-25be-357d-7aa5f2b8e8b3@oracle.com> Message-ID: <1381566928.959027.1598651995682.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Guy Steele" , "amber-spec-experts" > > Envoy?: Vendredi 28 Ao?t 2020 21:40:02 > Objet: Re: [pattern-switch] Totality >>> I don't recall agreeing to anything like this, so perhaps you can dig up a >>> reference, and I can try to reconstruct what I thought I was agreeing to? >> [ https://cr.openjdk.java.net/~briangoetz/amber/pattern-match-translation.html | >> https://cr.openjdk.java.net/~briangoetz/amber/pattern-match-translation.html ] >> section Migration Compat > Haha, OK. Yes, that was a first draft from before we really understood the > problem. In light of further exploration and understanding, I'm fine to see > that particular sentence go under the bus. It was a reasonable > stake-in-the-ground about compatibility given what we knew at the time. >> It should not be affected by compatible changes, obviously, the question is what >> a compatible change is. >> Compatible changes and a switch being total are intertwined, > Realistically, any significant change in the type hierarchy is likely to be > incompatible. Consider: > case SonOfFoo: > case Foo: > If we change from Foo extends SonOfFoo to the opposite, compiled switches won't > work the same way. Oh well. yes, hierarchy changes can be checked once at runtime but i'm not suggesting that, in my opinion, a cascade of if ... instanceof and a switch on types should have the same constraints, we can evaluate if we want to catch more errors at runtime but i think you should first to sure we have the same constraints as a if ... instanceof. >> If a case like case Pixel(var x, var y, var color) is total, and if x and y are >> not used, having the type of x and y implicitly encoded in the bytecode makes >> the refactoring from impossible, >> thus the type on any inner pattern that follows a destructuring can only be >> tested at runtime. Otherwise you will have a lot of ICCE that will be thrown >> spuriously. > What? Changing the signature of a member to use completely unrelated types is a > refactoring that happens "a lot"? Makes no sense. (Nor can it reasonably be > expected to be compatible.) It's not rare to add components to a sum type, by example a MouseEvent is upgraded to add the number of mouse wheel scroll knobs. Again, it should work like a cascade of if ... instanceof, so case Pixel(var x, var y, var color) -> color should be equivalent to if x instanceof Pixel p { yield p.color() } or if its a total pattern Pixel p = x; yield p.color(); As you see there is no constraint on the other components of a Pixel apart on color. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Aug 28 22:36:29 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 29 Aug 2020 00:36:29 +0200 (CEST) Subject: [pattern-switch] Totality In-Reply-To: <1381566928.959027.1598651995682.JavaMail.zimbra@u-pem.fr> References: <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> <254928371.551362.1598577218956.JavaMail.zimbra@u-pem.fr> <4796383e-bd52-dc50-0218-bc5e36124f51@oracle.com> <1440895501.939704.1598637488246.JavaMail.zimbra@u-pem.fr> <34869b78-9157-25be-357d-7aa5f2b8e8b3@oracle.com> <1381566928.959027.1598651995682.JavaMail.zimbra@u-pem.fr> Message-ID: <185150464.960647.1598654189960.JavaMail.zimbra@u-pem.fr> > De: "Remi Forax" > ?: "Brian Goetz" > Cc: "Guy Steele" , "amber-spec-experts" > > Envoy?: Vendredi 28 Ao?t 2020 23:59:55 > Objet: Re: [pattern-switch] Totality >> De: "Brian Goetz" >> ?: "Remi Forax" >> Cc: "Guy Steele" , "amber-spec-experts" >> >> Envoy?: Vendredi 28 Ao?t 2020 21:40:02 >> Objet: Re: [pattern-switch] Totality >>>> I don't recall agreeing to anything like this, so perhaps you can dig up a >>>> reference, and I can try to reconstruct what I thought I was agreeing to? >>> [ https://cr.openjdk.java.net/~briangoetz/amber/pattern-match-translation.html | >>> https://cr.openjdk.java.net/~briangoetz/amber/pattern-match-translation.html ] >>> section Migration Compat >> Haha, OK. Yes, that was a first draft from before we really understood the >> problem. In light of further exploration and understanding, I'm fine to see >> that particular sentence go under the bus. It was a reasonable >> stake-in-the-ground about compatibility given what we knew at the time. >>> It should not be affected by compatible changes, obviously, the question is what >>> a compatible change is. >>> Compatible changes and a switch being total are intertwined, >> Realistically, any significant change in the type hierarchy is likely to be >> incompatible. Consider: >> case SonOfFoo: >> case Foo: >> If we change from Foo extends SonOfFoo to the opposite, compiled switches won't >> work the same way. Oh well. > yes, > hierarchy changes can be checked once at runtime but i'm not suggesting that, > in my opinion, a cascade of if ... instanceof and a switch on types should have > the same constraints, we can evaluate if we want to catch more errors at > runtime but i think you should first to sure we have the same constraints as a > if ... instanceof. >>> If a case like case Pixel(var x, var y, var color) is total, and if x and y are >>> not used, having the type of x and y implicitly encoded in the bytecode makes >>> the refactoring from impossible, >>> thus the type on any inner pattern that follows a destructuring can only be >>> tested at runtime. Otherwise you will have a lot of ICCE that will be thrown >>> spuriously. >> What? Changing the signature of a member to use completely unrelated types is a >> refactoring that happens "a lot"? Makes no sense. (Nor can it reasonably be >> expected to be compatible.) > It's not rare to add components to a sum type, by example a MouseEvent is > upgraded to add the number of mouse wheel scroll knobs. sum type => product type > Again, it should work like a cascade of if ... instanceof, so > case Pixel(var x, var y, var color) -> color > should be equivalent to > if x instanceof Pixel p { yield p.color() } > or if its a total pattern > Pixel p = x; > yield p.color(); > As you see there is no constraint on the other components of a Pixel apart on > color. > R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Aug 28 22:43:10 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 28 Aug 2020 18:43:10 -0400 Subject: [pattern-switch] Totality In-Reply-To: <1381566928.959027.1598651995682.JavaMail.zimbra@u-pem.fr> References: <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> <254928371.551362.1598577218956.JavaMail.zimbra@u-pem.fr> <4796383e-bd52-dc50-0218-bc5e36124f51@oracle.com> <1440895501.939704.1598637488246.JavaMail.zimbra@u-pem.fr> <34869b78-9157-25be-357d-7aa5f2b8e8b3@oracle.com> <1381566928.959027.1598651995682.JavaMail.zimbra@u-pem.fr> Message-ID: > > It's not rare to add components to a sum type, by example a MouseEvent > is upgraded to add the number of mouse wheel scroll knobs. (later corrected to "product") To do that, you can't just blindly add a new component, because existing ctor calls will no longer link.? If yesterday you had ??? record Foo(int x, int y) { } then if you want to add a z, you have to: ??? record Foo(int x, int y, int z) { ??????? Foo(int x, int y) { this(x,y, 0); } ??? } and now old ctor sites will link. It's exactly the same with dtors; if you add a component, you have to explicitly provide the dtor with the old descriptor. > in my opinion, a cascade of if ... instanceof and a switch on types > should have the same constraints It's a good intuition, but can you be specific about what constraints you are talking about? From guy.steele at oracle.com Fri Aug 28 22:49:26 2020 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 28 Aug 2020 18:49:26 -0400 Subject: [pattern-switch] Totality In-Reply-To: <1381566928.959027.1598651995682.JavaMail.zimbra@u-pem.fr> References: <67c1b1ab-3913-ccee-85c5-6cc1d137afbf@oracle.com> <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> <254928371.551362.1598577218956.JavaMail.zimbra@u-pem.fr> <4796383e-bd52-dc50-0218-bc5e36124f51@oracle.com> <1440895501.939704.1598637488246.JavaMail.zimbra@u-pem.fr> <34869b78-9157-25be-357d-7aa5f2b8e8b3@oracle.com> <1381566928.959027.1598651995682.JavaMail.zimbra@u-pem.fr> Message-ID: > On Aug 28, 2020, at 5:59 PM, forax at univ-mlv.fr wrote: > > . . . > Again, it should work like a cascade of if ... instanceof, so > case Pixel(var x, var y, var color) -> color > should be equivalent to > if x instanceof Pixel p { yield p.color() } But I do not believe that at all. I do believe that case Pixel(var x, var y, var color) -> color should be equivalent to if x instanceof Pixel(var x, var y, var color) p { yield p.color() } or, if you prefer, to if x instanceof Pixel(var x, var y, var color) { yield color } The point is that the switch label `case Pixel(var x, var y, var color)` does not merely demand that the selector value be a Pixel; it demands that it be a Pixel having a specific three-argument destructor. It can be equivalent only to an instanceof expression that makes those same demands. If you want a switch clause that is equivalent to if x instanceof Pixel p { yield p.color() } then you should write case Pixel p -> p.color() From daniel.smith at oracle.com Fri Aug 28 23:15:40 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 28 Aug 2020 17:15:40 -0600 Subject: Sealed local interfaces In-Reply-To: <385055420.151260.1597673243223.JavaMail.zimbra@u-pem.fr> References: <385055420.151260.1597673243223.JavaMail.zimbra@u-pem.fr> Message-ID: <20D5BB10-17CD-4621-AE1F-FC357CB5E923@oracle.com> > On Aug 17, 2020, at 8:07 AM, Remi Forax wrote: > > static void foo() { > sealed interface I {} > > record Foo implements I {} > } > > but this code does not compile because Foo is a local class and a local class can not implement a sealed interface. This scenario is not directly covered by the specs, because local interfaces are a feature of the Records JEP, while 'sealed' is a feature of the Sealed Classes JEP. (If we'd noticed the interaction, we could have specified Sealed Classes in terms of the "Local Static Interfaces and Enum Classes" changes. But if those changes are finalized in 16, the point will be moot.) However, we *did* make a deliberate choice that a local *class* cannot be 'sealed' at all, and that restriction should naturally apply to local interfaces, too: "It is a compile-time error if a local class declaration contains any of the access modifiers public, protected, or private (6.6), or any of the modifiers static (8.1.1), sealed or non-sealed (8.1.1.2)." I kind of think this is overly-restrictive, but it was a safe way to sidestep all the questions that arise due to classes being "unavailable" from a certain scope. I think it's worth thinking more carefully about this in the next iteration. From forax at univ-mlv.fr Sat Aug 29 00:27:08 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 29 Aug 2020 02:27:08 +0200 (CEST) Subject: [pattern-switch] Totality In-Reply-To: References: <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> <254928371.551362.1598577218956.JavaMail.zimbra@u-pem.fr> <4796383e-bd52-dc50-0218-bc5e36124f51@oracle.com> <1440895501.939704.1598637488246.JavaMail.zimbra@u-pem.fr> <34869b78-9157-25be-357d-7aa5f2b8e8b3@oracle.com> <1381566928.959027.1598651995682.JavaMail.zimbra@u-pem.fr> Message-ID: <247070559.965810.1598660828263.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Guy Steele" , "amber-spec-experts" > Envoy?: Samedi 29 Ao?t 2020 00:43:10 > Objet: Re: [pattern-switch] Totality >> >> It's not rare to add components to a sum type, by example a MouseEvent >> is upgraded to add the number of mouse wheel scroll knobs. > > (later corrected to "product") > > To do that, you can't just blindly add a new component, because existing > ctor calls will no longer link.? If yesterday you had > > ??? record Foo(int x, int y) { } > > then if you want to add a z, you have to: > > ??? record Foo(int x, int y, int z) { > ??????? Foo(int x, int y) { this(x,y, 0); } > ??? } > > and now old ctor sites will link. > > It's exactly the same with dtors; if you add a component, you have to > explicitly provide the dtor with the old descriptor. I'm challenging that view, first the equivalent of a destructor is not a constructor at least not a plain one, it's more like the canonical constructor, the one that ask the names of the parameter to be in a certain order, i.e. the one that allow to creates an object from names + values instead of position + values. In that model, you don't need several deconstructors, you only need one because the runtime of the pattern matcher will only extract the part that is useful by names instead of by positions. When you specify a destructuring pattern, the compiler use the positional order to match the component but at runtime, this is not needed. This is basically the model used to serialize/deserialize a record. > >> in my opinion, a cascade of if ... instanceof and a switch on types >> should have the same constraints > > It's a good intuition, but can you be specific about what constraints > you are talking about ? see the mail of Guy. R?mi From daniel.smith at oracle.com Sat Aug 29 00:34:41 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Fri, 28 Aug 2020 18:34:41 -0600 Subject: [pattern-switch] Exhaustiveness In-Reply-To: <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> Message-ID: <41E50468-648B-4A29-8AB6-5AB0871ED320@oracle.com> > On Aug 20, 2020, at 4:14 PM, Brian Goetz wrote: > > > If a user had: > > case Box(Head) > case Box(Tail) > > and a Box(null) arrived unexpectedly at the switch, would NPE really be what they expect? An NPE happens when you _dereference_ a null. But no one is deferencing anything here; it's just that Box(null) fell into that middle space of "well, you didn't really cover it, but it's such a silly case that I didn't want to make you cover it either, but here we are and we have to do something." So maybe want some sort of SillyCaseException (perhaps with a less silly name) for at least the null residue. So the idea being pursued on this thread is that: case Box(Head) case Box(Tail) implies an implicit case Box(null): throw [something]; But I want to point out that you also said: > If we have: > > case Box(Rect r) > case Box(Circle c) > case Bag(Rect r) > case Bag(Circle c) > default > > then Box(Pentagon|null) and Bag(Pentagon|null) clearly fall into the default case, so no special handling is needed there. So whether to insert 'case Box(null)' immediately after 'case Box(Tail)' depends on whether there's a downstream handler for 'Box(null)'. That's a pretty complex and non-local user model. And for all this complex analysis we get... some different exception types? Doesn't seem like a worthwhile trade. Separately, I don't love that we're using ICCE for an unmatched enum?an error which typically indicates a binary incompatibility. We don't (and should not) say in JLS 13.4.26 that adding an enum constant is a binary incompatible change. Enum constants add new cases all the time. What I'd like to do instead: switch expressions that are optimistically/weakly total get an implicit 'default' case that throws 'UnmatchedSwitchException' or something like that for *everything* that goes unhandled. Exactly what diagnostic information we choose to put in the exception is a quality of implementation issue. As a special case, if the unmatched value is 'null' (not 'Box(null)'), we *might* decide to throw an NPE instead (depending on how your ideas about null hostility in switches pan out). This is a behavioral change for enum switch expressions in Java 14+ code, which makes me feel a bit sheepish, but I don't think anybody will mind a change now that the design has evolved enough to recognize the need for a specific exception class. From forax at univ-mlv.fr Sat Aug 29 00:50:13 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 29 Aug 2020 02:50:13 +0200 (CEST) Subject: [pattern-switch] Totality In-Reply-To: References: <53f5e6b4-11be-0f6e-2a81-fe81e2d3f088@oracle.com> <254928371.551362.1598577218956.JavaMail.zimbra@u-pem.fr> <4796383e-bd52-dc50-0218-bc5e36124f51@oracle.com> <1440895501.939704.1598637488246.JavaMail.zimbra@u-pem.fr> <34869b78-9157-25be-357d-7aa5f2b8e8b3@oracle.com> <1381566928.959027.1598651995682.JavaMail.zimbra@u-pem.fr> Message-ID: <746511378.966026.1598662213524.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Guy Steele" > ?: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > Envoy?: Samedi 29 Ao?t 2020 00:49:26 > Objet: Re: [pattern-switch] Totality >> On Aug 28, 2020, at 5:59 PM, forax at univ-mlv.fr wrote: >> >> . . . >> Again, it should work like a cascade of if ... instanceof, so >> case Pixel(var x, var y, var color) -> color >> should be equivalent to >> if x instanceof Pixel p { yield p.color() } > > But I do not believe that at all. I do believe that > > case Pixel(var x, var y, var color) -> color > > should be equivalent to > > if x instanceof Pixel(var x, var y, var color) p { yield p.color() } > > or, if you prefer, to > > if x instanceof Pixel(var x, var y, var color) { yield color } > > The point is that the switch label `case Pixel(var x, var y, var color)` does > not merely demand that the selector value be a Pixel; it demands that it be a > Pixel having a specific three-argument destructor. It can be equivalent only > to an instanceof expression that makes those same demands. > > If you want a switch clause that is equivalent to > > if x instanceof Pixel p { yield p.color() } > > then you should write > > case Pixel p -> p.color() It doesn't have to be that way. Let say Pixel have a deconstructor, something like deconstructor Pixel { return this; } because Pixel is already a record, it can return itself and when a pattern is used, you have something like this x instanceof Pixel(var foo, var bar, var baz) so for the compiler accessing to foo is equivalent of extracting Pixel::x, accessing to bar is equivalent to accessing to Pixel::y and accessing to baz is equivalent to accessing to Pixel::color, the compiler matches the positional parameters of the destructor with the position of the destructuring pattern so for if x instanceof Pixel(var foo, var bar, var baz) { yield baz } the equivalent code at runtime is if x instanceof Pixel p { yield p.color(); } This is very similar to the way enums works, when you declare an enum, each constant has a position but when you match an enum using a switch, at runtime the constants are matched by name not by position. Similarly the serialization of enums, which is a form of adhoc matching, is also done by name not by position. Another way of seeing it, it's very similar to the way the VM resolve fields, it's a two step process, in the bytecode you have a name, and at runtime the VM find the corresponding offset. Here, the compiler matches the record resulting from the deconstructor with the destructuring pattern and insert in the bytecode the corresponding name, at runtime, we do the opposite, we call the deconstructor (because the matching is done at runtime, you don't need more than one) and extract the value from the name. R?mi From brian.goetz at oracle.com Sat Aug 29 01:18:35 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 28 Aug 2020 21:18:35 -0400 Subject: [pattern-switch] Exhaustiveness In-Reply-To: <41E50468-648B-4A29-8AB6-5AB0871ED320@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <41E50468-648B-4A29-8AB6-5AB0871ED320@oracle.com> Message-ID: > And for all this complex analysis we get... some different exception types? Doesn't seem like a worthwhile trade. As I've mentioned already, I think the exception-type thing is mostly a red herring.? We have some existing precendent, which is pretty hard to extrapolate from: ?- Existing switches on enum/string/boxes throw NPE on a null target; ?- (As of 12) enum expression switches throw ICCE when confronted with a novel value. All the discussion about exception types are trying to extrapolate from these, but it's pretty hard to actually do so.? I would be happy to just have some sort of SwitchRemainderException. > What I'd like to do instead: switch expressions that are optimistically/weakly total get an implicit 'default' case that throws 'UnmatchedSwitchException' or something like that for *everything* that goes unhandled. Exactly what diagnostic information we choose to put in the exception is a quality of implementation issue. As a special case, if the unmatched value is 'null' (not 'Box(null)'), we *might* decide to throw an NPE instead (depending on how your ideas about null hostility in switches pan out) That is, essentially, what I have proposed in my "Totality" thread (I suspect you're still catching up, as it took us a long time to get there.) > This is a behavioral change for enum switch expressions in Java 14+ code, which makes me feel a bit sheepish, but I don't think anybody will mind a change now that the design has evolved enough to recognize the need for a specific exception class. I don't think we need an incompatible change; we're already sealing off some legacy behavior with enum/string/box switches, we can seal this one off there too. From forax at univ-mlv.fr Sat Aug 29 23:50:15 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 30 Aug 2020 01:50:15 +0200 (CEST) Subject: [pattern-switch] Totality In-Reply-To: <746511378.966026.1598662213524.JavaMail.zimbra@u-pem.fr> References: <254928371.551362.1598577218956.JavaMail.zimbra@u-pem.fr> <4796383e-bd52-dc50-0218-bc5e36124f51@oracle.com> <1440895501.939704.1598637488246.JavaMail.zimbra@u-pem.fr> <34869b78-9157-25be-357d-7aa5f2b8e8b3@oracle.com> <1381566928.959027.1598651995682.JavaMail.zimbra@u-pem.fr> <746511378.966026.1598662213524.JavaMail.zimbra@u-pem.fr> Message-ID: <1334884860.1057642.1598745015121.JavaMail.zimbra@u-pem.fr> I've implemented of proof of concept of the semantics of the pattern matching at runtime, The semantics of the different patterns is defined here [1] As Brian said, matching is equivalent to implementing Optional match(Object o), but here given that we want also deconstruction, we need to have a supplementary argument to carry all the bound values, John has called it the carrier record in the past so i've kept that name. Note that it not only carry all the bound variables, it also carry a special value __index__ which correspond to the index of the case of the switch. So a pattern is defined like this: public sealed interface Pattern { Optional match(Record carrier, Object o); } To deal with the deconstruction, i need 3 primitives - deconstruct(Object value) that calls the deconstructor if the deconstructed object is a class - extract(Record record, String name) that can extract the value of a record component from its name - with(Record record, String name, Object value) that creates a new record with the value of the record component corresponding to the name changed The implementation of those primitives relies on the reflection so it's slow and wrong in term of security, but it's enough for a prototype [2] I've re-written several builders, really some glorified typed s-expressions, to help me to compose the different patterns [3]. With that if i have this hierarchy sealed interface Shape {} record Rect(Point start, Point end) implements Shape {} record Circle(Point center, int radius) implements Shape {} record Box(T content) {} I'm able to express a switch like this switch(x) { case Box(Rect(Point point1, Point point2)) -> 0; case Box(Circle(Point point, _ _)) -> 1; } using this tree of patterns var pattern = _switch() ._case(_instanceof(Box.class, _destruct(b -> b.bind("content", "$content"), _with("$content", _or(b -> b ._case(_instanceof(Rect.class, _destruct(b2 -> b2.bind("start", "point1").bind("end", "point2"), _index(0)))) ._case(_instanceof(Circle.class, _destruct(b2 -> b2.bind("center", "point"), _index(1)))))) ))) .toPattern(); (_with() means takes a value from the carrier object and use it as the matching value for the downstream pattern) and this record carrier record Carrier(int __index__, Shape $content, Point point1, Point point2, Point point) { Carrier() { this(-1, null, null, null, null); } } I can write var box1 = new Box<>(new Rect(new Point(1, 2), new Point(3, 4))); pattern.match(new Carrier(), box1).orElseThrow() the result is a Carrier instance containing 0 as __index__, the content of the box as $content, the start of the rectangle as point1 and the end of the rectangle as point3, the field 'point' will still be null. There are more examples in [4]. The deconstruction is based on the names of the record components and not their declared positions in the record. You may think that this way of doing the deconstruction is not typesafe but the fact that all the fields of the carrier record are typed means that all the bound variables have the correct type. R?mi [1] https://github.com/forax/pattern-matching-runtime/blob/master/src/main/java/com/github/forax/pmr/Pattern.java [2] https://github.com/forax/pattern-matching-runtime/blob/master/src/main/java/com/github/forax/pmr/GoryDetails.java [3] https://github.com/forax/pattern-matching-runtime/blob/master/src/main/java/com/github/forax/pmr/PatternBuilder.java [4] https://github.com/forax/pattern-matching-runtime/blob/master/src/main/java/com/github/forax/pmr/PatternTest.java ----- Mail original ----- > De: "Remi Forax" > ?: "Guy Steele" > Cc: "Brian Goetz" , "amber-spec-experts" > Envoy?: Samedi 29 Ao?t 2020 02:50:13 > Objet: Re: [pattern-switch] Totality > ----- Mail original ----- >> De: "Guy Steele" >> ?: "Remi Forax" >> Cc: "Brian Goetz" , "amber-spec-experts" >> >> Envoy?: Samedi 29 Ao?t 2020 00:49:26 >> Objet: Re: [pattern-switch] Totality > >>> On Aug 28, 2020, at 5:59 PM, forax at univ-mlv.fr wrote: >>> >>> . . . >>> Again, it should work like a cascade of if ... instanceof, so >>> case Pixel(var x, var y, var color) -> color >>> should be equivalent to >>> if x instanceof Pixel p { yield p.color() } >> >> But I do not believe that at all. I do believe that >> >> case Pixel(var x, var y, var color) -> color >> >> should be equivalent to >> >> if x instanceof Pixel(var x, var y, var color) p { yield p.color() } >> >> or, if you prefer, to >> >> if x instanceof Pixel(var x, var y, var color) { yield color } >> >> The point is that the switch label `case Pixel(var x, var y, var color)` does >> not merely demand that the selector value be a Pixel; it demands that it be a >> Pixel having a specific three-argument destructor. It can be equivalent only >> to an instanceof expression that makes those same demands. >> >> If you want a switch clause that is equivalent to >> >> if x instanceof Pixel p { yield p.color() } >> >> then you should write >> >> case Pixel p -> p.color() > > > It doesn't have to be that way. > > Let say Pixel have a deconstructor, something like > deconstructor Pixel { return this; } > because Pixel is already a record, it can return itself > > and when a pattern is used, you have something like this > x instanceof Pixel(var foo, var bar, var baz) > > so for the compiler accessing to foo is equivalent of extracting Pixel::x, > accessing to bar is equivalent to accessing to Pixel::y and accessing to baz is > equivalent to accessing to Pixel::color, > the compiler matches the positional parameters of the destructor with the > position of the destructuring pattern > > so for > if x instanceof Pixel(var foo, var bar, var baz) { yield baz } > the equivalent code at runtime is > if x instanceof Pixel p { yield p.color(); } > > This is very similar to the way enums works, when you declare an enum, each > constant has a position but when you match an enum using a switch, at runtime > the constants are matched by name not by position. > Similarly the serialization of enums, which is a form of adhoc matching, is also > done by name not by position. > > Another way of seeing it, it's very similar to the way the VM resolve fields, > it's a two step process, in the bytecode you have a name, and at runtime the VM > find the corresponding offset. > Here, the compiler matches the record resulting from the deconstructor with the > destructuring pattern and insert in the bytecode the corresponding name, > at runtime, we do the opposite, we call the deconstructor (because the matching > is done at runtime, you don't need more than one) and extract the value from > the name. > > R?mi From forax at univ-mlv.fr Sun Aug 30 10:09:26 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 30 Aug 2020 12:09:26 +0200 (CEST) Subject: Finalizing in JDK 16 - Pattern matching for instanceof In-Reply-To: References: Message-ID: <363914172.1124714.1598782166630.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Gavin Bierman" , "amber-spec-experts" > > Envoy?: Mercredi 26 Ao?t 2020 17:00:47 > Objet: Re: Finalizing in JDK 16 - Pattern matching for instanceof > I have been thinking about this and I have two refinements I would like to > suggest for Pattern Matching in instanceof. Both have come out of the further > work on the next phases of pattern matching. > 1. Instanceof expressions must express conditionality. > One of the uncomfortable collisions between the independently-derived pattern > semantics and instanceof semantics is the treatment of total patterns. > Instanceof always says "no" on null, but the sensible thing on total patterns > is that _strongly total patterns_ match null. This yields a collision between > x instanceof Object > and > x instanceof Object o > This is not necessarily a problem for the specification, in that instanceof is > free to say "when x is null, we don't even test the pattern." But it is not > good for the users, in that these two things are subtly different. > While I get why some people would like to bootstrap this into an argument why > the pattern semantics are wrong, the key observation here is: _both of these > questions are stupid_. So I think there's an obvious way to fix this so that > there is no problem here: instanceof must ask a question. So the second form > would be illegal, with a compiler error saying "pattern always matches the > target." > Proposed: An `instanceof` expression must be able to evaluate to both true and > false, otherwise it is invalid. This rules out strongly total patterns on the > RHS. If you have a strongly total pattern, use pattern assignment instead. I agree, > 2. Mutability of binding variables. > We did it again; we gave in to our desire to try to "fix mistakes of the past", > with the obvious results. This time, we did it by making binding variables > implicitly final. > This is the same mistake we make over and over again with both nullity and > finality; when a new context comes up, we try to exclude the "mistakes" > (nullability and mutability) from those contexts. > We've seen plenty of examples recently with nullity. Here's a historical example > with finality. When we did Lambda, some clever fellow said "we could make the > lambda parameters implicitly final." And there was a round of "ooh, that would > be nice", because it fed our desire to fix mistakes of the past. But we quickly > realized it would be a new mistake, because it would be an impediment to > refactoring between lambdas and inner classes, and undermined the mental model > of "a lambda is just an anonymous method." > Further, the asymmetry has a user-model cost. And what would be the benefit? > Well, it would make us feel better, but ultimately, would not have a > significant impact on accidental-mutation errors because the context was so > limited (and most lambdas are small anyway.) In the end, it would have been a > huge mistake. > I now think that we have done the same with binding variables. Here are two > motivating examples: > (a) Pattern assignment. For (weakly) total pattern P, you will be able to say > P = e > Note that `int x` and `var x` are both valid patterns and local variable > declarations; it would be good if pattern assignment were a strict > generalization of local variable declaration. The sole asymmetry is that for > pattern assignment, the variable is final. Ooops. > (b) Reconstruction. We have analogized that a `with` expression: > x with { B } > is like the block expression: > { X(VARS) = x; B /* mutates vars */; yield new X(VARS) } > except that mutating the variables would not be allowed. > From a specification perspective, there is nontrivial spec complexity to keep > pattern variables and locals separately, but some of their difference is > gratuitous (mutability.) If we reduce the gratuitious differences, we can > likely bring them closer together, which will reduce friction and technical > debt in the future. > Like with lambda parameters, I am now thinking that we gave in to the base > desire to fix a past mistake, but in a way that doesn't really make the > language better or safer, just more complicated. Let's back this one out before > it really bites us. I agree, but in that case, does it also mean that "final" as modifier should be allowed ? if (foo instanceof final Bar b) { ... } R?mi > On 7/27/2020 6:53 AM, Gavin Bierman wrote: >> In JDK 16 we are planning to finalize two JEPs: >> - Pattern matching for `instanceof` >> - Records >> Whilst we don't have any major open issues for either of these features, I would >> like us to close them out. So I thought it would be useful to quickly summarize >> the features and the issues that have arisen over the preview periods so far. In >> this email I will discuss pattern matching; a following email will cover the >> Records feature. >> Pattern matching >> ---------------- >> Adding conditional pattern matching to an expression form is the main technical >> novelty of our design of this feature. There are several advantages that come >> from this targeting of an expression form: First, we get to refactor a very >> common programming pattern: >> if (e instanceof T) { >> T t = (T)e; // grr... >> ... >> } >> to >> if (e instanceof T t) { >> // let the pattern matching do the work! >> ... >> } >> A second, less obvious advantage is that we can combine the pattern matching >> instanceof with other *expressions*. This enables us to compactly express things >> with expressions that are unnecessarily complicated using statements. For >> example, when implementing a class Point, we might write an equals method as >> follows: >> public boolean equals(Object o) { >> if (!(o instanceof Point)) >> return false; >> Point other = (Point) o; >> return x == other.x >> && y == other.y; >> } >> Using pattern matching with instanceof instead, we can combine this into a >> single expression, eliminating the repetition and simplifying the control flow: >> public boolean equals(Object o) { >> return (o instanceof Point other) >> && x == other.x >> && y == other.y; >> } >> The conditionality of pattern matching - if a value does not match a pattern, >> then the pattern variable is not bound - means that we have to consider >> carefully the scope of the pattern variable. We could do something simple and >> say that the scope of the pattern variable is the containing statement and all >> subsequent statements in the enclosing block. But this has unfortunate >> 'poisoning' consequences, e.g. >> if (a instanceof Point p) { >> ... >> } >> if (b instanceof Point p) { // ERROR - p is in scope >> ... >> } >> In other words in the second statement the pattern variable is in a poisoned >> state - it is in scope, but it should not be accessible as it may not be >> instantiated with a value. Moreover, as it is in scope, we can't declare it >> again. This means that a pattern variable is 'poisoned' after it is declared, so >> the pattern-loving programmer will have to think of lots of distinct names for >> their pattern variables. >> We have chosen another way: Java already uses flow analysis - both in checking >> the access of local variables and blank final fields, and detecting unreachable >> statements. We lean on this concept to introduce the new notion of flow scoping. >> A pattern variable is only in scope where the compiler can deduce that the >> pattern has matched and the variable will be bound. This analysis is flow >> sensitive and works in a similar way to the existing analyses. Returning to our >> example: >> if (a instanceof Point p) { >> // p is in scope >> ... >> } >> // p not in scope here >> if (b instanceof Point p) { // Sure! >> ... >> } >> The motto is "a pattern variable is in scope where it has definitely matched". >> This is intuitive, allows for the safe reuse of pattern variables, and Java >> developers are already used to flow sensitive analyses. >> As pattern variables are treated in all other respects like normal variables >> -- and this was an important design principle -- they can shadow fields. >> However, their flow scoping nature means that some care must be taken to >> determine whether a name refers to a pattern variable declaration shadowing a >> field declaration or a field declaration. >> // field p is in scope >> if (e instanceof Point p) { >> // p refers to the pattern variable >> } else { >> // p refers to the field >> } >> We call this unfortunate interaction of flow scoping and shadowing the "Swiss >> cheese property". To rule it out would require ad-hoc special cases or more >> features, and our sense is that will not be that common, so we have decided to >> keep the feature simple. We hope that IDEs will quickly come to help programmers >> who have difficulty with flow scoping and shadowing. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sun Aug 30 11:12:28 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 30 Aug 2020 13:12:28 +0200 (CEST) Subject: When several patterns are total ? Message-ID: <1230746443.1129275.1598785948720.JavaMail.zimbra@u-pem.fr> Hi, i've hinted that there is an issue with intersection type and totality, but we did not follow up. Here is the issue var value = flag? "foo": 42; switch(value) { case String s -> ... case Integer i -> ... case Serializable s -> case Comparable c -> } given that the type of value is an intersection type Serializable & Comparable & ... the last two cases are total with respect to the type of value. which does not go well with the current semantics that can only have one total case. R?mi From forax at univ-mlv.fr Sun Aug 30 11:37:35 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sun, 30 Aug 2020 13:37:35 +0200 (CEST) Subject: switch: using an expicit type as total is dangerous In-Reply-To: <9a93449a-6ab6-b9ad-c30c-189b5d1b64ad@oracle.com> References: <1809503421.343196.1598278106154.JavaMail.zimbra@u-pem.fr> <9a93449a-6ab6-b9ad-c30c-189b5d1b64ad@oracle.com> Message-ID: <1989167974.1130932.1598787455898.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" , "amber-spec-experts" > Envoy?: Lundi 24 Ao?t 2020 20:57:03 > Objet: Re: switch: using an expicit type as total is dangerous >> 2/ using an explicit type for a total type is a footgun because the semantics >> will change if the hierarchy or the return type of a method switched upon >> change. > > Sorry, I think this argument is a pure red herring.?? I get why this is > one of those "scary the first time you see it" issues, but I think the > fear has been overblown to near-panic proportions.? We've spent a lot of > time talking about it and, the more we talk, the less worried I am. good for you, the more i talk about it, the more i'm worried because you don't seem to understand that having the semantics that change underneath you is bad. > > The conditions that have to combine for this to happen are already > individually rare: > ??? - a hierarchy change, combined with > ??? - enough use-site type inference that is not obvious what the type > dependencies are, combined with > ??? - null actually being a member of the domain, combined with > ??? - users not realizing null is a member of the domain. nope, you don't need a hierarchy change, changing the return type (as noticed by Tagir) and null being part of the domain is enough. > > Then, for it to actually be a problem, not only do all of the above have > to happen, but an unhandled null has to actually show up. > > Even then, the severity of this case is low -- most likely, the NPE gets > moved from one place to another. nope see below > > Even then, the remediation cost is trivial. for having remediation, as a user you have to first see the change of semantics, but you don't. Ok, let's take an example, i've written a method getLiteral() Number getLiteral(String token) { if (token.equals("null")) { return null; // null is part of the domain } try { return Integer.parseInt(token); } catch(NumberFormatException e) { return Double.parseDouble(token); } } and a statement switch in another package/module switch(getLiteral(token)) { case Integer -> System.out.println("Integer"); case Double -> System.out.println("Double"); case Number -> System.out.println("null"); } but now i change getLiteral() to add string literal Object getLiteral(String token) { if (token.equals("null")) { return null; // null is part of the domain } if (token.startsWith("\"") { return token.substring(1, token.length() - 1); } try { return Integer.parseInt(token); } catch(NumberFormatException e) { return Double.parseDouble(token); } } If i only recompile getLiteral(), and run the code containing the switch, i get a ICCE at runtime because the signature of getLiteral() has changed, which is good, but if i now recompile the switch, the code compiles without any error but with a different semantics, duh ? Using "case var _" as the last case at least keep the same semantics, using "default Number" does not compile. [...] R?mi From brian.goetz at oracle.com Sun Aug 30 14:31:23 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 30 Aug 2020 10:31:23 -0400 Subject: Finalizing in JDK 16 - Pattern matching for instanceof In-Reply-To: <363914172.1124714.1598782166630.JavaMail.zimbra@u-pem.fr> References: <363914172.1124714.1598782166630.JavaMail.zimbra@u-pem.fr> Message-ID: <8D0A72FB-245F-4CC1-ABF1-A6CACA0AD89C@oracle.com> > 2. Mutability of binding variables. > > I agree, > but in that case, does it also mean that "final" as modifier should be allowed ? > if (foo instanceof final Bar b) { ? } It means it?s a valid question, but a separate one; we could consider allowing final as a modifier on binding variables. But I see no reason we have to do that now, and would prefer not too. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sun Aug 30 14:37:53 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 30 Aug 2020 10:37:53 -0400 Subject: When several patterns are total ? In-Reply-To: <1230746443.1129275.1598785948720.JavaMail.zimbra@u-pem.fr> References: <1230746443.1129275.1598785948720.JavaMail.zimbra@u-pem.fr> Message-ID: <93DBBA27-70F4-4607-9BB7-D5B435E221FB@oracle.com> , > i've hinted that there is an issue with intersection type and totality, but we did not follow up. > > Here is the issue > var value = flag? "foo": 42; > switch(value) { > case String s -> ... > case Integer i -> ... > case Serializable s -> > case Comparable c -> > } > > given that the type of value is an intersection type Serializable & Comparable & ... > the last two cases are total with respect to the type of value. which does not go well with the current semantics that can only have one total case. Let?s separate the issues here. The type involved is an infinite type, which I think we can agree is a distraction. But lets assume the type of value were Serializable&Comparable (S&C for short.) Because S&C <: S, the `case S` in your example is already total, so the `case C` should be a dead case and yield a compilation error. According to the rule we have, `case S` is total on any U <: S, so it is total on S&C, so the current model covers this, and the `case C` is identified as dead by the compiler. Which makes sense because there?s no value it can match. I?m not seeing the problem? From amaembo at gmail.com Sun Aug 30 14:55:14 2020 From: amaembo at gmail.com (Tagir Valeev) Date: Sun, 30 Aug 2020 21:55:14 +0700 Subject: When several patterns are total ? In-Reply-To: <93DBBA27-70F4-4607-9BB7-D5B435E221FB@oracle.com> References: <1230746443.1129275.1598785948720.JavaMail.zimbra@u-pem.fr> <93DBBA27-70F4-4607-9BB7-D5B435E221FB@oracle.com> Message-ID: Interesting! How about try {...} catch(Ex1 | Ex2 e) { switch (e) { case Ex1 -> ... case Ex2 -> ... } } ? With best regards, Tagir Valeev. ??, 30 ???. 2020 ?., 21:38 Brian Goetz : > , > > i've hinted that there is an issue with intersection type and totality, > but we did not follow up. > > > > Here is the issue > > var value = flag? "foo": 42; > > switch(value) { > > case String s -> ... > > case Integer i -> ... > > case Serializable s -> > > case Comparable c -> > > } > > > > given that the type of value is an intersection type Serializable & > Comparable & ... > > the last two cases are total with respect to the type of value. which > does not go well with the current semantics that can only have one total > case. > > Let?s separate the issues here. The type involved is an infinite type, > which I think we can agree is a distraction. But lets assume the type of > value were Serializable&Comparable (S&C for short.) > > Because S&C <: S, the `case S` in your example is already total, so the > `case C` should be a dead case and yield a compilation error. According to > the rule we have, `case S` is total on any U <: S, so it is total on S&C, > so the current model covers this, and the `case C` is identified as dead by > the compiler. Which makes sense because there?s no value it can match. > > I?m not seeing the problem? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sun Aug 30 15:07:03 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 30 Aug 2020 11:07:03 -0400 Subject: switch: using an expicit type as total is dangerous In-Reply-To: <1989167974.1130932.1598787455898.JavaMail.zimbra@u-pem.fr> References: <1809503421.343196.1598278106154.JavaMail.zimbra@u-pem.fr> <9a93449a-6ab6-b9ad-c30c-189b5d1b64ad@oracle.com> <1989167974.1130932.1598787455898.JavaMail.zimbra@u-pem.fr> Message-ID: <0088B731-27C7-4817-842D-AF1E3F208040@oracle.com> Sorry, but I don?t find this example that compelling either. And I certainly don?t think it comes remotely close to ?bad enough that we have to throw the design in the trash.? As I said from the beginning, yes, you can construct puzzlers. But the existence of a puzzler is not necessarily evidence that the language design is broken, and you are lobbying (implicitly) for an extreme solution: to take the entire design that we?ve worked on for years and toss it in the trash, and replace it with something that is not even yet a proposal, I think the bar is much higher than this. (I don?t think that?s an exaggeration; totality is what makes the entire design stick together without being an ad-hoc bag of ?but on tuesday, do it differently?.) This example feels to me in the same category as combining var with diamond. There?s nothing wrong with that, but by leaving so many things implicit in your program, you may get an inference that is not what you expected. That doesn?t mean that var is bad, or diamond is bad, or even that we should outlaw their interaction (which some suggested at the time.) It just means that when you combine features, especially features that involve implicitness, your program becomes more brittle. This example combines a lot of implicitness to get the same kind of brittleness, including switching on a complex expression whose type isn?t obvious, and, more importantly, making an incompatible change to a method. I don?t think you get to lay the blame on the language inferring totality here, unless you?re advocating that we should never infer anything! Making a change like this could easily change inferred types, which could silently affect overload decisions, and, when we get to Valhalla, even runtime layouts. That?s just part of the trade we make when we allow users to leave something unspecified, whether it be a manifest type (var, diamond, generic method witnesses), finality (lambda capture), totality (as here), accessibility (such as when migrating a class to an interface), etc. So, it?s a good example, to call our attention to the consequences of leaving totality implicit. (We?re having a separate discussion about whether to let the user opt into making totality explicit, and that?s another tool that could be used to make this example less brittle, just as manifest types would make it less brittle than switching on an expression.) Really, though, I think you?re attacking totality because of .01% imperfections without really appreciating how much worse the alternatives are, and how much more often their pain points would come up. (Refactoring switches to instanceof should be expected to happen 1000x more often than making an incompatible change to a method signature and hoping nothing changes.) It?s good to identify the warts, but I?d prefer a little less jumping from ?wart, ergo mistake? ? it took us three years to converge on this answer precisely because there are no perfect answers. > On Aug 30, 2020, at 7:37 AM, forax at univ-mlv.fr wrote: > > ----- Mail original ----- >> De: "Brian Goetz" >> ?: "Remi Forax" , "amber-spec-experts" >> Envoy?: Lundi 24 Ao?t 2020 20:57:03 >> Objet: Re: switch: using an expicit type as total is dangerous > >>> 2/ using an explicit type for a total type is a footgun because the semantics >>> will change if the hierarchy or the return type of a method switched upon >>> change. >> >> Sorry, I think this argument is a pure red herring. I get why this is >> one of those "scary the first time you see it" issues, but I think the >> fear has been overblown to near-panic proportions. We've spent a lot of >> time talking about it and, the more we talk, the less worried I am. > > good for you, > the more i talk about it, the more i'm worried because you don't seem to understand that having the semantics that change underneath you is bad. > >> >> The conditions that have to combine for this to happen are already >> individually rare: >> - a hierarchy change, combined with >> - enough use-site type inference that is not obvious what the type >> dependencies are, combined with >> - null actually being a member of the domain, combined with >> - users not realizing null is a member of the domain. > > > nope, you don't need a hierarchy change, changing the return type (as noticed by Tagir) and null being part of the domain is enough. > >> >> Then, for it to actually be a problem, not only do all of the above have >> to happen, but an unhandled null has to actually show up. >> >> Even then, the severity of this case is low -- most likely, the NPE gets >> moved from one place to another. > > nope see below > >> >> Even then, the remediation cost is trivial. > > for having remediation, as a user you have to first see the change of semantics, but you don't. > > > Ok, let's take an example, i've written a method getLiteral() > Number getLiteral(String token) { > if (token.equals("null")) { > return null; // null is part of the domain > } > try { > return Integer.parseInt(token); > } catch(NumberFormatException e) { > return Double.parseDouble(token); > } > } > > and a statement switch in another package/module > switch(getLiteral(token)) { > case Integer -> System.out.println("Integer"); > case Double -> System.out.println("Double"); > case Number -> System.out.println("null"); > } > > but now i change getLiteral() to add string literal > Object getLiteral(String token) { > if (token.equals("null")) { > return null; // null is part of the domain > } > if (token.startsWith("\"") { > return token.substring(1, token.length() - 1); > } > try { > return Integer.parseInt(token); > } catch(NumberFormatException e) { > return Double.parseDouble(token); > } > } > > If i only recompile getLiteral(), and run the code containing the switch, i get a ICCE at runtime because the signature of getLiteral() has changed, which is good, > but if i now recompile the switch, the code compiles without any error but with a different semantics, duh ? > > Using "case var _" as the last case at least keep the same semantics, using "default Number" does not compile. > > [...] > > R?mi From brian.goetz at oracle.com Sun Aug 30 15:11:14 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 30 Aug 2020 11:11:14 -0400 Subject: When several patterns are total ? In-Reply-To: References: <1230746443.1129275.1598785948720.JavaMail.zimbra@u-pem.fr> <93DBBA27-70F4-4607-9BB7-D5B435E221FB@oracle.com> Message-ID: Heh, I considered to include this in my previous answer :) We can add another rule to our totality calculus to cover union types easily. I think it is: If I have a set of patterns P* that is total on A with remainder R, and a set of patterns Q* that is total on B with remainder S, then P* union Q* is total on A|B with remainder R union S I only thought about this for about 30 seconds, so could have missed a subtlety, but given that a union type is a union of value sets, it seems pretty straightforward. > On Aug 30, 2020, at 10:55 AM, Tagir Valeev wrote: > > Interesting! > > How about > > try {...} > catch(Ex1 | Ex2 e) { > switch (e) { > case Ex1 -> ... > case Ex2 -> ... > } > } > > ? > > With best regards, > Tagir Valeev. > > ??, 30 ???. 2020 ?., 21:38 Brian Goetz >: > , > > i've hinted that there is an issue with intersection type and totality, but we did not follow up. > > > > Here is the issue > > var value = flag? "foo": 42; > > switch(value) { > > case String s -> ... > > case Integer i -> ... > > case Serializable s -> > > case Comparable c -> > > } > > > > given that the type of value is an intersection type Serializable & Comparable & ... > > the last two cases are total with respect to the type of value. which does not go well with the current semantics that can only have one total case. > > Let?s separate the issues here. The type involved is an infinite type, which I think we can agree is a distraction. But lets assume the type of value were Serializable&Comparable (S&C for short.) > > Because S&C <: S, the `case S` in your example is already total, so the `case C` should be a dead case and yield a compilation error. According to the rule we have, `case S` is total on any U <: S, so it is total on S&C, so the current model covers this, and the `case C` is identified as dead by the compiler. Which makes sense because there?s no value it can match. > > I?m not seeing the problem? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sun Aug 30 14:23:38 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 30 Aug 2020 10:23:38 -0400 Subject: [pattern-switch] Totality In-Reply-To: <1334884860.1057642.1598745015121.JavaMail.zimbra@u-pem.fr> References: <254928371.551362.1598577218956.JavaMail.zimbra@u-pem.fr> <4796383e-bd52-dc50-0218-bc5e36124f51@oracle.com> <1440895501.939704.1598637488246.JavaMail.zimbra@u-pem.fr> <34869b78-9157-25be-357d-7aa5f2b8e8b3@oracle.com> <1381566928.959027.1598651995682.JavaMail.zimbra@u-pem.fr> <746511378.966026.1598662213524.JavaMail.zimbra@u-pem.fr> <1334884860.1057642.1598745015121.JavaMail.zimbra@u-pem.fr> Message-ID: <6BC5FFF1-EF48-4856-B04F-433D8C2ED4E4@oracle.com> Yes, this looks very much like a PoC I did once too. The upshot of that experiment is that operators like AND, OR, nesting, and guarding are amenable to implementation as combinators, which is nice. > To deal with the deconstruction, i need 3 primitives > - deconstruct(Object value) that calls the deconstructor if the deconstructed object is a class Here you?re making an assumption about the language model, which is that classes may only have one deconstructor. This is a pretty serious limitation, and forces pattern matching to stay very much on the periphery of the object model. (I know why you want to make this assumption, because you want the bindings to be the return type and there is no return type overloading.) > - extract(Record record, String name) that can extract the value of a record component from its name > - with(Record record, String name, Object value) that creates a new record with the value of the record component corresponding to the name changed > > The implementation of those primitives relies on the reflection so it's slow and wrong in term of security, but it's enough for a prototype [2] Yes, but. It also makes an assumption about runtime resolution of bindings based on component name. Again, you are making a big assumption about ?how the language should work?, but again it has gotten buried in an implementation detail. So whatever we learn from this model comes attached to some big assumptions. FWIW, my implementation treated a pattern as a bundle of method handles, one was a mapping from the match target to an opaque carrier (null meant no match), and N method handles from the carrier to the binding value. In many cases the target itself was a suitable carrier. > The deconstruction is based on the names of the record components and not their declared positions in the record. > You may think that this way of doing the deconstruction is not typesafe but the fact that all the fields of the carrier record are typed means that all the bound variables have the correct type. Yes, but type safety is one of the lesser of my concerns here. From brian.goetz at oracle.com Sun Aug 30 15:57:15 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 30 Aug 2020 11:57:15 -0400 Subject: switch: using an expicit type as total is dangerous In-Reply-To: <0088B731-27C7-4817-842D-AF1E3F208040@oracle.com> References: <1809503421.343196.1598278106154.JavaMail.zimbra@u-pem.fr> <9a93449a-6ab6-b9ad-c30c-189b5d1b64ad@oracle.com> <1989167974.1130932.1598787455898.JavaMail.zimbra@u-pem.fr> <0088B731-27C7-4817-842D-AF1E3F208040@oracle.com> Message-ID: One of the things that I think is being missed here is that switch is turning into a much more powerful construct, and as a result, our intuition for what it means needs to change with it. All of the sky-is-falling hyperbole about action-at-a-distance comes, in no small part, from having not fully upgraded our mental models to understand what switch means. Of course, we want to strike a balance between preserving suitable existing intuitions and extending he power of the construct, but let?s not fool ourselves that there isn?t something new to learn here. But given tradeoffs where we damage the expressiveness or consistency or clarity of the new construct to prop up old, obsolete intuitions, there?s really no decision here. Legacy switches have only constants as their cases. This means that at most one case could match, and in turn this means that order is independent. They implement the equivalent of a dynamically-computed (possibly partial) map. New switches are like constrained if-else chains. The constraints are useful to the compiler because they are more easily scrutable; compilers can optimize away redundant tests, and turn some O(n) switches into O(log n) or O(1). And they are useful to the user because they are asking a simpler question which can have a less error-prone expression. But at root, some things have changed, such as non-overlap of cases, and they can have far more interesting structure. And these structures are not single-level, they are recursive (we will see chains of D(P) ? D(Q) just as we will see chains of P .. Q), and our model should embrace that natural structure. While it may feel unfamiliar today, we should expect to routinely recognize the shape of: case Foo(Bar x): case Foo(Baz x): case Foo f: // or case Foo(Object o): case Object o: // or default where the cases form a tree, because these shapes will be ubiquitous. And developers will develop intuitions about these kinds of shapes too, but right now, they look a little more foreign, and so we?re engaging in ?bargaining? to try to make their behaviors less strange. So, rather than trying to constrain the model to be more consistent with what we?re familiar with, we should be trying to get familiar with the shapes that will naturally arise (which is not hard, because we can look in code from other languages that have similar features) before we start to make claims about what is ?natural? or ?confusing.? Otherwise, we?re just starting the race with bricks tied to our feet. > On Aug 30, 2020, at 11:07 AM, Brian Goetz wrote: > > Sorry, but I don?t find this example that compelling either. And I certainly don?t think it comes remotely close to ?bad enough that we have to throw the design in the trash.? > > As I said from the beginning, yes, you can construct puzzlers. But the existence of a puzzler is not necessarily evidence that the language design is broken, and you are lobbying (implicitly) for an extreme solution: to take the entire design that we?ve worked on for years and toss it in the trash, and replace it with something that is not even yet a proposal, I think the bar is much higher than this. (I don?t think that?s an exaggeration; totality is what makes the entire design stick together without being an ad-hoc bag of ?but on tuesday, do it differently?.) > > This example feels to me in the same category as combining var with diamond. There?s nothing wrong with that, but by leaving so many things implicit in your program, you may get an inference that is not what you expected. That doesn?t mean that var is bad, or diamond is bad, or even that we should outlaw their interaction (which some suggested at the time.) It just means that when you combine features, especially features that involve implicitness, your program becomes more brittle. > > This example combines a lot of implicitness to get the same kind of brittleness, including switching on a complex expression whose type isn?t obvious, and, more importantly, making an incompatible change to a method. I don?t think you get to lay the blame on the language inferring totality here, unless you?re advocating that we should never infer anything! Making a change like this could easily change inferred types, which could silently affect overload decisions, and, when we get to Valhalla, even runtime layouts. That?s just part of the trade we make when we allow users to leave something unspecified, whether it be a manifest type (var, diamond, generic method witnesses), finality (lambda capture), totality (as here), accessibility (such as when migrating a class to an interface), etc. > > So, it?s a good example, to call our attention to the consequences of leaving totality implicit. (We?re having a separate discussion about whether to let the user opt into making totality explicit, and that?s another tool that could be used to make this example less brittle, just as manifest types would make it less brittle than switching on an expression.) > > Really, though, I think you?re attacking totality because of .01% imperfections without really appreciating how much worse the alternatives are, and how much more often their pain points would come up. (Refactoring switches to instanceof should be expected to happen 1000x more often than making an incompatible change to a method signature and hoping nothing changes.) It?s good to identify the warts, but I?d prefer a little less jumping from ?wart, ergo mistake? ? it took us three years to converge on this answer precisely because there are no perfect answers. > > >> On Aug 30, 2020, at 7:37 AM, forax at univ-mlv.fr wrote: >> >> ----- Mail original ----- >>> De: "Brian Goetz" >>> ?: "Remi Forax" , "amber-spec-experts" >>> Envoy?: Lundi 24 Ao?t 2020 20:57:03 >>> Objet: Re: switch: using an expicit type as total is dangerous >> >>>> 2/ using an explicit type for a total type is a footgun because the semantics >>>> will change if the hierarchy or the return type of a method switched upon >>>> change. >>> >>> Sorry, I think this argument is a pure red herring. I get why this is >>> one of those "scary the first time you see it" issues, but I think the >>> fear has been overblown to near-panic proportions. We've spent a lot of >>> time talking about it and, the more we talk, the less worried I am. >> >> good for you, >> the more i talk about it, the more i'm worried because you don't seem to understand that having the semantics that change underneath you is bad. >> >>> >>> The conditions that have to combine for this to happen are already >>> individually rare: >>> - a hierarchy change, combined with >>> - enough use-site type inference that is not obvious what the type >>> dependencies are, combined with >>> - null actually being a member of the domain, combined with >>> - users not realizing null is a member of the domain. >> >> >> nope, you don't need a hierarchy change, changing the return type (as noticed by Tagir) and null being part of the domain is enough. >> >>> >>> Then, for it to actually be a problem, not only do all of the above have >>> to happen, but an unhandled null has to actually show up. >>> >>> Even then, the severity of this case is low -- most likely, the NPE gets >>> moved from one place to another. >> >> nope see below >> >>> >>> Even then, the remediation cost is trivial. >> >> for having remediation, as a user you have to first see the change of semantics, but you don't. >> >> >> Ok, let's take an example, i've written a method getLiteral() >> Number getLiteral(String token) { >> if (token.equals("null")) { >> return null; // null is part of the domain >> } >> try { >> return Integer.parseInt(token); >> } catch(NumberFormatException e) { >> return Double.parseDouble(token); >> } >> } >> >> and a statement switch in another package/module >> switch(getLiteral(token)) { >> case Integer -> System.out.println("Integer"); >> case Double -> System.out.println("Double"); >> case Number -> System.out.println("null"); >> } >> >> but now i change getLiteral() to add string literal >> Object getLiteral(String token) { >> if (token.equals("null")) { >> return null; // null is part of the domain >> } >> if (token.startsWith("\"") { >> return token.substring(1, token.length() - 1); >> } >> try { >> return Integer.parseInt(token); >> } catch(NumberFormatException e) { >> return Double.parseDouble(token); >> } >> } >> >> If i only recompile getLiteral(), and run the code containing the switch, i get a ICCE at runtime because the signature of getLiteral() has changed, which is good, >> but if i now recompile the switch, the code compiles without any error but with a different semantics, duh ? >> >> Using "case var _" as the last case at least keep the same semantics, using "default Number" does not compile. >> >> [...] >> >> R?mi > From forax at univ-mlv.fr Sun Aug 30 20:05:55 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sun, 30 Aug 2020 22:05:55 +0200 (CEST) Subject: [pattern-switch] Totality In-Reply-To: <6BC5FFF1-EF48-4856-B04F-433D8C2ED4E4@oracle.com> References: <1440895501.939704.1598637488246.JavaMail.zimbra@u-pem.fr> <34869b78-9157-25be-357d-7aa5f2b8e8b3@oracle.com> <1381566928.959027.1598651995682.JavaMail.zimbra@u-pem.fr> <746511378.966026.1598662213524.JavaMail.zimbra@u-pem.fr> <1334884860.1057642.1598745015121.JavaMail.zimbra@u-pem.fr> <6BC5FFF1-EF48-4856-B04F-433D8C2ED4E4@oracle.com> Message-ID: <631843016.1162305.1598817955347.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Guy Steele" , "amber-spec-experts" > Envoy?: Dimanche 30 Ao?t 2020 16:23:38 > Objet: Re: [pattern-switch] Totality > Yes, this looks very much like a PoC I did once too. The upshot of that > experiment is that operators like AND, OR, nesting, and guarding are amenable > to implementation as combinators, which is nice. > >> To deal with the deconstruction, i need 3 primitives >> - deconstruct(Object value) that calls the deconstructor if the deconstructed >> object is a class > > Here you?re making an assumption about the language model, which is that classes > may only have one deconstructor. This is a pretty serious limitation, and > forces pattern matching to stay very much on the periphery of the object model. > (I know why you want to make this assumption, because you want the bindings to > be the return type and there is no return type overloading.) It's a PoC. You can notice that adding more fields to the anonymous record, the record returned by the destructor is a binary compatible change. It can also be a source compatible change if the compiler allows the deconstructing pattern to have less parameters than the declared parameters of the deconstructor. By example, for a record Pixel(int x, int y, String color) {}, i.e if instead of the pattern Pixel(var x, var y, var _), one can write the pattern Pixel(var x, var y). You can also select among several deconstructors at runtime (as a linking pass done once) if they return a disjoint set of values, by first selecting all the applicable deconstructors and then selecting the most specific when you construct the method handle tree. But i'm not sure you need that dynamic linking selection in practice. >> - extract(Record record, String name) that can extract the value of a record >> component from its name >> - with(Record record, String name, Object value) that creates a new record with >> the value of the record component corresponding to the name changed >> >> The implementation of those primitives relies on the reflection so it's slow and >> wrong in term of security, but it's enough for a prototype [2] > > Yes, but. It also makes an assumption about runtime resolution of bindings > based on component name. Again, you are making a big assumption about ?how the > language should work?, but again it has gotten buried in an implementation > detail. So whatever we learn from this model comes attached to some big > assumptions. The assumption is that case Pixel(_ , _, var color) -> System.out.println(color); is equivalent to if (value instanceof Pixel pixel) { var record = pixel.deconstructor(); // if Pixel is not a record System.out.println(record.color()); } >From a user POV you don't care, from a binary compatible POV, there is no explicit reference to a specific deconstructor in the bytecode because the matching is done by name instead of being positional. > > FWIW, my implementation treated a pattern as a bundle of method handles, one was > a mapping from the match target to an opaque carrier (null meant no match), and > N method handles from the carrier to the binding value. In many cases the > target itself was a suitable carrier. a real implementation should not use null because we want the carrier object to be an inline, but saying by convention that if __index__ == 0 (which is equivalent to saying it's the default value of an inline) represents "not match" should be enough. > >> The deconstruction is based on the names of the record components and not their >> declared positions in the record. >> You may think that this way of doing the deconstruction is not typesafe but the >> fact that all the fields of the carrier record are typed means that all the >> bound variables have the correct type. > > Yes, but type safety is one of the lesser of my concerns here. s t ok, good. R?mi From forax at univ-mlv.fr Sun Aug 30 20:06:37 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sun, 30 Aug 2020 22:06:37 +0200 (CEST) Subject: [pattern-switch] Totality In-Reply-To: <6BC5FFF1-EF48-4856-B04F-433D8C2ED4E4@oracle.com> References: <1440895501.939704.1598637488246.JavaMail.zimbra@u-pem.fr> <34869b78-9157-25be-357d-7aa5f2b8e8b3@oracle.com> <1381566928.959027.1598651995682.JavaMail.zimbra@u-pem.fr> <746511378.966026.1598662213524.JavaMail.zimbra@u-pem.fr> <1334884860.1057642.1598745015121.JavaMail.zimbra@u-pem.fr> <6BC5FFF1-EF48-4856-B04F-433D8C2ED4E4@oracle.com> Message-ID: <638009343.1162310.1598817997712.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Guy Steele" , "amber-spec-experts" > Envoy?: Dimanche 30 Ao?t 2020 16:23:38 > Objet: Re: [pattern-switch] Totality > Yes, this looks very much like a PoC I did once too. The upshot of that > experiment is that operators like AND, OR, nesting, and guarding are amenable > to implementation as combinators, which is nice. > >> To deal with the deconstruction, i need 3 primitives >> - deconstruct(Object value) that calls the deconstructor if the deconstructed >> object is a class > > Here you?re making an assumption about the language model, which is that classes > may only have one deconstructor. This is a pretty serious limitation, and > forces pattern matching to stay very much on the periphery of the object model. > (I know why you want to make this assumption, because you want the bindings to > be the return type and there is no return type overloading.) It's a PoC. You can notice that adding more fields to the anonymous record, the record returned by the destructor is a binary compatible change. It can also be a source compatible change if the compiler allows the deconstructing pattern to have less parameters than the declared parameters of the deconstructor. By example, for a record Pixel(int x, int y, String color) {}, i.e if instead of the pattern Pixel(var x, var y, var _), one can write the pattern Pixel(var x, var y). You can also select among several deconstructors at runtime (as a linking pass done once) if they return a disjoint set of values, by first selecting all the applicable deconstructors and then selecting the most specific when you construct the method handle tree. But i'm not sure you need that dynamic linking selection in practice. >> - extract(Record record, String name) that can extract the value of a record >> component from its name >> - with(Record record, String name, Object value) that creates a new record with >> the value of the record component corresponding to the name changed >> >> The implementation of those primitives relies on the reflection so it's slow and >> wrong in term of security, but it's enough for a prototype [2] > > Yes, but. It also makes an assumption about runtime resolution of bindings > based on component name. Again, you are making a big assumption about ?how the > language should work?, but again it has gotten buried in an implementation > detail. So whatever we learn from this model comes attached to some big > assumptions. The assumption is that case Pixel(_ , _, var color) -> System.out.println(color); is equivalent to if (value instanceof Pixel pixel) { var record = pixel.deconstructor(); // if Pixel is not a record System.out.println(record.color()); } >From a user POV you don't care, from a binary compatible POV, there is no explicit reference to a specific deconstructor in the bytecode because the matching is done by name instead of being positional. > > FWIW, my implementation treated a pattern as a bundle of method handles, one was > a mapping from the match target to an opaque carrier (null meant no match), and > N method handles from the carrier to the binding value. In many cases the > target itself was a suitable carrier. a real implementation should not use null because we want the carrier object to be an inline, but saying by convention that if __index__ == 0 (which is equivalent to saying it's the default value of an inline) represents "not match" should be enough. > >> The deconstruction is based on the names of the record components and not their >> declared positions in the record. >> You may think that this way of doing the deconstruction is not typesafe but the >> fact that all the fields of the carrier record are typed means that all the >> bound variables have the correct type. > > Yes, but type safety is one of the lesser of my concerns here. s t ok, good. R?mi From forax at univ-mlv.fr Sun Aug 30 21:58:04 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sun, 30 Aug 2020 23:58:04 +0200 (CEST) Subject: When several patterns are total ? In-Reply-To: <93DBBA27-70F4-4607-9BB7-D5B435E221FB@oracle.com> References: <1230746443.1129275.1598785948720.JavaMail.zimbra@u-pem.fr> <93DBBA27-70F4-4607-9BB7-D5B435E221FB@oracle.com> Message-ID: <935290145.7318.1598824684605.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Dimanche 30 Ao?t 2020 16:37:53 > Objet: Re: When several patterns are total ? > , >> i've hinted that there is an issue with intersection type and totality, but we >> did not follow up. >> >> Here is the issue >> var value = flag? "foo": 42; >> switch(value) { >> case String s -> ... >> case Integer i -> ... >> case Serializable s -> >> case Comparable c -> >> } >> >> given that the type of value is an intersection type Serializable & >> Comparable & ... >> the last two cases are total with respect to the type of value. which does not >> go well with the current semantics that can only have one total case. > > Let?s separate the issues here. The type involved is an infinite type, which I > think we can agree is a distraction. But lets assume the type of value were > Serializable&Comparable (S&C for short.) > > Because S&C <: S, the `case S` in your example is already total, so the `case C` > should be a dead case and yield a compilation error. According to the rule we > have, `case S` is total on any U <: S, so it is total on S&C, so the current > model covers this, and the `case C` is identified as dead by the compiler. > Which makes sense because there?s no value it can match. > > I?m not seeing the problem? Ok, that seems logical, good. R?mi From forax at univ-mlv.fr Sun Aug 30 22:05:33 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 31 Aug 2020 00:05:33 +0200 (CEST) Subject: When several patterns are total ? In-Reply-To: References: <1230746443.1129275.1598785948720.JavaMail.zimbra@u-pem.fr> <93DBBA27-70F4-4607-9BB7-D5B435E221FB@oracle.com> Message-ID: <1810255821.8508.1598825133280.JavaMail.zimbra@u-pem.fr> > De: "Tagir Valeev" > ?: "Brian Goetz" > Cc: "Remi Forax" , "amber-spec-experts" > > Envoy?: Dimanche 30 Ao?t 2020 16:55:14 > Objet: Re: When several patterns are total ? > Interesting! > How about > try {...} > catch(Ex1 | Ex2 e) { > switch (e) { > case Ex1 -> ... > case Ex2 -> ... > } > } > ? I also wonder if the precise exception trick should work: try { ... } catch(IllegalAccesException | NoSuchMethodException e) { throw switch(e) { case ReflectiveOperationException e -> e; }; } from the POV of the compiler, it's like throwing IllegalAccesException | NoSuchMethodException and not ReflectiveOperationException. so the code can be refactored to try { ... } catch(IllegalAccesException | NoSuchMethodException e) { throw e; } > With best regards, > Tagir Valeev. R?mi > ??, 30 ???. 2020 ?., 21:38 Brian Goetz < [ mailto:brian.goetz at oracle.com | > brian.goetz at oracle.com ] >: >> , >>> i've hinted that there is an issue with intersection type and totality, but we >> > did not follow up. >> > Here is the issue >> > var value = flag? "foo": 42; >> > switch(value) { >> > case String s -> ... >> > case Integer i -> ... >> > case Serializable s -> >> > case Comparable c -> >> > } >>> given that the type of value is an intersection type Serializable & >> > Comparable & ... >>> the last two cases are total with respect to the type of value. which does not >> > go well with the current semantics that can only have one total case. >> Let?s separate the issues here. The type involved is an infinite type, which I >> think we can agree is a distraction. But lets assume the type of value were >> Serializable&Comparable (S&C for short.) >> Because S&C <: S, the `case S` in your example is already total, so the `case C` >> should be a dead case and yield a compilation error. According to the rule we >> have, `case S` is total on any U <: S, so it is total on S&C, so the current >> model covers this, and the `case C` is identified as dead by the compiler. >> Which makes sense because there?s no value it can match. >> I?m not seeing the problem? -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sun Aug 30 22:27:08 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 31 Aug 2020 00:27:08 +0200 (CEST) Subject: switch: using an expicit type as total is dangerous In-Reply-To: <0088B731-27C7-4817-842D-AF1E3F208040@oracle.com> References: <1809503421.343196.1598278106154.JavaMail.zimbra@u-pem.fr> <9a93449a-6ab6-b9ad-c30c-189b5d1b64ad@oracle.com> <1989167974.1130932.1598787455898.JavaMail.zimbra@u-pem.fr> <0088B731-27C7-4817-842D-AF1E3F208040@oracle.com> Message-ID: <1246444616.8983.1598826428489.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Dimanche 30 Ao?t 2020 17:07:03 > Objet: Re: switch: using an expicit type as total is dangerous > Sorry, but I don?t find this example that compelling either. And I certainly > don?t think it comes remotely close to ?bad enough that we have to throw the > design in the trash.? My mails were never about throwing out the actual design but if it can be improved, as i said earlier we already agree on the fact that a total case should accept null, on the case null (being the first case is still in flux), on the fact that exhaustiveness doesn't requires to catch null and any novel types, etc. We are disagreeing on if a total pattern should have a special syntax or not and on allowing a type pattern with an explicit type to be total or not. > > As I said from the beginning, yes, you can construct puzzlers. But the > existence of a puzzler is not necessarily evidence that the language design is > broken, and you are lobbying (implicitly) for an extreme solution: to take the > entire design that we?ve worked on for years and toss it in the trash, and > replace it with something that is not even yet a proposal, I think the bar is > much higher than this. (I don?t think that?s an exaggeration; totality is what > makes the entire design stick together without being an ad-hoc bag of ?but on > tuesday, do it differently?.) > > This example feels to me in the same category as combining var with diamond. > There?s nothing wrong with that, but by leaving so many things implicit in > your program, you may get an inference that is not what you expected. That > doesn?t mean that var is bad, or diamond is bad, or even that we should outlaw > their interaction (which some suggested at the time.) It just means that when > you combine features, especially features that involve implicitness, your > program becomes more brittle. Not necessarily, specifying a total pattern with a var instead of with an explicit type is more robust to future changes. switch(getLiteral(token)) { case Integer -> System.out.println("Integer"); case Double -> System.out.println("Double"); case var -> throw ... } is more robust than switch(getLiteral(token)) { case Integer -> System.out.println("Integer"); case Double -> System.out.println("Double"); case Number -> throw ... } because it still stay total whatever the return type of getLiteral() is. [...] > > So, it?s a good example, to call our attention to the consequences of leaving > totality implicit. (We?re having a separate discussion about whether to let > the user opt into making totality explicit, and that?s another tool that could > be used to make this example less brittle, just as manifest types would make it > less brittle than switching on an expression.) > > Really, though, I think you?re attacking totality because of .01% imperfections > without really appreciating how much worse the alternatives are, and how much > more often their pain points would come up. (Refactoring switches to > instanceof should be expected to happen 1000x more often than making an > incompatible change to a method signature and hoping nothing changes.) It?s > good to identify the warts, but I?d prefer a little less jumping from ?wart, > ergo mistake? ? it took us three years to converge on this answer precisely > because there are no perfect answers. again, i'm not attacking totality, if there is no totality, you need it at least in an expression switch to make the whole switch exhaustive. R?mi > > >> On Aug 30, 2020, at 7:37 AM, forax at univ-mlv.fr wrote: >> >> ----- Mail original ----- >>> De: "Brian Goetz" >>> ?: "Remi Forax" , "amber-spec-experts" >>> >>> Envoy?: Lundi 24 Ao?t 2020 20:57:03 >>> Objet: Re: switch: using an expicit type as total is dangerous >> >>>> 2/ using an explicit type for a total type is a footgun because the semantics >>>> will change if the hierarchy or the return type of a method switched upon >>>> change. >>> >>> Sorry, I think this argument is a pure red herring. I get why this is >>> one of those "scary the first time you see it" issues, but I think the >>> fear has been overblown to near-panic proportions. We've spent a lot of >>> time talking about it and, the more we talk, the less worried I am. >> >> good for you, >> the more i talk about it, the more i'm worried because you don't seem to >> understand that having the semantics that change underneath you is bad. >> >>> >>> The conditions that have to combine for this to happen are already >>> individually rare: >>> - a hierarchy change, combined with >>> - enough use-site type inference that is not obvious what the type >>> dependencies are, combined with >>> - null actually being a member of the domain, combined with >>> - users not realizing null is a member of the domain. >> >> >> nope, you don't need a hierarchy change, changing the return type (as noticed by >> Tagir) and null being part of the domain is enough. >> >>> >>> Then, for it to actually be a problem, not only do all of the above have >>> to happen, but an unhandled null has to actually show up. >>> >>> Even then, the severity of this case is low -- most likely, the NPE gets >>> moved from one place to another. >> >> nope see below >> >>> >>> Even then, the remediation cost is trivial. >> >> for having remediation, as a user you have to first see the change of semantics, >> but you don't. >> >> >> Ok, let's take an example, i've written a method getLiteral() >> Number getLiteral(String token) { >> if (token.equals("null")) { >> return null; // null is part of the domain >> } >> try { >> return Integer.parseInt(token); >> } catch(NumberFormatException e) { >> return Double.parseDouble(token); >> } >> } >> >> and a statement switch in another package/module >> switch(getLiteral(token)) { >> case Integer -> System.out.println("Integer"); >> case Double -> System.out.println("Double"); >> case Number -> System.out.println("null"); >> } >> >> but now i change getLiteral() to add string literal >> Object getLiteral(String token) { >> if (token.equals("null")) { >> return null; // null is part of the domain >> } >> if (token.startsWith("\"") { >> return token.substring(1, token.length() - 1); >> } >> try { >> return Integer.parseInt(token); >> } catch(NumberFormatException e) { >> return Double.parseDouble(token); >> } >> } >> >> If i only recompile getLiteral(), and run the code containing the switch, i get >> a ICCE at runtime because the signature of getLiteral() has changed, which is >> good, >> but if i now recompile the switch, the code compiles without any error but with >> a different semantics, duh ? >> >> Using "case var _" as the last case at least keep the same semantics, using >> "default Number" does not compile. >> >> [...] >> > > R?mi From forax at univ-mlv.fr Sun Aug 30 22:48:25 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 31 Aug 2020 00:48:25 +0200 (CEST) Subject: switch: using an expicit type as total is dangerous In-Reply-To: References: <1809503421.343196.1598278106154.JavaMail.zimbra@u-pem.fr> <9a93449a-6ab6-b9ad-c30c-189b5d1b64ad@oracle.com> <1989167974.1130932.1598787455898.JavaMail.zimbra@u-pem.fr> <0088B731-27C7-4817-842D-AF1E3F208040@oracle.com> Message-ID: <1649787503.12510.1598827705756.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Dimanche 30 Ao?t 2020 17:57:15 > Objet: Re: switch: using an expicit type as total is dangerous > One of the things that I think is being missed here is that switch is turning > into a much more powerful construct, and as a result, our intuition for what it > means needs to change with it. All of the sky-is-falling hyperbole about > action-at-a-distance comes, in no small part, from having not fully upgraded > our mental models to understand what switch means. Of course, we want to > strike a balance between preserving suitable existing intuitions and extending > he power of the construct, but let?s not fool ourselves that there isn?t > something new to learn here. But given tradeoffs where we damage the > expressiveness or consistency or clarity of the new construct to prop up old, > obsolete intuitions, there?s really no decision here. > > Legacy switches have only constants as their cases. This means that at most one > case could match, and in turn this means that order is independent. They > implement the equivalent of a dynamically-computed (possibly partial) map. > > New switches are like constrained if-else chains. The constraints are useful to > the compiler because they are more easily scrutable; compilers can optimize > away redundant tests, and turn some O(n) switches into O(log n) or O(1). And > they are useful to the user because they are asking a simpler question which > can have a less error-prone expression. But at root, some things have changed, > such as non-overlap of cases, and they can have far more interesting structure. > And these structures are not single-level, they are recursive (we will see > chains of D(P) ? D(Q) just as we will see chains of P .. Q), and our model > should embrace that natural structure. > > While it may feel unfamiliar today, we should expect to routinely recognize the > shape of: > > case Foo(Bar x): > case Foo(Baz x): > case Foo f: // or case Foo(Object o): > case Object o: // or default > > where the cases form a tree, because these shapes will be ubiquitous. And > developers will develop intuitions about these kinds of shapes too, but right > now, they look a little more foreign, and so we?re engaging in ?bargaining? to > try to make their behaviors less strange. It does not feel strange or new to me - an 'or' pattern on a sealed type is equivalent to a legacy switch where the order is independent - an 'or' pattern on random types is equivalent to a series of catches where the order matters How to interact with null may be a little strange but one you peek that a total case (or default) is like "else", it makes sense. The less obvious part is just how to know if a pattern is total or not (apart from case Object where it's obvious). > > So, rather than trying to constrain the model to be more consistent with what > we?re familiar with, we should be trying to get familiar with the shapes that > will naturally arise (which is not hard, because we can look in code from other > languages that have similar features) before we start to make claims about what > is ?natural? or ?confusing.? Otherwise, we?re just starting the race with > bricks tied to our feet. R?mi > > >> On Aug 30, 2020, at 11:07 AM, Brian Goetz wrote: >> >> Sorry, but I don?t find this example that compelling either. And I certainly >> don?t think it comes remotely close to ?bad enough that we have to throw the >> design in the trash.? >> >> As I said from the beginning, yes, you can construct puzzlers. But the >> existence of a puzzler is not necessarily evidence that the language design is >> broken, and you are lobbying (implicitly) for an extreme solution: to take the >> entire design that we?ve worked on for years and toss it in the trash, and >> replace it with something that is not even yet a proposal, I think the bar is >> much higher than this. (I don?t think that?s an exaggeration; totality is what >> makes the entire design stick together without being an ad-hoc bag of ?but on >> tuesday, do it differently?.) >> >> This example feels to me in the same category as combining var with diamond. >> There?s nothing wrong with that, but by leaving so many things implicit in >> your program, you may get an inference that is not what you expected. That >> doesn?t mean that var is bad, or diamond is bad, or even that we should outlaw >> their interaction (which some suggested at the time.) It just means that when >> you combine features, especially features that involve implicitness, your >> program becomes more brittle. >> >> This example combines a lot of implicitness to get the same kind of brittleness, >> including switching on a complex expression whose type isn?t obvious, and, more >> importantly, making an incompatible change to a method. I don?t think you get >> to lay the blame on the language inferring totality here, unless you?re >> advocating that we should never infer anything! Making a change like this >> could easily change inferred types, which could silently affect overload >> decisions, and, when we get to Valhalla, even runtime layouts. That?s just >> part of the trade we make when we allow users to leave something unspecified, >> whether it be a manifest type (var, diamond, generic method witnesses), >> finality (lambda capture), totality (as here), accessibility (such as when >> migrating a class to an interface), etc. >> >> So, it?s a good example, to call our attention to the consequences of leaving >> totality implicit. (We?re having a separate discussion about whether to let >> the user opt into making totality explicit, and that?s another tool that could >> be used to make this example less brittle, just as manifest types would make it >> less brittle than switching on an expression.) >> >> Really, though, I think you?re attacking totality because of .01% imperfections >> without really appreciating how much worse the alternatives are, and how much >> more often their pain points would come up. (Refactoring switches to >> instanceof should be expected to happen 1000x more often than making an >> incompatible change to a method signature and hoping nothing changes.) It?s >> good to identify the warts, but I?d prefer a little less jumping from ?wart, >> ergo mistake? ? it took us three years to converge on this answer precisely >> because there are no perfect answers. >> >> >>> On Aug 30, 2020, at 7:37 AM, forax at univ-mlv.fr wrote: >>> >>> ----- Mail original ----- >>>> De: "Brian Goetz" >>>> ?: "Remi Forax" , "amber-spec-experts" >>>> >>>> Envoy?: Lundi 24 Ao?t 2020 20:57:03 >>>> Objet: Re: switch: using an expicit type as total is dangerous >>> >>>>> 2/ using an explicit type for a total type is a footgun because the semantics >>>>> will change if the hierarchy or the return type of a method switched upon >>>>> change. >>>> >>>> Sorry, I think this argument is a pure red herring. I get why this is >>>> one of those "scary the first time you see it" issues, but I think the >>>> fear has been overblown to near-panic proportions. We've spent a lot of >>>> time talking about it and, the more we talk, the less worried I am. >>> >>> good for you, >>> the more i talk about it, the more i'm worried because you don't seem to >>> understand that having the semantics that change underneath you is bad. >>> >>>> >>>> The conditions that have to combine for this to happen are already >>>> individually rare: >>>> - a hierarchy change, combined with >>>> - enough use-site type inference that is not obvious what the type >>>> dependencies are, combined with >>>> - null actually being a member of the domain, combined with >>>> - users not realizing null is a member of the domain. >>> >>> >>> nope, you don't need a hierarchy change, changing the return type (as noticed by >>> Tagir) and null being part of the domain is enough. >>> >>>> >>>> Then, for it to actually be a problem, not only do all of the above have >>>> to happen, but an unhandled null has to actually show up. >>>> >>>> Even then, the severity of this case is low -- most likely, the NPE gets >>>> moved from one place to another. >>> >>> nope see below >>> >>>> >>>> Even then, the remediation cost is trivial. >>> >>> for having remediation, as a user you have to first see the change of semantics, >>> but you don't. >>> >>> >>> Ok, let's take an example, i've written a method getLiteral() >>> Number getLiteral(String token) { >>> if (token.equals("null")) { >>> return null; // null is part of the domain >>> } >>> try { >>> return Integer.parseInt(token); >>> } catch(NumberFormatException e) { >>> return Double.parseDouble(token); >>> } >>> } >>> >>> and a statement switch in another package/module >>> switch(getLiteral(token)) { >>> case Integer -> System.out.println("Integer"); >>> case Double -> System.out.println("Double"); >>> case Number -> System.out.println("null"); >>> } >>> >>> but now i change getLiteral() to add string literal >>> Object getLiteral(String token) { >>> if (token.equals("null")) { >>> return null; // null is part of the domain >>> } >>> if (token.startsWith("\"") { >>> return token.substring(1, token.length() - 1); >>> } >>> try { >>> return Integer.parseInt(token); >>> } catch(NumberFormatException e) { >>> return Double.parseDouble(token); >>> } >>> } >>> >>> If i only recompile getLiteral(), and run the code containing the switch, i get >>> a ICCE at runtime because the signature of getLiteral() has changed, which is >>> good, >>> but if i now recompile the switch, the code compiles without any error but with >>> a different semantics, duh ? >>> >>> Using "case var _" as the last case at least keep the same semantics, using >>> "default Number" does not compile. >>> >>> [...] >>> >>> R?mi From brian.goetz at oracle.com Mon Aug 31 13:25:13 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 31 Aug 2020 09:25:13 -0400 Subject: [pattern-switch] Opting into totality In-Reply-To: <1e3058dd-b98e-cffe-371a-b395fb768838@oracle.com> References: <1e3058dd-b98e-cffe-371a-b395fb768838@oracle.com> Message-ID: <2B1A968E-C518-496D-984E-76A05B2B545B@oracle.com> I think this is the main open question at this point. We now have a deeper understanding of what this means, and the shape of the remainder. Totality means not only ?spot check me that I?m right?, but also ?I know there might be some remainder, please deal with it.? So totality is not merely about type checking, but about affirmative handling of the remainder. Expression switches automatically get this treatment, and opting _out_ of that makes no sense for expression switches (expressions must be total), but statement switches make sense both ways (just like unbalanced and balanced if-else.) Unfortunately the default has to be partial, so the main question is, how do we indicate the desire for totality in a way that is properly evocative for the user? We?ve talked about modifying switch (sealed switch), a hyphenated keyword (total-switch), a trailing modifier (switch case), and synthetic cases (?default: unreachable?). Of course at this point it?s ?just syntax?, but I think our goal should be picking something that makes it obvious to users that what?s going on is not merely an assertion of totality, but also a desire to handle the remainder. > - How does a switch opt into totality, other than by being an expression switch? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Aug 31 13:35:32 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 31 Aug 2020 09:35:32 -0400 Subject: [pattern-switch] Opting into totality In-Reply-To: <2B1A968E-C518-496D-984E-76A05B2B545B@oracle.com> References: <1e3058dd-b98e-cffe-371a-b395fb768838@oracle.com> <2B1A968E-C518-496D-984E-76A05B2B545B@oracle.com> Message-ID: <8660C84F-338F-4F08-A5B5-CA0BBA99F970@oracle.com> Totality is a term that language designers like, but may not be all that evocative to users. So switch-total might not exactly turn on the light bulb for them. In this manner, ?sealed? has a useful connotation that has nothing to do with sealed types: non-leakiness: a sealed switch doesn?t leak any unprocessed values! Test driving ... sealed switch (x) { ? } sealed-switch (x) { ? } switch-sealed (x) { ? } ?A switch may be sealed with the sealed modifier; expression switches are implicitly sealed. The set of case patterns for a sealed switch must be total with some remainder; synthetic throwing cases are inserted for the remainder.? > On Aug 31, 2020, at 9:25 AM, Brian Goetz wrote: > > I think this is the main open question at this point. > > We now have a deeper understanding of what this means, and the shape of the remainder. Totality means not only ?spot check me that I?m right?, but also ?I know there might be some remainder, please deal with it.? So totality is not merely about type checking, but about affirmative handling of the remainder. > > Expression switches automatically get this treatment, and opting _out_ of that makes no sense for expression switches (expressions must be total), but statement switches make sense both ways (just like unbalanced and balanced if-else.) Unfortunately the default has to be partial, so the main question is, how do we indicate the desire for totality in a way that is properly evocative for the user? > > We?ve talked about modifying switch (sealed switch), a hyphenated keyword (total-switch), a trailing modifier (switch case), and synthetic cases (?default: unreachable?). Of course at this point it?s ?just syntax?, but I think our goal should be picking something that makes it obvious to users that what?s going on is not merely an assertion of totality, but also a desire to handle the remainder. > >> - How does a switch opt into totality, other than by being an expression switch? > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Aug 31 15:09:35 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 31 Aug 2020 11:09:35 -0400 Subject: [pattern-switch] Exhaustiveness In-Reply-To: References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <41E50468-648B-4A29-8AB6-5AB0871ED320@oracle.com> Message-ID: <0c73fc74-d965-48ea-68fe-80f480b7e464@oracle.com> To be clear, I think the sweet spot here is: ?- Legacy enum, string, and box (ESB) switches continue to throw NPE on null; ?- Total switches on enum (including the current expression switches on enum) throw ICCE on a novel value; ?- For new switches with remainder: ?? - Continue to throw NPE on unhandled null remainder; ?? - Throw UnexpectedFrogException on any other unhandled remainder. On 8/28/2020 9:18 PM, Brian Goetz wrote: > >> And for all this complex analysis we get... some different exception >> types? Doesn't seem like a worthwhile trade. > > As I've mentioned already, I think the exception-type thing is mostly > a red herring.? We have some existing precendent, which is pretty hard > to extrapolate from: > > ?- Existing switches on enum/string/boxes throw NPE on a null target; > ?- (As of 12) enum expression switches throw ICCE when confronted with > a novel value. > > All the discussion about exception types are trying to extrapolate > from these, but it's pretty hard to actually do so.? I would be happy > to just have some sort of SwitchRemainderException. > >> What I'd like to do instead: switch expressions that are >> optimistically/weakly total get an implicit 'default' case that >> throws 'UnmatchedSwitchException' or something like that for >> *everything* that goes unhandled. Exactly what diagnostic information >> we choose to put in the exception is a quality of implementation >> issue. As a special case, if the unmatched value is 'null' (not >> 'Box(null)'), we *might* decide to throw an NPE instead (depending on >> how your ideas about null hostility in switches pan out) > > That is, essentially, what I have proposed in my "Totality" thread (I > suspect you're still catching up, as it took us a long time to get > there.) > >> This is a behavioral change for enum switch expressions in Java 14+ >> code, which makes me feel a bit sheepish, but I don't think anybody >> will mind a change now that the design has evolved enough to >> recognize the need for a specific exception class. > > I don't think we need an incompatible change; we're already sealing > off some legacy behavior with enum/string/box switches, we can seal > this one off there too. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Aug 31 15:17:07 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 31 Aug 2020 17:17:07 +0200 (CEST) Subject: [pattern-switch] Opting into totality In-Reply-To: <8660C84F-338F-4F08-A5B5-CA0BBA99F970@oracle.com> References: <1e3058dd-b98e-cffe-371a-b395fb768838@oracle.com> <2B1A968E-C518-496D-984E-76A05B2B545B@oracle.com> <8660C84F-338F-4F08-A5B5-CA0BBA99F970@oracle.com> Message-ID: <684244172.398752.1598887027455.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Lundi 31 Ao?t 2020 15:35:32 > Objet: Re: [pattern-switch] Opting into totality > Totality is a term that language designers like, but may not be all that > evocative to users. So switch-total might not exactly turn on the light bulb > for them. In this manner, ?sealed? has a useful connotation that has nothing to > do with sealed types: non-leakiness: a sealed switch doesn?t leak any > unprocessed values! > Test driving ... > sealed switch (x) { ? } > sealed-switch (x) { ? } > switch-sealed (x) { ? } > ?A switch may be sealed with the sealed modifier; expression switches are > implicitly sealed. The set of case patterns for a sealed switch must be total > with some remainder; synthetic throwing cases are inserted for the remainder.? Those are all "snitch" moves, let's avoid that because all you said about having more than one kind of switch still apply. Here are some facts that can help us, - there is not a lot of existing switches in the wild - as you said, there is a very good chance that the switch on types become the dominant switch. Now, divide and conquer, 1/ a switch on type (statement or expression) should always be non leaky 2a/ add a warning on all existing leaky statement switches forcing them to have a default if not exhaustive 2b/ for an exhaustive enum switch, add a warning if the switch has a default. and if there is no default, let the compiler add a "default -> throw ICCE", it's a breaking change but it should be ok because IDEs currently ask for a default in a switch on enums. explanations for 1/, it's about designing with the future in mind, if most switch are switch on types, let have the right behavior for 2a/, ask users to fix leaky statement switches, even if we introduce a selaed-switch, we will need this warning to gradually move to a better world. for 2b/, ask users to fix exhaustive enum switches so it works like a switch on type. I may be wrong with the idea of adding a "default -> throw" on enum switches without a default, it may break a lot of codes, but i believe it worth the try. And BTW, we should also emit a warning if the default is in the middle of the switch, again to drive user to think in term of switch on type constraints. R?mi >> On Aug 31, 2020, at 9:25 AM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >> brian.goetz at oracle.com ] > wrote: >> I think this is the main open question at this point. >> We now have a deeper understanding of what this means, and the shape of the >> remainder. Totality means not only ?spot check me that I?m right?, but also ?I >> know there might be some remainder, please deal with it.? So totality is not >> merely about type checking, but about affirmative handling of the remainder. >> Expression switches automatically get this treatment, and opting _out_ of that >> makes no sense for expression switches (expressions must be total), but >> statement switches make sense both ways (just like unbalanced and balanced >> if-else.) Unfortunately the default has to be partial, so the main question is, >> how do we indicate the desire for totality in a way that is properly evocative >> for the user? >> We?ve talked about modifying switch (sealed switch), a hyphenated keyword >> (total-switch), a trailing modifier (switch case), and synthetic cases >> (?default: unreachable?). Of course at this point it?s ?just syntax?, but I >> think our goal should be picking something that makes it obvious to users that >> what?s going on is not merely an assertion of totality, but also a desire to >> handle the remainder. >>> - How does a switch opt into totality, other than by being an expression switch? -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Aug 31 15:18:55 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 31 Aug 2020 17:18:55 +0200 (CEST) Subject: [pattern-switch] Exhaustiveness In-Reply-To: <0c73fc74-d965-48ea-68fe-80f480b7e464@oracle.com> References: <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <41E50468-648B-4A29-8AB6-5AB0871ED320@oracle.com> <0c73fc74-d965-48ea-68fe-80f480b7e464@oracle.com> Message-ID: <2102375907.399691.1598887135693.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "daniel smith" > Cc: "Guy Steele" , "Remi Forax" , > "Tagir Valeev" , "amber-spec-experts" > > Envoy?: Lundi 31 Ao?t 2020 17:09:35 > Objet: Re: [pattern-switch] Exhaustiveness > To be clear, I think the sweet spot here is: > - Legacy enum, string, and box (ESB) switches continue to throw NPE on null; > - Total switches on enum (including the current expression switches on enum) > throw ICCE on a novel value; > - For new switches with remainder: > - Continue to throw NPE on unhandled null remainder; > - Throw UnexpectedFrogException on any other unhandled remainder. I read this as ICCE not being good enough compare to UnexpectedFrogException and i don't understand why ? R?mi > On 8/28/2020 9:18 PM, Brian Goetz wrote: >>> And for all this complex analysis we get... some different exception types? >>> Doesn't seem like a worthwhile trade. >> As I've mentioned already, I think the exception-type thing is mostly a red >> herring. We have some existing precendent, which is pretty hard to extrapolate >> from: >> - Existing switches on enum/string/boxes throw NPE on a null target; >> - (As of 12) enum expression switches throw ICCE when confronted with a novel >> value. >> All the discussion about exception types are trying to extrapolate from these, >> but it's pretty hard to actually do so. I would be happy to just have some sort >> of SwitchRemainderException. >>> What I'd like to do instead: switch expressions that are optimistically/weakly >>> total get an implicit 'default' case that throws 'UnmatchedSwitchException' or >>> something like that for *everything* that goes unhandled. Exactly what >>> diagnostic information we choose to put in the exception is a quality of >>> implementation issue. As a special case, if the unmatched value is 'null' (not >>> 'Box(null)'), we *might* decide to throw an NPE instead (depending on how your >>> ideas about null hostility in switches pan out) >> That is, essentially, what I have proposed in my "Totality" thread (I suspect >> you're still catching up, as it took us a long time to get there.) >>> This is a behavioral change for enum switch expressions in Java 14+ code, which >>> makes me feel a bit sheepish, but I don't think anybody will mind a change now >>> that the design has evolved enough to recognize the need for a specific >>> exception class. >> I don't think we need an incompatible change; we're already sealing off some >> legacy behavior with enum/string/box switches, we can seal this one off there >> too. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Aug 31 15:51:54 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 31 Aug 2020 11:51:54 -0400 Subject: [pattern-switch] Opting into totality In-Reply-To: <684244172.398752.1598887027455.JavaMail.zimbra@u-pem.fr> References: <1e3058dd-b98e-cffe-371a-b395fb768838@oracle.com> <2B1A968E-C518-496D-984E-76A05B2B545B@oracle.com> <8660C84F-338F-4F08-A5B5-CA0BBA99F970@oracle.com> <684244172.398752.1598887027455.JavaMail.zimbra@u-pem.fr> Message-ID: What you're suggesting is that we should treat statement switch partiality as a legacy behavior of existing switches on { primitives, boxes, strings, and enums }, and then say the rest of the switches are total.? (I must observe that the irony that you'd raise the spectre of "snitch" and then in the same breath make a proposal like this is pretty "total".) Not only is this a far more intrusive change, but it also ignores something fundamental: partiality for statement switches _is a feature, not a bug_.? A partial switch is like an `if` without an `else`; no one thinks such things are mistakes, and a rule that required an `else` on every `if` would not be appreciated.? I appreciate the attempt at symmetry, and all things being equal that would be nice, but I don't think all things are equal here.? I think this asks far too much of users to stretch their mental model in this way -- nor do I think it is worth the benefit, nor am I even convinced we'd actually even achieve the benefit in practice. > for 1/, it's about designing with the future in mind, if most switch > are switch on types, let have the right behavior I think you lost me already, as I don't think it's the right behavior.? Statements are partial. (I probably shouldn't even mention that this creates a new "action at a distance" problem since the totality semantics depend on the operand type (see, I was on the debate team in high school too), so I won't, because it would be unconstructive.) But I will mention that the operand type isn't even the right thing to key off of here, because even if we are switching on strings, we might still want to use type patterns with guards: ??? switch (int) { ??????? case 0: println("zero"); ??????? case 1: println("one"); ??????? case int x where x%2 == 0: println("even"); ??? } Is this an old switch, or a "type" switch?? Well, it can't be expressed as an old switch, since it uses type patterns, but it is a switch on old types.? So should it be total?? I think the line where you want to cut is fuzzier than you think, and that's going to confuse the heck out of users. So overall, while it's a fair question to ask "could we get away with defining switch to always be total, carve out an exception for all the existing idioms, and not confuse the users too much", I think that would be taking it too far. On 8/31/2020 11:17 AM, Remi Forax wrote: > > > ------------------------------------------------------------------------ > > *De: *"Brian Goetz" > *?: *"amber-spec-experts" > *Envoy?: *Lundi 31 Ao?t 2020 15:35:32 > *Objet: *Re: [pattern-switch] Opting into totality > > Totality is a term that language designers like, but may not be > all that evocative to users. ?So switch-total might not exactly > turn on the light bulb for them. ?In this manner, ?sealed? has a > useful connotation that has nothing to do with sealed types: > non-leakiness: a sealed switch doesn?t leak any unprocessed values! > > Test driving ... > > ? ? sealed switch (x) { ? } > ? ? sealed-switch (x) { ? } > ? ? switch-sealed (x) { ? } > > ?A switch may be sealed with the sealed modifier; expression > switches are implicitly sealed. ?The set of case patterns for a > sealed switch must be total with some remainder; synthetic > throwing cases are inserted for the remainder.? > > > Those are all "snitch" moves, let's avoid that because all you said > about having more than one kind of switch still apply. > > Here are some facts that can help us, > - there is not a lot of existing switches in the wild > - as you said, there is a very good chance that the switch on types > become the dominant switch. > > Now, divide and conquer, > 1/ a switch on type (statement or expression) should always be non leaky > 2a/ add a warning on all existing leaky statement switches forcing > them to have a default if not exhaustive > 2b/ for an exhaustive enum switch, add a warning if the switch has a > default. > ????? and if there is no default, let the compiler add a "default -> > throw ICCE", it's a breaking change but it should be ok because IDEs > currently ask for a default in a switch on enums. > explanations > for 1/, it's about designing with the future in mind, if most switch > are switch on types, let have the right behavior > for 2a/, ask users to fix leaky statement switches, even if we > introduce a selaed-switch, we will need this warning to gradually move > to a better world. > for 2b/, ask users to fix exhaustive enum switches so it works like a > switch on type. > > I may be wrong with the idea of adding a "default -> throw" on enum > switches without a default, it may break a lot of codes, but i believe > it worth the try. > > And BTW, we should also emit a warning if the default is in the middle > of the switch, again to drive user to think in term of switch on type > constraints. > > R?mi > > > On Aug 31, 2020, at 9:25 AM, Brian Goetz > > wrote: > > I think this is the main open question at this point. > > We now have a deeper understanding of what this means, and the > shape of the remainder. ?Totality means not only ?spot check > me that I?m right?, but also ?I know there might be some > remainder, please deal with it.? ? So totality is not merely > about type checking, but about affirmative handling of the > remainder. > > Expression switches automatically get ?this treatment, and > opting _out_ of that makes no sense for expression switches > (expressions must be total), but statement switches make sense > both ways (just like unbalanced and balanced if-else.) > ?Unfortunately the default has to be partial, ?so the main > question is, how ?do we indicate the desire for totality in a > way that is properly evocative for the user? > > We?ve talked about modifying switch (sealed switch), a > hyphenated keyword (total-switch), a trailing modifier (switch > case), and synthetic cases (?default: unreachable?). ?Of > course at this point it?s ?just syntax?, but I think our goal > should be picking something that ?makes it obvious to users > that what?s going on is not merely an assertion of totality, > but also a desire to handle the remainder. > > ?- How does a switch opt into totality, other than by > being an expression switch? > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Aug 31 16:35:21 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 31 Aug 2020 12:35:21 -0400 Subject: [pattern-switch] Exhaustiveness In-Reply-To: <2102375907.399691.1598887135693.JavaMail.zimbra@u-pem.fr> References: <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <41E50468-648B-4A29-8AB6-5AB0871ED320@oracle.com> <0c73fc74-d965-48ea-68fe-80f480b7e464@oracle.com> <2102375907.399691.1598887135693.JavaMail.zimbra@u-pem.fr> Message-ID: The answer is twofold; one is a correctness argument, and the other is a practical one. 1.? Box(null) is part of the remainder of `case Box(Bag(var x))`, and should be thrown on if the switch is total.? But ICCE is not an accurate description of what happened here; there has not been an incompatible class change, but instead simply a putatively total switch combined with an "acceptably leaky" set of cases.? The semantics here are "I got a value that the user didn't handle, but the user and the compiler made a deal that its OK to not handle that value, because its a silly value, so this exception serves as notice of that."? That's not ICCE. You could argue "well, then NPE."? Which is also not quite accurate, since no one tried to dereference the null reference.? But it might be close enough to get away with.? But then, what about something like `case TwoBox(Bag(var x), Shape s)` when confronted with a `new TwoBox(null, NovelSubtypeOfShape)`?? The latter smells like ICCE, the former like NPE.? Which should win?? We could specify this, but .... this brings me to my second answer. 2.? How much effort is it worth spending on coming up with a scheme to perfectly classify what exception should be thrown on remainder? And, do you have any idea how much JCK is then going to have to spend testing all the assertions about the difference of what should be thrown in weird nested cases?? And then, what happens when Eclipse implements it differently? Having seen how expensive it to adjudicate spec/JCK challenges ("you threw AME when I think the spec says you should have thrown ICCE"), we've learned not to create those situations when there is no value to doing so.? Yes, we could solve this, but it isn't worth it.? This is not where we want to spend our complexity and conformance budgets. An UnexpectedFrogException is both accurate (you got a value in the unhandled remainder) and simpler.? So it wins, because we have bigger fish to fry.? (Yes, I know frogs aren't fish.) On 8/31/2020 11:18 AM, forax at univ-mlv.fr wrote: > > To be clear, I think the sweet spot here is: > > ?- Legacy enum, string, and box (ESB) switches continue to throw > NPE on null; > ?- Total switches on enum (including the current expression > switches on enum) throw ICCE on a novel value; > ?- For new switches with remainder: > ?? - Continue to throw NPE on unhandled null remainder; > ?? - Throw UnexpectedFrogException on any other unhandled remainder. > > > I read this as ICCE not being good enough compare to > UnexpectedFrogException and i don't understand why ? > > R?mi > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Aug 31 19:57:37 2020 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 31 Aug 2020 21:57:37 +0200 (CEST) Subject: [pattern-switch] Opting into totality In-Reply-To: References: <1e3058dd-b98e-cffe-371a-b395fb768838@oracle.com> <2B1A968E-C518-496D-984E-76A05B2B545B@oracle.com> <8660C84F-338F-4F08-A5B5-CA0BBA99F970@oracle.com> <684244172.398752.1598887027455.JavaMail.zimbra@u-pem.fr> Message-ID: <1902801661.480215.1598903857911.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Lundi 31 Ao?t 2020 17:51:54 > Objet: Re: [pattern-switch] Opting into totality > What you're suggesting is that we should treat statement switch partiality as a > legacy behavior of existing switches on { primitives, boxes, strings, and enums > }, and then say the rest of the switches are total. (I must observe that the > irony that you'd raise the spectre of "snitch" and then in the same breath make > a proposal like this is pretty "total".) > Not only is this a far more intrusive change, but it also ignores something > fundamental: partiality for statement switches _is a feature, not a bug_. A > partial switch is like an `if` without an `else`; no one thinks such things are > mistakes, and a rule that required an `else` on every `if` would not be > appreciated. I appreciate the attempt at symmetry, and all things being equal > that would be nice, but I don't think all things are equal here. I think this > asks far too much of users to stretch their mental model in this way -- nor do > I think it is worth the benefit, nor am I even convinced we'd actually even > achieve the benefit in practice. 'if' and 'switch' are dual, 'if' is oriented toward doing one test on a value and 'switch' is oriented to doing several tests on the same value. So a partial switch is not like an 'if', it's like a cascade of 'if/else' so forcing to have an 'else' when you have a cascade of 'if/else' seems not as bad as your are suggesting. Yes, it's a more intrusive change but it using the playbook on how to grow a language, to avoid to add features on top of features to the point the language is too hard to understand, the idea is that when you add a feature, you do that in a way that retrofit an existing feature so the number of features stay more or less constant. I don't think that a partial statement is a bug. The rules i propose make it more explicit by adding "default:" or "default -> {}" at the end, but the semantics is still the same. >> for 1/, it's about designing with the future in mind, if most switch are switch >> on types, let have the right behavior > I think you lost me already, as I don't think it's the right behavior. > Statements are partial. Ok, here i should have use "right default" instead of "right behavior", you are right that it's not about the behavior, my bad on that. > (I probably shouldn't even mention that this creates a new "action at a > distance" problem since the totality semantics depend on the operand type (see, > I was on the debate team in high school too), so I won't, because it would be > unconstructive.) good you did not mention it because as far as i understand for null, there is a difference between a switch on types and the already existing switches. > But I will mention that the operand type isn't even the right thing to key off > of here, because even if we are switching on strings, we might still want to > use type patterns with guards: > switch (int) { > case 0: println("zero"); > case 1: println("one"); > case int x where x%2 == 0: println("even"); > } > Is this an old switch, or a "type" switch? Well, it can't be expressed as an old > switch, since it uses type patterns, but it is a switch on old types. Good question, from the user POV it's either an error or a warning, so in both cases it's a call for action, so for most user, using Alt+Enter or Ctrl+1 will fix the issue (insert a "default:") for us the EG or people writing compilers, it's a new switch because you have a case that is using a pattern. > So should it be total? If you get an error or a warning it's because it's not total. > I think the line where you want to cut is fuzzier than you think, and that's > going to confuse the heck out of users. The new switch will confuse a lot of users anyway, it's something i have remarked when doing presentations about the pattern matching, you have to explain the syntax because not enough Java devs have not been exposed to pattern matching in an another languages before. So > So overall, while it's a fair question to ask "could we get away with defining > switch to always be total, carve out an exception for all the existing idioms, > and not confuse the users too much", I think that would be taking it too far. I think that retrofitting the old switch to a common behavior at the same time you introduce the new construct is not too far, again as you said during the development of the expression switch, it's far easier to explain one behavior that to explain multiple (statement vs expression switch with respect to totality) or to have to explain when to use which kind of switch (switch vs sealed-switch). R?mi > On 8/31/2020 11:17 AM, Remi Forax wrote: >>> De: "Brian Goetz" [ mailto:brian.goetz at oracle.com | ] >>> ?: "amber-spec-experts" [ mailto:amber-spec-experts at openjdk.java.net | >>> ] >>> Envoy?: Lundi 31 Ao?t 2020 15:35:32 >>> Objet: Re: [pattern-switch] Opting into totality >>> Totality is a term that language designers like, but may not be all that >>> evocative to users. So switch-total might not exactly turn on the light bulb >>> for them. In this manner, ?sealed? has a useful connotation that has nothing to >>> do with sealed types: non-leakiness: a sealed switch doesn?t leak any >>> unprocessed values! >>> Test driving ... >>> sealed switch (x) { ? } >>> sealed-switch (x) { ? } >>> switch-sealed (x) { ? } >>> ?A switch may be sealed with the sealed modifier; expression switches are >>> implicitly sealed. The set of case patterns for a sealed switch must be total >>> with some remainder; synthetic throwing cases are inserted for the remainder.? >> Those are all "snitch" moves, let's avoid that because all you said about having >> more than one kind of switch still apply. >> Here are some facts that can help us, >> - there is not a lot of existing switches in the wild >> - as you said, there is a very good chance that the switch on types become the >> dominant switch. >> Now, divide and conquer, >> 1/ a switch on type (statement or expression) should always be non leaky >> 2a/ add a warning on all existing leaky statement switches forcing them to have >> a default if not exhaustive >> 2b/ for an exhaustive enum switch, add a warning if the switch has a default. >> and if there is no default, let the compiler add a "default -> throw ICCE", it's >> a breaking change but it should be ok because IDEs currently ask for a default >> in a switch on enums. >> explanations >> for 1/, it's about designing with the future in mind, if most switch are switch >> on types, let have the right behavior >> for 2a/, ask users to fix leaky statement switches, even if we introduce a >> selaed-switch, we will need this warning to gradually move to a better world. >> for 2b/, ask users to fix exhaustive enum switches so it works like a switch on >> type. >> I may be wrong with the idea of adding a "default -> throw" on enum switches >> without a default, it may break a lot of codes, but i believe it worth the try. >> And BTW, we should also emit a warning if the default is in the middle of the >> switch, again to drive user to think in term of switch on type constraints. >> R?mi >>>> On Aug 31, 2020, at 9:25 AM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >>>> brian.goetz at oracle.com ] > wrote: >>>> I think this is the main open question at this point. >>>> We now have a deeper understanding of what this means, and the shape of the >>>> remainder. Totality means not only ?spot check me that I?m right?, but also ?I >>>> know there might be some remainder, please deal with it.? So totality is not >>>> merely about type checking, but about affirmative handling of the remainder. >>>> Expression switches automatically get this treatment, and opting _out_ of that >>>> makes no sense for expression switches (expressions must be total), but >>>> statement switches make sense both ways (just like unbalanced and balanced >>>> if-else.) Unfortunately the default has to be partial, so the main question is, >>>> how do we indicate the desire for totality in a way that is properly evocative >>>> for the user? >>>> We?ve talked about modifying switch (sealed switch), a hyphenated keyword >>>> (total-switch), a trailing modifier (switch case), and synthetic cases >>>> (?default: unreachable?). Of course at this point it?s ?just syntax?, but I >>>> think our goal should be picking something that makes it obvious to users that >>>> what?s going on is not merely an assertion of totality, but also a desire to >>>> handle the remainder. >>>>> - How does a switch opt into totality, other than by being an expression switch? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Aug 31 20:05:12 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 31 Aug 2020 16:05:12 -0400 Subject: [pattern-switch] Opting into totality In-Reply-To: <1902801661.480215.1598903857911.JavaMail.zimbra@u-pem.fr> References: <1e3058dd-b98e-cffe-371a-b395fb768838@oracle.com> <2B1A968E-C518-496D-984E-76A05B2B545B@oracle.com> <8660C84F-338F-4F08-A5B5-CA0BBA99F970@oracle.com> <684244172.398752.1598887027455.JavaMail.zimbra@u-pem.fr> <1902801661.480215.1598903857911.JavaMail.zimbra@u-pem.fr> Message-ID: <1e453215-4b1c-60ab-8b8f-868de4046dba@oracle.com> Everything you say would be a fine way to make a new language from scratch.? But, having made the choice to rehabilitate switch, and not do a "let's fix all the switch mistakes of the past" snitch construct, I think this approach would be pushing it too far, and I strongly doubt we'd get the benefit we are looking for. > Ok, here i should have use "right default" instead of "right > behavior", you are right that it's not about the behavior, my bad on > that. OK, good.? One thing we've learned, though, is that trying to fix the defaults after 25 years often makes things worse.? The benefit has to be super-huge to justify the next few years of confusion.? Here, I don't see it. On 8/31/2020 3:57 PM, forax at univ-mlv.fr wrote: > > > ------------------------------------------------------------------------ > > *De: *"Brian Goetz" > *?: *"Remi Forax" > *Cc: *"amber-spec-experts" > *Envoy?: *Lundi 31 Ao?t 2020 17:51:54 > *Objet: *Re: [pattern-switch] Opting into totality > > What you're suggesting is that we should treat statement switch > partiality as a legacy behavior of existing switches on { > primitives, boxes, strings, and enums }, and then say the rest of > the switches are total.? (I must observe that the irony that you'd > raise the spectre of "snitch" and then in the same breath make a > proposal like this is pretty "total".) > > Not only is this a far more intrusive change, but it also ignores > something fundamental: partiality for statement switches _is a > feature, not a bug_.? A partial switch is like an `if` without an > `else`; no one thinks such things are mistakes, and a rule that > required an `else` on every `if` would not be appreciated.? I > appreciate the attempt at symmetry, and all things being equal > that would be nice, but I don't think all things are equal here.? > I think this asks far too much of users to stretch their mental > model in this way -- nor do I think it is worth the benefit, nor > am I even convinced we'd actually even achieve the benefit in > practice. > > > 'if' and 'switch' are dual, 'if' is oriented toward doing one test on > a value and 'switch' is oriented to doing several tests on the same value. > So a partial switch is not like an 'if', it's like a cascade of > 'if/else' so forcing to have an 'else' when you have a cascade of > 'if/else' seems not as bad as your are suggesting. > > Yes, it's a more intrusive change but it using the playbook on how to > grow a language, to avoid to add features on top of features to the > point the language is too hard to understand, the idea is that when > you add a feature, you do that in a way that retrofit an existing > feature so the number of features stay more or less constant. > > I don't think that a partial statement is a bug. The rules i propose > make it more explicit by adding "default:" or "default -> {}" at the > end, but the semantics is still the same. > > > for 1/, it's about designing with the future in mind, if most > switch are switch on types, let have the right behavior > > > I think you lost me already, as I don't think it's the right > behavior.? Statements are partial. > > > Ok, here i should have use "right default" instead of "right > behavior", you are right that it's not about the behavior, my bad on > that. > > > > (I probably shouldn't even mention that this creates a new "action > at a distance" problem since the totality semantics depend on the > operand type (see, I was on the debate team in high school too), > so I won't, because it would be unconstructive.) > > > good you did not mention it because as far as i understand for null, > there is a difference between a switch on types and the already > existing switches. > > > But I will mention that the operand type isn't even the right > thing to key off of here, because even if we are switching on > strings, we might still want to use type patterns with guards: > > ??? switch (int) { > ??????? case 0: println("zero"); > ??????? case 1: println("one"); > ??????? case int x where x%2 == 0: println("even"); > ??? } > > Is this an old switch, or a "type" switch?? Well, it can't be > expressed as an old switch, since it uses type patterns, but it is > a switch on old types. > > > Good question, > from the user POV it's either an error or a warning, so in both cases > it's a call for action, so for most user, using Alt+Enter or Ctrl+1 > will fix the issue (insert a "default:") > for us the EG or people writing compilers, it's a new switch because > you have a case that is using a pattern. > > ? So should it be total? > > > If you get an error or a warning it's because it's not total. > > I think the line where you want to cut is fuzzier than you think, > and that's going to confuse the heck out of users. > > > The new switch will confuse a lot of users anyway, it's something i > have remarked when doing presentations about the pattern matching, you > have to explain the syntax because not enough Java devs have not been > exposed to pattern matching in an another languages before. > So > > > So overall, while it's a fair question to ask "could we get away > with defining switch to always be total, carve out an exception > for all the existing idioms, and not confuse the users too much", > I think that would be taking it too far. > > > I think that retrofitting the old switch to a common behavior at the > same time you introduce the new construct is not too far, again as you > said during the development of the expression switch, it's far easier > to explain one behavior that to explain multiple (statement vs > expression switch with respect to totality) or to have to explain when > to use which kind of switch (switch vs sealed-switch). > > R?mi > > > > > > > > > On 8/31/2020 11:17 AM, Remi Forax wrote: > > > > ------------------------------------------------------------------------ > > *De: *"Brian Goetz" > *?: *"amber-spec-experts" > > *Envoy?: *Lundi 31 Ao?t 2020 15:35:32 > *Objet: *Re: [pattern-switch] Opting into totality > > Totality is a term that language designers like, but may > not be all that evocative to users. ?So switch-total might > not exactly turn on the light bulb for them. ?In this > manner, ?sealed? has a useful connotation that has nothing > to do with sealed types: non-leakiness: a sealed switch > doesn?t leak any unprocessed values! > > Test driving ... > > ? ? sealed switch (x) { ? } > ? ? sealed-switch (x) { ? } > ? ? switch-sealed (x) { ? } > > ?A switch may be sealed with the sealed modifier; > expression switches are implicitly sealed. ?The set of > case patterns for a sealed switch must be total with some > remainder; synthetic throwing cases are inserted for the > remainder.? > > > Those are all "snitch" moves, let's avoid that because all you > said about having more than one kind of switch still apply. > > Here are some facts that can help us, > - there is not a lot of existing switches in the wild > - as you said, there is a very good chance that the switch on > types become the dominant switch. > > Now, divide and conquer, > 1/ a switch on type (statement or expression) should always be > non leaky > 2a/ add a warning on all existing leaky statement switches > forcing them to have a default if not exhaustive > 2b/ for an exhaustive enum switch, add a warning if the switch > has a default. > ????? and if there is no default, let the compiler add a > "default -> throw ICCE", it's a breaking change but it should > be ok because IDEs currently ask for a default in a switch on > enums. > explanations > for 1/, it's about designing with the future in mind, if most > switch are switch on types, let have the right behavior > for 2a/, ask users to fix leaky statement switches, even if we > introduce a selaed-switch, we will need this warning to > gradually move to a better world. > for 2b/, ask users to fix exhaustive enum switches so it works > like a switch on type. > > I may be wrong with the idea of adding a "default -> throw" on > enum switches without a default, it may break a lot of codes, > but i believe it worth the try. > > And BTW, we should also emit a warning if the default is in > the middle of the switch, again to drive user to think in term > of switch on type constraints. > > R?mi > > > On Aug 31, 2020, at 9:25 AM, Brian Goetz > > wrote: > > I think this is the main open question at this point. > > We now have a deeper understanding of what this means, > and the shape of the remainder. ?Totality means not > only ?spot check me that I?m right?, but also ?I know > there might be some remainder, please deal with it.? ? > So totality is not merely about type checking, but > about affirmative handling of the remainder. > > Expression switches automatically get ?this treatment, > and opting _out_ of that makes no sense for expression > switches (expressions must be total), but statement > switches make sense both ways (just like unbalanced > and balanced if-else.) ?Unfortunately the default has > to be partial, ?so the main question is, how ?do we > indicate the desire for totality in a way that is > properly evocative for the user? > > We?ve talked about modifying switch (sealed switch), a > hyphenated keyword (total-switch), a trailing modifier > (switch case), and synthetic cases (?default: > unreachable?). ?Of course at this point it?s ?just > syntax?, but I think our goal should be picking > something that ?makes it obvious to users that what?s > going on is not merely an assertion of totality, but > also a desire to handle the remainder. > > ?- How does a switch opt into totality, other than > by being an expression switch? > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.smith at oracle.com Mon Aug 31 23:00:03 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Mon, 31 Aug 2020 17:00:03 -0600 Subject: [pattern-switch] Exhaustiveness In-Reply-To: <0c73fc74-d965-48ea-68fe-80f480b7e464@oracle.com> References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <41E50468-648B-4A29-8AB6-5AB0871ED320@oracle.com> <0c73fc74-d965-48ea-68fe-80f480b7e464@oracle.com> Message-ID: > On Aug 31, 2020, at 9:09 AM, Brian Goetz wrote: > > To be clear, I think the sweet spot here is: > > - Legacy enum, string, and box (ESB) switches continue to throw NPE on null; > - Total switches on enum (including the current expression switches on enum) throw ICCE on a novel value; > - For new switches with remainder: > - Continue to throw NPE on unhandled null remainder; > - Throw UnexpectedFrogException on any other unhandled remainder. I'm still thinking it's worthwhile to change the behavior of enum switches to throw the same thing as sealed class switches. Two reasons: - ICCE is arguably just wrong. It's not an incompatible change to add an enum constant to an enum declaration. - The inconsistency is a risk?somebody thinks they're catching the exception, then they discover, oops, enum switches throw a different exception type for historical reasons. Of course, behavioral changes are a risk, too, but I think getting it in early, before there's been much time for adoption of switch expressions and evolution of enums, minimizes that risk. (We could even make the change in 16 as a spec bug fix.) From forax at univ-mlv.fr Mon Aug 31 23:04:31 2020 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 1 Sep 2020 01:04:31 +0200 (CEST) Subject: [pattern-switch] Opting into totality In-Reply-To: <8660C84F-338F-4F08-A5B5-CA0BBA99F970@oracle.com> References: <1e3058dd-b98e-cffe-371a-b395fb768838@oracle.com> <2B1A968E-C518-496D-984E-76A05B2B545B@oracle.com> <8660C84F-338F-4F08-A5B5-CA0BBA99F970@oracle.com> Message-ID: <355666053.496907.1598915071108.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Lundi 31 Ao?t 2020 15:35:32 > Objet: Re: [pattern-switch] Opting into totality > Totality is a term that language designers like, but may not be all that > evocative to users. So switch-total might not exactly turn on the light bulb > for them. In this manner, ?sealed? has a useful connotation that has nothing to > do with sealed types: non-leakiness: a sealed switch doesn?t leak any > unprocessed values! > Test driving ... > sealed switch (x) { ? } > sealed-switch (x) { ? } > switch-sealed (x) { ? } > ?A switch may be sealed with the sealed modifier; expression switches are > implicitly sealed. The set of case patterns for a sealed switch must be total > with some remainder; synthetic throwing cases are inserted for the remainder.? Given that if there is a default it's already a sealed switch and that i can add a default to make it a sealed switch, i struggle to see where to use a classical statement switch and where to use a sealed switch ? R?mi >> On Aug 31, 2020, at 9:25 AM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >> brian.goetz at oracle.com ] > wrote: >> I think this is the main open question at this point. >> We now have a deeper understanding of what this means, and the shape of the >> remainder. Totality means not only ?spot check me that I?m right?, but also ?I >> know there might be some remainder, please deal with it.? So totality is not >> merely about type checking, but about affirmative handling of the remainder. >> Expression switches automatically get this treatment, and opting _out_ of that >> makes no sense for expression switches (expressions must be total), but >> statement switches make sense both ways (just like unbalanced and balanced >> if-else.) Unfortunately the default has to be partial, so the main question is, >> how do we indicate the desire for totality in a way that is properly evocative >> for the user? >> We?ve talked about modifying switch (sealed switch), a hyphenated keyword >> (total-switch), a trailing modifier (switch case), and synthetic cases >> (?default: unreachable?). Of course at this point it?s ?just syntax?, but I >> think our goal should be picking something that makes it obvious to users that >> what?s going on is not merely an assertion of totality, but also a desire to >> handle the remainder. >>> - How does a switch opt into totality, other than by being an expression switch? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Aug 31 23:30:15 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 31 Aug 2020 19:30:15 -0400 Subject: [pattern-switch] Exhaustiveness In-Reply-To: References: <8920ff53-39d2-4a18-5049-43816f4f4cb2@oracle.com> <887da6da-3264-83ba-4630-ae5c1a2293f6@oracle.com> <977893423.207091.1597955667990.JavaMail.zimbra@u-pem.fr> <50A45261-109C-4F73-9109-E3174E57D325@oracle.com> <2927f36f-3231-14c4-f8d4-f05685efd6e1@oracle.com> <41E50468-648B-4A29-8AB6-5AB0871ED320@oracle.com> <0c73fc74-d965-48ea-68fe-80f480b7e464@oracle.com> Message-ID: > I'm still thinking it's worthwhile to change the behavior of enum switches to throw the same thing as sealed class switches. I don't necessarily think this is terrible, but see inline. > > - ICCE is arguably just wrong. It's not an incompatible change to add an enum constant to an enum declaration. Kevin made a compelling argument when we did this that ICCE is actually right.? The notion is that ICCE is what users are most likely to associate with "broken classpath"; that the classes on the class path have been inconsistently compiled.? ICCEs go away on recompilation. This does, of course, hinge on the meaning of incompatible.? Adding an enum constant is not binary-incompatible; all the call sites continue to link.? But "I've never seen this value before", because the value wasn't present at compilation time, is not an unreasonable interpretation. > - The inconsistency is a risk?somebody thinks they're catching the exception, then they discover, oops, enum switches throw a different exception type for historical reasons. Yeah, don't care much about this.? No one catches these anyway. > Of course, behavioral changes are a risk, too, but I think getting it in early, before there's been much time for adoption of switch expressions and evolution of enums, minimizes that risk. (We could even make the change in 16 as a spec bug fix.) I don't necessarily disagree, and it would be consistent to throw UnexpectedFrogException for all non-null remainders, but the status quo does not seem wrong to me. From daniel.smith at oracle.com Mon Aug 31 23:31:16 2020 From: daniel.smith at oracle.com (Dan Smith) Date: Mon, 31 Aug 2020 17:31:16 -0600 Subject: [pattern-switch] Opting into totality In-Reply-To: <8660C84F-338F-4F08-A5B5-CA0BBA99F970@oracle.com> References: <1e3058dd-b98e-cffe-371a-b395fb768838@oracle.com> <2B1A968E-C518-496D-984E-76A05B2B545B@oracle.com> <8660C84F-338F-4F08-A5B5-CA0BBA99F970@oracle.com> Message-ID: <55264041-ACFD-446D-A6DF-CDE9892C157C@oracle.com> > On Aug 31, 2020, at 7:35 AM, Brian Goetz wrote: > > Totality is a term that language designers like, but may not be all that evocative to users. So switch-total might not exactly turn on the light bulb for them. In this manner, ?sealed? has a useful connotation that has nothing to do with sealed types: non-leakiness: a sealed switch doesn?t leak any unprocessed values! > > Test driving ... > > sealed switch (x) { ? } > sealed-switch (x) { ? } > switch-sealed (x) { ? } > > ?A switch may be sealed with the sealed modifier; expression switches are implicitly sealed. The set of case patterns for a sealed switch must be total with some remainder; synthetic throwing cases are inserted for the remainder.? +1 I like this being up front. I find tricks embedded in the body like 'default: unreachable' do be too subtle and verbose. And I like the reuse of 'sealed'. It's unclear whether your "some remainder" is allowed to be empty. (There was some discussion earlier about outlawing 'default' in the equivalent of a sealed switch.) I hope full totality is fine?an expression switch, implicitly 'sealed', of course permits a 'default' clause. And then note that, given the existence of 'sealed switch', the 'default Object o' feature is redundant. If you want to make sure you have a total case in your switch, just say 'sealed' at the top. All sealed switches (both statement and expression) guarantee either optimistic totality + NPE or that the last clause is total. From brian.goetz at oracle.com Mon Aug 31 23:32:35 2020 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 31 Aug 2020 19:32:35 -0400 Subject: [pattern-switch] Opting into totality In-Reply-To: <355666053.496907.1598915071108.JavaMail.zimbra@u-pem.fr> References: <1e3058dd-b98e-cffe-371a-b395fb768838@oracle.com> <2B1A968E-C518-496D-984E-76A05B2B545B@oracle.com> <8660C84F-338F-4F08-A5B5-CA0BBA99F970@oracle.com> <355666053.496907.1598915071108.JavaMail.zimbra@u-pem.fr> Message-ID: > Given that if there is a default it's already a sealed switch and that > i can add a default to make it a sealed switch, > i struggle to see where to use a classical statement switch and where > to use a sealed switch ? It feels like we're going in circles :) One point here is that total switches are generally better _without_ default clauses, if that is semantically practical (e.g., enums, sealed types, total type patterns) because then you don't have a catch-all that sweeps mistakes under the rug.? But the there needs to be a way to engage totality checking / handling for statements. Totalizing with default should not be your first move.