From brian.goetz at oracle.com Wed Nov 1 18:53:40 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 1 Nov 2017 14:53:40 -0400 Subject: Data classes Message-ID: At the following URL, please find a writeup containing our current thoughts on Data Classes for Java: http://cr.openjdk.java.net/~briangoetz/amber/datum.html Comments welcome! We'll be making a prototype available soon for folks to play with. From mark at io7m.com Thu Nov 2 10:46:50 2017 From: mark at io7m.com (Mark Raynsford) Date: Thu, 2 Nov 2017 10:46:50 +0000 Subject: Data classes In-Reply-To: References: Message-ID: <20171102104651.75e67a86@copperhead.int.arc7.info> On 2017-11-01T14:53:40 -0400 Brian Goetz wrote: > At the following URL, please find a writeup containing our current > thoughts on Data Classes for Java: > > http://cr.openjdk.java.net/~briangoetz/amber/datum.html > > Comments welcome! > > We'll be making a prototype available soon for folks to play with. This looks great! One thing springs to mind with the accessors: Is it possible that the generated accessor methods could participate in the implementation of interfaces? For example: interface Vector { double x(); double y(); double z(); } __data class Vector3 (double x, double y, double z) implements Vector { } The Vector3 type would automatically get x(), y(), and z() methods based on the field declarations. This may imply, of course, that the fields are public by default. I don't see any obvious reasons why this couldn't work, but you've almost certainly thought further than I have on the subject. -- Mark Raynsford | http://www.io7m.com From maurizio.cimadamore at oracle.com Thu Nov 2 11:31:06 2017 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 2 Nov 2017 11:31:06 +0000 Subject: Data classes In-Reply-To: <20171102104651.75e67a86@copperhead.int.arc7.info> References: <20171102104651.75e67a86@copperhead.int.arc7.info> Message-ID: <2c251cbd-5d71-8c28-9340-1a68a1fee1ce@oracle.com> Hi Mark, the code that is generated on a datum is essentially like real code that you would otherwise have typed yourself. In other word, think of it as ACC_MANDATED, not as ACC_SYNTHETIC. It is something that the compiler can fully reason about during type-checking and, as a consequence, it means that, yes, you can implement an interface via datum _mandated_ methods. Cheers Maurizio On 02/11/17 10:46, Mark Raynsford wrote: > On 2017-11-01T14:53:40 -0400 > Brian Goetz wrote: > >> At the following URL, please find a writeup containing our current >> thoughts on Data Classes for Java: >> >> http://cr.openjdk.java.net/~briangoetz/amber/datum.html >> >> Comments welcome! >> >> We'll be making a prototype available soon for folks to play with. > This looks great! > > One thing springs to mind with the accessors: Is it possible that the > generated accessor methods could participate in the implementation of > interfaces? For example: > > interface Vector > { > double x(); > double y(); > double z(); > } > > __data class Vector3 (double x, double y, double z) > implements Vector { } > > The Vector3 type would automatically get x(), y(), and z() methods > based on the field declarations. This may imply, of course, that the > fields are public by default. > > I don't see any obvious reasons why this couldn't work, but you've > almost certainly thought further than I have on the subject. > From mark at io7m.com Thu Nov 2 12:32:17 2017 From: mark at io7m.com (Mark Raynsford) Date: Thu, 2 Nov 2017 12:32:17 +0000 Subject: Data classes In-Reply-To: <2c251cbd-5d71-8c28-9340-1a68a1fee1ce@oracle.com> References: <20171102104651.75e67a86@copperhead.int.arc7.info> <2c251cbd-5d71-8c28-9340-1a68a1fee1ce@oracle.com> Message-ID: <20171102123217.7e2eb9a7@copperhead.int.arc7.info> On 2017-11-02T11:31:06 +0000 Maurizio Cimadamore wrote: > Hi Mark, > the code that is generated on a datum is essentially like real code that > you would otherwise have typed yourself. In other word, think of it as > ACC_MANDATED, not as ACC_SYNTHETIC. It is something that the compiler > can fully reason about during type-checking and, as a consequence, it > means that, yes, you can implement an interface via datum _mandated_ > methods. Nice. That's what I suspected/hoped. -- Mark Raynsford | http://www.io7m.com From forax at univ-mlv.fr Thu Nov 2 13:34:58 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 2 Nov 2017 14:34:58 +0100 (CET) Subject: Data classes In-Reply-To: References: Message-ID: <325340300.1290169.1509629698541.JavaMail.zimbra@u-pem.fr> Hi Brian, there is an axis which is not mentioned in this document, the nullablility of the fields, it's somewhat like 'final' because fields should not non nullable by default, it asks the compiler to add codes in the constructors (the calls to requireNonNull) so it can have a not so simple interaction with the call to default. I will post several other mails on each sections later. cheers, R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Mercredi 1 Novembre 2017 19:53:40 > Objet: Data classes > At the following URL, please find a writeup containing our current > thoughts on Data Classes for Java: > > http://cr.openjdk.java.net/~briangoetz/amber/datum.html > > Comments welcome! > > We'll be making a prototype available soon for folks to play with. From brian.goetz at oracle.com Thu Nov 2 13:59:11 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 2 Nov 2017 09:59:11 -0400 Subject: Data classes In-Reply-To: <7B7410AB-AE95-4486-88F7-BA428BDFC859@d-d.me> References: <7B7410AB-AE95-4486-88F7-BA428BDFC859@d-d.me> Message-ID: <5a0b3a84-c770-9329-423e-5c745a7b0585@oracle.com> > This is a very good and a pleasant read, thank you! > Especially nice to see a clear definition of design requirements. > I have several follow up questions: > > * > > do you envision data classes as Java language feature or JVM > platform feature? > Are additional features such as externalization implemented by > runtime magic or by > Bytecode generated by compiler ahead of time? > We will likely do the same thing we did with Lambda, where we provide JDK runtime for implementing the Object methods using indy. In our prototype, we expose invokedynamic bootstraps that implement equals, hashCode, and toString, which take as their static arguments method handles for fetching the fields (and in the case of toString, also field names), like this: ??? public static CallSite makeEquals(MethodHandles.Lookup lookup, String invName, MethodType invType, ????????????????????????????????????? Class dataClass, MethodHandle... getters) throws Throwable { ??????? return new ConstantCallSite(makeEquals(dataClass, List.of(getters))); ??? } The static compiler, in the body of equals(), just generates a a single invokedynamic instruction whose bootstrap is this method, whose static arguments describe this class, and whose dynamic arguments are those of equals().? Other language compilers could do the same. > * > > requirements stipulated in section ?Towards requirements for data > classes? feel like they > should not only impose limits on the dataclass, but also types of > fields the data class could contain. For example: > |For any instance c of C, ctor(dtor(c)) equals c, according to the > equals() contract for C, > and further, that the composition ctor(dtor(x)) is an identity on > the codomain of ctor.| > this assumes that equals on fields is also well-behaved. > While in perfect world I?d prefer to limit data-classes to only > have well-behaved fields, > I see that this may be infeasible. I think it is still worth > spelling that requirements&definitions > of transparent carrier are conditional on well-behaveness of > fields + formally defining this well-behaveness. > Yeah, I'm not really sure what more we can do here, but I agree that this is a place where our requirements could be implicitly undermined.? Any suggestions? > * > > while I agree that names are meaningful in Java, it would be nice > to be able to generically inspect and\or > deconstruct arbitrary data-class as if it was a tuple. This will > allow generic handling of data-classes > when it makes sense. Do you have ideas in this area? > We will generate deconstructor patterns for these things, of course, which can be statically invoked by name (case Point(var x, var y)). And deconstructor patterns can be invoked reflectively, but I know that's also not what you mean (though may be good enough for frameworks).? I think what you mean is something more like "data interfaces", where a data class can "implement" a more abstract deconstructor? > * > > Mutability: I agree that enforcing complete immutability is > impractical. > But I?m curios if you?re current thinking is to limit mutability > to class itself(as if setters were private) > or to allow data classes to be mutated externally(as if setters > were public). > My feeling that the former route is nice because it allows data > class authors > to expose mutability only if only they want to. What was your > thinking? > The most restrictive position is: fields are always final and are always publicly readable (either because the fields themselves are public, or we provide public read accessors.)? This is a principled position, but is probably impractical. The likely practical compromise is: ?- fields are final by default, but can be declared "unfinal" ?- fields are private by default, but can be declared public ?- you get public read accessors, no matter what So if you do nothing, you get final and readable, which is a reasonable default.? If you opt into mutability, fields are still private, and you have the option to expose mutative methods, or not.? If you opt into mutability and public-ness, people can do whatever they want. > * > > The arguments to the extends Base() clause is a list of names > of state components of Sub ? > must be a prefix of the state description of Sub, > > Curiosity: why do you need it to be a prefix? It didn?t understand > it from the writeup. > Essentially, this is "nominal width subtyping". This isn't essential, though I think this requirement is useful to nudge people away from mistaken ideas about extension; the degrees of freedom it removes are relatively weak in expressiveness, but still make it possible for users to confuse themselves, such as when Foo(int x, int y) is declared to extend Bar(int y, int x). > * > > I feel like this might be limiting: if you?re working on abstract > ADTs, you are likely to work on > Very abstract entities, where convenient order of arguments is not > yet clear. When ADTs become less abstract > in different subtypes it will become clear and can be different in > different branches of inheritance hierarchy. > Please share examples! Cheers, -Brian -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Nov 2 14:00:38 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 2 Nov 2017 10:00:38 -0400 Subject: Data classes In-Reply-To: <20171102104651.75e67a86@copperhead.int.arc7.info> References: <20171102104651.75e67a86@copperhead.int.arc7.info> Message-ID: <2c95cdb1-a41d-b92e-be00-dcf5323ade50@oracle.com> > > This looks great! > > One thing springs to mind with the accessors: Is it possible that the > generated accessor methods could participate in the implementation of > interfaces? For example: > > interface Vector > { > double x(); > double y(); > double z(); > } > > __data class Vector3 (double x, double y, double z) > implements Vector { } > > The Vector3 type would automatically get x(), y(), and z() methods > based on the field declarations. This may imply, of course, that the > fields are public by default. This is exactly how it would work.? The Vector3 class acquires x(), y(), and z() accessors, and since the interface Vector specifies these, and Vector3 <: Vector, then those accessors are the implementations of the Vector methods. From brian.goetz at oracle.com Thu Nov 2 14:05:29 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 2 Nov 2017 10:05:29 -0400 Subject: Data classes In-Reply-To: <325340300.1290169.1509629698541.JavaMail.zimbra@u-pem.fr> References: <325340300.1290169.1509629698541.JavaMail.zimbra@u-pem.fr> Message-ID: <8d46174e-5c88-b88b-6488-f52501a23a9e@oracle.com> Yes, there are many features that a feature like data classes might want to pull in, like nullability / optionality / named constructor parameter invocations, to name a few. However, treating nullability as a sub-feature of data classes would be a mistake, because, if a user has a class that does not quite meet the requirements for a data class, users will be badly tempted to wedge this square peg into the round data class hole just to get, say, nullability support.? If we're going to do something about nullability, it should work for all classes, data or otherwise. You are right that there's a weak coupling, in that if we did that feature first, it might affect the defaults for data classes, whereas if we do data classes first, we might be boxed in with respect to defaults. On 11/2/2017 9:34 AM, Remi Forax wrote: > Hi Brian, > there is an axis which is not mentioned in this document, the nullablility of the fields, it's somewhat like 'final' because fields should not non nullable by default, it asks the compiler to add codes in the constructors (the calls to requireNonNull) so it can have a not so simple interaction with the call to default. > > I will post several other mails on each sections later. > > cheers, > R?mi > > ----- Mail original ----- >> De: "Brian Goetz" >> ?: "amber-spec-experts" >> Envoy?: Mercredi 1 Novembre 2017 19:53:40 >> Objet: Data classes >> At the following URL, please find a writeup containing our current >> thoughts on Data Classes for Java: >> >> http://cr.openjdk.java.net/~briangoetz/amber/datum.html >> >> Comments welcome! >> >> We'll be making a prototype available soon for folks to play with. From forax at univ-mlv.fr Thu Nov 2 14:40:29 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 2 Nov 2017 15:40:29 +0100 (CET) Subject: Data classes In-Reply-To: <8d46174e-5c88-b88b-6488-f52501a23a9e@oracle.com> References: <325340300.1290169.1509629698541.JavaMail.zimbra@u-pem.fr> <8d46174e-5c88-b88b-6488-f52501a23a9e@oracle.com> Message-ID: <1753394655.1345295.1509633629668.JavaMail.zimbra@u-pem.fr> I agree that there is no way in Java to say that a field is null or not but we also commonly uses Objects.requireNonNull as precondition to indicate in our code to indicate if field can be null or not (by example IDEA adds Objects.requireNonNull by default). I heard your concern about not trying to solve the nullability support in Java but at the same time, we do not write the constructor, equals or hashCode the same way if a field can be null or not, so it the kind of an information you want to tailor the code the compiler/JDK will generate, so the coupling is not weak. In the proposal, you discuss about adding keywords like "non-final, unfinal, mutable", i.e. considering that there can be a different set of keywords for data class declaration which is different from the one we have on fields. In that spirit, we can have a flag like nullable, maybenull, etc to specify that the compiler will not generate a requireNonNull inside the principal constructor and that equals or hashCode do not need nullchecks. R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Jeudi 2 Novembre 2017 15:05:29 > Objet: Re: Data classes > Yes, there are many features that a feature like data classes might want > to pull in, like nullability / optionality / named constructor parameter > invocations, to name a few. > > However, treating nullability as a sub-feature of data classes would be > a mistake, because, if a user has a class that does not quite meet the > requirements for a data class, users will be badly tempted to wedge this > square peg into the round data class hole just to get, say, nullability > support.? If we're going to do something about nullability, it should > work for all classes, data or otherwise. > > You are right that there's a weak coupling, in that if we did that > feature first, it might affect the defaults for data classes, whereas if > we do data classes first, we might be boxed in with respect to defaults. > > > > On 11/2/2017 9:34 AM, Remi Forax wrote: >> Hi Brian, >> there is an axis which is not mentioned in this document, the nullablility of >> the fields, it's somewhat like 'final' because fields should not non nullable >> by default, it asks the compiler to add codes in the constructors (the calls to >> requireNonNull) so it can have a not so simple interaction with the call to >> default. >> >> I will post several other mails on each sections later. >> >> cheers, >> R?mi >> >> ----- Mail original ----- >>> De: "Brian Goetz" >>> ?: "amber-spec-experts" >>> Envoy?: Mercredi 1 Novembre 2017 19:53:40 >>> Objet: Data classes >>> At the following URL, please find a writeup containing our current >>> thoughts on Data Classes for Java: >>> >>> http://cr.openjdk.java.net/~briangoetz/amber/datum.html >>> >>> Comments welcome! >>> > >> We'll be making a prototype available soon for folks to play with. From brian.goetz at oracle.com Thu Nov 2 15:05:33 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 2 Nov 2017 11:05:33 -0400 Subject: Data classes In-Reply-To: <1753394655.1345295.1509633629668.JavaMail.zimbra@u-pem.fr> References: <325340300.1290169.1509629698541.JavaMail.zimbra@u-pem.fr> <8d46174e-5c88-b88b-6488-f52501a23a9e@oracle.com> <1753394655.1345295.1509633629668.JavaMail.zimbra@u-pem.fr> Message-ID: <74a5059d-65e7-5303-bc12-d944bc16ed14@oracle.com> > In the proposal, you discuss about adding keywords like "non-final, unfinal, mutable", i.e. considering that there can be a different set of keywords for data class declaration which is different from the one we have on fields. In that spirit, we can have a flag like nullable, maybenull, etc to specify that the compiler will not generate a requireNonNull inside the principal constructor and that equals or hashCode do not need nullchecks. Not really a fair comparison.? The keywords suggested like "non-final" are not adding new concepts; they are merely making explicit something that was previously implicit.? They are more akin to allowing the "package" keyword to describe the default accessibility of class fields, rather than a new feature. Non-nullability, on the other hand, is indeed a new feature.? And, it is not a simple property of fields (well, it could be, but I suspect users would find that to an unsatisfying interpretation of "support for non-nullity.") Ignoring that, what I think you're saying is "You don't need to do full-blown nullity propagation, you could simply have an 'annotation' that triggers a null check on data class constructors."? And if that's what you're saying, I agree, we could do that.? But I suspect the limitations of this would very quickly turn opinion negative.? (Can't use it on non-data classes.? Can't use it on locals.? Using it on non-final fields starts to get messy fast -- do we have to do this on all field writes?? Etc.) From mark at io7m.com Thu Nov 2 15:19:56 2017 From: mark at io7m.com (Mark Raynsford) Date: Thu, 2 Nov 2017 15:19:56 +0000 Subject: Data classes In-Reply-To: <74a5059d-65e7-5303-bc12-d944bc16ed14@oracle.com> References: <325340300.1290169.1509629698541.JavaMail.zimbra@u-pem.fr> <8d46174e-5c88-b88b-6488-f52501a23a9e@oracle.com> <1753394655.1345295.1509633629668.JavaMail.zimbra@u-pem.fr> <74a5059d-65e7-5303-bc12-d944bc16ed14@oracle.com> Message-ID: <20171102151956.0f1516b4@copperhead.int.arc7.info> On 2017-11-02T11:05:33 -0400 Brian Goetz wrote: > > In the proposal, you discuss about adding keywords like "non-final, unfinal, mutable", i.e. considering that there can be a different set of keywords for data class declaration which is different from the one we have on fields. In that spirit, we can have a flag like nullable, maybenull, etc to specify that the compiler will not generate a requireNonNull inside the principal constructor and that equals or hashCode do not need nullchecks. > > Not really a fair comparison.? The keywords suggested like "non-final" > are not adding new concepts; they are merely making explicit something > that was previously implicit.? They are more akin to allowing the > "package" keyword to describe the default accessibility of class fields, > rather than a new feature. Non-nullability, on the other hand, is indeed > a new feature.? And, it is not a simple property of fields (well, it > could be, but I suspect users would find that to an unsatisfying > interpretation of "support for non-nullity.") I think it would be better to get non-nullability language-wide first as a separate feature. -- Mark Raynsford | http://www.io7m.com From gavin.bierman at oracle.com Fri Nov 3 10:44:13 2017 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Fri, 3 Nov 2017 10:44:13 +0000 Subject: Questions on pattern matching Message-ID: Dear Spec Experts, In our development of the pattern matching feature a number of fairly fundamental design questions have arisen. In the following series of emails, I will try to explain the question and describe some of the solutions that have come to mind. Please do let us know of any thoughts you might have. Many thanks, Gavin From gavin.bierman at oracle.com Fri Nov 3 10:44:51 2017 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Fri, 3 Nov 2017 10:44:51 +0000 Subject: PM design question: Scopes Message-ID: <27805597-0665-41D9-996D-6BEBA77B8ADA@oracle.com> Scopes Java has five constructs that introduce fresh variables into scope: the local variable declaration statement, the for statement, the try-with-resources statement, the catch block, and lambda expressions. The first, local variable declaration statements, introduce variables that are in scope for the rest of the block that it is declared in. The others introduce variables that are limited in their scope. The addition of pattern matching brings a new expression, matches, and extends the switch statement. Both these constructs can now introduce fresh (and, if the pattern match succeeds, definitely assigned (DA)) variables. But the question is what is the scope of these ?pattern? variables? Let us consider the pattern matching constructs in turn. First the switch statement: switch (o) { case int i: ... case .. } What is the scope of the pattern variable i? There are a range of options. The scope of the pattern variable is from the start of the switch statement until the end of the enclosing block. In this case the pattern variable is in scope but would be definitely unassigned (DU) immediately after the switch statement. switch (o) { case int i : ... // DA ... // DA case T t : // i is in scope } ... // i in still in scope and DU +ve Simple -ve Can?t simply reuse a pattern variable in the same switch statement (without some form of shadowing) -ve Pattern variable poisons the rest of the block The scope of the pattern variable extends only to the end of the switch block. In this case the pattern variable would be considered DA only for the statements between the current case label and the subsequent case labeled statement. For example: switch (o) { case int i : ... // DA ... // DA case T t : // i is in scope but not DA } ... // i not in scope +ve Simple +ve Pattern variables not poisoned in subsequent statements in the rest of the block +ve Similar technique to for identifiers (not a new idea) -ve Can?t simply reuse a pattern variable in the same switch statement (without some form of shadowing) The scope of the pattern variable extends only to the next case label. switch (o) { case int i : ... // in scope and DA ... // in scope and DA case T i : // int i not in scope, so can re-use } ... // i not in scope +ve Simple syntactic rule +ve Allows reuse of pattern variable in the same switch statement. -ve Doesn?t make sense for fallthrough NOTE This final point is important - supporting fallthrough impacts on what solution we might choose for scoping of pattern variables. (We could not support fallthrough and instead support OR patterns - a further design dimension.) ASIDE Should we support a switch expression; it seems clear that scoping should be treated in the same way as it is for lambda expressions. The matches expression is unusual in that it is an expression that introduces a fresh variable. What is the scope of this variable? We want it to be more than the expression itself, as we want the following example code to be correct: if (e matches String s) { System.out.println("It's a string - " + s); } In other words, the variable introduced by the pattern needs to be in scope for an enclosing IfThen statement. However, a match expression could be nested within another expression. It seems reasonable that the patterns variables are in scope for at least the rest of the expression. For example: (e matches String s || s.length() > 0) Here the s should be in scope for the subexpression s.length (although it is not DA). In contrast: (e matches String s && s.length() > 0) Here the s is both in scope and DA for the subexpression s.length. However, what about the following: if (s.length() > 0 && e matches String s) { System.out.println(s); } Given the idea that a pattern variable flows from the inside-out to the enclosing statement, it would appear that s is in scope for the subexpression s.length; although it is not DA. Unless we want scopes to be non-contiguous, we will have to accept this rather odd situation (consider where s shadows a field). [This appears to be what happens in the current C# compiler.] Now let?s consider how far a pattern variable flows wrt its enclosing statement. We have a range of options: The scope is both the statement that the match expression occurs in and the rest of the block. In this scenario, if (o matches T t) { ... } else { ... } is treated as equivalent to the following pseudo-code (where match-and-bind is a fictional pattern matching construct that pattern-matches and binds to a variable that has already been declared) T t; if (o match-and-bind t) { // t in scope and DA } else { // t in scope and DU } // t in scope and DU This is how the current C# compiler works (although the spec describes the next option; so perhaps this is a bug). The scope is just the statement that the match expression occurs in. In this scenario, if (o matches T t) { ... } else { } ... is treated as equivalent to the pseudo-code { T t; if (o match-and-bind t) { // t in scope and DA } else { // t in scope and DU // thus declaration int t = 42; is not allowed. } } // t not in scope ... This restricted scope allows reuse of pattern variables, e.g. if (o matches T x) { ... } if (o matches S x) { ... } The scope of the pattern variable is determined by a flow analysis of the enclosing statement. (It could be thought of as a refinement of option b.) This is currently implemented in the prototype compiler. For example: if (!!(o matches T t)) { // t in scope } else { // t not in scope } +ve Code will work in the presence of most refactorings +ve We have this code working already :-) -ve This is a break to the existant notion of scope as a contiguous program fragment. A scope can now have holes in it. Will users ever understand this? (Although they are very similar to the flow-based rules for DA/DU.) ASIDE Regardless of whether we opt for (b) or (c) we may consider a further extension where we allow the scope to extend beyond the current statement for the case of an unbalanced if statement. For example ``` if (!(o matches T t)) { return; } // t in scope ... return; ``` +ve Supports a common idiom where else blocks are not needed -ve Yet further complication of notion of scope. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Fri Nov 3 10:45:30 2017 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Fri, 3 Nov 2017 10:45:30 +0000 Subject: PM design question: Shadowing Message-ID: <20910035-3C63-4561-BB54-73B8162B0910@oracle.com> Shadowing As mentioned in the previous email, Java currently has five constructs that declare fresh variables. All five declare variables under the same shadowing regime: (i) No shadowing of formal parameters, (ii) No shadowing of other locally declared variables, but (iii) shadowing permitted of fields. Thus we would expect pattern variables to shadow fields. But such a decision has some interesting consequences. For example, if we adopt flow-like scoping strategy (c or d in the previous email), then the following code has some subtle behaviour. // field i in scope switch (o) { case Integer i : System.out.print(i); // shadows field i break; case T t : System.out.print(i); // field i } Is this too confusing? We could also consider allowing variables to shadow other variables when they are in scope and DU. For example, if we adopt scoping strategy (b) from the previous email - where the scope of a pattern variable is the entire enclosing statement - the following code would be allowed. if (o matches T t) { // t in scope and DA } else { // t in scope and DU if (o1 matches Integer t) { // Integer t shadows T t } } Should we restrict this new notion of shadowing to pattern variables only, or change it for all variables in Java. Would this be a step too far? -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Fri Nov 3 10:46:10 2017 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Fri, 3 Nov 2017 10:46:10 +0000 Subject: Patterns design question: Nulls Message-ID: Nulls and pattern matching The null value is treated somewhat inconsistently in Java; unfortunately pattern matching places a fresh focus on these inconsistences. Consider a local variable, String s=null;. Currently s instanceof String returns false; whereas (String)ssucceeds; and further switch (s){ case "Hello": ... } throws a NPE. Unfortunately, we need to maintain these behaviours whilst providing a consistent story for patterns. So far, we have essentially two choices on the table. One based on what might be called a pragmatic navigation of existing choices; and another more sophisticated one based on static type information. (In what follows, we assume a declaration T t = null; is in scope.) Option 1. Matches t matches Object o. To keep it consistent with instanceof this must return false. Switch switch is retconned to not throw a NPE when given a null. However, all switches with reference-typed selector expressions are considered to have an implicit case null: throw new NullPointerException(); clause as the first clause in the switch. If the user supplies a null clause (which means it must be a pattern-matching switch) then this replaces the implicit clause. [An alternative to this is to introduce non-null type tests, which we fear would quickly become unwieldy.] Note that this addresses a problem that has been brought up on the external mailing list. Currently: static void testSwitchInteger(Integer i) { switch(i) { case 1: System.out.println("One"); break; default: System.out.println("Other"); break; } } static void testSwitchNumber(Number i) { switch(i) { case 1: System.out.println("One"); break; default: System.out.println("Other"); break; } } testSwitchNumber(null); // prints "Other" testSwitchInteger(null); // NPE The Integer case is an old-style switch, so throws an NPE. The Number case is a pattern matching case, so without the insertion of an implicit null clause, it would actually match the default clause (this is the behaviour of the current prototype). ASIDE Adding a null clause has an impact on the dominance analysis in pattern matching. A null pattern must appear before a type test pattern. Nested/Destructuring patterns As discussed earlier, t matches Object o returns false. But unfortunately new Box(t) matches Box(Object o) really ought to return true. (Both because this is what we feel would be expected, but also to be consistent with expected semantics of extractors.) In other words, the semantics of matching against null is not compositional. Note also that the null value never matches against a nested pattern. We might expect a translation to proceed something like the following. e matches Box(Object o) -> e matches Box && (e.contents matches null as o || e.contents matches Object o) -> e instance Box && (e.contents == null || e.contents instanceof Object) (Note the rarely seen as pattern in the intermediate pattern.) Option 2. We can use the static type information to classify pattern matches, which ultimately determines how the matching is translated. For example: if (t matches U u) { // where T <: U ... } Notice here that the pattern match is guaranteed to succeed as T is a subtype of U. We can classify this as a type restatement pattern, and compile it essentially to the following if (true) { U u = t; ... } In other words, the expression (o matches U u) succeeds depending on the static type of o: if the static type of o is a subtype of U then it evaluates to true, even for the value of null. If it is not statically a subtype of U then its runtime type is tested as normally, and null would fail. ASIDE The choice of null matching also impacts on our reachability analysis. For example: Integer i = ...; switch (i) { case Integer j: { System.out.println(j); break; } default: System.out.println("Something else"); } Is the default case reachable? If the type test matches null then it is unreachable, otherwise it is reachable. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Fri Nov 3 10:46:40 2017 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Fri, 3 Nov 2017 10:46:40 +0000 Subject: Patterns design question: Generics Message-ID: Generics A related problem to the issue of null and pattern matching is dealing with patterns mentioning generic types. Currently, it is forbidden to use instanceof with a non-reifiable type. However, we suspect that Java programmers would expect the following to work: ArrayList al = ... if (al matches List li) { ... } Whereas perhaps it is to be expected that the following is suspect Object o = ... if (o matches List li) { // How could we perform this test? } The type restatement distinction that we introduced in the previous email for dealing with null provides a way forward. More formally, given an expression e matches U u where e has type T: If T is assignment convertible to U then this is a type restatement pattern match, and is allowed regardless of the type U (even if it is non-reifiable). If T is cast convertible to U, but not assignment convertible, then we emit a warning/error as per the cast conversion rules. Do we have any other design options here? -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Fri Nov 3 10:47:24 2017 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Fri, 3 Nov 2017 10:47:24 +0000 Subject: Patterns design question: Primitive type tests Message-ID: Primitive type-test patterns Given that patterns include constant expressions, and type tests possibly including generic types; it seems reasonable to consider the possibility of allowing primitive type tests in pattern matching. (This answers a sometimes-requested feature: can instanceof support primitive types?) However, it is not wholly obvious what this test might mean. One possibility is that a ?type-restating? equivalent for primitive type-test patterns is assignment conversion; e.g. if I have case int x: then a target whose static type is byte, short, char, or int ? or their boxes ? will be statically deemed to match. A target whose dynamic type can be assigned to the primitive type through a combination of unboxing and widening (again, assignment conversion) matches a primitive type test. So if we have: switch (o) { case int i: ... we have to do instanceof tests against {Integer,Short,Character,Boolean} to determine a match. A primitive type test pattern dominates other primitive type patterns according to assingment compatibility; int dominates byte/short/char, long dominates int/byte/short/char, and double dominates float. A primitive type test pattern is inapplicable (dead) if cast conversion from the static type of the target fails: Map m; switch (m) { case int x: // compile error } The dominance interaction between primitive type-tests and reference type-tests for the wrapper types (and their supertypes) seems messy. Consider the following combinations: case int n: case Integer n: // dead case Integer n: case int n: // not dead -- still matches Short, Byte case Byte b: case byte b: // dead case Number n: case int n: // dead Is there some unifying theory that makes sense here? One possibility is to take a more denotational view: a type is a set of values, so type restatement is really about semantic set inclusion, and dynamic testing is about set membership. Is this adding too much complexity? Do developers really care about this feature? -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Nov 3 12:19:09 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 3 Nov 2017 13:19:09 +0100 (CET) Subject: Data classes In-Reply-To: <74a5059d-65e7-5303-bc12-d944bc16ed14@oracle.com> References: <325340300.1290169.1509629698541.JavaMail.zimbra@u-pem.fr> <8d46174e-5c88-b88b-6488-f52501a23a9e@oracle.com> <1753394655.1345295.1509633629668.JavaMail.zimbra@u-pem.fr> <74a5059d-65e7-5303-bc12-d944bc16ed14@oracle.com> Message-ID: <1480459040.1979843.1509711549963.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: forax at univ-mlv.fr > Cc: "amber-spec-experts" > Envoy?: Jeudi 2 Novembre 2017 16:05:33 > Objet: Re: Data classes >> In the proposal, you discuss about adding keywords like "non-final, unfinal, >> mutable", i.e. considering that there can be a different set of keywords for >> data class declaration which is different from the one we have on fields. In >> that spirit, we can have a flag like nullable, maybenull, etc to specify that >> the compiler will not generate a requireNonNull inside the principal >> constructor and that equals or hashCode do not need nullchecks. > > Not really a fair comparison.? The keywords suggested like "non-final" > are not adding new concepts; they are merely making explicit something > that was previously implicit.? They are more akin to allowing the > "package" keyword to describe the default accessibility of class fields, > rather than a new feature. Non-nullability, on the other hand, is indeed > a new feature.? And, it is not a simple property of fields (well, it > could be, but I suspect users would find that to an unsatisfying > interpretation of "support for non-nullity.") There is no keyword in Java for defining a nullable field in Java, but not in SQL so this is not really a new concept, anyway, this is not the point here. > > Ignoring that, what I think you're saying is "You don't need to do > full-blown nullity propagation, you could simply have an 'annotation' > that triggers a null check on data class constructors."? And if that's > what you're saying, I agree, we could do that.? Yes, the point is about encapsulation, boundaries, as you have written. We used to insert Objects.requireNonNull at the start of public methods to validate preconditions, given that the primary constructor of a data class is now generated, the question is how to tell the compiler to generate those precondition for me. As i said, it's a littel more general that just the constructor because you want setter to also have that nullcheck and the compiler can leverage that knowledge to generate better code for equals and hashCode. > But I suspect the limitations of this would very quickly turn opinion negative.? (Can't > use it on non-data classes.? Can't use it on locals.? Using it on > non-final fields starts to get messy fast -- do we have to do this on > all field writes?? Etc.) Once you generate codes, people will want to be able to generate codes for non-data class, it's not something specific to what i'm proposing here. Again, the point is not about field or local variable, it's about enforcing preconditions at boundaries in a generated code. R?mi From brian.goetz at oracle.com Fri Nov 3 16:53:33 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 3 Nov 2017 12:53:33 -0400 Subject: Survey on primitive type test patterns Message-ID: <33322e9e-5f7b-e4a4-0fea-e996bf255a38@oracle.com> You may have seen a survey float by a few days ago, aimed at probing at developer intuition about the semantics of primitive type test patterns. The results are here, with over 1000 answers: https://www.surveymonkey.com/results/SM-87Y6K3BY8/ BACKGROUND The real goal of this survey was to probe at how deep the Stockholm Syndrome between Java developers and primitive boxes goes (with an eye towards evaluating the semantic options for how to interpret primitive type-test patterns). And the unfortunate answer was, not surprisingly, pretty deep. The seam introduced by boxing is deep and terrible; it messes with the most fundamental foundations of computing, numeric equality. Given: Long zl = 0L; Byte zb = (byte) 0; We have zl == 0 and zb == 0 but not zl == zb as we would expect from equality being transitive. (The odd bits of floating point, like NaN, mess with this too, but this corner case is more likely to stay in the corner where it belongs.) Which brings us to question 1; what should a constant pattern 0 match? Object anObject = ... switch (anObject) { case 0: ... } There are basically three options here: - It matches (Integer) 0, but not (Long) 0, (Short) 0, etc. - It matches all the primitive zeros, because, well, they're all the same zero. - Punt; don't allow numeric constant patterns unless we know more about the target type. The first elevates the primitive boxes (which are really the tail, not the dog) to being the "real" numeric types in this case. This feels like bug bait. The second recognizes that types are merely a convenient way of reasoning about value sets, and the value 0 is a member of the Long value set, the Integer value set, etc. The third hides our head in the sand. Question 2 is the same, just one level up; how do we interpret "case int" as a type test pattern (or "x matches int", or "x instanceof int")? And again we have the same choices: - Pretend "case int" is really "case Integer" (all hail, primitive box types); - Interpret "case int" as "are you a member of the value set described by int"; - Don't allow that question to be asked. INTERPRETATION OF THE RESULTS There were three choices that corresponded to "instanceof is about types, full stop": 1/2/4. Each of these choices said, in some manner, that asking "are you an instance of int" was a dumb question. The fourth choice (#3), treated "instanceof int" as "are you an int". #3 is strictly more expressive than the others; it allows you to ask a sensible question that was previously hard to ask, and get a reasonable answer. (#1 and #2 let you ask a dumb question and get a dumb answer; #4 tells you "that was a dumb question.") The thing we were trying to get at was, whether if we have a Long that holds a number, people think the more important characteristic is that it is a Long (vs Integer, Short, Byte, or Character), or its value. This poll told us that: the Stockholm Syndrome is so strong that, when given the option to treat boxed numbers as numbers, rather than instances of accidental boxes, 85% chose the latter. From brian.goetz at oracle.com Fri Nov 3 19:37:20 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 3 Nov 2017 15:37:20 -0400 Subject: Patterns design question: Primitive type tests In-Reply-To: References: Message-ID: <896b6adf-80dc-b289-9da4-545eec264bad@oracle.com> As I outlined in the mail on the survey, I think there are three possible ways to treat primitive type test patterns and numeric constant patterns (when the target type is a reference type): 1.? Treat them as if they were synonyms for their box type. 2.? Treat them as matching a set of values; for example, "int x" matches integers in the traditional 32 bit range, unboxing numeric targets and comparing their values. 3.? Outlaw them, to avoid confusion or to preserve the opportunity to do either (1) or (2) later. For my mind, I think #2 is the "right" answer; I think #1 would be a sad answer.? But, there are two additional considerations I'd add: ?- As the survey showed, there would be a significant education component of choosing #2, and; ?- There isn't really an overwhelming need for being able to say "Is this Object a numeric zero" or "Is this object a boxed primitive in the range of int." Taken together, these lead me to #3 -- rather than choose between something sad and something that makes developers heads explode, just do neither.? I don't think this is a bad choice. Concretely, what I'd propose is: Only allow primitive type test patterns in type-restating contexts.? This means that ??? switch (anObject) { ??????? case int x: ... ??? } is no good -- you'd have to say Integer x or Number x or something more specific.? But you could say: ??? switch (anObject) { ??????? case Point(int x, int y): ... ??? } because the types of the extracted components of Point are int, and therefore the type test pattern is type-restating (statically provable to match.) Similarly, for numeric constant patterns, only allow them in switches where the target type is a primitive or a primitive box. There are ample workarounds where the user can explicitly say what they want, if they need to -- but I don't think it will actually come up very often.? And this choice leaves us the option to pursue either #1 or #2 later, if it turns out that we underestimated how often people want to do this. This also sidesteps the question of dominance, since the confusing cases below (like Integer vs int) will not come up except in situations where we can prove they are equivalent. On 11/3/2017 6:47 AM, Gavin Bierman wrote: > > > Primitive type-test patterns > > Given that patterns include constant expressions, and type tests > possibly including generic types; it seems reasonable to consider the > possibility of allowing primitive type tests in pattern matching. > (This answers a sometimes-requested feature: can |instanceof|?support > primitive types?) > > However, it is not wholly obvious what this test might mean. One > possibility is that a ?type-restating? equivalent for primitive > type-test patterns is assignment conversion; e.g.?if I have > > |case int x:| > > then a target whose static type is |byte|, |short|, |char|, or |int|?? > or their boxes ? will be statically deemed to match. > > A target whose /dynamic/?type can be assigned to the primitive type > through a combination of unboxing and widening (again, assignment > conversion) matches a primitive type test. So if we have: > > |switch (o) { case int i: ...| > > we have to do |instanceof|?tests against > {|Integer|,|Short|,|Character|,|Boolean|} to determine a match. > > A primitive type test pattern dominates other primitive type patterns > according to assingment compatibility; |int|?dominates > |byte|/|short|/|char|, |long|?dominates |int|/|byte|/|short|/|char|, > and |double|?dominates |float|. > > A primitive type test pattern is inapplicable (dead) if cast > conversion from the static type of the target fails: > > |Map m; switch (m) { case int x: // compile error }| > > The dominance interaction between primitive type-tests and reference > type-tests for the wrapper types (and their supertypes) seems messy. > Consider the following combinations: > > |case int n: case Integer n: // dead case Integer n: case int n: // not > dead -- still matches Short, Byte case Byte b: case byte b: // dead > case Number n: case int n: // dead| > Is there some unifying theory that makes sense here? One possibility > is to take a more denotational view: a type is a set of values, so > type restatement is really about semantic set inclusion, and dynamic > testing is about set membership. Is this adding too much complexity? > Do developers really care about this feature? -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Nov 3 20:30:17 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 3 Nov 2017 21:30:17 +0100 (CET) Subject: Patterns design question: Primitive type tests In-Reply-To: <896b6adf-80dc-b289-9da4-545eec264bad@oracle.com> References: <896b6adf-80dc-b289-9da4-545eec264bad@oracle.com> Message-ID: <117160943.2264144.1509741017437.JavaMail.zimbra@u-pem.fr> I'm happy with choice #3 too. #2 is a sad choice because this semantics is not explicit, #2 means instanceof + unboxing + widening but nowhere in the syntax the wrapper type used for the instanceof and the unboxing appears. Not having the wrapper type mentioned doesn't pass my semantics smell check. regards, R?mi > De: "Brian Goetz" > ?: "Gavin Bierman" , "amber-spec-experts" > > Envoy?: Vendredi 3 Novembre 2017 20:37:20 > Objet: Re: Patterns design question: Primitive type tests > As I outlined in the mail on the survey, I think there are three possible ways > to treat primitive type test patterns and numeric constant patterns (when the > target type is a reference type): > 1. Treat them as if they were synonyms for their box type. > 2. Treat them as matching a set of values; for example, "int x" matches integers > in the traditional 32 bit range, unboxing numeric targets and comparing their > values. > 3. Outlaw them, to avoid confusion or to preserve the opportunity to do either > (1) or (2) later. > For my mind, I think #2 is the "right" answer; I think #1 would be a sad answer. > But, there are two additional considerations I'd add: > - As the survey showed, there would be a significant education component of > choosing #2, and; > - There isn't really an overwhelming need for being able to say "Is this Object > a numeric zero" or "Is this object a boxed primitive in the range of int." > Taken together, these lead me to #3 -- rather than choose between something sad > and something that makes developers heads explode, just do neither. I don't > think this is a bad choice. > Concretely, what I'd propose is: > Only allow primitive type test patterns in type-restating contexts. This means > that > switch (anObject) { > case int x: ... > } > is no good -- you'd have to say Integer x or Number x or something more > specific. But you could say: > switch (anObject) { > case Point(int x, int y): ... > } > because the types of the extracted components of Point are int, and therefore > the type test pattern is type-restating (statically provable to match.) > Similarly, for numeric constant patterns, only allow them in switches where the > target type is a primitive or a primitive box. > There are ample workarounds where the user can explicitly say what they want, if > they need to -- but I don't think it will actually come up very often. And this > choice leaves us the option to pursue either #1 or #2 later, if it turns out > that we underestimated how often people want to do this. > This also sidesteps the question of dominance, since the confusing cases below > (like Integer vs int) will not come up except in situations where we can prove > they are equivalent. > On 11/3/2017 6:47 AM, Gavin Bierman wrote: >> Primitive type-test patterns >> Given that patterns include constant expressions, and type tests possibly >> including generic types; it seems reasonable to consider the possibility of >> allowing primitive type tests in pattern matching. (This answers a >> sometimes-requested feature: can instanceof support primitive types?) >> However, it is not wholly obvious what this test might mean. One possibility is >> that a ?type-restating? equivalent for primitive type-test patterns is >> assignment conversion; e.g. if I have >> case int x: >> then a target whose static type is byte , short , char , or int ? or their boxes >> ? will be statically deemed to match. >> A target whose dynamic type can be assigned to the primitive type through a >> combination of unboxing and widening (again, assignment conversion) matches a >> primitive type test. So if we have: >> switch (o) { >> case int i: ... >> we have to do instanceof tests against { Integer , Short , Character , Boolean } >> to determine a match. >> A primitive type test pattern dominates other primitive type patterns according >> to assingment compatibility; int dominates byte / short / char , long dominates >> int / byte / short / char , and double dominates float . >> A primitive type test pattern is inapplicable (dead) if cast conversion from the >> static type of the target fails: >> Map m; >> switch (m) { >> case int x: // compile error >> } >> The dominance interaction between primitive type-tests and reference type-tests >> for the wrapper types (and their supertypes) seems messy. Consider the >> following combinations: >> case int n: >> case Integer n: // dead >> case Integer n: >> case int n: // not dead -- still matches Short, Byte >> case Byte b: >> case byte b: // dead >> case Number n: >> case int n: // dead >> Is there some unifying theory that makes sense here? One possibility is to take >> a more denotational view: a type is a set of values, so type restatement is >> really about semantic set inclusion, and dynamic testing is about set >> membership. Is this adding too much complexity? Do developers really care about >> this feature? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Nov 3 20:53:50 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 3 Nov 2017 16:53:50 -0400 Subject: Patterns design question: Primitive type tests In-Reply-To: <117160943.2264144.1509741017437.JavaMail.zimbra@u-pem.fr> References: <896b6adf-80dc-b289-9da4-545eec264bad@oracle.com> <117160943.2264144.1509741017437.JavaMail.zimbra@u-pem.fr> Message-ID: Note that is really just about primitives, as they are the only ones whose value sets have non-trivial intersection.? Value types, being non-polymorphic and having no nontrivial overlap, won't have this problem. Arguably, for strongly typed literals ("case 0.0f"), we could allow them against a target type of Object or Number, since there's only one type they could mean, but I don't see the return-on-spec-complexity here. On 11/3/2017 4:30 PM, Remi Forax wrote: > I'm happy with choice #3 too. > > #2 is a sad choice because this semantics is not explicit, > #2 means instanceof + unboxing + widening but nowhere in the syntax > the wrapper type used for the instanceof and the unboxing appears. Not > having the wrapper type mentioned doesn't pass my semantics smell check. > > regards, > R?mi > > ------------------------------------------------------------------------ > > *De: *"Brian Goetz" > *?: *"Gavin Bierman" , > "amber-spec-experts" > *Envoy?: *Vendredi 3 Novembre 2017 20:37:20 > *Objet: *Re: Patterns design question: Primitive type tests > > As I outlined in the mail on the survey, I think there are three > possible ways to treat primitive type test patterns and numeric > constant patterns (when the target type is a reference type): > > 1.? Treat them as if they were synonyms for their box type. > 2.? Treat them as matching a set of values; for example, "int x" > matches integers in the traditional 32 bit range, unboxing numeric > targets and comparing their values. > 3.? Outlaw them, to avoid confusion or to preserve the opportunity > to do either (1) or (2) later. > > For my mind, I think #2 is the "right" answer; I think #1 would be > a sad answer.? But, there are two additional considerations I'd add: > ?- As the survey showed, there would be a significant education > component of choosing #2, and; > ?- There isn't really an overwhelming need for being able to say > "Is this Object a numeric zero" or "Is this object a boxed > primitive in the range of int." > > Taken together, these lead me to #3 -- rather than choose between > something sad and something that makes developers heads explode, > just do neither.? I don't think this is a bad choice. > > Concretely, what I'd propose is: > > Only allow primitive type test patterns in type-restating > contexts.? This means that > > ??? switch (anObject) { > ??????? case int x: ... > ??? } > > is no good -- you'd have to say Integer x or Number x or something > more specific.? But you could say: > > ??? switch (anObject) { > ??????? case Point(int x, int y): ... > ??? } > > because the types of the extracted components of Point are int, > and therefore the type test pattern is type-restating (statically > provable to match.) > > Similarly, for numeric constant patterns, only allow them in > switches where the target type is a primitive or a primitive box. > > There are ample workarounds where the user can explicitly say what > they want, if they need to -- but I don't think it will actually > come up very often.? And this choice leaves us the option to > pursue either #1 or #2 later, if it turns out that we > underestimated how often people want to do this. > > This also sidesteps the question of dominance, since the confusing > cases below (like Integer vs int) will not come up except in > situations where we can prove they are equivalent. > > > On 11/3/2017 6:47 AM, Gavin Bierman wrote: > > > Primitive type-test patterns > > Given that patterns include constant expressions, and type > tests possibly including generic types; it seems reasonable to > consider the possibility of allowing primitive type tests in > pattern matching. (This answers a sometimes-requested feature: > can |instanceof|?support primitive types?) > > However, it is not wholly obvious what this test might mean. > One possibility is that a ?type-restating? equivalent for > primitive type-test patterns is assignment conversion; e.g.?if > I have > > |case int x:| > > then a target whose static type is |byte|, |short|, |char|, or > |int|?? or their boxes ? will be statically deemed to match. > > A target whose /dynamic/?type can be assigned to the primitive > type through a combination of unboxing and widening (again, > assignment conversion) matches a primitive type test. So if we > have: > > |switch (o) { case int i: ...| > > we have to do |instanceof|?tests against > {|Integer|,|Short|,|Character|,|Boolean|} to determine a match. > > A primitive type test pattern dominates other primitive type > patterns according to assingment compatibility; > |int|?dominates |byte|/|short|/|char|, |long|?dominates > |int|/|byte|/|short|/|char|, and |double|?dominates |float|. > > A primitive type test pattern is inapplicable (dead) if cast > conversion from the static type of the target fails: > > |Map m; switch (m) { case int x: // compile error }| > > The dominance interaction between primitive type-tests and > reference type-tests for the wrapper types (and their > supertypes) seems messy. Consider the following combinations: > > |case int n: case Integer n: // dead case Integer n: case int > n: // not dead -- still matches Short, Byte case Byte b: case > byte b: // dead case Number n: case int n: // dead| > > Is there some unifying theory that makes sense here? One > possibility is to take a more denotational view: a type is a > set of values, so type restatement is really about semantic > set inclusion, and dynamic testing is about set membership. Is > this adding too much complexity? Do developers really care > about this feature? > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Fri Nov 3 21:10:52 2017 From: amaembo at gmail.com (Tagir Valeev) Date: Sat, 4 Nov 2017 00:10:52 +0300 Subject: default branch placement in switch Message-ID: Hello! Currently the default branch can be placed in any place inside the switch operator, e.g. like this: switch(i) { case 1: System.out.println("one");break; default: System.out.println("other");break; case 2: System.out.println("two");break; } In this case behavior does not change on the order of case blocks. However in pattern matching the order of cases usually matters: if some pattern matches, this means that the subsequent patterns will not be checked. Does this mean that with pattern matching the default branch makes all the subsequent case blocks unreachable? Or default can still be located anywhere and is checked only after any other pattern? With best regards, Tagir Valeev From brian.goetz at oracle.com Fri Nov 3 21:25:18 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 3 Nov 2017 17:25:18 -0400 Subject: default branch placement in switch In-Reply-To: References: Message-ID: <4740da81-8d99-645f-fa22-f587b6433278@oracle.com> Yeah, this has to change.? In existing switches, there are no case labels other than default, so order is irrelevant.? But now that patterns have overlapping match-sets, default should be considered to dominate other cases, so it should go last. Compatibility-wise, we have two choices for how to get there; carve out a permanent exception for switches where all cases are type-restating constant patterns, or plan to eventually get to a place where default always comes last, even for "int" switches. If we want to get to the latter, we should start warning on this construct now. On 11/3/2017 5:10 PM, Tagir Valeev wrote: > Hello! > > Currently the default branch can be placed in any place inside the > switch operator, e.g. like this: > > switch(i) { > case 1: System.out.println("one");break; > default: System.out.println("other");break; > case 2: System.out.println("two");break; > } > > In this case behavior does not change on the order of case blocks. > However in pattern matching the order of cases usually matters: if > some pattern matches, this means that the subsequent patterns will not > be checked. Does this mean that with pattern matching the default > branch makes all the subsequent case blocks unreachable? Or default > can still be located anywhere and is checked only after any other > pattern? > > With best regards, > Tagir Valeev From forax at univ-mlv.fr Fri Nov 3 23:27:13 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 4 Nov 2017 00:27:13 +0100 (CET) Subject: Patterns design question: Primitive type tests In-Reply-To: References: <896b6adf-80dc-b289-9da4-545eec264bad@oracle.com> <117160943.2264144.1509741017437.JavaMail.zimbra@u-pem.fr> Message-ID: <261994695.2272732.1509751633375.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Gavin Bierman" , "amber-spec-experts" > > Envoy?: Vendredi 3 Novembre 2017 21:53:50 > Objet: Re: Patterns design question: Primitive type tests > Note that is really just about primitives, as they are the only ones whose value > sets have non-trivial intersection. Value types, being non-polymorphic and > having no nontrivial overlap, won't have this problem. I can be a mess with value-types too if there is a way to link a value-type to its box (or vice-versa like in the MVT, so we have the operation box and unbox) and boxes and value types can declare different interfaces (you can not do that with the current prototype because a value type can not declare interfaces yet), you have reproduced exactly the same issue as with primitives and their corresponding wrappers. Forcing a value type and its box to have the same set of interfaces fix the issue. > Arguably, for strongly typed literals ("case 0.0f"), we could allow them against > a target type of Object or Number, since there's only one type they could mean, > but I don't see the return-on-spec-complexity here. I agree. R?mi > On 11/3/2017 4:30 PM, Remi Forax wrote: >> I'm happy with choice #3 too. >> #2 is a sad choice because this semantics is not explicit, >> #2 means instanceof + unboxing + widening but nowhere in the syntax the wrapper >> type used for the instanceof and the unboxing appears. Not having the wrapper >> type mentioned doesn't pass my semantics smell check. >> regards, >> R?mi >>> De: "Brian Goetz" [ mailto:brian.goetz at oracle.com | ] >>> ?: "Gavin Bierman" [ mailto:gavin.bierman at oracle.com | >>> ] , "amber-spec-experts" [ >>> mailto:amber-spec-experts at openjdk.java.net | >>> ] >>> Envoy?: Vendredi 3 Novembre 2017 20:37:20 >>> Objet: Re: Patterns design question: Primitive type tests >>> As I outlined in the mail on the survey, I think there are three possible ways >>> to treat primitive type test patterns and numeric constant patterns (when the >>> target type is a reference type): >>> 1. Treat them as if they were synonyms for their box type. >>> 2. Treat them as matching a set of values; for example, "int x" matches integers >>> in the traditional 32 bit range, unboxing numeric targets and comparing their >>> values. >>> 3. Outlaw them, to avoid confusion or to preserve the opportunity to do either >>> (1) or (2) later. >>> For my mind, I think #2 is the "right" answer; I think #1 would be a sad answer. >>> But, there are two additional considerations I'd add: >>> - As the survey showed, there would be a significant education component of >>> choosing #2, and; >>> - There isn't really an overwhelming need for being able to say "Is this Object >>> a numeric zero" or "Is this object a boxed primitive in the range of int." >>> Taken together, these lead me to #3 -- rather than choose between something sad >>> and something that makes developers heads explode, just do neither. I don't >>> think this is a bad choice. >>> Concretely, what I'd propose is: >>> Only allow primitive type test patterns in type-restating contexts. This means >>> that >>> switch (anObject) { >>> case int x: ... >>> } >>> is no good -- you'd have to say Integer x or Number x or something more >>> specific. But you could say: >>> switch (anObject) { >>> case Point(int x, int y): ... >>> } >>> because the types of the extracted components of Point are int, and therefore >>> the type test pattern is type-restating (statically provable to match.) >>> Similarly, for numeric constant patterns, only allow them in switches where the >>> target type is a primitive or a primitive box. >>> There are ample workarounds where the user can explicitly say what they want, if >>> they need to -- but I don't think it will actually come up very often. And this >>> choice leaves us the option to pursue either #1 or #2 later, if it turns out >>> that we underestimated how often people want to do this. >>> This also sidesteps the question of dominance, since the confusing cases below >>> (like Integer vs int) will not come up except in situations where we can prove >>> they are equivalent. >>> On 11/3/2017 6:47 AM, Gavin Bierman wrote: >>>> Primitive type-test patterns >>>> Given that patterns include constant expressions, and type tests possibly >>>> including generic types; it seems reasonable to consider the possibility of >>>> allowing primitive type tests in pattern matching. (This answers a >>>> sometimes-requested feature: can instanceof support primitive types?) >>>> However, it is not wholly obvious what this test might mean. One possibility is >>>> that a ?type-restating? equivalent for primitive type-test patterns is >>>> assignment conversion; e.g. if I have >>>> case int x: >>>> then a target whose static type is byte , short , char , or int ? or their boxes >>>> ? will be statically deemed to match. >>>> A target whose dynamic type can be assigned to the primitive type through a >>>> combination of unboxing and widening (again, assignment conversion) matches a >>>> primitive type test. So if we have: >>>> switch (o) { >>>> case int i: ... >>>> we have to do instanceof tests against { Integer , Short , Character , Boolean } >>>> to determine a match. >>>> A primitive type test pattern dominates other primitive type patterns according >>>> to assingment compatibility; int dominates byte / short / char , long dominates >>>> int / byte / short / char , and double dominates float . >>>> A primitive type test pattern is inapplicable (dead) if cast conversion from the >>>> static type of the target fails: >>>> Map m; >>>> switch (m) { >>>> case int x: // compile error >>>> } >>>> The dominance interaction between primitive type-tests and reference type-tests >>>> for the wrapper types (and their supertypes) seems messy. Consider the >>>> following combinations: >>>> case int n: >>>> case Integer n: // dead >>>> case Integer n: >>>> case int n: // not dead -- still matches Short, Byte >>>> case Byte b: >>>> case byte b: // dead >>>> case Number n: >>>> case int n: // dead >>>> Is there some unifying theory that makes sense here? One possibility is to take >>>> a more denotational view: a type is a set of values, so type restatement is >>>> really about semantic set inclusion, and dynamic testing is about set >>>> membership. Is this adding too much complexity? Do developers really care about >>>> this feature? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Nov 4 15:15:37 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 4 Nov 2017 11:15:37 -0400 Subject: PM design question: Scopes In-Reply-To: <27805597-0665-41D9-996D-6BEBA77B8ADA@oracle.com> References: <27805597-0665-41D9-996D-6BEBA77B8ADA@oracle.com> Message-ID: <8a278aea-3bc8-d41d-0dab-98de8508efab@oracle.com> This mail is mostly about constraints; suggestions for addressing them will come separately. An existing roadblock on scopes is that the scope of a local declared in a switch today is the entire switch, even though it is DU in most of the switch: ??? switch (x) { ??????? case 1: ??????????? int a = 3; ??????????? break; ??????? case 2: ??????????? use(a);? // error, not DA ??????????? break; ??????? case 3: ??????????? int a = 4; // error, a already in scope ??? } This was an unfortunate decision (that was part and parcel of the overly-literal copying of switch semantics from C), but something we have to at least minimally accommodate. However, as the number of locals in switches increases (each case may have several binding variables), this status quo will become more and more annoying; users will not want to do this: ??? case Foo(var a): ... break; ??? case Bar(var aa): ... break; ??? case Baz(var aaa): ... break; when it's "obvious" the various binding variables are disjoint. Users are going to want to be able to reuse binding variables, and reasonably so; they may even want to reuse the same name with a different type in different cases: ??? case Integer n: ??? case Long n: ??? case Float n: And again, this is reasonable.? So any solution we come up with should accomodate this desire. Finally, there is the matter of unbalanced ifs.? I would really like to accomodate this use case: ??? if (!(x matches Foo(var a)) ??????? throw new NotFooException(); ??? // use a To accomodate this, at least for unbalanced ifs, the scope of binding variables declared in the conditional would have to extend to the end of the scope, as if the unbalanced if were desugared into a balanced one: ??? if (!(x matches Foo(var a)) ??????? throw new NotFooException(); ??? else { ??????? // rest of block goes here ??????? // use a ??? } OTOH, if we do this, then for pattern variables that are DU after the if (imagine we hadn't inverted the condition), they will be polluting the scope after the if, since they will be in scope but not DA, much like the existing rules regarding locals declared inside switches. The flow-scoping rules (alluded to, but not fully written out in Gavin's note) are beautiful, and result in binding variables being in scope wherever they make sense to be (when they are DA), and not in scope where they are not, but I worry they are a bit too un-Java-ish (even though they are really just DA/DU in a separate guise.)? And the fact that they leave "scoping holes" is disturbing to some people.? There's a few ways to deal with this. As an additional constraint, we'd like it to be the case that, if you refactor a switch into an if-else chain, the scopes of binding variables are as consistent as possible between the two different ways to say the same thing. On 11/3/2017 6:44 AM, Gavin Bierman wrote: > > > Scopes > > Java has five constructs that introduce fresh variables into scope: > the local variable declaration statement, the for statement, the > try-with-resources statement, the catch block, and lambda expressions. > The first, local variable declaration statements, introduce variables > that are in scope for the rest of the block that it is declared in. > The others introduce variables that are limited in their scope. > > The addition of pattern matching brings a new expression, |matches|, > and extends the |switch|?statement. Both these constructs can now > introduce fresh (and, if the pattern match succeeds, definitely > assigned (DA)) variables. But the question is /what is the scope of > these ?pattern? variables/? > > Let us consider the pattern matching constructs in turn. First the > |switch|?statement: > > |switch (o) { case int i: ... case .. }| > > What is the scope of the pattern variable |i|? There are a range of > options. > > 1. > > The scope of the pattern variable is from the start of the switch > statement until the end of the enclosing block. > > In this case the pattern variable is in scope but would be > definitely unassigned (DU) immediately after the switch statement. > > |switch (o) { case int i : ... // DA ... // DA case T t : // i is > in scope } ... // i in still in scope and DU| > > * *+ve*?Simple > * *-ve*?Can?t simply reuse a pattern variable in the same switch > statement (without some form of shadowing) > * *-ve*?Pattern variable poisons the rest of the block > > 2. > > The scope of the pattern variable extends only to the end of the > switch block. > > In this case the pattern variable would be considered DA only for > the statements between the current case label and the subsequent > case labeled statement. For example: > > |switch (o) { case int i : ... // DA ... // DA case T t : // i is > in scope but not DA } ... // i not in scope| > > * *+ve*?Simple > * *+ve*?Pattern variables not poisoned in subsequent statements in > the rest of the block > * *+ve*?Similar technique to |for|?identifiers (not a new idea) > * *-ve*?Can?t simply reuse a pattern variable in the same switch > statement (without some form of shadowing) > > 3. > > The scope of the pattern variable extends only to the next case label. > > |switch (o) { case int i : ... // in scope and DA ... // in scope > and DA case T i : // int i not in scope, so can re-use } ... // i > not in scope| > > * *+ve*?Simple syntactic rule > * *+ve*?Allows reuse of pattern variable in the same switch statement. > * *-ve*?Doesn?t make sense for fallthrough > > *NOTE*?This final point is important - supporting fallthrough impacts > on what solution we might choose for scoping of pattern variables. (We > could not support fallthrough and instead support OR patterns - a > further design dimension.) > > *ASIDE*?Should we support a |switch| /expression/; it seems clear that > scoping should be treated in the same way as it is for lambda expressions. > > The |matches|?expression is unusual in that it is an /expression/?that > introduces a fresh variable. What is the scope of this variable? We > want it to be more than the expression itself, as we want the > following example code to be correct: > > |if (e matches String s) { System.out.println("It's a string - " + s); }| > > In other words, the variable introduced by the pattern needs to be in > scope for an enclosing IfThen statement. > > However, a |match|?expression could be nested within another > expression. It seems reasonable that the patterns variables are in > scope for at least the rest of the expression. For example: > > |(e matches String s || s.length() > 0) | > > Here the |s|?should be in scope for the subexpression > |s.length|?(although it is not DA). In contrast: > > |(e matches String s && s.length() > 0)| > > Here the |s|?is both in scope and DA for the subexpression |s.length|. > > However, what about the following: > > |if (s.length() > 0 && e matches String s) { System.out.println(s); }| > > Given the idea that a pattern variable flows from the inside-out to > the enclosing statement, it would appear that |s|?is in scope for the > subexpression |s.length|; although it is not DA. Unless we want scopes > to be non-contiguous, we will have to accept this rather odd situation > (consider where |s|?shadows a field). [This appears to be what happens > in the current C# compiler.] > > Now let?s consider how far a pattern variable flows wrt its enclosing > statement. We have a range of options: > > 1. > > The scope is both the statement that the match expression occurs > in and the rest of the block. In this scenario, > > |if (o matches T t) { ... } else { ... }| > > is treated as equivalent to the following pseudo-code (where > |match-and-bind|?is a fictional pattern matching construct that > pattern-matches and binds to a variable that has already been > declared) > > |T t; if (o match-and-bind t) { // t in scope and DA } else { // t > in scope and DU } // t in scope and DU| > > This is how the current C# compiler works (although the spec > describes the next option; so perhaps this is a bug). > > 2. > > The scope is just the statement that the match expression occurs > in. In this scenario, > > |if (o matches T t) { ... } else { } ...| > > is treated as equivalent to the pseudo-code > > |{ T t; if (o match-and-bind t) { // t in scope and DA } else { // > t in scope and DU // thus declaration int t = 42; is not allowed. > } } // t not in scope ...| > > This restricted scope allows reuse of pattern variables, e.g. > > |if (o matches T x) { ... } if (o matches S x) { ... }| > > 3. > > The scope of the pattern variable is determined by a flow analysis > of the enclosing statement. (It could be thought of as a > refinement of option b.) This is currently implemented in the > prototype compiler. For example: > > |if (!!(o matches T t)) { // t in scope } else { // t not in scope }| > > * *+ve*?Code will work in the presence of most refactorings > * *+ve*?We have this code working already :-) > * *-ve*?This is a break to the existant notion of scope as a > contiguous program fragment. A scope can now have holes in it. > Will users ever understand this? (Although they are /very/?similar > to the flow-based rules for DA/DU.) > > *ASIDE*?Regardless of whether we opt for (b) or (c) we may consider a > further extension where we allow the scope to extend beyond the > current statement for the case of an unbalanced |if|?statement. For > example > > |``` if (!(o matches T t)) { return; } // t in scope ... return; ```| > > * *+ve*?Supports a common idiom where else blocks are not needed > * *-ve*?Yet further complication of notion of scope. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at io7m.com Sat Nov 4 20:46:55 2017 From: mark at io7m.com (Mark Raynsford) Date: Sat, 4 Nov 2017 20:46:55 +0000 Subject: PM design question: Scopes In-Reply-To: <27805597-0665-41D9-996D-6BEBA77B8ADA@oracle.com> References: <27805597-0665-41D9-996D-6BEBA77B8ADA@oracle.com> Message-ID: <20171104204655.1008a644@copperhead.int.arc7.info> On 2017-11-03T10:44:51 +0000 Gavin Bierman wrote: > > switch (o) { > case int i : ... // in scope and DA > ... // in scope and DA > case T i : // int i not in scope, so can re-use > } > ... // i not in scope > +ve Simple syntactic rule > +ve Allows reuse of pattern variable in the same switch statement. > -ve Doesn?t make sense for fallthrough > > NOTE This final point is important - supporting fallthrough impacts on what solution we might choose for scoping of pattern variables. (We could not support fallthrough and instead support OR patterns - a further design dimension.) I'm strongly in favour of this one. In my experience and opinion, fallthrough was a mistake in C and it was a mistake to copy it in Java. In some 15 years and close to a million lines of code, I have never once felt the need to fall through a switch case. If fallthrough didn't exist, I'm not sure that there'd even be a question that this was the "right" choice... It essentially makes each branch of the switch appear to match the scoping rules for lambdas. From a pedagogical standpoint, all of the other languages I know of that have implemented pattern matching never had any kind of fallthrough in the first place, so it'd likely benefit Java to match them (no pun intended!). I'm thinking of people who learned pattern matching via something like ML being able to write Java switches/matches without having to make the mental gear change of inserting these annoying "break" statements everywhere. Is there actually a significant amount of code out there that uses switch case fallthrough? I mean, I know you have to assume that there is in the interest of preserving backwards compatibility, but I'm curious if there are actually any metrics on this. I've never even seen fallthrough in any code in the wild... I was under the impression that modern IDEs warn against fallthroughs by default. -- Mark Raynsford | http://www.io7m.com From mark at io7m.com Sat Nov 4 20:55:07 2017 From: mark at io7m.com (Mark Raynsford) Date: Sat, 4 Nov 2017 20:55:07 +0000 Subject: Patterns design question: Primitive type tests In-Reply-To: <117160943.2264144.1509741017437.JavaMail.zimbra@u-pem.fr> References: <896b6adf-80dc-b289-9da4-545eec264bad@oracle.com> <117160943.2264144.1509741017437.JavaMail.zimbra@u-pem.fr> Message-ID: <20171104205507.6d1320c2@copperhead.int.arc7.info> On 2017-11-03T21:30:17 +0100 Remi Forax wrote: > I'm happy with choice #3 too. > Seconded. On 2017-11-03T15:37:20 -0400 Brian Goetz wrote: > > ?- There isn't really an overwhelming need for being able to say "Is > this Object a numeric zero" or "Is this object a boxed primitive in the > range of int." I agree. From my experience, despite the best efforts of autoboxing, many programmers treat primitives and references as being from entirely separate worlds. I'd therefore not really expect people to be matching on arbitrary things and relying on quiet primitive conversions behind the scenes to get the behaviour they want. This is obviously purely subjective! -- Mark Raynsford | http://www.io7m.com From brian.goetz at oracle.com Sat Nov 4 21:31:18 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 4 Nov 2017 17:31:18 -0400 Subject: PM design question: Scopes In-Reply-To: <20171104204655.1008a644@copperhead.int.arc7.info> References: <27805597-0665-41D9-996D-6BEBA77B8ADA@oracle.com> <20171104204655.1008a644@copperhead.int.arc7.info> Message-ID: <924EA483-6115-4BC7-9601-0FD7D694831A@oracle.com> > On Nov 4, 2017, at 4:46 PM, Mark Raynsford wrote: > > On 2017-11-03T10:44:51 +0000 > Gavin Bierman wrote: >> >> switch (o) { >> case int i : ... // in scope and DA >> ... // in scope and DA >> case T i : // int i not in scope, so can re-use >> } >> > > I'm strongly in favour of this one. In my experience and opinion, > fallthrough was a mistake in C and it was a mistake to copy it in Java. While you will find few fans of fall through around here, let?s not forget that (a) we have it, and (b) the scoping of variables in switch is already reflective of this ? the scope of a local declared in a switch extends to the end of the switch. So we?re not going to be able to roll back either of these decisions for existing (primitive, string, box, enum) switches. Further, the more we can define the rules so that we don?t divide the language into ?old switch? and ?new switch?, the more pattern switch looks like a natural extension of the existing language rather than a bifurcation. > In some 15 years and close to a million lines of code, I have never once > felt the need to fall through a switch case. That?s true for a lot of developers, but not true for all. The JDK, for example, contains hundreds of switch fallthroughs. I went through a lot of them as part of this project, to see what the usage was, and while some of them are dumb, most are not. They show up often in parsers, for example. But, the farther you move ?up the stack?, the rarer they are, so they will be a lot more appropriate when switching on chars than on objects. I have no problem saying ?OR patterns (repeated case labels) are all we?ll need for patterns? (and that?s really the only thing that makes sense for expression switch anyway.) What I take from your mail is that you have two real goals here: - Discouraging fall through; - Allowing re-use of pattern binding labels in a switch. But this particular scoping rule is not the only way to get to either of these. > From a pedagogical standpoint, all of the other languages I know of > that have implemented pattern matching never had any kind of > fallthrough in the first place, so it'd likely benefit Java to match > them (no pun intended!). I'm thinking of people who learned pattern > matching via something like ML being able to write Java > switches/matches without having to make the mental gear change of > inserting these annoying "break" statements everywhere. You?ll get this with expression switch, but I don?t realistically see how we can do this for statement switch, unless we want to create a new syntactic form for ?match.? Which is back to ?bifurcation.? > Is there actually a significant amount of code out there that uses > switch case fall through? Sadly yes. From brian.goetz at oracle.com Sat Nov 4 22:02:05 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 4 Nov 2017 18:02:05 -0400 Subject: Patterns design question: Primitive type tests In-Reply-To: <20171104205507.6d1320c2@copperhead.int.arc7.info> References: <896b6adf-80dc-b289-9da4-545eec264bad@oracle.com> <117160943.2264144.1509741017437.JavaMail.zimbra@u-pem.fr> <20171104205507.6d1320c2@copperhead.int.arc7.info> Message-ID: <6DBDA4CD-DB18-478D-8E4D-2ED882832F86@oracle.com> Here?s a little more detail on this choice. Assuming that we: - Outlaw primitive type-test patterns, except in type-restating contexts; - Outlaw numeric constant patterns, except where the switch target type is a primitive or a box type. We can now define how we can validate whether patterns are potentially applicable (and statically reject those that are not.) Let's say that the static type of the target is S. Type test patterns. A type test pattern for type T is potentially applicable as follows: - ref(S) && ref(T): if S is cast-convertible to T - prim(S) || prim(T): if T == S, modulo boxing/unboxing - val(S): if S == T, modulo boxing - ref(S) && val(T): if S is cast-convertible to box(T) If a pattern is not potentially applicable to the static type of the target, a compiler error ensues. There are two possibly odd things here: - Tight restrictions on when you can use primitive type-test patterns; - Asymmetry between primitive and values (when we have them). Both stem from the fact that primitive types have nontrivially overlapping value sets; type tests for "int x" and "long x" overlap, but type tests for named values never do. If you want to ask whether an Object is an int, you can ask for a specific box type (or a shared supertype like Number) instead, either using a type-test pattern or a static destructuring pattern like Integer.valueOf(int x). Destructuring patterns. Destructuring patterns for a type T (reference or value) have the same applicability restrictions as type-test patterns for T. Constant patterns. Constant patterns have applicability restrictions as follows: - String literals: if S is cast-convertible to String - Enum literals (fully qualified): if S is cast-convertible to the enum type - Enum literals (abbreviated): if S is the enum type - Class literals: if S is cast-convertible to Class (we may wish to not support Class literals as a constant pattern, though) - Numeric literals: if S is a primitive type or box, and the constant is a valid literal for that primitive type - Boolean literals: if S cast-convertible to Boolean (though switches on boolean or Boolean should be disallowed; use if. Rationale to be provided separately.) - Null literal: if S is a reference type Note that all of this is not about "there are multiple kinds of switch", as much as statically eliminating silly combinations of patterns and targets based on typing. > On Nov 4, 2017, at 4:55 PM, Mark Raynsford wrote: > > On 2017-11-03T21:30:17 +0100 > Remi Forax wrote: > >> I'm happy with choice #3 too. >> > > Seconded. > > On 2017-11-03T15:37:20 -0400 > Brian Goetz wrote: >> >> - There isn't really an overwhelming need for being able to say "Is >> this Object a numeric zero" or "Is this object a boxed primitive in the >> range of int." > > I agree. From my experience, despite the best efforts of autoboxing, > many programmers treat primitives and references as being from entirely > separate worlds. I'd therefore not really expect people to be matching > on arbitrary things and relying on quiet primitive conversions behind > the scenes to get the behaviour they want. > > This is obviously purely subjective! > > -- > Mark Raynsford | http://www.io7m.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Nov 4 22:20:44 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 4 Nov 2017 18:20:44 -0400 Subject: [patterns] AND patterns, OR patterns, fall though Message-ID: In theory, patterns can be combined with AND and OR to produce new patterns, if their target types and binding lists are compatible. Note also that most fallthroughs (those where the case labels immediately follow other case labels, with no intervening statements) can be expressed as OR patterns. Some form of OR patterns are almost a forced move if we want to have expression switches with patterns: int numLetters = switch(s) { case "one", "two" -> 3; case "three" -> 5; ... } Because, while statement switches can simply repeat the labels: case "one": case "two": this idiom looks pretty stupid if we try to transplant it to expression switches: case "one" -> case "two" -> wtf? OR patterns give us much of what fallthrough gives us; the only difference is the ability to have intervening statements between the case labels. Given that expression switches push us towards OR patterns, why not double down, using this for statement switches, and prohibit fallthrough for statement switches too? This is simpler and covers what seem like the important cases. In theory, an OR pattern of P and Q would require that both P and Q are applicable to the static type of the target, and (in the most strict interpretation) have identical binding variable lists. Note that we have a form of OR patterns now, with multi-catch: catch (E1 | E2 identifier) Though, this might not really be what we want an OR pattern to look like, as this looks like the OR of "E1" (no bindings) and "E2" (with bindings), which would fail our restriction on the binding variable lists being the same. An OR pattern would more correctly be written (E1 e | E2 e). (However, we could interpret ?E1|E2 identifier? as a union type-test pattern if we wanted to unify catch with patterns.) The big question is whether we need OR patterns at all, or whether this is merely an artifact of the switch statement. For the matches expression, we can express ORs clearly enough without it: if (x matches P || x matches Q) (and we need to support this anyway.) If we used comma to separate patterns: case 1, 2, 3: case Foo x, Bar x, Baz x: case Foo(var x), Bar(var x), Baz(var x): Is that clear enough? Is that unambiguous enough? If this works, this is nice because it works cleanly with existing constant switches too. I think this is pretty good. So, concrete proposal: - Allow multiple patterns to be separated by commas in a case label; - Treat ?case X: case Y:? as sugar for ?case X, Y:? in statement switches; - Impose the ?same bindings? rule when multiple patterns are combined in this way; - Disallow fall through into patterns with binding variables. Note that we don?t have to create a new kind of switch here to prohibit fall through; we just don?t allow fall through into non-constant pattern cases. Note that Scala lets you OR multiple patterns together: def matcher(l: Foo): String = { l match { case A() => "A" case B(_) | C(_) => "B" case _ => "default" } } but I'm not sure whether this is really an OR on patterns, or whether this is a "feature" of match? But, this seems a pretty questionable syntax choice, as: scala> 1 match { | case 1 | 2 => "one"; | } res0: String = one scala> 1 | 2 res1: Int = 3 So, even though 1|2 is an integer constant whose value is 3, "case 1|2" is an OR pattern. Similarly, its even less clear that we need AND patterns. Though I could imagine wanting intersection type-test patterns, like: switch (lambda) { case Predicate p && Serializable: ... case Predicate p: ... } Are there compelling use cases for AND patterns that I?m missing? -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at io7m.com Sat Nov 4 22:42:15 2017 From: mark at io7m.com (Mark Raynsford) Date: Sat, 4 Nov 2017 22:42:15 +0000 Subject: [patterns] AND patterns, OR patterns, fall though In-Reply-To: References: Message-ID: <20171104224215.28f59839@copperhead.int.arc7.info> On 2017-11-04T18:20:44 -0400 Brian Goetz wrote: > In theory, patterns can be combined with AND and OR to produce new patterns, The OR patterns do indeed look good. I'd have to play with a prototype to see what works and what doesn't though. -- Mark Raynsford | http://www.io7m.com From Daniel_Heidinga at ca.ibm.com Sun Nov 5 01:10:00 2017 From: Daniel_Heidinga at ca.ibm.com (Daniel Heidinga) Date: Sun, 5 Nov 2017 01:10:00 +0000 Subject: Patterns design question: Primitive type tests In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sun Nov 5 11:02:37 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 5 Nov 2017 11:02:37 +0000 Subject: Patterns design question: Primitive type tests In-Reply-To: References: Message-ID: > How is null being handled in pattern matches? Whether the Integer case is dead depends on whether or not it would match null. (Sorry if I've missed an email thread covering this) Gavin posted some notes on the options here the other day ? look for a message with ?null? in the subject. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at io7m.com Sun Nov 5 22:36:42 2017 From: mark at io7m.com (Mark Raynsford) Date: Sun, 5 Nov 2017 22:36:42 +0000 Subject: [patterns] AND patterns, OR patterns, fall though In-Reply-To: References: Message-ID: <20171105223642.39857878@copperhead.int.arc7.info> On 2017-11-04T18:20:44 -0400 Brian Goetz wrote: > > Similarly, its even less clear that we need AND patterns. Though I could imagine wanting intersection type-test patterns, like: > > switch (lambda) { > case Predicate p && Serializable: ... > case Predicate p: ... > } > > Are there compelling use cases for AND patterns that I?m missing? I've given this more thought and I still can't find a reason for AND patterns to exist at all. It seems that conjunctions of patterns that would deconstruct values would either be unmatchable, or could be just as easily expressed as single nested patterns. This is because if the two sides of the AND pattern were contradictory, then the pattern could never match. If the two sides weren't contradictory, then they could obviously be expressed as a single pattern. // Assume an Option type that's Some(x) | None __data class T (Option left, Option right) { } switch (t) { // Obviously reducible to (Option(Some(x), Some(y))) case (Option(Some(x), _)) && (Option(_, Some(y))) -> ... // Obviously impossible case (Option(Some(x), _)) && (Option(None, _)) -> ... } This would leave patterns that don't deconstruct values. Basically: instanceof checks. It seems like those could be just as easily expressed as guards/predicates on patterns (I think pattern guards were discussed months back). -- Mark Raynsford | http://www.io7m.com From mark at io7m.com Wed Nov 8 16:29:15 2017 From: mark at io7m.com (Mark Raynsford) Date: Wed, 8 Nov 2017 16:29:15 +0000 Subject: Data classes In-Reply-To: <1480459040.1979843.1509711549963.JavaMail.zimbra@u-pem.fr> References: <325340300.1290169.1509629698541.JavaMail.zimbra@u-pem.fr> <8d46174e-5c88-b88b-6488-f52501a23a9e@oracle.com> <1753394655.1345295.1509633629668.JavaMail.zimbra@u-pem.fr> <74a5059d-65e7-5303-bc12-d944bc16ed14@oracle.com> <1480459040.1979843.1509711549963.JavaMail.zimbra@u-pem.fr> Message-ID: <20171108162915.6cc5a33a@copperhead.int.arc7.info> On 2017-11-03T13:19:09 +0100 forax at univ-mlv.fr wrote: > > Yes, the point is about encapsulation, boundaries, as you have written. > We used to insert Objects.requireNonNull at the start of public methods to validate preconditions, given that the primary constructor of a data class is now generated, the question is how to tell the compiler to generate those precondition for me. Having played around with the prototype today, I have to say that I think R?mi might be right! It's what I think of as the Haskell effect: When you have a language with very terse syntax, every single piece of boilerplate you have to write seems a lot worse than it otherwise would. If you take a look at a typical Haskell file, it somewhat ironically feels as though there's quite a bit of boilerplate (typeclass instances, module export lists, matching on record values, signatures to some extent) because the rest of the language requires so few keystrokes. Having used data classes a little, I can see that I'm now almost never going to use the default constructor because I always want to insert calls to Objects.requireNonNull(...). Those have become the new mildly annoying boilerplate (and yet I'm used to having to write them over and over by hand in normal classes today). -- Mark Raynsford | http://www.io7m.com From forax at univ-mlv.fr Sat Nov 11 12:41:50 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 11 Nov 2017 13:41:50 +0100 (CET) Subject: Patterns design question: Generics In-Reply-To: References: Message-ID: <1542502066.2570726.1510404110527.JavaMail.zimbra@u-pem.fr> These rules seem fine too me. Because you ask, the other solution is to introduce a raw type conversion of U, so the first bullet point can be T as to be cast convertible to raw(U). regards, R?mi > De: "Gavin Bierman" > ?: "amber-spec-experts" > Envoy?: Vendredi 3 Novembre 2017 11:46:40 > Objet: Patterns design question: Generics > Generics > A related problem to the issue of null and pattern matching is dealing with > patterns mentioning generic types. Currently, it is forbidden to use instanceof > with a non-reifiable type. However, we suspect that Java programmers would > expect the following to work: > ArrayList al = ... > if (al matches List li) { > ... > } > Whereas perhaps it is to be expected that the following is suspect > Object o = ... > if (o matches List li) { > // How could we perform this test? > } > The type restatement distinction that we introduced in the previous email for > dealing with null provides a way forward. > More formally, given an expression e matches U u where e has type T : > * If T is assignment convertible to U then this is a type restatement pattern > match, and is allowed regardless of the type U (even if it is non-reifiable). > * If T is cast convertible to U , but not assignment convertible, then we emit a > warning/error as per the cast conversion rules. > Do we have any other design options here? -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Nov 11 13:20:58 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 11 Nov 2017 14:20:58 +0100 (CET) Subject: Patterns design question: Nulls In-Reply-To: References: Message-ID: <1947114546.2573346.1510406458456.JavaMail.zimbra@u-pem.fr> I'm for Option, given Option 2 is not a real Option 2, see below. It's unfortunate that the behavior inside nested patterns is different from the behavior of the switch. As you said, it's the expected semantics and i think it's not a big deal if we do not allow destructuring inside a match ( new Box(t) matches Box(Object o)). I believe the old switch and the new switch should work the same way. it will be surprising if a kind of switch allow to declare a case null and another kind do not allow it. Also about having to put the case null at the top because of the dominance analysis, the compiler can easily recognize case 'null' and remove it from the dominance analysis meaning it can be put an any positions, or we can choose by example that if 'default' as to be the last pattern, case null as to be the one just before. As far as a understand, your Option 2 is just 1.b, i.e. the semantics is the same as Option 1 and you introduce a specific behavior if the match is detected as being a type restatement. In my opinion, there is a more general question first, do we want to allow type restatement ?, and if the option is yes, then we can decide if we want the null behavior to be the one you propose or not. I'm not sure it's a good idea to allow type restatement in switch - transforming a switch to an if true seems arcane, if we want a type restatement syntax, why use switch and not another construction (let expression ?), it will be more clear. - it goes in the opposite direction of the introduction of 'var' in the language, 'var' hides the type but here we introduce a feature to change the type (more or less) for a part of the code, this make the code harder to read for a feature which seems too arcane. regards, R?mi > De: "Gavin Bierman" > ?: "amber-spec-experts" > Envoy?: Vendredi 3 Novembre 2017 11:46:10 > Objet: Patterns design question: Nulls > Nulls and pattern matching > The null value is treated somewhat inconsistently in Java; unfortunately pattern > matching places a fresh focus on these inconsistences. Consider a local > variable, String s=null; . Currently s instanceof String returns false; whereas > (String)s succeeds; and further switch (s){ case "Hello": ... } throws a NPE. > Unfortunately, we need to maintain these behaviours whilst providing a > consistent story for patterns. > So far, we have essentially two choices on the table. One based on what might be > called a pragmatic navigation of existing choices; and another more > sophisticated one based on static type information. (In what follows, we assume > a declaration T t = null; is in scope.) Option 1. > Matches t matches Object o . To keep it consistent with instanceof this must > return false . > Switch switch is retconned to not throw a NPE when given a null . However, all > switches with reference-typed selector expressions are considered to have an > implicit case null: throw new NullPointerException(); clause as the first > clause in the switch. If the user supplies a null clause (which means it must > be a pattern-matching switch ) then this replaces the implicit clause. [An > alternative to this is to introduce non-null type tests, which we fear would > quickly become unwieldy.] > Note that this addresses a problem that has been brought up on the external > mailing list. Currently: > static void testSwitchInteger(Integer i) { > switch(i) { > case 1: System.out.println("One"); break; > default: System.out.println("Other"); break; > } > } > static void testSwitchNumber(Number i) { > switch(i) { > case 1: System.out.println("One"); break; > default: System.out.println("Other"); break; > } > } > testSwitchNumber(null); // prints "Other" > testSwitchInteger(null); // NPE > The Integer case is an old-style switch, so throws an NPE. The Number case is a > pattern matching case, so without the insertion of an implicit null clause, it > would actually match the default clause (this is the behaviour of the current > prototype). > ASIDE Adding a null clause has an impact on the dominance analysis in pattern > matching. A null pattern must appear before a type test pattern. > Nested/Destructuring patterns As discussed earlier, t matches Object o returns > false. But unfortunately new Box(t) matches Box(Object o) really ought to > return true. (Both because this is what we feel would be expected, but also to > be consistent with expected semantics of extractors.) In other words, the > semantics of matching against null is not compositional. Note also that the > null value never matches against a nested pattern. > We might expect a translation to proceed something like the following. > e matches Box(Object o) > -> e matches Box && > (e.contents matches null as o || e.contents matches Object o) > -> e instance Box && > (e.contents == null || e.contents instanceof Object) > (Note the rarely seen as pattern in the intermediate pattern.) Option 2. > We can use the static type information to classify pattern matches, which > ultimately determines how the matching is translated. > For example: > if (t matches U u) { // where T <: U > ... > } > Notice here that the pattern match is guaranteed to succeed as T is a subtype of > U . We can classify this as a type restatement pattern, and compile it > essentially to the following > if (true) { > U u = t; > ... > } > In other words, the expression (o matches U u) succeeds depending on the static > type of o : if the static type of o is a subtype of U then it evaluates to true > , even for the value of null . If it is not statically a subtype of U then its > runtime type is tested as normally, and null would fail. > ASIDE The choice of null matching also impacts on our reachability analysis. For > example: > Integer i = ...; > switch (i) { > case Integer j: { > System.out.println(j); > break; > } > default: System.out.println("Something else"); > } > Is the default case reachable? If the type test matches null then it is > unreachable, otherwise it is reachable. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Nov 11 13:30:21 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 11 Nov 2017 14:30:21 +0100 (CET) Subject: PM design question: Shadowing In-Reply-To: <20910035-3C63-4561-BB54-73B8162B0910@oracle.com> References: <20910035-3C63-4561-BB54-73B8162B0910@oracle.com> Message-ID: <1706705010.2573717.1510407021909.JavaMail.zimbra@u-pem.fr> It's a step too far, as you said the rules for shadowing are currently the same for all constructs, let's not try to break that unity. Your first example already exist with the current construct: //field i in scope if (o instanceof Integer) { Integer i = (Integer)o; System.out.println(i); // local variable } else { System.out.println(i); // field } Furthermore, given that all IDEs colorize differently a local variable access vs a field access, from my experience, it's not a real issue apart for the very beginners. regards, R?mi > De: "Gavin Bierman" > ?: "amber-spec-experts" > Envoy?: Vendredi 3 Novembre 2017 11:45:30 > Objet: PM design question: Shadowing > Shadowing > As mentioned in the previous email, Java currently has five constructs that > declare fresh variables. All five declare variables under the same shadowing > regime: (i) No shadowing of formal parameters, (ii) No shadowing of other > locally declared variables, but (iii) shadowing permitted of fields. Thus we > would expect pattern variables to shadow fields. > But such a decision has some interesting consequences. For example, if we adopt > flow-like scoping strategy (c or d in the previous email), then the following > code has some subtle behaviour. > // field i in scope > switch (o) { > case Integer i : System.out.print(i); // shadows field i > break; > case T t : System.out.print(i); // field i > } > Is this too confusing? > We could also consider allowing variables to shadow other variables when they > are in scope and DU. For example, if we adopt scoping strategy (b) from the > previous email - where the scope of a pattern variable is the entire enclosing > statement - the following code would be allowed. > if (o matches T t) { > // t in scope and DA > } else { > // t in scope and DU > if (o1 matches Integer t) { > // Integer t shadows T t > } > } > Should we restrict this new notion of shadowing to pattern variables only, or > change it for all variables in Java. Would this be a step too far? -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Nov 11 23:48:21 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 12 Nov 2017 00:48:21 +0100 (CET) Subject: [patterns] AND patterns, OR patterns, fall though In-Reply-To: References: Message-ID: <1884798360.2628828.1510444101470.JavaMail.zimbra@u-pem.fr> I do not know all the answers :) i would like to emphasis that OR (and AND) are not a binary OR but a n-arity OR, i think it's important because it's easier to re-organize/reason about the pattern if you are not limited by the pattern being binary were decision are more local. You can find the AND pattern when you deconstruct an object that have several fieds, by example for case BinOp(Value v1, Value v2) -> accept(v1, v2) you can decompose it too x match BinOp -> let (v1, v2) = x.deconstruct()] AND(v1 match Value, v2 match Value) -> accept(v1, v2) R?mi > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Samedi 4 Novembre 2017 23:20:44 > Objet: [patterns] AND patterns, OR patterns, fall though > In theory, patterns can be combined with AND and OR to produce new patterns, if > their target types and binding lists are compatible. Note also that most > fallthroughs (those where the case labels immediately follow other case labels, > with no intervening statements) can be expressed as OR patterns. > Some form of OR patterns are almost a forced move if we want to have expression > switches with patterns: > int numLetters = switch(s) { > case "one", "two" -> 3; > case "three" -> 5; > ... > } > Because, while statement switches can simply repeat the labels: > case "one": > case "two": > this idiom looks pretty stupid if we try to transplant it to expression > switches: > case "one" -> > case "two" -> wtf? > OR patterns give us much of what fallthrough gives us; the only difference is > the ability to have intervening statements between the case labels. Given that > expression switches push us towards OR patterns, why not double down, using > this for statement switches, and prohibit fallthrough for statement switches > too? This is simpler and covers what seem like the important cases. > In theory, an OR pattern of P and Q would require that both P and Q are > applicable to the static type of the target, and (in the most strict > interpretation) have identical binding variable lists. > Note that we have a form of OR patterns now, with multi-catch: > catch (E1 | E2 identifier) > Though, this might not really be what we want an OR pattern to look like, as > this looks like the OR of "E1" (no bindings) and "E2" (with bindings), which > would fail our restriction on the binding variable lists being the same. An OR > pattern would more correctly be written (E1 e | E2 e). (However, we could > interpret ?E1|E2 identifier? as a union type-test pattern if we wanted to unify > catch with patterns.) > The big question is whether we need OR patterns at all, or whether this is > merely an artifact of the switch statement. For the matches expression, we can > express ORs clearly enough without it: > if (x matches P || x matches Q) > (and we need to support this anyway.) If we used comma to separate patterns: > case 1, 2, 3: > case Foo x, Bar x, Baz x: > case Foo(var x), Bar(var x), Baz(var x): > Is that clear enough? Is that unambiguous enough? If this works, this is nice > because it works cleanly with existing constant switches too. I think this is > pretty good. > So, concrete proposal: > - Allow multiple patterns to be separated by commas in a case label; > - Treat ? case X: case Y:? as sugar for ?case X, Y:? in statement switches; > - Impose the ?same bindings? rule when multiple patterns are combined in this > way; > - Disallow fall through into patterns with binding variables. > Note that we don?t have to create a new kind of switch here to prohibit fall > through; we just don?t allow fall through into non-constant pattern cases. > Note that Scala lets you OR multiple patterns together: > def matcher ( l : Foo ): String = { l match { case A () => "A" case B ( _ ) | C > ( _ ) => "B" case _ => "default" } } > but I'm not sure whether this is really an OR on patterns, or whether this is a > "feature" of match? But, this seems a pretty questionable syntax choice, as: >> scala> 1 match { >> | case 1 | 2 => "one"; >> | } >> res0: String = one >> scala> 1 | 2 >> res1: Int = 3 > So, even though 1|2 is an integer constant whose value is 3, "case 1|2" is an OR > pattern. > Similarly, its even less clear that we need AND patterns. Though I could imagine > wanting intersection type-test patterns, like: > switch (lambda) { > case Predicate p && Serializable: ... > case Predicate p: ... > } > Are there compelling use cases for AND patterns that I?m missing? -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Nov 11 23:54:03 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 12 Nov 2017 00:54:03 +0100 (CET) Subject: default branch placement in switch In-Reply-To: <4740da81-8d99-645f-fa22-f587b6433278@oracle.com> References: <4740da81-8d99-645f-fa22-f587b6433278@oracle.com> Message-ID: <1437353687.2629024.1510444443566.JavaMail.zimbra@u-pem.fr> I prefer default to be special and has to be at the end thus starts warning about default not being at the end. I think case null should be special too, for the same reason, case null dominates every cases while default dominates none. R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Tagir Valeev" , "amber-spec-experts" > Envoy?: Vendredi 3 Novembre 2017 22:25:18 > Objet: Re: default branch placement in switch > Yeah, this has to change.? In existing switches, there are no case > labels other than default, so order is irrelevant.? But now that > patterns have overlapping match-sets, default should be considered to > dominate other cases, so it should go last. > > Compatibility-wise, we have two choices for how to get there; carve out > a permanent exception for switches where all cases are type-restating > constant patterns, or plan to eventually get to a place where default > always comes last, even for "int" switches. If we want to get to the > latter, we should start warning on this construct now. > > > > On 11/3/2017 5:10 PM, Tagir Valeev wrote: >> Hello! >> >> Currently the default branch can be placed in any place inside the >> switch operator, e.g. like this: >> >> switch(i) { >> case 1: System.out.println("one");break; >> default: System.out.println("other");break; >> case 2: System.out.println("two");break; >> } >> >> In this case behavior does not change on the order of case blocks. >> However in pattern matching the order of cases usually matters: if >> some pattern matches, this means that the subsequent patterns will not >> be checked. Does this mean that with pattern matching the default >> branch makes all the subsequent case blocks unreachable? Or default >> can still be located anywhere and is checked only after any other >> pattern? >> >> With best regards, > > Tagir Valeev From forax at univ-mlv.fr Sun Nov 12 12:47:34 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 12 Nov 2017 13:47:34 +0100 (CET) Subject: inner data class Message-ID: <883137080.2658309.1510490854808.JavaMail.zimbra@u-pem.fr> In the actual prototype, a data class declared inside a class is considered as an inner class so it's an a 'plain' data class, i propose that a data class declared inside a class should always be static (like enum and interface). public class InnerExample { __datum Internal(String name); public static void main(String[] args) { Internal i = new Internal("foo"); } } so the code above will compile. R?mi From brian.goetz at oracle.com Sun Nov 12 17:38:47 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 12 Nov 2017 17:38:47 +0000 Subject: default branch placement in switch In-Reply-To: <1437353687.2629024.1510444443566.JavaMail.zimbra@u-pem.fr> References: <4740da81-8d99-645f-fa22-f587b6433278@oracle.com> <1437353687.2629024.1510444443566.JavaMail.zimbra@u-pem.fr> Message-ID: <1B34742A-2C1F-41EA-B713-588A015BA61C@oracle.com> Agree except: the terminology of domination is backwards. Default/_ dominates everything; most things dominate null. (Null is unordered w respect to other constant patterns and primitive type test patterns, though). Sent from my MacBook Wheel > On Nov 11, 2017, at 11:54 PM, Remi Forax wrote: > > I prefer default to be special and has to be at the end thus starts warning about default not being at the end. > I think case null should be special too, for the same reason, case null dominates every cases while default dominates none. > > R?mi > > ----- Mail original ----- >> De: "Brian Goetz" >> ?: "Tagir Valeev" , "amber-spec-experts" >> Envoy?: Vendredi 3 Novembre 2017 22:25:18 >> Objet: Re: default branch placement in switch > >> Yeah, this has to change. In existing switches, there are no case >> labels other than default, so order is irrelevant. But now that >> patterns have overlapping match-sets, default should be considered to >> dominate other cases, so it should go last. >> >> Compatibility-wise, we have two choices for how to get there; carve out >> a permanent exception for switches where all cases are type-restating >> constant patterns, or plan to eventually get to a place where default >> always comes last, even for "int" switches. If we want to get to the >> latter, we should start warning on this construct now. >> >> >> >>> On 11/3/2017 5:10 PM, Tagir Valeev wrote: >>> Hello! >>> >>> Currently the default branch can be placed in any place inside the >>> switch operator, e.g. like this: >>> >>> switch(i) { >>> case 1: System.out.println("one");break; >>> default: System.out.println("other");break; >>> case 2: System.out.println("two");break; >>> } >>> >>> In this case behavior does not change on the order of case blocks. >>> However in pattern matching the order of cases usually matters: if >>> some pattern matches, this means that the subsequent patterns will not >>> be checked. Does this mean that with pattern matching the default >>> branch makes all the subsequent case blocks unreachable? Or default >>> can still be located anywhere and is checked only after any other >>> pattern? >>> >>> With best regards, >>> Tagir Valeev From brian.goetz at oracle.com Sun Nov 12 17:40:52 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 12 Nov 2017 17:40:52 +0000 Subject: inner data class In-Reply-To: <883137080.2658309.1510490854808.JavaMail.zimbra@u-pem.fr> References: <883137080.2658309.1510490854808.JavaMail.zimbra@u-pem.fr> Message-ID: <298324F5-A51E-4ABC-BBD3-AE90DA9FB076@oracle.com> This makes sense to me at least for now, but I might want to revisit later. Sent from my MacBook Wheel > On Nov 12, 2017, at 12:47 PM, Remi Forax wrote: > > In the actual prototype, a data class declared inside a class is considered as an inner class so it's an a 'plain' data class, > i propose that a data class declared inside a class should always be static (like enum and interface). > > public class InnerExample { > __datum Internal(String name); > > public static void main(String[] args) { > Internal i = new Internal("foo"); > } > } > > so the code above will compile. > > R?mi From forax at univ-mlv.fr Sun Nov 12 18:36:48 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sun, 12 Nov 2017 19:36:48 +0100 (CET) Subject: default branch placement in switch In-Reply-To: <1B34742A-2C1F-41EA-B713-588A015BA61C@oracle.com> References: <4740da81-8d99-645f-fa22-f587b6433278@oracle.com> <1437353687.2629024.1510444443566.JavaMail.zimbra@u-pem.fr> <1B34742A-2C1F-41EA-B713-588A015BA61C@oracle.com> Message-ID: <2086983844.2700102.1510511808488.JavaMail.zimbra@u-pem.fr> My bad, i was thinking about the domination of the nodes in a tree of ifs, the domination in the control flow sense. R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "Tagir Valeev" , "amber-spec-experts" > Envoy?: Dimanche 12 Novembre 2017 18:38:47 > Objet: Re: default branch placement in switch > Agree except: the terminology of domination is backwards. Default/_ dominates > everything; most things dominate null. (Null is unordered w respect to other > constant patterns and primitive type test patterns, though). > > Sent from my MacBook Wheel > >> On Nov 11, 2017, at 11:54 PM, Remi Forax wrote: >> >> I prefer default to be special and has to be at the end thus starts warning >> about default not being at the end. >> I think case null should be special too, for the same reason, case null >> dominates every cases while default dominates none. >> >> R?mi >> >> ----- Mail original ----- >>> De: "Brian Goetz" >>> ?: "Tagir Valeev" , "amber-spec-experts" >>> >>> Envoy?: Vendredi 3 Novembre 2017 22:25:18 >>> Objet: Re: default branch placement in switch >> >>> Yeah, this has to change. In existing switches, there are no case >>> labels other than default, so order is irrelevant. But now that >>> patterns have overlapping match-sets, default should be considered to >>> dominate other cases, so it should go last. >>> >>> Compatibility-wise, we have two choices for how to get there; carve out >>> a permanent exception for switches where all cases are type-restating >>> constant patterns, or plan to eventually get to a place where default >>> always comes last, even for "int" switches. If we want to get to the >>> latter, we should start warning on this construct now. >>> >>> >>> >>>> On 11/3/2017 5:10 PM, Tagir Valeev wrote: >>>> Hello! >>>> >>>> Currently the default branch can be placed in any place inside the >>>> switch operator, e.g. like this: >>>> >>>> switch(i) { >>>> case 1: System.out.println("one");break; >>>> default: System.out.println("other");break; >>>> case 2: System.out.println("two");break; >>>> } >>>> >>>> In this case behavior does not change on the order of case blocks. >>>> However in pattern matching the order of cases usually matters: if >>>> some pattern matches, this means that the subsequent patterns will not >>>> be checked. Does this mean that with pattern matching the default >>>> branch makes all the subsequent case blocks unreachable? Or default >>>> can still be located anywhere and is checked only after any other >>>> pattern? >>>> >>>> With best regards, > >>> Tagir Valeev From guy.steele at oracle.com Mon Nov 13 19:16:29 2017 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 13 Nov 2017 14:16:29 -0500 Subject: PM design question: Scopes In-Reply-To: <27805597-0665-41D9-996D-6BEBA77B8ADA@oracle.com> References: <27805597-0665-41D9-996D-6BEBA77B8ADA@oracle.com> Message-ID: <96A5940F-A8D2-40D4-A549-9221EE930AE8@oracle.com> I?m late to this discussion because I?ve been traveling. But I do have a comment about Scopes and pattern matching (see bottom). > On Nov 3, 2017, at 6:44 AM, Gavin Bierman wrote: > > Scopes > > Java has five constructs that introduce fresh variables into scope: the local variable declaration statement, the for statement, the try-with-resources statement, the catch block, and lambda expressions. The first, local variable declaration statements, introduce variables that are in scope for the rest of the block that it is declared in. The others introduce variables that are limited in their scope. > > The addition of pattern matching brings a new expression, matches, and extends the switch statement. Both these constructs can now introduce fresh (and, if the pattern match succeeds, definitely assigned (DA)) variables. But the question is what is the scope of these ?pattern? variables? > > Let us consider the pattern matching constructs in turn. First the switch statement: > > switch (o) { > case int i: ... > case .. > } > What is the scope of the pattern variable i? There are a range of options. > > The scope of the pattern variable is from the start of the switch statement until the end of the enclosing block. > > In this case the pattern variable is in scope but would be definitely unassigned (DU) immediately after the switch statement. > > switch (o) { > case int i : ... // DA > ... // DA > case T t : // i is in scope > } > ... // i in still in scope and DU > +ve Simple > -ve Can?t simply reuse a pattern variable in the same switch statement (without some form of shadowing) > -ve Pattern variable poisons the rest of the block > The scope of the pattern variable extends only to the end of the switch block. > > In this case the pattern variable would be considered DA only for the statements between the current case label and the subsequent case labeled statement. For example: > > switch (o) { > case int i : ... // DA > ... // DA > case T t : // i is in scope but not DA > } > ... // i not in scope > +ve Simple > +ve Pattern variables not poisoned in subsequent statements in the rest of the block > +ve Similar technique to for identifiers (not a new idea) > -ve Can?t simply reuse a pattern variable in the same switch statement (without some form of shadowing) > The scope of the pattern variable extends only to the next case label. > > switch (o) { > case int i : ... // in scope and DA > ... // in scope and DA > case T i : // int i not in scope, so can re-use > } > ... // i not in scope > +ve Simple syntactic rule > +ve Allows reuse of pattern variable in the same switch statement. > -ve Doesn?t make sense for fallthrough > NOTE This final point is important - supporting fallthrough impacts on what solution we might choose for scoping of pattern variables. (We could not support fallthrough and instead support OR patterns - a further design dimension.) > > ASIDE Should we support a switch expression; it seems clear that scoping should be treated in the same way as it is for lambda expressions. > > The matches expression is unusual in that it is an expression that introduces a fresh variable. What is the scope of this variable? We want it to be more than the expression itself, as we want the following example code to be correct: > > if (e matches String s) { > System.out.println("It's a string - " + s); > } > In other words, the variable introduced by the pattern needs to be in scope for an enclosing IfThen statement. > > However, a match expression could be nested within another expression. It seems reasonable that the patterns variables are in scope for at least the rest of the expression. For example: > > (e matches String s || s.length() > 0) > Here the s should be in scope for the subexpression s.length (although it is not DA). In contrast: > > (e matches String s && s.length() > 0) > Here the s is both in scope and DA for the subexpression s.length. > > However, what about the following: > > if (s.length() > 0 && e matches String s) { > System.out.println(s); > } > Given the idea that a pattern variable flows from the inside-out to the enclosing statement, it would appear that s is in scope for the subexpression s.length; although it is not DA. Unless we want scopes to be non-contiguous, we will have to accept this rather odd situation (consider where s shadows a field). [This appears to be what happens in the current C# compiler.] > > Now let?s consider how far a pattern variable flows wrt its enclosing statement. We have a range of options: > > The scope is both the statement that the match expression occurs in and the rest of the block. In this scenario, > > if (o matches T t) { > ... > } else { > ... > } > is treated as equivalent to the following pseudo-code (where match-and-bind is a fictional pattern matching construct that pattern-matches and binds to a variable that has already been declared) > > T t; > if (o match-and-bind t) { > // t in scope and DA > } else { > // t in scope and DU > } > // t in scope and DU > This is how the current C# compiler works (although the spec describes the next option; so perhaps this is a bug). > > The scope is just the statement that the match expression occurs in. In this scenario, > > if (o matches T t) { > ... > } else { > > } > ... > is treated as equivalent to the pseudo-code > > { T t; > if (o match-and-bind t) { > // t in scope and DA > } else { > // t in scope and DU > // thus declaration int t = 42; is not allowed. > } > } > // t not in scope > ... > This restricted scope allows reuse of pattern variables, e.g. > > if (o matches T x) { ... } > if (o matches S x) { ... } > The scope of the pattern variable is determined by a flow analysis of the enclosing statement. (It could be thought of as a refinement of option b.) This is currently implemented in the prototype compiler. For example: > > if (!!(o matches T t)) { > // t in scope > } else { > // t not in scope > } > +ve Code will work in the presence of most refactorings > +ve We have this code working already :-) > -ve This is a break to the existant notion of scope as a contiguous program fragment. A scope can now have holes in it. Will users ever understand this? (Although they are very similar to the flow-based rules for DA/DU.) > ASIDE Regardless of whether we opt for (b) or (c) we may consider a further extension where we allow the scope to extend beyond the current statement for the case of an unbalanced if statement. For example > > ``` > if (!(o matches T t)) { > return; > } > // t in scope > ... > return; > ``` > +ve Supports a common idiom where else blocks are not needed > -ve Yet further complication of notion of scope. > Here is a fourth possibility for `switch`: 4. The scope of a pattern variable bound by a `case` label extends only to the next case label. Moreover, it is allowed to shadow a local variable declared earlier in the switch block. Fallthrough has a special treatment: When falling through a case label, an implicit assignment is performed to every variable bound by the case label. Falling through a case label is permitted only if: (a) Every variable name bound by the case label is also definitely assigned after the statement, local variable declaration, or other case label that precedes it in the switch block. (b) The type of every variable bound by the case label is a type to which the type of the same variable?when regarded from just after the statement, local variable declaration, or other case label that precedes the case label in the switch block?can be assigned. The value implicitly assigned by fallthrough to a variable bound by the case label is the value of the variable of the same name as of just before the case label. (This would have to be a piece of magic not expressible by a simple source-code-level rewriting.) Example: switch (o) { case Cons(int i, int j, T x): // int i, int j, and T x are in scope and DA. int k = x.size(); // For backward compatibility, scope of int k extends to end of switch block String z = ?baz?; // For backward compatibility, scope of String z extends to end of switch block // At this point int i, int j, T x, int k, and String z are in scope and DA. // As we fall through the next case label (which binds i, j, and k), we get // implicit assignments of int i to int i, int j to long j, and int k to long k. case Cons(int i, long j, long k): // int i, long j, and long k are in scope and DA; // previous int i, int j, and T x (bound by first case label) are not in scope; // int k is in scope but shadowed by long k; // String z is in scope and neither DA nor DU. ? } +ve Allows reuse of pattern variable in the same switch statement. +ve Allows useful forms of fallthrough +ve Compatible with previous treatment of fallthrough for case labels that bind no variables -ve Introduces a mild (benign?) form of shadowing into the language An alternative is to require the types to match exactly for the implicit assignments rather than using assignment conversion. This would be less flexible but perhaps also less confusing. ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Nov 13 22:17:25 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 13 Nov 2017 23:17:25 +0100 (CET) Subject: abstract datum ?? In-Reply-To: References: <98dcfc75-936d-ae46-992f-ee10fffae025@oracle.com> <77C708F7-4C15-47A7-8A93-E8FD7121D441@yahoo.co.uk> <06b61c48-55ed-964d-0192-4fb03d48873b@oracle.com> <1251411247.2303541.1510323358904.JavaMail.zimbra@u-pem.fr> Message-ID: <1028092349.607258.1510611445552.JavaMail.zimbra@u-pem.fr> > Hi Remi, > Hi Vicente, [...] >> But more fundamentally, given that we now have default methods and an easy way >> to declare fields, i do not understand why the spec allow to declare abstract >> datum. >> >> By example, >> interface Foo { >> abstract int m(); >> } >> abstract datum AbstractFoo(int x) implements Foo { >> public int m() { return x * 2; } >> } >> datum Bar(int x, int y) extends AbstractFoo(x); >> >> can always be simplified to: >> interface Foo { >> default int m() { return x() * 2; } >> abstract int x(); >> } >> datum Bar(int x, int y) implements Foo; > > I think that what you are proposing is also a valid approach. I don't > think that the current one is written in stone so we have to evaluate > both. The benefit of abstract datum I would say is that you can abstract > not only behavior but data too in one place. my example above also abstract over data. > Plus that you can create data classes only APIs. yes, very true > Will this be enough support having abstract datum, well I guess that the uses cases will benefit more one or the other. > Also having abstract datum give the users the benefit of using one approach or the other which I think it's better than having only one option having two ways of doing the same thing is usually not something you want apart if there are clear cases where one approach is better than the other and vice-versa. Now, let see the arguments against using an abstract class: - An abstract class contains implementation details. In theory, an abstract class is independent that the subclasses but in reality, because an abstract class shared part of the subclass implementations, an abstract class and its subclasses are strongly coupled. To avoid that, abstract classes should be non visible (like AbstractStringBuilder), given that a lot of people do not do that means that using an abstract class is harder than one may think so not introducing a way to specify an abstract data class is a win. - generating codes inside an abstract class and the subclass and having separate compilation that works is hard, by example, generating the generics bridges in a hierarchy is hard. As another example, the code below currently throws a VerifyError at runtime. public class AbstractExample { static abstract __datum A(int v) { public final boolean equals(Object o) { return o == this; } } static __datum B(int v) extends A(v); public static void main(String[] args) { B b = new B(42); } } so i think we should keep the design simple and avoid abstract data classes. > Vicente R?mi From forax at univ-mlv.fr Wed Nov 15 22:15:48 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 15 Nov 2017 23:15:48 +0100 (CET) Subject: Local variable inference and anonymous class Message-ID: <877075994.1702665.1510784148361.JavaMail.zimbra@u-pem.fr> I had to persuade myself that the fact that a var with an anonymous class will 'leak' the anonymous type [1] is not an issue. By example, with var foo = new Object() { int i; }; the type of foo is the anonymous class and not Object. In fact, we can already 'leak' the type of an anonymous class using a lambda that creates an anonymous class, to create a kind of strawman tuple: List list = List.of("hello", "world!"); Map map = list.stream() .map(s -> new Object() { int key = s.length(); String value = s; }) .collect(Collectors.toMap(t -> t.key, t -> t.value)); System.out.println(map); so i guess 'leaking' the type of an anonymous class is not an issue. BTW, Eclipse doesn't compile the code above, but this is reported as a bug. https://bugs.eclipse.org/bugs/show_bug.cgi?id=477894 regards, R?mi [1] http://cr.openjdk.java.net/~dlsmith/local-var-inference.html From john.r.rose at oracle.com Thu Nov 16 04:10:26 2017 From: john.r.rose at oracle.com (John Rose) Date: Wed, 15 Nov 2017 23:10:26 -0500 Subject: Local variable inference and anonymous class In-Reply-To: <877075994.1702665.1510784148361.JavaMail.zimbra@u-pem.fr> References: <877075994.1702665.1510784148361.JavaMail.zimbra@u-pem.fr> Message-ID: <85C59F83-B485-4A78-B650-3E32B747504B@oracle.com> On Nov 15, 2017, at 5:15 PM, Remi Forax wrote: > > List list = List.of("hello", "world!"); > Map map = list.stream() > .map(s -> new Object() { int key = s.length(); String value = s; }) > .collect(Collectors.toMap(t -> t.key, t -> t.value)); > System.out.println(map); > > so i guess 'leaking' the type of an anonymous class is not an issue. That's cool; I didn't know Java had anonymous tuples!!! -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Nov 16 07:45:06 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 16 Nov 2017 07:45:06 +0000 Subject: Local variable inference and anonymous class In-Reply-To: <85C59F83-B485-4A78-B650-3E32B747504B@oracle.com> References: <877075994.1702665.1510784148361.JavaMail.zimbra@u-pem.fr> <85C59F83-B485-4A78-B650-3E32B747504B@oracle.com> Message-ID: This is like using an Object[], but with names and no second level of boxing? > On Nov 16, 2017, at 4:10 AM, John Rose wrote: > > On Nov 15, 2017, at 5:15 PM, Remi Forax > wrote: >> >> List list = List.of("hello", "world!"); >> Map map = list.stream() >> .map(s -> new Object() { int key = s.length(); String value = s; }) >> .collect(Collectors.toMap(t -> t.key, t -> t.value)); >> System.out.println(map); >> >> so i guess 'leaking' the type of an anonymous class is not an issue. > > That's cool; I didn't know Java had anonymous tuples!!! -------------- next part -------------- An HTML attachment was scrubbed... URL: From maurizio.cimadamore at oracle.com Thu Nov 16 17:33:00 2017 From: maurizio.cimadamore at oracle.com (Maurizio Cimadamore) Date: Thu, 16 Nov 2017 17:33:00 +0000 Subject: Local variable inference and anonymous class In-Reply-To: <877075994.1702665.1510784148361.JavaMail.zimbra@u-pem.fr> References: <877075994.1702665.1510784148361.JavaMail.zimbra@u-pem.fr> Message-ID: <5fa0380d-7a64-c776-8339-b7039c9fe5ea@oracle.com> FTR, this is not just about lambdas and Java 8 in general, this behavior is there since inner classes were added: String s = new Object() { String s }.s; and, after Java 5: id(Z z) { return z; } String s = id(new Object()).s; And then of course in Java 8 the inference enhancement allows for these types to propagate in a stream method chain. But yes - bottom line is - these types are not new, and they have been available for a long time - albeit the places in which they have been exposed have been relatively limited so far. Cheers Maurizio On 15/11/17 22:15, Remi Forax wrote: > I had to persuade myself that the fact that a var with an anonymous class will 'leak' the anonymous type [1] is not an issue. > > By example, with > var foo = new Object() { int i; }; > the type of foo is the anonymous class and not Object. > > In fact, we can already 'leak' the type of an anonymous class using a lambda that creates an anonymous class, > to create a kind of strawman tuple: > List list = List.of("hello", "world!"); > Map map = list.stream() > .map(s -> new Object() { int key = s.length(); String value = s; }) > .collect(Collectors.toMap(t -> t.key, t -> t.value)); > System.out.println(map); > > so i guess 'leaking' the type of an anonymous class is not an issue. > > BTW, Eclipse doesn't compile the code above, but this is reported as a bug. > https://bugs.eclipse.org/bugs/show_bug.cgi?id=477894 > > regards, > R?mi > > [1] http://cr.openjdk.java.net/~dlsmith/local-var-inference.html > From brian.goetz at oracle.com Mon Nov 20 18:17:56 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 20 Nov 2017 13:17:56 -0500 Subject: PM design question: Scopes In-Reply-To: <27805597-0665-41D9-996D-6BEBA77B8ADA@oracle.com> References: <27805597-0665-41D9-996D-6BEBA77B8ADA@oracle.com> Message-ID: <97e5c94e-b822-a0a7-5e5c-57efee1bbb72@oracle.com> We had a long meeting regarding scoping and shadowing of pattern variables.? We ended up in a good place, and we were all a bit surprised at where it seems to be pointing. We started with two use cases that we thought were important: Re-use of binding variables: ??? switch (x) { ??????? case Foo(var a): ...? break; ??????? case Bar(var a): ... ??? } Short-circuiting tests: ??? if (!(x matches Foo(var a)) ??????? throw new NotFooException(); ??? // use a here We had a few nice-to-haves: ?- that binding variables should be ordinary variables, not something new; ?- that binding, when assigned, be final Where we expected to land was something like: ?- binding variables are treated as blank finals ?- binding variables are hoisted into a synthetic block, which starts right before the statement containing the expression defining the binding ?- it is permitted for locals to shadow other locals that are DU at the point of shadowing.? (This, as a bonus, would rescue the existing unfortunate scoping of local variables defined in switch blocks.) We thought this was a sensible place to land because it built on the existing notion of scoping and local variables.? The remaining question, it seemed, was: "where does this synthetic scope end." First, a note about where the scope starts.? Consider: ??? if (e1 && x matches Foo(var a)) { ??????? ... ??? } Logically, we'd like to start the scope for `a` right where it is first declared; this is how locals work.? But, if we want to maintain the existing concept of local variable scope, it has to start earlier.? The latest candidate is right before the if starts; we act as if there is an invisible { ... } containing the entirety of the if statement, and declare `a` there. This means, though, that the scope of `a` includes `e1`, even though `a` is declared later.? This is confusing, but maybe we can ignore this, and provide a clear diagnostic if the user stumbles across it. So, where does the scope end?? The obvious candidate is right after the if statement.? This means `a` is in scope for the entire if-else, but, because it is DU in the else-blocks, can be reused if we adopt the "shadowing OK if DU" rule. FWIW, the "shadowing ok if DU" rule is clever, and gives us the behavior we want for switch / if-else chains with patterns, but has some collateral damage.? For example, the following would become valid code: ??? int x;? // declared but never used ??? float x = 1.0f;? // acceptable shadowing of int x Again, maybe we can ignore this.? But where things really blew up was attempting to handle the short-circuiting if case: ??? if (!(x matches Foo(var a)) ??????? throw new NotFooException(); ??? // use a here For this to work, we'd have to extend the scope to the end of the block containing the if statement.? Now, given our "shadowing is OK if DU rule", this is fine, right?? Not so fast.? In this simpler case: ??? if (x matches Foo(var b)) { } ??? // try to reuse b here, I dare you we find that ?- B is neither DU nor DA after the if, so we can't shadow it; ?- B is final and not DU, so we can't write to it; ?- B is not DA, so we can't use it. In other words, B is a permanent toxic waste zone, we can neither use, nor redeclare, nor assign it.? Urk. Note too that our scoping rule is not really about unbalanced ifs; it's about abrupt completion.? This is reasonable too: ??? if (x matches Foo(var a)) { ??????? println("Matched!"); ??? } ??? else ??????? throw new NotFooException(); ??? // reasonable to use a here too! Taking stock: our goal here was to try and use normal scopes and blank final semantics to describe binding variables, out of a desire to not introduce new concepts.? But it's a bad fit; the scope may be unnaturally large on the beginning side, and wherever we set the end of the scope, we end up in a choice of bad situations (either something we want in scope is not, or something we don't want in scope is.)? So traditional scopes are just a bad approximation, and what we gain in "reusing familiar concepts", we lose in the mismatch. STEPPING BACK What we realized at this point is that the essence of binding variables is their _conditionality_.? There is not a single logical old-style scope that describes the right set of places for a binding to be in scope, but there is a well-defined control-flow analysis that tells us exactly where we can use the binding, and where we can't.? This is the flow-scoping construct we initially worried was too "new and different."? But, after some further thought, and a few tweaks, this seems exactly what we want, and I think can be made understandable. The basic idea behind flow-scoping is: a binding variable is in scope where it is well-defined, and not in scope when it is not. We'll provide a complete calculus, but the key thing to understand is that the rules of flow scoping are just plain old DA/DU; if a binding is DA, then it is well-defined. In particular, flow-scoping can handle abrupt termination naturally; for a statement: ??? if (x matches Foo(var a)) { A } ??? else { B } ??? C the scope of `a` includes A, and also includes C iff B completes abruptly.? We can easily explain this as: ?- if x matches Foo(var a), we execute the A block, and in this case `a` is clearly well-defined (as we'd not execute A if the match failed); ?- The only way to reach C, if B completes abruptly, is if the match succeeds, so `a` is well defined during C in this case too. Because the scope of a binding variable is precisely the cases in which it is well defined, there is no need to tinker with shadowing. Conditional variables can now always be final, because they will never be in scope and not DA. Similarly, folding reachability into scoping for conditional variables also means that fallthrough has a well-defined meaning. If we have: ??? case Foo(int x): ... break; ??? case Bar(int x): .... then the Bar case is not reachable from where x would be initialized, so the first x is not in scope when the second x is declared, and everything is great.? On the other hand: ??? case Foo(int x): ... no break ... ??? case Bar(int x): ... A ... now x is well-defined in A, no matter how we got there.? (The merging of the two xs is the same merging we have to do anyway for "if (x matches Foo(int a) || x matches Bar(int a)".) People had originally expressed concern that flow-scoping leaves a scope "with holes", and allows puzzlers with shadowing of fields. (This is the "swiss cheese" problem.) For example: ??? // Field ??? String s ??? if (!(x matches String s)) { ??????? a(s); ??? } ??? else { ??????? b(s); ??? } This would be confusing because the `s` passed to a() is the field, but the `s` passed to b() is the binding.? But, there's a really simple way to prevent this: do not allow conditional variables to shadow fields or locals.? Now, there is no chance of this confusion, and this is not a big constraint, because the names of conditional variables are strictly local.? (Further, we can disallow shadowing of in-scope conditional variables by locals (or other conditional variables.)) Scorecard: ?- Relatively straightforward to spec, as we have a clean calculus for flow-scoped conditional variables; ?- Relatively straightforward to implement (our prototype already does this); ?- One new concept: conditional variables; ?- Conditional vars are scope where they make sense, and not in scope where they do not, cannot be assigned to (always DA and final when in scope), and are never in scope when not DA; ?- No changes to shadowing; ?- Meets all the target use cases. On 11/3/2017 6:44 AM, Gavin Bierman wrote: > > > Scopes > > Java has five constructs that introduce fresh variables into scope: > the local variable declaration statement, the for statement, the > try-with-resources statement, the catch block, and lambda expressions. > The first, local variable declaration statements, introduce variables > that are in scope for the rest of the block that it is declared in. > The others introduce variables that are limited in their scope. > > The addition of pattern matching brings a new expression, |matches|, > and extends the |switch|?statement. Both these constructs can now > introduce fresh (and, if the pattern match succeeds, definitely > assigned (DA)) variables. But the question is /what is the scope of > these ?pattern? variables/? > > Let us consider the pattern matching constructs in turn. First the > |switch|?statement: > > |switch (o) { case int i: ... case .. }| > > What is the scope of the pattern variable |i|? There are a range of > options. > > 1. > > The scope of the pattern variable is from the start of the switch > statement until the end of the enclosing block. > > In this case the pattern variable is in scope but would be > definitely unassigned (DU) immediately after the switch statement. > > |switch (o) { case int i : ... // DA ... // DA case T t : // i is > in scope } ... // i in still in scope and DU| > > * *+ve*?Simple > * *-ve*?Can?t simply reuse a pattern variable in the same switch > statement (without some form of shadowing) > * *-ve*?Pattern variable poisons the rest of the block > > 2. > > The scope of the pattern variable extends only to the end of the > switch block. > > In this case the pattern variable would be considered DA only for > the statements between the current case label and the subsequent > case labeled statement. For example: > > |switch (o) { case int i : ... // DA ... // DA case T t : // i is > in scope but not DA } ... // i not in scope| > > * *+ve*?Simple > * *+ve*?Pattern variables not poisoned in subsequent statements in > the rest of the block > * *+ve*?Similar technique to |for|?identifiers (not a new idea) > * *-ve*?Can?t simply reuse a pattern variable in the same switch > statement (without some form of shadowing) > > 3. > > The scope of the pattern variable extends only to the next case label. > > |switch (o) { case int i : ... // in scope and DA ... // in scope > and DA case T i : // int i not in scope, so can re-use } ... // i > not in scope| > > * *+ve*?Simple syntactic rule > * *+ve*?Allows reuse of pattern variable in the same switch statement. > * *-ve*?Doesn?t make sense for fallthrough > > *NOTE*?This final point is important - supporting fallthrough impacts > on what solution we might choose for scoping of pattern variables. (We > could not support fallthrough and instead support OR patterns - a > further design dimension.) > > *ASIDE*?Should we support a |switch| /expression/; it seems clear that > scoping should be treated in the same way as it is for lambda expressions. > > The |matches|?expression is unusual in that it is an /expression/?that > introduces a fresh variable. What is the scope of this variable? We > want it to be more than the expression itself, as we want the > following example code to be correct: > > |if (e matches String s) { System.out.println("It's a string - " + s); }| > > In other words, the variable introduced by the pattern needs to be in > scope for an enclosing IfThen statement. > > However, a |match|?expression could be nested within another > expression. It seems reasonable that the patterns variables are in > scope for at least the rest of the expression. For example: > > |(e matches String s || s.length() > 0) | > > Here the |s|?should be in scope for the subexpression > |s.length|?(although it is not DA). In contrast: > > |(e matches String s && s.length() > 0)| > > Here the |s|?is both in scope and DA for the subexpression |s.length|. > > However, what about the following: > > |if (s.length() > 0 && e matches String s) { System.out.println(s); }| > > Given the idea that a pattern variable flows from the inside-out to > the enclosing statement, it would appear that |s|?is in scope for the > subexpression |s.length|; although it is not DA. Unless we want scopes > to be non-contiguous, we will have to accept this rather odd situation > (consider where |s|?shadows a field). [This appears to be what happens > in the current C# compiler.] > > Now let?s consider how far a pattern variable flows wrt its enclosing > statement. We have a range of options: > > 1. > > The scope is both the statement that the match expression occurs > in and the rest of the block. In this scenario, > > |if (o matches T t) { ... } else { ... }| > > is treated as equivalent to the following pseudo-code (where > |match-and-bind|?is a fictional pattern matching construct that > pattern-matches and binds to a variable that has already been > declared) > > |T t; if (o match-and-bind t) { // t in scope and DA } else { // t > in scope and DU } // t in scope and DU| > > This is how the current C# compiler works (although the spec > describes the next option; so perhaps this is a bug). > > 2. > > The scope is just the statement that the match expression occurs > in. In this scenario, > > |if (o matches T t) { ... } else { } ...| > > is treated as equivalent to the pseudo-code > > |{ T t; if (o match-and-bind t) { // t in scope and DA } else { // > t in scope and DU // thus declaration int t = 42; is not allowed. > } } // t not in scope ...| > > This restricted scope allows reuse of pattern variables, e.g. > > |if (o matches T x) { ... } if (o matches S x) { ... }| > > 3. > > The scope of the pattern variable is determined by a flow analysis > of the enclosing statement. (It could be thought of as a > refinement of option b.) This is currently implemented in the > prototype compiler. For example: > > |if (!!(o matches T t)) { // t in scope } else { // t not in scope }| > > * *+ve*?Code will work in the presence of most refactorings > * *+ve*?We have this code working already :-) > * *-ve*?This is a break to the existant notion of scope as a > contiguous program fragment. A scope can now have holes in it. > Will users ever understand this? (Although they are /very/?similar > to the flow-based rules for DA/DU.) > > *ASIDE*?Regardless of whether we opt for (b) or (c) we may consider a > further extension where we allow the scope to extend beyond the > current statement for the case of an unbalanced |if|?statement. For > example > > |``` if (!(o matches T t)) { return; } // t in scope ... return; ```| > > * *+ve*?Supports a common idiom where else blocks are not needed > * *-ve*?Yet further complication of notion of scope. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Mon Nov 20 18:33:15 2017 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 20 Nov 2017 13:33:15 -0500 Subject: PM design question: Scopes In-Reply-To: <97e5c94e-b822-a0a7-5e5c-57efee1bbb72@oracle.com> References: <27805597-0665-41D9-996D-6BEBA77B8ADA@oracle.com> <97e5c94e-b822-a0a7-5e5c-57efee1bbb72@oracle.com> Message-ID: <3B51BA91-947D-46E2-8004-859EA8AAB81C@oracle.com> I like this. One question: what does this new theory have to say about the situation switch (x) { case Foo(int x): int y = x; // fall through case Bar(int x, int y): ? } ? Perhaps it is forbidden because the ?int y? in the pattern would shadow the ?int y? in the earlier declaration? Or can the two be merged? ?Guy > On Nov 20, 2017, at 1:17 PM, Brian Goetz wrote: > > > We had a long meeting regarding scoping and shadowing of pattern variables. We ended up in a good place, and we were all a bit surprised at where it seems to be pointing. > > We started with two use cases that we thought were important: > > Re-use of binding variables: > > switch (x) { > case Foo(var a): ... break; > case Bar(var a): ... > } > > Short-circuiting tests: > > if (!(x matches Foo(var a)) > throw new NotFooException(); > // use a here > > We had a few nice-to-haves: > - that binding variables should be ordinary variables, not something new; > - that binding, when assigned, be final > > Where we expected to land was something like: > - binding variables are treated as blank finals > - binding variables are hoisted into a synthetic block, which starts right before the statement containing the expression defining the binding > - it is permitted for locals to shadow other locals that are DU at the point of shadowing. (This, as a bonus, would rescue the existing unfortunate scoping of local variables defined in switch blocks.) > > We thought this was a sensible place to land because it built on the existing notion of scoping and local variables. The remaining question, it seemed, was: "where does this synthetic scope end." > > First, a note about where the scope starts. Consider: > > if (e1 && x matches Foo(var a)) { > ... > } > > Logically, we'd like to start the scope for `a` right where it is first declared; this is how locals work. But, if we want to maintain the existing concept of local variable scope, it has to start earlier. The latest candidate is right before the if starts; we act as if there is an invisible { ... } containing the entirety of the if statement, and declare `a` there. > > This means, though, that the scope of `a` includes `e1`, even though `a` is declared later. This is confusing, but maybe we can ignore this, and provide a clear diagnostic if the user stumbles across it. > > So, where does the scope end? The obvious candidate is right after the if statement. This means `a` is in scope for the entire if-else, but, because it is DU in the else-blocks, can be reused if we adopt the "shadowing OK if DU" rule. > > FWIW, the "shadowing ok if DU" rule is clever, and gives us the behavior we want for switch / if-else chains with patterns, but has some collateral damage. For example, the following would become valid code: > > int x; // declared but never used > float x = 1.0f; // acceptable shadowing of int x > > Again, maybe we can ignore this. But where things really blew up was attempting to handle the short-circuiting if case: > > if (!(x matches Foo(var a)) > throw new NotFooException(); > // use a here > > For this to work, we'd have to extend the scope to the end of the block containing the if statement. Now, given our "shadowing is OK if DU rule", this is fine, right? Not so fast. In this simpler case: > > if (x matches Foo(var b)) { } > // try to reuse b here, I dare you > > we find that > - B is neither DU nor DA after the if, so we can't shadow it; > - B is final and not DU, so we can't write to it; > - B is not DA, so we can't use it. > > In other words, B is a permanent toxic waste zone, we can neither use, nor redeclare, nor assign it. Urk. > > Note too that our scoping rule is not really about unbalanced ifs; it's about abrupt completion. This is reasonable too: > > if (x matches Foo(var a)) { > println("Matched!"); > } > else > throw new NotFooException(); > // reasonable to use a here too! > > Taking stock: our goal here was to try and use normal scopes and blank final semantics to describe binding variables, out of a desire to not introduce new concepts. But it's a bad fit; the scope may be unnaturally large on the beginning side, and wherever we set the end of the scope, we end up in a choice of bad situations (either something we want in scope is not, or something we don't want in scope is.) So traditional scopes are just a bad approximation, and what we gain in "reusing familiar concepts", we lose in the mismatch. > > > STEPPING BACK > > What we realized at this point is that the essence of binding variables is their _conditionality_. There is not a single logical old-style scope that describes the right set of places for a binding to be in scope, but there is a well-defined control-flow analysis that tells us exactly where we can use the binding, and where we can't. This is the flow-scoping construct we initially worried was too "new and different." But, after some further thought, and a few tweaks, this seems exactly what we want, and I think can be made understandable. > > The basic idea behind flow-scoping is: a binding variable is in scope where it is well-defined, and not in scope when it is not. We'll provide a complete calculus, but the key thing to understand is that the rules of flow scoping are just plain old DA/DU; if a binding is DA, then it is well-defined. > > In particular, flow-scoping can handle abrupt termination naturally; for a statement: > > if (x matches Foo(var a)) { A } > else { B } > C > > the scope of `a` includes A, and also includes C iff B completes abruptly. We can easily explain this as: > - if x matches Foo(var a), we execute the A block, and in this case `a` is clearly well-defined (as we'd not execute A if the match failed); > - The only way to reach C, if B completes abruptly, is if the match succeeds, so `a` is well defined during C in this case too. > > Because the scope of a binding variable is precisely the cases in which it is well defined, there is no need to tinker with shadowing. > > Conditional variables can now always be final, because they will never be in scope and not DA. > > Similarly, folding reachability into scoping for conditional variables also means that fallthrough has a well-defined meaning. If we have: > > case Foo(int x): ... break; > case Bar(int x): .... > > then the Bar case is not reachable from where x would be initialized, so the first x is not in scope when the second x is declared, and everything is great. On the other hand: > > case Foo(int x): ... no break ... > case Bar(int x): ... A ... > > now x is well-defined in A, no matter how we got there. (The merging of the two xs is the same merging we have to do anyway for "if (x matches Foo(int a) || x matches Bar(int a)".) > > > People had originally expressed concern that flow-scoping leaves a scope "with holes", and allows puzzlers with shadowing of fields. (This is the "swiss cheese" problem.) For example: > > // Field > String s > > if (!(x matches String s)) { > a(s); > } > else { > b(s); > } > > This would be confusing because the `s` passed to a() is the field, but the `s` passed to b() is the binding. But, there's a really simple way to prevent this: do not allow conditional variables to shadow fields or locals. Now, there is no chance of this confusion, and this is not a big constraint, because the names of conditional variables are strictly local. (Further, we can disallow shadowing of in-scope conditional variables by locals (or other conditional variables.)) > > > Scorecard: > - Relatively straightforward to spec, as we have a clean calculus for flow-scoped conditional variables; > - Relatively straightforward to implement (our prototype already does this); > - One new concept: conditional variables; > - Conditional vars are scope where they make sense, and not in scope where they do not, cannot be assigned to (always DA and final when in scope), and are never in scope when not DA; > - No changes to shadowing; > - Meets all the target use cases. > > > > > On 11/3/2017 6:44 AM, Gavin Bierman wrote: >> Scopes >> >> Java has five constructs that introduce fresh variables into scope: the local variable declaration statement, the for statement, the try-with-resources statement, the catch block, and lambda expressions. The first, local variable declaration statements, introduce variables that are in scope for the rest of the block that it is declared in. The others introduce variables that are limited in their scope. >> >> The addition of pattern matching brings a new expression, matches, and extends the switch statement. Both these constructs can now introduce fresh (and, if the pattern match succeeds, definitely assigned (DA)) variables. But the question is what is the scope of these ?pattern? variables? >> >> Let us consider the pattern matching constructs in turn. First the switch statement: >> >> switch (o) { >> case int i: ... >> case .. >> } >> What is the scope of the pattern variable i? There are a range of options. >> >> The scope of the pattern variable is from the start of the switch statement until the end of the enclosing block. >> >> In this case the pattern variable is in scope but would be definitely unassigned (DU) immediately after the switch statement. >> >> switch (o) { >> case int i : ... // DA >> ... // DA >> case T t : // i is in scope >> } >> ... // i in still in scope and DU >> +ve Simple >> -ve Can?t simply reuse a pattern variable in the same switch statement (without some form of shadowing) >> -ve Pattern variable poisons the rest of the block >> The scope of the pattern variable extends only to the end of the switch block. >> >> In this case the pattern variable would be considered DA only for the statements between the current case label and the subsequent case labeled statement. For example: >> >> switch (o) { >> case int i : ... // DA >> ... // DA >> case T t : // i is in scope but not DA >> } >> ... // i not in scope >> +ve Simple >> +ve Pattern variables not poisoned in subsequent statements in the rest of the block >> +ve Similar technique to for identifiers (not a new idea) >> -ve Can?t simply reuse a pattern variable in the same switch statement (without some form of shadowing) >> The scope of the pattern variable extends only to the next case label. >> >> switch (o) { >> case int i : ... // in scope and DA >> ... // in scope and DA >> case T i : // int i not in scope, so can re-use >> } >> ... // i not in scope >> +ve Simple syntactic rule >> +ve Allows reuse of pattern variable in the same switch statement. >> -ve Doesn?t make sense for fallthrough >> NOTE This final point is important - supporting fallthrough impacts on what solution we might choose for scoping of pattern variables. (We could not support fallthrough and instead support OR patterns - a further design dimension.) >> >> ASIDE Should we support a switch expression; it seems clear that scoping should be treated in the same way as it is for lambda expressions. >> >> The matches expression is unusual in that it is an expression that introduces a fresh variable. What is the scope of this variable? We want it to be more than the expression itself, as we want the following example code to be correct: >> >> if (e matches String s) { >> System.out.println("It's a string - " + s); >> } >> In other words, the variable introduced by the pattern needs to be in scope for an enclosing IfThen statement. >> >> However, a match expression could be nested within another expression. It seems reasonable that the patterns variables are in scope for at least the rest of the expression. For example: >> >> (e matches String s || s.length() > 0) >> Here the s should be in scope for the subexpression s.length (although it is not DA). In contrast: >> >> (e matches String s && s.length() > 0) >> Here the s is both in scope and DA for the subexpression s.length. >> >> However, what about the following: >> >> if (s.length() > 0 && e matches String s) { >> System.out.println(s); >> } >> Given the idea that a pattern variable flows from the inside-out to the enclosing statement, it would appear that s is in scope for the subexpression s.length; although it is not DA. Unless we want scopes to be non-contiguous, we will have to accept this rather odd situation (consider where s shadows a field). [This appears to be what happens in the current C# compiler.] >> >> Now let?s consider how far a pattern variable flows wrt its enclosing statement. We have a range of options: >> >> The scope is both the statement that the match expression occurs in and the rest of the block. In this scenario, >> >> if (o matches T t) { >> ... >> } else { >> ... >> } >> is treated as equivalent to the following pseudo-code (where match-and-bind is a fictional pattern matching construct that pattern-matches and binds to a variable that has already been declared) >> >> T t; >> if (o match-and-bind t) { >> // t in scope and DA >> } else { >> // t in scope and DU >> } >> // t in scope and DU >> This is how the current C# compiler works (although the spec describes the next option; so perhaps this is a bug). >> >> The scope is just the statement that the match expression occurs in. In this scenario, >> >> if (o matches T t) { >> ... >> } else { >> >> } >> ... >> is treated as equivalent to the pseudo-code >> >> { T t; >> if (o match-and-bind t) { >> // t in scope and DA >> } else { >> // t in scope and DU >> // thus declaration int t = 42; is not allowed. >> } >> } >> // t not in scope >> ... >> This restricted scope allows reuse of pattern variables, e.g. >> >> if (o matches T x) { ... } >> if (o matches S x) { ... } >> The scope of the pattern variable is determined by a flow analysis of the enclosing statement. (It could be thought of as a refinement of option b.) This is currently implemented in the prototype compiler. For example: >> >> if (!!(o matches T t)) { >> // t in scope >> } else { >> // t not in scope >> } >> +ve Code will work in the presence of most refactorings >> +ve We have this code working already :-) >> -ve This is a break to the existant notion of scope as a contiguous program fragment. A scope can now have holes in it. Will users ever understand this? (Although they are very similar to the flow-based rules for DA/DU.) >> ASIDE Regardless of whether we opt for (b) or (c) we may consider a further extension where we allow the scope to extend beyond the current statement for the case of an unbalanced if statement. For example >> >> ``` >> if (!(o matches T t)) { >> return; >> } >> // t in scope >> ... >> return; >> ``` >> +ve Supports a common idiom where else blocks are not needed >> -ve Yet further complication of notion of scope. >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Nov 20 18:55:56 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 20 Nov 2017 13:55:56 -0500 Subject: PM design question: Scopes In-Reply-To: <3B51BA91-947D-46E2-8004-859EA8AAB81C@oracle.com> References: <27805597-0665-41D9-996D-6BEBA77B8ADA@oracle.com> <97e5c94e-b822-a0a7-5e5c-57efee1bbb72@oracle.com> <3B51BA91-947D-46E2-8004-859EA8AAB81C@oracle.com> Message-ID: <8d6f45f0-8ae3-31c6-bb7e-77669f24c17c@oracle.com> On 11/20/2017 1:33 PM, Guy Steele wrote: > I like this. ?One question: what does this new theory have to say > about the situation > > switch (x) { > ? case Foo(int x): > int?y = x; > // fall through > ? case Bar(int x, int y): > ? > } In this case, I would say that the second y is shadowing the first, and therefore this is an error.? Trying to merge the ys seems like a heroic measure.? Merging the xs, on the other hand, is clean, because at the point where the second x is bound, the first x is DU (we'd skip over the Bar(int x, int y) binding if we had matched the first case.) The opposite example is also interesting: ??? case Foo(int x, int y): ??????? // A ??????? // fall through ??? case Bar(int x): ??????? // B Here, y is in scope in A, but not in B; x is in scope in both A and B. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Mon Nov 20 19:02:31 2017 From: guy.steele at oracle.com (Guy Steele) Date: Mon, 20 Nov 2017 14:02:31 -0500 Subject: PM design question: Scopes In-Reply-To: <8d6f45f0-8ae3-31c6-bb7e-77669f24c17c@oracle.com> References: <27805597-0665-41D9-996D-6BEBA77B8ADA@oracle.com> <97e5c94e-b822-a0a7-5e5c-57efee1bbb72@oracle.com> <3B51BA91-947D-46E2-8004-859EA8AAB81C@oracle.com> <8d6f45f0-8ae3-31c6-bb7e-77669f24c17c@oracle.com> Message-ID: <9C5F5D46-517A-4BA7-A649-0D57FBF2FE04@oracle.com> Okay, thanks for this clarification. I am not a big fan of fall-through, and I think we could live with this example being an error. (If it were to work as a natural consequence of whatever theory we finally adopt, I predict that it would get used in exactly this sort of defaulting situation. On the other hand, it is not a completely general solution to the defaulting problem; consider switch (x) { case Quux(int y): int x = y; // would also like to get to the point after case Bar, but can?t ?fall through? into it. case Foo(int x): int y = x; // fall through case Bar(int x, int y): ? } so perhaps it is just as well that we not encourage it.) > On Nov 20, 2017, at 1:55 PM, Brian Goetz wrote: > > > On 11/20/2017 1:33 PM, Guy Steele wrote: >> I like this. One question: what does this new theory have to say about the situation >> >> switch (x) { >> case Foo(int x): >> int y = x; >> // fall through >> case Bar(int x, int y): >> ? >> } > > In this case, I would say that the second y is shadowing the first, and therefore this is an error. Trying to merge the ys seems like a heroic measure. Merging the xs, on the other hand, is clean, because at the point where the second x is bound, the first x is DU (we'd skip over the Bar(int x, int y) binding if we had matched the first case.) > > The opposite example is also interesting: > > case Foo(int x, int y): > // A > // fall through > case Bar(int x): > // B > > Here, y is in scope in A, but not in B; x is in scope in both A and B. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vicente.romero at oracle.com Tue Nov 14 15:52:37 2017 From: vicente.romero at oracle.com (Vicente Romero) Date: Tue, 14 Nov 2017 10:52:37 -0500 Subject: abstract datum ?? In-Reply-To: <1028092349.607258.1510611445552.JavaMail.zimbra@u-pem.fr> References: <98dcfc75-936d-ae46-992f-ee10fffae025@oracle.com> <77C708F7-4C15-47A7-8A93-E8FD7121D441@yahoo.co.uk> <06b61c48-55ed-964d-0192-4fb03d48873b@oracle.com> <1251411247.2303541.1510323358904.JavaMail.zimbra@u-pem.fr> <1028092349.607258.1510611445552.JavaMail.zimbra@u-pem.fr> Message-ID: <45366443-5f55-e2e8-deba-796b5ede6fe1@oracle.com> On 11/13/2017 05:17 PM, forax at univ-mlv.fr wrote: >> Hi Remi, >> > Hi Vicente, > > [...] > >>> But more fundamentally, given that we now have default methods and an easy way >>> to declare fields, i do not understand why the spec allow to declare abstract >>> datum. >>> >>> By example, >>> interface Foo { >>> abstract int m(); >>> } >>> abstract datum AbstractFoo(int x) implements Foo { >>> public int m() { return x * 2; } >>> } >>> datum Bar(int x, int y) extends AbstractFoo(x); >>> >>> can always be simplified to: >>> interface Foo { >>> default int m() { return x() * 2; } >>> abstract int x(); >>> } >>> datum Bar(int x, int y) implements Foo; >> I think that what you are proposing is also a valid approach. I don't >> think that the current one is written in stone so we have to evaluate >> both. The benefit of abstract datum I would say is that you can abstract >> not only behavior but data too in one place. > my example above also abstract over data. > >> Plus that you can create data classes only APIs. > yes, very true > >> Will this be enough support having abstract datum, well I guess that the uses cases will benefit more one or the other. >> Also having abstract datum give the users the benefit of using one approach or the other which I think it's better than having only one option > having two ways of doing the same thing is usually not something you want apart if there are clear cases where one approach is better than the other and vice-versa. > > Now, let see the arguments against using an abstract class: > - An abstract class contains implementation details. In theory, an abstract class is independent that the subclasses but in reality, because an abstract class shared part of the subclass implementations, an abstract class and its subclasses are strongly coupled. To avoid that, abstract classes should be non visible (like AbstractStringBuilder), given that a lot of people do not do that means that using an abstract class is harder than one may think so not introducing a way to specify an abstract data class is a win. > - generating codes inside an abstract class and the subclass and having separate compilation that works is hard, by example, generating the generics bridges in a hierarchy is hard. > As another example, the code below currently throws a VerifyError at runtime. > > public class AbstractExample { > static abstract __datum A(int v) { > public final boolean equals(Object o) { > return o == this; > } > } > > static __datum B(int v) extends A(v); > > public static void main(String[] args) { > B b = new B(42); > } > } > > so i think we should keep the design simple and avoid abstract data classes. We should have an amber meeting soon and consider all the pros and cons of each option. > >> Vicente > R?mi Vicente From brian.goetz at oracle.com Wed Nov 29 21:54:03 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 29 Nov 2017 16:54:03 -0500 Subject: Guards -- not just for switch! Message-ID: <84528b93-96c1-a3ea-9cbe-84f3f955e340@oracle.com> As we've swirled around the design space on pattern matching, it seems the sensible place to land is to not provide explicit syntax for AND and OR patterns, but to support guards on case labels: ??? case Foo(String s) ??????? where (s.length() > 0): ... In the course of a separate discussion, we realized that guards can profitably go in some other places, too.? Like methods/constructors: ??? public Range(int lo, int hi) ??????? where (lo <= hi) { ??????????? this.lo = lo; ??????????? this.hi = hi; ??? } and the compiler will insert a ??????????? if (!(low <= hi)) ??????????????? throw new IllegalArgumentException(String.format("Precondition lo <= hi violated; lo=%s, hi=%s", lo, hi); at the top of the method.? We already have throws clauses in this position, so putting a guard clause here isn't too weird. Hoisting preconditions into "where" clauses has several benefits: ?- Constraints are easier to find when reading the code; ?- Constraints can be automatically hoisted into the Javadoc as preconditions; ?- More compact / pleasant way to write precondition checks, which means users are more likely to actually specify / check input constraints; ?- The compiler is likely to generate a more informative exception message than the user would be. We can keep pulling on this string, and put these on data classes (new name: records) too: ??? record Range(int lo, int hi) ??????? where (lo <= hi); and the where clause just gets lowered onto the default ctor.? (We should probably balk if it mentions a non-final component, as we can never find all the writes (reflection!), and we don't want to give users a false sense of confidence.) For those who asked for non-nullity support, this comes for free with guards: ??? record Foo(String s) ??????? where s != null; From guy.steele at oracle.com Wed Nov 29 21:49:23 2017 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 29 Nov 2017 16:49:23 -0500 Subject: Guards -- not just for switch! In-Reply-To: <84528b93-96c1-a3ea-9cbe-84f3f955e340@oracle.com> References: <84528b93-96c1-a3ea-9cbe-84f3f955e340@oracle.com> Message-ID: Um, let?s be careful here. See below. > On Nov 29, 2017, at 4:54 PM, Brian Goetz wrote: > > As we've swirled around the design space on pattern matching, it seems the sensible place to land is to not provide explicit syntax for AND and OR patterns, but to support guards on case labels: > > case Foo(String s) > where (s.length() > 0): ... > > In the course of a separate discussion, we realized that guards can profitably go in some other places, too. Like methods/constructors: > > public Range(int lo, int hi) > where (lo <= hi) { > this.lo = lo; > this.hi = hi; > } > The two situations are not analogous, because if a case clause fails you just try the next one, but if a constructor fails you don?t try another constructor. If we use the term ?where? on a constructor, then Joe Programmer (that's me) will probably be very surprised if switch (x) { case Foo(String s) where (s.length() > 0): ... case Foo(String s) where (s.length() == 0): ... } works, but public Interval(int lo, int hi) where (lo <= hi) { this.lo = lo; this.hi = hi; this.exterior = false; } public Interval(int lo, int hi) where (lo > hi) { this.lo = lo; this.hi = hi; this.exterior = true; } does not work. It might make sense to use a word such as ?requires? or ?assert? rather than ?where? in the constructor case. ?Guy From forax at univ-mlv.fr Wed Nov 29 22:20:50 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 29 Nov 2017 23:20:50 +0100 (CET) Subject: Guards -- not just for switch! In-Reply-To: References: <84528b93-96c1-a3ea-9cbe-84f3f955e340@oracle.com> Message-ID: <1521139510.1966549.1511994050404.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Guy Steele" > ?: "Brian Goetz" > Cc: "amber-spec-experts" > Envoy?: Mercredi 29 Novembre 2017 22:49:23 > Objet: Re: Guards -- not just for switch! > Um, let?s be careful here. See below. > >> On Nov 29, 2017, at 4:54 PM, Brian Goetz wrote: >> >> As we've swirled around the design space on pattern matching, it seems the >> sensible place to land is to not provide explicit syntax for AND and OR >> patterns, but to support guards on case labels: >> >> case Foo(String s) >> where (s.length() > 0): ... >> >> In the course of a separate discussion, we realized that guards can profitably >> go in some other places, too. Like methods/constructors: >> >> public Range(int lo, int hi) >> where (lo <= hi) { >> this.lo = lo; >> this.hi = hi; >> } >> > > The two situations are not analogous, because if a case clause fails you just > try the next one, but if a constructor fails you don?t try another constructor. > > If we use the term ?where? on a constructor, then Joe Programmer (that's me) i was thinking you were more a random Guy :) > will probably be very surprised if > > switch (x) { > case Foo(String s) > where (s.length() > 0): ... > case Foo(String s) > where (s.length() == 0): ... > } > > > works, but > > public Interval(int lo, int hi) > where (lo <= hi) { > this.lo = lo; > this.hi = hi; > this.exterior = false; > } > > public Interval(int lo, int hi) > where (lo > hi) { > this.lo = lo; > this.hi = hi; > this.exterior = true; > } > > does not work. > > It might make sense to use a word such as ?requires? or ?assert? rather than > ?where? in the constructor case. yes, it's the whole design by contract thingy, i vote for "requires" like in Eiffel, record Apple(int seed) requires seed >= 0; > > ?Guy R?mi From amaembo at gmail.com Thu Nov 30 00:24:05 2017 From: amaembo at gmail.com (Tagir Valeev) Date: Thu, 30 Nov 2017 07:24:05 +0700 Subject: Guards -- not just for switch! In-Reply-To: <84528b93-96c1-a3ea-9cbe-84f3f955e340@oracle.com> References: <84528b93-96c1-a3ea-9cbe-84f3f955e340@oracle.com> Message-ID: public Range(int lo, int hi) where (lo <= hi) { this.lo = lo; this.hi = hi; } Probably it's too early to argue about syntax, but now it's too similar to the 'while' loop. The 'where' keyword really looks like 'while' (same length, prefix and suffix) and just like the 'while' loop it's followed by parenthesized boolean expression and code block. I bet this would become a source of confusion when reading the code. The 'requires' keyword, as Remi suggests, sounds much better. With best regards, Tagir Valeev -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Nov 30 18:29:41 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 30 Nov 2017 13:29:41 -0500 Subject: Reader mail bag Message-ID: We've gotten two submissions to the amber-spec-comments list. 1.? "Forcing a pattern match", at: http://mail.openjdk.java.net/pipermail/amber-spec-comments/2017-November/000000.html 2.? "Named parameters in data classes", at: http://mail.openjdk.java.net/pipermail/amber-spec-comments/2017-November/000003.html I think the first asks for an alternate form of the "matches" operator which would fail with a CCE, rather than evaluating to false, if the match fails.? This would be essentially saying, "This should match; if it doesn't, that's an error."? I think that's what's being asked, but then he talks about an "unnecessary instanceof check", which makes me wonder whether this is really just about optimization. To be clear, the instanceof check is neither expensive nor unnecessary.? (Though we can optimize these away when we know the static type of the target; if x is a Foo, then "x matches Foo f" can be statically strength-reduced to "x != null".) Note that the "if member.getKind() == VARIABLE" is merely a manual optimization; you could easily leave that out and just match against VariableTree.? What we'd rather focus on is how to get to: ??? switch (member) { ??????? case BlockTree bt: ... ??????? case VariableTree vt: ... ??? } while allowing the pattern to capture the quicker pre-test (kind == BLOCK) and maintain the performance without making the user worry about this.? We have some ideas here, but I don't think this "forcing" idea really offers a lot. The second (named parameters) was a question I was expecting. I agree that being able to invoke constructors and methods by name can sometimes result in easier-to-read code, especially when some parameters have a sensible default value.? (This is also a more complicated feature than anyone gives it credit for, so it's not the "gimme" it is often assumed to be.) However, data classes is not the place to cram in this feature; this should be a feature that stands on its own, and applies to all classes, data- or not.? One of the design goals for data classes is that a data class should be essentially a "macro" for a class you could write by hand; this allows easy migration from existing classes to data classes (if they meet the requirements) and from data classes to full classes if they expand to no longer fit the data class profile.? The more that *uses* of data classes or their members are different from the corresponding use of regular classes, the more difficult this migration becomes.? (This is not unlike the design mandate with default methods; from the client perspective, they're just ordinary virtual methods, and you can't tell whether the method was implemented directly or inherited -- they're just methods.) So, while named parameters are a reasonable feature to explore, trying to staple them onto data classes would be a mistake.? They are their own feature.? We're open to exploring it, but we've got our plate full for now. From guy.steele at oracle.com Thu Nov 30 18:31:49 2017 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 30 Nov 2017 13:31:49 -0500 Subject: Reader mail bag In-Reply-To: References: Message-ID: > On Nov 30, 2017, at 1:29 PM, Brian Goetz wrote: > . . . > So, while named parameters are a reasonable feature to explore, trying to staple them onto data classes would be a mistake. They are their own feature. We're open to exploring it, but we've got our plate full for now. I strongly concur. ?Guy From forax at univ-mlv.fr Thu Nov 30 20:21:01 2017 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 30 Nov 2017 21:21:01 +0100 (CET) Subject: Reader mail bag In-Reply-To: References: Message-ID: <542972726.2360777.1512073261898.JavaMail.zimbra@u-pem.fr> My note on named parameters: Supporting real named parameters that works with overriding and backward compatibility is hard, but i believe there is a sweet spot. - a method can declare to support named parameters by using a new modifier 'named'. so - all parameters are named or none are, you can not mix positional and named parameters like in Ruby. - a method is either support positional parameters or named parameters but not both at the same time. if there are several overloads, you can only have one named overload with the same number of parameters, which means that counting the number of parameters is enough to know which method can be called. a varargs method can not be named. - when overriding a method, a named method can not override a non named method. overriding a named method with a non named method is allowed to ease the transition but emit a warning. a named method that overrides another named method need to have the same parameters with the same name at the same position. - at call site, a named method has to be called using the syntax "name: argument" for each arguments by example: ThreadGroup main = ... new ThreadGroup(parent: main, name: "my group"); if a named method is called with positional arguments, the compiler emit a warning - at compile time, a declaration site, all parameter named are stored in the Parameter attribute. a callsite, the compile insert an invokedynamic to the NamedParameterMetaFactory with a method handle ref on the declared named method and the names of all arguments in the order of the call. a runtime, the NamedParameterMetaFactory verifies that the number of named argument is the same and are a permutation of the declared parameters that are retrieved by cracking the method handle ref as a MethodHandleInfo and calling getParameters on the corresponding Constructor/Method, once the permutation is calculated, the NamedParameterMetaFactory returns a ConstantCallSite on the constant method handle (the method handle ref) permuted using MethodHandles.permuteArguments. To summarize, this let us use named parameters if the API designer want to allow that, there is a path for upgrading a method that uses positional parameters to use named parameters, but not in the other way (like varargs), not supporting permutations between overridden methods make thing far simpler that they are otherwise. cheers, R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Jeudi 30 Novembre 2017 19:29:41 > Objet: Reader mail bag > We've gotten two submissions to the amber-spec-comments list. > > 1.? "Forcing a pattern match", at: > http://mail.openjdk.java.net/pipermail/amber-spec-comments/2017-November/000000.html > > 2.? "Named parameters in data classes", at: > http://mail.openjdk.java.net/pipermail/amber-spec-comments/2017-November/000003.html > > > I think the first asks for an alternate form of the "matches" operator > which would fail with a CCE, rather than evaluating to false, if the > match fails.? This would be essentially saying, "This should match; if > it doesn't, that's an error."? I think that's what's being asked, but > then he talks about an "unnecessary instanceof check", which makes me > wonder whether this is really just about optimization. > > To be clear, the instanceof check is neither expensive nor unnecessary. > (Though we can optimize these away when we know the static type of the > target; if x is a Foo, then "x matches Foo f" can be statically > strength-reduced to "x != null".) > > Note that the "if member.getKind() == VARIABLE" is merely a manual > optimization; you could easily leave that out and just match against > VariableTree.? What we'd rather focus on is how to get to: > > ??? switch (member) { > ??????? case BlockTree bt: ... > ??????? case VariableTree vt: ... > ??? } > > while allowing the pattern to capture the quicker pre-test (kind == > BLOCK) and maintain the performance without making the user worry about > this.? We have some ideas here, but I don't think this "forcing" idea > really offers a lot. > > > The second (named parameters) was a question I was expecting. > > I agree that being able to invoke constructors and methods by name can > sometimes result in easier-to-read code, especially when some parameters > have a sensible default value.? (This is also a more complicated feature > than anyone gives it credit for, so it's not the "gimme" it is often > assumed to be.) > > However, data classes is not the place to cram in this feature; this > should be a feature that stands on its own, and applies to all classes, > data- or not.? One of the design goals for data classes is that a data > class should be essentially a "macro" for a class you could write by > hand; this allows easy migration from existing classes to data classes > (if they meet the requirements) and from data classes to full classes if > they expand to no longer fit the data class profile.? The more that > *uses* of data classes or their members are different from the > corresponding use of regular classes, the more difficult this migration > becomes.? (This is not unlike the design mandate with default methods; > from the client perspective, they're just ordinary virtual methods, and > you can't tell whether the method was implemented directly or inherited > -- they're just methods.) > > So, while named parameters are a reasonable feature to explore, trying > to staple them onto data classes would be a mistake.? They are their own > feature.? We're open to exploring it, but we've got our plate full for now. From guy.steele at oracle.com Thu Nov 30 20:56:27 2017 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 30 Nov 2017 15:56:27 -0500 Subject: Named parameters [was: Re: Reader mail bag] In-Reply-To: <542972726.2360777.1512073261898.JavaMail.zimbra@u-pem.fr> References: <542972726.2360777.1512073261898.JavaMail.zimbra@u-pem.fr> Message-ID: <3D92F35D-0CCB-437B-8CD5-C824219971D1@oracle.com> Thanks, Remi, this is an excellent start! And it may be where we want to end up. But when the time comes that we dig into a serious discussion of putting named method parameters into Java, I would like to see a broader exploration of the design space before we settle on a specific design. I?ve seen a lot of other designs for a lot of other languages, each with pros and cons. There are at least four more-or-less orthogonal properties a method parameter can have: (1) May it be specified by name (rather than by position) at the call site? If so, can it be specified _either_ by name or by position? (2) May the corresponding actual argument be omitted at the call site? If so, what happens? System-specified default value (such as zero or null) is supplied. Programmer-specified default value is supplied. Value is a compile-time constant. Value is recomputed at call time. Can this computation depend on other argument values (such as those to the left)? No value is supplied. A separate mechanism allows inquiry as to whether an actual argument was provided. The parameter type is actually an option type (choice; if no value is supplied, you get an empty value). (3) May the corresponding actual argument be duplicated at the call site? (This may make little sense for Java, but is used extensively in Common Lisp, where name-value lists may be built dynamically and then fed to `apply`; allowing duplications makes it easy to override a default by just sticking a new name-value pair onto the front of a list.) (4) May the actual arguments be permuted at the call site?that is, appear in an order other than the order in which they are declared in the method declaration? (Typically the answer is ?no? for positional parameters, but may be ?yes? or ?no? for named parameters.) (If a call contains both positional and named arguments, one can ask whether the named arguments may be mixed in among the positional ones [yech!] or must be kept separate, such as always appearing to the right of the positional arguments.) For each of the preceding four questions, there are these meta-questions: Is the answer to the question the same for all parameters whatsoever? Is the answer to the question the same for all parameters of the same kind (such as named or positional)? Is the answer to the question the same for all parameters in a single method declaration? Is the answer to the question the same for all parameters of the same kind (such as named or positional) in a single method declaration? In addition, there is the question of exactly what combinations of positional, optional positional, named, and/or optional named parameters may be used within a single method declaration. And there is the question of what combinations of combinations may appear within an overload set. Many of these questions are in fact answered by specific choices in the proposal below. I?m just looking to seeing (eventually) a thorough discussion of the rationale for each choice. I provide this list of questions as one possible starting point for that discussion. ?Guy > On Nov 30, 2017, at 3:21 PM, Remi Forax wrote: > > My note on named parameters: > Supporting real named parameters that works with overriding and backward compatibility is hard, but i believe there is a sweet spot. > > - a method can declare to support named parameters by using a new modifier 'named'. > so - all parameters are named or none are, you can not mix positional and named parameters like in Ruby. > - a method is either support positional parameters or named parameters but not both at the same time. > if there are several overloads, you can only have one named overload with the same number of parameters, > which means that counting the number of parameters is enough to know which method can be called. > a varargs method can not be named. > > - when overriding a method, a named method can not override a non named method. > overriding a named method with a non named method is allowed to ease the transition but emit a warning. > a named method that overrides another named method need to have the same parameters with the same name at the same position. > > - at call site, a named method has to be called using the syntax "name: argument" for each arguments > by example: > ThreadGroup main = ... > new ThreadGroup(parent: main, name: "my group"); > if a named method is called with positional arguments, the compiler emit a warning > > - at compile time, > a declaration site, all parameter named are stored in the Parameter attribute. > a callsite, the compile insert an invokedynamic to the NamedParameterMetaFactory with a method handle ref on the declared named method and the names of all arguments in the order of the call. > a runtime, the NamedParameterMetaFactory verifies that the number of named argument is the same and are a permutation of the declared parameters that are retrieved by cracking the method handle ref as a MethodHandleInfo and calling getParameters on the corresponding Constructor/Method, once the permutation is calculated, the NamedParameterMetaFactory returns a ConstantCallSite on the constant method handle (the method handle ref) permuted using MethodHandles.permuteArguments. > > To summarize, this let us use named parameters if the API designer want to allow that, > there is a path for upgrading a method that uses positional parameters to use named parameters, but not in the other way (like varargs), > not supporting permutations between overridden methods make thing far simpler that they are otherwise. > > cheers, > R?mi > > ----- Mail original ----- >> De: "Brian Goetz" >> ?: "amber-spec-experts" >> Envoy?: Jeudi 30 Novembre 2017 19:29:41 >> Objet: Reader mail bag > >> We've gotten two submissions to the amber-spec-comments list. >> >> 1. "Forcing a pattern match", at: >> http://mail.openjdk.java.net/pipermail/amber-spec-comments/2017-November/000000.html >> >> 2. "Named parameters in data classes", at: >> http://mail.openjdk.java.net/pipermail/amber-spec-comments/2017-November/000003.html >> >> >> I think the first asks for an alternate form of the "matches" operator >> which would fail with a CCE, rather than evaluating to false, if the >> match fails. This would be essentially saying, "This should match; if >> it doesn't, that's an error." I think that's what's being asked, but >> then he talks about an "unnecessary instanceof check", which makes me >> wonder whether this is really just about optimization. >> >> To be clear, the instanceof check is neither expensive nor unnecessary. >> (Though we can optimize these away when we know the static type of the >> target; if x is a Foo, then "x matches Foo f" can be statically >> strength-reduced to "x != null".) >> >> Note that the "if member.getKind() == VARIABLE" is merely a manual >> optimization; you could easily leave that out and just match against >> VariableTree. What we'd rather focus on is how to get to: >> >> switch (member) { >> case BlockTree bt: ... >> case VariableTree vt: ... >> } >> >> while allowing the pattern to capture the quicker pre-test (kind == >> BLOCK) and maintain the performance without making the user worry about >> this. We have some ideas here, but I don't think this "forcing" idea >> really offers a lot. >> >> >> The second (named parameters) was a question I was expecting. >> >> I agree that being able to invoke constructors and methods by name can >> sometimes result in easier-to-read code, especially when some parameters >> have a sensible default value. (This is also a more complicated feature >> than anyone gives it credit for, so it's not the "gimme" it is often >> assumed to be.) >> >> However, data classes is not the place to cram in this feature; this >> should be a feature that stands on its own, and applies to all classes, >> data- or not. One of the design goals for data classes is that a data >> class should be essentially a "macro" for a class you could write by >> hand; this allows easy migration from existing classes to data classes >> (if they meet the requirements) and from data classes to full classes if >> they expand to no longer fit the data class profile. The more that >> *uses* of data classes or their members are different from the >> corresponding use of regular classes, the more difficult this migration >> becomes. (This is not unlike the design mandate with default methods; >> from the client perspective, they're just ordinary virtual methods, and >> you can't tell whether the method was implemented directly or inherited >> -- they're just methods.) >> >> So, while named parameters are a reasonable feature to explore, trying >> to staple them onto data classes would be a mistake. They are their own >> feature. We're open to exploring it, but we've got our plate full for now. From brian.goetz at oracle.com Thu Nov 30 21:41:53 2017 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 30 Nov 2017 16:41:53 -0500 Subject: Named parameters [was: Re: Reader mail bag] In-Reply-To: <3D92F35D-0CCB-437B-8CD5-C824219971D1@oracle.com> References: <542972726.2360777.1512073261898.JavaMail.zimbra@u-pem.fr> <3D92F35D-0CCB-437B-8CD5-C824219971D1@oracle.com> Message-ID: <0f7b9ee8-335f-3463-2569-f056f3598144@oracle.com> All of this, and: what happens when a client compiles against a given named method signature, and then, through separate compilation, the method signature changes, say, to add a new parameter?? Dealing with this in some form seems essential, as, the most common cases where you want to invoke by name is when a method has many parameters (especially if many of those are optional), and, if a method has 17 parameters, surely some day it will have 18.? And users will expect that adding a new parameter with a default is compatible.? (Indy provides a nice mechanism for sewing up this gap, but, we need to outline the requirements and the compatibility guarantees.) On 11/30/2017 3:56 PM, Guy Steele wrote: > Thanks, Remi, this is an excellent start! And it may be where we want to end up. > > But when the time comes that we dig into a serious discussion of putting named method parameters into Java, I would like to see a broader exploration of the design space before we settle on a specific design. > > I?ve seen a lot of other designs for a lot of other languages, each with pros and cons. > > There are at least four more-or-less orthogonal properties a method parameter can have: > > (1) May it be specified by name (rather than by position) at the call site? > If so, can it be specified _either_ by name or by position? > > (2) May the corresponding actual argument be omitted at the call site? > If so, what happens? > System-specified default value (such as zero or null) is supplied. > Programmer-specified default value is supplied. > Value is a compile-time constant. > Value is recomputed at call time. > Can this computation depend on other argument values (such as those to the left)? > No value is supplied. > A separate mechanism allows inquiry as to whether an actual argument was provided. > The parameter type is actually an option type (choice; if no value is supplied, you get an empty value). > > (3) May the corresponding actual argument be duplicated at the call site? > (This may make little sense for Java, but is used extensively in Common Lisp, where name-value lists may be built dynamically and then fed to `apply`; allowing duplications makes it easy to override a default by just sticking a new name-value pair onto the front of a list.) > > (4) May the actual arguments be permuted at the call site?that is, appear in an order other than the order in which they are declared in the method declaration? > (Typically the answer is ?no? for positional parameters, but may be ?yes? or ?no? for named parameters.) > (If a call contains both positional and named arguments, one can ask whether the named arguments may be mixed in among the positional ones [yech!] or must be kept separate, such as always appearing to the right of the positional arguments.) > > For each of the preceding four questions, there are these meta-questions: > Is the answer to the question the same for all parameters whatsoever? > Is the answer to the question the same for all parameters of the same kind (such as named or positional)? > Is the answer to the question the same for all parameters in a single method declaration? > Is the answer to the question the same for all parameters of the same kind (such as named or positional) in a single method declaration? > > In addition, there is the question of exactly what combinations of positional, optional positional, named, and/or optional named parameters may be used within a single method declaration. > And there is the question of what combinations of combinations may appear within an overload set. > > Many of these questions are in fact answered by specific choices in the proposal below. I?m just looking to seeing (eventually) a thorough discussion of the rationale for each choice. I provide this list of questions as one possible starting point for that discussion. > > ?Guy > >> On Nov 30, 2017, at 3:21 PM, Remi Forax wrote: >> >> My note on named parameters: >> Supporting real named parameters that works with overriding and backward compatibility is hard, but i believe there is a sweet spot. >> >> - a method can declare to support named parameters by using a new modifier 'named'. >> so - all parameters are named or none are, you can not mix positional and named parameters like in Ruby. >> - a method is either support positional parameters or named parameters but not both at the same time. >> if there are several overloads, you can only have one named overload with the same number of parameters, >> which means that counting the number of parameters is enough to know which method can be called. >> a varargs method can not be named. >> >> - when overriding a method, a named method can not override a non named method. >> overriding a named method with a non named method is allowed to ease the transition but emit a warning. >> a named method that overrides another named method need to have the same parameters with the same name at the same position. >> >> - at call site, a named method has to be called using the syntax "name: argument" for each arguments >> by example: >> ThreadGroup main = ... >> new ThreadGroup(parent: main, name: "my group"); >> if a named method is called with positional arguments, the compiler emit a warning >> >> - at compile time, >> a declaration site, all parameter named are stored in the Parameter attribute. >> a callsite, the compile insert an invokedynamic to the NamedParameterMetaFactory with a method handle ref on the declared named method and the names of all arguments in the order of the call. >> a runtime, the NamedParameterMetaFactory verifies that the number of named argument is the same and are a permutation of the declared parameters that are retrieved by cracking the method handle ref as a MethodHandleInfo and calling getParameters on the corresponding Constructor/Method, once the permutation is calculated, the NamedParameterMetaFactory returns a ConstantCallSite on the constant method handle (the method handle ref) permuted using MethodHandles.permuteArguments. >> >> To summarize, this let us use named parameters if the API designer want to allow that, >> there is a path for upgrading a method that uses positional parameters to use named parameters, but not in the other way (like varargs), >> not supporting permutations between overridden methods make thing far simpler that they are otherwise. >> >> cheers, >> R?mi >> >> ----- Mail original ----- >>> De: "Brian Goetz" >>> ?: "amber-spec-experts" >>> Envoy?: Jeudi 30 Novembre 2017 19:29:41 >>> Objet: Reader mail bag >>> We've gotten two submissions to the amber-spec-comments list. >>> >>> 1. "Forcing a pattern match", at: >>> http://mail.openjdk.java.net/pipermail/amber-spec-comments/2017-November/000000.html >>> >>> 2. "Named parameters in data classes", at: >>> http://mail.openjdk.java.net/pipermail/amber-spec-comments/2017-November/000003.html >>> >>> >>> I think the first asks for an alternate form of the "matches" operator >>> which would fail with a CCE, rather than evaluating to false, if the >>> match fails. This would be essentially saying, "This should match; if >>> it doesn't, that's an error." I think that's what's being asked, but >>> then he talks about an "unnecessary instanceof check", which makes me >>> wonder whether this is really just about optimization. >>> >>> To be clear, the instanceof check is neither expensive nor unnecessary. >>> (Though we can optimize these away when we know the static type of the >>> target; if x is a Foo, then "x matches Foo f" can be statically >>> strength-reduced to "x != null".) >>> >>> Note that the "if member.getKind() == VARIABLE" is merely a manual >>> optimization; you could easily leave that out and just match against >>> VariableTree. What we'd rather focus on is how to get to: >>> >>> switch (member) { >>> case BlockTree bt: ... >>> case VariableTree vt: ... >>> } >>> >>> while allowing the pattern to capture the quicker pre-test (kind == >>> BLOCK) and maintain the performance without making the user worry about >>> this. We have some ideas here, but I don't think this "forcing" idea >>> really offers a lot. >>> >>> >>> The second (named parameters) was a question I was expecting. >>> >>> I agree that being able to invoke constructors and methods by name can >>> sometimes result in easier-to-read code, especially when some parameters >>> have a sensible default value. (This is also a more complicated feature >>> than anyone gives it credit for, so it's not the "gimme" it is often >>> assumed to be.) >>> >>> However, data classes is not the place to cram in this feature; this >>> should be a feature that stands on its own, and applies to all classes, >>> data- or not. One of the design goals for data classes is that a data >>> class should be essentially a "macro" for a class you could write by >>> hand; this allows easy migration from existing classes to data classes >>> (if they meet the requirements) and from data classes to full classes if >>> they expand to no longer fit the data class profile. The more that >>> *uses* of data classes or their members are different from the >>> corresponding use of regular classes, the more difficult this migration >>> becomes. (This is not unlike the design mandate with default methods; >>> from the client perspective, they're just ordinary virtual methods, and >>> you can't tell whether the method was implemented directly or inherited >>> -- they're just methods.) >>> >>> So, while named parameters are a reasonable feature to explore, trying >>> to staple them onto data classes would be a mistake. They are their own >>> feature. We're open to exploring it, but we've got our plate full for now. From forax at univ-mlv.fr Thu Nov 30 23:14:01 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 1 Dec 2017 00:14:01 +0100 (CET) Subject: Named parameters [was: Re: Reader mail bag] In-Reply-To: <0f7b9ee8-335f-3463-2569-f056f3598144@oracle.com> References: <542972726.2360777.1512073261898.JavaMail.zimbra@u-pem.fr> <3D92F35D-0CCB-437B-8CD5-C824219971D1@oracle.com> <0f7b9ee8-335f-3463-2569-f056f3598144@oracle.com> Message-ID: <1523536455.2389877.1512083641090.JavaMail.zimbra@u-pem.fr> You want is to be able to use named parameters as builders. In my opinion, this is really hard to combine with the overriding rules. But like @SafeVarargs, we can limit named parameters to constuctors, static methods, private methods and final methods. indy can bridge overriding too but it means you need to express the inlining cache with indy too, you can not as i've proposed to just permute the argument with indy but let the VM do the method invocation. Using indy to manage the inlining cache will require a lot of work tuning to get the same perf as we have actually. R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Guy Steele" , "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Jeudi 30 Novembre 2017 22:41:53 > Objet: Re: Named parameters [was: Re: Reader mail bag] > All of this, and: what happens when a client compiles against a given > named method signature, and then, through separate compilation, the > method signature changes, say, to add a new parameter?? Dealing with > this in some form seems essential, as, the most common cases where you > want to invoke by name is when a method has many parameters (especially > if many of those are optional), and, if a method has 17 parameters, > surely some day it will have 18.? And users will expect that adding a > new parameter with a default is compatible.? (Indy provides a nice > mechanism for sewing up this gap, but, we need to outline the > requirements and the compatibility guarantees.) > > On 11/30/2017 3:56 PM, Guy Steele wrote: >> Thanks, Remi, this is an excellent start! And it may be where we want to end >> up. >> >> But when the time comes that we dig into a serious discussion of putting named >> method parameters into Java, I would like to see a broader exploration of the >> design space before we settle on a specific design. >> >> I?ve seen a lot of other designs for a lot of other languages, each with pros >> and cons. >> >> There are at least four more-or-less orthogonal properties a method parameter >> can have: >> >> (1) May it be specified by name (rather than by position) at the call site? >> If so, can it be specified _either_ by name or by position? >> >> (2) May the corresponding actual argument be omitted at the call site? >> If so, what happens? >> System-specified default value (such as zero or null) is supplied. >> Programmer-specified default value is supplied. >> Value is a compile-time constant. >> Value is recomputed at call time. >> Can this computation depend on other argument values (such as those to the >> left)? >> No value is supplied. >> A separate mechanism allows inquiry as to whether an actual argument was >> provided. >> The parameter type is actually an option type (choice; if no value is supplied, >> you get an empty value). >> >> (3) May the corresponding actual argument be duplicated at the call site? >> (This may make little sense for Java, but is used extensively in Common Lisp, >> where name-value lists may be built dynamically and then fed to `apply`; >> allowing duplications makes it easy to override a default by just sticking a >> new name-value pair onto the front of a list.) >> >> (4) May the actual arguments be permuted at the call site?that is, appear in an >> order other than the order in which they are declared in the method >> declaration? >> (Typically the answer is ?no? for positional parameters, but may be ?yes? or >> ?no? for named parameters.) >> (If a call contains both positional and named arguments, one can ask whether the >> named arguments may be mixed in among the positional ones [yech!] or must be >> kept separate, such as always appearing to the right of the positional >> arguments.) >> >> For each of the preceding four questions, there are these meta-questions: >> Is the answer to the question the same for all parameters whatsoever? >> Is the answer to the question the same for all parameters of the same kind (such >> as named or positional)? >> Is the answer to the question the same for all parameters in a single method >> declaration? >> Is the answer to the question the same for all parameters of the same kind (such >> as named or positional) in a single method declaration? >> >> In addition, there is the question of exactly what combinations of positional, >> optional positional, named, and/or optional named parameters may be used within >> a single method declaration. >> And there is the question of what combinations of combinations may appear within >> an overload set. >> >> Many of these questions are in fact answered by specific choices in the proposal >> below. I?m just looking to seeing (eventually) a thorough discussion of the >> rationale for each choice. I provide this list of questions as one possible >> starting point for that discussion. >> >> ?Guy >> >>> On Nov 30, 2017, at 3:21 PM, Remi Forax wrote: >>> >>> My note on named parameters: >>> Supporting real named parameters that works with overriding and backward >>> compatibility is hard, but i believe there is a sweet spot. >>> >>> - a method can declare to support named parameters by using a new modifier >>> 'named'. >>> so - all parameters are named or none are, you can not mix positional and named >>> parameters like in Ruby. >>> - a method is either support positional parameters or named parameters but not >>> both at the same time. >>> if there are several overloads, you can only have one named overload with the >>> same number of parameters, >>> which means that counting the number of parameters is enough to know which >>> method can be called. >>> a varargs method can not be named. >>> >>> - when overriding a method, a named method can not override a non named method. >>> overriding a named method with a non named method is allowed to ease the >>> transition but emit a warning. >>> a named method that overrides another named method need to have the same >>> parameters with the same name at the same position. >>> >>> - at call site, a named method has to be called using the syntax "name: >>> argument" for each arguments >>> by example: >>> ThreadGroup main = ... >>> new ThreadGroup(parent: main, name: "my group"); >>> if a named method is called with positional arguments, the compiler emit a >>> warning >>> >>> - at compile time, >>> a declaration site, all parameter named are stored in the Parameter attribute. >>> a callsite, the compile insert an invokedynamic to the NamedParameterMetaFactory >>> with a method handle ref on the declared named method and the names of all >>> arguments in the order of the call. >>> a runtime, the NamedParameterMetaFactory verifies that the number of named >>> argument is the same and are a permutation of the declared parameters that are >>> retrieved by cracking the method handle ref as a MethodHandleInfo and calling >>> getParameters on the corresponding Constructor/Method, once the permutation is >>> calculated, the NamedParameterMetaFactory returns a ConstantCallSite on the >>> constant method handle (the method handle ref) permuted using >>> MethodHandles.permuteArguments. >>> >>> To summarize, this let us use named parameters if the API designer want to allow >>> that, >>> there is a path for upgrading a method that uses positional parameters to use >>> named parameters, but not in the other way (like varargs), >>> not supporting permutations between overridden methods make thing far simpler >>> that they are otherwise. >>> >>> cheers, >>> R?mi >>> >>> ----- Mail original ----- >>>> De: "Brian Goetz" >>>> ?: "amber-spec-experts" >>>> Envoy?: Jeudi 30 Novembre 2017 19:29:41 >>>> Objet: Reader mail bag >>>> We've gotten two submissions to the amber-spec-comments list. >>>> >>>> 1. "Forcing a pattern match", at: >>>> http://mail.openjdk.java.net/pipermail/amber-spec-comments/2017-November/000000.html >>>> >>>> 2. "Named parameters in data classes", at: >>>> http://mail.openjdk.java.net/pipermail/amber-spec-comments/2017-November/000003.html >>>> >>>> >>>> I think the first asks for an alternate form of the "matches" operator >>>> which would fail with a CCE, rather than evaluating to false, if the >>>> match fails. This would be essentially saying, "This should match; if >>>> it doesn't, that's an error." I think that's what's being asked, but >>>> then he talks about an "unnecessary instanceof check", which makes me >>>> wonder whether this is really just about optimization. >>>> >>>> To be clear, the instanceof check is neither expensive nor unnecessary. >>>> (Though we can optimize these away when we know the static type of the >>>> target; if x is a Foo, then "x matches Foo f" can be statically >>>> strength-reduced to "x != null".) >>>> >>>> Note that the "if member.getKind() == VARIABLE" is merely a manual >>>> optimization; you could easily leave that out and just match against >>>> VariableTree. What we'd rather focus on is how to get to: >>>> >>>> switch (member) { >>>> case BlockTree bt: ... >>>> case VariableTree vt: ... >>>> } >>>> >>>> while allowing the pattern to capture the quicker pre-test (kind == >>>> BLOCK) and maintain the performance without making the user worry about >>>> this. We have some ideas here, but I don't think this "forcing" idea >>>> really offers a lot. >>>> >>>> >>>> The second (named parameters) was a question I was expecting. >>>> >>>> I agree that being able to invoke constructors and methods by name can >>>> sometimes result in easier-to-read code, especially when some parameters >>>> have a sensible default value. (This is also a more complicated feature >>>> than anyone gives it credit for, so it's not the "gimme" it is often >>>> assumed to be.) >>>> >>>> However, data classes is not the place to cram in this feature; this >>>> should be a feature that stands on its own, and applies to all classes, >>>> data- or not. One of the design goals for data classes is that a data >>>> class should be essentially a "macro" for a class you could write by >>>> hand; this allows easy migration from existing classes to data classes >>>> (if they meet the requirements) and from data classes to full classes if >>>> they expand to no longer fit the data class profile. The more that >>>> *uses* of data classes or their members are different from the >>>> corresponding use of regular classes, the more difficult this migration >>>> becomes. (This is not unlike the design mandate with default methods; >>>> from the client perspective, they're just ordinary virtual methods, and >>>> you can't tell whether the method was implemented directly or inherited >>>> -- they're just methods.) >>>> >>>> So, while named parameters are a reasonable feature to explore, trying >>>> to staple them onto data classes would be a mistake. They are their own > >>> feature. We're open to exploring it, but we've got our plate full for now. From forax at univ-mlv.fr Thu Nov 30 23:32:40 2017 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 1 Dec 2017 00:32:40 +0100 (CET) Subject: Named parameters [was: Re: Reader mail bag] In-Reply-To: <3D92F35D-0CCB-437B-8CD5-C824219971D1@oracle.com> References: <542972726.2360777.1512073261898.JavaMail.zimbra@u-pem.fr> <3D92F35D-0CCB-437B-8CD5-C824219971D1@oracle.com> Message-ID: <1728761728.2390596.1512084760387.JavaMail.zimbra@u-pem.fr> I will just anwser on the default argument part. Java already has a way to specify default argument using several overloads with a different numbers of argument, it's curbersome but it should not be too easy to specify default values. I believe that default argument are evil for two main reasons. It's like playing God, it's not the user of an API that specifies the default value, it's the creator of the API, so it's the creator the API saying to the world, i know how you will using my API, here are the default value you want. I tink we should be more humble when developing APIs and not pretend that we know what the good default are for all our users. Defaults are also part of the contract but people tends to think loosely about that part, defaults are implicit, so a part of the configuration of an object or a method in not visible at callsite, this leads to misunderstanding between what a user of a method think the method does and what the method really do. So i'm all for named parameters if it makes a method call easier to understand, if it makes method call harder to understand by introducing implicit values, i think i prefer the statu quo. R?mi ----- Mail original ----- > De: "Guy Steele" > ?: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > Envoy?: Jeudi 30 Novembre 2017 21:56:27 > Objet: Named parameters [was: Re: Reader mail bag] > Thanks, Remi, this is an excellent start! And it may be where we want to end > up. > > But when the time comes that we dig into a serious discussion of putting named > method parameters into Java, I would like to see a broader exploration of the > design space before we settle on a specific design. > > I?ve seen a lot of other designs for a lot of other languages, each with pros > and cons. > > There are at least four more-or-less orthogonal properties a method parameter > can have: > > (1) May it be specified by name (rather than by position) at the call site? > If so, can it be specified _either_ by name or by position? > > (2) May the corresponding actual argument be omitted at the call site? > If so, what happens? > System-specified default value (such as zero or null) is supplied. > Programmer-specified default value is supplied. > Value is a compile-time constant. > Value is recomputed at call time. > Can this computation depend on other argument values (such as those to the > left)? > No value is supplied. > A separate mechanism allows inquiry as to whether an actual argument was > provided. > The parameter type is actually an option type (choice; if no value is supplied, > you get an empty value). > > (3) May the corresponding actual argument be duplicated at the call site? > (This may make little sense for Java, but is used extensively in Common Lisp, > where name-value lists may be built dynamically and then fed to `apply`; > allowing duplications makes it easy to override a default by just sticking a > new name-value pair onto the front of a list.) > > (4) May the actual arguments be permuted at the call site?that is, appear in an > order other than the order in which they are declared in the method > declaration? > (Typically the answer is ?no? for positional parameters, but may be ?yes? or > ?no? for named parameters.) > (If a call contains both positional and named arguments, one can ask whether the > named arguments may be mixed in among the positional ones [yech!] or must be > kept separate, such as always appearing to the right of the positional > arguments.) > > For each of the preceding four questions, there are these meta-questions: > Is the answer to the question the same for all parameters whatsoever? > Is the answer to the question the same for all parameters of the same kind (such > as named or positional)? > Is the answer to the question the same for all parameters in a single method > declaration? > Is the answer to the question the same for all parameters of the same kind (such > as named or positional) in a single method declaration? > > In addition, there is the question of exactly what combinations of positional, > optional positional, named, and/or optional named parameters may be used within > a single method declaration. > And there is the question of what combinations of combinations may appear within > an overload set. > > Many of these questions are in fact answered by specific choices in the proposal > below. I?m just looking to seeing (eventually) a thorough discussion of the > rationale for each choice. I provide this list of questions as one possible > starting point for that discussion. > > ?Guy > >> On Nov 30, 2017, at 3:21 PM, Remi Forax wrote: >> >> My note on named parameters: >> Supporting real named parameters that works with overriding and backward >> compatibility is hard, but i believe there is a sweet spot. >> >> - a method can declare to support named parameters by using a new modifier >> 'named'. >> so - all parameters are named or none are, you can not mix positional and named >> parameters like in Ruby. >> - a method is either support positional parameters or named parameters but not >> both at the same time. >> if there are several overloads, you can only have one named overload with the >> same number of parameters, >> which means that counting the number of parameters is enough to know which >> method can be called. >> a varargs method can not be named. >> >> - when overriding a method, a named method can not override a non named method. >> overriding a named method with a non named method is allowed to ease the >> transition but emit a warning. >> a named method that overrides another named method need to have the same >> parameters with the same name at the same position. >> >> - at call site, a named method has to be called using the syntax "name: >> argument" for each arguments >> by example: >> ThreadGroup main = ... >> new ThreadGroup(parent: main, name: "my group"); >> if a named method is called with positional arguments, the compiler emit a >> warning >> >> - at compile time, >> a declaration site, all parameter named are stored in the Parameter attribute. >> a callsite, the compile insert an invokedynamic to the NamedParameterMetaFactory >> with a method handle ref on the declared named method and the names of all >> arguments in the order of the call. >> a runtime, the NamedParameterMetaFactory verifies that the number of named >> argument is the same and are a permutation of the declared parameters that are >> retrieved by cracking the method handle ref as a MethodHandleInfo and calling >> getParameters on the corresponding Constructor/Method, once the permutation is >> calculated, the NamedParameterMetaFactory returns a ConstantCallSite on the >> constant method handle (the method handle ref) permuted using >> MethodHandles.permuteArguments. >> >> To summarize, this let us use named parameters if the API designer want to allow >> that, >> there is a path for upgrading a method that uses positional parameters to use >> named parameters, but not in the other way (like varargs), >> not supporting permutations between overridden methods make thing far simpler >> that they are otherwise. >> >> cheers, >> R?mi >> >> ----- Mail original ----- >>> De: "Brian Goetz" >>> ?: "amber-spec-experts" >>> Envoy?: Jeudi 30 Novembre 2017 19:29:41 >>> Objet: Reader mail bag >> >>> We've gotten two submissions to the amber-spec-comments list. >>> >>> 1. "Forcing a pattern match", at: >>> http://mail.openjdk.java.net/pipermail/amber-spec-comments/2017-November/000000.html >>> >>> 2. "Named parameters in data classes", at: >>> http://mail.openjdk.java.net/pipermail/amber-spec-comments/2017-November/000003.html >>> >>> >>> I think the first asks for an alternate form of the "matches" operator >>> which would fail with a CCE, rather than evaluating to false, if the >>> match fails. This would be essentially saying, "This should match; if >>> it doesn't, that's an error." I think that's what's being asked, but >>> then he talks about an "unnecessary instanceof check", which makes me >>> wonder whether this is really just about optimization. >>> >>> To be clear, the instanceof check is neither expensive nor unnecessary. >>> (Though we can optimize these away when we know the static type of the >>> target; if x is a Foo, then "x matches Foo f" can be statically >>> strength-reduced to "x != null".) >>> >>> Note that the "if member.getKind() == VARIABLE" is merely a manual >>> optimization; you could easily leave that out and just match against >>> VariableTree. What we'd rather focus on is how to get to: >>> >>> switch (member) { >>> case BlockTree bt: ... >>> case VariableTree vt: ... >>> } >>> >>> while allowing the pattern to capture the quicker pre-test (kind == >>> BLOCK) and maintain the performance without making the user worry about >>> this. We have some ideas here, but I don't think this "forcing" idea >>> really offers a lot. >>> >>> >>> The second (named parameters) was a question I was expecting. >>> >>> I agree that being able to invoke constructors and methods by name can >>> sometimes result in easier-to-read code, especially when some parameters >>> have a sensible default value. (This is also a more complicated feature >>> than anyone gives it credit for, so it's not the "gimme" it is often >>> assumed to be.) >>> >>> However, data classes is not the place to cram in this feature; this >>> should be a feature that stands on its own, and applies to all classes, >>> data- or not. One of the design goals for data classes is that a data >>> class should be essentially a "macro" for a class you could write by >>> hand; this allows easy migration from existing classes to data classes >>> (if they meet the requirements) and from data classes to full classes if >>> they expand to no longer fit the data class profile. The more that >>> *uses* of data classes or their members are different from the >>> corresponding use of regular classes, the more difficult this migration >>> becomes. (This is not unlike the design mandate with default methods; >>> from the client perspective, they're just ordinary virtual methods, and >>> you can't tell whether the method was implemented directly or inherited >>> -- they're just methods.) >>> >>> So, while named parameters are a reasonable feature to explore, trying >>> to staple them onto data classes would be a mistake. They are their own > >> feature. We're open to exploring it, but we've got our plate full for now.