From brian.goetz at oracle.com Tue Sep 6 21:11:43 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 6 Sep 2022 17:11:43 -0400 Subject: Array patterns (and varargs patterns) In-Reply-To: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> Message-ID: We dropped this out of the record patterns JEP, but I think it is time to revisit this. The concept of array patterns was pretty straightforward; they mimic the nesting and exhaustiveness rules of record patterns, they are just a different sort of container for nested patterns.? And they have an obvious duality with array creation expressions. The main open question here was how we distinguish between "match an array of length exactly N" (where there are N nested patterns) and "match an array of length at least N".? We toyed with the idea of a "..." indicator to mean "more elements", but this felt a little forced and opened new questions. It later occurred to me that there is another place to nest a pattern in an array pattern -- to match (and bind) the length. In the following, assume for sake of exposition that "_" is the "any" pattern (matches everything, binds nothing) and that we have some way to denote a constant pattern, which I'll denote here with a constant literal. There is an obvious place to put this (optional) pattern: in between the brackets.? So: ??? case String[1] { P }: ??????????????? ^ a constant pattern would match string arrays of length 1 whose sole element matches P.? And ??? case String[] { P, Q } would match string arrays of length exactly 2, whose first two elements match P and Q respectively.? (If the length pattern is not specified, we infer a constant pattern whose constant is equal to the length of the nested pattern list.) Matching a target to `String[L] { P0, .., Pn }` means ??? x instanceof String[] arr ??????? && arr.length matches L ??????? && arr.length >= n ??????? && arr[0] matches P0 ??????? && arr[1] matches P1 ??????? ... ??????? && arr[n] matches Pn More examples: ??? case String[int len] { P } would match string arrays of length >= 1 whose first element matches P, and further binds the array length to `len`. ??? case String[_] { P, Q } would match string arrays of any length whose first two elements match P and Q. ??? case String[3] { } ??????????????? ^constant pattern matches all string arrays of length 3. This is a more principled way to do it, because the length is a part of the array and deserves a chance to match via nested patterns, just as with the elements, and it avoid trying to give "..." a new meaning. The downside is that it might be confusing at first (though people will learn quickly enough) how to distinguish between an exact match and a prefix match. On 1/5/2021 1:48 PM, Brian Goetz wrote: > As we get into the next round of pattern matching, I'd like to > opportunistically attach another sub-feature: array patterns.? (This > also bears on the question of "how would varargs patterns work", which > I'll address below, though they might come later.) > > ## Array Patterns > > If we want to create a new array, we do so with an array construction > expression: > > ??? new String[] { "a", "b" } > > Since each form of aggregation should have its dual in destructuring, > the natural way to represent an array pattern (h/t to AlanM for > suggesting this) is: > > ??? if (arr instanceof String[] { var a, var b }) { ... } > > Here, the applicability test is: "are you an instanceof of String[], > with length = 2", and if so, we cast to String[], extract the two > elements, and match them to the nested patterns `var a` and `var b`.?? > This is the natural analogue of deconstruction patterns for arrays, > complete with nesting. > > Since an array can have more elements, we likely need a way to say > "length >= 2" rather than simply "length == 2".? There are multiple > syntactic ways to get there, for now I'm going to write > > ??? if (arr instanceof String[] { var a, var b, ... }) > > to indicate "more".? The "..." matches zero or more elements and binds > nothing. > > > People are immediately going to ask "can I bind something to the > remainder"; I think this is mostly an "attractive distraction", and > would prefer to not have this dominate the discussion. > > > Here's an example from the JDK that could use this effectively: > > String[] limits = limitString.split(":"); > try { > ??? switch (limits.length) { > ??????? case 2: { > ??????????? if (!limits[1].equals("*")) > ??????????????? setMultilineLimit(MultilineLimit.DEPTH, > Integer.parseInt(limits[1])); > ??????? } > ??????? case 1: { > ??????????? if (!limits[0].equals("*")) > ??????????????? setMultilineLimit(MultilineLimit.LENGTH, > Integer.parseInt(limits[0])); > ??????? } > ??? } > } > catch(NumberFormatException ex) { > ??? setMultilineLimit(MultilineLimit.DEPTH, -1); > ??? setMultilineLimit(MultilineLimit.LENGTH, -1); > } > > becomes (eventually) > > switch (limitString.split(":")) { > ??????? case String[] { var _, Integer.parseInt(var i) } -> > setMultilineLimit(DEPTH, i); > ? ? case String[] { Integer.parseInt(var i) } -> > setMultilineLimit(LENGTH, i); > ??????? default -> { setMultilineLimit(DEPTH, -1); > setMultilineLimit(LENGTH, -1); } > ??? } > > Note how not only does this become more compact, but the unchecked > "NumberFormatException" is folded into the match, rather than being a > separate concern. > > > ## Varargs patterns > > Having array patterns offers us a natural way to interpret > deconstruction patterns for varargs records.? Assume we have: > > ??? void m(X... xs) { } > > Then a varargs invocation > > ??? m(a, b, c) > > is really sugar for > > ??? m(new X[] { a, b, c }) > > So the dual of a varargs invocation, a varargs match, is really a > match to an array pattern.? So for a record > > ??? record R(X... xs) { } > > a varargs match: > > ??? case R(var a, var b, var c): > > is really sugar for an array match: > > ??? case R(X[] { var a, var b, var c }): > > And similarly, we can use our "more arity" indicator: > > ??? case R(var a, var b, var c, ...): > > to indicate that there are at least three elements. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Wed Sep 7 14:10:15 2022 From: amaembo at gmail.com (Tagir Valeev) Date: Wed, 7 Sep 2022 16:10:15 +0200 Subject: Array patterns (and varargs patterns) In-Reply-To: References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> Message-ID: Hello! Honestly, to me this whole feature looks not very important. It's a rare case in modern Java applications that business logic operates with arrays directly. They are mostly used in low-level system code where performance matters more than code elegance. Custom defined named patterns for lists would be much more useful. Moreover, if named patterns are supported, then array deconstruction could be implemented in a library, without complicating the language specification (like `x instanceof Arrays.of(String first, String next, String last)`). With best regards, Tagir Valeev. On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz wrote: > > We dropped this out of the record patterns JEP, but I think it is time to revisit this. > > The concept of array patterns was pretty straightforward; they mimic the nesting and exhaustiveness rules of record patterns, they are just a different sort of container for nested patterns. And they have an obvious duality with array creation expressions. > > The main open question here was how we distinguish between "match an array of length exactly N" (where there are N nested patterns) and "match an array of length at least N". We toyed with the idea of a "..." indicator to mean "more elements", but this felt a little forced and opened new questions. > > It later occurred to me that there is another place to nest a pattern in an array pattern -- to match (and bind) the length. In the following, assume for sake of exposition that "_" is the "any" pattern (matches everything, binds nothing) and that we have some way to denote a constant pattern, which I'll denote here with a constant literal. > > There is an obvious place to put this (optional) pattern: in between the brackets. So: > > case String[1] { P }: > ^ a constant pattern > > would match string arrays of length 1 whose sole element matches P. And > > case String[] { P, Q } > > would match string arrays of length exactly 2, whose first two elements match P and Q respectively. (If the length pattern is not specified, we infer a constant pattern whose constant is equal to the length of the nested pattern list.) > > Matching a target to `String[L] { P0, .., Pn }` means > > x instanceof String[] arr > && arr.length matches L > && arr.length >= n > && arr[0] matches P0 > && arr[1] matches P1 > ... > && arr[n] matches Pn > > More examples: > > case String[int len] { P } > > would match string arrays of length >= 1 whose first element matches P, and further binds the array length to `len`. > > case String[_] { P, Q } > > would match string arrays of any length whose first two elements match P and Q. > > case String[3] { } > ^constant pattern > > matches all string arrays of length 3. > > > This is a more principled way to do it, because the length is a part of the array and deserves a chance to match via nested patterns, just as with the elements, and it avoid trying to give "..." a new meaning. > > The downside is that it might be confusing at first (though people will learn quickly enough) how to distinguish between an exact match and a prefix match. > > > > > On 1/5/2021 1:48 PM, Brian Goetz wrote: > > As we get into the next round of pattern matching, I'd like to opportunistically attach another sub-feature: array patterns. (This also bears on the question of "how would varargs patterns work", which I'll address below, though they might come later.) > > ## Array Patterns > > If we want to create a new array, we do so with an array construction expression: > > new String[] { "a", "b" } > > Since each form of aggregation should have its dual in destructuring, the natural way to represent an array pattern (h/t to AlanM for suggesting this) is: > > if (arr instanceof String[] { var a, var b }) { ... } > > Here, the applicability test is: "are you an instanceof of String[], with length = 2", and if so, we cast to String[], extract the two elements, and match them to the nested patterns `var a` and `var b`. This is the natural analogue of deconstruction patterns for arrays, complete with nesting. > > Since an array can have more elements, we likely need a way to say "length >= 2" rather than simply "length == 2". There are multiple syntactic ways to get there, for now I'm going to write > > if (arr instanceof String[] { var a, var b, ... }) > > to indicate "more". The "..." matches zero or more elements and binds nothing. > > > People are immediately going to ask "can I bind something to the remainder"; I think this is mostly an "attractive distraction", and would prefer to not have this dominate the discussion. > > > Here's an example from the JDK that could use this effectively: > > String[] limits = limitString.split(":"); > try { > switch (limits.length) { > case 2: { > if (!limits[1].equals("*")) > setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1])); > } > case 1: { > if (!limits[0].equals("*")) > setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0])); > } > } > } > catch(NumberFormatException ex) { > setMultilineLimit(MultilineLimit.DEPTH, -1); > setMultilineLimit(MultilineLimit.LENGTH, -1); > } > > becomes (eventually) > > switch (limitString.split(":")) { > case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i); > case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i); > default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); } > } > > Note how not only does this become more compact, but the unchecked "NumberFormatException" is folded into the match, rather than being a separate concern. > > > ## Varargs patterns > > Having array patterns offers us a natural way to interpret deconstruction patterns for varargs records. Assume we have: > > void m(X... xs) { } > > Then a varargs invocation > > m(a, b, c) > > is really sugar for > > m(new X[] { a, b, c }) > > So the dual of a varargs invocation, a varargs match, is really a match to an array pattern. So for a record > > record R(X... xs) { } > > a varargs match: > > case R(var a, var b, var c): > > is really sugar for an array match: > > case R(X[] { var a, var b, var c }): > > And similarly, we can use our "more arity" indicator: > > case R(var a, var b, var c, ...): > > to indicate that there are at least three elements. > > > From brian.goetz at oracle.com Wed Sep 7 14:32:34 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 7 Sep 2022 10:32:34 -0400 Subject: Array patterns (and varargs patterns) In-Reply-To: References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> Message-ID: <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com> I understand where this sentiment comes from.? But the motivation is somewhat more indirect than "people are falling over themselves to deconstruct arrays today". Because deconstruction is the dual of aggregation, it is desirable for each of the forms of aggregation -- constructors, factories, etc -- to have pattern counterparts.? Not doing so creates asymmetries that make the whole thing seem more ad-hoc.? Many of the "not as important" pattern features we're working on now, are in the realm of "completing" the feature. More importantly, array patterns are how we fully support varargs in records.? If we have a varargs record: ??? record VA(String... strings) { } we can construct it with a varargs invocation ??? new VA("a", "b") which is sugar for ??? new VA(new String[] { "a", "b" }) But we cannot yet deconstruct it with: ??? case VA(var a, var b) and analogously, for a varargs record, the above is sugar for ??? case VA(String[] { var a, var b }) So it is not just about arrays. I agree that named patterns are more useful, and we are working on them too.? But they are also a bigger feature (bringing in overload selection, reflection, translation, etc), so they will take longer. Whereas array patterns are really a remix of things we've already worked out -- nested patterns, exhaustiveness, etc.? In any case I would like to avoid leaving a trail of unfinished work, so cleaning up the loose ends on basic patterns first seems preferable before adding bigger new pattern features. > Hello! > > Honestly, to me this whole feature looks not very important. It's a > rare case in modern Java applications that business logic operates > with arrays directly. They are mostly used in low-level system code > where performance matters more than code elegance. Custom defined > named patterns for lists would be much more useful. Moreover, if named > patterns are supported, then array deconstruction could be implemented > in a library, without complicating the language specification (like `x > instanceof Arrays.of(String first, String next, String last)`). I'm not sure how this Arrays.of pattern is going to work, unless we're willing to have overloads for every arity up to, say, 22? Otherwise, we need varargs, and varargs is sugar for an array pattern. > > With best regards, > Tagir Valeev. > > On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz wrote: >> We dropped this out of the record patterns JEP, but I think it is time to revisit this. >> >> The concept of array patterns was pretty straightforward; they mimic the nesting and exhaustiveness rules of record patterns, they are just a different sort of container for nested patterns. And they have an obvious duality with array creation expressions. >> >> The main open question here was how we distinguish between "match an array of length exactly N" (where there are N nested patterns) and "match an array of length at least N". We toyed with the idea of a "..." indicator to mean "more elements", but this felt a little forced and opened new questions. >> >> It later occurred to me that there is another place to nest a pattern in an array pattern -- to match (and bind) the length. In the following, assume for sake of exposition that "_" is the "any" pattern (matches everything, binds nothing) and that we have some way to denote a constant pattern, which I'll denote here with a constant literal. >> >> There is an obvious place to put this (optional) pattern: in between the brackets. So: >> >> case String[1] { P }: >> ^ a constant pattern >> >> would match string arrays of length 1 whose sole element matches P. And >> >> case String[] { P, Q } >> >> would match string arrays of length exactly 2, whose first two elements match P and Q respectively. (If the length pattern is not specified, we infer a constant pattern whose constant is equal to the length of the nested pattern list.) >> >> Matching a target to `String[L] { P0, .., Pn }` means >> >> x instanceof String[] arr >> && arr.length matches L >> && arr.length >= n >> && arr[0] matches P0 >> && arr[1] matches P1 >> ... >> && arr[n] matches Pn >> >> More examples: >> >> case String[int len] { P } >> >> would match string arrays of length >= 1 whose first element matches P, and further binds the array length to `len`. >> >> case String[_] { P, Q } >> >> would match string arrays of any length whose first two elements match P and Q. >> >> case String[3] { } >> ^constant pattern >> >> matches all string arrays of length 3. >> >> >> This is a more principled way to do it, because the length is a part of the array and deserves a chance to match via nested patterns, just as with the elements, and it avoid trying to give "..." a new meaning. >> >> The downside is that it might be confusing at first (though people will learn quickly enough) how to distinguish between an exact match and a prefix match. >> >> >> >> >> On 1/5/2021 1:48 PM, Brian Goetz wrote: >> >> As we get into the next round of pattern matching, I'd like to opportunistically attach another sub-feature: array patterns. (This also bears on the question of "how would varargs patterns work", which I'll address below, though they might come later.) >> >> ## Array Patterns >> >> If we want to create a new array, we do so with an array construction expression: >> >> new String[] { "a", "b" } >> >> Since each form of aggregation should have its dual in destructuring, the natural way to represent an array pattern (h/t to AlanM for suggesting this) is: >> >> if (arr instanceof String[] { var a, var b }) { ... } >> >> Here, the applicability test is: "are you an instanceof of String[], with length = 2", and if so, we cast to String[], extract the two elements, and match them to the nested patterns `var a` and `var b`. This is the natural analogue of deconstruction patterns for arrays, complete with nesting. >> >> Since an array can have more elements, we likely need a way to say "length >= 2" rather than simply "length == 2". There are multiple syntactic ways to get there, for now I'm going to write >> >> if (arr instanceof String[] { var a, var b, ... }) >> >> to indicate "more". The "..." matches zero or more elements and binds nothing. >> >> >> People are immediately going to ask "can I bind something to the remainder"; I think this is mostly an "attractive distraction", and would prefer to not have this dominate the discussion. >> >> >> Here's an example from the JDK that could use this effectively: >> >> String[] limits = limitString.split(":"); >> try { >> switch (limits.length) { >> case 2: { >> if (!limits[1].equals("*")) >> setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1])); >> } >> case 1: { >> if (!limits[0].equals("*")) >> setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0])); >> } >> } >> } >> catch(NumberFormatException ex) { >> setMultilineLimit(MultilineLimit.DEPTH, -1); >> setMultilineLimit(MultilineLimit.LENGTH, -1); >> } >> >> becomes (eventually) >> >> switch (limitString.split(":")) { >> case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i); >> case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i); >> default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); } >> } >> >> Note how not only does this become more compact, but the unchecked "NumberFormatException" is folded into the match, rather than being a separate concern. >> >> >> ## Varargs patterns >> >> Having array patterns offers us a natural way to interpret deconstruction patterns for varargs records. Assume we have: >> >> void m(X... xs) { } >> >> Then a varargs invocation >> >> m(a, b, c) >> >> is really sugar for >> >> m(new X[] { a, b, c }) >> >> So the dual of a varargs invocation, a varargs match, is really a match to an array pattern. So for a record >> >> record R(X... xs) { } >> >> a varargs match: >> >> case R(var a, var b, var c): >> >> is really sugar for an array match: >> >> case R(X[] { var a, var b, var c }): >> >> And similarly, we can use our "more arity" indicator: >> >> case R(var a, var b, var c, ...): >> >> to indicate that there are at least three elements. >> >> >> From brian.goetz at oracle.com Wed Sep 7 17:41:33 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 7 Sep 2022 13:41:33 -0400 Subject: Unnamed variables and match-all patterns Message-ID: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com> We've gone around and around a few times on "unnamed variables" (underscore), starting with JEP 302 (Lambda Leftovers).? We reclaimed the underscore token in Java 9 with the intention of using it for unnamed variables and "any" patterns.? Along the way, we ran into some hiccups, and it has sat on the shelf for a while.? Let's take it down, dust it off, and see if we have any more clarity than before. There are three syntactic productions in which we might want to use underscore as a "don't care" indicator: ?- Unnamed variables.? Here, underscore stands in for a variable name.? When we declare a local variable, catch formal, pattern variable, etc, whose name is `_`, which has the effect of entering no new names in scope.? It becomes an "initialize-only" variable. ??? try { ... } ??? catch (FooException _) { throw new BarException("foo"); } ?- Partial inference.? Here, underscore stands in for a type name.? Today, we can infer type variables for generic method invocations and constructor invocations, but it is all-or-nothing.? Being able to denote "infer this type" would allow us to do partial inference: ??? foo.m(...) ?- "Any" patterns.? Here, underscore is a pattern, which matches everything, and binds nothing. ??? case Foo(var s, _): ... We don't have to do all of these; right now we're not considering partial inference, but the other two are reasonable options.? Unnamed variables have been a long-standing request; any patterns will likely be a common request soon as well. For a match-all pattern, there is little to say other than "_" is one of the alternatives of the Pattern production, it is applicable to all types, it is unconditional on all types, and it has no bindings.? The specification already has a concept of "any" patterns; this is just making it denotable. I think there is little controversy about using unnamed local variables (local variable declaration statements, catch formals, foreach induction variables, resources in try-with-resources) and unnamed lambda parameters.? What is common to all of these is that these are _pure implementation details_, where the author has elected to not give a name to a variable that is entirely implementation-facing.? This seems eminently reasonable.? Unnamed parameters can help eliminate errors by capturing design assumptions and make life easier for static analysis tools that like to point out unused variables. Where we stumble is on method parameters, because method parameter names serve two masters -- the implementation (as the declaration of a variable) and the API (as part of the specification of what the method does.)? Among other things, we like to document the semantics of method parameters in Javadoc with the `@param` tag, but doing so requires a name (or inventing a new Javadoc mechanism like `@param #4`, likely a loser.)? Secondarily, sometimes parameter names are retained in the MethodParameters attribute, though that attribute (JVMS 4.7.24) already supports parameters without names by using a zero CP index. With `var`, we drew a clear line of "implementation only" -- you can't infer a method return type, even for a private method, you can only use it for local variables and lambda formals.? This has been pretty successful. We've explored a number of intermediate points on the spectrum with varying degrees of stability: ?A) Implementation only -- local variables, catch formals, for-loop induction variables, TWR resources, pattern variables, lambda formals ?B) "A++", where we add in method parameters of anonymous classes ?C) Adding in method parameters _for non-initial declarations_ -- allow unnamed parameters only for methods that override a method from a supertype, ensuring that there is a real specification of what the parameters mean. ?D) Anything goes, any method parameter can be unnamed, throwing specification to the wind. A is a stable point, and has the advantage of mostly lining up with where we can use `var`.? But users will surely grumble that they can't use it for implementations of methods from supertypes.? As this feature request predates lambdas and patterns, giving it to lambdas and patterns but not ordinary methods might feel a bit mean. The motivation for B is obvious -- to support smooth refactoring between lambdas and inner classes -- but is not a very stable point, as one will immediately ask "what about refactoring to named classes". C feels attractive, though there would surely be complaints too; it excludes constructors and static methods (which might sometimes want unnamed parameters when a parameter is no longer used, but stays around for binary compatibility), and even some initial declarations.? But, these cases are likely to be somewhat more rare, so I don't object to leaving these aside. The main concern is that this might feel arbitrary.? There is also the possibility for some confusion; it is not obvious what it means when you override a method that already has an unnamed parameter.? Can you give it a name and use it?? It is a little weird that the lack of name applies only to the implementation of the method, but somehow bleeds into the specification.? There is also some impact on Javadoc, as well as lingering concerns that there are other shoes to drop other than Javadoc and MethodParameters. D is also stable, but feels like it makes the language less safe, by making some methods unspecifiable.? On the other hand, the people who might use it for initial declarations, static methods, etc, are also the sort of people who probably don't write specification anyway (otherwise they would realize that they are depriving their callers of useful information.) In (C), Javadoc could insert an `@implNote` that says something like "this implementation ignores the value of parameters and from declaring method Foo::bar".? In (D), it could say "ignores its 3rd and 4th parameter", or insert synthetic @param tags for parameters whose name is something like "". Past discussions seemed to gravitate toward either A or D, which are also the simplest / most stable points.? I guess it becomes a question of getting over the "makes the language less safe" concerns. Regardless, I'd like to see if we can quantify the "lingering concerns about other shoes to drop." -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Sep 7 21:43:42 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 7 Sep 2022 23:43:42 +0200 (CEST) Subject: Unnamed variables and match-all patterns In-Reply-To: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com> References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com> Message-ID: <1010692310.705191.1662587022219.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "amber-spec-experts" > Sent: Wednesday, September 7, 2022 7:41:33 PM > Subject: Unnamed variables and match-all patterns > We've gone around and around a few times on "unnamed variables" (underscore), > starting with JEP 302 (Lambda Leftovers). We reclaimed the underscore token in > Java 9 with the intention of using it for unnamed variables and "any" patterns. > Along the way, we ran into some hiccups, and it has sat on the shelf for a > while. Let's take it down, dust it off, and see if we have any more clarity > than before. > There are three syntactic productions in which we might want to use underscore > as a "don't care" indicator: > - Unnamed variables. Here, underscore stands in for a variable name. When we > declare a local variable, catch formal, pattern variable, etc, whose name is > `_`, which has the effect of entering no new names in scope. It becomes an > "initialize-only" variable. > try { ... } > catch (FooException _) { throw new BarException("foo"); } > - Partial inference. Here, underscore stands in for a type name. Today, we can > infer type variables for generic method invocations and constructor > invocations, but it is all-or-nothing. Being able to denote "infer this type" > would allow us to do partial inference: > foo.m(...) > - "Any" patterns. Here, underscore is a pattern, which matches everything, and > binds nothing. > case Foo(var s, _): ... > We don't have to do all of these; right now we're not considering partial > inference, but the other two are reasonable options. Unnamed variables have > been a long-standing request; any patterns will likely be a common request soon > as well. > For a match-all pattern, there is little to say other than "_" is one of the > alternatives of the Pattern production, it is applicable to all types, it is > unconditional on all types, and it has no bindings. The specification already > has a concept of "any" patterns; this is just making it denotable. > I think there is little controversy about using unnamed local variables (local > variable declaration statements, catch formals, foreach induction variables, > resources in try-with-resources) and unnamed lambda parameters. What is common > to all of these is that these are _pure implementation details_, where the > author has elected to not give a name to a variable that is entirely > implementation-facing. This seems eminently reasonable. Unnamed parameters can > help eliminate errors by capturing design assumptions and make life easier for > static analysis tools that like to point out unused variables. > Where we stumble is on method parameters, because method parameter names serve > two masters -- the implementation (as the declaration of a variable) and the > API (as part of the specification of what the method does.) Among other things, > we like to document the semantics of method parameters in Javadoc with the > `@param` tag, but doing so requires a name (or inventing a new Javadoc > mechanism like `@param #4`, likely a loser.) Secondarily, sometimes parameter > names are retained in the MethodParameters attribute, though that attribute > (JVMS 4.7.24) already supports parameters without names by using a zero CP > index. > With `var`, we drew a clear line of "implementation only" -- you can't infer a > method return type, even for a private method, you can only use it for local > variables and lambda formals. This has been pretty successful. > We've explored a number of intermediate points on the spectrum with varying > degrees of stability: > A) Implementation only -- local variables, catch formals, for-loop induction > variables, TWR resources, pattern variables, lambda formals > B) "A++", where we add in method parameters of anonymous classes > C) Adding in method parameters _for non-initial declarations_ -- allow unnamed > parameters only for methods that override a method from a supertype, ensuring > that there is a real specification of what the parameters mean. > D) Anything goes, any method parameter can be unnamed, throwing specification to > the wind. > A is a stable point, and has the advantage of mostly lining up with where we can > use `var`. But users will surely grumble that they can't use it for > implementations of methods from supertypes. As this feature request predates > lambdas and patterns, giving it to lambdas and patterns but not ordinary > methods might feel a bit mean. > The motivation for B is obvious -- to support smooth refactoring between lambdas > and inner classes -- but is not a very stable point, as one will immediately > ask "what about refactoring to named classes". > C feels attractive, though there would surely be complaints too; it excludes > constructors and static methods (which might sometimes want unnamed parameters > when a parameter is no longer used, but stays around for binary compatibility), > and even some initial declarations. But, these cases are likely to be somewhat > more rare, so I don't object to leaving these aside. The main concern is that > this might feel arbitrary. There is also the possibility for some confusion; it > is not obvious what it means when you override a method that already has an > unnamed parameter. Can you give it a name and use it? It is a little weird that > the lack of name applies only to the implementation of the method, but somehow > bleeds into the specification. There is also some impact on Javadoc, as well as > lingering concerns that there are other shoes to drop other than Javadoc and > MethodParameters. > D is also stable, but feels like it makes the language less safe, by making some > methods unspecifiable. On the other hand, the people who might use it for > initial declarations, static methods, etc, are also the sort of people who > probably don't write specification anyway (otherwise they would realize that > they are depriving their callers of useful information.) > In (C), Javadoc could insert an `@implNote` that says something like "this > implementation ignores the value of parameters and from declaring > method Foo::bar". In (D), it could say "ignores its 3rd and 4th parameter", or > insert synthetic @param tags for parameters whose name is something like > "". > Past discussions seemed to gravitate toward either A or D, which are also the > simplest / most stable points. I guess it becomes a question of getting over > the "makes the language less safe" concerns. > Regardless, I'd like to see if we can quantify the "lingering concerns about > other shoes to drop." There is a C-bis, where '_' is allowed for private methods but that's not important. As a teacher, i vote for A, APIs should be documented, giving a good name to a parameter is usually the first step. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Sep 7 22:01:04 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 8 Sep 2022 00:01:04 +0200 (CEST) Subject: Array patterns (and varargs patterns) In-Reply-To: <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com> References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com> Message-ID: <1114411168.705812.1662588064126.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Brian Goetz" > To: "Tagir Valeev" > Cc: "amber-spec-experts" > Sent: Wednesday, September 7, 2022 4:32:34 PM > Subject: Re: Array patterns (and varargs patterns) > I understand where this sentiment comes from.? But the motivation is > somewhat more indirect than "people are falling over themselves to > deconstruct arrays today". > > Because deconstruction is the dual of aggregation, it is desirable for > each of the forms of aggregation -- constructors, factories, etc -- to > have pattern counterparts.? Not doing so creates asymmetries that make > the whole thing seem more ad-hoc.? Many of the "not as important" > pattern features we're working on now, are in the realm of "completing" > the feature. > > More importantly, array patterns are how we fully support varargs in > records.? If we have a varargs record: > > ??? record VA(String... strings) { } > > we can construct it with a varargs invocation > > ??? new VA("a", "b") > > which is sugar for > > ??? new VA(new String[] { "a", "b" }) > > But we cannot yet deconstruct it with: > > ??? case VA(var a, var b) > > and analogously, for a varargs record, the above is sugar for > > ??? case VA(String[] { var a, var b }) > > So it is not just about arrays. > > I agree that named patterns are more useful, and we are working on them > too.? But they are also a bigger feature (bringing in overload > selection, reflection, translation, etc), so they will take longer. > Whereas array patterns are really a remix of things we've already worked > out -- nested patterns, exhaustiveness, etc.? In any case I would like > to avoid leaving a trail of unfinished work, so cleaning up the loose > ends on basic patterns first seems preferable before adding bigger new > pattern features. > >> Hello! >> >> Honestly, to me this whole feature looks not very important. It's a >> rare case in modern Java applications that business logic operates >> with arrays directly. They are mostly used in low-level system code >> where performance matters more than code elegance. Custom defined >> named patterns for lists would be much more useful. Moreover, if named >> patterns are supported, then array deconstruction could be implemented >> in a library, without complicating the language specification (like `x >> instanceof Arrays.of(String first, String next, String last)`). > > I'm not sure how this Arrays.of pattern is going to work, unless we're > willing to have overloads for every arity up to, say, 22? Otherwise, we > need varargs, and varargs is sugar for an array pattern. For me, Arrays.of() is a named pattern with a vararg list of bindings, no ? So i agree with Tagir, let's figure out how named patterns work first. I see also other reasons to not specify the array pattern now, - record with a varargs are quite rare so people are not desperately in need for the corresponding pattern, - the deconstruction of collections / map pattern is also a dependency of the array pattern. It will be sad if the array pattern and the List pattern does not have a same way to specify the length/size (specifying the length of the array pattern inside the [] seems a too ad-hoc, but maybe i'm wrong). R?mi > >> >> With best regards, >> Tagir Valeev. >> >> On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz wrote: >>> We dropped this out of the record patterns JEP, but I think it is time to >>> revisit this. >>> >>> The concept of array patterns was pretty straightforward; they mimic the nesting >>> and exhaustiveness rules of record patterns, they are just a different sort of >>> container for nested patterns. And they have an obvious duality with array >>> creation expressions. >>> >>> The main open question here was how we distinguish between "match an array of >>> length exactly N" (where there are N nested patterns) and "match an array of >>> length at least N". We toyed with the idea of a "..." indicator to mean "more >>> elements", but this felt a little forced and opened new questions. >>> >>> It later occurred to me that there is another place to nest a pattern in an >>> array pattern -- to match (and bind) the length. In the following, assume for >>> sake of exposition that "_" is the "any" pattern (matches everything, binds >>> nothing) and that we have some way to denote a constant pattern, which I'll >>> denote here with a constant literal. >>> >>> There is an obvious place to put this (optional) pattern: in between the >>> brackets. So: >>> >>> case String[1] { P }: >>> ^ a constant pattern >>> >>> would match string arrays of length 1 whose sole element matches P. And >>> >>> case String[] { P, Q } >>> >>> would match string arrays of length exactly 2, whose first two elements match P >>> and Q respectively. (If the length pattern is not specified, we infer a >>> constant pattern whose constant is equal to the length of the nested pattern >>> list.) >>> >>> Matching a target to `String[L] { P0, .., Pn }` means >>> >>> x instanceof String[] arr >>> && arr.length matches L >>> && arr.length >= n >>> && arr[0] matches P0 >>> && arr[1] matches P1 >>> ... >>> && arr[n] matches Pn >>> >>> More examples: >>> >>> case String[int len] { P } >>> >>> would match string arrays of length >= 1 whose first element matches P, and >>> further binds the array length to `len`. >>> >>> case String[_] { P, Q } >>> >>> would match string arrays of any length whose first two elements match P and Q. >>> >>> case String[3] { } >>> ^constant pattern >>> >>> matches all string arrays of length 3. >>> >>> >>> This is a more principled way to do it, because the length is a part of the >>> array and deserves a chance to match via nested patterns, just as with the >>> elements, and it avoid trying to give "..." a new meaning. >>> >>> The downside is that it might be confusing at first (though people will learn >>> quickly enough) how to distinguish between an exact match and a prefix match. >>> >>> >>> >>> >>> On 1/5/2021 1:48 PM, Brian Goetz wrote: >>> >>> As we get into the next round of pattern matching, I'd like to opportunistically >>> attach another sub-feature: array patterns. (This also bears on the question >>> of "how would varargs patterns work", which I'll address below, though they >>> might come later.) >>> >>> ## Array Patterns >>> >>> If we want to create a new array, we do so with an array construction >>> expression: >>> >>> new String[] { "a", "b" } >>> >>> Since each form of aggregation should have its dual in destructuring, the >>> natural way to represent an array pattern (h/t to AlanM for suggesting this) >>> is: >>> >>> if (arr instanceof String[] { var a, var b }) { ... } >>> >>> Here, the applicability test is: "are you an instanceof of String[], with length >>> = 2", and if so, we cast to String[], extract the two elements, and match them >>> to the nested patterns `var a` and `var b`. This is the natural analogue of >>> deconstruction patterns for arrays, complete with nesting. >>> >>> Since an array can have more elements, we likely need a way to say "length >= 2" >>> rather than simply "length == 2". There are multiple syntactic ways to get >>> there, for now I'm going to write >>> >>> if (arr instanceof String[] { var a, var b, ... }) >>> >>> to indicate "more". The "..." matches zero or more elements and binds nothing. >>> >>> >>> People are immediately going to ask "can I bind something to the remainder"; I >>> think this is mostly an "attractive distraction", and would prefer to not have >>> this dominate the discussion. >>> >>> >>> Here's an example from the JDK that could use this effectively: >>> >>> String[] limits = limitString.split(":"); >>> try { >>> switch (limits.length) { >>> case 2: { >>> if (!limits[1].equals("*")) >>> setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1])); >>> } >>> case 1: { >>> if (!limits[0].equals("*")) >>> setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0])); >>> } >>> } >>> } >>> catch(NumberFormatException ex) { >>> setMultilineLimit(MultilineLimit.DEPTH, -1); >>> setMultilineLimit(MultilineLimit.LENGTH, -1); >>> } >>> >>> becomes (eventually) >>> >>> switch (limitString.split(":")) { >>> case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i); >>> case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i); >>> default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); } >>> } >>> >>> Note how not only does this become more compact, but the unchecked >>> "NumberFormatException" is folded into the match, rather than being a separate >>> concern. >>> >>> >>> ## Varargs patterns >>> >>> Having array patterns offers us a natural way to interpret deconstruction >>> patterns for varargs records. Assume we have: >>> >>> void m(X... xs) { } >>> >>> Then a varargs invocation >>> >>> m(a, b, c) >>> >>> is really sugar for >>> >>> m(new X[] { a, b, c }) >>> >>> So the dual of a varargs invocation, a varargs match, is really a match to an >>> array pattern. So for a record >>> >>> record R(X... xs) { } >>> >>> a varargs match: >>> >>> case R(var a, var b, var c): >>> >>> is really sugar for an array match: >>> >>> case R(X[] { var a, var b, var c }): >>> >>> And similarly, we can use our "more arity" indicator: >>> >>> case R(var a, var b, var c, ...): >>> >>> to indicate that there are at least three elements. >>> >>> From brian.goetz at oracle.com Wed Sep 7 22:13:57 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 7 Sep 2022 18:13:57 -0400 Subject: Unnamed variables and match-all patterns In-Reply-To: <1010692310.705191.1662587022219.JavaMail.zimbra@u-pem.fr> References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com> <1010692310.705191.1662587022219.JavaMail.zimbra@u-pem.fr> Message-ID: > > As a teacher, i vote for A, APIs should be documented, giving a good > name to a parameter is usually the first step. > I'm willing to consider starting with A, though I think we should admit that the most likely reaction if we do that is "you idiots got it wrong again, we waited 25 years for underscore, and you don't even let us do it in the most obvious places."? So I don't think "do A and never do anything about method parameters" is going to fly, though it is potentially a reasonable incremental step on the way there to get people used to unnamed things. From brian.goetz at oracle.com Wed Sep 7 22:15:04 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 7 Sep 2022 18:15:04 -0400 Subject: Array patterns (and varargs patterns) In-Reply-To: <1114411168.705812.1662588064126.JavaMail.zimbra@u-pem.fr> References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com> <1114411168.705812.1662588064126.JavaMail.zimbra@u-pem.fr> Message-ID: > For me, Arrays.of() is a named pattern with a vararg list of bindings, no ? Its a named pattern, but to work, it would need varargs patterns -- and array patterns are the underpinnings of varargs, just as array creation is the underpinning of varargs invocation.? We're not going to do varargs patterns differently than we do varargs invocation, just to avoid doing array patterns -- that would be silly. > I see also other reasons to not specify the array pattern now, > - record with a varargs are quite rare so people are not desperately in need for the corresponding pattern, > - the deconstruction of collections / map pattern is also a dependency of the array pattern. It will be sad if the array pattern and the List pattern does not have a same way to specify the length/size (specifying the length of the array pattern inside the [] seems a too ad-hoc, but maybe i'm wrong). As I've said, the fact that people are not desperate for this yet (though obviously you and Tagir want varargs patterns, so there is some demand for it) is not the primary reason to do this now.? The symmetry between aggregation and deconstruction is very, very, very important to people understanding properly how pattern matching fits into the language.? I am trying to button up the sources of asymmetry in the patterns we have before moving on to cool new patterns.? Otherwise we leave a trail of accidental complexity behind us, where certain things are reversible and others are not, for no apparent reason.? (Primitives in type patterns are in this category too, and we'll be returning to them very soon.) So I'm not going to hold up the discussion of named patterns for array patterns (I'm working on a document for named patterns too, but its much longer), but I'm also not going to hold up array patterns until we get named patterns done either.? I want to close up the holes in what we've already built before laying the next layer. (As to List and Map patterns, these will have to be co-designed with List and Map literals, which will likely require some additional groundwork.? They're a ways away, we're building a tower layer by layer.) > > R?mi > >>> With best regards, >>> Tagir Valeev. >>> >>> On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz wrote: >>>> We dropped this out of the record patterns JEP, but I think it is time to >>>> revisit this. >>>> >>>> The concept of array patterns was pretty straightforward; they mimic the nesting >>>> and exhaustiveness rules of record patterns, they are just a different sort of >>>> container for nested patterns. And they have an obvious duality with array >>>> creation expressions. >>>> >>>> The main open question here was how we distinguish between "match an array of >>>> length exactly N" (where there are N nested patterns) and "match an array of >>>> length at least N". We toyed with the idea of a "..." indicator to mean "more >>>> elements", but this felt a little forced and opened new questions. >>>> >>>> It later occurred to me that there is another place to nest a pattern in an >>>> array pattern -- to match (and bind) the length. In the following, assume for >>>> sake of exposition that "_" is the "any" pattern (matches everything, binds >>>> nothing) and that we have some way to denote a constant pattern, which I'll >>>> denote here with a constant literal. >>>> >>>> There is an obvious place to put this (optional) pattern: in between the >>>> brackets. So: >>>> >>>> case String[1] { P }: >>>> ^ a constant pattern >>>> >>>> would match string arrays of length 1 whose sole element matches P. And >>>> >>>> case String[] { P, Q } >>>> >>>> would match string arrays of length exactly 2, whose first two elements match P >>>> and Q respectively. (If the length pattern is not specified, we infer a >>>> constant pattern whose constant is equal to the length of the nested pattern >>>> list.) >>>> >>>> Matching a target to `String[L] { P0, .., Pn }` means >>>> >>>> x instanceof String[] arr >>>> && arr.length matches L >>>> && arr.length >= n >>>> && arr[0] matches P0 >>>> && arr[1] matches P1 >>>> ... >>>> && arr[n] matches Pn >>>> >>>> More examples: >>>> >>>> case String[int len] { P } >>>> >>>> would match string arrays of length >= 1 whose first element matches P, and >>>> further binds the array length to `len`. >>>> >>>> case String[_] { P, Q } >>>> >>>> would match string arrays of any length whose first two elements match P and Q. >>>> >>>> case String[3] { } >>>> ^constant pattern >>>> >>>> matches all string arrays of length 3. >>>> >>>> >>>> This is a more principled way to do it, because the length is a part of the >>>> array and deserves a chance to match via nested patterns, just as with the >>>> elements, and it avoid trying to give "..." a new meaning. >>>> >>>> The downside is that it might be confusing at first (though people will learn >>>> quickly enough) how to distinguish between an exact match and a prefix match. >>>> >>>> >>>> >>>> >>>> On 1/5/2021 1:48 PM, Brian Goetz wrote: >>>> >>>> As we get into the next round of pattern matching, I'd like to opportunistically >>>> attach another sub-feature: array patterns. (This also bears on the question >>>> of "how would varargs patterns work", which I'll address below, though they >>>> might come later.) >>>> >>>> ## Array Patterns >>>> >>>> If we want to create a new array, we do so with an array construction >>>> expression: >>>> >>>> new String[] { "a", "b" } >>>> >>>> Since each form of aggregation should have its dual in destructuring, the >>>> natural way to represent an array pattern (h/t to AlanM for suggesting this) >>>> is: >>>> >>>> if (arr instanceof String[] { var a, var b }) { ... } >>>> >>>> Here, the applicability test is: "are you an instanceof of String[], with length >>>> = 2", and if so, we cast to String[], extract the two elements, and match them >>>> to the nested patterns `var a` and `var b`. This is the natural analogue of >>>> deconstruction patterns for arrays, complete with nesting. >>>> >>>> Since an array can have more elements, we likely need a way to say "length >= 2" >>>> rather than simply "length == 2". There are multiple syntactic ways to get >>>> there, for now I'm going to write >>>> >>>> if (arr instanceof String[] { var a, var b, ... }) >>>> >>>> to indicate "more". The "..." matches zero or more elements and binds nothing. >>>> >>>> >>>> People are immediately going to ask "can I bind something to the remainder"; I >>>> think this is mostly an "attractive distraction", and would prefer to not have >>>> this dominate the discussion. >>>> >>>> >>>> Here's an example from the JDK that could use this effectively: >>>> >>>> String[] limits = limitString.split(":"); >>>> try { >>>> switch (limits.length) { >>>> case 2: { >>>> if (!limits[1].equals("*")) >>>> setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1])); >>>> } >>>> case 1: { >>>> if (!limits[0].equals("*")) >>>> setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0])); >>>> } >>>> } >>>> } >>>> catch(NumberFormatException ex) { >>>> setMultilineLimit(MultilineLimit.DEPTH, -1); >>>> setMultilineLimit(MultilineLimit.LENGTH, -1); >>>> } >>>> >>>> becomes (eventually) >>>> >>>> switch (limitString.split(":")) { >>>> case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i); >>>> case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i); >>>> default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); } >>>> } >>>> >>>> Note how not only does this become more compact, but the unchecked >>>> "NumberFormatException" is folded into the match, rather than being a separate >>>> concern. >>>> >>>> >>>> ## Varargs patterns >>>> >>>> Having array patterns offers us a natural way to interpret deconstruction >>>> patterns for varargs records. Assume we have: >>>> >>>> void m(X... xs) { } >>>> >>>> Then a varargs invocation >>>> >>>> m(a, b, c) >>>> >>>> is really sugar for >>>> >>>> m(new X[] { a, b, c }) >>>> >>>> So the dual of a varargs invocation, a varargs match, is really a match to an >>>> array pattern. So for a record >>>> >>>> record R(X... xs) { } >>>> >>>> a varargs match: >>>> >>>> case R(var a, var b, var c): >>>> >>>> is really sugar for an array match: >>>> >>>> case R(X[] { var a, var b, var c }): >>>> >>>> And similarly, we can use our "more arity" indicator: >>>> >>>> case R(var a, var b, var c, ...): >>>> >>>> to indicate that there are at least three elements. >>>> >>>> From guy.steele at oracle.com Thu Sep 8 01:35:31 2022 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 8 Sep 2022 01:35:31 +0000 Subject: Unnamed variables and match-all patterns In-Reply-To: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com> References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com> Message-ID: On Sep 7, 2022, at 1:41 PM, Brian Goetz > wrote: . . . Where we stumble is on method parameters, because method parameter names serve two masters -- the implementation (as the declaration of a variable) and the API (as part of the specification of what the method does.) Among other things, we like to document the semantics of method parameters in Javadoc with the `@param` tag, but doing so requires a name And a general language-design pattern is that if you discover a single language feature is serving two masters, consider splitting it into two features, one to serve each master (and then perhaps continue to allow the old feature, explaining it in terms of the new, more general features. In this case, a single feature (method parameter name) provides both a name for the implementation and a name for the API. So, consider having a way to provide two names. Common Lisp has been doing this for its keyword parameters for almost four decades: (defun foo (&key ((:color c) white) ((:angle a) 0)) ? c ? a ?) (foo :color black :angle 45) So the names :color and :angle are part of the API, and the names c and a are the variable names that are actually bound for use in the body. When you write (defun baz (&key (color white) (angle 0)) ? color ? angle) it is by definition an abbreviation for (defun baz (&key ((:color color) white) ((:angle angle) 0)) ? color ? angle) So you don?t have to write out two names in the common case where you actually do want them to be ?the same?. ???? So in Java we could pick some crazy syntax to allow specifying two names for a method parameter, the API name and the implementation (bound variable) name: int colorHack(int red=>r, int green=>g, int blue=>b, int fromIndex=>from, int toIndex=>to) { // Here the names `r`, `g`, `b`, `from`, and `to` are in scope. } and then if you really want to ignore a parameter: int colorBlindHack(int red=>_, int green=>_, int blue=>_, int fromIndex=>from, int toIndex=>to) { // Here the names `from` and `to` are in scope. } Not sure we want to go in that direction, but we should at least consider it. ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Thu Sep 8 04:10:02 2022 From: john.r.rose at oracle.com (John Rose) Date: Wed, 07 Sep 2022 21:10:02 -0700 Subject: Unnamed variables and match-all patterns In-Reply-To: References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com> Message-ID: <89034F8F-F4B2-4F29-AC3F-57D34F9A8B6B@oracle.com> On 7 Sep 2022, at 18:35, Guy Steele wrote: > On Sep 7, 2022, at 1:41 PM, Brian Goetz > > wrote: > . . . > > Where we stumble is on method parameters, because method parameter > names serve two masters -- > the implementation (as the declaration of a variable) and the API (as > part of the specification of what the method does.) Among other > things, we like to document the semantics of method parameters in > Javadoc with the `@param` tag, but doing so requires a name > > And a general language-design pattern is that if you discover a single > language feature is serving two masters, consider splitting it into > two features, one to serve each master (and then perhaps continue to > allow the old feature, explaining it in terms of the new, more general > features. > > In this case, a single feature (method parameter name) provides both a > name for the implementation and a name for the API. So, consider > having a way to provide two names. Common Lisp has been doing this for > its keyword parameters for almost four decades: > > (defun foo (&key ((:color c) white) ((:angle a) 0)) > ? c ? a ?) > > (foo :color black :angle 45) > > So the names :color and :angle are part of the API, and the names c > and a are the variable names that are actually bound for use in the > body. > > When you write > > (defun baz (&key (color white) (angle 0)) > ? color ? angle) > > it is by definition an abbreviation for > > (defun baz (&key ((:color color) white) ((:angle angle) 0)) > ? color ? angle) > > So you don?t have to write out two names in the common case where > you actually do want them to be ?the same?. > > ???? > > So in Java we could pick some crazy syntax to allow specifying two > names for a method parameter, the API name and the implementation > (bound variable) name: > > int colorHack(int red=>r, int green=>g, int blue=>b, int > fromIndex=>from, int toIndex=>to) { > // Here the names `r`, `g`, `b`, `from`, and `to` are in scope. > } > > and then if you really want to ignore a parameter: > > int colorBlindHack(int red=>_, int green=>_, int blue=>_, int > fromIndex=>from, int toIndex=>to) { > // Here the names `from` and `to` are in scope. > } > > Not sure we want to go in that direction, but we should at least > consider it. > > ?Guy As it happens, today I also cited Lisp argument syntax practice to Brian, on the subject of array patterns. (As in, one bit of prior art for sequence matching is Common Lisp req/opt/key args?) This is another bit of prior art from the same rather deep wellspring. There is a proposed syntax which allows a single value to have two names, one of which is a binding, and that is the pattern-let syntax. Perhaps a variation of pattern-let could make sense in parameter declarations. (As many kinds of patterns might eventually be useful in parameter position.) In this case, the binding is inside the pattern, such as `String s`, and the let-part is after an equals sign, `let String s = expr`. (Bikeshed still to be painted here.) Aligning with the need for a double declaration, we could say that the `expr` part is the formal and external name of the parameter, and the `s` part is the local and internal name of the binding. So: int colorBlindHack(let int _ = red, let int _ = green, let int b = blue, ?) ? Huh. Looks too close to optional arguments for comfort. And how would you combine it with optional arguments? int colorBlindHack(let int _ = red = 0, let int _ = green = 0, let int b = blue, ?) ? Oh well, it was a thought. If we ever do go with keyword-based calling conventions in Java, then there will be significant pressure for such double names, as there was in Common Lisp. Until then, the double naming seems to me to be a corner-case feature. Adding immutable structs (value classes) into the language does, in fact, increase the need for keyword-based conventions, so that you can *update* an immutable instance by combining a pre-existing instance with one or more field values to update. (I like to call such a factory method a ?reconstructor? because of its similarity to a constructor, which disregards any previous state and sets *all* the fields.) If we add reconstructors to the language, there is new pressure for keyword-based calling conventions, and after that, there is pressure for the ?double naming? of parameters. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Thu Sep 8 04:16:48 2022 From: john.r.rose at oracle.com (John Rose) Date: Wed, 07 Sep 2022 21:16:48 -0700 Subject: Unnamed variables and match-all patterns In-Reply-To: References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com> Message-ID: <0A448D4F-AB8F-4CEF-BF12-356A816F8CA6@oracle.com> > Aligning with the need for a double declaration, we could say that the `expr` part is the formal and external name of the parameter, and the `s` part is the local and internal name of the binding. So: > > int colorBlindHack(let int _ = red, let int _ = green, let int b = blue, ?) ? > > Huh. Looks too close to optional arguments for comfort. And how would you combine it with optional arguments? P.S. (That is, Painting Shed.) If we allowed Java the label-like syntax adopted by some languages for externally named keyword arguments it might look like this: int colorBlindHack(red: int _, green: int _, blue: int, ?) The last argument keyworded as ?blue? is bound to the name ?blue? in the absence of other indication; the other choices being `blue: int _` for ignored argument and `blue: int b` for a different local name. int colorHack(red: int, green: int, blue: int, ?) //keyword arguments (The ?L: FOO? syntax is already a thing in Java, see?) So many bikesheds, so little time? From john.r.rose at oracle.com Thu Sep 8 04:41:22 2022 From: john.r.rose at oracle.com (John Rose) Date: Wed, 07 Sep 2022 21:41:22 -0700 Subject: Array patterns (and varargs patterns) In-Reply-To: <3b623438-81c9-f888-07bf-55d231db3240@oracle.com> References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> <3b623438-81c9-f888-07bf-55d231db3240@oracle.com> Message-ID: <0DAD71CE-C638-4DDC-A6E8-C61EF082C0B9@oracle.com> On 7 Jan 2021, at 6:18, Brian Goetz wrote: > ?Varargs patterns will build on it (as shown in the mail); if and > when Java ever gets collection literals, there will be corresponding > collection patterns too.? I think the path to streamlining this is > not to try and simplify the syntax of the primitive, but move upwards > to higher-level patterns. OTOH if patterns (like `switch ((O)x) case P v:` or `let P v = (O)x`) are the duals of assignment (like `x = v` or `O x = v`), then we are within our moral rights to make a pattern dualization of the venerable Java syntax `T[] x = {a,b,c}`, which is sugar for `T[] x = new T[]{a,b,c}`. The sugar allows you to take the second `T[]` (and the `new`) for a typeful context (`T[] x`). So without the sugar we get something like: T[] a = ?; switch (a) { case new T[]{a,b,c}: } (The `new` from `new T[]{a,b,c}` is dropped because `new` doesn?t appear in patterns.) But with the same sugar, but dualized, we get: T[] a = ?; switch (a) { case {a,b,c}: } In other words, when the pattern target is already an array, there is no need for the ceremony of repeating the array type, as with normal array declarations. Likewise: T[][] a2d = ?; switch (a2d) { case {{a,b},{c,d}}: } I think this is what Tagir expected, and I think it is a reasonable ?penciling out? of the basic moves of the game we are playing here. Moving on to varargs, the context of a method call marked varargs allows elision not only of the `new T[]` in `new T[]{a,b,c}` but also the braces, you you can equally say `f(a,b,c)` or `f(new T[]{a,b,c})`. (But not `f({a,b,c})`. So, we don?t get `f2d({{a,b},{c,d}})` by analogy with nested array initializers. Whatever.) If a pattern-method can take a pattern-flavored argument, and perhaps a varargs argument to boot, it?s pretty clear that additional moves could follow quickly: pattern f(pattern T[] a} { ? } pattern f2d(pattern T[][] a) { ? } pattern fv(pattern T a?) { ? } // extra ?pattern? keyword on parameters for emphasis? switch (x) { case f({a,b,c}): ? // omit `new T[]` b/c type case f2d({{d,e,f},{g}}): ? // omit `new T[][]` b/c type case fv(h,i,j,k): ? // omit `new T[]` and braces b/c varargs } -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Thu Sep 8 06:47:26 2022 From: amaembo at gmail.com (Tagir Valeev) Date: Thu, 8 Sep 2022 08:47:26 +0200 Subject: Unnamed variables and match-all patterns In-Reply-To: References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com> <1010692310.705191.1662587022219.JavaMail.zimbra@u-pem.fr> Message-ID: Hello! ??, 8 ????. 2022 ?., 00:14 Brian Goetz : > > > > > As a teacher, i vote for A, APIs should be documented, giving a good > > name to a parameter is usually the first step. > > > > I'm willing to consider starting with A, though I think we should admit > that the most likely reaction if we do that is "you idiots got it wrong > again, we waited 25 years for underscore, and you don't even let us do > it in the most obvious places." So I don't think "do A and never do > anything about method parameters" is going to fly, though it is > potentially a reasonable incremental step on the way there to get people > used to unnamed things. > I'm not sure it's so critical. To me, the main source of frustration is the necessity to think up a name that I won't use anyway. The second source is the fact that the code becomes noticeably longer when it includes unused names. Both problems are not so important for method parameters: - If you override or implement method, any IDE just copies names from the super-method for you, so you don't need to think. - Method declaration is already quite verbose. It contains @Override annotation, modifiers, types of all parameters and return type explicitly spelled, all of them could be quite long. Probably other annotations, throws and Javadoc. Saving few chars there would not help much. On the other hand, declaration doesn't contain logic, so people rarely stare at it trying to understand what's going on. Another problem, namely polluting namespace with an unused name, stays, but I believe it's not so important. It may be confusing if you want to reuse the name of super-method parameter for another purpose, so occupying it by default has its advantages. That's said, I'm also for A. It's simple and well defined. It's in line with lvti philosophy and will be already very helpful without adding confusion and strange corner cases. With best regards, Tagir Valeev. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Sep 8 12:22:22 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 8 Sep 2022 08:22:22 -0400 Subject: Unnamed variables and match-all patterns In-Reply-To: References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com> <1010692310.705191.1662587022219.JavaMail.zimbra@u-pem.fr> Message-ID: <4ae3daa5-af90-6690-730d-5e41ad591547@oracle.com> > > I'm not sure it's so critical. To me, the main source of frustration > is the necessity to think up a name that I won't use anyway. The > second source is the fact that the code becomes noticeably longer when > it includes unused names. Both problems are not so important for > method parameters: > > - If you override or implement method, any IDE just copies names from > the super-method for you, so you don't need to think. > - Method declaration is already quite verbose. It contains @Override > annotation, modifiers, types of all parameters and return type > explicitly spelled, all of them could be quite long. Probably other > annotations, throws and Javadoc. Saving few chars there would not help > much. On the other hand, declaration doesn't contain logic, so people > rarely stare at it trying to understand what's going on. For the people who complain about this, I don't think it's about saving a few characters in the declaration, as much as satisyfing static analysis that complains about unused parameters.? But I suspect that many of these have already become lambdas (this happened most commonly with anonymous classes previously).? So I'm willing to do the experiment of A first and see if we need to take the next step. From guy.steele at oracle.com Thu Sep 8 15:13:28 2022 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 8 Sep 2022 15:13:28 +0000 Subject: Unnamed variables and match-all patterns In-Reply-To: <0A448D4F-AB8F-4CEF-BF12-356A816F8CA6@oracle.com> References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com> <0A448D4F-AB8F-4CEF-BF12-356A816F8CA6@oracle.com> Message-ID: <710344EE-A52C-42FA-BA80-F29C0562F002@oracle.com> > On Sep 8, 2022, at 12:16 AM, John Rose wrote: > >> Aligning with the need for a double declaration, we could say that the `expr` part is the formal and external name of the parameter, and the `s` part is the local and internal name of the binding. So: >> >> int colorBlindHack(let int _ = red, let int _ = green, let int b = blue, ?) ? >> >> Huh. Looks too close to optional arguments for comfort. And how would you combine it with optional arguments? > > P.S. (That is, Painting Shed.) If we allowed Java the label-like syntax adopted by some languages for externally named keyword arguments it might look like this: > > int colorBlindHack(red: int _, green: int _, blue: int, ?) > > The last argument keyworded as ?blue? is bound to the name ?blue? in the absence of other indication; the other choices being `blue: int _` for ignored argument and `blue: int b` for a different local name. > > int colorHack(red: int, green: int, blue: int, ?) //keyword arguments > > (The ?L: FOO? syntax is already a thing in Java, see?) > > So many bikesheds, so little time? Wowww?this is one of the bikiest sheds I have seen in a long time. I am very impressed. Completely consistent to those who know the history, therefore very appealing! But also, alas, with the promise of totally confusing newcomers as to whether Java parameter declaration syntax is C-like or Pascal-like. :-) ?Guy From brian.goetz at oracle.com Thu Sep 8 16:53:21 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 8 Sep 2022 12:53:21 -0400 Subject: Primitives in instanceof and patterns Message-ID: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> Earlier in the year we talked about primitive type patterns.? Let me summarize the past discussion, what I think the right direction is, and why this is (yet another) "finishing up the job" task for basic patterns that, if left undone, will be a sharp edge. Prior to record patterns, we didn't support primitive type patterns at all. With records, we now support primitive type patterns as nested patterns, but they are very limited; they are only applicable to exactly their own type. The motivation for "finishing" primitive type patterns is the same as discussed earlier this week with array patterns -- if pattern matching is the dual of aggregation, we want to avoid gratuitous asymmetries that let you put things together but not take them apart. Currently, we can assign a `String` to an `Object`, and recover the `String` with a pattern match: ??? Object o = "Bob"; ??? if (o instanceof String s) { println("Hi Bob"); } Analogously, we can assign an `int` to a `long`: ??? long n = 0; but we cannot yet recover the int with a pattern match: ??? if (n instanceof int i) { ... } // error, pattern `int i` not applicable to `long` To fill out some more of the asymmetries around records if we don't finish the job: given ??? record R(int i) { } we can construct it with ??? new R(anInt)???? // no adaptation ??? new R(aShort)??? // widening ??? new R(anInteger) // unboxing but yet cannot deconstruct it the same way: ??? case R(int i)???? // OK ??? case R(short s)?? // nope ??? case R(Integer i) // nope It would be a gratuitous asymmetry that we can use pattern matching to recover from reference widening, but not from primitive widening.? While many of the arguments against doing primitive type patterns now were of the form "let's keep things simple", I believe that the simpler solution is actually to _finish the job_, because this minimizes asymmetries and potholes that users would otherwise have to maintain a mental catalog of. Our earlier explorations started (incorrectly, as it turned out), with assignment context.? This direction gave us a good push in the right direction, but turned out to not be the right answer.? A more careful reading of JLS Ch5 convinced me that the answer lies not in assignment conversion, but _cast conversion_. #### Stepping back: instanceof The right place to start is actually not patterns, but `instanceof`.? If we start here, and listen carefully to the specification, it leads us to the correct answer. Today, `instanceof` works only for reference types. Accordingly, most people view `instanceof` as "the subtyping operator" -- because that's the only question we can currently ask it.? We almost never see `instanceof` on its own; it is nearly always followed by a cast to the same type. Similarly, we rarely see a cast on its own; it is nearly always preceded by an `instanceof` for the same type. There's a reason these two operations travel together: casting is, in general, unsafe; we can try to cast an `Object` reference to a `String`, but if the reference refers to another type, the cast will fail.? So to make casting safe, we precede it with an `instanceof` test.? The semantics of `instanceof` and casting align such that `instanceof` is the precondition test for safe casting. > instanceof is the precondition for safe casting Asking `instanceof T` means "if I cast this to T, would I like the answer." Obviously CCE is an unlikable answer; `instanceof` further adopts the opinion that casting `null` would also be an unlikable answer, because while the cast would succeed, you can't do anything useful with the result. Currently, `instanceof` is only defined on reference types, and on this domain coincides with subtyping.? On the other hand, casting is defined between primitive types (widening, narrowing), and between primitive and reference types (boxing, unboxing).? Some casts involving primitives yield "better" results than others; casting `0` to `byte` results in no loss of information, since `0` is representable as a byte, but casting `500` to `byte` succeeds but loses information because the higher order bits are discarded. If we characterize some casts as "lossy" and others as "exact" -- where lossy means discarding useful information -- we can extend the "safe casting precondition" meaning of `instanceof` to primitive operands and types in the obvious way -- "would casting this expression to this type succeed without error and without information loss."? If the type of the expression is not castable to the type we are asking about, we know the cast cannot succeed and reject the `instanceof` test at compile time. Defining which casts are lossy and which are exact is fairly straightforward; we can appeal to the concept already in the JLS of "representable in the range of a type."? For some pairs of types, casting is always exact (e.g., casting `int` to `long` is always exact); we call these "unconditionally exact". For other pairs of types, some values can be cast exactly and others cannot. Defining which casts are exact gives us a simple and precise semantics for `x instanceof T`: whether `x` can be cast exactly to `T`. Similarly, if the static type of `x` is not castable to `T`, then the corresponding `instanceof` question is rejected statically.? The answers are not suprising: ?- Boxing is always exact; ?- Unboxing is exact for all non-null values; ?- Reference widening is always exact; ?- Reference narrowing is exact if the type of the target expression is a ?? subtype of the target type; ?- Primitive widening and narrowing are exact if the target expression can be ?? represented in the range of the target type. #### Primitive type patterns It is a short hop from `instanceof` to patterns (including primitive type patterns, and reference type patterns applied to primitive types), which can be defined entirely in terms of cast conversion and exactness: ?- A type pattern `T t` is applicable to a target of type `S` if `S` is ?? cast-convertible to `T`; ?- A type pattern `T t` matches a target `x` if `x` can be cast exactly to `T`; ?- A type pattern `T t` is unconditional at type `S` if casting from `T` to `S` ?? is unconditionally exact; ?- A type pattern `T t` dominates a type pattern `S s` (or a record pattern ?? `S(...)`) if `T t` would be unconditional on `S`. While the rules for casting are complex, primitive patterns add no new complexity; there are no new conversions or conversion contexts.? If we see: ??? switch (a) { ??????? case T t: ... ??? } we know the case matches if `a` can be cast exactly to `T`, and the pattern is unconditional if _all_ values of `a`'s type can be cast exactly to `T`.? Note that none of this is specific to primitives; we derive the semantics of _all_ type patterns from the enhanced definition of casting. Now, our record deconstruction examples work symmetrically to construction: ??? case R(int i)???? // OK ??? case R(short s)?? // test if `i` is in the range of `short` ??? case R(Integer i) // box `i` to `Integer` -------------- next part -------------- An HTML attachment was scrubbed... URL: From emcmanus at google.com Thu Sep 8 17:53:55 2022 From: emcmanus at google.com (=?UTF-8?Q?=C3=89amonn_McManus?=) Date: Thu, 8 Sep 2022 10:53:55 -0700 Subject: Primitives in instanceof and patterns In-Reply-To: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> Message-ID: This makes a lot of sense. I'm wondering, though, how it works with float and double. I don't see the answer by looking at JLS Ch5. Are the following expressions legal, and if so which ones are true? 1. 2.0 instanceof int 2. 2.5 instanceof int 3. Double.MAX_VALUE instanceof float 4. Math.PI instanceof float 5. Double.NaN instanceof float 6. Double.NaN instanceof double Intuitively, I think I would expect that if t is an expression of primitive type T, and U is also a primitive type, then t instanceof U is true iff t == (T) (U) t. (Perhaps with a variant of == that is reflexive even for NaN, or perhaps not.) That would make (1) true, (2,3,4) false, and (5,6) who knows. On Thu, 8 Sept 2022 at 09:53, Brian Goetz wrote: > Earlier in the year we talked about primitive type patterns. Let me > summarize > the past discussion, what I think the right direction is, and why this is > (yet > another) "finishing up the job" task for basic patterns that, if left > undone, > will be a sharp edge. > > Prior to record patterns, we didn't support primitive type patterns at > all. With > records, we now support primitive type patterns as nested patterns, but > they are > very limited; they are only applicable to exactly their own type. > > The motivation for "finishing" primitive type patterns is the same as > discussed > earlier this week with array patterns -- if pattern matching is the dual of > aggregation, we want to avoid gratuitous asymmetries that let you put > things > together but not take them apart. > > Currently, we can assign a `String` to an `Object`, and recover the > `String` > with a pattern match: > > Object o = "Bob"; > if (o instanceof String s) { println("Hi Bob"); } > > Analogously, we can assign an `int` to a `long`: > > long n = 0; > > but we cannot yet recover the int with a pattern match: > > if (n instanceof int i) { ... } // error, pattern `int i` not > applicable to `long` > > To fill out some more of the asymmetries around records if we don't finish > the job: given > > record R(int i) { } > > we can construct it with > > new R(anInt) // no adaptation > new R(aShort) // widening > new R(anInteger) // unboxing > > but yet cannot deconstruct it the same way: > > case R(int i) // OK > case R(short s) // nope > case R(Integer i) // nope > > It would be a gratuitous asymmetry that we can use pattern matching to > recover from > reference widening, but not from primitive widening. While many of the > arguments against doing primitive type patterns now were of the form > "let's keep > things simple", I believe that the simpler solution is actually to _finish > the > job_, because this minimizes asymmetries and potholes that users would > otherwise > have to maintain a mental catalog of. > > Our earlier explorations started (incorrectly, as it turned out), with > assignment context. This direction gave us a good push in the right > direction, > but turned out to not be the right answer. A more careful reading of JLS > Ch5 > convinced me that the answer lies not in assignment conversion, but _cast > conversion_. > > #### Stepping back: instanceof > > The right place to start is actually not patterns, but `instanceof`. If we > start here, and listen carefully to the specification, it leads us to the > correct answer. > > Today, `instanceof` works only for reference types. Accordingly, most > people > view `instanceof` as "the subtyping operator" -- because that's the only > question we can currently ask it. We almost never see `instanceof` on its > own; > it is nearly always followed by a cast to the same type. Similarly, we > rarely > see a cast on its own; it is nearly always preceded by an `instanceof` for > the > same type. > > There's a reason these two operations travel together: casting is, in > general, > unsafe; we can try to cast an `Object` reference to a `String`, but if the > reference refers to another type, the cast will fail. So to make casting > safe, > we precede it with an `instanceof` test. The semantics of `instanceof` and > casting align such that `instanceof` is the precondition test for safe > casting. > > > instanceof is the precondition for safe casting > > Asking `instanceof T` means "if I cast this to T, would I like the answer." > Obviously CCE is an unlikable answer; `instanceof` further adopts the > opinion > that casting `null` would also be an unlikable answer, because while the > cast > would succeed, you can't do anything useful with the result. > > Currently, `instanceof` is only defined on reference types, and on this > domain > coincides with subtyping. On the other hand, casting is defined between > primitive types (widening, narrowing), and between primitive and reference > types > (boxing, unboxing). Some casts involving primitives yield "better" > results than > others; casting `0` to `byte` results in no loss of information, since `0` > is > representable as a byte, but casting `500` to `byte` succeeds but loses > information because the higher order bits are discarded. > > If we characterize some casts as "lossy" and others as "exact" -- where > lossy > means discarding useful information -- we can extend the "safe casting > precondition" meaning of `instanceof` to primitive operands and types in > the > obvious way -- "would casting this expression to this type succeed without > error > and without information loss." If the type of the expression is not > castable to > the type we are asking about, we know the cast cannot succeed and reject > the > `instanceof` test at compile time. > > Defining which casts are lossy and which are exact is fairly > straightforward; we > can appeal to the concept already in the JLS of "representable in the > range of a > type." For some pairs of types, casting is always exact (e.g., casting > `int` to > `long` is always exact); we call these "unconditionally exact". For other > pairs > of types, some values can be cast exactly and others cannot. > > Defining which casts are exact gives us a simple and precise semantics for > `x > instanceof T`: whether `x` can be cast exactly to `T`. Similarly, if the > static > type of `x` is not castable to `T`, then the corresponding `instanceof` > question > is rejected statically. The answers are not suprising: > > - Boxing is always exact; > - Unboxing is exact for all non-null values; > - Reference widening is always exact; > - Reference narrowing is exact if the type of the target expression is a > subtype of the target type; > - Primitive widening and narrowing are exact if the target expression can > be > represented in the range of the target type. > > #### Primitive type patterns > > It is a short hop from `instanceof` to patterns (including primitive type > patterns, and reference type patterns applied to primitive types), which > can be > defined entirely in terms of cast conversion and exactness: > > - A type pattern `T t` is applicable to a target of type `S` if `S` is > cast-convertible to `T`; > - A type pattern `T t` matches a target `x` if `x` can be cast exactly to > `T`; > - A type pattern `T t` is unconditional at type `S` if casting from `T` > to `S` > is unconditionally exact; > - A type pattern `T t` dominates a type pattern `S s` (or a record pattern > `S(...)`) if `T t` would be unconditional on `S`. > > While the rules for casting are complex, primitive patterns add no new > complexity; there are no new conversions or conversion contexts. If we > see: > > switch (a) { > case T t: ... > } > > we know the case matches if `a` can be cast exactly to `T`, and the > pattern is > unconditional if _all_ values of `a`'s type can be cast exactly to `T`. > Note > that none of this is specific to primitives; we derive the semantics of > _all_ > type patterns from the enhanced definition of casting. > > Now, our record deconstruction examples work symmetrically to > construction: > > case R(int i) // OK > case R(short s) // test if `i` is in the range of `short` > case R(Integer i) // box `i` to `Integer` > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4003 bytes Desc: S/MIME Cryptographic Signature URL: From brian.goetz at oracle.com Thu Sep 8 20:03:45 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 8 Sep 2022 16:03:45 -0400 Subject: Primitives in instanceof and patterns In-Reply-To: References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> Message-ID: <601c9f99-da45-5990-21c0-baf3731f19db@oracle.com> Sigh, floating point.? Yes, this is the most difficult corner of this work. Bear in mind that you're really asking questions about cast conversion.? Additionally, we have to define which of these casts are "lossy" vs which are "information preserving" (exact), which for most numbers is straightforward but for weird floating point values might be harder.?? So let's first see what happens when we cast your examples: jshell> (int) 2.0f $1 ==> 2 jshell> (int) 2.5f $2 ==> 2 jshell> (float) Double.MAX_VALUE Infinity jshell> (float) Math.PI $4 ==> 3.1415927 jshell> (float) Double.NaN $5 ==> NaN jshell> (double) Double.NaN $6 ==> NaN Clearly #2 (casting 2.5 to int) is lossy, and therefore would not be exact, so we can cross that one off the list.? Similar for Double.MAX_VALUE to float.? It's also pretty clear that Math.PI -- which is merely an alias for 3.14159265358979323846 -- is also not representable in the range of float. So that leaves: ??? 2.0 instanceof int ??? Double.NaN instanceof float ??? Double.NaN instanceof double The first one is an exact cast; this can be specified in a number of ways, including "sufficient number of low order bits are zero", or "representable in the range of the target type", or others.? The binary representation of 2.0f is 01000000000000000000000000000000, which exactly encodes an integer.? Again, this is not a rule about `instanceof`; this is derived from casting (though we do have to define what exactness means.) For NaN, negative zero, and infinity, whether various conversions are exact or not may require a somewhat ad-hoc decision, but I think the intuitive answers for NaN is that Float.NaN <--> Double.NaN is exact, and similar for Float.Inf <--> Double.Inf.? There will be an argument about -0.0 and 0.0. > Intuitively, I think I would expect that if t is an expression of > primitive type T, and U is also a primitive type, then t instanceof > Uis true iff?t == (T) (U) t. (Perhaps with a variant of == that is > reflexive even for NaN, or perhaps not.) That would make (1) true, > (2,3,4) false, and (5,6) who knows. This is a good intuition, and is true for some types, and almost true for others, but it falls into some holes.? In particular, with some conversions, `(U) t` is lossy but `(T) (U) t` is also lossy in an exactly compensating way. > This makes a lot of sense. I'm wondering, though, how it works with > float and double. I don't see the answer by looking at JLS Ch5. Are > the following expressions legal, and if so which ones are true? > > 1. 2.0 instanceof int > 2. 2.5 instanceof int > 3. Double.MAX_VALUE instanceof float > 4. Math.PI instanceof float > 5. Double.NaN instanceof float > 6. Double.NaN instanceof double > > Intuitively, I think I would expect that if t is an expression of > primitive type T, and U is also a primitive type, then t instanceof > Uis true iff?t == (T) (U) t. (Perhaps with a variant of == that is > reflexive even for NaN, or perhaps not.) That would make (1) true, > (2,3,4) false, and (5,6) who knows. > > On Thu, 8 Sept 2022 at 09:53, Brian Goetz wrote: > > Earlier in the year we talked about primitive type patterns.? Let > me summarize > the past discussion, what I think the right direction is, and why > this is (yet > another) "finishing up the job" task for basic patterns that, if > left undone, > will be a sharp edge. > > Prior to record patterns, we didn't support primitive type > patterns at all. With > records, we now support primitive type patterns as nested > patterns, but they are > very limited; they are only applicable to exactly their own type. > > The motivation for "finishing" primitive type patterns is the same > as discussed > earlier this week with array patterns -- if pattern matching is > the dual of > aggregation, we want to avoid gratuitous asymmetries that let you > put things > together but not take them apart. > > Currently, we can assign a `String` to an `Object`, and recover > the `String` > with a pattern match: > > ??? Object o = "Bob"; > ??? if (o instanceof String s) { println("Hi Bob"); } > > Analogously, we can assign an `int` to a `long`: > > ??? long n = 0; > > but we cannot yet recover the int with a pattern match: > > ??? if (n instanceof int i) { ... } // error, pattern `int i` not > applicable to `long` > > To fill out some more of the asymmetries around records if we > don't finish the job: given > > ??? record R(int i) { } > > we can construct it with > > ??? new R(anInt)???? // no adaptation > ??? new R(aShort)??? // widening > ??? new R(anInteger) // unboxing > > but yet cannot deconstruct it the same way: > > ??? case R(int i)???? // OK > ??? case R(short s)?? // nope > ??? case R(Integer i) // nope > > It would be a gratuitous asymmetry that we can use pattern > matching to recover from > reference widening, but not from primitive widening. While many of the > arguments against doing primitive type patterns now were of the > form "let's keep > things simple", I believe that the simpler solution is actually to > _finish the > job_, because this minimizes asymmetries and potholes that users > would otherwise > have to maintain a mental catalog of. > > Our earlier explorations started (incorrectly, as it turned out), with > assignment context.? This direction gave us a good push in the > right direction, > but turned out to not be the right answer.? A more careful reading > of JLS Ch5 > convinced me that the answer lies not in assignment conversion, > but _cast > conversion_. > > #### Stepping back: instanceof > > The right place to start is actually not patterns, but > `instanceof`.? If we > start here, and listen carefully to the specification, it leads us > to the > correct answer. > > Today, `instanceof` works only for reference types. Accordingly, > most people > view `instanceof` as "the subtyping operator" -- because that's > the only > question we can currently ask it.? We almost never see > `instanceof` on its own; > it is nearly always followed by a cast to the same type.? > Similarly, we rarely > see a cast on its own; it is nearly always preceded by an > `instanceof` for the > same type. > > There's a reason these two operations travel together: casting is, > in general, > unsafe; we can try to cast an `Object` reference to a `String`, > but if the > reference refers to another type, the cast will fail. So to make > casting safe, > we precede it with an `instanceof` test.? The semantics of > `instanceof` and > casting align such that `instanceof` is the precondition test for > safe casting. > > > instanceof is the precondition for safe casting > > Asking `instanceof T` means "if I cast this to T, would I like the > answer." > Obviously CCE is an unlikable answer; `instanceof` further adopts > the opinion > that casting `null` would also be an unlikable answer, because > while the cast > would succeed, you can't do anything useful with the result. > > Currently, `instanceof` is only defined on reference types, and on > this domain > coincides with subtyping.? On the other hand, casting is defined > between > primitive types (widening, narrowing), and between primitive and > reference types > (boxing, unboxing).? Some casts involving primitives yield > "better" results than > others; casting `0` to `byte` results in no loss of information, > since `0` is > representable as a byte, but casting `500` to `byte` succeeds but > loses > information because the higher order bits are discarded. > > If we characterize some casts as "lossy" and others as "exact" -- > where lossy > means discarding useful information -- we can extend the "safe casting > precondition" meaning of `instanceof` to primitive operands and > types in the > obvious way -- "would casting this expression to this type succeed > without error > and without information loss."? If the type of the expression is > not castable to > the type we are asking about, we know the cast cannot succeed and > reject the > `instanceof` test at compile time. > > Defining which casts are lossy and which are exact is fairly > straightforward; we > can appeal to the concept already in the JLS of "representable in > the range of a > type."? For some pairs of types, casting is always exact (e.g., > casting `int` to > `long` is always exact); we call these "unconditionally exact".? > For other pairs > of types, some values can be cast exactly and others cannot. > > Defining which casts are exact gives us a simple and precise > semantics for `x > instanceof T`: whether `x` can be cast exactly to `T`. Similarly, > if the static > type of `x` is not castable to `T`, then the corresponding > `instanceof` question > is rejected statically.? The answers are not suprising: > > ?- Boxing is always exact; > ?- Unboxing is exact for all non-null values; > ?- Reference widening is always exact; > ?- Reference narrowing is exact if the type of the target > expression is a > ?? subtype of the target type; > ?- Primitive widening and narrowing are exact if the target > expression can be > ?? represented in the range of the target type. > > #### Primitive type patterns > > It is a short hop from `instanceof` to patterns (including > primitive type > patterns, and reference type patterns applied to primitive types), > which can be > defined entirely in terms of cast conversion and exactness: > > ?- A type pattern `T t` is applicable to a target of type `S` if > `S` is > ?? cast-convertible to `T`; > ?- A type pattern `T t` matches a target `x` if `x` can be cast > exactly to `T`; > ?- A type pattern `T t` is unconditional at type `S` if casting > from `T` to `S` > ?? is unconditionally exact; > ?- A type pattern `T t` dominates a type pattern `S s` (or a > record pattern > ?? `S(...)`) if `T t` would be unconditional on `S`. > > While the rules for casting are complex, primitive patterns add no new > complexity; there are no new conversions or conversion contexts.? > If we see: > > ??? switch (a) { > ??????? case T t: ... > ??? } > > we know the case matches if `a` can be cast exactly to `T`, and > the pattern is > unconditional if _all_ values of `a`'s type can be cast exactly to > `T`.? Note > that none of this is specific to primitives; we derive the > semantics of _all_ > type patterns from the enhanced definition of casting. > > Now, our record deconstruction examples work symmetrically to > construction: > > ??? case R(int i)???? // OK > ??? case R(short s)?? // test if `i` is in the range of `short` > ??? case R(Integer i) // box `i` to `Integer` > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.buckley at oracle.com Thu Sep 8 22:32:42 2022 From: alex.buckley at oracle.com (Alex Buckley) Date: Thu, 8 Sep 2022 15:32:42 -0700 Subject: Unnamed variables and match-all patterns In-Reply-To: <4ae3daa5-af90-6690-730d-5e41ad591547@oracle.com> References: <8b234bbf-dc9e-62dd-30f6-c1e2bd0480ec@oracle.com> <1010692310.705191.1662587022219.JavaMail.zimbra@u-pem.fr> <4ae3daa5-af90-6690-730d-5e41ad591547@oracle.com> Message-ID: On 9/8/2022 5:22 AM, Brian Goetz wrote: > For the people who complain about this, I don't think it's about saving > a few characters in the declaration, as much as satisyfing static > analysis that complains about unused parameters.? But I suspect that > many of these have already become lambdas (this happened most commonly > with anonymous classes previously).? So I'm willing to do the experiment > of A first and see if we need to take the next step. A longstanding request is for method parameters that are implicitly final, so that static analysis can point out dumb assignments to them in the method body. I suspect a lot of requestors are actually thinking about constructor parameters, where useless self-assignment (`firstName = firstName;`) is a tripwire for Java beginners. Of course, record classes sidestep the assignment boilerplate completely, but being able to denote a constructor parameter as unusable, and therefore unused, and therefore not contributory to the state of an object, feels like it has some utility. This speaks to keeping an open mind about D, even if A is the first step. Alex From forax at univ-mlv.fr Fri Sep 9 15:35:44 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 9 Sep 2022 17:35:44 +0200 (CEST) Subject: Primitives in instanceof and patterns In-Reply-To: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> Message-ID: <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "amber-spec-experts" > Sent: Thursday, September 8, 2022 6:53:21 PM > Subject: Primitives in instanceof and patterns > Earlier in the year we talked about primitive type patterns. Let me summarize > the past discussion, what I think the right direction is, and why this is (yet > another) "finishing up the job" task for basic patterns that, if left undone, > will be a sharp edge. > Prior to record patterns, we didn't support primitive type patterns at all. With > records, we now support primitive type patterns as nested patterns, but they are > very limited; they are only applicable to exactly their own type. > The motivation for "finishing" primitive type patterns is the same as discussed > earlier this week with array patterns -- if pattern matching is the dual of > aggregation, we want to avoid gratuitous asymmetries that let you put things > together but not take them apart. > Currently, we can assign a `String` to an `Object`, and recover the `String` > with a pattern match: > Object o = "Bob"; > if (o instanceof String s) { println("Hi Bob"); } > Analogously, we can assign an `int` to a `long`: > long n = 0; > but we cannot yet recover the int with a pattern match: > if (n instanceof int i) { ... } // error, pattern `int i` not applicable to > `long` > To fill out some more of the asymmetries around records if we don't finish the > job: given > record R(int i) { } > we can construct it with > new R(anInt) // no adaptation > new R(aShort) // widening > new R(anInteger) // unboxing > but yet cannot deconstruct it the same way: > case R(int i) // OK > case R(short s) // nope > case R(Integer i) // nope > It would be a gratuitous asymmetry that we can use pattern matching to recover > from > reference widening, but not from primitive widening. While many of the > arguments against doing primitive type patterns now were of the form "let's keep > things simple", I believe that the simpler solution is actually to _finish the > job_, because this minimizes asymmetries and potholes that users would otherwise > have to maintain a mental catalog of. > Our earlier explorations started (incorrectly, as it turned out), with > assignment context. This direction gave us a good push in the right direction, > but turned out to not be the right answer. A more careful reading of JLS Ch5 > convinced me that the answer lies not in assignment conversion, but _cast > conversion_. > #### Stepping back: instanceof > The right place to start is actually not patterns, but `instanceof`. If we > start here, and listen carefully to the specification, it leads us to the > correct answer. > Today, `instanceof` works only for reference types. Accordingly, most people > view `instanceof` as "the subtyping operator" -- because that's the only > question we can currently ask it. We almost never see `instanceof` on its own; > it is nearly always followed by a cast to the same type. Similarly, we rarely > see a cast on its own; it is nearly always preceded by an `instanceof` for the > same type. > There's a reason these two operations travel together: casting is, in general, > unsafe; we can try to cast an `Object` reference to a `String`, but if the > reference refers to another type, the cast will fail. So to make casting safe, > we precede it with an `instanceof` test. The semantics of `instanceof` and > casting align such that `instanceof` is the precondition test for safe casting. > > instanceof is the precondition for safe casting > Asking `instanceof T` means "if I cast this to T, would I like the answer." > Obviously CCE is an unlikable answer; `instanceof` further adopts the opinion > that casting `null` would also be an unlikable answer, because while the cast > would succeed, you can't do anything useful with the result. > Currently, `instanceof` is only defined on reference types, and on this domain > coincides with subtyping. On the other hand, casting is defined between > primitive types (widening, narrowing), and between primitive and reference types > (boxing, unboxing). Some casts involving primitives yield "better" results than > others; casting `0` to `byte` results in no loss of information, since `0` is > representable as a byte, but casting `500` to `byte` succeeds but loses > information because the higher order bits are discarded. > If we characterize some casts as "lossy" and others as "exact" -- where lossy > means discarding useful information -- we can extend the "safe casting > precondition" meaning of `instanceof` to primitive operands and types in the > obvious way -- "would casting this expression to this type succeed without error > and without information loss." If the type of the expression is not castable to > the type we are asking about, we know the cast cannot succeed and reject the > `instanceof` test at compile time. > Defining which casts are lossy and which are exact is fairly straightforward; we > can appeal to the concept already in the JLS of "representable in the range of a > type." For some pairs of types, casting is always exact (e.g., casting `int` to > `long` is always exact); we call these "unconditionally exact". For other pairs > of types, some values can be cast exactly and others cannot. > Defining which casts are exact gives us a simple and precise semantics for `x > instanceof T`: whether `x` can be cast exactly to `T`. Similarly, if the static > type of `x` is not castable to `T`, then the corresponding `instanceof` question > is rejected statically. The answers are not suprising: > - Boxing is always exact; > - Unboxing is exact for all non-null values; > - Reference widening is always exact; > - Reference narrowing is exact if the type of the target expression is a > subtype of the target type; > - Primitive widening and narrowing are exact if the target expression can be > represented in the range of the target type. > #### Primitive type patterns > It is a short hop from `instanceof` to patterns (including primitive type > patterns, and reference type patterns applied to primitive types), which can be > defined entirely in terms of cast conversion and exactness: > - A type pattern `T t` is applicable to a target of type `S` if `S` is > cast-convertible to `T`; > - A type pattern `T t` matches a target `x` if `x` can be cast exactly to `T`; > - A type pattern `T t` is unconditional at type `S` if casting from `T` to `S` > is unconditionally exact; > - A type pattern `T t` dominates a type pattern `S s` (or a record pattern > `S(...)`) if `T t` would be unconditional on `S`. > While the rules for casting are complex, primitive patterns add no new > complexity; there are no new conversions or conversion contexts. If we see: > switch (a) { > case T t: ... > } > we know the case matches if `a` can be cast exactly to `T`, and the pattern is > unconditional if _all_ values of `a`'s type can be cast exactly to `T`. Note > that none of this is specific to primitives; we derive the semantics of _all_ > type patterns from the enhanced definition of casting. > Now, our record deconstruction examples work symmetrically to construction: > case R(int i) // OK > case R(short s) // test if `i` is in the range of `short` > case R(Integer i) // box `i` to `Integer` I think we hev to be careful with you notion of dual here, a record canonical constructor and a deconstructing pattern are dual, but it's a special case because the deconstructing pattern always match, once you introduce patterns that may match or not, there is no duality anymore. The primitive pattern you propose is clearly not the dual of the cast conversions, because the casting conversions are verified by the compiler while some of the primitive patterns you propose are checked at runtime. As an example, if there is a method declared like this static void m(int i) { ... } and this method is called with a short, short s = ... m(s); there is an implicit conversion from short to int, and if the first parameter of m is not compatible a compiler error occurs. If you compare with the corresponding pattern int i = ... switch(i) { case short s -> ... } The semantics you propose is not to emit a compile error but at runtime to check if the value "i" is beetween Short.MIN_VALUE and Short.MAX_VALUE. So there is perhaps a syntactic duality but clearly there is no semantics duality. Moreover, the semantics you propose is not aligned with the concept of data oriented programming which says that the data are more important than the code so that we should try to raise a compile error when the data changed to help the developer to change the code. If we take a simple example record Point(int x, int y) { } Point point = ... switch(point) { case Point(int i, int j) -> ... ... } let say know that we change Point to use longs record Point(long x, long y) { } With the semantics you propose, the code still compile but the pattern is now transformed to a partial pattern that will not match all Points but only the ones with x and y in between Integer.MIN_VALUE and Integer.MAX_VALUE. I believe this is exactly what Stephen Colbourne was complaining when we discussed the previous iteration of this spec, the semantics of the primtiive pattern change depending on the definition of the data. The remark of Tagir about array pattern also works here, having a named pattern like Short.asShort() makes the semantics far cleared because it disambiguate between a a pattern that request a conversion and a pattern that does a conversion because the data definition has changed. And i'm still worry that we are muddying the water here, instanceof is about instance and subtypining relationship (hence the name), extending it to cover non-instance / primitive value is very confusing. regards, R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Sep 9 16:09:03 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 9 Sep 2022 18:09:03 +0200 (CEST) Subject: Array patterns (and varargs patterns) In-Reply-To: References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com> <1114411168.705812.1662588064126.JavaMail.zimbra@u-pem.fr> Message-ID: <1264502963.2184553.1662739743003.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Brian Goetz" > To: "Remi Forax" > Cc: "Tagir Valeev" , "amber-spec-experts" > Sent: Thursday, September 8, 2022 12:15:04 AM > Subject: Re: Array patterns (and varargs patterns) >> For me, Arrays.of() is a named pattern with a vararg list of bindings, no ? > > Its a named pattern, but to work, it would need varargs patterns -- and > array patterns are the underpinnings of varargs, just as array creation > is the underpinning of varargs invocation.? We're not going to do > varargs patterns differently than we do varargs invocation, just to > avoid doing array patterns -- that would be silly. Here we want to extract the value into bindings/variables, that is not what the varargs does, the varargs takes a bunch of value on stack and put them into an array. Here we want the opposite operation of a varargs, the spread (or splat) operator that takes the argument from an array (or a collection ?) and put them on the stack. If we have the pattern method Arrays.of() static pattern (T...) of(T[] array) { // here it's a varargs ... } and we call it using a named pattern switch(array) { case Arrays.of(/* insert a syntax here */) -> ... the syntax should extract some/all values of the array into one or several bindings. If we are in Caml, we have the :: operator to separate the first element from the rest switch(array) { case Arrays.of(String first :: String[] rest) -> ... If we are in JavaScript, we have the spread operator (notice that the ... is before the type) switch(array) { case Arrays.of(String first, ... String[] rest) -> ... So the varargs is at the declaration side, at the pattern side we need a new operator spread, so i think that adding an array pattern now is not a good idea. regards, R?mi >> >>>> With best regards, >>>> Tagir Valeev. >>>> >>>> On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz wrote: >>>>> We dropped this out of the record patterns JEP, but I think it is time to >>>>> revisit this. >>>>> >>>>> The concept of array patterns was pretty straightforward; they mimic the nesting >>>>> and exhaustiveness rules of record patterns, they are just a different sort of >>>>> container for nested patterns. And they have an obvious duality with array >>>>> creation expressions. >>>>> >>>>> The main open question here was how we distinguish between "match an array of >>>>> length exactly N" (where there are N nested patterns) and "match an array of >>>>> length at least N". We toyed with the idea of a "..." indicator to mean "more >>>>> elements", but this felt a little forced and opened new questions. >>>>> >>>>> It later occurred to me that there is another place to nest a pattern in an >>>>> array pattern -- to match (and bind) the length. In the following, assume for >>>>> sake of exposition that "_" is the "any" pattern (matches everything, binds >>>>> nothing) and that we have some way to denote a constant pattern, which I'll >>>>> denote here with a constant literal. >>>>> >>>>> There is an obvious place to put this (optional) pattern: in between the >>>>> brackets. So: >>>>> >>>>> case String[1] { P }: >>>>> ^ a constant pattern >>>>> >>>>> would match string arrays of length 1 whose sole element matches P. And >>>>> >>>>> case String[] { P, Q } >>>>> >>>>> would match string arrays of length exactly 2, whose first two elements match P >>>>> and Q respectively. (If the length pattern is not specified, we infer a >>>>> constant pattern whose constant is equal to the length of the nested pattern >>>>> list.) >>>>> >>>>> Matching a target to `String[L] { P0, .., Pn }` means >>>>> >>>>> x instanceof String[] arr >>>>> && arr.length matches L >>>>> && arr.length >= n >>>>> && arr[0] matches P0 >>>>> && arr[1] matches P1 >>>>> ... >>>>> && arr[n] matches Pn >>>>> >>>>> More examples: >>>>> >>>>> case String[int len] { P } >>>>> >>>>> would match string arrays of length >= 1 whose first element matches P, and >>>>> further binds the array length to `len`. >>>>> >>>>> case String[_] { P, Q } >>>>> >>>>> would match string arrays of any length whose first two elements match P and Q. >>>>> >>>>> case String[3] { } >>>>> ^constant pattern >>>>> >>>>> matches all string arrays of length 3. >>>>> >>>>> >>>>> This is a more principled way to do it, because the length is a part of the >>>>> array and deserves a chance to match via nested patterns, just as with the >>>>> elements, and it avoid trying to give "..." a new meaning. >>>>> >>>>> The downside is that it might be confusing at first (though people will learn >>>>> quickly enough) how to distinguish between an exact match and a prefix match. >>>>> >>>>> >>>>> >>>>> >>>>> On 1/5/2021 1:48 PM, Brian Goetz wrote: >>>>> >>>>> As we get into the next round of pattern matching, I'd like to opportunistically >>>>> attach another sub-feature: array patterns. (This also bears on the question >>>>> of "how would varargs patterns work", which I'll address below, though they >>>>> might come later.) >>>>> >>>>> ## Array Patterns >>>>> >>>>> If we want to create a new array, we do so with an array construction >>>>> expression: >>>>> >>>>> new String[] { "a", "b" } >>>>> >>>>> Since each form of aggregation should have its dual in destructuring, the >>>>> natural way to represent an array pattern (h/t to AlanM for suggesting this) >>>>> is: >>>>> >>>>> if (arr instanceof String[] { var a, var b }) { ... } >>>>> >>>>> Here, the applicability test is: "are you an instanceof of String[], with length >>>>> = 2", and if so, we cast to String[], extract the two elements, and match them >>>>> to the nested patterns `var a` and `var b`. This is the natural analogue of >>>>> deconstruction patterns for arrays, complete with nesting. >>>>> >>>>> Since an array can have more elements, we likely need a way to say "length >= 2" >>>>> rather than simply "length == 2". There are multiple syntactic ways to get >>>>> there, for now I'm going to write >>>>> >>>>> if (arr instanceof String[] { var a, var b, ... }) >>>>> >>>>> to indicate "more". The "..." matches zero or more elements and binds nothing. >>>>> >>>>> >>>>> People are immediately going to ask "can I bind something to the remainder"; I >>>>> think this is mostly an "attractive distraction", and would prefer to not have >>>>> this dominate the discussion. >>>>> >>>>> >>>>> Here's an example from the JDK that could use this effectively: >>>>> >>>>> String[] limits = limitString.split(":"); >>>>> try { >>>>> switch (limits.length) { >>>>> case 2: { >>>>> if (!limits[1].equals("*")) >>>>> setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1])); >>>>> } >>>>> case 1: { >>>>> if (!limits[0].equals("*")) >>>>> setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0])); >>>>> } >>>>> } >>>>> } >>>>> catch(NumberFormatException ex) { >>>>> setMultilineLimit(MultilineLimit.DEPTH, -1); >>>>> setMultilineLimit(MultilineLimit.LENGTH, -1); >>>>> } >>>>> >>>>> becomes (eventually) >>>>> >>>>> switch (limitString.split(":")) { >>>>> case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i); >>>>> case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i); >>>>> default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); } >>>>> } >>>>> >>>>> Note how not only does this become more compact, but the unchecked >>>>> "NumberFormatException" is folded into the match, rather than being a separate >>>>> concern. >>>>> >>>>> >>>>> ## Varargs patterns >>>>> >>>>> Having array patterns offers us a natural way to interpret deconstruction >>>>> patterns for varargs records. Assume we have: >>>>> >>>>> void m(X... xs) { } >>>>> >>>>> Then a varargs invocation >>>>> >>>>> m(a, b, c) >>>>> >>>>> is really sugar for >>>>> >>>>> m(new X[] { a, b, c }) >>>>> >>>>> So the dual of a varargs invocation, a varargs match, is really a match to an >>>>> array pattern. So for a record >>>>> >>>>> record R(X... xs) { } >>>>> >>>>> a varargs match: >>>>> >>>>> case R(var a, var b, var c): >>>>> >>>>> is really sugar for an array match: >>>>> >>>>> case R(X[] { var a, var b, var c }): >>>>> >>>>> And similarly, we can use our "more arity" indicator: >>>>> >>>>> case R(var a, var b, var c, ...): >>>>> >>>>> to indicate that there are at least three elements. >>>>> From brian.goetz at oracle.com Fri Sep 9 18:07:41 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 9 Sep 2022 14:07:41 -0400 Subject: Primitives in instanceof and patterns In-Reply-To: <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> Message-ID: > > The semantics you propose is not to emit a compile error but at > runtime to check if the value "i" is beetween Short.MIN_VALUE and > Short.MAX_VALUE. > > So there is perhaps a syntactic duality but clearly there is no > semantics duality. Of course there is a semantic duality here.? Specifically, `int` and `short` are related by an _embedding-projection pair_.? Briefly: given two sets A and B (think "B" for "bigger"), an approximation metric on B (a complete partial ordering), and a pair of functions `e : A -> B` and `p : B -> A`, they form an e-p pair if (a) p . e is the identity function (dot is compose), and e . p produces an approximation of the input (according to the metric.) The details are not critical here (though this algebraic structure shows up everywhere in our work if you look closely), but the point remains: there is an algebraic duality here.? Yes, when going in one direction, no runtime tests are needed; when going in the other direction, because it may be lossy in one direction, a runtime test is needed in that direction.? Just like with `instanceof String` / `case String s` today. Anyway, I don't think you're saying what you really mean.? Let's not get caught up in silly arguments about what "dual" means; that won't be helpful. > Moreover, the semantics you propose is not aligned with the concept of > data oriented programming which says that the data are more important > than the code so that we should try to raise a compile error when the > data changed to help the developer to change the code. > > If we take a simple example > ? record Point(int x, int y) { } > ? Point point = ... > ? switch(point) { > ?? case Point(int i, int j) -> ... > ?? ... > ? } > > let say know that we change Point to use longs > ? record Point(long x, long y) { } > > With the semantics you propose, the code still compile but the pattern > is now transformed to a partial pattern that will not match all Points > but only the ones with x and y in between Integer.MIN_VALUE and > Integer.MAX_VALUE. This is an extraneous argument; if you change the declaration of Point to take two Strings, of course all the use sites will change their meaning.? Maybe they'll still compile but mean something else, maybe they will be errors.? Patterns are not special here; the semantics of nearly all language features (assignment, arithmetic, etc) will change when you change the type of the underlying arguments.? That the meaning of patterns changes also when you change the types involved is just more of the same. > I believe this is exactly what Stephen Colbourne was complaining when > we discussed the previous iteration of this spec, the semantics of the > primtiive pattern change depending on the definition of the data. I think what Stephen didn't like is that there is no syntactic difference between a total and partial pattern at the use site.? And I get why that made him uncomfortable; it's a valid concern, and one could imagine designing the language so that total and partial patterns look different.? This is one of the tradeoffs we have made; I do still think we picked a good one. > The remark of Tagir about array pattern also works here, having a > named pattern like Short.asShort() makes the semantics far cleared > because it disambiguate between a a pattern that request a conversion > and a pattern that does a conversion because the data definition has > changed. If the language didn't support primitive widening in assignment / method invocation context (like Golang does), and instead said "use Integer::toLong (or Long::fromInteger) to convert int -> long", then yes, the natural duality would be to also represent these as named patterns; then conversions in both directions are mediated by API points, total in one direction, partial in the other.? But that's not the language we have!? The language we have allows us to provide an int where a long is needed, and the language does the needful.? Pattern matching allows us to recover whether a value came from a certain type, even after we've lost the static type information.? Just as we can recover the String-ness here: ??? Object o = "Foo"; ??? if (o instanceof String s) { ... } because reference type patterns are willing to conditionally reverse reference widening, all the same arguments apply to ??? long n = 3; ??? if (n instanceof int i) { ... } And not allowing this makes the language *more* complicated, because now some conversions are reversible and some are not, for ad-hoc reasons that no one will be able to understand.? Can you offer any compelling reason why we should be able to recover the String-ness of `o` after a widening, but not the int-ness of `n` after a widening? > And i'm still worry that we are muddying the water here, instanceof is > about instance and subtypining relationship (hence the name), > extending it to cover non-instance / primitive value is very confusing. Sorry, this is a cheap rhetorical trick; declaring words to mean what you want them to mean, and then pointing to that meaning as a way to close the argument. Yes, saying "instanceof T is about subtyping" is a useful mental model *when the only types you can apply it to are those related by inclusion polymorphism*."? But the restriction of instanceof to reference types is arbitrary (and we've already decided to allow patterns in instanceof, which are surely not mere subtyping.) Regardless, a better way to think about `instanceof` is that it is the precondition for "would a cast to this type be safe and useful."? In the world where we restrict to reference types, the two notions coincide.? But the safe-cast-precondition is clearly more general (this is like the difference between defining the function 2^n on Z, vs on R or C; of course they have to agree at the integers, but the continuous exponential function is far more useful than the discrete one.)? Moreover, the general mental model is just as simple: how do you know a cast is safe?? Ask instanceof.? What does safe mean?? No error or material loss of precision. A more reasonable way to state this objection would be: "most users believe that `instnaceof` is purely about subtyping, and it will take some work to bring them around to a more general interpretation, how are we going to do that?" Jumping up a level, you're throwing a lot of arguments at the wall that mostly come down to "I don't like this feature, so let me try and undermine it."? That's not a particularly helpful way to go about this, and none of the arguments so far have been very compelling (nor are they new from the last time we went around on it.)? I get that you would like pattern matching to have a more "surface" role in the language; that's a valid opinion.? But I would also like you to try harder to understand what we're trying to achieve and why we're pushing it deeper, and to respond to the substance of the proposal rather than just saying "YAGNI". (I strongly encourage everyone to re-read JLS Ch5, and to meditate on *why* we have the particular conversions in the contexts we have.? They're complex, but not arbitrary; if you listen closely to the specification, it sometimes whispers to you.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Sep 9 18:29:37 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 9 Sep 2022 14:29:37 -0400 Subject: Array patterns (and varargs patterns) In-Reply-To: <1264502963.2184553.1662739743003.JavaMail.zimbra@u-pem.fr> References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com> <1114411168.705812.1662588064126.JavaMail.zimbra@u-pem.fr> <1264502963.2184553.1662739743003.JavaMail.zimbra@u-pem.fr> Message-ID: <7a7f81f9-8b6b-635c-fe54-4605610570c6@oracle.com> Again, look for the embedding projection pairs.? The sets involved are T^n and T[].? The array creation operator is an embedding from T^n to T[]; the missing dual is the projection from T[] to T^k (for specific k.)? Projections are partial (or lossy), so these are patterns rather than total functions.? The dual of packing an array from a list of expressions is unpacking the elements into a list of variables. When I pack an array: ??? String[] ss = new String[] { "Hi", "Bob" }; this has a similar feel to ??? Object o = "Bob"; in that we've thrown away some static typing information (in the former, that the array has length two.)? But this information is retained dynamically, and we can recover it with a runtime test. Asking ??? if (o instanceof String s) { ... } is asking "was the last assignment to `o` from a String".? Asking ??? if (ss instanceof String[] { var a, var b }) { ... } is asking "was the last assignment to ss a String[] with two elements" (and similar for other configurations of the nested patterns.)? In both cases, we are asking the same generalized question: could this { object, array } have come from an assignment / creation expression that has a certain shape. I get it; you don't find this feature compelling.? You've said that already, and now we're just going in circles.? Your mail reads to me like "its a bad idea because I think its a bad idea."? Yes, other languages approach this in different ways; Caml deconstructs into (head, tail) because its fundamental data structure is a cons list. That makes sense given how the language works.? Java works differently, so transplanting from Caml or Javascript is not always going to be a good answer.? Remember the pattern mantra: each aggregation idiom in the language should have a corresponding form deconstruction pattern.? Constructors have deconstruction patterns; factory methods will eventually have named static patterns; if we add collection literals, there will be collection patterns, etc.? If an aggregation form lacks a corresponding dual, this turns into an asymmetry which in turn means *destructuring cannot compose the same way aggregation composes*.? This is bad!? Arrays have their own special form of aggregation (array creation expression); array patterns are the corresponding destructuring. I encourage you to re-read https://openjdk.org/projects/amber/design-notes/patterns/pattern-match-object-model , and the "red ball" API examples, to see what I mean.? This is about composibility, not about whether any given form of pattern "pays its weight." So again, please try harder to engage with _why do we think this is important_, and the specifics of what has been proposed, rather than just waving the YAGNI stick.? There's a bigger picture here. >>> For me, Arrays.of() is a named pattern with a vararg list of bindings, no ? >> Its a named pattern, but to work, it would need varargs patterns -- and >> array patterns are the underpinnings of varargs, just as array creation >> is the underpinning of varargs invocation.? We're not going to do >> varargs patterns differently than we do varargs invocation, just to >> avoid doing array patterns -- that would be silly. > Here we want to extract the value into bindings/variables, that is not what the varargs does, the varargs takes a bunch of value on stack and put them into an array. > Here we want the opposite operation of a varargs, the spread (or splat) operator that takes the argument from an array (or a collection ?) and put them on the stack. > > If we have the pattern method Arrays.of() > > static pattern (T...) of(T[] array) { // here it's a varargs > ... > } > > and we call it using a named pattern > switch(array) { > case Arrays.of(/* insert a syntax here */) -> ... > > the syntax should extract some/all values of the array into one or several bindings. > > If we are in Caml, we have the :: operator to separate the first element from the rest > switch(array) { > case Arrays.of(String first :: String[] rest) -> ... > > If we are in JavaScript, we have the spread operator (notice that the ... is before the type) > switch(array) { > case Arrays.of(String first, ... String[] rest) -> ... > > So the varargs is at the declaration side, at the pattern side we need a new operator spread, so i think that adding an array pattern now is not a good idea. > > regards, > R?mi > >>>>> With best regards, >>>>> Tagir Valeev. >>>>> >>>>> On Tue, Sep 6, 2022 at 11:11 PM Brian Goetz wrote: >>>>>> We dropped this out of the record patterns JEP, but I think it is time to >>>>>> revisit this. >>>>>> >>>>>> The concept of array patterns was pretty straightforward; they mimic the nesting >>>>>> and exhaustiveness rules of record patterns, they are just a different sort of >>>>>> container for nested patterns. And they have an obvious duality with array >>>>>> creation expressions. >>>>>> >>>>>> The main open question here was how we distinguish between "match an array of >>>>>> length exactly N" (where there are N nested patterns) and "match an array of >>>>>> length at least N". We toyed with the idea of a "..." indicator to mean "more >>>>>> elements", but this felt a little forced and opened new questions. >>>>>> >>>>>> It later occurred to me that there is another place to nest a pattern in an >>>>>> array pattern -- to match (and bind) the length. In the following, assume for >>>>>> sake of exposition that "_" is the "any" pattern (matches everything, binds >>>>>> nothing) and that we have some way to denote a constant pattern, which I'll >>>>>> denote here with a constant literal. >>>>>> >>>>>> There is an obvious place to put this (optional) pattern: in between the >>>>>> brackets. So: >>>>>> >>>>>> case String[1] { P }: >>>>>> ^ a constant pattern >>>>>> >>>>>> would match string arrays of length 1 whose sole element matches P. And >>>>>> >>>>>> case String[] { P, Q } >>>>>> >>>>>> would match string arrays of length exactly 2, whose first two elements match P >>>>>> and Q respectively. (If the length pattern is not specified, we infer a >>>>>> constant pattern whose constant is equal to the length of the nested pattern >>>>>> list.) >>>>>> >>>>>> Matching a target to `String[L] { P0, .., Pn }` means >>>>>> >>>>>> x instanceof String[] arr >>>>>> && arr.length matches L >>>>>> && arr.length >= n >>>>>> && arr[0] matches P0 >>>>>> && arr[1] matches P1 >>>>>> ... >>>>>> && arr[n] matches Pn >>>>>> >>>>>> More examples: >>>>>> >>>>>> case String[int len] { P } >>>>>> >>>>>> would match string arrays of length >= 1 whose first element matches P, and >>>>>> further binds the array length to `len`. >>>>>> >>>>>> case String[_] { P, Q } >>>>>> >>>>>> would match string arrays of any length whose first two elements match P and Q. >>>>>> >>>>>> case String[3] { } >>>>>> ^constant pattern >>>>>> >>>>>> matches all string arrays of length 3. >>>>>> >>>>>> >>>>>> This is a more principled way to do it, because the length is a part of the >>>>>> array and deserves a chance to match via nested patterns, just as with the >>>>>> elements, and it avoid trying to give "..." a new meaning. >>>>>> >>>>>> The downside is that it might be confusing at first (though people will learn >>>>>> quickly enough) how to distinguish between an exact match and a prefix match. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 1/5/2021 1:48 PM, Brian Goetz wrote: >>>>>> >>>>>> As we get into the next round of pattern matching, I'd like to opportunistically >>>>>> attach another sub-feature: array patterns. (This also bears on the question >>>>>> of "how would varargs patterns work", which I'll address below, though they >>>>>> might come later.) >>>>>> >>>>>> ## Array Patterns >>>>>> >>>>>> If we want to create a new array, we do so with an array construction >>>>>> expression: >>>>>> >>>>>> new String[] { "a", "b" } >>>>>> >>>>>> Since each form of aggregation should have its dual in destructuring, the >>>>>> natural way to represent an array pattern (h/t to AlanM for suggesting this) >>>>>> is: >>>>>> >>>>>> if (arr instanceof String[] { var a, var b }) { ... } >>>>>> >>>>>> Here, the applicability test is: "are you an instanceof of String[], with length >>>>>> = 2", and if so, we cast to String[], extract the two elements, and match them >>>>>> to the nested patterns `var a` and `var b`. This is the natural analogue of >>>>>> deconstruction patterns for arrays, complete with nesting. >>>>>> >>>>>> Since an array can have more elements, we likely need a way to say "length >= 2" >>>>>> rather than simply "length == 2". There are multiple syntactic ways to get >>>>>> there, for now I'm going to write >>>>>> >>>>>> if (arr instanceof String[] { var a, var b, ... }) >>>>>> >>>>>> to indicate "more". The "..." matches zero or more elements and binds nothing. >>>>>> >>>>>> >>>>>> People are immediately going to ask "can I bind something to the remainder"; I >>>>>> think this is mostly an "attractive distraction", and would prefer to not have >>>>>> this dominate the discussion. >>>>>> >>>>>> >>>>>> Here's an example from the JDK that could use this effectively: >>>>>> >>>>>> String[] limits = limitString.split(":"); >>>>>> try { >>>>>> switch (limits.length) { >>>>>> case 2: { >>>>>> if (!limits[1].equals("*")) >>>>>> setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1])); >>>>>> } >>>>>> case 1: { >>>>>> if (!limits[0].equals("*")) >>>>>> setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0])); >>>>>> } >>>>>> } >>>>>> } >>>>>> catch(NumberFormatException ex) { >>>>>> setMultilineLimit(MultilineLimit.DEPTH, -1); >>>>>> setMultilineLimit(MultilineLimit.LENGTH, -1); >>>>>> } >>>>>> >>>>>> becomes (eventually) >>>>>> >>>>>> switch (limitString.split(":")) { >>>>>> case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i); >>>>>> case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i); >>>>>> default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); } >>>>>> } >>>>>> >>>>>> Note how not only does this become more compact, but the unchecked >>>>>> "NumberFormatException" is folded into the match, rather than being a separate >>>>>> concern. >>>>>> >>>>>> >>>>>> ## Varargs patterns >>>>>> >>>>>> Having array patterns offers us a natural way to interpret deconstruction >>>>>> patterns for varargs records. Assume we have: >>>>>> >>>>>> void m(X... xs) { } >>>>>> >>>>>> Then a varargs invocation >>>>>> >>>>>> m(a, b, c) >>>>>> >>>>>> is really sugar for >>>>>> >>>>>> m(new X[] { a, b, c }) >>>>>> >>>>>> So the dual of a varargs invocation, a varargs match, is really a match to an >>>>>> array pattern. So for a record >>>>>> >>>>>> record R(X... xs) { } >>>>>> >>>>>> a varargs match: >>>>>> >>>>>> case R(var a, var b, var c): >>>>>> >>>>>> is really sugar for an array match: >>>>>> >>>>>> case R(X[] { var a, var b, var c }): >>>>>> >>>>>> And similarly, we can use our "more arity" indicator: >>>>>> >>>>>> case R(var a, var b, var c, ...): >>>>>> >>>>>> to indicate that there are at least three elements. >>>>>> From brian.goetz at oracle.com Fri Sep 9 20:06:34 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 9 Sep 2022 16:06:34 -0400 Subject: What does instanceof mean (was: Primitives in instanceof and patterns) In-Reply-To: <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> Message-ID: As mentioned, it is a common mental model that "Instanceof is the subtype operator", as Remi claims here: On 9/9/2022 11:35 AM, Remi Forax wrote: > And i'm still worry that we are muddying the water here, instanceof is > about instance and subtypining relationship (hence the name) We will surely elicit some "who moved my cheese" responses when generalizing from "subtyping" to "safe casting precondition" (though the two coincide given the restrictions on instanceof today.)? The question is largely a pedagogical one; how do we help people see that "subtyping operator" is merely a convenient description of what instanceof has done to date? We made a choice to lump rather than split by having `instanceof` take a pattern on the RHS as well as a type.? This choice is not without its challenges (mostly, the confusion around whether patterns match null or not), but it also illustrates that people can get over a narrow view of what `instanceof` "means", since this has generally not been a problem to date.? The leap from "reference types only" to "all types" is a smaller one, though it appeals to a broader view of polymorphism. When restricted to reference types, `instanceof` is a question about subtyping relative to inclusion polymorphism.? When we bring in primitive widening/narrowing, we are appealing to coercion polymorphism too -- "can this value be coerced to this type without getting mangled."? When we bring in boxing and unboxing, we appeal to another form of coercion.? (Both forms are covered under existing conversion rules.)? I suspect this direction is not likely to be helpful, since these terms are not particularly widely used in the Java community. Positioning "instanceof TYPE" as the "precondition for safe casting to TYPE" seems a pretty simple leap to me, since "instanceof TYPE" is basically never seen in the wild when not immediately followed by a cast to the same type.? Which makes sense; casting is risky (unless the type pairs involved are known to be related in a certain way), and instanceof is how you avoid casting surprises. Is there a better way to explain this? -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Sep 9 20:12:21 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 9 Sep 2022 16:12:21 -0400 Subject: Primitives in instanceof and patterns In-Reply-To: <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> Message-ID: <74b9d5f1-ed98-203c-ca35-a3f8ce67897b@oracle.com> I have a question about your example.? I'm not trying to be clever and play "whatabout", I'm looking for a straight answer why you think the two cases are different. > > If we take a simple example > ? record Point(int x, int y) { } > ? Point point = ... > ? switch(point) { > ?? case Point(int i, int j) -> ... > ?? ... > ? } > > let say know that we change Point to use longs > ? record Point(long x, long y) { } > > With the semantics you propose, the code still compile but the pattern > is now transformed to a partial pattern that will not match all Points > but only the ones with x and y in between Integer.MIN_VALUE and > Integer.MAX_VALUE. The same is true when I start with ???? record Foo(String s) { ... } and later change it to ??? record Foo(Object s) { ... } (both are incompatible changes, but we won't dwell on that.) My question is: why does it not bother you that use-site patterns like `Foo(String s)` are reinterpreted as partial after the String -> Object change, but the analogous change with long -> int bothers you so much that you'd use it to argue against being able to ask whether a long is also an int? You obviously think that these two examples are radically different.? Can you explain why?? Is it anything more than "that's the way its always been"? -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Fri Sep 9 21:32:04 2022 From: john.r.rose at oracle.com (John Rose) Date: Fri, 09 Sep 2022 14:32:04 -0700 Subject: Primitives in instanceof and patterns In-Reply-To: References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> Message-ID: <7EED07F6-64EF-45AF-B9BE-BC31802A0289@oracle.com> On 9 Sep 2022, at 11:07, Brian Goetz wrote: > ? Regardless, a better way to think about `instanceof` is that it is > the precondition for "would a cast to this type be safe and useful." > In the world where we restrict to reference types, the two notions > coincide. And, in the future world where every value (except possibly `null`) is an *instance*, the two notions will coincide again, without the restriction to reference types. We are taking reasonable incremental steps toward that world here, IMO. > But the safe-cast-precondition is clearly more general (this is like > the difference between defining the function 2^n on Z, vs on R or C; > of course they have to agree at the integers, but the continuous > exponential function is far more useful than the discrete one.) > Moreover, the general mental model is just as simple: how do you know > a cast is safe? Ask instanceof. What does safe mean? No error or > material loss of precision. And (to pile on a bit here), the casts you are speaking of here, Brian, *are the casts we have in Java*, not some idealized or restricted or cleaned up cast. So we have to deal with the oddities of primitive value conversion. The payoff from dealing with this is that the meaning of patterns is derived systematically from the meaning of casts (and other conversions). That is hugely desirable, because it means a very complex new feature is firmly anchored to existing features. Getting this kind of thing right preserves and extends Java?s role as a world-class programming language. > A more reasonable way to state this objection would be: "most users > believe that `instanceof` is purely about subtyping, and it will take > some work to bring them around to a more general interpretation, how > are we going to do that?" This is subjective and esthetic, but I think two thoughts help here (with teaching and rationale): First, everything (except `null`) is an instance, or will eventually be. Second, subtyping in Java includes the murky rules for primitive typing. Those specific rules more or less systematically determine how casts work. They should also systematically determine (in the same way) how patterns work. After all, casts and patterns are (and very much should be!) mirror image counterparts of each other, or dance partners holding hands. (I visualize such things as boxes on the whiteboard with reversible arrows between them. You could say ?category? if you like. Brian likes to say ?dual?, and I took linear algebra too, but I doubt most folks took the trouble in that class to be curious about exactly what a ?dual space? really is all about.) Rather than extending the language we wish we had, we are extending the one we *do* have, and that means aligning even the murky parts of casts with pattern behavior. In the end, I don?t think it?s very murky at all in practice, except of course for the outraged theoretical purist (who lives in each of us). There is certainly *no new murk*. IMO what Brian is showing works out surprisingly well, so kudos to him for following his nose to a design with liveable details. This success also IMO demonstrates the foresight of the original authors and current maintainers of the spec, even in the ?murky? parts of primitive value conversions. ? John -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Fri Sep 9 23:13:40 2022 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 9 Sep 2022 23:13:40 +0000 Subject: Primitives in instanceof and patterns In-Reply-To: <7EED07F6-64EF-45AF-B9BE-BC31802A0289@oracle.com> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> <7EED07F6-64EF-45AF-B9BE-BC31802A0289@oracle.com> Message-ID: <85335BD6-F037-4EFB-90D1-EB8F0642295C@oracle.com> On Sep 9, 2022, at 5:32 PM, John Rose wrote: Well said, but I cannot resist observing: > > ?There is certainly no new murk. To be precise, if you think there is more murk than before, take comfort that it is merely the dual of the existing murk. :-) From john.r.rose at oracle.com Fri Sep 9 23:25:10 2022 From: john.r.rose at oracle.com (John Rose) Date: Fri, 09 Sep 2022 16:25:10 -0700 Subject: Primitives in instanceof and patterns In-Reply-To: <85335BD6-F037-4EFB-90D1-EB8F0642295C@oracle.com> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> <7EED07F6-64EF-45AF-B9BE-BC31802A0289@oracle.com> <85335BD6-F037-4EFB-90D1-EB8F0642295C@oracle.com> Message-ID: On 9 Sep 2022, at 16:13, Guy Steele wrote: > On Sep 9, 2022, at 5:32 PM, John Rose wrote: > > Well said, but I cannot resist observing: >> >> ?There is certainly no new murk. > > To be precise, if you think there is more murk than before, take comfort that it is merely the dual of the existing murk. :-) (How much is that murk in the mirror?) From brian.goetz at oracle.com Sat Sep 10 00:16:15 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 9 Sep 2022 20:16:15 -0400 Subject: Array patterns (and varargs patterns) In-Reply-To: References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> Message-ID: <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com> John pulled a nice Jedi-mind-trick on me, and pointed out that we actually have two creation expressions for arrays: ??? new Foo[n] ??? new Foo[] { a0, .., an } and that if we are dualizing, then we should have these two patterns: ??? new Foo[] { P0, ..., Pn }? // matches arrays of exactly length N ??? new Foo[P]???????????????? // matches arrays whose length match P but that neither ??? new Foo[] { P, Q, ... }?? // previous suggestion nor ??? new Foo[L] { P, Q }?????? // current suggestion correspond to either of those, which suggests that we may have prematurely optimized the pattern form.? The rational consequence of this observation is to do ??? new Foo[] { P0, ..., Pn }? // matches arrays of exactly length N now (which is also the basis of varargs patterns), and once we have constant patterns (which are kind of required for the second form to be all that useful), come back for `Foo[P]`. On 9/6/2022 5:11 PM, Brian Goetz wrote: > We dropped this out of the record patterns JEP, but I think it is time > to revisit this. > > The concept of array patterns was pretty straightforward; they mimic > the nesting and exhaustiveness rules of record patterns, they are just > a different sort of container for nested patterns.? And they have an > obvious duality with array creation expressions. > > The main open question here was how we distinguish between "match an > array of length exactly N" (where there are N nested patterns) and > "match an array of length at least N".? We toyed with the idea of a > "..." indicator to mean "more elements", but this felt a little forced > and opened new questions. > > It later occurred to me that there is another place to nest a pattern > in an array pattern -- to match (and bind) the length.? In the > following, assume for sake of exposition that "_" is the "any" pattern > (matches everything, binds nothing) and that we have some way to > denote a constant pattern, which I'll denote here with a constant > literal. > > There is an obvious place to put this (optional) pattern: in between > the brackets.? So: > > ??? case String[1] { P }: > ??????????????? ^ a constant pattern > > would match string arrays of length 1 whose sole element matches P.? And > > ??? case String[] { P, Q } > > would match string arrays of length exactly 2, whose first two > elements match P and Q respectively.? (If the length pattern is not > specified, we infer a constant pattern whose constant is equal to the > length of the nested pattern list.) > > Matching a target to `String[L] { P0, .., Pn }` means > > ??? x instanceof String[] arr > ??????? && arr.length matches L > ??????? && arr.length >= n > ??????? && arr[0] matches P0 > ??????? && arr[1] matches P1 > ??????? ... > ??????? && arr[n] matches Pn > > More examples: > > ??? case String[int len] { P } > > would match string arrays of length >= 1 whose first element matches > P, and further binds the array length to `len`. > > ??? case String[_] { P, Q } > > would match string arrays of any length whose first two elements match > P and Q. > > ??? case String[3] { } > ??????????????? ^constant pattern > > matches all string arrays of length 3. > > > This is a more principled way to do it, because the length is a part > of the array and deserves a chance to match via nested patterns, just > as with the elements, and it avoid trying to give "..." a new meaning. > > The downside is that it might be confusing at first (though people > will learn quickly enough) how to distinguish between an exact match > and a prefix match. > > > > > On 1/5/2021 1:48 PM, Brian Goetz wrote: >> As we get into the next round of pattern matching, I'd like to >> opportunistically attach another sub-feature: array patterns.? (This >> also bears on the question of "how would varargs patterns work", >> which I'll address below, though they might come later.) >> >> ## Array Patterns >> >> If we want to create a new array, we do so with an array construction >> expression: >> >> ??? new String[] { "a", "b" } >> >> Since each form of aggregation should have its dual in destructuring, >> the natural way to represent an array pattern (h/t to AlanM for >> suggesting this) is: >> >> ??? if (arr instanceof String[] { var a, var b }) { ... } >> >> Here, the applicability test is: "are you an instanceof of String[], >> with length = 2", and if so, we cast to String[], extract the two >> elements, and match them to the nested patterns `var a` and `var >> b`.?? This is the natural analogue of deconstruction patterns for >> arrays, complete with nesting. >> >> Since an array can have more elements, we likely need a way to say >> "length >= 2" rather than simply "length == 2". There are multiple >> syntactic ways to get there, for now I'm going to write >> >> ??? if (arr instanceof String[] { var a, var b, ... }) >> >> to indicate "more".? The "..." matches zero or more elements and >> binds nothing. >> >> >> People are immediately going to ask "can I bind something to the >> remainder"; I think this is mostly an "attractive distraction", and >> would prefer to not have this dominate the discussion. >> >> >> Here's an example from the JDK that could use this effectively: >> >> String[] limits = limitString.split(":"); >> try { >> ??? switch (limits.length) { >> ??????? case 2: { >> ??????????? if (!limits[1].equals("*")) >> ??????????????? setMultilineLimit(MultilineLimit.DEPTH, >> Integer.parseInt(limits[1])); >> ??????? } >> ??????? case 1: { >> ??????????? if (!limits[0].equals("*")) >> ??????????????? setMultilineLimit(MultilineLimit.LENGTH, >> Integer.parseInt(limits[0])); >> ??????? } >> ??? } >> } >> catch(NumberFormatException ex) { >> ??? setMultilineLimit(MultilineLimit.DEPTH, -1); >> ??? setMultilineLimit(MultilineLimit.LENGTH, -1); >> } >> >> becomes (eventually) >> >> switch (limitString.split(":")) { >> ??????? case String[] { var _, Integer.parseInt(var i) } -> >> setMultilineLimit(DEPTH, i); >> ? ? ? case String[] { Integer.parseInt(var i) } -> >> setMultilineLimit(LENGTH, i); >> ??????? default -> { setMultilineLimit(DEPTH, -1); >> setMultilineLimit(LENGTH, -1); } >> ??? } >> >> Note how not only does this become more compact, but the unchecked >> "NumberFormatException" is folded into the match, rather than being a >> separate concern. >> >> >> ## Varargs patterns >> >> Having array patterns offers us a natural way to interpret >> deconstruction patterns for varargs records.? Assume we have: >> >> ??? void m(X... xs) { } >> >> Then a varargs invocation >> >> ??? m(a, b, c) >> >> is really sugar for >> >> ??? m(new X[] { a, b, c }) >> >> So the dual of a varargs invocation, a varargs match, is really a >> match to an array pattern.? So for a record >> >> ??? record R(X... xs) { } >> >> a varargs match: >> >> ??? case R(var a, var b, var c): >> >> is really sugar for an array match: >> >> ??? case R(X[] { var a, var b, var c }): >> >> And similarly, we can use our "more arity" indicator: >> >> ??? case R(var a, var b, var c, ...): >> >> to indicate that there are at least three elements. >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Sat Sep 10 00:34:40 2022 From: john.r.rose at oracle.com (John Rose) Date: Fri, 09 Sep 2022 17:34:40 -0700 Subject: Array patterns (and varargs patterns) In-Reply-To: <1264502963.2184553.1662739743003.JavaMail.zimbra@u-pem.fr> References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> <3b76d875-f65e-cbe1-85fe-52278ff05689@oracle.com> <1114411168.705812.1662588064126.JavaMail.zimbra@u-pem.fr> <1264502963.2184553.1662739743003.JavaMail.zimbra@u-pem.fr> Message-ID: On 9 Sep 2022, at 9:09, forax at univ-mlv.fr wrote: > ----- Original Message ----- > Here we want to extract the value into bindings/variables, that is not > what the varargs does, the varargs takes a bunch of value on stack > and put them into an array. > Here we want the opposite operation of a varargs, the spread (or > splat) operator that takes the argument from an array (or a collection > ?) and put them on the stack. You are right that Brian?s proposal is not at its heart varargs, it is *array patterns* just as *array construction* are not equivalent to varargs, just a precursor to varargs. I think we need to get array patterns right first. Then we can move to whatever a fuller conception of varargs might look like ?in the dual mirror?. In Brian?s architecture of patterns, every aggregator is matched as cleanly as possible with its dual pattern (which reverses data flows). There are actually two array construction expressions in Java today. (We could extend them with more varargs-flavored features to do slice/splat/spread/splice/whatever, but we don?t have them today!) The older expression takes a length and produces an uninitialized array. The slightly-less-old expression takes an initializer list *and refuses to take a length* and produces an initialized array, correctly sized. The most conservative application of Brian?s design principles would create, I think, *two distinct array patterns*, one for each kind of expression. Can the two patterns be merged? Yeah, maybe, but at the cost of disturbing the correspondence with array aggregation. And it may be that some of of the tricky questions about varargs go away if we restrict ourselves to just the two kinds of basic patterns that derive directly from today?s array creation expressions. Remember, patterns compose. If you have that rare need for both length and contents, use two patterns combined on the same array. There?s always a way to do that. If you want *some of the content* of an array to match a pattern, use a don?t care pattern. If you want length-polymorphism and element subpatterns (a match of one pattern to many lengths, with elements sprinkled around somehow) then we are beyond the bounds of today?s exercise, aren?t we? > > If we have the pattern method Arrays.of() > > static pattern (T...) of(T[] array) { // here it's a varargs > ... > } > > and we call it using a named pattern > switch(array) { > case Arrays.of(/* insert a syntax here */) -> ... > > the syntax should extract some/all values of the array into one or > several bindings. We?ll get there. Just not quite yet. One step at a time. I think it would be really neat to be able to ?slice out? multi-element chunks of an array and bind them to pattern variables. Lisp folks have been enjoying this sort of thing basically forever. And *ignoring a range of elements* in a pattern is exactly equivalent to slicing them out and binding them to a don?t-care pattern, right? Confronted with such a feature, having thought about Brian?s principles, I think I would at the same time expect that there would be a dual array *construction* expression which would do the *mirror-opposite*. That is, it would ?splice in? multi-element chunks, into a newly created array. The Lisp folks sometimes use the same notation for both splicing and slicing. (I?m thinking of backquote-comma-atsign, with and without some kind of pattern-bind.) Under Brian?s design principles, which I whole-heartedly agree with, I guess a slogan for array patterns might be: No slicing without splicing! -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Sat Sep 10 00:45:16 2022 From: john.r.rose at oracle.com (John Rose) Date: Fri, 09 Sep 2022 17:45:16 -0700 Subject: Array patterns (and varargs patterns) In-Reply-To: <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com> References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com> Message-ID: <1B9FFA3E-4FF3-45ED-B1FB-3BD6AD60D325@oracle.com> I was practicing that trick all morning! I agree that `Foo[P]` can be saved for later. In case it wasn?t clear in my previous message, I also think that splicey stuff like `new Foo[]{ ...as, b, c, ...ds, e }` and the corresponding slicey patterns can *also* be saved for later. In fact, the slice/splice stuff seems like it is best situated in a larger design exercise for ?collection literals? whatever those are. Basically, that would be where Lisp?s backquote-comma get inherited by Java. On 9 Sep 2022, at 17:16, Brian Goetz wrote: > John pulled a nice Jedi-mind-trick on me, and pointed out that we > actually have two creation expressions for arrays: > > ??? new Foo[n] > ??? new Foo[] { a0, .., an } > > and that if we are dualizing, then we should have these two patterns: > > ??? new Foo[] { P0, ..., Pn }? // matches arrays of exactly length > N > ??? new Foo[P]???????????????? // matches arrays > whose length match P > > but that neither > > ??? new Foo[] { P, Q, ... }?? // previous suggestion > nor > ??? new Foo[L] { P, Q }?????? // current suggestion > > correspond to either of those, which suggests that we may have > prematurely optimized the pattern form.? The rational consequence of > this observation is to do > > ??? new Foo[] { P0, ..., Pn }? // matches arrays of exactly length > N > > now (which is also the basis of varargs patterns), and once we have > constant patterns (which are kind of required for the second form to > be all that useful), come back for `Foo[P]`. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Sep 10 08:57:50 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 10 Sep 2022 10:57:50 +0200 (CEST) Subject: Primitives in instanceof and patterns In-Reply-To: <74b9d5f1-ed98-203c-ca35-a3f8ce67897b@oracle.com> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> <74b9d5f1-ed98-203c-ca35-a3f8ce67897b@oracle.com> Message-ID: <1698220860.2333623.1662800270577.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Friday, September 9, 2022 10:12:21 PM > Subject: Re: Primitives in instanceof and patterns > I have a question about your example. I'm not trying to be clever and play > "whatabout", I'm looking for a straight answer why you think the two cases are > different. >> If we take a simple example >> record Point(int x, int y) { } >> Point point = ... >> switch(point) { >> case Point(int i, int j) -> ... >> ... >> } >> let say know that we change Point to use longs >> record Point(long x, long y) { } >> With the semantics you propose, the code still compile but the pattern is now >> transformed to a partial pattern that will not match all Points but only the >> ones with x and y in between Integer.MIN_VALUE and Integer.MAX_VALUE. > The same is true when I start with > record Foo(String s) { ... } > and later change it to > record Foo(Object s) { ... } > (both are incompatible changes, but we won't dwell on that.) > My question is: why does it not bother you that use-site patterns like > `Foo(String s)` are reinterpreted as partial after the String -> Object change, > but the analogous change with long -> int bothers you so much that you'd use it > to argue against being able to ask whether a long is also an int? > You obviously think that these two examples are radically different. Can you > explain why? Is it anything more than "that's the way its always been"? I think i've been a little over my head with this example. I've forgotten that in both cases, the patterns move from being a total pattern to be a partial pattern so the enclosing switch will not be exhaustive anymore, thus not compile. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Sep 10 08:58:01 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 10 Sep 2022 10:58:01 +0200 (CEST) Subject: Primitives in instanceof and patterns In-Reply-To: References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> Message-ID: <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Friday, September 9, 2022 8:07:41 PM > Subject: Re: Primitives in instanceof and patterns >> The semantics you propose is not to emit a compile error but at runtime to check >> if the value "i" is beetween Short.MIN_VALUE and Short.MAX_VALUE. >> So there is perhaps a syntactic duality but clearly there is no semantics >> duality. > Of course there is a semantic duality here. Specifically, `int` and `short` are > related by an _embedding-projection pair_. Briefly: given two sets A and B > (think "B" for "bigger"), an approximation metric on B (a complete partial > ordering), and a pair of functions `e : A -> B` and `p : B -> A`, they form an > e-p pair if (a) p . e is the identity function (dot is compose), and e . p > produces an approximation of the input (according to the metric.) > The details are not critical here (though this algebraic structure shows up > everywhere in our work if you look closely), but the point remains: there is an > algebraic duality here. Yes, when going in one direction, no runtime tests are > needed; when going in the other direction, because it may be lossy in one > direction, a runtime test is needed in that direction. Just like with > `instanceof String` / `case String s` today. > Anyway, I don't think you're saying what you really mean. Let's not get caught > up in silly arguments about what "dual" means; that won't be helpful. I do not disagree with the fact that a dual exist, i disagree that the semantics you propose is a dual (or a good dual if you prefer). A cast on primitive type apply the same operation for all the possible values, the semantics you propose for checking if an integer is a short does not apply the same operation to all values. The semantics of the Java 19 of the type pattern with a primitive type is a better dual in my opinion. The idea that the semantics of a primitive type pattern has to be "useful" is a trap. [...] >> I believe this is exactly what Stephen Colbourne was complaining when we >> discussed the previous iteration of this spec, the semantics of the primtiive >> pattern change depending on the definition of the data. > I think what Stephen didn't like is that there is no syntactic difference > between a total and partial pattern at the use site. And I get why that made > him uncomfortable; it's a valid concern, and one could imagine designing the > language so that total and partial patterns look different. This is one of the > tradeoffs we have made; I do still think we picked a good one. >> The remark of Tagir about array pattern also works here, having a named pattern >> like Short.asShort() makes the semantics far cleared because it disambiguate >> between a a pattern that request a conversion and a pattern that does a >> conversion because the data definition has changed. > If the language didn't support primitive widening in assignment / method > invocation context (like Golang does), and instead said "use Integer::toLong > (or Long::fromInteger) to convert int -> long", then yes, the natural duality > would be to also represent these as named patterns; then conversions in both > directions are mediated by API points, total in one direction, partial in the > other. But that's not the language we have! The language we have allows us to > provide an int where a long is needed, and the language does the needful. > Pattern matching allows us to recover whether a value came from a certain type, > even after we've lost the static type information. Just as we can recover the > String-ness here: > Object o = "Foo"; > if (o instanceof String s) { ... } > because reference type patterns are willing to conditionally reverse reference > widening, all the same arguments apply to > long n = 3; > if (n instanceof int i) { ... } > And not allowing this makes the language *more* complicated, because now some > conversions are reversible and some are not, for ad-hoc reasons that no one > will be able to understand. Can you offer any compelling reason why we should > be able to recover the String-ness of `o` after a widening, but not the > int-ness of `n` after a widening? In the case of instanceof, the type is not lost because any instances keep a reference to its class at runtime, a long does not keep a secret class saying its really an integer in disguise. >> And i'm still worry that we are muddying the water here, instanceof is about >> instance and subtypining relationship (hence the name), extending it to cover >> non-instance / primitive value is very confusing. > Sorry, this is a cheap rhetorical trick; declaring words to mean what you want > them to mean, and then pointing to that meaning as a way to close the argument. > Yes, saying "instanceof T is about subtyping" is a useful mental model *when the > only types you can apply it to are those related by inclusion polymorphism*." > But the restriction of instanceof to reference types is arbitrary (and we've > already decided to allow patterns in instanceof, which are surely not mere > subtyping.) > Regardless, a better way to think about `instanceof` is that it is the > precondition for "would a cast to this type be safe and useful." In the world > where we restrict to reference types, the two notions coincide. But the > safe-cast-precondition is clearly more general (this is like the difference > between defining the function 2^n on Z, vs on R or C; of course they have to > agree at the integers, but the continuous exponential function is far more > useful than the discrete one.) Moreover, the general mental model is just as > simple: how do you know a cast is safe? Ask instanceof. What does safe mean? No > error or material loss of precision. [...] "would a cast to this type be safe and useful." I think you are overstating how useful a pattern that do a range check is. There is no method in the JDK that takes an int convert it to a short if in the right range or throw an exception otherwise. It seems a better fit to a named pattern that to "default behavior" of the type pattern. I believe that defining a range check as a primitive type pattern is a too clever idea. [...] > (I strongly encourage everyone to re-read JLS Ch5, and to meditate on *why* we > have the particular conversions in the contexts we have. They're complex, but > not arbitrary; if you listen closely to the specification, it sometimes > whispers to you.) I don't disagree that users may want what you call the dual of a cast to primitive, i disagree that it has to come as a type pattern and not as a named pattern. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Sep 10 08:58:15 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sat, 10 Sep 2022 10:58:15 +0200 (CEST) Subject: Primitives in instanceof and patterns In-Reply-To: <7EED07F6-64EF-45AF-B9BE-BC31802A0289@oracle.com> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> <7EED07F6-64EF-45AF-B9BE-BC31802A0289@oracle.com> Message-ID: <909015874.2333799.1662800295791.JavaMail.zimbra@u-pem.fr> > From: "John Rose" > To: "Brian Goetz" > Cc: "Remi Forax" , "amber-spec-experts" > > Sent: Friday, September 9, 2022 11:32:04 PM > Subject: Re: Primitives in instanceof and patterns > On 9 Sep 2022, at 11:07, Brian Goetz wrote: >> ? Regardless, a better way to think about `instanceof` is that it is the >> precondition for "would a cast to this type be safe and useful." In the world >> where we restrict to reference types, the two notions coincide. > And, in the future world where every value (except possibly null ) is an > instance , the two notions will coincide again, without the restriction to > reference types. We are taking reasonable incremental steps toward that world > here, IMO. >> But the safe-cast-precondition is clearly more general (this is like the >> difference between defining the function 2^n on Z, vs on R or C; of course they >> have to agree at the integers, but the continuous exponential function is far >> more useful than the discrete one.) Moreover, the general mental model is just >> as simple: how do you know a cast is safe? Ask instanceof. What does safe mean? >> No error or material loss of precision. > And (to pile on a bit here), the casts you are speaking of here, Brian, are the > casts we have in Java , not some idealized or restricted or cleaned up cast. So > we have to deal with the oddities of primitive value conversion. > The payoff from dealing with this is that the meaning of patterns is derived > systematically from the meaning of casts (and other conversions). That is > hugely desirable, because it means a very complex new feature is firmly > anchored to existing features. Getting this kind of thing right preserves and > extends Java?s role as a world-class programming language. >> A more reasonable way to state this objection would be: "most users believe that >> `instanceof` is purely about subtyping, and it will take some work to bring >> them around to a more general interpretation, how are we going to do that?" > This is subjective and esthetic, but I think two thoughts help here (with > teaching and rationale): First, everything (except null ) is an instance, or > will eventually be. Second, subtyping in Java includes the murky rules for > primitive typing. > Those specific rules more or less systematically determine how casts work. They > should also systematically determine (in the same way) how patterns work. After > all, casts and patterns are (and very much should be!) mirror image > counterparts of each other, or dance partners holding hands. > (I visualize such things as boxes on the whiteboard with reversible arrows > between them. You could say ?category? if you like. Brian likes to say ?dual?, > and I took linear algebra too, but I doubt most folks took the trouble in that > class to be curious about exactly what a ?dual space? really is all about.) > Rather than extending the language we wish we had, we are extending the one we > do have, and that means aligning even the murky parts of casts with pattern > behavior. > In the end, I don?t think it?s very murky at all in practice, except of course > for the outraged theoretical purist (who lives in each of us). There is > certainly no new murk . IMO what Brian is showing works out surprisingly well, > so kudos to him for following his nose to a design with liveable details. This > success also IMO demonstrates the foresight of the original authors and current > maintainers of the spec, even in the ?murky? parts of primitive value > conversions. > ? John At some point in the future, we may want what an instanceof means, i think we can all agree with that. I would prefer to be on the safe side when we will ask ourselves how exactly to retrofit primitive types to value classes. I'm not against changing what a type pattern is but it should be done in concert with changing the other rules (overriding rules especially) and the retrofitting of primitive types to value classes. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Sep 10 09:48:20 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 10 Sep 2022 11:48:20 +0200 (CEST) Subject: Array patterns (and varargs patterns) In-Reply-To: <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com> References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com> Message-ID: <2111267205.2346604.1662803300417.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "amber-spec-experts" > Sent: Saturday, September 10, 2022 2:16:15 AM > Subject: Re: Array patterns (and varargs patterns) > John pulled a nice Jedi-mind-trick on me, and pointed out that we actually have > two creation expressions for arrays: > new Foo[n] > new Foo[] { a0, .., an } > and that if we are dualizing, then we should have these two patterns: > new Foo[] { P0, ..., Pn } // matches arrays of exactly length N > new Foo[P] // matches arrays whose length match P > but that neither > new Foo[] { P, Q, ... } // previous suggestion > nor > new Foo[L] { P, Q } // current suggestion > correspond to either of those, which suggests that we may have prematurely > optimized the pattern form. The rational consequence of this observation is to > do > new Foo[] { P0, ..., Pn } // matches arrays of exactly length N > now (which is also the basis of varargs patterns), and once we have constant > patterns (which are kind of required for the second form to be all that > useful), come back for `Foo[P]`. I like this proposal, it offers a clean separation between the array pattern and a future spread pattern (or whatever when end up calling it). R?mi > On 9/6/2022 5:11 PM, Brian Goetz wrote: >> We dropped this out of the record patterns JEP, but I think it is time to >> revisit this. >> The concept of array patterns was pretty straightforward; they mimic the nesting >> and exhaustiveness rules of record patterns, they are just a different sort of >> container for nested patterns. And they have an obvious duality with array >> creation expressions. >> The main open question here was how we distinguish between "match an array of >> length exactly N" (where there are N nested patterns) and "match an array of >> length at least N". We toyed with the idea of a "..." indicator to mean "more >> elements", but this felt a little forced and opened new questions. >> It later occurred to me that there is another place to nest a pattern in an >> array pattern -- to match (and bind) the length. In the following, assume for >> sake of exposition that "_" is the "any" pattern (matches everything, binds >> nothing) and that we have some way to denote a constant pattern, which I'll >> denote here with a constant literal. >> There is an obvious place to put this (optional) pattern: in between the >> brackets. So: >> case String[1] { P }: >> ^ a constant pattern >> would match string arrays of length 1 whose sole element matches P. And >> case String[] { P, Q } >> would match string arrays of length exactly 2, whose first two elements match P >> and Q respectively. (If the length pattern is not specified, we infer a >> constant pattern whose constant is equal to the length of the nested pattern >> list.) >> Matching a target to `String[L] { P0, .., Pn }` means >> x instanceof String[] arr >> && arr.length matches L >> && arr.length >= n >> && arr[0] matches P0 >> && arr[1] matches P1 >> ... >> && arr[n] matches Pn >> More examples: >> case String[int len] { P } >> would match string arrays of length >= 1 whose first element matches P, and >> further binds the array length to `len`. >> case String[_] { P, Q } >> would match string arrays of any length whose first two elements match P and Q. >> case String[3] { } >> ^constant pattern >> matches all string arrays of length 3. >> This is a more principled way to do it, because the length is a part of the >> array and deserves a chance to match via nested patterns, just as with the >> elements, and it avoid trying to give "..." a new meaning. >> The downside is that it might be confusing at first (though people will learn >> quickly enough) how to distinguish between an exact match and a prefix match. >> On 1/5/2021 1:48 PM, Brian Goetz wrote: >>> As we get into the next round of pattern matching, I'd like to opportunistically >>> attach another sub-feature: array patterns. (This also bears on the question of >>> "how would varargs patterns work", which I'll address below, though they might >>> come later.) >>> ## Array Patterns >>> If we want to create a new array, we do so with an array construction >>> expression: >>> new String[] { "a", "b" } >>> Since each form of aggregation should have its dual in destructuring, the >>> natural way to represent an array pattern (h/t to AlanM for suggesting this) >>> is: >>> if (arr instanceof String[] { var a, var b }) { ... } >>> Here, the applicability test is: "are you an instanceof of String[], with length >>> = 2", and if so, we cast to String[], extract the two elements, and match them >>> to the nested patterns `var a` and `var b`. This is the natural analogue of >>> deconstruction patterns for arrays, complete with nesting. >>> Since an array can have more elements, we likely need a way to say "length >= 2" >>> rather than simply "length == 2". There are multiple syntactic ways to get >>> there, for now I'm going to write >>> if (arr instanceof String[] { var a, var b, ... }) >>> to indicate "more". The "..." matches zero or more elements and binds nothing. >>> >>> People are immediately going to ask "can I bind something to the remainder"; I >>> think this is mostly an "attractive distraction", and would prefer to not have >>> this dominate the discussion. >>> >>> Here's an example from the JDK that could use this effectively: >>> String[] limits = limitString.split(":"); >>> try { >>> switch (limits.length) { >>> case 2: { >>> if (!limits[1].equals("*")) >>> setMultilineLimit(MultilineLimit.DEPTH, Integer.parseInt(limits[1])); >>> } >>> case 1: { >>> if (!limits[0].equals("*")) >>> setMultilineLimit(MultilineLimit.LENGTH, Integer.parseInt(limits[0])); >>> } >>> } >>> } >>> catch(NumberFormatException ex) { >>> setMultilineLimit(MultilineLimit.DEPTH, -1); >>> setMultilineLimit(MultilineLimit.LENGTH, -1); >>> } >>> becomes (eventually) >>> switch (limitString.split(":")) { >>> case String[] { var _, Integer.parseInt(var i) } -> setMultilineLimit(DEPTH, i); >>> case String[] { Integer.parseInt(var i) } -> setMultilineLimit(LENGTH, i); >>> default -> { setMultilineLimit(DEPTH, -1); setMultilineLimit(LENGTH, -1); } >>> } >>> Note how not only does this become more compact, but the unchecked >>> "NumberFormatException" is folded into the match, rather than being a separate >>> concern. >>> ## Varargs patterns >>> Having array patterns offers us a natural way to interpret deconstruction >>> patterns for varargs records. Assume we have: >>> void m(X... xs) { } >>> Then a varargs invocation >>> m(a, b, c) >>> is really sugar for >>> m(new X[] { a, b, c }) >>> So the dual of a varargs invocation, a varargs match, is really a match to an >>> array pattern. So for a record >>> record R(X... xs) { } >>> a varargs match: >>> case R(var a, var b, var c): >>> is really sugar for an array match: >>> case R(X[] { var a, var b, var c }): >>> And similarly, we can use our "more arity" indicator: >>> case R(var a, var b, var c, ...): >>> to indicate that there are at least three elements. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Sep 10 09:50:00 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 10 Sep 2022 11:50:00 +0200 (CEST) Subject: Array patterns (and varargs patterns) In-Reply-To: <1B9FFA3E-4FF3-45ED-B1FB-3BD6AD60D325@oracle.com> References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com> <1B9FFA3E-4FF3-45ED-B1FB-3BD6AD60D325@oracle.com> Message-ID: <1484184530.2347295.1662803400260.JavaMail.zimbra@u-pem.fr> > From: "John Rose" > To: "Brian Goetz" > Cc: "amber-spec-experts" > Sent: Saturday, September 10, 2022 2:45:16 AM > Subject: Re: Array patterns (and varargs patterns) > I was practicing that trick all morning! > I agree that Foo[P] can be saved for later. > In case it wasn?t clear in my previous message, I also think that splicey stuff > like new Foo[]{ ...as, b, c, ...ds, e } and the corresponding slicey patterns > can also be saved for later. > In fact, the slice/splice stuff seems like it is best situated in a larger > design exercise for ?collection literals? whatever those are. Basically, that > would be where Lisp?s backquote-comma get inherited by Java. yes, R?mi > On 9 Sep 2022, at 17:16, Brian Goetz wrote: >> John pulled a nice Jedi-mind-trick on me, and pointed out that we actually have >> two creation expressions for arrays: >> new Foo[n] >> new Foo[] { a0, .., an } >> and that if we are dualizing, then we should have these two patterns: >> new Foo[] { P0, ..., Pn } // matches arrays of exactly length N >> new Foo[P] // matches arrays whose length match P >> but that neither >> new Foo[] { P, Q, ... } // previous suggestion >> nor >> new Foo[L] { P, Q } // current suggestion >> correspond to either of those, which suggests that we may have prematurely >> optimized the pattern form. The rational consequence of this observation is to >> do >> new Foo[] { P0, ..., Pn } // matches arrays of exactly length N >> now (which is also the basis of varargs patterns), and once we have constant >> patterns (which are kind of required for the second form to be all that >> useful), come back for `Foo[P]`. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Sep 10 14:00:38 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 10 Sep 2022 10:00:38 -0400 Subject: Array patterns (and varargs patterns) In-Reply-To: <2111267205.2346604.1662803300417.JavaMail.zimbra@u-pem.fr> References: <5f55e727-8a29-3bdf-57ec-6ec6245ef5c5@oracle.com> <1ffc5f55-2764-7250-489e-d60d26f3cf00@oracle.com> <2111267205.2346604.1662803300417.JavaMail.zimbra@u-pem.fr> Message-ID: <242ed758-cbcd-1f06-eaf1-23c2402b42d7@oracle.com> Obvious correction: the `new` in the pattern examples was a cut and paste error, patterns don't say `new`. On 9/10/2022 5:48 AM, Remi Forax wrote: > > > ------------------------------------------------------------------------ > > *From: *"Brian Goetz" > *To: *"amber-spec-experts" > *Sent: *Saturday, September 10, 2022 2:16:15 AM > *Subject: *Re: Array patterns (and varargs patterns) > > John pulled a nice Jedi-mind-trick on me, and pointed out that we > actually have two creation expressions for arrays: > > ??? new Foo[n] > ??? new Foo[] { a0, .., an } > > and that if we are dualizing, then we should have these two patterns: > > ??? new Foo[] { P0, ..., Pn }? // matches arrays of exactly length N > ??? new Foo[P]???????????????? // matches arrays whose length match P > > but that neither > > ??? new Foo[] { P, Q, ... }?? // previous suggestion > nor > ??? new Foo[L] { P, Q }?????? // current suggestion > > correspond to either of those, which suggests that we may have > prematurely optimized the pattern form.? The rational consequence > of this observation is to do > > ??? new Foo[] { P0, ..., Pn }? // matches arrays of exactly length N > > now (which is also the basis of varargs patterns), and once we > have constant patterns (which are kind of required for the second > form to be all that useful), come back for `Foo[P]`. > > > I like this proposal, it offers a clean separation between the array > pattern and a future spread pattern (or whatever when end up calling it). > > R?mi > > > > On 9/6/2022 5:11 PM, Brian Goetz wrote: > > We dropped this out of the record patterns JEP, but I think it > is time to revisit this. > > The concept of array patterns was pretty straightforward; they > mimic the nesting and exhaustiveness rules of record patterns, > they are just a different sort of container for nested > patterns. And they have an obvious duality with array creation > expressions. > > The main open question here was how we distinguish between > "match an array of length exactly N" (where there are N nested > patterns) and "match an array of length at least N".? We toyed > with the idea of a "..." indicator to mean "more elements", > but this felt a little forced and opened new questions. > > It later occurred to me that there is another place to nest a > pattern in an array pattern -- to match (and bind) the > length.? In the following, assume for sake of exposition that > "_" is the "any" pattern (matches everything, binds nothing) > and that we have some way to denote a constant pattern, which > I'll denote here with a constant literal. > > There is an obvious place to put this (optional) pattern: in > between the brackets.? So: > > ??? case String[1] { P }: > ??????????????? ^ a constant pattern > > would match string arrays of length 1 whose sole element > matches P.? And > > ??? case String[] { P, Q } > > would match string arrays of length exactly 2, whose first two > elements match P and Q respectively.? (If the length pattern > is not specified, we infer a constant pattern whose constant > is equal to the length of the nested pattern list.) > > Matching a target to `String[L] { P0, .., Pn }` means > > ??? x instanceof String[] arr > ??????? && arr.length matches L > ??????? && arr.length >= n > ??????? && arr[0] matches P0 > ??????? && arr[1] matches P1 > ??????? ... > ??????? && arr[n] matches Pn > > More examples: > > ??? case String[int len] { P } > > would match string arrays of length >= 1 whose first element > matches P, and further binds the array length to `len`. > > ??? case String[_] { P, Q } > > would match string arrays of any length whose first two > elements match P and Q. > > ??? case String[3] { } > ??????????????? ^constant pattern > > matches all string arrays of length 3. > > > This is a more principled way to do it, because the length is > a part of the array and deserves a chance to match via nested > patterns, just as with the elements, and it avoid trying to > give "..." a new meaning. > > The downside is that it might be confusing at first (though > people will learn quickly enough) how to distinguish between > an exact match and a prefix match. > > > > > On 1/5/2021 1:48 PM, Brian Goetz wrote: > > As we get into the next round of pattern matching, I'd > like to opportunistically attach another sub-feature: > array patterns.? (This also bears on the question of "how > would varargs patterns work", which I'll address below, > though they might come later.) > > ## Array Patterns > > If we want to create a new array, we do so with an array > construction expression: > > ??? new String[] { "a", "b" } > > Since each form of aggregation should have its dual in > destructuring, the natural way to represent an array > pattern (h/t to AlanM for suggesting this) is: > > ??? if (arr instanceof String[] { var a, var b }) { ... } > > Here, the applicability test is: "are you an instanceof of > String[], with length = 2", and if so, we cast to > String[], extract the two elements, and match them to the > nested patterns `var a` and `var b`.?? This is the natural > analogue of deconstruction patterns for arrays, complete > with nesting. > > Since an array can have more elements, we likely need a > way to say "length >= 2" rather than simply "length == > 2".? There are multiple syntactic ways to get there, for > now I'm going to write > > ??? if (arr instanceof String[] { var a, var b, ... }) > > to indicate "more".? The "..." matches zero or more > elements and binds nothing. > > > People are immediately going to ask "can I bind something > to the remainder"; I think this is mostly an "attractive > distraction", and would prefer to not have this dominate > the discussion. > > > Here's an example from the JDK that could use this > effectively: > > String[] limits = limitString.split(":"); > try { > ??? switch (limits.length) { > ??????? case 2: { > ??????????? if (!limits[1].equals("*")) > setMultilineLimit(MultilineLimit.DEPTH, > Integer.parseInt(limits[1])); > ??????? } > ??????? case 1: { > ??????????? if (!limits[0].equals("*")) > setMultilineLimit(MultilineLimit.LENGTH, > Integer.parseInt(limits[0])); > ??????? } > ??? } > } > catch(NumberFormatException ex) { > ??? setMultilineLimit(MultilineLimit.DEPTH, -1); > ??? setMultilineLimit(MultilineLimit.LENGTH, -1); > } > > becomes (eventually) > > ??? switch (limitString.split(":")) { > ??????? case String[] { var _, Integer.parseInt(var i) } > -> setMultilineLimit(DEPTH, i); > ? ? ? ? case String[] { Integer.parseInt(var i) } -> > setMultilineLimit(LENGTH, i); > ??????? default -> { setMultilineLimit(DEPTH, -1); > setMultilineLimit(LENGTH, -1); } > ??? } > > Note how not only does this become more compact, but the > unchecked "NumberFormatException" is folded into the > match, rather than being a separate concern. > > > ## Varargs patterns > > Having array patterns offers us a natural way to interpret > deconstruction patterns for varargs records.? Assume we have: > > ??? void m(X... xs) { } > > Then a varargs invocation > > ??? m(a, b, c) > > is really sugar for > > ??? m(new X[] { a, b, c }) > > So the dual of a varargs invocation, a varargs match, is > really a match to an array pattern.? So for a record > > ??? record R(X... xs) { } > > a varargs match: > > ??? case R(var a, var b, var c): > > is really sugar for an array match: > > ??? case R(X[] { var a, var b, var c }): > > And similarly, we can use our "more arity" indicator: > > ??? case R(var a, var b, var c, ...): > > to indicate that there are at least three elements. > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Sep 10 14:01:23 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 10 Sep 2022 10:01:23 -0400 Subject: Primitives in instanceof and patterns In-Reply-To: <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr> Message-ID: > I think you are overstating how useful a pattern that do a range check is. I think you're falling into the trap of examining each conversion and asking "would I want a pattern to do this."? That's a recipe for more complexity because we'll end up with another ad-hoc, not-like-anything-else construct (which is what the Java 19 primitive type pattern semantics is.)? It's not about "is range check useful" (though, it is), its about "is casting to/from primitives safely" useful. > I'm not against changing what a type pattern is but it should be done > in concert with changing the other rules (overriding rules especially) > and the retrofitting of primitive types to value classes. It's not about "changing other rules", its about aligning to them. We're aligning to cast conversion here.? When we have named patterns, we will have to define overload selection for patterns; again, this should just be the existing overload selection with "arrows reversed", which means we want boxing for patterns to also be "boxing with arrows reversed" (otherwise it doesn't compose.) The language we have now is telling us how patterns should work; we should listen. From brian.goetz at oracle.com Sat Sep 10 20:04:50 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 10 Sep 2022 16:04:50 -0400 Subject: Primitives in instanceof and patterns In-Reply-To: References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr> Message-ID: <26d98589-409f-ce80-911d-91c472cade29@oracle.com> >> I'm not against changing what a type pattern is but it should be done >> in concert with changing the other rules (overriding rules >> especially) and the retrofitting of primitive types to value classes. > > It's not about "changing other rules", its about aligning to them. > We're aligning to cast conversion here.? When we have named patterns, > we will have to define overload selection for patterns; again, this > should just be the existing overload selection with "arrows reversed", > which means we want boxing for patterns to also be "boxing with arrows > reversed" (otherwise it doesn't compose.) The language we have now is > telling us how patterns should work; we should listen. > Just in case it's not clear: ?- instanceof T means "is it safe to cast to T" ?- non-unconditional type patterns (those that do not resolve to any patterns) mean `instanceof T` And this is true for primitive and reference types, primitive and reference targets.? This isn't special new rules for primitive type patterns, it is extending instanceof to mean "is it safe to cast" and then defining type patterns purely in terms of instanceof. Other useful things follow too: ?- for types S and T, if all values of S are instanceof T, then `T t` is unconditional on S (no distinction between primitive and reference types) ?- with the same condition, `T t` dominates `S s` ?- if S is cast-convertible (modulo unchecked conversions) to T, then `T t` is applicable to S (no distinction between primitive and reference types) If you read the new spec (Aggelos will post a draft soon), you'll see that we hardly even mention primitive types in the section on type patterns.? We just define type patterns in terms of exact casts, and same for instanceof.? Pretty much all the new text is "what is an exact cast". From forax at univ-mlv.fr Sun Sep 11 10:19:13 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Sun, 11 Sep 2022 12:19:13 +0200 (CEST) Subject: Primitives in instanceof and patterns In-Reply-To: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> Message-ID: <583895871.2568221.1662891553338.JavaMail.zimbra@u-pem.fr> I found a way to explain clearly why a reference type pattern and a primitive type pattern are different. Let suppose that the code compiles (to avoid the issues of the separate compilation), unlike a reference type pattern, the code executed for a primitive type pattern is a function of *both* the declared type and the pattern type. By example, if i have a code like this, i've no idea what code is executed for case Foo(int i) without having to go to the declaration of Foo which is usually not collocated with the switch itself. Foo foo = ... switch (foo) { case Foo(int i) -> {} case Foo(double d) -> {} } Here, if Foo is declared like this, record Foo(long l) { } or like that, record Foo(double d) { }, the semantics is different, there is no such problem with a reference type, if the first pattern of the switch is case Foo(String s) -> {} we know that it is always a subtyping check. This is different from a cast because the cast is collocated with the expression it applies to, so the semantics is a kind of obvious. long l = ... int i = (int) l; R?mi > From: "Brian Goetz" > To: "amber-spec-experts" > Sent: Thursday, September 8, 2022 6:53:21 PM > Subject: Primitives in instanceof and patterns > Earlier in the year we talked about primitive type patterns. Let me summarize > the past discussion, what I think the right direction is, and why this is (yet > another) "finishing up the job" task for basic patterns that, if left undone, > will be a sharp edge. > Prior to record patterns, we didn't support primitive type patterns at all. With > records, we now support primitive type patterns as nested patterns, but they are > very limited; they are only applicable to exactly their own type. > The motivation for "finishing" primitive type patterns is the same as discussed > earlier this week with array patterns -- if pattern matching is the dual of > aggregation, we want to avoid gratuitous asymmetries that let you put things > together but not take them apart. > Currently, we can assign a `String` to an `Object`, and recover the `String` > with a pattern match: > Object o = "Bob"; > if (o instanceof String s) { println("Hi Bob"); } > Analogously, we can assign an `int` to a `long`: > long n = 0; > but we cannot yet recover the int with a pattern match: > if (n instanceof int i) { ... } // error, pattern `int i` not applicable to > `long` > To fill out some more of the asymmetries around records if we don't finish the > job: given > record R(int i) { } > we can construct it with > new R(anInt) // no adaptation > new R(aShort) // widening > new R(anInteger) // unboxing > but yet cannot deconstruct it the same way: > case R(int i) // OK > case R(short s) // nope > case R(Integer i) // nope > It would be a gratuitous asymmetry that we can use pattern matching to recover > from > reference widening, but not from primitive widening. While many of the > arguments against doing primitive type patterns now were of the form "let's keep > things simple", I believe that the simpler solution is actually to _finish the > job_, because this minimizes asymmetries and potholes that users would otherwise > have to maintain a mental catalog of. > Our earlier explorations started (incorrectly, as it turned out), with > assignment context. This direction gave us a good push in the right direction, > but turned out to not be the right answer. A more careful reading of JLS Ch5 > convinced me that the answer lies not in assignment conversion, but _cast > conversion_. > #### Stepping back: instanceof > The right place to start is actually not patterns, but `instanceof`. If we > start here, and listen carefully to the specification, it leads us to the > correct answer. > Today, `instanceof` works only for reference types. Accordingly, most people > view `instanceof` as "the subtyping operator" -- because that's the only > question we can currently ask it. We almost never see `instanceof` on its own; > it is nearly always followed by a cast to the same type. Similarly, we rarely > see a cast on its own; it is nearly always preceded by an `instanceof` for the > same type. > There's a reason these two operations travel together: casting is, in general, > unsafe; we can try to cast an `Object` reference to a `String`, but if the > reference refers to another type, the cast will fail. So to make casting safe, > we precede it with an `instanceof` test. The semantics of `instanceof` and > casting align such that `instanceof` is the precondition test for safe casting. > > instanceof is the precondition for safe casting > Asking `instanceof T` means "if I cast this to T, would I like the answer." > Obviously CCE is an unlikable answer; `instanceof` further adopts the opinion > that casting `null` would also be an unlikable answer, because while the cast > would succeed, you can't do anything useful with the result. > Currently, `instanceof` is only defined on reference types, and on this domain > coincides with subtyping. On the other hand, casting is defined between > primitive types (widening, narrowing), and between primitive and reference types > (boxing, unboxing). Some casts involving primitives yield "better" results than > others; casting `0` to `byte` results in no loss of information, since `0` is > representable as a byte, but casting `500` to `byte` succeeds but loses > information because the higher order bits are discarded. > If we characterize some casts as "lossy" and others as "exact" -- where lossy > means discarding useful information -- we can extend the "safe casting > precondition" meaning of `instanceof` to primitive operands and types in the > obvious way -- "would casting this expression to this type succeed without error > and without information loss." If the type of the expression is not castable to > the type we are asking about, we know the cast cannot succeed and reject the > `instanceof` test at compile time. > Defining which casts are lossy and which are exact is fairly straightforward; we > can appeal to the concept already in the JLS of "representable in the range of a > type." For some pairs of types, casting is always exact (e.g., casting `int` to > `long` is always exact); we call these "unconditionally exact". For other pairs > of types, some values can be cast exactly and others cannot. > Defining which casts are exact gives us a simple and precise semantics for `x > instanceof T`: whether `x` can be cast exactly to `T`. Similarly, if the static > type of `x` is not castable to `T`, then the corresponding `instanceof` question > is rejected statically. The answers are not suprising: > - Boxing is always exact; > - Unboxing is exact for all non-null values; > - Reference widening is always exact; > - Reference narrowing is exact if the type of the target expression is a > subtype of the target type; > - Primitive widening and narrowing are exact if the target expression can be > represented in the range of the target type. > #### Primitive type patterns > It is a short hop from `instanceof` to patterns (including primitive type > patterns, and reference type patterns applied to primitive types), which can be > defined entirely in terms of cast conversion and exactness: > - A type pattern `T t` is applicable to a target of type `S` if `S` is > cast-convertible to `T`; > - A type pattern `T t` matches a target `x` if `x` can be cast exactly to `T`; > - A type pattern `T t` is unconditional at type `S` if casting from `T` to `S` > is unconditionally exact; > - A type pattern `T t` dominates a type pattern `S s` (or a record pattern > `S(...)`) if `T t` would be unconditional on `S`. > While the rules for casting are complex, primitive patterns add no new > complexity; there are no new conversions or conversion contexts. If we see: > switch (a) { > case T t: ... > } > we know the case matches if `a` can be cast exactly to `T`, and the pattern is > unconditional if _all_ values of `a`'s type can be cast exactly to `T`. Note > that none of this is specific to primitives; we derive the semantics of _all_ > type patterns from the enhanced definition of casting. > Now, our record deconstruction examples work symmetrically to construction: > case R(int i) // OK > case R(short s) // test if `i` is in the range of `short` > case R(Integer i) // box `i` to `Integer` -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sun Sep 11 10:19:16 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Sun, 11 Sep 2022 12:19:16 +0200 (CEST) Subject: Primitives in instanceof and patterns In-Reply-To: References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr> Message-ID: <376761828.2568222.1662891556420.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Saturday, September 10, 2022 4:01:23 PM > Subject: Re: Primitives in instanceof and patterns >> I think you are overstating how useful a pattern that do a range check is. > > I think you're falling into the trap of examining each conversion and > asking "would I want a pattern to do this."? That's a recipe for more > complexity because we'll end up with another ad-hoc, > not-like-anything-else construct (which is what the Java 19 primitive > type pattern semantics is.)? It's not about "is range check useful" > (though, it is), its about "is casting to/from primitives safely" useful. Given that only primitive widening casts are safe, allowing only primitive widening is another way to answer to the question what a primitive type pattern is. You are proposing a semantics using range checks, that's the problem. > >> I'm not against changing what a type pattern is but it should be done >> in concert with changing the other rules (overriding rules especially) >> and the retrofitting of primitive types to value classes. > > It's not about "changing other rules", its about aligning to them. We're > aligning to cast conversion here.? When we have named patterns, we will > have to define overload selection for patterns; again, this should just > be the existing overload selection with "arrows reversed", which means > we want boxing for patterns to also be "boxing with arrows reversed" > (otherwise it doesn't compose.) The language we have now is telling us > how patterns should work; we should listen. As an example, instanceof rules and the rules about overriding methods are intimately linked, asking if a method override another is equivalent to asking if their function types are a subtypes. if int instanceof double is allowed, then B::m should override A::m class A { int m() { ... } } class B extends A { @Override double m() { ... } } This is what i meant by changing other rules. R?mi From brian.goetz at oracle.com Sun Sep 11 14:48:04 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 11 Sep 2022 10:48:04 -0400 Subject: Primitives in instanceof and patterns In-Reply-To: <376761828.2568222.1662891556420.JavaMail.zimbra@u-pem.fr> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr> <376761828.2568222.1662891556420.JavaMail.zimbra@u-pem.fr> Message-ID: >> >> I think you're falling into the trap of examining each conversion and >> asking "would I want a pattern to do this." > Given that only primitive widening casts are safe, allowing only primitive widening is another way to answer to the question what a primitive type pattern is. > You are proposing a semantics using range checks, that's the problem. So, substitute "reference" for "primitive" in this argument, and you will see how silly it is: "since only reference widening is safe, allowing only reference widening would be 'another answer to what a reference type pattern is.'"? But that would also be a useless semantic.? You're caught up on "range checks", but that's not the important thing here.? Casting is the important thing. > As an example, instanceof rules and the rules about overriding methods > are intimately linked, asking if a method override another is > equivalent to asking if their function types are a subtypes. > if int instanceof double is allowed, then B::m should override A::m > class A { > int m() { ... } > } > class B extends A { > @Override > double m() { ... } > } > > This is what i meant by changing other rules. Another cute argument, but no.? Covariant overriding is linked to *subtyping*.? Instanceof *happens to coincide* with subtyping right now (given its ad-hoc restrictions), but the causality goes the other way.? (Casting also appeals to subtyping, through reference widening conversions.)? But this argument is like starting with "all men are moral" and "Socrates is a man" and concluding "All men are Socrates." We can talk about whether it would be wise to align the definition of covariant overrides with conversions other than reference widening (and will likely come up again in Valhalla anyway), but this is by no means a forced move, and not tied to generalizing the semantics of instanceof. > I found a way to explain clearly why a reference type pattern and a > primitive type pattern are different. > > Let suppose that the code compiles (to avoid the issues of the > separate compilation), > unlike a reference type pattern, the code executed for a primitive > type pattern is a function of *both* the declared type and the pattern > type. So (a) untrue -- what code we execute for a reference type pattern does depend on the static types -- we may or may not generate an `instanceof` instruction, depending on whether the pattern is unconditional.? (The same is true for a cast; some casts are no-ops and generate no code.)? And (b), so what?? We're asking "would it be safe to cast x to T".? Depending on the types X and T, we will have different code for the casting, so why is it unreasonable to have different code for asking whether it is castable? > > By example, if i have a code like this, i've no idea what code is > executed for case Foo(int i) without having to go to the declaration > of Foo which is usually not collocated with the switch itself. > > ?? Foo foo = ... > ?? switch (foo) { > ????? case Foo(int i) -> {} > ????? case Foo(double d) -> {} > ??? } Sigh, this argument again?? We've been through this extensively the first time around, with reference types, where you "had no idea what this code means" without looking at the declaration of the pattern. (Then, it was partiality and totality.)? I get that you didn't like that total and partial patterns don't look syntactically different, and that ship has sailed.? But this is the same argument warmed over. "What code will be executed" is irrelevant; what is relevant is the semantics.? Assuming a single deconstruction pattern for Foo, the first case asks "can the Foo's component be cast safely to int, and if so, please cast it for me".? It doesn't matter what code we use to answer that question or do the cast -- could be a narrowing, could be an unboxing, whatever. You see the same thing today without patterns: ??? var x = foo.getFoo(); ??? int i = (int) x; x could be a long, an int, an Integer, etc, but you don't know unless you look at the definition of getFoo().? And you have "no idea what code will be executed."? Sure, but so what?? You asked for a cast to int.? The language validated that x is castable to int, and does what needs to be done, which might be nothing, or a widening, or a narrowing with truncation, or an unboxing, or some combination. (When we get to overloading deconstruction patterns, we'll have all the same issues as we have with overloading methods today -- it is not obvious looking only at the call site, which overload is called, and therefore which conversions are applied to arguments or returns.) As a reminder, here's what a nested pattern means: ??? x matches P(Q) === x matches P(var q) && q matches Q Understanding what is going to happen involves understanding the type of `q`.? I get that you didn't like that choice, and that's your right, but it's not OK to keep bringing it up as if its a new thing. I think I actually understand your concern here, which has nothing to do with the dozen or so bogus examples and explanations you've tossed out so far.? It is that cast conversion is complicated, and you would like pattern matching to be "simple", and so pulling in the muck of cast conversion into pattern matching feels to you like an unforced error.? Right?? (And if so, perhaps you could have just said that, instead of throwing random arguments at the wall?) I also would like to hear from more people in this discussion, and I don't think the style of discourse we've fallen into (again) is conducive to that. From brian.goetz at oracle.com Mon Sep 12 19:36:09 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 12 Sep 2022 15:36:09 -0400 Subject: Knocking off two more vestiges of legacy switch Message-ID: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> The work on primitive patterns continues to yield fruit; it points us to a principled way of dealing with _constant patterns_, both as nested patterns, and to redefining constant case labels as simple patterns.? It also points us to a way to bring the missing three types into the realm of switch (since now switch is usable at every type _but_ these): float, double, and boolean.? While I'm not in a hurry to prioritize this immediately, I wanted to connect the dots to how primitive type patterns lay the foundation for these two vestiges of legacy switch.? (The remaining vestige, not yet dealt with, is that legacy statement switches are not exhaustive.? We'd like a path back to uniformity there as well, but this is likely a longer road.) **Constant patterns.**? In early explorations (e.g., "Pattern Matching Semantics"), we struggled with the meaning of constant patterns, specifically with conversions in the absence of a sharp type for the match target.? The exploration of that document treated boxing conversions but not other conversions, which would have created a gratuitously new conversion context. This was one of several reasons we deferred constant patterns. The current status is that constant case labels (e.g., `case 3`) are permitted (a) only in the presence of a compatible operand type and (b) are not patterns.? This has led to some accidental complexity in specifying switch, since we can have a mix of pattern and non-pattern labels, and it means we can't use constants as nested patterns.? (We've also not yet integrated enum cases into the exhaustiveness analysis in the presence of a sealed type that permits an enum type.)? Ret-conning all case labels as patterns seems attractive if we can make the semantics clear, as not only does it bring more uniformity, but it means we can use them as nested patterns, not just at the top level of the switch.? More composition. The recent work on `instanceof` involving primitives offers a clear and principled meaning to `0` as a pattern; given a constant `c` of type `C`, treat ??? x matches c as meaning ??? x matches C alpha && alpha eq c where `eq` is a suitable comparison predicate for the type C (== for integral types and enums, .equals() for String, and something irritating for floating point.)? This gives us a solid basis for interpreting something like `case 3L`; we match if the target would match `long alpha` and `alpha == 3L`.? No new rules; all conversions are handled through the type pattern for the static type of the constant in question.? Not coincidentally, the rules for primitive type patterns support the implicit conversions allowed in today's switches on `short`, `byte`, and `char`, which are allowed to use `int` labels, preserving the meaning of existing code while we generalize what switch means. The other attributes of patterns -- applicability, exhaustiveness, and dominance -- are also easy: ?- a constant pattern for `c : C` is applicable to S if a type pattern for `C` is applicable to S. ?- a type pattern for T dominates a constant pattern for `c : C` if the type pattern for T dominates a type pattern for C. ?- constant patterns are never exhaustive. No new rules; just appeal to type patterns. **Switch on float, double, and boolean.**? Switches on floating point were left out for the obvious reason -- it just isn't that useful, and it would have introduced new complexity into the specification of switch.? Similarly, boolean was left out because we have "if" statements.? In the original world, where you could switch on only five types, this was a sensible compromise.? We later added in String and enum types, which were sensible additions.?? But now we move into a world where we can switch on every type _except_ float, double, and boolean -- and this no long seems sensible.? It still may not be something people will use often, but a key driver of the redesign of switch has been refactorability, and we currently don't have a story for refactoring ??? record R(float f) { } ??? switch (r) { ??????? case R(0f): ... ??????? case R(1f): ... ??? } to ??? switch (r) { ??????? case R rr: ??????????? switch (rr.f()) { ??????????????? case 0f: ... ??????????????? case 1f: ... ??????????? } ??? } because we don't have switches on float.? By retconning constant case labels as patterns, we don't have to define new semantics for switching on these types or for constant labels of these types, we only have to remove the restrictions about what types you can switch on. **Denoting constant patterns.**? One of the remaining questions is how we denote constant patterns.? This is a bit of a bikeshed, which we can come back to when we're ready to move forward.? For purposes of exposition we'll use the constant literal here. **Closing a compositional asymmetry.**? In the "Patterns in the Java Object Model" document, we called attention to a glaring problem in API design, where it becomes nearly impossible to use the same sort of composition for taking apart objects that we use for putting them together.? As an example, suppose we compose an `Optional` as follows: ??? Optional os = Optional.of(Shape.redBall(1)); Here, we have static factories for both Optional and Shape, they don't know about each other, but we can compose them just fine. Today, if we want to reverse that -- ask whether an `Optional` contains a red ball of size 1, we have to do something awful and error prone: ??? Shape s = os.orElse(null); ??? boolean isRedUnitBall = s != null ??? ? ?? ????????????????? && s.isBall() ??????? ? ?? ????????????? && (s.color() == RED) ??????????? ? ?? ????????? && s.size() == 1; ??? if (isRedUnitBall) { ... } These code snippets look nothing alike, making reversal harder and more error-prone, and it gets worse the deeper you compose. With destructuring patterns, this gets much better and more like the creation expression: ??? if (os instanceof Optional.of(Shape.redBall(var size)) ? ?? ?? && size == 1) { ... } but that `&& size == 1` was a pesky asymmetry.? With constant patterns (modulo syntax), we can complete the transformation: ??? if (os instanceof Optional.of(Shape.redBall(1)) { ... } and destructuring looks just like the aggregation. **Bonus round: the last (?) vestige.**? Currently, we allow statement switches on legacy switch types (integers, their boxes, strings, and enums) with all constant labels to be partial, and require all other switches to be total. Patching this hole is harder, since there is lots of legacy code today that depends on this partiality.? There are a few things we can do to pave the way forward here: ?- Allow `default -> ;` in addition to `default -> { }`, since people seem to have a hard time discovering the latter. ?- Issue a warning when a legacy switch construct is not exhaustive.? This can start as a lint warning, move up to a regular warning over time, then a mandatory (unsuppressable) warning.? Maybe in a decade it can become an error, but we can start paving the way sooner. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amalloy at google.com Mon Sep 12 20:20:05 2022 From: amalloy at google.com (Alan Malloy) Date: Mon, 12 Sep 2022 13:20:05 -0700 Subject: Knocking off two more vestiges of legacy switch In-Reply-To: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> Message-ID: It's nice to see this: I think it helps some with the previous discussion on the list about "why do we want instanceof for primitives?" The point isn't that we expect anyone to use instanceof for primitives often, but that conversions for primitives in patterns is an important part of fixing up the asymmetries switch is still stuck with. On Mon, Sep 12, 2022 at 12:36 PM Brian Goetz wrote: > The work on primitive patterns continues to yield fruit; it points us to a > principled way of dealing with _constant patterns_, both as nested > patterns, and to redefining constant case labels as simple patterns. It > also points us to a way to bring the missing three types into the realm of > switch (since now switch is usable at every type _but_ these): float, > double, and boolean. While I'm not in a hurry to prioritize this > immediately, I wanted to connect the dots to how primitive type patterns > lay the foundation for these two vestiges of legacy switch. (The remaining > vestige, not yet dealt with, is that legacy statement switches are not > exhaustive. We'd like a path back to uniformity there as well, but this is > likely a longer road.) > > **Constant patterns.** In early explorations (e.g., "Pattern Matching > Semantics"), we struggled with the meaning of constant patterns, > specifically with conversions in the absence of a sharp type for the match > target. The exploration of that document treated boxing conversions but > not other conversions, which would have created a gratuitously new > conversion context. This was one of several reasons we deferred constant > patterns. > > The current status is that constant case labels (e.g., `case 3`) are > permitted (a) only in the presence of a compatible operand type and (b) are > not patterns. This has led to some accidental complexity in specifying > switch, since we can have a mix of pattern and non-pattern labels, and it > means we can't use constants as nested patterns. (We've also not yet > integrated enum cases into the exhaustiveness analysis in the presence of a > sealed type that permits an enum type.) Ret-conning all case labels as > patterns seems attractive if we can make the semantics clear, as not only > does it bring more uniformity, but it means we can use them as nested > patterns, not just at the top level of the switch. More composition. > > The recent work on `instanceof` involving primitives offers a clear and > principled meaning to `0` as a pattern; given a constant `c` of type `C`, > treat > > x matches c > > as meaning > > x matches C alpha && alpha eq c > > where `eq` is a suitable comparison predicate for the type C (== for > integral types and enums, .equals() for String, and something irritating > for floating point.) This gives us a solid basis for interpreting > something like `case 3L`; we match if the target would match `long alpha` > and `alpha == 3L`. No new rules; all conversions are handled through the > type pattern for the static type of the constant in question. Not > coincidentally, the rules for primitive type patterns support the implicit > conversions allowed in today's switches on `short`, `byte`, and `char`, > which are allowed to use `int` labels, preserving the meaning of existing > code while we generalize what switch means. > > The other attributes of patterns -- applicability, exhaustiveness, and > dominance -- are also easy: > > - a constant pattern for `c : C` is applicable to S if a type pattern for > `C` is applicable to S. > - a type pattern for T dominates a constant pattern for `c : C` if the > type pattern for T dominates a type pattern for C. > - constant patterns are never exhaustive. > > No new rules; just appeal to type patterns. > > **Switch on float, double, and boolean.** Switches on floating point were > left out for the obvious reason -- it just isn't that useful, and it would > have introduced new complexity into the specification of switch. > Similarly, boolean was left out because we have "if" statements. In the > original world, where you could switch on only five types, this was a > sensible compromise. We later added in String and enum types, which were > sensible additions. But now we move into a world where we can switch on > every type _except_ float, double, and boolean -- and this no long seems > sensible. It still may not be something people will use often, but a key > driver of the redesign of switch has been refactorability, and we currently > don't have a story for refactoring > > record R(float f) { } > > switch (r) { > case R(0f): ... > case R(1f): ... > } > > to > > switch (r) { > case R rr: > switch (rr.f()) { > case 0f: ... > case 1f: ... > } > } > > because we don't have switches on float. By retconning constant case > labels as patterns, we don't have to define new semantics for switching on > these types or for constant labels of these types, we only have to remove > the restrictions about what types you can switch on. > > **Denoting constant patterns.** One of the remaining questions is how we > denote constant patterns. This is a bit of a bikeshed, which we can come > back to when we're ready to move forward. For purposes of exposition we'll > use the constant literal here. > > **Closing a compositional asymmetry.** In the "Patterns in the Java > Object Model" document, we called attention to a glaring problem in API > design, where it becomes nearly impossible to use the same sort of > composition for taking apart objects that we use for putting them > together. As an example, suppose we compose an `Optional` as > follows: > > Optional os = Optional.of(Shape.redBall(1)); > > Here, we have static factories for both Optional and Shape, they don't > know about each other, but we can compose them just fine. Today, if we > want to reverse that -- ask whether an `Optional` contains a red > ball of size 1, we have to do something awful and error prone: > > Shape s = os.orElse(null); > boolean isRedUnitBall = s != null > && s.isBall() > && (s.color() == RED) > && s.size() == 1; > if (isRedUnitBall) { ... } > > These code snippets look nothing alike, making reversal harder and more > error-prone, and it gets worse the deeper you compose. With destructuring > patterns, this gets much better and more like the creation expression: > > if (os instanceof Optional.of(Shape.redBall(var size)) > && size == 1) { ... } > > but that `&& size == 1` was a pesky asymmetry. With constant patterns > (modulo syntax), we can complete the transformation: > > if (os instanceof Optional.of(Shape.redBall(1)) { ... } > > and destructuring looks just like the aggregation. > > **Bonus round: the last (?) vestige.** Currently, we allow statement > switches on legacy switch types (integers, their boxes, strings, and enums) > with all constant labels to be partial, and require all other switches to > be total. Patching this hole is harder, since there is lots of legacy code > today that depends on this partiality. There are a few things we can do to > pave the way forward here: > > - Allow `default -> ;` in addition to `default -> { }`, since people seem > to have a hard time discovering the latter. > - Issue a warning when a legacy switch construct is not exhaustive. This > can start as a lint warning, move up to a regular warning over time, then a > mandatory (unsuppressable) warning. Maybe in a decade it can become an > error, but we can start paving the way sooner. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Sep 12 22:28:58 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 13 Sep 2022 00:28:58 +0200 (CEST) Subject: Primitives in instanceof and patterns In-Reply-To: References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr> <376761828.2568222.1662891556420.JavaMail.zimbra@u-pem.fr> Message-ID: <1080674635.3449120.1663021738258.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Brian Goetz" > To: "Remi Forax" > Cc: "amber-spec-experts" > Sent: Sunday, September 11, 2022 4:48:04 PM > Subject: Re: Primitives in instanceof and patterns >>> >>> I think you're falling into the trap of examining each conversion and >>> asking "would I want a pattern to do this." >> Given that only primitive widening casts are safe, allowing only primitive >> widening is another way to answer to the question what a primitive type pattern >> is. >> You are proposing a semantics using range checks, that's the problem. > > So, substitute "reference" for "primitive" in this argument, and you > will see how silly it is: "since only reference widening is safe, > allowing only reference widening would be 'another answer to what a > reference type pattern is.'"? But that would also be a useless > semantic.? You're caught up on "range checks", but that's not the > important thing here.? Casting is the important thing. In fact, primitive widening is not a good idea, see my anwser about the constant pattern. > >> As an example, instanceof rules and the rules about overriding methods >> are intimately linked, asking if a method override another is >> equivalent to asking if their function types are a subtypes. >> if int instanceof double is allowed, then B::m should override A::m >> class A { >> int m() { ... } >> } >> class B extends A { >> @Override >> double m() { ... } >> } >> >> This is what i meant by changing other rules. > > Another cute argument, but no.? Covariant overriding is linked to > *subtyping*.? Instanceof *happens to coincide* with subtyping right now > (given its ad-hoc restrictions), but the causality goes the other way. > (Casting also appeals to subtyping, through reference widening > conversions.)? But this argument is like starting with "all men are > moral" and "Socrates is a man" and concluding "All men are Socrates." > > We can talk about whether it would be wise to align the definition of > covariant overrides with conversions other than reference widening (and > will likely come up again in Valhalla anyway), but this is by no means a > forced move, and not tied to generalizing the semantics of instanceof. If we can avoid ten different semantics for casting, pattern and overriding, etc. I think it's a win. Valhalla is another can of worms, because you are prematurely assigning a semantics to instanceof int, so Valhalla can not retcon instanceof int to instaceof Qjava/lang/Integer; even if unlike the primitive type int, Qjava/lang/Integer; is an object. There is another mismatch, int.class.isInstance(o) and o instanceof int are not aligned anymore. > >> I found a way to explain clearly why a reference type pattern and a >> primitive type pattern are different. >> >> Let suppose that the code compiles (to avoid the issues of the >> separate compilation), >> unlike a reference type pattern, the code executed for a primitive >> type pattern is a function of *both* the declared type and the pattern >> type. > > So (a) untrue -- what code we execute for a reference type pattern does > depend on the static types -- we may or may not generate an `instanceof` > instruction, depending on whether the pattern is unconditional.? (The > same is true for a cast; some casts are no-ops and generate no code.) Please take a look to the examples, in both cases, if the code compile it means that the first pattern is conditional and the second unconditional. > And (b), so what?? We're asking "would it be safe to cast x to T". > Depending on the types X and T, we will have different code for the > casting, so why is it unreasonable to have different code for asking > whether it is castable ? see below > >> >> By example, if i have a code like this, i've no idea what code is >> executed for case Foo(int i) without having to go to the declaration >> of Foo which is usually not collocated with the switch itself. >> >> ?? Foo foo = ... >> ?? switch (foo) { >> ????? case Foo(int i) -> {} >> ????? case Foo(double d) -> {} >> ??? } > > Sigh, this argument again?? We've been through this extensively the > first time around, with reference types, where you "had no idea what > this code means" without looking at the declaration of the pattern. > (Then, it was partiality and totality.)? I get that you didn't like that > total and partial patterns don't look syntactically different, and that > ship has sailed.? But this is the same argument warmed over. Nope, please take a look to the example, in both cases, if the code compile it means that the first pattern is conditional and the second unconditional. > > "What code will be executed" is irrelevant; what is relevant is the > semantics.? Assuming a single deconstruction pattern for Foo, the first > case asks "can the Foo's component be cast safely to int, and if so, > please cast it for me".? It doesn't matter what code we use to answer > that question or do the cast -- could be a narrowing, could be an > unboxing, whatever. It matters because the first pattern is conditional, so it's important to know the condition, at least when you debug. > > You see the same thing today without patterns: > > ??? var x = foo.getFoo(); > ??? int i = (int) x; > > x could be a long, an int, an Integer, etc, but you don't know unless > you look at the definition of getFoo().? And you have "no idea what code > will be executed."? Sure, but so what?? You asked for a cast to int. > The language validated that x is castable to int, and does what needs to > be done, which might be nothing, or a widening, or a narrowing with > truncation, or an unboxing, or some combination. There is a big difference between var x = foo.getFoo(); if (x instanceof int) { ... } and the code above when you are reading the code. The issue with the semantics you propose is that the pattern express a condition but the condition is hidden. With a cast there is no condition, it will be always executed. > > (When we get to overloading deconstruction patterns, we'll have all the > same issues as we have with overloading methods today -- it is not > obvious looking only at the call site, which overload is called, and > therefore which conversions are applied to arguments or returns.) We do not need overloading of patterns ! I repeat. We do not need overloading of patterns ! We need overloaded constructor because if the canonical constructor takes 3 arguments and we want a constructor with two, we have to provide a value. In case of pattern methods / deconstructor, we can match the three arguments but with an '_' for the argument we want to drop. > > As a reminder, here's what a nested pattern means: > > ??? x matches P(Q) === x matches P(var q) && q matches Q > > Understanding what is going to happen involves understanding the type of > `q`.? I get that you didn't like that choice, and that's your right, but > it's not OK to keep bringing it up as if its a new thing. If the switch is exhaustive, the patterns below will usually (it's not fully true) help. switch(x) { case P(Q q) -> case P(R r) -> } the last one is unconditional so the first pattern does a r instanceof Q q. > > > I think I actually understand your concern here, which has nothing to do > with the dozen or so bogus examples and explanations you've tossed out > so far.? It is that cast conversion is complicated, and you would like > pattern matching to be "simple", and so pulling in the muck of cast > conversion into pattern matching feels to you like an unforced error. > Right?? (And if so, perhaps you could have just said that, instead of > throwing random arguments at the wall?) Nope, let me recapitulate. 1) having a primitive pattern doing a range check is useless because this is rare that you want to do a range check + cast in real life, How many people have written a code like this int i = ... if (i >= Byte.MIN_VALUE && i <= Byte.MAX_VALUE) { byte b = (byte) i; ... } It's useful when you write a bytecode generator without using an existing library, ok, but how many write a bytecode generator ? It should not be the default behavior for the primitive type pattern. 2) It's also useless because there is no need to have it as a pattern, when you can use a cast in the following expression Person person = ... switch(person) { // instead of // case Person(double age) -> foo(age); // one can write case Person(int age) -> foo(age); // widening cast } 3) when you read a conditional primitive patterns, you have no idea what is the underlying operation until you go to the declaration (unlike the code just above). 4) if we change the type pattern to be not just about subtyping, we should revisit the JLS to avoid to have too many different semantics. R?mi From forax at univ-mlv.fr Mon Sep 12 22:29:03 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 13 Sep 2022 00:29:03 +0200 (CEST) Subject: Knocking off two more vestiges of legacy switch In-Reply-To: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> Message-ID: <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "amber-spec-experts" > Sent: Monday, September 12, 2022 9:36:09 PM > Subject: Knocking off two more vestiges of legacy switch > The work on primitive patterns continues to yield fruit; it points us to a > principled way of dealing with _constant patterns_, both as nested patterns, > and to redefining constant case labels as simple patterns. It also points us to > a way to bring the missing three types into the realm of switch (since now > switch is usable at every type _but_ these): float, double, and boolean. While > I'm not in a hurry to prioritize this immediately, I wanted to connect the dots > to how primitive type patterns lay the foundation for these two vestiges of > legacy switch. (The remaining vestige, not yet dealt with, is that legacy > statement switches are not exhaustive. We'd like a path back to uniformity > there as well, but this is likely a longer road.) > **Constant patterns.** In early explorations (e.g., "Pattern Matching > Semantics"), we struggled with the meaning of constant patterns, specifically > with conversions in the absence of a sharp type for the match target. The > exploration of that document treated boxing conversions but not other > conversions, which would have created a gratuitously new conversion context. > This was one of several reasons we deferred constant patterns. > The current status is that constant case labels (e.g., `case 3`) are permitted > (a) only in the presence of a compatible operand type and (b) are not patterns. > This has led to some accidental complexity in specifying switch, since we can > have a mix of pattern and non-pattern labels, and it means we can't use > constants as nested patterns. (We've also not yet integrated enum cases into > the exhaustiveness analysis in the presence of a sealed type that permits an > enum type.) Ret-conning all case labels as patterns seems attractive if we can > make the semantics clear, as not only does it bring more uniformity, but it > means we can use them as nested patterns, not just at the top level of the > switch. More composition. > The recent work on `instanceof` involving primitives offers a clear and > principled meaning to `0` as a pattern; given a constant `c` of type `C`, treat > x matches c > as meaning > x matches C alpha && alpha eq c > where `eq` is a suitable comparison predicate for the type C (== for integral > types and enums, .equals() for String, and something irritating for floating > point.) This gives us a solid basis for interpreting something like `case 3L`; > we match if the target would match `long alpha` and `alpha == 3L`. No new > rules; all conversions are handled through the type pattern for the static type > of the constant in question. Not coincidentally, the rules for primitive type > patterns support the implicit conversions allowed in today's switches on > `short`, `byte`, and `char`, which are allowed to use `int` labels, preserving > the meaning of existing code while we generalize what switch means. > The other attributes of patterns -- applicability, exhaustiveness, and dominance > -- are also easy: > - a constant pattern for `c : C` is applicable to S if a type pattern for `C` is > applicable to S. > - a type pattern for T dominates a constant pattern for `c : C` if the type > pattern for T dominates a type pattern for C. > - constant patterns are never exhaustive. > No new rules; just appeal to type patterns. It shows that the semantics you propose for the primitive type pattern is not the right one. Currently, a code like this does not compile byte b = ... switch(b) { case 200 -> .... } because 200 is not a short which is great because otherwise at runtime it will never be reached. But if we apply the rules above + your definition of the primitive pattern, the code above will happily compile because it is equivalent to byte b = ... switch(b) { case short s when s == 200 -> .... } Moreover, i think R(true) and R(false) should be exhaustive, it's not a big deal because you can rewrite it R(true) and R (or R(_)) but i think that R(true) and R(false) is more readable. > **Switch on float, double, and boolean.** Switches on floating point were left > out for the obvious reason -- it just isn't that useful, and it would have > introduced new complexity into the specification of switch. Similarly, boolean > was left out because we have "if" statements. In the original world, where you > could switch on only five types, this was a sensible compromise. We later added > in String and enum types, which were sensible additions. But now we move into a > world where we can switch on every type _except_ float, double, and boolean -- > and this no long seems sensible. It still may not be something people will use > often, but a key driver of the redesign of switch has been refactorability, and > we currently don't have a story for refactoring > record R(float f) { } > switch (r) { > case R(0f): ... > case R(1f): ... > } > to > switch (r) { > case R rr: > switch (rr.f()) { > case 0f: ... > case 1f: ... > } > } > because we don't have switches on float. By retconning constant case labels as > patterns, we don't have to define new semantics for switching on these types or > for constant labels of these types, we only have to remove the restrictions > about what types you can switch on. > **Denoting constant patterns.** One of the remaining questions is how we denote > constant patterns. This is a bit of a bikeshed, which we can come back to when > we're ready to move forward. For purposes of exposition we'll use the constant > literal here. This is what Haskell does, this is what Caml don't, at some point we will have to pick a side. > **Closing a compositional asymmetry.** In the "Patterns in the Java Object > Model" document, we called attention to a glaring problem in API design, where > it becomes nearly impossible to use the same sort of composition for taking > apart objects that we use for putting them together. As an example, suppose we > compose an `Optional` as follows: > Optional os = Optional.of(Shape.redBall(1)); > Here, we have static factories for both Optional and Shape, they don't know > about each other, but we can compose them just fine. Today, if we want to > reverse that -- ask whether an `Optional` contains a red ball of size 1, > we have to do something awful and error prone: > Shape s = os.orElse(null); > boolean isRedUnitBall = s != null > && s.isBall() > && (s.color() == RED) > && s.size() == 1; > if (isRedUnitBall) { ... } > These code snippets look nothing alike, making reversal harder and more > error-prone, and it gets worse the deeper you compose. With destructuring > patterns, this gets much better and more like the creation expression: > if (os instanceof Optional.of(Shape.redBall(var size)) > && size == 1) { ... } > but that `&& size == 1` was a pesky asymmetry. With constant patterns (modulo > syntax), we can complete the transformation: > if (os instanceof Optional.of(Shape.redBall(1)) { ... } > and destructuring looks just like the aggregation. I agree, it's quite sad that we have to support float and double but as you said composition is more important. > **Bonus round: the last (?) vestige.** Currently, we allow statement switches on > legacy switch types (integers, their boxes, strings, and enums) with all > constant labels to be partial, and require all other switches to be total. > Patching this hole is harder, since there is lots of legacy code today that > depends on this partiality. There are a few things we can do to pave the way > forward here: > - Allow `default -> ;` in addition to `default -> { }`, since people seem to > have a hard time discovering the latter. we should also fix that for lambdas, the fact that the lambda syntax and the case arrow syntax are not aligned currently ; `() -> throw ...`is not legal while `case ... -> throw ...` is, is something that trouble a lot of my student (i also introduce the switch syntax before the lambda, so the lambda seems less powerful ??). > - Issue a warning when a legacy switch construct is not exhaustive. This can > start as a lint warning, move up to a regular warning over time, then a > mandatory (unsuppressable) warning. Maybe in a decade it can become an error, > but we can start paving the way sooner. I agree with a switch warning if all the IDEs stop fixing the warning by adding a `default` when the type switched upon is sealed. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Sep 12 22:48:42 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 12 Sep 2022 18:48:42 -0400 Subject: Primitives in instanceof and patterns In-Reply-To: <1080674635.3449120.1663021738258.JavaMail.zimbra@u-pem.fr> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr> <376761828.2568222.1662891556420.JavaMail.zimbra@u-pem.fr> <1080674635.3449120.1663021738258.JavaMail.zimbra@u-pem.fr> Message-ID: <27437ad6-7a87-580f-a593-9866f1ee8af5@oracle.com> >> (When we get to overloading deconstruction patterns, we'll have all the >> same issues as we have with overloading methods today -- it is not >> obvious looking only at the call site, which overload is called, and >> therefore which conversions are applied to arguments or returns.) > We do not need overloading of patterns ! > I repeat. > We do not need overloading of patterns ! You would be incorrect about that. Deconstruction patterns are the dual of constructors.? Pairing a constructor (or factory) with a deconstruction (or static) pattern forms an embedding-projection pair, which is what drives, e.g., the `with` construct.? This is a very powerful relationship. Constructors can be overloaded; saying "but deconstructors can't" is just a gratuitous restriction, and it undermines the role of patterns as the dual of ctors/methods. At the risk of repeating myself, it is clear that you just want pattern matching to be a much smaller feature.? That's a valid opinion, but please stop with the "not a good idea" / "not needed" / "useless" / "YAGNI" at every turn.? There's a big story here; you can agree with it or not, but please, please, please stop trying to talk down every part of the story because it seems "too big" to you.? If you don't understand the big story, ask (constructive) questions.? But please, please, stop trying to YAGNI away everything.? It's not helpful. > Nope, > let me recapitulate. > > 1) having a primitive pattern doing a range check is useless because this is rare that you want to do a range check + cast in real life, > How many people have written a code like this > > int i = ... > if (i >= Byte.MIN_VALUE && i <= Byte.MAX_VALUE) { > byte b = (byte) i; > ... > } > > It's useful when you write a bytecode generator without using an existing library, ok, but how many write a bytecode generator ? > It should not be the default behavior for the primitive type pattern. > > 2) It's also useless because there is no need to have it as a pattern, when you can use a cast in the following expression > Person person = ... > switch(person) { > // instead of > // case Person(double age) -> foo(age); > // one can write > case Person(int age) -> foo(age); // widening cast > } > > 3) when you read a conditional primitive patterns, you have no idea what is the underlying operation until you go to the declaration (unlike the code just above). > > > 4) if we change the type pattern to be not just about subtyping, we should revisit the JLS to avoid to have too many different semantics. > Thanks for stating your concerns succinctly.? (Some of this is just subjective "I want patterns to be a smaller feature"; some is disagreement with decisions that are already made.) From brian.goetz at oracle.com Mon Sep 12 22:57:40 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 12 Sep 2022 18:57:40 -0400 Subject: Knocking off two more vestiges of legacy switch In-Reply-To: <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> Message-ID: > > It shows that the semantics you propose for the primitive type pattern > is not the right one. > > Currently, a code like this does not compile > ? byte b = ... > ? switch(b) { > ??? case 200 -> .... > ? } Thanks, that's a good catch -- we currently do more type checking than a strict interpretation of this story for constant patterns provides.? But this can be addressed by additional compile-time type checking for constant patterns. But this would be a critique of _constant patterns_, not of primitive type patterns (and easily addressed.) > > because 200 is not a short which is great because otherwise at runtime > it will never be reached. I think you mean "not a byte"? > > But if we apply the rules above + your definition of the primitive > pattern, the code above will happily compile because it is equivalent to > > ? byte b = ... > ? switch(b) { > ??? case short s when s == 200 -> .... > ? } I think you mean "case int s when s == 200"? > > Moreover, i think R(true) and R(false) should be exhaustive, it's not > a big deal because you can rewrite it R(true) and R (or R(_)) but i > think that R(true) and R(false) is more readable. Agree, that's in the plan.? Booleans are like enums, so true/false covers boolean, and therefore R(true) and R(false) covers R(boolean). > > I agree, it's quite sad that we have to support float and double but > as you said composition is more important. It would have been unfortunate if we had to add these as special cases for switch.? But with primitive type patterns plus "constants are patterns" then this falls out trivially without additional specification; all we have to do is _remove_ the existing restriction. > > > **Bonus round: the last (?) vestige.**? Currently, we allow > statement switches on legacy switch types (integers, their boxes, > strings, and enums) with all constant labels to be partial, and > require all other switches to be total.? Patching this hole is > harder, since there is lots of legacy code today that depends on > this partiality.? There are a few things we can do to pave the way > forward here: > > ?- Allow `default -> ;` in addition to `default -> { }`, since > people seem to have a hard time discovering the latter. > > > we should also fix that for lambdas, the fact that the lambda syntax > and the case arrow syntax are not aligned currently ; `() -> throw > ...`is not legal while `case ... -> throw ...` is, is something that > trouble a lot of my student (i also introduce the switch syntax before > the lambda, so the lambda seems less powerful ??). Good thought. > ? - Issue a warning when a legacy switch construct is not > exhaustive.? This can start as a lint warning, move up to a > regular warning over time, then a mandatory (unsuppressable) > warning. Maybe in a decade it can become an error, but we can > start paving the way sooner. > > > I agree with a switch warning if all the IDEs stop fixing the warning > by adding a `default` when the type switched upon is sealed. Right, I think over some time, IDEs will fix all the occurrences and then it is less disruptive to tighten. -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Mon Sep 12 22:58:47 2022 From: john.r.rose at oracle.com (John Rose) Date: Mon, 12 Sep 2022 15:58:47 -0700 Subject: Knocking off two more vestiges of legacy switch In-Reply-To: <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> Message-ID: <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com> It?s too harsh to say your example shows the semantics are just wrong. I think they are right, but possibly incomplete. The exclusion of case 200 is the job of dead code detection logic in the language, the same kind of logic that also reports an error on `"foo" instanceof List`. Then there are the old murky rules that allow an integral constant like 100 to assign to `byte` only because 100 fits in the byte range while 200 does not. The duals of those rules will surely speak to the restriction of `case 200:` matching a byte. On 12 Sep 2022, at 15:29, Remi Forax wrote: >> No new rules; just appeal to type patterns. > It shows that the semantics you propose for the primitive type pattern > is not the right one. > > Currently, a code like this does not compile > byte b = ... > switch(b) { > case 200 -> .... > } > > because 200 is not a short which is great because otherwise at runtime > it will never be reached. > > But if we apply the rules above + your definition of the primitive > pattern, the code above will happily compile because it is equivalent > to > > byte b = ... > switch(b) { > case short s when s == 200 -> .... > } -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Sep 13 14:06:17 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 13 Sep 2022 16:06:17 +0200 (CEST) Subject: Knocking off two more vestiges of legacy switch In-Reply-To: <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com> References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com> Message-ID: <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr> > From: "John Rose" > To: "Remi Forax" > Cc: "Brian Goetz" , "amber-spec-experts" > > Sent: Tuesday, September 13, 2022 12:58:47 AM > Subject: Re: Knocking off two more vestiges of legacy switch > It?s too harsh to say your example shows the semantics are just wrong. yes, it's more than there is inconsistencies > I think they are right, but possibly incomplete. The exclusion of case 200 is > the job of dead code detection logic in the language, the same kind of logic > that also reports an error on "foo" instanceof List . > Then there are the old murky rules that allow an integral constant like 100 to > assign to byte only because 100 fits in the byte range while 200 does not. The > duals of those rules will surely speak to the restriction of case 200: matching > a byte. The problem with that approach is that the semantics of constant patterns and the semantics of primitive type patterns will be not aligned, so if you have both pattern in a switch, users will spot the inconsistency. something like byte b = ... switch(b) { case 200 -> ... // does not compile, incompatible types between byte and int case int i -> ... // ok, compiles } So i agree that we should have primitive type patterns but instead of using the casting rules as model, the actual rules complemented with boolean, long, float and double seems a better fit. Compared to what Brian proposed, it means all primitive patterns are unconditional apart unboxing if the pattern is not total (the same way reference type pattern works with null). R?mi > On 12 Sep 2022, at 15:29, Remi Forax wrote: >>> No new rules; just appeal to type patterns. >> It shows that the semantics you propose for the primitive type pattern is not >> the right one. >> Currently, a code like this does not compile >> byte b = ... >> switch(b) { >> case 200 -> .... >> } >> because 200 is not a short which is great because otherwise at runtime it will >> never be reached. >> But if we apply the rules above + your definition of the primitive pattern, the >> code above will happily compile because it is equivalent to >> byte b = ... >> switch(b) { >> case short s when s == 200 -> .... >> } -------------- next part -------------- An HTML attachment was scrubbed... URL: From heidinga at redhat.com Tue Sep 13 14:13:48 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Tue, 13 Sep 2022 10:13:48 -0400 Subject: Knocking off two more vestiges of legacy switch In-Reply-To: <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr> References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com> <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr> Message-ID: On Tue, Sep 13, 2022 at 10:08 AM wrote: > > > ------------------------------ > > *From: *"John Rose" > *To: *"Remi Forax" > *Cc: *"Brian Goetz" , "amber-spec-experts" < > amber-spec-experts at openjdk.java.net> > *Sent: *Tuesday, September 13, 2022 12:58:47 AM > *Subject: *Re: Knocking off two more vestiges of legacy switch > > It?s too harsh to say your example shows the semantics are just wrong. > > > yes, it's more than there is inconsistencies > > I think they are right, but possibly incomplete. The exclusion of case 200 > is the job of dead code detection logic in the language, the same kind of > logic that also reports an error on "foo" instanceof List. > > Then there are the old murky rules that allow an integral constant like > 100 to assign to byte only because 100 fits in the byte range while 200 > does not. The duals of those rules will surely speak to the restriction of case > 200: matching a byte. > > > The problem with that approach is that the semantics of constant patterns > and the semantics of primitive type patterns will be not aligned, > so if you have both pattern in a switch, users will spot the inconsistency. > > something like > byte b = ... > switch(b) { > case 200 -> ... // does not compile, incompatible types between byte > and int > case int i -> ... // ok, compiles > } > I've been following along on this discussion and I'm not sure what the inconsistency here is. Remi, can you clarify? As a developer, the semantics here are intuitive - I can't have a (signed) byte that matches 200 so as John said earlier, it's clearly dead code. On the other hand, bytes can always be converted to an int so it makes sense that the `case int i` both compiles and matches to the byte. Can you expand on why users would find that confusing? --Dan > > So i agree that we should have primitive type patterns but instead of > using the casting rules as model, the actual rules complemented with > boolean, long, float and double seems a better fit. > > Compared to what Brian proposed, it means all primitive patterns are > unconditional apart unboxing if the pattern is not total (the same way > reference type pattern works with null). > > R?mi > > On 12 Sep 2022, at 15:29, Remi Forax wrote: > > No new rules; just appeal to type patterns. > > It shows that the semantics you propose for the primitive type pattern is > not the right one. > > Currently, a code like this does not compile > byte b = ... > switch(b) { > case 200 -> .... > } > > because 200 is not a short which is great because otherwise at runtime it > will never be reached. > > But if we apply the rules above + your definition of the primitive > pattern, the code above will happily compile because it is equivalent to > > byte b = ... > switch(b) { > case short s when s == 200 -> .... > } > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Sep 13 14:51:47 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 13 Sep 2022 16:51:47 +0200 (CEST) Subject: Knocking off two more vestiges of legacy switch In-Reply-To: References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com> <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr> Message-ID: <1636267230.4107746.1663080707641.JavaMail.zimbra@u-pem.fr> > From: "Dan Heidinga" > To: "Remi Forax" > Cc: "John Rose" , "Brian Goetz" > , "amber-spec-experts" > > Sent: Tuesday, September 13, 2022 4:13:48 PM > Subject: Re: Knocking off two more vestiges of legacy switch > On Tue, Sep 13, 2022 at 10:08 AM < [ mailto:forax at univ-mlv.fr | > forax at univ-mlv.fr ] > wrote: >>> From: "John Rose" < [ mailto:john.r.rose at oracle.com | john.r.rose at oracle.com ] > >>> To: "Remi Forax" < [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] > >>> Cc: "Brian Goetz" < [ mailto:brian.goetz at oracle.com | brian.goetz at oracle.com ] >>> >, "amber-spec-experts" < [ mailto:amber-spec-experts at openjdk.java.net | >>> amber-spec-experts at openjdk.java.net ] > >>> Sent: Tuesday, September 13, 2022 12:58:47 AM >>> Subject: Re: Knocking off two more vestiges of legacy switch >>> It?s too harsh to say your example shows the semantics are just wrong. >> yes, it's more than there is inconsistencies >>> I think they are right, but possibly incomplete. The exclusion of case 200 is >>> the job of dead code detection logic in the language, the same kind of logic >>> that also reports an error on "foo" instanceof List . >>> Then there are the old murky rules that allow an integral constant like 100 to >>> assign to byte only because 100 fits in the byte range while 200 does not. The >>> duals of those rules will surely speak to the restriction of case 200: matching >>> a byte. >> The problem with that approach is that the semantics of constant patterns and >> the semantics of primitive type patterns will be not aligned, >> so if you have both pattern in a switch, users will spot the inconsistency. >> something like >> byte b = ... >> switch(b) { >> case 200 -> ... // does not compile, incompatible types between byte and int >> case int i -> ... // ok, compiles >> } > I've been following along on this discussion and I'm not sure what the > inconsistency here is. Remi, can you clarify? > As a developer, the semantics here are intuitive - I can't have a (signed) byte > that matches 200 so as John said earlier, it's clearly dead code. On the other > hand, bytes can always be converted to an int so it makes sense that the `case > int i` both compiles and matches to the byte. Can you expand on why users would > find that confusing? The error messages of javac says the types are incompatible. > --Dan R?mi >> So i agree that we should have primitive type patterns but instead of using the >> casting rules as model, the actual rules complemented with boolean, long, float >> and double seems a better fit. >> Compared to what Brian proposed, it means all primitive patterns are >> unconditional apart unboxing if the pattern is not total (the same way >> reference type pattern works with null). >> R?mi >>> On 12 Sep 2022, at 15:29, Remi Forax wrote: >>>>> No new rules; just appeal to type patterns. >>>> It shows that the semantics you propose for the primitive type pattern is not >>>> the right one. >>>> Currently, a code like this does not compile >>>> byte b = ... >>>> switch(b) { >>>> case 200 -> .... >>>> } >>>> because 200 is not a short which is great because otherwise at runtime it will >>>> never be reached. >>>> But if we apply the rules above + your definition of the primitive pattern, the >>>> code above will happily compile because it is equivalent to >>>> byte b = ... >>>> switch(b) { >>>> case short s when s == 200 -> .... >>>> } -------------- next part -------------- An HTML attachment was scrubbed... URL: From heidinga at redhat.com Tue Sep 13 15:31:19 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Tue, 13 Sep 2022 11:31:19 -0400 Subject: Knocking off two more vestiges of legacy switch In-Reply-To: <1636267230.4107746.1663080707641.JavaMail.zimbra@u-pem.fr> References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com> <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr> <1636267230.4107746.1663080707641.JavaMail.zimbra@u-pem.fr> Message-ID: On Tue, Sep 13, 2022 at 11:01 AM wrote: > > > ------------------------------ > > *From: *"Dan Heidinga" > *To: *"Remi Forax" > *Cc: *"John Rose" , "Brian Goetz" < > brian.goetz at oracle.com>, "amber-spec-experts" < > amber-spec-experts at openjdk.java.net> > *Sent: *Tuesday, September 13, 2022 4:13:48 PM > *Subject: *Re: Knocking off two more vestiges of legacy switch > > > > On Tue, Sep 13, 2022 at 10:08 AM wrote: > >> >> >> ------------------------------ >> >> *From: *"John Rose" >> *To: *"Remi Forax" >> *Cc: *"Brian Goetz" , "amber-spec-experts" < >> amber-spec-experts at openjdk.java.net> >> *Sent: *Tuesday, September 13, 2022 12:58:47 AM >> *Subject: *Re: Knocking off two more vestiges of legacy switch >> >> It?s too harsh to say your example shows the semantics are just wrong. >> >> >> yes, it's more than there is inconsistencies >> >> I think they are right, but possibly incomplete. The exclusion of case >> 200 is the job of dead code detection logic in the language, the same kind >> of logic that also reports an error on "foo" instanceof List. >> >> Then there are the old murky rules that allow an integral constant like >> 100 to assign to byte only because 100 fits in the byte range while 200 >> does not. The duals of those rules will surely speak to the restriction of case >> 200: matching a byte. >> >> >> The problem with that approach is that the semantics of constant patterns >> and the semantics of primitive type patterns will be not aligned, >> so if you have both pattern in a switch, users will spot the >> inconsistency. >> >> something like >> byte b = ... >> switch(b) { >> case 200 -> ... // does not compile, incompatible types between byte >> and int >> case int i -> ... // ok, compiles >> } >> > > I've been following along on this discussion and I'm not sure what the > inconsistency here is. Remi, can you clarify? > > As a developer, the semantics here are intuitive - I can't have a (signed) > byte that matches 200 so as John said earlier, it's clearly dead code. On > the other hand, bytes can always be converted to an int so it makes sense > that the `case int i` both compiles and matches to the byte. Can you > expand on why users would find that confusing? > > > The error messages of javac says the types are incompatible. > > Ok. So the concern is with the error messages produced by javac? That seems fixable but also a separate issue from whether the semantics being proposed are a good path forward. And at least jshell is quite clear in the message it produces for similar code today so this may be a non-issue. jshell> byte b = 200 | Error: | incompatible types: possible lossy conversion from int to byte | byte b = 200; | ^-^ --Dan > > --Dan > > > R?mi > > > >> >> So i agree that we should have primitive type patterns but instead of >> using the casting rules as model, the actual rules complemented with >> boolean, long, float and double seems a better fit. >> >> Compared to what Brian proposed, it means all primitive patterns are >> unconditional apart unboxing if the pattern is not total (the same way >> reference type pattern works with null). >> >> R?mi >> >> On 12 Sep 2022, at 15:29, Remi Forax wrote: >> >> No new rules; just appeal to type patterns. >> >> It shows that the semantics you propose for the primitive type pattern is >> not the right one. >> >> Currently, a code like this does not compile >> byte b = ... >> switch(b) { >> case 200 -> .... >> } >> >> because 200 is not a short which is great because otherwise at runtime it >> will never be reached. >> >> But if we apply the rules above + your definition of the primitive >> pattern, the code above will happily compile because it is equivalent to >> >> byte b = ... >> switch(b) { >> case short s when s == 200 -> .... >> } >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From joe.darcy at oracle.com Tue Sep 13 16:48:15 2022 From: joe.darcy at oracle.com (Joe Darcy) Date: Tue, 13 Sep 2022 09:48:15 -0700 Subject: Knocking off two more vestiges of legacy switch In-Reply-To: <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> Message-ID: <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com> On 9/12/2022 3:29 PM, Remi Forax wrote: > > > ------------------------------------------------------------------------ > > *From: *"Brian Goetz" > *To: *"amber-spec-experts" > *Sent: *Monday, September 12, 2022 9:36:09 PM > *Subject: *Knocking off two more vestiges of legacy switch > > [snip] > > I agree, it's quite sad that we have to support float and double but > as you said composition is more important. > It is common for math library methods to have a preamble to screen out special values (infinities, NaN, 0.0, 1.0, etc.). This would be a reasonable use of a switch on float/double switch. -Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Sep 13 16:55:49 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 13 Sep 2022 12:55:49 -0400 Subject: Knocking off two more vestiges of legacy switch In-Reply-To: <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com> References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com> Message-ID: <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com> > It is common for math library methods to have a preamble to screen out > special values (infinities, NaN, 0.0, 1.0, etc.). > > This would be a reasonable use of a switch on float/double switch. > > Which raises some questions (again) of the semantics of constant patterns for exotic floating point values, especially (again) negative zero. From forax at univ-mlv.fr Tue Sep 13 16:59:40 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 13 Sep 2022 18:59:40 +0200 (CEST) Subject: Knocking off two more vestiges of legacy switch In-Reply-To: <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com> References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com> <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com> Message-ID: <1086150907.4171803.1663088380655.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Brian Goetz" > To: "joe darcy" , "Amber Expert Group Observers" , "Remi Forax" > > Cc: "amber-spec-experts" > Sent: Tuesday, September 13, 2022 6:55:49 PM > Subject: Re: Knocking off two more vestiges of legacy switch >> It is common for math library methods to have a preamble to screen out >> special values (infinities, NaN, 0.0, 1.0, etc.). >> >> This would be a reasonable use of a switch on float/double switch. >> >> > > Which raises some questions (again) of the semantics of constant > patterns for exotic floating point values, especially (again) negative zero. You mean, do we use == or Float.equals()/Double.equals() ? I will vote for the later, like with records. R?mi From joe.darcy at oracle.com Tue Sep 13 17:07:45 2022 From: joe.darcy at oracle.com (Joe Darcy) Date: Tue, 13 Sep 2022 10:07:45 -0700 Subject: Knocking off two more vestiges of legacy switch In-Reply-To: <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com> References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com> <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com> Message-ID: On 9/13/2022 9:55 AM, Brian Goetz wrote: > >> It is common for math library methods to have a preamble to screen >> out special values (infinities, NaN, 0.0, 1.0, etc.). >> >> This would be a reasonable use of a switch on float/double switch. >> >> > > Which raises some questions (again) of the semantics of constant > patterns for exotic floating point values, especially (again) negative > zero. In a switching context, I think there is a stronger case for distinguishing between +0.0 and -0.0. The operational semantics I'd recommend are to desugar, say a float switch, to an int switch on the Float.floatToIntBits mapping of the float case labels. Float.floatToIntBits, as opposed to Float.floatToRawIntBits, normalized all NaN representations to a single value. -Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Sep 13 17:31:03 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 13 Sep 2022 19:31:03 +0200 (CEST) Subject: Knocking off two more vestiges of legacy switch In-Reply-To: References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com> <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr> Message-ID: <1054001531.4176794.1663090263206.JavaMail.zimbra@u-pem.fr> > From: "Dan Heidinga" > To: "Remi Forax" > Cc: "John Rose" , "Brian Goetz" > , "amber-spec-experts" > > Sent: Tuesday, September 13, 2022 4:13:48 PM > Subject: Re: Knocking off two more vestiges of legacy switch > On Tue, Sep 13, 2022 at 10:08 AM < [ mailto:forax at univ-mlv.fr | > forax at univ-mlv.fr ] > wrote: >>> From: "John Rose" < [ mailto:john.r.rose at oracle.com | john.r.rose at oracle.com ] > >>> To: "Remi Forax" < [ mailto:forax at univ-mlv.fr | forax at univ-mlv.fr ] > >>> Cc: "Brian Goetz" < [ mailto:brian.goetz at oracle.com | brian.goetz at oracle.com ] >>> >, "amber-spec-experts" < [ mailto:amber-spec-experts at openjdk.java.net | >>> amber-spec-experts at openjdk.java.net ] > >>> Sent: Tuesday, September 13, 2022 12:58:47 AM >>> Subject: Re: Knocking off two more vestiges of legacy switch >>> It?s too harsh to say your example shows the semantics are just wrong. >> yes, it's more than there is inconsistencies >>> I think they are right, but possibly incomplete. The exclusion of case 200 is >>> the job of dead code detection logic in the language, the same kind of logic >>> that also reports an error on "foo" instanceof List . >>> Then there are the old murky rules that allow an integral constant like 100 to >>> assign to byte only because 100 fits in the byte range while 200 does not. The >>> duals of those rules will surely speak to the restriction of case 200: matching >>> a byte. >> The problem with that approach is that the semantics of constant patterns and >> the semantics of primitive type patterns will be not aligned, >> so if you have both pattern in a switch, users will spot the inconsistency. >> something like >> byte b = ... >> switch(b) { >> case 200 -> ... // does not compile, incompatible types between byte and int >> case int i -> ... // ok, compiles >> } > I've been following along on this discussion and I'm not sure what the > inconsistency here is. Remi, can you clarify? > As a developer, the semantics here are intuitive - I can't have a (signed) byte > that matches 200 so as John said earlier, it's clearly dead code. On the other > hand, bytes can always be converted to an int so it makes sense that the `case > int i` both compiles and matches to the byte. Can you expand on why users would > find that confusing? So my main concern stay that String s = ... switch(s) { case Comparable c -> ... case Object o -> ... } and long l = ... switch(l) { case float f -> ... case double d -> ... } behave differently. > --Dan R?mi >> So i agree that we should have primitive type patterns but instead of using the >> casting rules as model, the actual rules complemented with boolean, long, float >> and double seems a better fit. >> Compared to what Brian proposed, it means all primitive patterns are >> unconditional apart unboxing if the pattern is not total (the same way >> reference type pattern works with null). >> R?mi >>> On 12 Sep 2022, at 15:29, Remi Forax wrote: >>>>> No new rules; just appeal to type patterns. >>>> It shows that the semantics you propose for the primitive type pattern is not >>>> the right one. >>>> Currently, a code like this does not compile >>>> byte b = ... >>>> switch(b) { >>>> case 200 -> .... >>>> } >>>> because 200 is not a short which is great because otherwise at runtime it will >>>> never be reached. >>>> But if we apply the rules above + your definition of the primitive pattern, the >>>> code above will happily compile because it is equivalent to >>>> byte b = ... >>>> switch(b) { >>>> case short s when s == 200 -> .... >>>> } -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Sep 13 17:31:38 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 13 Sep 2022 13:31:38 -0400 Subject: Knocking off two more vestiges of legacy switch In-Reply-To: References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com> <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com> Message-ID: >> Which raises some questions (again) of the semantics of constant >> patterns for exotic floating point values, especially (again) >> negative zero. > > > In a switching context, I think there is a stronger case for > distinguishing between +0.0 and -0.0. The operational semantics I'd > recommend are to desugar, say a float switch, to an int switch on the > Float.floatToIntBits mapping of the float case labels. > Float.floatToIntBits, as opposed to Float.floatToRawIntBits, > normalized all NaN representations to a single value. > This sounds right to me, but its not just about switch -- this would have to be the case for all constant patterns, such as ??? if (x instanceof FloatHolder(Float.NaN)) { ... } But I think your argument still applies here as well. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Sep 13 18:14:52 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 13 Sep 2022 14:14:52 -0400 Subject: Primitives in instanceof and patterns In-Reply-To: <27437ad6-7a87-580f-a593-9866f1ee8af5@oracle.com> References: <94d8d589-fa85-3e86-ccab-7969e66f6855@oracle.com> <1873318389.2133206.1662737744430.JavaMail.zimbra@u-pem.fr> <1258625143.2333630.1662800281638.JavaMail.zimbra@u-pem.fr> <376761828.2568222.1662891556420.JavaMail.zimbra@u-pem.fr> <1080674635.3449120.1663021738258.JavaMail.zimbra@u-pem.fr> <27437ad6-7a87-580f-a593-9866f1ee8af5@oracle.com> Message-ID: <7fce011b-f87d-07d8-6986-379e9755a2a7@oracle.com> I'm going to try and address these points *for the benefit of everyone else*.? (Note to Remi only: this is not an invitation to continue the back and forth, as doing so would likely be unconstructive unless you have something either (a) radically new that no one has thought of yet and/or (b) something that is so obviously right and compelling that I will immediately weep with embarrassment for how wrong I was.? That's the bar at this point.? I get that you hate this feature.? You've made that manifestly clear.? But unless you have some significantly new light to shed on it, it is unconstructive to just keep banging this drum, and you are creating an environment where others feel less comfortable sharing their thoughts, which is unacceptable.) > 1) having a primitive pattern doing a range check is useless because > this is rare that you want to do a range check + cast in real life, > ?? How many people have written a code like this > > ??? int i = ... > ??? if (i >= Byte.MIN_VALUE && i <= Byte.MAX_VALUE) { > ????? byte b = (byte) i; > ????? ... > ??? } > > ?? It's useful when you write a bytecode generator without using an > existing library, ok, but how many write a bytecode generator ? > ?? It should not be the default behavior for the primitive type pattern. This argument stems from a misunderstanding of what we are trying to accomplish here.? Yes, it is correct that `case byte b` is not something everyone will use (I have written this many times, though I admit this is probably unusual.)? But that's not the point of this exercise; the point of the exercise is uniformity, in part because the lack of uniformity is complexity, and in part we want to offer new semantic symmetries that programmers can count on.? You are trying to tinker at the margins, asking if each conversion carries its weight; that's a recipe for creating new, ad-hoc complexity surface.? Sometimes that's the right move, and sometimes it is unavoidable, but there is such an obviously correct interpretation of primitive instanceof here -- "would a cast to this type be safe" -- that it would be an unforced error to opt for the ad-hoc complexity just because you can't imagine using it that often. If I have a record: ??? record R(int x) { } I can construct it with ??? new R(aShort) but under the strict semantics of primitive type patterns,? I cannot deconstruct it with ??? case R(short s) { } which would ask: "could this record have come from a constructor invocation `new R(s)`".?? And this is gratuitously different than the correspond case with reference widening: ??? record S(Object o) { } ??? S s = new S("foo"); ??? if (s instanceof S(String ss)) { ... } Further, I take objection to your continued characterization of this as a "range check", as this is a mischaracterization as well as minimizing what is going on.? Casting subsumes boxing and unboxing as well as widening and narrowing, so a more correct characterization would be "could I cast this without loss or error to a short".? Which applies not only to wider and narrower types, but to types like Short and Object.? Just like `instanceof` for reference types, which asks whether the type could be cast to another type.? And without creating a new context for what is allowable. Not only is the term "useless" unconstructive, but it is not even the right measure.? The bar here is not "would people use it a lot."? We're making the language simpler by making it more uniform. To say "let's gratuitously knock some of the boxes out of the cast matrix because I can't imagine using them" only makes the language more complicated. > 2) It's also useless because there is no need to have it as a pattern, > when you can use a cast in the following expression > ??? Person person = ... > ??? switch(person) { > ????? // instead of > ????? // case Person(double age) -> foo(age); > ????? // one can write > ????? case Person(int age) -> foo(age);? // widening cast > ??? } Same argument (also you got your example backwards).? I get that you think its fine to have to do this, but it is yet another gratuitous asymmetry between aggregation and destructuring that confuses people about how destructuring works.? Why can you pass an int or a double to `new Person`, but could only take an `double` out?? Whereas with Object/String, you could take either out? Again, this is gratuitous complexity, which I think is rooted in your unwillingness to let go of "instanceof means subtype."? Sorry, it doesn't any more (but it means something that generalizes it.) > 3) when you read a conditional primitive patterns, you have no idea > what is the underlying operation until you go to the declaration > (unlike the code just above). This is the same complaint you had in the past about partial and total nested patterns.? As I've said, I understand why you find it uncomfortable ("action at a distance"), but we evaluated the pros and cons extensively already, and we made our decision.? There's no reason to reopen it here, nor are the considerations any different in this case. > 4) if we change the type pattern to be not just about subtyping, we > should revisit the JLS to avoid to have too many different semantics. This is FUD, implying that we are going to have to reexamine everything.? I don't buy it.? Many of the things that lean on subtyping today are just ... subtyping.? And the things that have conversions involving primitives already lean on conversions and contexts. By way of concrete example, you raised the question about covariant overrides.? Which was a good example, and which I appreciate, but I wish you would have raised it differently. A constructive way to raise this would be: "Do we also want to reexamine covariant overrides to use castability (or some other criteria) rather than subtyping?" An unconstructive way to raise this would be: "This feature is bad, look at the problems you are creating for covariant overrides, everything will have to be reexamined." -------------- next part -------------- An HTML attachment was scrubbed... URL: From heidinga at redhat.com Tue Sep 13 18:42:23 2022 From: heidinga at redhat.com (Dan Heidinga) Date: Tue, 13 Sep 2022 14:42:23 -0400 Subject: Knocking off two more vestiges of legacy switch In-Reply-To: <1054001531.4176794.1663090263206.JavaMail.zimbra@u-pem.fr> References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com> <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr> <1054001531.4176794.1663090263206.JavaMail.zimbra@u-pem.fr> Message-ID: > > So my main concern stay that > String s = ... > switch(s) { > case Comparable c -> ... //Dan: matches here as String implements > Comparable (this case is total on "s" so no further matching) > case Object o -> ... > } > > and > long l = ... > switch(l) { > case float f -> ... //Dan: matches here if l is convertable to a float > case double d -> ... //Dan: otherwise matches here > } > > behave differently. > > In each case, we're finding the switch case that the value is compatible with. Another way to say it is the value is convertable to... or castable to. Can you expand on what you mean by "behave differently"? I'm still working on reading through the "big picture" presentation in [0] so if there's a particular section there that you think is relevant, I can re-read that first. It might be useful for both of us to re-read it and see how this example fits with the bigger picture being proposed for pattern matching. --Dan [0] https://openjdk.org/projects/amber/design-notes/patterns/pattern-match-object-model -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Tue Sep 13 19:15:38 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Tue, 13 Sep 2022 21:15:38 +0200 (CEST) Subject: Knocking off two more vestiges of legacy switch In-Reply-To: References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> <0BEAAF86-ED63-4247-9E29-CBABCF4C091E@oracle.com> <842449722.4053672.1663077977558.JavaMail.zimbra@u-pem.fr> <1054001531.4176794.1663090263206.JavaMail.zimbra@u-pem.fr> Message-ID: <460160711.4208276.1663096538708.JavaMail.zimbra@u-pem.fr> > From: "Dan Heidinga" > To: "Remi Forax" > Cc: "John Rose" , "Brian Goetz" > , "amber-spec-experts" > > Sent: Tuesday, September 13, 2022 8:42:23 PM > Subject: Re: Knocking off two more vestiges of legacy switch > >> So my main concern stay that >> String s = ... >> switch(s) { >> case Comparable c -> ... //Dan: matches here as String implements Comparable >> (this case is total on "s" so no further matching) >> case Object o -> ... >> } >> and >> long l = ... >> switch(l) { >> case float f -> ... //Dan: matches here if l is convertable to a float >> case double d -> ... //Dan: otherwise matches here >> } >> behave differently. > In each case, we're finding the switch case that the value is compatible with. > Another way to say it is the value is convertable to... or castable to. Can you > expand on what you mean by "behave differently"? In the first example, both type patterns are total so it does not compile because both patterns will match all Strings. In the second example, if we follow the semantics proposed by Brian, the first pattern is partial and is equivalent to iff (l == (long) (float) l) { float f = l; ... } and the second pattern is total. > I'm still working on reading through the "big picture" presentation in [0] so if > there's a particular section there that you think is relevant, I can re-read > that first. It might be useful for both of us to re-read it and see how this > example fits with the bigger picture being proposed for pattern matching. This document give you a nice overview of the problems but some parts are outdated, the following spec correspond to the semantics for Java 19 https://cr.openjdk.java.net/~gbierman/jep427+405/jep427+405-20220601/specs/patterns-switch-record-patterns-jls.html#jls-15.28 The proposed semantics of the primitive pattern is described here https://mail.openjdk.org/pipermail/amber-spec-experts/2022-September/003497.html and here https://mail.openjdk.org/pipermail/amber-spec-experts/2022-September/003499.html > --Dan > [0] [ > https://openjdk.org/projects/amber/design-notes/patterns/pattern-match-object-model > | > https://openjdk.org/projects/amber/design-notes/patterns/pattern-match-object-model > ] R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Wed Sep 14 04:05:45 2022 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 14 Sep 2022 04:05:45 +0000 Subject: Knocking off two more vestiges of legacy switch In-Reply-To: References: <2c76f0e7-ab76-5e18-bde4-407cca21d3a8@oracle.com> <2144691515.3449122.1663021743952.JavaMail.zimbra@u-pem.fr> <5a9da89d-a9a2-7fb0-0b64-02c9a0e7f548@oracle.com> <5bf2de36-4e56-cb68-2fe3-972d4cadebb8@oracle.com> Message-ID: +1 on this suggestion. I believe it is the only approach that could make switch on floats at all useful, and it would be very useful, as Joe says, for expressing special cases in math libraries clearly. ?Guy On Sep 13, 2022, at 1:07 PM, Joe Darcy > wrote: On 9/13/2022 9:55 AM, Brian Goetz wrote: It is common for math library methods to have a preamble to screen out special values (infinities, NaN, 0.0, 1.0, etc.). This would be a reasonable use of a switch on float/double switch. Which raises some questions (again) of the semantics of constant patterns for exotic floating point values, especially (again) negative zero. In a switching context, I think there is a stronger case for distinguishing between +0.0 and -0.0. The operational semantics I'd recommend are to desugar, say a float switch, to an int switch on the Float.floatToIntBits mapping of the float case labels. Float.floatToIntBits, as opposed to Float.floatToRawIntBits, normalized all NaN representations to a single value. -Joe -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Sat Sep 17 18:06:09 2022 From: amaembo at gmail.com (Tagir Valeev) Date: Sat, 17 Sep 2022 20:06:09 +0200 Subject: [enhanced-switches] My experience of converting old switches to new ones Message-ID: Hello! Our codebase was updated recently from Java 11 to Java 17 level, and we started gradually using new Java features. Recently, I converted in a semi-automated manner ~1000 of the old switches to new ones (either statements or expressions). Here's my thoughts about this. Probably somebody will find them interesting. 1. Knowing that the switch never falls through is really relieving. So if you see an arrow after the first case, you immediately know that this is a 'simple switch' (not doing fallthrough). If you see colon, you start looking more precisely: probably something fancy is done in this switch (otherwise, it's likely that automated refactoring would be suggested to use an arrow). So the arrow basically separates simple switches and complex ones. 2. I really have a desire to use expression switches, even when it requires some code repetition. E.g., a common pattern: if (cond) { switch(val) { case A -> return "a"; case B -> return "b"; } } return "default"; I tend to convert it to if (cond) { return switch(val) { case A -> "a"; case B -> "b"; default -> "default"; } } return "default"; There's a cost of repeating the "default" expression. However, we also have a benefit. Now, we know that under the condition we always return no matter what. Unfortunately, sometimes the default expression can be non-trivial (e.g., return super.blahblah(all, my, parameters, passed)). In this case, I'm reluctant to duplicate it many times. It's quite possible that in fact this is an impossible case, and it's written there just because something should be written. However, this requires a deeper understanding of the original code. 3. It's also sad that signaling about impossible cases is quite long. `default -> assert false;` is not accepted for obvious reasons. Writing every time `default -> throw new IllegalStateException("Unexpected value of "+selectorValue);` is very verbose and distracts from the actual code. Probably some syntactic sugar to assert that we covered all possible values in a non-exhaustive switch would be nice (like `default impossible;` or whatever). E.g., I observed the following (arguably strange) pattern: switch((cond1() ? 0 : 1) + (cond2() ? 0 : 2)) { case 0 -> ... case 1 -> ... case 2 -> ... case 3 -> ... default -> throw new AssertionError("cannot reach here"); } 4. I really enjoyed exhaustive switch expressions over enums. I removed probably a hundred of redundant default branches. In switch statements, people do different things when they are forced to write default, even though they covered all the enum values: a. throw something (throw new IllegalStateException(), throw new AssertionError(), etc.) b. return something simple (return "", return null, etc.) c. assert+return something simple: assert false; return null; d. questionable: join return branch with the last case (case LAST:default: ...) e. more dangerous: omit the explicit last case and use default instead it, while it's clear that default actually handles non-mentioned case. Luckily all of these are unnecessary anymore if you can use switch expressions. I even forcibly push down switch expressions inside something (e.g., a call), just to be able to use it. E.g.: switch(MY_ENUM) { case A -> setSomething("a"); case B -> setSomething("b"); case C -> setSomething("c"); default -> throw new IllegalStateException("impossible; all values are covered"); } Can be nicely converted to setSomething(switch(MY_ENUM) { case A -> "a"; case B -> "b"; case C -> "c"; }); Unfortunately, this is not always the case. Sometimes, you cannot use switch expression at all, and in this case, inability to specify exhaustiveness is really annoying. We need total switch statements. 5. At first, I thought that switch expressions are best for return values, assignment rvalues and variable declaration initializers, but in other contexts they are too verbose and may make things more complex than necessary. However, I started liking using them as the last argument of the call. E.g., before: switch(x) { case "a":return wrap(getA()); case "b":return wrap(getB()); case "c":return wrap(getC()); default:throw new IllegalArgumentException(); } after: return wrap(switch(x) { case "a" -> getA(); case "b" -> getB(); case "c" -> getC(); default -> throw new IllegalArgumentException(); }); If you don't have tail arguments after switch, then you don't lose the context, and you immediately know that every non-exceptional return value is wrapped. It's also possible to extract such a switch into a separate local variable, but even without extraction it reads nicely. It's also ok to use switch expressions inside other switch expressions. Especially useful in double-dispatch enum methods (e.g., some kind of lattice operations): enum Item { BOTTOM, A, B, AB, TOP; Item join(Item other) { return switch(this) { case TOP -> this; case BOTTOM -> other; case A -> switch(other) { case A, AB, TOP -> other; case B -> AB; case BOTTOM -> this; }; case B -> switch(other) { case B, AB, TOP -> other; case A -> AB; case BOTTOM -> this; }; case AB -> switch(other) { case TOP -> other; case A, B, AB, BOTTOM -> this; }; }; } } Reads much better than tons of returns before. Also, thanks to exhaustiveness checks, you know that every single case is covered. 6. I started to like yield. In some cases, only a couple of branches of a long switch that returns from every branch have some complex intermediate computations or conditional branches. In this case, it's still better to convert it to switch expression, and replace some returns with yields. And even if every single branch is complex, using switch expression + yield may make code more clear. E.g., it may clearly show that the purpose of the whole switch is to assign a value to the same variable, though computation of variable value in every branch could be complex. Also, it can be implicitly assumed that even complex switch expressions with yields don't produce side-effects. Of course, this is not controlled by a compiler but it would be a bad practice to produce them, so there could be an agreement between the team. In this case, reading the code is simplified a lot. If you see `var something = switch(...) {...}`, you immediately know that regardless of the switch complexity, we just calculate the value for `something`, so we can skip the whole thing if we are not interested in details. If you see a switch statement, you are less sure whether every single branch does only this. 7. I really miss `case null`. I saw many switches these days, during my conversion quest. And it happens quite often in our codebase that the null case is handled separately before the switch (often the same as 'default', but sometimes not). In the IntelliJ codebase, we really use nulls extensively, even though some people may think that it's a bad idea. It's good that we will have `case null` in future. 8. I also miss `case default`. It's strange, but I often see old switches where `default:` is joined with other cases. Probably more often with strings, less often with enums. Something like: switch(valueFromConfig) { case "increase": increase(); break; case "decrease": decrease(); break; case "enable": enable(); break; case "disable": // "disable" is a documented value and we explicitly process it default: // something unknown, but we still want to fallback to default value which is "disable" disable(); break; } With Java 17 enhanced switches, we should either delete `case "disable"`, or duplicate the branch. If we delete, it will not be so clear anymore that this value is especially processed as "official" value. In the future, I could use `case "disable", default -> disable();` which would solve the issue. 9. Some old switches are actually shorted than new ones, and I'm not sure about conversion. Usually, it's like this: if (condition) { switch(value) { case 1: return "a"; case 2: return "b"; case 3: return "c"; // no default case, execution continues } } ... a lot of common code for `condition` is false or `value` is not listed in cases ... Here it's hard to use switch expression, and enhanced switch statement only becomes longer and cluttered with syntax: switch(value) { case 1 -> { return "a"; } case 2 -> { return "b"; } case 3 -> { return "c"; } } Well, it's possible to refactor to something like String result = !condition ? null : switch(value) { case 1 -> "a"; case 2 -> "b"; case 3 -> "c"; default -> null; }; if (result != null) return result; ... a lot of common code for `condition` is false or `value` is not listed in cases ... But it's questionable whether this makes the code more readable. 10. Sometimes, one or few enum values are peeled off in advance. In this case, nice conversion becomes problematic. E.g.: enum Mode {IGNORE, A, B, C} void updateMode(Mode mode) { if (mode == Mode.IGNORE) return; System.out.println("Processing..."); switch(mode) { case A -> process("a"); case B -> process("b"); case C -> process("c"); } } It's almost convertible to switch expression. However, the switch is non-exhaustive, and you cannot get exhaustiveness benefits. It's possible to add a throwing branch, though it's also long and verbose: void updateMode(Mode mode) { if (mode == Mode.IGNORE) return; System.out.println("Processing..."); process(switch(mode) { case A -> "a"; case B -> "b"; case C -> "c"; case IGNORE -> throw new AssertionError("impossible; handled before"); // hooray, exhaustive now! }); } Of course, it would be too much for javac to analyze code to this extent and allow skipping IGNORE branch, as it was checked before (IntelliJ analyzer knows this). However, it's still sad. This somehow corresponds to item 3. Probably short syntax for impossible branches would be nice. Thank you for reading my very long email. With best regards, Tagir Valeev. From brian.goetz at oracle.com Sun Sep 18 13:21:56 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Sun, 18 Sep 2022 09:21:56 -0400 Subject: [enhanced-switches] My experience of converting old switches to new ones In-Reply-To: References: Message-ID: Thanks for the extensive feedback! > 3. It's also sad that signaling about impossible cases is quite long. > `default -> assert false;` is not accepted for obvious reasons. Java lacks suitable abstraction over effects, so we cannot use our regular abstraction tools for simplifying a throw -- you have to do it all inline, unfortunately. We have talked about various sugary things here, such as: ??? default -> unreachable; or ??? default -> throw; but could never get all that excited about it; its not that powerful, and invariably someone will want to customize the exception.? You could have a simple library method: ??? AssertionError unreachable() { ??????? return new AssertionError("got lost in the weeds"); ??? } ??? default -> throw unreachable(); which seems better than a language feature, though you end up with some "junk" frames on the stack trace.? If that point is really unreachable, that won't matter. But as you say, really you'll want to provide some context about the data that brought you to this point.? Which suggests you want something that is part of switch, so it can at least reproduce the selector.? I kind of like your idea about a case that says "impossible", as it is tied to the switch and can carry the selector value, so it can give you a better error.? (Ideally, something that the existing synthetic defaults could be shorthand for.) > > We need total switch statements. Is this different from the "default impossible" above? > 5. At first, I thought that switch expressions are best for return > values, assignment rvalues and variable declaration initializers, but > in other contexts they are too verbose and may make things more > complex than necessary. However, I started liking using them as the > last argument of the call. Not unlike lambdas.? A sensible style has emerged in many libraries that encourage a single lambda argument at the end, for this same reason. > 7. I really miss `case null`. I saw many switches these days, during > my conversion quest. And it happens quite often in our codebase that > the null case is handled separately before the switch (often the same > as 'default', but sometimes not). In the IntelliJ codebase, we really > use nulls extensively, even though some people may think that it's a > bad idea. It's good that we will have `case null` in future. Hopefully near future! > 9. Some old switches are actually shorted than new ones, and I'm not > sure about conversion. There's nothing wrong with old switches when you need complex control flow.? Even if we made all switches exhaustive, a `default: break` would suffice here.? I think there's no need to use the new thing here; you want some weird control flow, old switches are good for that. > Usually, it's like this: > > if (condition) { > switch(value) { > case 1: return "a"; > case 2: return "b"; > case 3: return "c"; > // no default case, execution continues > } -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Mon Sep 19 07:22:34 2022 From: amaembo at gmail.com (Tagir Valeev) Date: Mon, 19 Sep 2022 09:22:34 +0200 Subject: [enhanced-switches] My experience of converting old switches to new ones In-Reply-To: References: Message-ID: Hello! > > We need total switch statements. > > > Is this different from the "default impossible" above? Yes. I mean, currently we cannot have exhaustiveness checks on enum switch statements having a compilation error when a new enum constant is added. We have this for switch expressions and for sealed classes, but not for switch statements over enums. With best regards, Tagir Valeev. From forax at univ-mlv.fr Mon Sep 19 08:07:04 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 19 Sep 2022 10:07:04 +0200 (CEST) Subject: [enhanced-switches] My experience of converting old switches to new ones In-Reply-To: References: Message-ID: <391672854.8279091.1663574824680.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Tagir Valeev" > To: "Brian Goetz" > Cc: "amber-spec-experts" > Sent: Monday, September 19, 2022 9:22:34 AM > Subject: Re: [enhanced-switches] My experience of converting old switches to new ones > Hello! > >> > We need total switch statements. >> >> >> Is this different from the "default impossible" above? > > Yes. I mean, currently we cannot have exhaustiveness checks on enum > switch statements having a compilation error when a new enum constant > is added. We have this for switch expressions and for sealed classes, > but not for switch statements over enums. enum Foo { A, B } switch(foo) { case null -> throw null; case A -> ... case B -> ... } is exhaustive. > > With best regards, > Tagir Valeev. R?mi From amaembo at gmail.com Wed Sep 21 12:22:20 2022 From: amaembo at gmail.com (Tagir Valeev) Date: Wed, 21 Sep 2022 14:22:20 +0200 Subject: [string-templates] Processors with side effects Message-ID: Hello! I was thinking about how Java beginners may benefit from string templates. Some teaching materials rely on System.out.printf to produce formatted output, like System.out.printf("Hello %s!", user); With string templates proposal, we can use System.out.println(FMT."Hello %s\{user}!"); This is not very exciting. But I realised that PrintStream may implement TemplateProcessor by itself (returning Void or whatever), and print directly: System.out."Hello %s\{user}!"; Will such use cases be encouraged, or this should be considered as misuse of the feature? Well, for side effect we may still want to specify formatting options, like whether it should be a concatenation, or formatting, and with which locale, so probably it would be better to have an intermediate method (or even field!) that returns a TemplateProcessor: System.out.printstr."Hello \{user}!"; System.out.printfmt."Hello %s\{user}!"; System.out.printfmt(myLocale)."Hello %s\{user}!"; That said, in "Safely composing and executing database queries" section of JEP 430, it's assumed that the DB object always produces a ResultSet. However, in PreparedStatement there's also executeUpdate() (returning int) and execute() (returning boolean) which might be sometimes more appropriate. So a level of indirection between connection and template processor is probably necessary: ResultSet resultSet = conn.query()."SELECT \{col} FROM \{table}"; int count = conn.update()."UPDATE \{table} SET \{col} = \{value}"; With best regards, Tagir Valeev. From james.laskey at oracle.com Wed Sep 21 12:54:36 2022 From: james.laskey at oracle.com (Jim Laskey) Date: Wed, 21 Sep 2022 12:54:36 +0000 Subject: [string-templates] Processors with side effects In-Reply-To: References: Message-ID: <9D921D4D-DA01-4C63-A355-34BE372591A4@oracle.com> > On Sep 21, 2022, at 9:22 AM, Tagir Valeev wrote: > > Hello! > > I was thinking about how Java beginners may benefit from string > templates. Some teaching materials rely on System.out.printf to > produce formatted output, like > > System.out.printf("Hello %s!", user); > > With string templates proposal, we can use > > System.out.println(FMT."Hello %s\{user}!"); > > This is not very exciting. But I realised that PrintStream may > implement TemplateProcessor by itself (returning Void or whatever), > and print directly: > > System.out."Hello %s\{user}!"; > > Will such use cases be encouraged, or this should be considered as > misuse of the feature? Misuse will come whether we like or not. The plan is to have a "User Guide to String Templates" influence developers toward safe and reasonable usage. > > Well, for side effect we may still want to specify formatting options, > like whether it should be a concatenation, or formatting, and with > which locale, so probably it would be better to have an intermediate > method (or even field!) that returns a TemplateProcessor: > > System.out.printstr."Hello \{user}!"; > System.out.printfmt."Hello %s\{user}!"; > System.out.printfmt(myLocale)."Hello %s\{user}!"; One flavour you didn?t propose was OUT."Hello \{user}!?; Or OUT."Hello %s\{user}!?; Similar to "import static java.lang.System.out? used by some developers. No doubt, there will be significant spin off and discussion from the JEP?s proposal. > > That said, in "Safely composing and executing database queries" > section of JEP 430, it's assumed that the DB object always produces a > ResultSet. However, in PreparedStatement there's also executeUpdate() > (returning int) and execute() (returning boolean) which might be > sometimes more appropriate. So a level of indirection between > connection and template processor is probably necessary: > > ResultSet resultSet = conn.query()."SELECT \{col} FROM \{table}"; > int count = conn.update()."UPDATE \{table} SET \{col} = \{value}"; Since the JEP was originally written we?ve done some more research about what SQL template processors might look like. More expert consultation will take place but the current leaning is toward producing PrepareStatements. So the code will be more like; PreparedStatement stmt = conn."SELECT \{col} FROM \{table}?; ResultSet rs = stmt.executeQuery(); Or just ResultSet rs = conn."SELECT \{col} FROM \{table}?.executeQuery(); The advantage here, beside the validation, is that the statement will only be compiled and optimized (using meta data) once per callsite/connection and reused with different values per iteration. As stated, we will be gathering more direction from the DB community (think separate JEP.) Cheers, ? Jim > > With best regards, > Tagir Valeev. From brian.goetz at oracle.com Wed Sep 21 13:03:02 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 21 Sep 2022 09:03:02 -0400 Subject: [string-templates] Processors with side effects In-Reply-To: References: Message-ID: <7c8f8e37-1c6c-2064-22c2-528d74a7639f@oracle.com> > Hello! > > I was thinking about how Java beginners may benefit from string > templates. Some teaching materials rely on System.out.printf to > produce formatted output, like > > System.out.printf("Hello %s!", user); > > With string templates proposal, we can use > > System.out.println(FMT."Hello %s\{user}!"); > > This is not very exciting. But I realised that PrintStream may > implement TemplateProcessor by itself (returning Void or whatever), > and print directly: > > System.out."Hello %s\{user}!"; > > Will such use cases be encouraged, or this should be considered as > misuse of the feature? We anticipated that some libraries may want to implement TemplateProcessor in order to do this.? However, whether we do so for PrintStream will require thought.? The former version may not be exciting, but it is clear, and users will have no question what is going on.? The latter represents an opinionated "this is the formatter we will use for the next century", and is a choice to be taken carefully.? One of the great things about the current release cadence (plus preview mechanism) is that it is not necessary any more to do everything at once; we can let things sit for a while and see how they settle. From brian.goetz at oracle.com Wed Sep 28 17:57:19 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 28 Sep 2022 13:57:19 -0400 Subject: Paving the on-ramp Message-ID: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> At various points, we've explored the question of which program elements are most and least helpful for students first learning Java.? After considering a number of alternatives over the years, I have a simple proposal for smoothing the "on ramp" to Java programming, while not creating new things to unlearn. Markdown source is below, HTML will appear soon at: https://openjdk.org/projects/amber/design-notes/on-ramp # Paving the on-ramp Java is one of the most widely taught programming languages in the world.? Tens of thousands of educators find that the imperative core of the language combined with a straightforward standard library is a foundation that students can comfortably learn on.? Choosing Java gives educators many degrees of freedom: they can situate students in `jshell` or Notepad or a full-fledged IDE; they can teach imperative, object-oriented, functional, or hybrid programming styles; and they can easily find libraries to interact with external data and services. No language is perfect, and one of the most common complaints about Java is that it is "too verbose" or has "too much ceremony."? And unfortunately, Java imposes its heaviest ceremony on those first learning the language, who need and appreciate it the least.? The declaration of a class and the incantation of `public static void main` is pure mystery to a beginning programmer.? While these incantations have principled origins and serve a useful organizing purpose in larger programs, they have the effect of placing obstacles in the path of _becoming_ Java programmers. Educators constantly remind us of the litany of complexity that students have to confront on Day 1 of class -- when they really just want to write their first program. As an amusing demonstration of this, in her JavaOne keynote appearance in 2019, [Aimee Lucido](https://www.youtube.com/watch?v=BkPPFiXUwYk) talked about when she learned to program in Java, and how her teacher performed a rap song to help students memorize `"public static void main"`.? Our hats are off to creative educators everywhere for this kind of dedication, but teachers shouldn't have to do this. Of course, advanced programmers complain about ceremony too.? We will never be able to satisfy programmers' insatiable appetite for typing fewer keystrokes, and we shouldn't try, because the goal of programming is to write programs that are easy to read and are clearly correct, not programs that were easy to type. But we can try to better align the ceremony commensurate with the value it brings to a program -- and let simple programs be expressed more simply. ## Concept overload The classic "Hello World" program looks like this in Java: ``` public class HelloWorld { ??? public static void main(String[] args) { ??????? System.out.println("Hello World"); ??? } } ``` It may only be five lines, but those lines are packed with concepts that are challenging to absorb without already having some programming experience and familiarity with object orientation. Let's break down the concepts a student confronts when writing their first Java program: ? - **public** (on the class).? The `public` accessibility level is relevant ??? only when there is going to be cross-package access; in a simple "Hello ??? World" program, there is only one class, which lives in the unnamed package. ??? They haven't even written a one-line program yet; the notion of access ??? control -- keeping parts of a program from accessing other parts of it -- is ??? still way in their future. ? - **class**.? Our student hasn't set out to write a _class_, or model a ??? complex system with objects; they want to write a _program_.? In Java, a ??? program is just a `main` method in some class, but at this point our student ??? still has no idea what a class is or why they want one. ? - **Methods**.? Methods are of course a key concept in Java, but the mechanics ??? of methods -- parameters, return types, and invocation -- are still ??? unfamiliar, and the `main` method is invoked magically from the `java` ??? launcher rather than from explicit code. ? - **public** (again).? Like the class, the `main` method has to be public, but ??? again this is only relevant when programs are large enough to require ??? packages to organize them. ? - **static**.? The `main` method has to be static, and at this point, students ??? have no context for understanding what a static method is or why they want ??? one.? Worse, the early exposure to `static` methods will turn out to be a ??? bad habit that must be later unlearned.? Worse still, the fact that the ??? `main` method is `static` creates a seam between `main` and other methods; ??? either they must become `static` too, or the `main` method must trampoline ??? to some sort of "instance main" (more ceremony!)? And if we get this wrong, ??? we get the dreaded and mystifying `"cannot be referenced from a static ??? context"` error. ? - **main**.? The name `main` has special meaning in a Java program, indicating ??? the starting point of a program, but this specialness hides behind being an ??? ordinary method name.? This may contribute to the sense of "so many magic ??? incantations." ? - **String[]**.? The parameter to `main` is an array of strings, which are the ??? arguments that the `java` launcher collected from the command line.? But our ??? first program -- likely our first dozen -- will not use command-line ??? parameters. Requiring the `String[]` parameter is, at this point, a mistake ??? waiting to happen, and it will be a long time until this parameter makes ??? sense.? Worse, educators may be tempted to explain arrays at this point, ??? which further increases the time-to-first-program. ? - **System.out.println**.? If you look closely at this incantation, each ??? element in the chain is a different thing -- `System` is a class (what's a ??? class again?), `out` is a static field (what's a field?), and `println` is ??? an instance method.? The only part the student cares about right now is ??? `println`; the rest of it is an incantation that they do not yet understand ??? in order to get at the behavior they want. That's a lot to explain to a student on the first day of class. There's a good chance that by now, class is over and we haven't written any programs yet, or the teacher has said "don't worry what this means, you'll understand it later" six or eight times.? Not only is this a lot of _syntactic_ things to absorb, but each of those things appeals to a different concept (class, method, package, return value, parameter, array, static, public, etc) that the student doesn't have a framework for understanding yet.? Each of these will have an important role to play in larger programs, but so far, they only contribute to "wow, programming is complicated." It won't be practical (or even desirable) to get _all_ of these concepts out of the student's face on day 1, but we can do a lot -- and focus on the ones that do the most to help beginners understand how programs are constructed. ## Goal: a smooth on-ramp As much as programmers like to rant about ceremony, the real goal here is not mere ceremony reduction, but providing a graceful _on ramp_ to Java programming. This on-ramp should be helpful to beginning programmers by requiring only those concepts that a simple program needs. Not only should an on-ramp have a gradual slope and offer enough acceleration distance to get onto the highway at the right speed, but its direction must align with that of the highway.? When a programmer is ready to learn about more advanced concepts, they should not have to discard what they've already learned, but instead easily see how the simple programs they've already written generalize to more complicated ones, and both the syntatic and conceptual transformation from "simple" to "full blown" program should be straightforward and unintrusive.? It is a definite non-goal to create a "simplified dialect of Java for students". We identify three simplifications that should aid both educators and students in navigating the on-ramp to Java, as well as being generally useful to simple programs beyond the classroom as well: ?- A more tolerant launch protocol ?- Unnamed classes ?- Predefined static imports for the most critical methods and fields ## A more tolerant launch protocol The Java Language Specification has relatively little to say about how Java "programs" get launched, other than saying that there is some way to indicate which class is the initial class of a program (JLS 12.1.1) and that a public static method called `main` whose sole argument is of type `String[]` and whose return is `void` constitutes the entry point of the indicated class. We can eliminate much of the concept overload simply by relaxing the interactions between a Java program and the `java` launcher: ?- Relax the requirement that the class, and `main` method, be public.? Public ?? accessibility is only relevant when access crosses packages; simple programs ?? live in the unnamed package, so cannot be accessed from any other package ?? anyway.? For a program whose main class is in the unnamed package, we can ?? drop the requirement that the class or its `main` method be public, ?? effectively treating the `java` launcher as if it too resided in the unnamed ?? package. ?- Make the "args" parameter to `main` optional, by allowing the `java` launcher to ?? first look for a main method with the traditional `main(String[])` ?? signature, and then (if not found) for a main method with no arguments. ?- Make the `static` modifier on `main` optional, by allowing the `java` launcher to ?? invoke an instance `main` method (of either signature) by instantiating an ?? instance using an accessible no-arg constructor and then invoking the `main` ?? method on it. This small set of changes to the launch protocol strikes out five of the bullet points in the above list of concepts: public (twice), static, method parameters, and `String[]`. At this point, our Hello World program is now: ``` class HelloWorld { ??? void main() { ??????? System.out.println("Hello World"); ??? } } ``` It's not any shorter by line count, but we've removed a lot of "horizontal noise" along with a number of concepts.? Students and educators will appreciate it, but advanced programmers are unlikely to be in any hurry to make these implicit elements explicit either. Additionally, the notion of an "instance main" has value well beyond the first day.? Because excessive use of `static` is considered a code smell, many educators encourage the pattern of "all the static `main` method does is instantiate an instance and call an instance `main` method" anyway.? Formalizing the "instance main" protocol reduces a layer of boilerplate in these cases, and defers the point at which we have to explain what instance creation is -- and what `static` is.? (Further, allowing the `main` method to be an instance method means that it could be inherited from a superclass, which is useful for simple frameworks such as test runners or service frameworks.) ## Unnamed classes In a simple program, the `class` declaration often doesn't help either, because other classes (if there are any) are not going to reference it by name, and we don't extend a superclass or implement any interfaces.? If we say an "unnamed class" consists of member declarations without a class header, then our Hello World program becomes: ``` void main() { ??? System.out.println("Hello World"); } ``` Such source files can still have fields, methods, and even nested classes, so that as a program evolves from a few statements to needing some ancillary state or helper methods, these can be factored out of the `main` method while still not yet requiring a full class declaration: ``` String greeting() { return "Hello World"; } void main() { ??? System.out.println(greeting()); } ``` This is where treating `main` as an instance method really shines; the user has just declared two methods, and they can freely call each other. Students need not confront the confusing distinction between instance and static methods yet; indeed, if not forced to confront static members on day 1, it might be a while before they do have to learn this distinction.? The fact that there is a receiver lurking in the background will come in handy later, but right now is not bothering anybody. [JEP 330](https://openjdk.org/jeps/330) allows single-file programs to be launched directly without compilation; this streamlined launcher pairs well with unnamed classes. ## Predefined static imports The most important classes, such as `String` and `Integer`, live in the `java.lang` package, which is automatically on-demand imported into all compilation units; this is why we do not have to `import java.lang.String` in every class.? Static imports were not added until Java 5, but no corresponding facility for automatic on-demand import of common behavior was added at that time.? Most programs, however, will want to do console IO, and Java forces us to do this in a roundabout way -- through the static `System.out` and `System.in` fields.? Basic console input and output is a reasonable candidate for auto-static import, as one or both are needed by most simple programs.? While these are currently instance methods accessed through static fields, we can easily create static methods for `println` and `readln` which are suitable for static import, and automatically import them.? At which point our first program is now down to: ``` void main() { ??? println("Hello World"); } ``` ## Putting this all together We've discussed several simplifications: ?- Update the launcher protocol to make public, static, and arguments optional ?? for main methods, and for main methods to be instance methods (when a ?? no-argument constructor is available); ?- Make the class wrapper for "main classes" optional (unnamed classes); ?- Automatically static import methods like `println` which together whittle our long list of day-1 concepts down considerably.? While this is still not as minimal as the minimal Python or Ruby program -- statements must still live in a method -- the goal here is not to win at "code golf".? The goal is to ensure that concepts not needed by simple programs need not appear in those programs, while at the same time not encouraging habits that have to be unlearned as programs scale up. Each of these simplifications is individually small and unintrusive, and each is independent of the others.? And each embodies a simple transformation that the author can easily manually reverse when it makes sense to do so: elided modifiers and `main` arguments can be added back, the class wrapper can be added back when the affordances of classes are needed (supertypes, constructors), and the full qualifier of static-import can be added back.? And these reversals are independent of one another; they can done in any combination or any order. This seems to meet the requirements of our on-ramp; we've eliminated most of the day-1 ceremony elements without introducing new concepts that need to be unlearned. The remaining concepts -- a method is a container for statements, and a program is a Java source file with a `main` method -- are easily understood in relation to their fully specified counterparts. ## Alternatives Obviously, we've lived with the status quo for 25+ years, so we could continue to do so.? There were other alternatives explored as well; ultimately, each of these fell afoul of one of our goals. ### Can't we go further? Fans of "code golf" -- of which there are many -- are surely right now trying to figure out how to eliminate the last little bit, the `main` method, and allow statements to exist at the top-level of a program.? We deliberately stopped short of this because it offers little value beyond the first few minutes, and even that small value quickly becomes something that needs to be unlearned. The fundamental problem behind allowing such "loose" statements is that variables can be declared inside both classes (fields) and methods (local variables), and they share the same syntactic production but not the same semantics.? So it is unclear (to both compilers and humans) whether a "loose" variable would be a local or a field.? If we tried to adopt some sort of simple heuristic to collapse this ambiguity (e.g., whether it precedes or follows the first statement), that may satisfy the compiler, but now simple refactorings might subtly change the meaning of the program, and we'd be replacing the explicit syntactic overhead of `void main()` with an invisible "line" in the program that subtly affects semantics, and a new subtle rule about the meaning of variable declarations that applies only to unnamed classes. This doesn't help students, nor is this particularly helpful for all but the most trivial programs.? It quickly becomes a crutch to be discarded and unlearned, which falls afoul of our "on ramp" goals.? Of all the concepts on our list, "methods" and "a program is specified by a main method" seem the ones that are most worth asking students to learn early. ### Why not "just" use `jshell`? While JShell is a great interactive tool, leaning too heavily on it as an onramp would fall afoul of our goals.? A JShell session is not a program, but a sequence of code snippets.? When we type declarations into `jshell`, they are viewed as implicitly static members of some unspecified class, with accessibility is ignored completely, and statements execute in a context where all previous declarations are in scope.? This is convenient for experimentation -- the primary goal of `jshell` -- but not such a great mental model for learning to write Java programs.? Transforming a batch of working declarations in `jshell` to a real Java program would not be sufficiently simple or unintrusive, and would lead to a non-idiomatic style of code, because the straightforward translation would have us redeclaring each method, class, and variable declaration as `static`.? Further, this is probably not the direction we want to go when we scale up from a handful of statements and declarations to a simple class -- we probably want to start using classes as classes, not just as containers for static members. JShell is a great tool for exploration and debugging, and we expect many educators will continue to incorporate it into their curriculum, but is not the on-ramp programming model we are looking for. ### What about "always local"? One of the main tensions that `main` introduces is that most class members are not `static`, but the `main` method is -- and that forces programmers to confront the seam between static and non-static members.? JShell answers this with "make everything static". Another approach would be to "make everything local" -- treat a simple program as being the "unwrapped" body of an implicit main method.? We already allow variables and classes to be declared local to a method.? We could add local methods (a useful feature in its own right) and relax some of the asymmetries around nesting (again, an attractive cleanup), and then treat a mix of declarations and statements without a class wrapper as the body of an invisible `main` method. This seems an attractive model as well -- at first. While the syntactic overhead of converting back to full-blown classes -- wrap the whole thing in a `main` method and a `class` declaration -- is far less intrusive than the transformation inherent in `jshell`, this is still not an ideal on-ramp.? Local variables interact with local classes (and methods, when we have them) in a very different way than instance fields do with instance methods and inner classes: their scopes are different (no forward references), their initialization rules are different, and captured local variables must be effectively final.? This is a subtly different programming model that would then have to be unlearned when scaling up to full classes. Further, the result of this wrapping -- where everything is local to the main method -- is also not "idiomatic Java".? So while local methods may be an attractive feature, they are similarly not the on-ramp we are looking for. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Wed Sep 28 19:49:33 2022 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 28 Sep 2022 12:49:33 -0700 Subject: Paving the on-ramp In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: Virtuous. The quips about horses having fled the barn are coming, but whether they did is irrelevant; let's just make Java better now. On Wed, Sep 28, 2022 at 10:57 AM Brian Goetz wrote: ## Concept overload > I like that the focus is not just on boilerplate but on the offense of forcing learners to encounter concepts they *will* need to care about but don't yet. - Relax the requirement that the class, and `main` method, be public. > Public > accessibility is only relevant when access crosses packages; simple > programs > live in the unnamed package, so cannot be accessed from any other > package > anyway. For a program whose main class is in the unnamed package, we > can > drop the requirement that the class or its `main` method be public, > effectively treating the `java` launcher as if it too resided in the > unnamed > package. > Alternative: drop the requirement altogether. Most main methods have no desire to make themselves publicly callable as `TheClass.main(args)`, but today they are forced to expose that API anyway. I feel like it would still be conceptually clean to say that `public` is really about whether other *code* can access it, not whether a VM can get to it at all. - Make the "args" parameter to `main` optional, by allowing the `java` > launcher to > first look for a main method with the traditional `main(String[])` > signature, and then (if not found) for a main method with no arguments. > This seems to leave users vulnerable to some surprises, where the code they think is being called isn't. Why not make it a compile-time error to provide both forms? - Make the `static` modifier on `main` optional, by allowing the `java` > launcher to > invoke an instance `main` method (of either signature) by instantiating > an > instance using an accessible no-arg constructor and then invoking the > `main` > method on it. > I'll give the problems I see with this, without a judgement on what should be done. What's the whole idea of main? Well, it's the entry point into the program. But now it's *not* really the entry point; finding the entry point is more subtle. (Okay, I concede that static initializers are run first either way; that undercuts *some* of the strength of my argument here.) Even if this is okay when I'm writing my own new program, understanding it as I go, then suppose someone else reads my program. That person has the burden of remembering to check whether `main` is static or not, and remembering that some constructor code is happening first if it's not. Classes that have both main and a constructor will be a mixture of some that call them in one order and some in the other. That's just, like, messy. And is it even clear, then, why the VM shouldn't be passing `args` to the *constructor*, only hoarding it until calling `main`? On a deep conceptual level... I'd insist that main() *is static*. It is *the* single entry point into the program; what could be more static than that? But thinking about our learner, who wrote some `main`s before learning about static. The instant they learn `static` is a keyword a method can have, they'll "know" one thing about it already: this is going to be something new that's *not* true of main(). But then they hear an explanation that fits `main` perfectly? Because excessive use of `static` is considered a code smell, many > educators encourage the pattern of "all the static `main` method does is > instantiate an instance and call an instance `main` method" anyway. > Heavy groan. In my opinion, some ideas are too misguided to take seriously. The value in that practice is if instance `main` accepts parameters like `PrintStream` and `Console`, and static main passes in `System.out` and `System.console()`. That makes all your actual program logic unit-testable. Great! This actually strikes directly at the heart of what the entire problem with `static` is! But this isn't the case you're addressing. Static methods are not a code smell! Static methods that ought to be overrideable by one of their argument types (Collections.sort()), sure. Static mutable state is a code smell, definitely -- but a method that touches that state is equally problematic whether it itself is static or not. There are some code smells around `static`, but `static` itself is fresh and flowery. (Further, allowing the `main` method to be an instance method > means that it could be inherited from a superclass, which is useful for > simple > frameworks such as test runners or service frameworks.) > This does not give me a happy feeling. Going into it is a deep discussion though. Rest of the response coming soon, I hope. Just to mention one additional idea. We could permit `main` to optionally return `int`, becoming the default exit status if `exit` is never called. Seems elegant for the rare cases where you care about exit status, but (a) would this feature get in the way in *any* sense for the vast majority of cases that don't care, or (b) are the cases that care just way too rare for us to worry about? I'm not sure about (a). But (b) kinda seems like a yes. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Sep 28 20:10:02 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 28 Sep 2022 16:10:02 -0400 Subject: Paving the on-ramp In-Reply-To: References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com> > > ?- Relax the requirement that the class, and `main` method, be > public.? Public > ?? accessibility is only relevant when access crosses packages; > simple programs > ?? live in the unnamed package, so cannot be accessed from any > other package > ?? anyway.? For a program whose main class is in the unnamed > package, we can > ?? drop the requirement that the class or its `main` method be public, > ?? effectively treating the `java` launcher as if it too resided > in the unnamed > ?? package. > > > Alternative: drop the requirement altogether. Most main methods have > no desire to make themselves publicly?callable as > `TheClass.main(args)`, but today they are forced to expose that API > anyway. I feel like it would still be conceptually clean?to say that > `public` is really about whether other *code* can access it, not > whether a VM can get to it at all. I think we're saying the same thing; main need not be public. > > ?- Make the "args" parameter to `main` optional, by allowing the > `java` launcher to > ?? first look for a main method with the traditional `main(String[])` > ?? signature, and then (if not found) for a main method with no > arguments. > > > This seems to leave users vulnerable to some surprises, where the code > they think is being called isn't. Why not make it a compile-time error > to provide both forms? Currently, the treatment of methods called "main" is "and also"; it is a valid method, *and also* (if it has the right shape) can be used as a main entry point.? Making this an error would take some valid programs and make them invalid, which seems a shift in the interpretation of the magic name "main".? A warning is probably reasonable though. > > ?- Make the `static` modifier on `main` optional, by allowing the > `java` launcher to > ?? invoke an instance `main` method (of either signature) by > instantiating an > ?? instance using an accessible no-arg constructor and then > invoking the `main` > ?? method on it. > > > On a deep conceptual level... I'd insist that main() *is static*. It > is *the* single entry point into the program; what could be more > static than that? But thinking about our learner, who wrote some > `main`s before learning about static. The instant they learn `static` > is a keyword a method can have, they'll "know" one thing about it > already: this is going to be something new that's *not* true of > main(). But then they hear an explanation that fits `main` perfectly? John likes to say "static has messed up every job we've ever given it", and while that seems an exaggeration at first, often turns out to be surprisingly accurate.? One subtle thing it messes up here is that one cannot effectively inherit a main() method.? But inheriting main() is super useful!? Consider a TestCase class in a test framework, or an AbstractService class in a services framework.? If the abstract class can provide the main() method, then every test case or service _is also a program_, one which runs that test case or service. But, there is cheese-moving here.? In the old model, "main" is just a disembodied method, which only accidentally lives in a class, and drags the class along for the ride.?? In this model, main-ness moves up the stack, becoming a property of a class, not just something a class has. This tension is evident in JLS 12, which defines the interaction with main.? It is full of wiggle words, because it is trying to pretend that Java has no concept of "program", just classes, but at the same time, there has to be a way to get the computation started.? The JLS tries to pretend that "program" is defined almost extralinguistically (by appeal to an unspcified launcher program that exists outside of the language), but nearly trips over its own feet trying to have it both ways. The debate among educators about whether main should be allowed to do anything it wants, or should only instantiate an object and call a single method, illustrates this tension.? So what is really going on here is bringing the notion of "program" to classes in a less nailed-on-the-side way. > Just to mention one additional idea. We could permit `main` to > optionally return `int`, becoming the default exit status if `exit` is > never called. Seems elegant for the rare cases where you care about > exit status, but (a) would this feature?get in the way in *any* sense > for the vast majority of cases that don't care, or (b) are the cases > that care just way too rare for us to worry about? > > I'm not sure about (a). But (b) kinda seems like a yes. > Considered this (since C lets you do this.)?? Since Java doesn't let you overload on return types, we have the option to do this later without making the search order any more complicated, so I left it out. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Wed Sep 28 20:27:59 2022 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 28 Sep 2022 13:27:59 -0700 Subject: Paving the on-ramp In-Reply-To: <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com> Message-ID: On Wed, Sep 28, 2022 at 1:10 PM Brian Goetz wrote: > - Relax the requirement that the class, and `main` method, be public. >> Public >> accessibility is only relevant when access crosses packages; simple >> programs >> live in the unnamed package, so cannot be accessed from any other >> package >> anyway. For a program whose main class is in the unnamed package, we >> can >> drop the requirement that the class or its `main` method be public, >> effectively treating the `java` launcher as if it too resided in the >> unnamed >> package. >> > > Alternative: drop the requirement altogether. Most main methods have no > desire to make themselves publicly callable as `TheClass.main(args)`, but > today they are forced to expose that API anyway. I feel like it would still > be conceptually clean to say that `public` is really about whether other > *code* can access it, not whether a VM can get to it at all. > > I think we're saying the same thing; main need not be public. > You seemed quite clearly to be offering that for classes in the default package only. - Make the "args" parameter to `main` optional, by allowing the `java` >> launcher to >> first look for a main method with the traditional `main(String[])` >> signature, and then (if not found) for a main method with no arguments. >> > > This seems to leave users vulnerable to some surprises, where the code > they think is being called isn't. Why not make it a compile-time error to > provide both forms? > > Currently, the treatment of methods called "main" is "and also"; it is a > valid method, *and also* (if it has the right shape) can be used as a main > entry point. Making this an error would take some valid programs and make > them invalid, which seems a shift in the interpretation of the magic name > "main". A warning is probably reasonable though. > Oh, yeah, I have a habit of saying "error" when I am always always perfectly satisfied with a warning. Of course, the warning goes just on the method that isn't gonna get called, and the user should be advised to rename it. - Make the `static` modifier on `main` optional, by allowing the `java` >> launcher to >> invoke an instance `main` method (of either signature) by >> instantiating an >> instance using an accessible no-arg constructor and then invoking the >> `main` >> method on it. >> > > On a deep conceptual level... I'd insist that main() *is static*. It is > *the* single entry point into the program; what could be more static than > that? But thinking about our learner, who wrote some `main`s before > learning about static. The instant they learn `static` is a keyword a > method can have, they'll "know" one thing about it already: this is going > to be something new that's *not* true of main(). But then they hear an > explanation that fits `main` perfectly? > > Sorry, just a quick self-reply of clarification: when I said "main IS static", that was taking Java's model that everything belongs to a class as *given*. It's not commentary against "main is really a free function". John likes to say "static has messed up every job we've ever given it", and > while that seems an exaggeration at first, often turns out to be > surprisingly accurate. One subtle thing it messes up here is that one > cannot effectively inherit a main() method. But inheriting main() is super > useful! Consider a TestCase class in a test framework, or an > AbstractService class in a services framework. If the abstract class can > provide the main() method, then every test case or service _is also a > program_, one which runs that test case or service. > I see that that is "a way" to do a thing. But in my view, implementation inheritance has messed up every job we've ever given it. :-) It will at the *least* take me some time and reflection to convince myself that "inheritable main" isn't horrifying. Most of the specific counter-arguments I laid out to the non-static main have dropped out of the thread without acknowledgement, so I'm a little concerned they'll be forgotten in the discussion. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Sep 28 20:49:50 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 28 Sep 2022 22:49:50 +0200 (CEST) Subject: Paving the on-ramp In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: <1054807958.15296354.1664398190396.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "amber-spec-experts" > Sent: Wednesday, September 28, 2022 7:57:19 PM > Subject: Paving the on-ramp > At various points, we've explored the question of which program elements are > most and least helpful for students first learning Java. After considering a > number of alternatives over the years, I have a simple proposal for smoothing > the "on ramp" to Java programming, while not creating new things to unlearn. > Markdown source is below, HTML will appear soon at: > [ https://openjdk.org/projects/amber/design-notes/on-ramp | > https://openjdk.org/projects/amber/design-notes/on-ramp ] > # Paving the on-ramp > Java is one of the most widely taught programming languages in the world. Tens > of thousands of educators find that the imperative core of the language combined > with a straightforward standard library is a foundation that students can > comfortably learn on. Choosing Java gives educators many degrees of freedom: > they can situate students in `jshell` or Notepad or a full-fledged IDE; they can > teach imperative, object-oriented, functional, or hybrid programming styles; and > they can easily find libraries to interact with external data and services. > No language is perfect, and one of the most common complaints about Java is that > it is "too verbose" or has "too much ceremony." And unfortunately, Java imposes > its heaviest ceremony on those first learning the language, who need and > appreciate it the least. The declaration of a class and the incantation of > `public static void main` is pure mystery to a beginning programmer. While > these incantations have principled origins and serve a useful organizing purpose > in larger programs, they have the effect of placing obstacles in the path of > _becoming_ Java programmers. Educators constantly remind us of the litany of > complexity that students have to confront on Day 1 of class -- when they really > just want to write their first program. > As an amusing demonstration of this, in her JavaOne keynote appearance in 2019, > [Aimee Lucido]( [ https://www.youtube.com/watch?v=BkPPFiXUwYk | > https://www.youtube.com/watch?v=BkPPFiXUwYk ] ) talked about when > she learned to program in Java, and how her teacher performed a rap song > to help students memorize `"public static void main"`. Our hats are off to > creative educators everywhere for this kind of dedication, but teachers > shouldn't have to do this. > Of course, advanced programmers complain about ceremony too. We will never be > able to satisfy programmers' insatiable appetite for typing fewer keystrokes, > and we shouldn't try, because the goal of programming is to write programs that > are easy to read and are clearly correct, not programs that were easy to type. > But we can try to better align the ceremony commensurate with the value it > brings to a program -- and let simple programs be expressed more simply. > ## Concept overload > The classic "Hello World" program looks like this in Java: > ``` > public class HelloWorld { > public static void main(String[] args) { > System.out.println("Hello World"); > } > } > ``` > It may only be five lines, but those lines are packed with concepts that are > challenging to absorb without already having some programming experience and > familiarity with object orientation. Let's break down the concepts a student > confronts when writing their first Java program: > - **public** (on the class). The `public` accessibility level is relevant > only when there is going to be cross-package access; in a simple "Hello > World" program, there is only one class, which lives in the unnamed package. > They haven't even written a one-line program yet; the notion of access > control -- keeping parts of a program from accessing other parts of it -- is > still way in their future. > - **class**. Our student hasn't set out to write a _class_, or model a > complex system with objects; they want to write a _program_. In Java, a > program is just a `main` method in some class, but at this point our student > still has no idea what a class is or why they want one. > - **Methods**. Methods are of course a key concept in Java, but the mechanics > of methods -- parameters, return types, and invocation -- are still > unfamiliar, and the `main` method is invoked magically from the `java` > launcher rather than from explicit code. > - **public** (again). Like the class, the `main` method has to be public, but > again this is only relevant when programs are large enough to require > packages to organize them. > - **static**. The `main` method has to be static, and at this point, students > have no context for understanding what a static method is or why they want > one. Worse, the early exposure to `static` methods will turn out to be a > bad habit that must be later unlearned. Worse still, the fact that the > `main` method is `static` creates a seam between `main` and other methods; > either they must become `static` too, or the `main` method must trampoline > to some sort of "instance main" (more ceremony!) And if we get this wrong, > we get the dreaded and mystifying `"cannot be referenced from a static > context"` error. > - **main**. The name `main` has special meaning in a Java program, indicating > the starting point of a program, but this specialness hides behind being an > ordinary method name. This may contribute to the sense of "so many magic > incantations." > - **String[]**. The parameter to `main` is an array of strings, which are the > arguments that the `java` launcher collected from the command line. But our > first program -- likely our first dozen -- will not use command-line > parameters. Requiring the `String[]` parameter is, at this point, a mistake > waiting to happen, and it will be a long time until this parameter makes > sense. Worse, educators may be tempted to explain arrays at this point, > which further increases the time-to-first-program. > - **System.out.println**. If you look closely at this incantation, each > element in the chain is a different thing -- `System` is a class (what's a > class again?), `out` is a static field (what's a field?), and `println` is > an instance method. The only part the student cares about right now is > `println`; the rest of it is an incantation that they do not yet understand > in order to get at the behavior they want. > That's a lot to explain to a student on the first day of class. There's a good > chance that by now, class is over and we haven't written any programs yet, or > the teacher has said "don't worry what this means, you'll understand it later" > six or eight times. Not only is this a lot of _syntactic_ things to absorb, but > each of those things appeals to a different concept (class, method, package, > return value, parameter, array, static, public, etc) that the student doesn't > have a framework for understanding yet. Each of these will have an important > role to play in larger programs, but so far, they only contribute to "wow, > programming is complicated." > It won't be practical (or even desirable) to get _all_ of these concepts out of > the student's face on day 1, but we can do a lot -- and focus on the ones that > do the most to help beginners understand how programs are constructed. > ## Goal: a smooth on-ramp > As much as programmers like to rant about ceremony, the real goal here is not > mere ceremony reduction, but providing a graceful _on ramp_ to Java programming. > This on-ramp should be helpful to beginning programmers by requiring only those > concepts that a simple program needs. > Not only should an on-ramp have a gradual slope and offer enough acceleration > distance to get onto the highway at the right speed, but its direction must > align with that of the highway. When a programmer is ready to learn about more > advanced concepts, they should not have to discard what they've already learned, > but instead easily see how the simple programs they've already written > generalize to more complicated ones, and both the syntatic and conceptual > transformation from "simple" to "full blown" program should be straightforward > and unintrusive. It is a definite non-goal to create a "simplified dialect of > Java for students". > We identify three simplifications that should aid both educators and students in > navigating the on-ramp to Java, as well as being generally useful to simple > programs beyond the classroom as well: > - A more tolerant launch protocol > - Unnamed classes > - Predefined static imports for the most critical methods and fields > ## A more tolerant launch protocol > The Java Language Specification has relatively little to say about how Java > "programs" get launched, other than saying that there is some way to indicate > which class is the initial class of a program (JLS 12.1.1) and that a public > static method called `main` whose sole argument is of type `String[]` and whose > return is `void` constitutes the entry point of the indicated class. > We can eliminate much of the concept overload simply by relaxing the > interactions between a Java program and the `java` launcher: > - Relax the requirement that the class, and `main` method, be public. Public > accessibility is only relevant when access crosses packages; simple programs > live in the unnamed package, so cannot be accessed from any other package > anyway. For a program whose main class is in the unnamed package, we can > drop the requirement that the class or its `main` method be public, > effectively treating the `java` launcher as if it too resided in the unnamed > package. > - Make the "args" parameter to `main` optional, by allowing the `java` launcher > to > first look for a main method with the traditional `main(String[])` > signature, and then (if not found) for a main method with no arguments. > - Make the `static` modifier on `main` optional, by allowing the `java` launcher > to > invoke an instance `main` method (of either signature) by instantiating an > instance using an accessible no-arg constructor and then invoking the `main` > method on it. > This small set of changes to the launch protocol strikes out five of the bullet > points in the above list of concepts: public (twice), static, method parameters, > and `String[]`. > At this point, our Hello World program is now: > ``` > class HelloWorld { > void main() { > System.out.println("Hello World"); > } > } > ``` > It's not any shorter by line count, but we've removed a lot of "horizontal > noise" along with a number of concepts. Students and educators will appreciate > it, but advanced programmers are unlikely to be in any hurry to make these > implicit elements explicit either. > Additionally, the notion of an "instance main" has value well beyond the first > day. Because excessive use of `static` is considered a code smell, many > educators encourage the pattern of "all the static `main` method does is > instantiate an instance and call an instance `main` method" anyway. Formalizing > the "instance main" protocol reduces a layer of boilerplate in these cases, and > defers the point at which we have to explain what instance creation is -- and > what `static` is. (Further, allowing the `main` method to be an instance method > means that it could be inherited from a superclass, which is useful for simple > frameworks such as test runners or service frameworks.) > ## Unnamed classes > In a simple program, the `class` declaration often doesn't help either, because > other classes (if there are any) are not going to reference it by name, and we > don't extend a superclass or implement any interfaces. If we say an "unnamed > class" consists of member declarations without a class header, then our Hello > World program becomes: > ``` > void main() { > System.out.println("Hello World"); > } > ``` > Such source files can still have fields, methods, and even nested classes, so > that as a program evolves from a few statements to needing some ancillary state > or helper methods, these can be factored out of the `main` method while still > not yet requiring a full class declaration: > ``` > String greeting() { return "Hello World"; } > void main() { > System.out.println(greeting()); > } > ``` > This is where treating `main` as an instance method really shines; the user has > just declared two methods, and they can freely call each other. Students need > not confront the confusing distinction between instance and static methods yet; > indeed, if not forced to confront static members on day 1, it might be a while > before they do have to learn this distinction. The fact that there is a > receiver lurking in the background will come in handy later, but right now is > not bothering anybody. > [JEP 330]( [ https://openjdk.org/jeps/330 | https://openjdk.org/jeps/330 ] ) > allows single-file programs to be > launched directly without compilation; this streamlined launcher pairs well with > unnamed classes. > ## Predefined static imports > The most important classes, such as `String` and `Integer`, live in the > `java.lang` package, which is automatically on-demand imported into all > compilation units; this is why we do not have to `import java.lang.String` in > every class. Static imports were not added until Java 5, but no corresponding > facility for automatic on-demand import of common behavior was added at that > time. Most programs, however, will want to do console IO, and Java forces us to > do this in a roundabout way -- through the static `System.out` and `System.in` > fields. Basic console input and output is a reasonable candidate for > auto-static import, as one or both are needed by most simple programs. While > these are currently instance methods accessed through static fields, we can > easily create static methods for `println` and `readln` which are suitable for > static import, and automatically import them. At which point our first program > is now down to: > ``` > void main() { > println("Hello World"); > } > ``` > ## Putting this all together > We've discussed several simplifications: > - Update the launcher protocol to make public, static, and arguments optional > for main methods, and for main methods to be instance methods (when a > no-argument constructor is available); > - Make the class wrapper for "main classes" optional (unnamed classes); > - Automatically static import methods like `println` > which together whittle our long list of day-1 concepts down considerably. While > this is still not as minimal as the minimal Python or Ruby program -- statements > must still live in a method -- the goal here is not to win at "code golf". The > goal is to ensure that concepts not needed by simple programs need not appear in > those programs, while at the same time not encouraging habits that have to be > unlearned as programs scale up. > Each of these simplifications is individually small and unintrusive, and each is > independent of the others. And each embodies a simple transformation that the > author can easily manually reverse when it makes sense to do so: elided > modifiers and `main` arguments can be added back, the class wrapper can be added > back when the affordances of classes are needed (supertypes, constructors), and > the full qualifier of static-import can be added back. And these reversals are > independent of one another; they can done in any combination or any order. > This seems to meet the requirements of our on-ramp; we've eliminated most of the > day-1 ceremony elements without introducing new concepts that need to be > unlearned. The remaining concepts -- a method is a container for statements, and > a program is a Java source file with a `main` method -- are easily understood in > relation to their fully specified counterparts. > ## Alternatives > Obviously, we've lived with the status quo for 25+ years, so we could continue > to do so. There were other alternatives explored as well; ultimately, each of > these fell afoul of one of our goals. > ### Can't we go further? > Fans of "code golf" -- of which there are many -- are surely right now trying to > figure out how to eliminate the last little bit, the `main` method, and allow > statements to exist at the top-level of a program. We deliberately stopped > short of this because it offers little value beyond the first few minutes, and > even that small value quickly becomes something that needs to be unlearned. > The fundamental problem behind allowing such "loose" statements is that > variables can be declared inside both classes (fields) and methods (local > variables), and they share the same syntactic production but not the same > semantics. So it is unclear (to both compilers and humans) whether a "loose" > variable would be a local or a field. If we tried to adopt some sort of simple > heuristic to collapse this ambiguity (e.g., whether it precedes or follows the > first statement), that may satisfy the compiler, but now simple refactorings > might subtly change the meaning of the program, and we'd be replacing the > explicit syntactic overhead of `void main()` with an invisible "line" in the > program that subtly affects semantics, and a new subtle rule about the meaning > of variable declarations that applies only to unnamed classes. This doesn't > help students, nor is this particularly helpful for all but the most trivial > programs. It quickly becomes a crutch to be discarded and unlearned, which > falls afoul of our "on ramp" goals. Of all the concepts on our list, "methods" > and "a program is specified by a main method" seem the ones that are most worth > asking students to learn early. > ### Why not "just" use `jshell`? > While JShell is a great interactive tool, leaning too heavily on it as an onramp > would fall afoul of our goals. A JShell session is not a program, but a > sequence of code snippets. When we type declarations into `jshell`, they are > viewed as implicitly static members of some unspecified class, with > accessibility is ignored completely, and statements execute in a context where > all previous declarations are in scope. This is convenient for experimentation > -- the primary goal of `jshell` -- but not such a great mental model for > learning to write Java programs. Transforming a batch of working declarations > in `jshell` to a real Java program would not be sufficiently simple or > unintrusive, and would lead to a non-idiomatic style of code, because the > straightforward translation would have us redeclaring each method, class, and > variable declaration as `static`. Further, this is probably not the direction > we want to go when we scale up from a handful of statements and declarations to > a simple class -- we probably want to start using classes as classes, not just > as containers for static members. JShell is a great tool for exploration and > debugging, and we expect many educators will continue to incorporate it into > their curriculum, but is not the on-ramp programming model we are looking for. > ### What about "always local"? > One of the main tensions that `main` introduces is that most class members are > not `static`, but the `main` method is -- and that forces programmers to > confront the seam between static and non-static members. JShell answers this > with "make everything static". > Another approach would be to "make everything local" -- treat a simple program > as being the "unwrapped" body of an implicit main method. We already allow > variables and classes to be declared local to a method. We could add local > methods (a useful feature in its own right) and relax some of the asymmetries > around nesting (again, an attractive cleanup), and then treat a mix of > declarations and statements without a class wrapper as the body of an invisible > `main` method. This seems an attractive model as well -- at first. > While the syntactic overhead of converting back to full-blown classes -- wrap > the whole thing in a `main` method and a `class` declaration -- is far less > intrusive than the transformation inherent in `jshell`, this is still not an > ideal on-ramp. Local variables interact with local classes (and methods, when > we have them) in a very different way than instance fields do with instance > methods and inner classes: their scopes are different (no forward references), > their initialization rules are different, and captured local variables must be > effectively final. This is a subtly different programming model that would then > have to be unlearned when scaling up to full classes. Further, the result of > this wrapping -- where everything is local to the main method -- is also not > "idiomatic Java". So while local methods may be an attractive feature, they are > similarly not the on-ramp we are looking for. I agree with the goal, i've several remarks. - You do not have to declare a class public to run it so you do not have to explain the first "public". So the sutuation is a little less awful that the one you describe :) - I know several teachers that uses an interface instead of a class as the default container for methods for the first weeks, because methods are public by default inside an interface (and nested classes are implicitly static). The snippet used is something like this interface Hello { static void main(String[] args) { ... } } so technically you do not have to explain "public". - You can declare a main() on other things that a class, on an interface, on an enum or a record. Being able to declare the "main" without to have to declare it static is nice, but the semantics you propose creates new issues, because the auto-instantiation does not work if the container is an interface or a record with components. This feels too magical to me, and as a teacher i will have to explain it at some point. - Currently there is a nice progression in term of complexity, there are 3 steps : first, you have a class with no package and you can only use the classes of the JDK or the classes of the current folder, then, you have the package declaration and you can have multiple folder, and finally if you want to declare non-visible packages, you need module and the module-info. I think your idea of a classless compilation unit plays well with the current idea if we consider it has the step zero, first you have classless class, then class, then package + class and at the end module + package + class. - We should not be able to declare fields inside a classless class, students strugle at the beginning to make the difference between a field and a local variable. Every syntax that make that distinction murkier is a bad idea. So perhaps what we want is a classless container of methods, not a classless class. - At the begining, teaching records is easier than teaching classes because you can do too much with a class while records have a simple syntax and a simple semantics. At my university, real classes (not class as container) are only introduced at week 4, when we start to have mutable thingy. In a dream world, we should be able to declare records inside a classless class, but i do not see how the compiler will not see a top level record instead of a classless class containing records. I suppose, a classless class can not have nested classes/record/enum/interface.. - At my uni, we start by teaching Python and JavaScript, then C then Java. We do not teach ipython because the semantics is slightly different from python. For the same reason, we do not use jshell for undergraduates because the semantics is sligthly different than java. For the same reason, if the semantics of a classless class is different from the semantics of a regular class , we will not use it too. I don't think your proposal has that problem, it's more a remainder for me that a classless class can not have a different semantics than a class, it can do less but it can not do more or worst do something differently. - I don't hink we can add an auto static import without causing source backward compatibility issues, because you can not have several import static using the same last identitfier. By example, if an existing class declare import static foo.A.println; this class will now fail to compile. That's why no auto static import was added in Java 5. There is also the problem of the comb rule, Java prefers super types methods (even static methods) to static imports. So adding a method println() (or any method named like an auto imported static method) to a non final class becomes a hazard. You may argue that we altready have that problem now, which is true, but any auto static imports of methods makes this known problem worst. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Sep 28 20:56:06 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 28 Sep 2022 16:56:06 -0400 Subject: Paving the on-ramp In-Reply-To: <1054807958.15296354.1664398190396.JavaMail.zimbra@u-pem.fr> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> <1054807958.15296354.1664398190396.JavaMail.zimbra@u-pem.fr> Message-ID: > - You can declare a main() on other things that a class, on an > interface, on an enum or a record. > ? Being able to declare the "main" without to have to declare it > static is nice, but the semantics you propose creates new issues, > ? because the auto-instantiation does not work if the container is an > interface or a record with components. > ? This feels too magical to me, and as a teacher i will have to > explain it at some point. Perhaps, but not on the first day. > - At the begining, teaching records is easier than teaching classes > because you can do too much with a class while records have a simple > syntax and a simple semantics. I agree teaching records first is a good teaching strategy; I have a lot to say about curriculum design, but I'd like to keep that a separate discussion.? Suffiice it to say that an important secondary goal here is unconstraining the order in which things must be taught. > ? In a dream world, we should be able to declare records inside a > classless class, but i do not see how the compiler will not see a top > level record instead of a classless class containing records. > Hoping to make this dream possible. > - At my uni, we start by teaching Python and JavaScript, then C then > Java. We do not teach ipython because the semantics is slightly > different from python. > ? For the same reason, we do not use jshell for undergraduates because > the semantics is sligthly different than java. > ? For the same reason, if the semantics of a classless class is > different from the semantics of a regular class , we will not use it too. Agree, and this was a strong driving motivation.? This is why we have avoided trying to create a "safe subset for beginners", and instead focus on allowing unnecessary wrapping to be elided. > - I don't hink we can add an auto static import without causing source > backward compatibility issues, because you can not have several import > static using the same last identitfier. > ? By example, if an existing class declare > ??? import static foo.A.println; > > ? this class will now fail to compile. > ? That's why no auto static import was added in Java 5. There is some complexity here, but it does not seem insurmountable. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Wed Sep 28 21:13:06 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Wed, 28 Sep 2022 23:13:06 +0200 (CEST) Subject: Paving the on-ramp In-Reply-To: <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com> Message-ID: <2114262561.15300506.1664399586451.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "Kevin Bourrillion" > Cc: "amber-spec-experts" > Sent: Wednesday, September 28, 2022 10:10:02 PM > Subject: Re: Paving the on-ramp >>> - Make the "args" parameter to `main` optional, by allowing the `java` launcher >>> to >>> first look for a main method with the traditional `main(String[])` >>> signature, and then (if not found) for a main method with no arguments. >> This seems to leave users vulnerable to some surprises, where the code they >> think is being called isn't. Why not make it a compile-time error to provide >> both forms? > Currently, the treatment of methods called "main" is "and also"; it is a valid > method, *and also* (if it has the right shape) can be used as a main entry > point. Making this an error would take some valid programs and make them > invalid, which seems a shift in the interpretation of the magic name "main". A > warning is probably reasonable though. The other solution is to do something similar to the compact constructor of a record, a compact main that have a syntax which is not currently valid in Java. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Sep 28 21:23:47 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 28 Sep 2022 17:23:47 -0400 Subject: Paving the on-ramp In-Reply-To: <2114262561.15300506.1664399586451.JavaMail.zimbra@u-pem.fr> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com> <2114262561.15300506.1664399586451.JavaMail.zimbra@u-pem.fr> Message-ID: <74baeecb-e3e5-f38c-d875-0977db3e96de@oracle.com> > The other solution is to do something similar to the compact > constructor of a record, a compact main that have a syntax which is > not currently valid in Java. An early iteration had something like that.? I liked it for about five minutes!? Then I started to dislike it, because (a) it was going to quickly become something that needs to be unlearned and (b) it was spending syntax on a very narrow use case, narrow in multiple ways.? And fixing (a) by generalizing to "compact methods" didn't feel like a win either; now it was just two ways to say the same thing. Of all the concepts that it is worth asking users to internalize early, I think "methods as aggregations of statements" is it.? (Yes, in this version you still have to confront "void" and "()".) From forax at univ-mlv.fr Wed Sep 28 21:35:12 2022 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Wed, 28 Sep 2022 23:35:12 +0200 (CEST) Subject: Paving the on-ramp In-Reply-To: <74baeecb-e3e5-f38c-d875-0977db3e96de@oracle.com> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> <599f9122-68db-cf8d-1cdd-04e01bcf8d70@oracle.com> <2114262561.15300506.1664399586451.JavaMail.zimbra@u-pem.fr> <74baeecb-e3e5-f38c-d875-0977db3e96de@oracle.com> Message-ID: <1702806108.15315388.1664400912191.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Brian Goetz" > To: "Remi Forax" > Cc: "Kevin Bourrillion" , "amber-spec-experts" > Sent: Wednesday, September 28, 2022 11:23:47 PM > Subject: Re: Paving the on-ramp >> The other solution is to do something similar to the compact >> constructor of a record, a compact main that have a syntax which is >> not currently valid in Java. > > An early iteration had something like that.? I liked it for about five > minutes!? Then I started to dislike it, because (a) it was going to > quickly become something that needs to be unlearned and (b) it was > spending syntax on a very narrow use case, narrow in multiple ways.? And > fixing (a) by generalizing to "compact methods" didn't feel like a win > either; now it was just two ways to say the same thing. > > Of all the concepts that it is worth asking users to internalize early, > I think "methods as aggregations of statements" is it.? (Yes, in this > version you still have to confront "void" and "()".) That the main issue with main :) It's the entry point so it's a special case but at the same time you do not want to spend a lot of effort to make it different from a method, so having a special syntax is too much. And not making it a method by allowing to write statements without a method like in JavaScript does not work well because you can not write statements inside a class. R?mi From kevinb at google.com Thu Sep 29 00:37:47 2022 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 28 Sep 2022 17:37:47 -0700 Subject: Paving the on-ramp In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: Again, big fan of getting to a streamlined main() source file. A major design goal of yours seems clear: to get there without rendering Java source files explicitly bimorphic ("class" source files all look like this, "main" source files all look like that). Instead you have a set of independent features that can compose to get you there in a "smooth ramp". The design looks heavily influenced by that goal. And it sounds virtuous. But... is it? Really? Take a language that has this pretty streamlined already (I'll use the one I know): ``` fun main() { ... } ``` As my program grows and gets more complex, I will make changes like * use more other libraries * add args to main() * add helper methods * add constants * create new classes and use them from here But: when and why would I be motivated to change *this* code *itself* to "become" a class, become instantiable, acquire instance state, etc. etc.? I don't imagine ever having that urge. main() is just main()! It's just a way in. Isn't it literally just a way to (a) transfer control back and forth and (b) hand me args? If I need those other qualities, then I create a class to get them, maybe even right below main(), and I use it. I'm already going to be regularly needing to do that anyway just as my code grows. A quick clarification: On Wed, Sep 28, 2022 at 12:49 PM Kevin Bourrillion wrote: Because excessive use of `static` is considered a code smell, many >> educators encourage the pattern of "all the static `main` method does is >> instantiate an instance and call an instance `main` method" anyway. >> > > Heavy groan. In my opinion, some ideas are too misguided to take seriously. > > The value in that practice is if instance `main` accepts parameters like > `PrintStream` and `Console`, and static main passes in `System.out` and > `System.console()`. That makes all your actual program logic unit-testable. > Great! This actually strikes directly at the heart of what the entire > problem with `static` is! But this isn't the case you're addressing. > Note I was only reacting to "static bad!" here. I would be happy if *that* argument were dropped, but you do still have another valid argument: that `static` is another backward default, and the viral burden of putting it not just on main() but every helper method you factor out is pure nuisance. (I'd suggest mentioning the viral nature of this particular burden higher/more prominently in the doc, as it's currently out of place under the "unnamed classes" section.) (That doesn't mean "so let's do it"; I still hope to see that benefit carefully measured against the drawbacks. Btw, *some* of those drawbacks might be eased by disallowing an explicit constructor... and jeez, please disallow type parameters too... I'm leaving the exact meaning of "disallow" undefined here.) To resume with the original text... On Wed, Sep 28, 2022 at 10:57 AM Brian Goetz wrote: ## Unnamed classes > > In a simple program, the `class` declaration often doesn't help either, > because > other classes (if there are any) are not going to reference it by name, > and we > don't extend a superclass or implement any interfaces. > How do I tell `java` which class file to load and call main() on? Class name based on file name, I guess? Tiny side benefit of dropping all the `static`s: then if you also use an unnamed class you can still make method references to your own helper methods. If we say an "unnamed > class" consists of member declarations without a class header, then our > Hello > World program becomes: > > ``` > void main() { > System.out.println("Hello World"); > } > ``` > One or more class annotations could appear below package/imports? Such source files can still have fields, methods, and even nested classes, > Do those get compiled to real nested classes, nested inside an unnamed class? So if I edit a "regular" `Foo.java` file, go down below the last `}` and add a `main` function there, does that cause the whole `Foo` class above to be reinterpreted as "nested inside an unnamed class" instead of top-level? Students need > not confront the confusing distinction between instance and static methods > yet; > indeed, if not forced to confront static members on day 1, it might be a > while > before they do have to learn this distinction. > Well, they'll confront it from the calling side, `str.length()` looks quite different from unqualified calls and classname-qualified calls. They'd ideally get a chance to understand that first before making their own classes. This is my notion of a natural progression: 1. Write procedural code: calling static methods, using existing data types, soon calling their instance methods 2. Proceed to creating your own types (from simple data types onward) and using them too 3. One day learn that your main() function is actually a method of an instantiable type too... at pub trivia night, then promptly forget it > The fact that there is a > receiver lurking in the background will come in handy later, > (My claim up above is "I don't think it will.") -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Sep 29 01:36:45 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 28 Sep 2022 21:36:45 -0400 Subject: Paving the on-ramp In-Reply-To: References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: <2e779dc3-8a16-8d89-1e5d-6b14b04998d3@oracle.com> > A major design goal of yours seems clear: to get there without > rendering Java source files explicitly?bimorphic?("class" source files > all look like this, "main" source files all look like that). Instead > you have a set of independent features that can compose to get you > there in a "smooth ramp". The design looks heavily influenced by that > goal. Yes, I sometimes call this "telescoping", because there's a chain of "x is short for y is short for z".? For example, with lambdas: ??? x -> e is-short-for ??? (x) -> e is-short-for ??? (var x) -> e is-short-for ??? (int x) -> e? // or whatever the arg is As a design convention, it enables a mental model where there is really just one form, with varying things you could leave out. Early in the Lambda days, we saw articles like "there are N forms of lambda expressions", and that stuff infuriates me, it is as if people go out of their way to find more complex mental models than necessary. > As my program grows and gets more complex, I will make changes like > > * use more other libraries > * add args to main() > * add helper methods > * add constants > * create new classes and use them from here > > But: when and why would I be motivated to change?*this* code *itself* > to "become" a class, become instantiable, acquire instance state, etc. > etc.? I don't imagine ever having that urge. main() is just main()! > It's just a way in. Isn't it literally just a way to (a) transfer > control back and forth and (b) hand me args? This doesn't seem like such a leap to me.? You might start out hardcoding a file path that will be read.? Then you might decide to let that be passed in (so you add the args parameter to main).? Then you might want to treat the filename to be read as a field so it can be shared across methods, so you turn it into a constructor parameter.? One could imagine "introduce X" refactorings to do all of these.? The process of hardcoding to main() parameter to constructor argument is a natural sedimentation of things finding their right level.? (And even if you don't do all of this, knowing that its an ordinary class (like an enum or a record) just with a concise syntax means you don't have to learn new concepts.? I don't want Foo classes and Bar classes.) > > Note I was only reacting to "static bad!" here. I would be happy if > *that* argument were dropped, but you do still have another valid > argument: that `static` is another backward default, and the viral > burden of putting it not just on main() but every helper method you > factor out is pure nuisance. (I'd suggest mentioning the viral nature > of this particular burden higher/more prominently in the doc, as it's > currently out of place under the "unnamed classes" section.) > > (That doesn't mean "so let's do it"; I still hope to see that benefit > carefully measured against the drawbacks. Btw, *some* of those > drawbacks might be eased by disallowing an explicit constructor... and > jeez, please disallow type parameters too... I'm leaving the exact > meaning of "disallow" undefined here.) Indeed, I intend that there are no explicit constructors or instance initializers here.? (There can't be constructors, because the class is unnamed!)? I think I said somewhere "such classes can contain ..." and didn't list constructors, but I should have been more explicit. > > ## Unnamed classes > > In a simple program, the `class` declaration often doesn't help > either, because > other classes (if there are any) are not going to reference it by > name, and we > don't extend a superclass or implement any interfaces. > > > How do I tell `java` which class file to load and call main() on? > Class name based on file name, I guess? Sadly yes.? More sad stories coming on this front, Jim can tell. > > If we say an "unnamed > class" consists of member declarations without a class header, > then our Hello > World program becomes: > > ``` > void main() { > ??? System.out.println("Hello World"); > } > ``` > > > > One or more class annotations could appear below package/imports? No package statement (unnamed classes live in the unnamed package), but imports are OK.? No class annotations.? No type variables.? No superclasses. > Such source files can still have fields, methods, and even nested > classes, > > > Do those get compiled to real nested classes, nested inside an unnamed > class? So if I edit a "regular" `Foo.java` file, go down below the > last `}` and add a `main` function there, does that cause the whole > `Foo` class above to be reinterpreted as "nested inside an unnamed > class" instead of top-level? To be discussed! > This is my notion of a natural progression: > > 1. Write procedural code: calling static methods, using existing data > types, soon calling their instance methods > 2. Proceed to creating your own types (from simple data types onward) > and using them too > 3. One day learn that your main() function is actually a method of an > instantiable type too... at pub trivia night, then promptly forget it Right. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Sep 29 03:41:24 2022 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 29 Sep 2022 03:41:24 +0000 Subject: Paving the on-ramp In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: <06823323-6214-438A-80A7-184310F01C55@oracle.com> This is headed in the right direction, but I worry about the use of dei ex machina that have the property that they are NOT easily explained in terms of something the user could have written. Perhaps these could all be dispensed with by using an alternate strategy of code rewriting, (at least as an explanation, if not also as an implementation mechanism). (1) Instead of having a magic ?unnamed? class, which has bizarre properties such as not having a constructor (or at least not a constructor you can mention in a `new` expression), only to then require a second magic rule about what you put in the command line ?java ??, why not simply use the much more obvious rule that if a compilation unit doesn't have a class header, then a class header is _supplied_ by the compiler, and the name of the class is taken from the filename of the compilation unit? (2) Instead of complicating the Java launch protocol, why not leave it along, and instead use the existing mechanism of ?in situation X, if the user fails to provide method Y, the compiler will provide a definition automatically?? Specifically, in a compilation unit named Foo.java for which a class header has to be provided automatically, if a method with signature ?main()? is present but no static method with signature ?main(String[])? is present, then a static method with signature ?main(String[])? is automatically provided by the compiler. (2a) If the method with signature ?main()? is static, the provided method is public static void main(String[] args) { main(); } (2b) If the method with signature ?main()? is not static, the provided method is public static void main(String[] args) { new Foo().main(); } Notice that this mechanism also automatically makes the keyword ?public? optional on the declaration of ?main()?. (3) Instead of speaking of automatic imports, speak of the compiler automatically providing certain import statements if the compilation unit doesn?t have a class header. That way _everything_ (the name of class when a class header is not provided, the behavior when you write variously abbreviated definitions of method `main`, and the automatic importation of certain libraries) can be explained in terms of source-code rewrites that the programmer can do once the programmer learns enough about more advanced features. ?Guy On Sep 28, 2022, at 1:57 PM, Brian Goetz > wrote: At various points, we've explored the question of which program elements are most and least helpful for students first learning Java. After considering a number of alternatives over the years, I have a simple proposal for smoothing the "on ramp" to Java programming, while not creating new things to unlearn. Markdown source is below, HTML will appear soon at: https://openjdk.org/projects/amber/design-notes/on-ramp # Paving the on-ramp Java is one of the most widely taught programming languages in the world. Tens of thousands of educators find that the imperative core of the language combined with a straightforward standard library is a foundation that students can comfortably learn on. Choosing Java gives educators many degrees of freedom: they can situate students in `jshell` or Notepad or a full-fledged IDE; they can teach imperative, object-oriented, functional, or hybrid programming styles; and they can easily find libraries to interact with external data and services. No language is perfect, and one of the most common complaints about Java is that it is "too verbose" or has "too much ceremony." And unfortunately, Java imposes its heaviest ceremony on those first learning the language, who need and appreciate it the least. The declaration of a class and the incantation of `public static void main` is pure mystery to a beginning programmer. While these incantations have principled origins and serve a useful organizing purpose in larger programs, they have the effect of placing obstacles in the path of _becoming_ Java programmers. Educators constantly remind us of the litany of complexity that students have to confront on Day 1 of class -- when they really just want to write their first program. As an amusing demonstration of this, in her JavaOne keynote appearance in 2019, [Aimee Lucido](https://www.youtube.com/watch?v=BkPPFiXUwYk) talked about when she learned to program in Java, and how her teacher performed a rap song to help students memorize `"public static void main"`. Our hats are off to creative educators everywhere for this kind of dedication, but teachers shouldn't have to do this. Of course, advanced programmers complain about ceremony too. We will never be able to satisfy programmers' insatiable appetite for typing fewer keystrokes, and we shouldn't try, because the goal of programming is to write programs that are easy to read and are clearly correct, not programs that were easy to type. But we can try to better align the ceremony commensurate with the value it brings to a program -- and let simple programs be expressed more simply. ## Concept overload The classic "Hello World" program looks like this in Java: ``` public class HelloWorld { public static void main(String[] args) { System.out.println("Hello World"); } } ``` It may only be five lines, but those lines are packed with concepts that are challenging to absorb without already having some programming experience and familiarity with object orientation. Let's break down the concepts a student confronts when writing their first Java program: - **public** (on the class). The `public` accessibility level is relevant only when there is going to be cross-package access; in a simple "Hello World" program, there is only one class, which lives in the unnamed package. They haven't even written a one-line program yet; the notion of access control -- keeping parts of a program from accessing other parts of it -- is still way in their future. - **class**. Our student hasn't set out to write a _class_, or model a complex system with objects; they want to write a _program_. In Java, a program is just a `main` method in some class, but at this point our student still has no idea what a class is or why they want one. - **Methods**. Methods are of course a key concept in Java, but the mechanics of methods -- parameters, return types, and invocation -- are still unfamiliar, and the `main` method is invoked magically from the `java` launcher rather than from explicit code. - **public** (again). Like the class, the `main` method has to be public, but again this is only relevant when programs are large enough to require packages to organize them. - **static**. The `main` method has to be static, and at this point, students have no context for understanding what a static method is or why they want one. Worse, the early exposure to `static` methods will turn out to be a bad habit that must be later unlearned. Worse still, the fact that the `main` method is `static` creates a seam between `main` and other methods; either they must become `static` too, or the `main` method must trampoline to some sort of "instance main" (more ceremony!) And if we get this wrong, we get the dreaded and mystifying `"cannot be referenced from a static context"` error. - **main**. The name `main` has special meaning in a Java program, indicating the starting point of a program, but this specialness hides behind being an ordinary method name. This may contribute to the sense of "so many magic incantations." - **String[]**. The parameter to `main` is an array of strings, which are the arguments that the `java` launcher collected from the command line. But our first program -- likely our first dozen -- will not use command-line parameters. Requiring the `String[]` parameter is, at this point, a mistake waiting to happen, and it will be a long time until this parameter makes sense. Worse, educators may be tempted to explain arrays at this point, which further increases the time-to-first-program. - **System.out.println**. If you look closely at this incantation, each element in the chain is a different thing -- `System` is a class (what's a class again?), `out` is a static field (what's a field?), and `println` is an instance method. The only part the student cares about right now is `println`; the rest of it is an incantation that they do not yet understand in order to get at the behavior they want. That's a lot to explain to a student on the first day of class. There's a good chance that by now, class is over and we haven't written any programs yet, or the teacher has said "don't worry what this means, you'll understand it later" six or eight times. Not only is this a lot of _syntactic_ things to absorb, but each of those things appeals to a different concept (class, method, package, return value, parameter, array, static, public, etc) that the student doesn't have a framework for understanding yet. Each of these will have an important role to play in larger programs, but so far, they only contribute to "wow, programming is complicated." It won't be practical (or even desirable) to get _all_ of these concepts out of the student's face on day 1, but we can do a lot -- and focus on the ones that do the most to help beginners understand how programs are constructed. ## Goal: a smooth on-ramp As much as programmers like to rant about ceremony, the real goal here is not mere ceremony reduction, but providing a graceful _on ramp_ to Java programming. This on-ramp should be helpful to beginning programmers by requiring only those concepts that a simple program needs. Not only should an on-ramp have a gradual slope and offer enough acceleration distance to get onto the highway at the right speed, but its direction must align with that of the highway. When a programmer is ready to learn about more advanced concepts, they should not have to discard what they've already learned, but instead easily see how the simple programs they've already written generalize to more complicated ones, and both the syntatic and conceptual transformation from "simple" to "full blown" program should be straightforward and unintrusive. It is a definite non-goal to create a "simplified dialect of Java for students". We identify three simplifications that should aid both educators and students in navigating the on-ramp to Java, as well as being generally useful to simple programs beyond the classroom as well: - A more tolerant launch protocol - Unnamed classes - Predefined static imports for the most critical methods and fields ## A more tolerant launch protocol The Java Language Specification has relatively little to say about how Java "programs" get launched, other than saying that there is some way to indicate which class is the initial class of a program (JLS 12.1.1) and that a public static method called `main` whose sole argument is of type `String[]` and whose return is `void` constitutes the entry point of the indicated class. We can eliminate much of the concept overload simply by relaxing the interactions between a Java program and the `java` launcher: - Relax the requirement that the class, and `main` method, be public. Public accessibility is only relevant when access crosses packages; simple programs live in the unnamed package, so cannot be accessed from any other package anyway. For a program whose main class is in the unnamed package, we can drop the requirement that the class or its `main` method be public, effectively treating the `java` launcher as if it too resided in the unnamed package. - Make the "args" parameter to `main` optional, by allowing the `java` launcher to first look for a main method with the traditional `main(String[])` signature, and then (if not found) for a main method with no arguments. - Make the `static` modifier on `main` optional, by allowing the `java` launcher to invoke an instance `main` method (of either signature) by instantiating an instance using an accessible no-arg constructor and then invoking the `main` method on it. This small set of changes to the launch protocol strikes out five of the bullet points in the above list of concepts: public (twice), static, method parameters, and `String[]`. At this point, our Hello World program is now: ``` class HelloWorld { void main() { System.out.println("Hello World"); } } ``` It's not any shorter by line count, but we've removed a lot of "horizontal noise" along with a number of concepts. Students and educators will appreciate it, but advanced programmers are unlikely to be in any hurry to make these implicit elements explicit either. Additionally, the notion of an "instance main" has value well beyond the first day. Because excessive use of `static` is considered a code smell, many educators encourage the pattern of "all the static `main` method does is instantiate an instance and call an instance `main` method" anyway. Formalizing the "instance main" protocol reduces a layer of boilerplate in these cases, and defers the point at which we have to explain what instance creation is -- and what `static` is. (Further, allowing the `main` method to be an instance method means that it could be inherited from a superclass, which is useful for simple frameworks such as test runners or service frameworks.) ## Unnamed classes In a simple program, the `class` declaration often doesn't help either, because other classes (if there are any) are not going to reference it by name, and we don't extend a superclass or implement any interfaces. If we say an "unnamed class" consists of member declarations without a class header, then our Hello World program becomes: ``` void main() { System.out.println("Hello World"); } ``` Such source files can still have fields, methods, and even nested classes, so that as a program evolves from a few statements to needing some ancillary state or helper methods, these can be factored out of the `main` method while still not yet requiring a full class declaration: ``` String greeting() { return "Hello World"; } void main() { System.out.println(greeting()); } ``` This is where treating `main` as an instance method really shines; the user has just declared two methods, and they can freely call each other. Students need not confront the confusing distinction between instance and static methods yet; indeed, if not forced to confront static members on day 1, it might be a while before they do have to learn this distinction. The fact that there is a receiver lurking in the background will come in handy later, but right now is not bothering anybody. [JEP 330](https://openjdk.org/jeps/330) allows single-file programs to be launched directly without compilation; this streamlined launcher pairs well with unnamed classes. ## Predefined static imports The most important classes, such as `String` and `Integer`, live in the `java.lang` package, which is automatically on-demand imported into all compilation units; this is why we do not have to `import java.lang.String` in every class. Static imports were not added until Java 5, but no corresponding facility for automatic on-demand import of common behavior was added at that time. Most programs, however, will want to do console IO, and Java forces us to do this in a roundabout way -- through the static `System.out` and `System.in` fields. Basic console input and output is a reasonable candidate for auto-static import, as one or both are needed by most simple programs. While these are currently instance methods accessed through static fields, we can easily create static methods for `println` and `readln` which are suitable for static import, and automatically import them. At which point our first program is now down to: ``` void main() { println("Hello World"); } ``` ## Putting this all together We've discussed several simplifications: - Update the launcher protocol to make public, static, and arguments optional for main methods, and for main methods to be instance methods (when a no-argument constructor is available); - Make the class wrapper for "main classes" optional (unnamed classes); - Automatically static import methods like `println` which together whittle our long list of day-1 concepts down considerably. While this is still not as minimal as the minimal Python or Ruby program -- statements must still live in a method -- the goal here is not to win at "code golf". The goal is to ensure that concepts not needed by simple programs need not appear in those programs, while at the same time not encouraging habits that have to be unlearned as programs scale up. Each of these simplifications is individually small and unintrusive, and each is independent of the others. And each embodies a simple transformation that the author can easily manually reverse when it makes sense to do so: elided modifiers and `main` arguments can be added back, the class wrapper can be added back when the affordances of classes are needed (supertypes, constructors), and the full qualifier of static-import can be added back. And these reversals are independent of one another; they can done in any combination or any order. This seems to meet the requirements of our on-ramp; we've eliminated most of the day-1 ceremony elements without introducing new concepts that need to be unlearned. The remaining concepts -- a method is a container for statements, and a program is a Java source file with a `main` method -- are easily understood in relation to their fully specified counterparts. ## Alternatives Obviously, we've lived with the status quo for 25+ years, so we could continue to do so. There were other alternatives explored as well; ultimately, each of these fell afoul of one of our goals. ### Can't we go further? Fans of "code golf" -- of which there are many -- are surely right now trying to figure out how to eliminate the last little bit, the `main` method, and allow statements to exist at the top-level of a program. We deliberately stopped short of this because it offers little value beyond the first few minutes, and even that small value quickly becomes something that needs to be unlearned. The fundamental problem behind allowing such "loose" statements is that variables can be declared inside both classes (fields) and methods (local variables), and they share the same syntactic production but not the same semantics. So it is unclear (to both compilers and humans) whether a "loose" variable would be a local or a field. If we tried to adopt some sort of simple heuristic to collapse this ambiguity (e.g., whether it precedes or follows the first statement), that may satisfy the compiler, but now simple refactorings might subtly change the meaning of the program, and we'd be replacing the explicit syntactic overhead of `void main()` with an invisible "line" in the program that subtly affects semantics, and a new subtle rule about the meaning of variable declarations that applies only to unnamed classes. This doesn't help students, nor is this particularly helpful for all but the most trivial programs. It quickly becomes a crutch to be discarded and unlearned, which falls afoul of our "on ramp" goals. Of all the concepts on our list, "methods" and "a program is specified by a main method" seem the ones that are most worth asking students to learn early. ### Why not "just" use `jshell`? While JShell is a great interactive tool, leaning too heavily on it as an onramp would fall afoul of our goals. A JShell session is not a program, but a sequence of code snippets. When we type declarations into `jshell`, they are viewed as implicitly static members of some unspecified class, with accessibility is ignored completely, and statements execute in a context where all previous declarations are in scope. This is convenient for experimentation -- the primary goal of `jshell` -- but not such a great mental model for learning to write Java programs. Transforming a batch of working declarations in `jshell` to a real Java program would not be sufficiently simple or unintrusive, and would lead to a non-idiomatic style of code, because the straightforward translation would have us redeclaring each method, class, and variable declaration as `static`. Further, this is probably not the direction we want to go when we scale up from a handful of statements and declarations to a simple class -- we probably want to start using classes as classes, not just as containers for static members. JShell is a great tool for exploration and debugging, and we expect many educators will continue to incorporate it into their curriculum, but is not the on-ramp programming model we are looking for. ### What about "always local"? One of the main tensions that `main` introduces is that most class members are not `static`, but the `main` method is -- and that forces programmers to confront the seam between static and non-static members. JShell answers this with "make everything static". Another approach would be to "make everything local" -- treat a simple program as being the "unwrapped" body of an implicit main method. We already allow variables and classes to be declared local to a method. We could add local methods (a useful feature in its own right) and relax some of the asymmetries around nesting (again, an attractive cleanup), and then treat a mix of declarations and statements without a class wrapper as the body of an invisible `main` method. This seems an attractive model as well -- at first. While the syntactic overhead of converting back to full-blown classes -- wrap the whole thing in a `main` method and a `class` declaration -- is far less intrusive than the transformation inherent in `jshell`, this is still not an ideal on-ramp. Local variables interact with local classes (and methods, when we have them) in a very different way than instance fields do with instance methods and inner classes: their scopes are different (no forward references), their initialization rules are different, and captured local variables must be effectively final. This is a subtly different programming model that would then have to be unlearned when scaling up to full classes. Further, the result of this wrapping -- where everything is local to the main method -- is also not "idiomatic Java". So while local methods may be an attractive feature, they are similarly not the on-ramp we are looking for. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Thu Sep 29 04:12:58 2022 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 28 Sep 2022 21:12:58 -0700 Subject: Paving the on-ramp In-Reply-To: <2e779dc3-8a16-8d89-1e5d-6b14b04998d3@oracle.com> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> <2e779dc3-8a16-8d89-1e5d-6b14b04998d3@oracle.com> Message-ID: Meta-comment: I think you have the right *motivating* use cases (beginners, small/temporary programs), but I expect pretty much *any* main method to want to use this, and I don't see why it shouldn't. That makes those use cases important and worth reasonable attempts at accommodation, regardless of whether we'd even be doing this for their sake alone. On Wed, Sep 28, 2022 at 6:36 PM Brian Goetz wrote: > > > A major design goal of yours seems clear: to get there without rendering > Java source files explicitly bimorphic ("class" source files all look like > this, "main" source files all look like that). Instead you have a set of > independent features that can compose to get you there in a "smooth ramp". > The design looks heavily influenced by that goal. > > > Yes, I sometimes call this "telescoping", because there's a chain of "x is > short for y is short for z". For example, with lambdas: > > x -> e > > is-short-for > > (x) -> e > > is-short-for > > (var x) -> e > > is-short-for > > (int x) -> e // or whatever the arg is > > As a design convention, it enables a mental model where there is really > just one form, with varying things you could leave out. Early in the > Lambda days, we saw articles like "there are N forms of lambda > expressions", and that stuff infuriates me, it is as if people go out of > their way to find more complex mental models than necessary. > Right, and that design was good inasmuch as there were good use cases for every one of the rungs (and there were). As my program grows and gets more complex, I will make changes like > > * use more other libraries > * add args to main() > * add helper methods > * add constants > * create new classes and use them from here > > But: when and why would I be motivated to change *this* code *itself* to > "become" a class, become instantiable, acquire instance state, etc. etc.? I > don't imagine ever having that urge. main() is just main()! It's just a way > in. Isn't it literally just a way to (a) transfer control back and forth > and (b) hand me args? > > This doesn't seem like such a leap to me. You might start out hardcoding > a file path that will be read. Then you might decide to let that be passed > in (so you add the args parameter to main). Then you might want to treat > the filename to be read as a field so it can be shared across methods, so > you turn it into a constructor parameter. > So far so good up to that last part. A constructor parameter? I thought you were going to say you just add the field and all your non-static methods read and write it at will. Getting a bit lost in the twists n' folds. > One could imagine "introduce X" refactorings to do all of these. The > process of hardcoding to main() parameter to constructor argument is a > natural sedimentation of things finding their right level. (And even if > you don't do all of this, knowing that its an ordinary class (like an enum > or a record) just with a concise syntax means you don't have to learn new > concepts. I don't want Foo classes and Bar classes.) > > > Note I was only reacting to "static bad!" here. I would be happy if *that* > argument were dropped, but you do still have another valid argument: that > `static` is another backward default, and the viral burden of putting it > not just on main() but every helper method you factor out is pure nuisance. > (I'd suggest mentioning the viral nature of this particular burden > higher/more prominently in the doc, as it's currently out of place under > the "unnamed classes" section.) > > (That doesn't mean "so let's do it"; I still hope to see that benefit > carefully measured against the drawbacks. Btw, *some* of those drawbacks > might be eased by disallowing an explicit constructor... and jeez, please > disallow type parameters too... I'm leaving the exact meaning of "disallow" > undefined here.) > > > Indeed, I intend that there are no explicit constructors or instance > initializers here. (There can't be constructors, because the class is > unnamed!) > Hmm, I was under the impression I could drop all my `static`s while keeping the class signature if I wanted? But, if I can and even then explicit constrs and initers are banned, then indeed, at least one of my drawbacks is invalid. I don't think it undercuts my overall case that much. > I think I said somewhere "such classes can contain ..." and didn't list > constructors, but I should have been more explicit. > > > ## Unnamed classes >> >> In a simple program, the `class` declaration often doesn't help either, >> because >> other classes (if there are any) are not going to reference it by name, >> and we >> don't extend a superclass or implement any interfaces. >> > > How do I tell `java` which class file to load and call main() on? Class > name based on file name, I guess? > > > Sadly yes. More sad stories coming on this front, Jim can tell. > > > If we say an "unnamed >> class" consists of member declarations without a class header, then our >> Hello >> World program becomes: >> >> ``` >> void main() { >> System.out.println("Hello World"); >> } >> ``` >> > > > One or more class annotations could appear below package/imports? > > > No package statement (unnamed classes live in the unnamed package), but > imports are OK. > I'm confused; what does any of this have to do with package location? Isn't that orthogonal to everything we're discussing? I'm also not sure why we're talking about "unnamed" so much; the condition we're talking about is really "signatureless" or "body-only", which as far as I know could be a purely source-level distinction, producing a completely normal-looking, named class in a classfile. (Wrote that before Guy's message came in, even.) No class annotations. No type variables. No superclasses. > Okay, in the motivating use cases (beginner/small/temp), yeah, they're awfully unlikely to want class annotations. But again, to me every main() method out there is a use case too. And plenty of class annotations are used for purposes that aren't about *classes*, just "this whole range of code here". It feels like they should be allowed, unless you want to talk about `ElementType.COMPILATION_UNIT` ... :-) Such source files can still have fields, methods, and even nested classes, >> > > Do those get compiled to real nested classes, nested inside an unnamed > class? So if I edit a "regular" `Foo.java` file, go down below the last `}` > and add a `main` function there, does that cause the whole `Foo` class > above to be reinterpreted as "nested inside an unnamed class" instead of > top-level? > > > To be discussed! > > This is my notion of a natural progression: > > 1. Write procedural code: calling static methods, using existing data > types, soon calling their instance methods > 2. Proceed to creating your own types (from simple data types onward) and > using them too > 3. One day learn that your main() function is actually a method of an > instantiable type too... at pub trivia night, then promptly forget it > > > Right. > > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From amaembo at gmail.com Thu Sep 29 07:07:44 2022 From: amaembo at gmail.com (Tagir Valeev) Date: Thu, 29 Sep 2022 09:07:44 +0200 Subject: Paving the on-ramp In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: Hello! Very interesting writing, thanks! A couple of notes from me: > ## Unnamed classes > ... > Such source files can still have fields, methods, and even nested classes, so > that as a program evolves from a few statements to needing some ancillary state > or helper methods, these can be factored out of the `main` method while still I wonder how we tell apart unnamed class syntax and normal class syntax. E.g., consider the source file: // Hello.java public class Hello { // tons of logic } void main() { } Will it be considered as a correct Java file, having Hello class as a nested class of top-level unnamed class? If yes, then, adding a main method after the class declaration, I change the class semantics, making it an inner class. This looks like action at a distance and may cause confusion. E.g., I just wrote a main() method outside of Hello class instead of inside, and boom, now Hello is not resolvable from other classes, for no apparent reason. I assume that the main() method is required for an unnamed class, and if there are only other top-level declarations, then it should be a compilation error, right? > ## Predefined static imports > ``` > void main() { > println("Hello World"); > } > ``` I wonder how it will play with existing static star imports. We already saw problems when updated to Java 9 or Java 14 that star-imported class named Module or Record becomes unresolvable. If existing code already imports static method named println from somewhere, will this code become invalid? With best regards, Tagir Valeev. From forax at univ-mlv.fr Thu Sep 29 07:39:45 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 29 Sep 2022 09:39:45 +0200 (CEST) Subject: Paving the on-ramp In-Reply-To: References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: <910090311.15460599.1664437163375.JavaMail.zimbra@u-pem.fr> ----- Original Message ----- > From: "Tagir Valeev" > To: "Brian Goetz" > Cc: "amber-spec-experts" > Sent: Thursday, September 29, 2022 9:07:44 AM > Subject: Re: Paving the on-ramp > Hello! > > Very interesting writing, thanks! A couple of notes from me: > >> ## Unnamed classes >> ... >> Such source files can still have fields, methods, and even nested classes, so >> that as a program evolves from a few statements to needing some ancillary state >> or helper methods, these can be factored out of the `main` method while still > > I wonder how we tell apart unnamed class syntax and normal class > syntax. E.g., consider the source file: > > // Hello.java > public class Hello { > // tons of logic > } > > void main() { > } > > Will it be considered as a correct Java file, having Hello class as a > nested class of top-level unnamed class? > If yes, then, adding a main method after the class declaration, I > change the class semantics, making it an inner class. > This looks like action at a distance and may cause confusion. E.g., I > just wrote a main() method outside of Hello class instead of inside, > and boom, > now Hello is not resolvable from other classes, for no apparent reason. There are several ways to try to tame that issue - we can restrict unnamed class to only work if it is run by java Hello.java, so no Hello.class is generated at compile time, no problem with Hello being resolvable. - we can disallow an unnamed class to contains a nested class with the same name as the unnamed class, the error message will still be hard to decipher for beginners. - we can disallow nested class in unnamed class, but that a bummer because being able to write records inside an unnamed class is a great combo. > > I assume that the main() method is required for an unnamed class, and > if there are only other top-level declarations, > then it should be a compilation error, right ? I do not think you can because having a file named Foo.java containing only a non public class Bar is currently legal in Java. > >> ## Predefined static imports >> ``` >> void main() { >> println("Hello World"); >> } >> ``` > > I wonder how it will play with existing static star imports. We > already saw problems when updated to Java 9 or Java 14 that > star-imported class named Module or Record becomes unresolvable. If > existing code already imports static method named println from > somewhere, will this code become invalid? yes, i've asked the same question to Brian. We need the predefined static imports to be resolved after the classical static imports are resolved. BTW, there is a connection with the templated string spec here, because STR or FMT also needs to be predefined static imports. > > With best regards, > Tagir Valeev. regards, R?mi From forax at univ-mlv.fr Thu Sep 29 08:01:58 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 29 Sep 2022 10:01:58 +0200 (CEST) Subject: Paving the on-ramp In-Reply-To: <06823323-6214-438A-80A7-184310F01C55@oracle.com> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> <06823323-6214-438A-80A7-184310F01C55@oracle.com> Message-ID: <755558451.15481407.1664438518341.JavaMail.zimbra@u-pem.fr> > From: "Guy Steele" > To: "Brian Goetz" > Cc: "amber-spec-experts" > Sent: Thursday, September 29, 2022 5:41:24 AM > Subject: Re: Paving the on-ramp > This is headed in the right direction, but I worry about the use of dei ex > machina that have the property that they are NOT easily explained in terms of > something the user could have written. Perhaps these could all be dispensed > with by using an alternate strategy of code rewriting, (at least as an > explanation, if not also as an implementation mechanism). > (1) Instead of having a magic ?unnamed? class, which has bizarre properties such > as not having a constructor (or at least not a constructor you can mention in a > `new` expression), only to then require a second magic rule about what you put > in the command line ?java ??, why not simply use the much more obvious rule > that if a compilation unit doesn't have a class header, then a class header is > _supplied_ by the compiler, and the name of the class is taken from the > filename of the compilation unit? You can have both, an unnamed class is syntactic sugar but we do not want to allow puzzling combinations. Declaring a constructor in Java use the class name but it's not clear to me that we should allow the class name of an unnamed class to be denotable, it seems to magical to me. > (2) Instead of complicating the Java launch protocol, why not leave it along, > and instead use the existing mechanism of ?in situation X, if the user fails to > provide method Y, the compiler will provide a definition automatically?? > Specifically, in a compilation unit named Foo.java for which a class header has > to be provided automatically, if a method with signature ?main()? is present > but no static method with signature ?main(String[])? is present, then a static > method with signature ?main(String[])? is automatically provided by the > compiler. > (2a) If the method with signature ?main()? is static, the provided method is > public static void main(String[] args) { main(); } > (2b) If the method with signature ?main()? is not static, the provided method is > public static void main(String[] args) { new Foo().main(); } > Notice that this mechanism also automatically makes the keyword ?public? > optional on the declaration of ?main()?. I agree on that, it also goes well with the warning Kevin was mentioning. > (3) Instead of speaking of automatic imports, speak of the compiler > automatically providing certain import statements if the compilation unit > doesn?t have a class header. I disagree about this one, if we can write println() inside an unnamed class, we should be able to write println() inside a classical class. You can disallow things inside an unamed class but you can not have a behavior that works on unnamed class and that will not work on a classical class, otherwise an unnamed class is not really Java, it's a kid sandbox. > That way _everything_ (the name of class when a class header is not provided, > the behavior when you write variously abbreviated definitions of method `main`, > and the automatic importation of certain libraries) can be explained in terms > of source-code rewrites that the programmer can do once the programmer learns > enough about more advanced features. > ?Guy R?mi >> On Sep 28, 2022, at 1:57 PM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >> brian.goetz at oracle.com ] > wrote: >> At various points, we've explored the question of which program elements are >> most and least helpful for students first learning Java. After considering a >> number of alternatives over the years, I have a simple proposal for smoothing >> the "on ramp" to Java programming, while not creating new things to unlearn. >> Markdown source is below, HTML will appear soon at: >> [ https://openjdk.org/projects/amber/design-notes/on-ramp | >> https://openjdk.org/projects/amber/design-notes/on-ramp ] >> # Paving the on-ramp >> Java is one of the most widely taught programming languages in the world. Tens >> of thousands of educators find that the imperative core of the language combined >> with a straightforward standard library is a foundation that students can >> comfortably learn on. Choosing Java gives educators many degrees of freedom: >> they can situate students in `jshell` or Notepad or a full-fledged IDE; they can >> teach imperative, object-oriented, functional, or hybrid programming styles; and >> they can easily find libraries to interact with external data and services. >> No language is perfect, and one of the most common complaints about Java is that >> it is "too verbose" or has "too much ceremony." And unfortunately, Java imposes >> its heaviest ceremony on those first learning the language, who need and >> appreciate it the least. The declaration of a class and the incantation of >> `public static void main` is pure mystery to a beginning programmer. While >> these incantations have principled origins and serve a useful organizing purpose >> in larger programs, they have the effect of placing obstacles in the path of >> _becoming_ Java programmers. Educators constantly remind us of the litany of >> complexity that students have to confront on Day 1 of class -- when they really >> just want to write their first program. >> As an amusing demonstration of this, in her JavaOne keynote appearance in 2019, >> [Aimee Lucido]( [ https://www.youtube.com/watch?v=BkPPFiXUwYk | >> https://www.youtube.com/watch?v=BkPPFiXUwYk ] ) talked about when >> she learned to program in Java, and how her teacher performed a rap song >> to help students memorize `"public static void main"`. Our hats are off to >> creative educators everywhere for this kind of dedication, but teachers >> shouldn't have to do this. >> Of course, advanced programmers complain about ceremony too. We will never be >> able to satisfy programmers' insatiable appetite for typing fewer keystrokes, >> and we shouldn't try, because the goal of programming is to write programs that >> are easy to read and are clearly correct, not programs that were easy to type. >> But we can try to better align the ceremony commensurate with the value it >> brings to a program -- and let simple programs be expressed more simply. >> ## Concept overload >> The classic "Hello World" program looks like this in Java: >> ``` >> public class HelloWorld { >> public static void main(String[] args) { >> System.out.println("Hello World"); >> } >> } >> ``` >> It may only be five lines, but those lines are packed with concepts that are >> challenging to absorb without already having some programming experience and >> familiarity with object orientation. Let's break down the concepts a student >> confronts when writing their first Java program: >> - **public** (on the class). The `public` accessibility level is relevant >> only when there is going to be cross-package access; in a simple "Hello >> World" program, there is only one class, which lives in the unnamed package. >> They haven't even written a one-line program yet; the notion of access >> control -- keeping parts of a program from accessing other parts of it -- is >> still way in their future. >> - **class**. Our student hasn't set out to write a _class_, or model a >> complex system with objects; they want to write a _program_. In Java, a >> program is just a `main` method in some class, but at this point our student >> still has no idea what a class is or why they want one. >> - **Methods**. Methods are of course a key concept in Java, but the mechanics >> of methods -- parameters, return types, and invocation -- are still >> unfamiliar, and the `main` method is invoked magically from the `java` >> launcher rather than from explicit code. >> - **public** (again). Like the class, the `main` method has to be public, but >> again this is only relevant when programs are large enough to require >> packages to organize them. >> - **static**. The `main` method has to be static, and at this point, students >> have no context for understanding what a static method is or why they want >> one. Worse, the early exposure to `static` methods will turn out to be a >> bad habit that must be later unlearned. Worse still, the fact that the >> `main` method is `static` creates a seam between `main` and other methods; >> either they must become `static` too, or the `main` method must trampoline >> to some sort of "instance main" (more ceremony!) And if we get this wrong, >> we get the dreaded and mystifying `"cannot be referenced from a static >> context"` error. >> - **main**. The name `main` has special meaning in a Java program, indicating >> the starting point of a program, but this specialness hides behind being an >> ordinary method name. This may contribute to the sense of "so many magic >> incantations." >> - **String[]**. The parameter to `main` is an array of strings, which are the >> arguments that the `java` launcher collected from the command line. But our >> first program -- likely our first dozen -- will not use command-line >> parameters. Requiring the `String[]` parameter is, at this point, a mistake >> waiting to happen, and it will be a long time until this parameter makes >> sense. Worse, educators may be tempted to explain arrays at this point, >> which further increases the time-to-first-program. >> - **System.out.println**. If you look closely at this incantation, each >> element in the chain is a different thing -- `System` is a class (what's a >> class again?), `out` is a static field (what's a field?), and `println` is >> an instance method. The only part the student cares about right now is >> `println`; the rest of it is an incantation that they do not yet understand >> in order to get at the behavior they want. >> That's a lot to explain to a student on the first day of class. There's a good >> chance that by now, class is over and we haven't written any programs yet, or >> the teacher has said "don't worry what this means, you'll understand it later" >> six or eight times. Not only is this a lot of _syntactic_ things to absorb, but >> each of those things appeals to a different concept (class, method, package, >> return value, parameter, array, static, public, etc) that the student doesn't >> have a framework for understanding yet. Each of these will have an important >> role to play in larger programs, but so far, they only contribute to "wow, >> programming is complicated." >> It won't be practical (or even desirable) to get _all_ of these concepts out of >> the student's face on day 1, but we can do a lot -- and focus on the ones that >> do the most to help beginners understand how programs are constructed. >> ## Goal: a smooth on-ramp >> As much as programmers like to rant about ceremony, the real goal here is not >> mere ceremony reduction, but providing a graceful _on ramp_ to Java programming. >> This on-ramp should be helpful to beginning programmers by requiring only those >> concepts that a simple program needs. >> Not only should an on-ramp have a gradual slope and offer enough acceleration >> distance to get onto the highway at the right speed, but its direction must >> align with that of the highway. When a programmer is ready to learn about more >> advanced concepts, they should not have to discard what they've already learned, >> but instead easily see how the simple programs they've already written >> generalize to more complicated ones, and both the syntatic and conceptual >> transformation from "simple" to "full blown" program should be straightforward >> and unintrusive. It is a definite non-goal to create a "simplified dialect of >> Java for students". >> We identify three simplifications that should aid both educators and students in >> navigating the on-ramp to Java, as well as being generally useful to simple >> programs beyond the classroom as well: >> - A more tolerant launch protocol >> - Unnamed classes >> - Predefined static imports for the most critical methods and fields >> ## A more tolerant launch protocol >> The Java Language Specification has relatively little to say about how Java >> "programs" get launched, other than saying that there is some way to indicate >> which class is the initial class of a program (JLS 12.1.1) and that a public >> static method called `main` whose sole argument is of type `String[]` and whose >> return is `void` constitutes the entry point of the indicated class. >> We can eliminate much of the concept overload simply by relaxing the >> interactions between a Java program and the `java` launcher: >> - Relax the requirement that the class, and `main` method, be public. Public >> accessibility is only relevant when access crosses packages; simple programs >> live in the unnamed package, so cannot be accessed from any other package >> anyway. For a program whose main class is in the unnamed package, we can >> drop the requirement that the class or its `main` method be public, >> effectively treating the `java` launcher as if it too resided in the unnamed >> package. >> - Make the "args" parameter to `main` optional, by allowing the `java` launcher >> to >> first look for a main method with the traditional `main(String[])` >> signature, and then (if not found) for a main method with no arguments. >> - Make the `static` modifier on `main` optional, by allowing the `java` launcher >> to >> invoke an instance `main` method (of either signature) by instantiating an >> instance using an accessible no-arg constructor and then invoking the `main` >> method on it. >> This small set of changes to the launch protocol strikes out five of the bullet >> points in the above list of concepts: public (twice), static, method parameters, >> and `String[]`. >> At this point, our Hello World program is now: >> ``` >> class HelloWorld { >> void main() { >> System.out.println("Hello World"); >> } >> } >> ``` >> It's not any shorter by line count, but we've removed a lot of "horizontal >> noise" along with a number of concepts. Students and educators will appreciate >> it, but advanced programmers are unlikely to be in any hurry to make these >> implicit elements explicit either. >> Additionally, the notion of an "instance main" has value well beyond the first >> day. Because excessive use of `static` is considered a code smell, many >> educators encourage the pattern of "all the static `main` method does is >> instantiate an instance and call an instance `main` method" anyway. Formalizing >> the "instance main" protocol reduces a layer of boilerplate in these cases, and >> defers the point at which we have to explain what instance creation is -- and >> what `static` is. (Further, allowing the `main` method to be an instance method >> means that it could be inherited from a superclass, which is useful for simple >> frameworks such as test runners or service frameworks.) >> ## Unnamed classes >> In a simple program, the `class` declaration often doesn't help either, because >> other classes (if there are any) are not going to reference it by name, and we >> don't extend a superclass or implement any interfaces. If we say an "unnamed >> class" consists of member declarations without a class header, then our Hello >> World program becomes: >> ``` >> void main() { >> System.out.println("Hello World"); >> } >> ``` >> Such source files can still have fields, methods, and even nested classes, so >> that as a program evolves from a few statements to needing some ancillary state >> or helper methods, these can be factored out of the `main` method while still >> not yet requiring a full class declaration: >> ``` >> String greeting() { return "Hello World"; } >> void main() { >> System.out.println(greeting()); >> } >> ``` >> This is where treating `main` as an instance method really shines; the user has >> just declared two methods, and they can freely call each other. Students need >> not confront the confusing distinction between instance and static methods yet; >> indeed, if not forced to confront static members on day 1, it might be a while >> before they do have to learn this distinction. The fact that there is a >> receiver lurking in the background will come in handy later, but right now is >> not bothering anybody. >> [JEP 330]( [ https://openjdk.org/jeps/330 | https://openjdk.org/jeps/330 ] ) >> allows single-file programs to be >> launched directly without compilation; this streamlined launcher pairs well with >> unnamed classes. >> ## Predefined static imports >> The most important classes, such as `String` and `Integer`, live in the >> `java.lang` package, which is automatically on-demand imported into all >> compilation units; this is why we do not have to `import java.lang.String` in >> every class. Static imports were not added until Java 5, but no corresponding >> facility for automatic on-demand import of common behavior was added at that >> time. Most programs, however, will want to do console IO, and Java forces us to >> do this in a roundabout way -- through the static `System.out` and `System.in` >> fields. Basic console input and output is a reasonable candidate for >> auto-static import, as one or both are needed by most simple programs. While >> these are currently instance methods accessed through static fields, we can >> easily create static methods for `println` and `readln` which are suitable for >> static import, and automatically import them. At which point our first program >> is now down to: >> ``` >> void main() { >> println("Hello World"); >> } >> ``` >> ## Putting this all together >> We've discussed several simplifications: >> - Update the launcher protocol to make public, static, and arguments optional >> for main methods, and for main methods to be instance methods (when a >> no-argument constructor is available); >> - Make the class wrapper for "main classes" optional (unnamed classes); >> - Automatically static import methods like `println` >> which together whittle our long list of day-1 concepts down considerably. While >> this is still not as minimal as the minimal Python or Ruby program -- statements >> must still live in a method -- the goal here is not to win at "code golf". The >> goal is to ensure that concepts not needed by simple programs need not appear in >> those programs, while at the same time not encouraging habits that have to be >> unlearned as programs scale up. >> Each of these simplifications is individually small and unintrusive, and each is >> independent of the others. And each embodies a simple transformation that the >> author can easily manually reverse when it makes sense to do so: elided >> modifiers and `main` arguments can be added back, the class wrapper can be added >> back when the affordances of classes are needed (supertypes, constructors), and >> the full qualifier of static-import can be added back. And these reversals are >> independent of one another; they can done in any combination or any order. >> This seems to meet the requirements of our on-ramp; we've eliminated most of the >> day-1 ceremony elements without introducing new concepts that need to be >> unlearned. The remaining concepts -- a method is a container for statements, and >> a program is a Java source file with a `main` method -- are easily understood in >> relation to their fully specified counterparts. >> ## Alternatives >> Obviously, we've lived with the status quo for 25+ years, so we could continue >> to do so. There were other alternatives explored as well; ultimately, each of >> these fell afoul of one of our goals. >> ### Can't we go further? >> Fans of "code golf" -- of which there are many -- are surely right now trying to >> figure out how to eliminate the last little bit, the `main` method, and allow >> statements to exist at the top-level of a program. We deliberately stopped >> short of this because it offers little value beyond the first few minutes, and >> even that small value quickly becomes something that needs to be unlearned. >> The fundamental problem behind allowing such "loose" statements is that >> variables can be declared inside both classes (fields) and methods (local >> variables), and they share the same syntactic production but not the same >> semantics. So it is unclear (to both compilers and humans) whether a "loose" >> variable would be a local or a field. If we tried to adopt some sort of simple >> heuristic to collapse this ambiguity (e.g., whether it precedes or follows the >> first statement), that may satisfy the compiler, but now simple refactorings >> might subtly change the meaning of the program, and we'd be replacing the >> explicit syntactic overhead of `void main()` with an invisible "line" in the >> program that subtly affects semantics, and a new subtle rule about the meaning >> of variable declarations that applies only to unnamed classes. This doesn't >> help students, nor is this particularly helpful for all but the most trivial >> programs. It quickly becomes a crutch to be discarded and unlearned, which >> falls afoul of our "on ramp" goals. Of all the concepts on our list, "methods" >> and "a program is specified by a main method" seem the ones that are most worth >> asking students to learn early. >> ### Why not "just" use `jshell`? >> While JShell is a great interactive tool, leaning too heavily on it as an onramp >> would fall afoul of our goals. A JShell session is not a program, but a >> sequence of code snippets. When we type declarations into `jshell`, they are >> viewed as implicitly static members of some unspecified class, with >> accessibility is ignored completely, and statements execute in a context where >> all previous declarations are in scope. This is convenient for experimentation >> -- the primary goal of `jshell` -- but not such a great mental model for >> learning to write Java programs. Transforming a batch of working declarations >> in `jshell` to a real Java program would not be sufficiently simple or >> unintrusive, and would lead to a non-idiomatic style of code, because the >> straightforward translation would have us redeclaring each method, class, and >> variable declaration as `static`. Further, this is probably not the direction >> we want to go when we scale up from a handful of statements and declarations to >> a simple class -- we probably want to start using classes as classes, not just >> as containers for static members. JShell is a great tool for exploration and >> debugging, and we expect many educators will continue to incorporate it into >> their curriculum, but is not the on-ramp programming model we are looking for. >> ### What about "always local"? >> One of the main tensions that `main` introduces is that most class members are >> not `static`, but the `main` method is -- and that forces programmers to >> confront the seam between static and non-static members. JShell answers this >> with "make everything static". >> Another approach would be to "make everything local" -- treat a simple program >> as being the "unwrapped" body of an implicit main method. We already allow >> variables and classes to be declared local to a method. We could add local >> methods (a useful feature in its own right) and relax some of the asymmetries >> around nesting (again, an attractive cleanup), and then treat a mix of >> declarations and statements without a class wrapper as the body of an invisible >> `main` method. This seems an attractive model as well -- at first. >> While the syntactic overhead of converting back to full-blown classes -- wrap >> the whole thing in a `main` method and a `class` declaration -- is far less >> intrusive than the transformation inherent in `jshell`, this is still not an >> ideal on-ramp. Local variables interact with local classes (and methods, when >> we have them) in a very different way than instance fields do with instance >> methods and inner classes: their scopes are different (no forward references), >> their initialization rules are different, and captured local variables must be >> effectively final. This is a subtly different programming model that would then >> have to be unlearned when scaling up to full classes. Further, the result of >> this wrapping -- where everything is local to the main method -- is also not >> "idiomatic Java". So while local methods may be an attractive feature, they are >> similarly not the on-ramp we are looking for. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Sep 29 13:54:59 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 29 Sep 2022 09:54:59 -0400 Subject: Paving the on-ramp In-Reply-To: <06823323-6214-438A-80A7-184310F01C55@oracle.com> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> <06823323-6214-438A-80A7-184310F01C55@oracle.com> Message-ID: <20e10b28-5d05-f381-5cbd-458418377231@oracle.com> > (1) Instead of having a magic ?unnamed? class, which has bizarre > properties such as not having a constructor (or at least not a > constructor you can mention in a `new` expression), only to then > require a second magic rule about what you put in the command line > ?java ??, why not simply use the much more obvious rule that if a > compilation unit doesn't have a class header, then a class header is > _supplied_ by the compiler, and the name of the class is taken from > the filename of the compilation unit? The implementation does something like this, which is almost a forced move due to the vagaries of the various extralinguistic rules like "Foo.class should not contain a class other than Foo" (enforced by the class loader.)? So indeed, if Foo.java contains an "unnamed" class, Foo.class will contain a class called "Foo". The main difference (if there is one) is the meaning of the name Foo in the body of the class.? This relates to another "unnamed" JEP in flight, which is "unnamed variables", such as: ??? var _ = mySideEffects(); Here, _ refers to a variable whose name is not entered into the symbol table, and is therefore write-once, read-none.? The proposal herein for unnamed classes treats the class name the same way. (Full disclosure: since there is a Foo.class with a class called Foo in it, it is hard to stop _other_ classes from instantiating it.) What you are suggesting is to instead take that extralinguistically-derived name and make it official.? This reduces some of the restrictions (you can have constructors) but seems like it creates new ghosts from different machines, since now there is a name that has meaning in the language but which didn't come from any Java source code. > (2) Instead of complicating the Java launch protocol, why not leave it > along, and instead use the existing mechanism of ?in situation X, if > the user fails to provide method Y, the compiler will provide a > definition automatically?? ?Specifically, in a compilation unit named > Foo.java for which a class header has to be provided automatically, if > a method with signature ?main()? is present but no static method with > signature ?main(String[])? is present, then a static method with > signature ?main(String[])? is automatically provided by the compiler. Saying that you can only use these two mechanisms together seems a sharp edge that users will get caught on.? The two simplifications are orthogonal; there is "instance main" and there is "low ceremony classes", but coupling the two in this way means you have to give up one if you don't use the other. However, your "if you don't provide..." approach is an entirely valid way to implement "instance main" -- by injecting additional methods into the compiled class rather than modifying the launcher. It would be specified slightly differently (since it is also reflectively visible) but that's OK. > (3) Instead of speaking of automatic imports, speak of the compiler > automatically providing certain import statements if the compilation > unit doesn?t have a class header. If we did this, when a class "graduates" from a low-ceremony class to a full class, then they'd have to go back and fix up all the println calls, and similarly it would put users in a position of "you can have ceremony reduction X, but only if you qualify for ceremony reduction Y."? It is surely a weaker argument that `println` needs to be effectively global, but after having programmed without saying "System.out" in front of println for only a few weeks, one already feels like going back is a punishment. (Its small, I know, but in some situations you type it a lot.)? We have also seen the need for automatic imports elsewhere, such as in JEP 430, where a feature of the language carries with it a static member (the STR and FMT template processors), and requiring an explicit static import seems burdensome. Taken together, coupling "instance main" and "auto static imports" to "no class header" means that we have created a "beginners dialect" which is different, and which has to be unlearned and undone as soon as a class graduates.? I would prefer to have these be orthogonal features to the extent possible. > That way _everything_ (the name of class when a class header is not > provided, the behavior when you write variously abbreviated > definitions of method `main`, and the automatic importation of certain > libraries) can be explained in terms of source-code rewrites that the > programmer can do once the programmer learns enough about more > advanced features. > > ?Guy > > >> On Sep 28, 2022, at 1:57 PM, Brian Goetz wrote: >> >> At various points, we've explored the question of which program >> elements are most and least helpful for students first learning >> Java.? After considering a number of alternatives over the years, I >> have a simple proposal for smoothing the "on ramp" to Java >> programming, while not creating new things to unlearn. >> >> Markdown source is below, HTML will appear soon at: >> >> https://openjdk.org/projects/amber/design-notes/on-ramp >> >> >> # Paving the on-ramp >> >> Java is one of the most widely taught programming languages in the >> world.? Tens >> of thousands of educators find that the imperative core of the >> language combined >> with a straightforward standard library is a foundation that students can >> comfortably learn on.? Choosing Java gives educators many degrees of >> freedom: >> they can situate students in `jshell` or Notepad or a full-fledged >> IDE; they can >> teach imperative, object-oriented, functional, or hybrid programming >> styles; and >> they can easily find libraries to interact with external data and >> services. >> >> No language is perfect, and one of the most common complaints about >> Java is that >> it is "too verbose" or has "too much ceremony." And unfortunately, >> Java imposes >> its heaviest ceremony on those first learning the language, who need and >> appreciate it the least.? The declaration of a class and the >> incantation of >> `public static void main` is pure mystery to a beginning programmer.? >> While >> these incantations have principled origins and serve a useful >> organizing purpose >> in larger programs, they have the effect of placing obstacles in the >> path of >> _becoming_ Java programmers. Educators constantly remind us of the >> litany of >> complexity that students have to confront on Day 1 of class -- when >> they really >> just want to write their first program. >> >> As an amusing demonstration of this, in her JavaOne keynote >> appearance in 2019, >> [Aimee Lucido](https://www.youtube.com/watch?v=BkPPFiXUwYk) talked >> about when >> she learned to program in Java, and how her teacher performed a rap song >> to help students memorize `"public static void main"`.? Our hats are >> off to >> creative educators everywhere for this kind of dedication, but teachers >> shouldn't have to do this. >> >> Of course, advanced programmers complain about ceremony too.? We will >> never be >> able to satisfy programmers' insatiable appetite for typing fewer >> keystrokes, >> and we shouldn't try, because the goal of programming is to write >> programs that >> are easy to read and are clearly correct, not programs that were easy >> to type. >> But we can try to better align the ceremony commensurate with the >> value it >> brings to a program -- and let simple programs be expressed more simply. >> >> ## Concept overload >> >> The classic "Hello World" program looks like this in Java: >> >> ``` >> public class HelloWorld { >> ??? public static void main(String[] args) { >> ??????? System.out.println("Hello World"); >> ??? } >> } >> ``` >> >> It may only be five lines, but those lines are packed with concepts >> that are >> challenging to absorb without already having some programming >> experience and >> familiarity with object orientation. Let's break down the concepts a >> student >> confronts when writing their first Java program: >> >> ? - **public** (on the class).? The `public` accessibility level is >> relevant >> ??? only when there is going to be cross-package access; in a simple >> "Hello >> ??? World" program, there is only one class, which lives in the >> unnamed package. >> ??? They haven't even written a one-line program yet; the notion of >> access >> ??? control -- keeping parts of a program from accessing other parts >> of it -- is >> ??? still way in their future. >> >> ? - **class**.? Our student hasn't set out to write a _class_, or model a >> ??? complex system with objects; they want to write a _program_.? In >> Java, a >> ??? program is just a `main` method in some class, but at this point >> our student >> ??? still has no idea what a class is or why they want one. >> >> ? - **Methods**.? Methods are of course a key concept in Java, but >> the mechanics >> ??? of methods -- parameters, return types, and invocation -- are still >> ??? unfamiliar, and the `main` method is invoked magically from the >> `java` >> ??? launcher rather than from explicit code. >> >> ? - **public** (again).? Like the class, the `main` method has to be >> public, but >> ??? again this is only relevant when programs are large enough to require >> ??? packages to organize them. >> >> ? - **static**.? The `main` method has to be static, and at this >> point, students >> ??? have no context for understanding what a static method is or why >> they want >> ??? one.? Worse, the early exposure to `static` methods will turn out >> to be a >> ??? bad habit that must be later unlearned.? Worse still, the fact >> that the >> ??? `main` method is `static` creates a seam between `main` and other >> methods; >> ??? either they must become `static` too, or the `main` method must >> trampoline >> ??? to some sort of "instance main" (more ceremony!)? And if we get >> this wrong, >> ??? we get the dreaded and mystifying `"cannot be referenced from a >> static >> ??? context"` error. >> >> ? - **main**.? The name `main` has special meaning in a Java program, >> indicating >> ??? the starting point of a program, but this specialness hides >> behind being an >> ??? ordinary method name.? This may contribute to the sense of "so >> many magic >> ??? incantations." >> >> ? - **String[]**.? The parameter to `main` is an array of strings, >> which are the >> ??? arguments that the `java` launcher collected from the command >> line.? But our >> ??? first program -- likely our first dozen -- will not use command-line >> ??? parameters. Requiring the `String[]` parameter is, at this point, >> a mistake >> ??? waiting to happen, and it will be a long time until this >> parameter makes >> ??? sense.? Worse, educators may be tempted to explain arrays at this >> point, >> ??? which further increases the time-to-first-program. >> >> ? - **System.out.println**.? If you look closely at this incantation, >> each >> ??? element in the chain is a different thing -- `System` is a class >> (what's a >> ??? class again?), `out` is a static field (what's a field?), and >> `println` is >> ??? an instance method.? The only part the student cares about right >> now is >> ??? `println`; the rest of it is an incantation that they do not yet >> understand >> ??? in order to get at the behavior they want. >> >> That's a lot to explain to a student on the first day of class.? >> There's a good >> chance that by now, class is over and we haven't written any programs >> yet, or >> the teacher has said "don't worry what this means, you'll understand >> it later" >> six or eight times.? Not only is this a lot of _syntactic_ things to >> absorb, but >> each of those things appeals to a different concept (class, method, >> package, >> return value, parameter, array, static, public, etc) that the student >> doesn't >> have a framework for understanding yet.? Each of these will have an >> important >> role to play in larger programs, but so far, they only contribute to >> "wow, >> programming is complicated." >> >> It won't be practical (or even desirable) to get _all_ of these >> concepts out of >> the student's face on day 1, but we can do a lot -- and focus on the >> ones that >> do the most to help beginners understand how programs are constructed. >> >> ## Goal: a smooth on-ramp >> >> As much as programmers like to rant about ceremony, the real goal >> here is not >> mere ceremony reduction, but providing a graceful _on ramp_ to Java >> programming. >> This on-ramp should be helpful to beginning programmers by requiring >> only those >> concepts that a simple program needs. >> >> Not only should an on-ramp have a gradual slope and offer enough >> acceleration >> distance to get onto the highway at the right speed, but its >> direction must >> align with that of the highway.? When a programmer is ready to learn >> about more >> advanced concepts, they should not have to discard what they've >> already learned, >> but instead easily see how the simple programs they've already written >> generalize to more complicated ones, and both the syntatic and conceptual >> transformation from "simple" to "full blown" program should be >> straightforward >> and unintrusive.? It is a definite non-goal to create a "simplified >> dialect of >> Java for students". >> >> We identify three simplifications that should aid both educators and >> students in >> navigating the on-ramp to Java, as well as being generally useful to >> simple >> programs beyond the classroom as well: >> >> ?- A more tolerant launch protocol >> ?- Unnamed classes >> ?- Predefined static imports for the most critical methods and fields >> >> ## A more tolerant launch protocol >> >> The Java Language Specification has relatively little to say about >> how Java >> "programs" get launched, other than saying that there is some way to >> indicate >> which class is the initial class of a program (JLS 12.1.1) and that a >> public >> static method called `main` whose sole argument is of type `String[]` >> and whose >> return is `void` constitutes the entry point of the indicated class. >> >> We can eliminate much of the concept overload simply by relaxing the >> interactions between a Java program and the `java` launcher: >> >> ?- Relax the requirement that the class, and `main` method, be >> public.? Public >> ?? accessibility is only relevant when access crosses packages; >> simple programs >> ?? live in the unnamed package, so cannot be accessed from any other >> package >> ?? anyway.? For a program whose main class is in the unnamed package, >> we can >> ?? drop the requirement that the class or its `main` method be public, >> ?? effectively treating the `java` launcher as if it too resided in >> the unnamed >> ?? package. >> >> ?- Make the "args" parameter to `main` optional, by allowing the >> `java` launcher to >> ?? first look for a main method with the traditional `main(String[])` >> ?? signature, and then (if not found) for a main method with no >> arguments. >> >> ?- Make the `static` modifier on `main` optional, by allowing the >> `java` launcher to >> ?? invoke an instance `main` method (of either signature) by >> instantiating an >> ?? instance using an accessible no-arg constructor and then invoking >> the `main` >> ?? method on it. >> >> This small set of changes to the launch protocol strikes out five of >> the bullet >> points in the above list of concepts: public (twice), static, method >> parameters, >> and `String[]`. >> >> At this point, our Hello World program is now: >> >> ``` >> class HelloWorld { >> ??? void main() { >> ??????? System.out.println("Hello World"); >> ??? } >> } >> ``` >> >> It's not any shorter by line count, but we've removed a lot of >> "horizontal >> noise" along with a number of concepts.? Students and educators will >> appreciate >> it, but advanced programmers are unlikely to be in any hurry to make >> these >> implicit elements explicit either. >> >> Additionally, the notion of an "instance main" has value well beyond >> the first >> day.? Because excessive use of `static` is considered a code smell, many >> educators encourage the pattern of "all the static `main` method does is >> instantiate an instance and call an instance `main` method" anyway.? >> Formalizing >> the "instance main" protocol reduces a layer of boilerplate in these >> cases, and >> defers the point at which we have to explain what instance creation >> is -- and >> what `static` is.? (Further, allowing the `main` method to be an >> instance method >> means that it could be inherited from a superclass, which is useful >> for simple >> frameworks such as test runners or service frameworks.) >> >> ## Unnamed classes >> >> In a simple program, the `class` declaration often doesn't help >> either, because >> other classes (if there are any) are not going to reference it by >> name, and we >> don't extend a superclass or implement any interfaces.? If we say an >> "unnamed >> class" consists of member declarations without a class header, then >> our Hello >> World program becomes: >> >> ``` >> void main() { >> ??? System.out.println("Hello World"); >> } >> ``` >> >> Such source files can still have fields, methods, and even nested >> classes, so >> that as a program evolves from a few statements to needing some >> ancillary state >> or helper methods, these can be factored out of the `main` method >> while still >> not yet requiring a full class declaration: >> >> ``` >> String greeting() { return "Hello World"; } >> >> void main() { >> ??? System.out.println(greeting()); >> } >> ``` >> >> This is where treating `main` as an instance method really shines; >> the user has >> just declared two methods, and they can freely call each other.? >> Students need >> not confront the confusing distinction between instance and static >> methods yet; >> indeed, if not forced to confront static members on day 1, it might >> be a while >> before they do have to learn this distinction. The fact that there is a >> receiver lurking in the background will come in handy later, but >> right now is >> not bothering anybody. >> >> [JEP 330](https://openjdk.org/jeps/330) allows single-file programs to be >> launched directly without compilation; this streamlined launcher >> pairs well with >> unnamed classes. >> >> ## Predefined static imports >> >> The most important classes, such as `String` and `Integer`, live in the >> `java.lang` package, which is automatically on-demand imported into all >> compilation units; this is why we do not have to `import >> java.lang.String` in >> every class.? Static imports were not added until Java 5, but no >> corresponding >> facility for automatic on-demand import of common behavior was added >> at that >> time.? Most programs, however, will want to do console IO, and Java >> forces us to >> do this in a roundabout way -- through the static `System.out` and >> `System.in` >> fields.? Basic console input and output is a reasonable candidate for >> auto-static import, as one or both are needed by most simple >> programs.? While >> these are currently instance methods accessed through static fields, >> we can >> easily create static methods for `println` and `readln` which are >> suitable for >> static import, and automatically import them.? At which point our >> first program >> is now down to: >> >> ``` >> void main() { >> ??? println("Hello World"); >> } >> ``` >> >> ## Putting this all together >> >> We've discussed several simplifications: >> >> ?- Update the launcher protocol to make public, static, and arguments >> optional >> ?? for main methods, and for main methods to be instance methods (when a >> ?? no-argument constructor is available); >> ?- Make the class wrapper for "main classes" optional (unnamed classes); >> ?- Automatically static import methods like `println` >> >> which together whittle our long list of day-1 concepts down >> considerably.? While >> this is still not as minimal as the minimal Python or Ruby program -- >> statements >> must still live in a method -- the goal here is not to win at "code >> golf".? The >> goal is to ensure that concepts not needed by simple programs need >> not appear in >> those programs, while at the same time not encouraging habits that >> have to be >> unlearned as programs scale up. >> >> Each of these simplifications is individually small and unintrusive, >> and each is >> independent of the others.? And each embodies a simple transformation >> that the >> author can easily manually reverse when it makes sense to do so: elided >> modifiers and `main` arguments can be added back, the class wrapper >> can be added >> back when the affordances of classes are needed (supertypes, >> constructors), and >> the full qualifier of static-import can be added back.? And these >> reversals are >> independent of one another; they can done in any combination or any >> order. >> >> This seems to meet the requirements of our on-ramp; we've eliminated >> most of the >> day-1 ceremony elements without introducing new concepts that need to be >> unlearned. The remaining concepts -- a method is a container for >> statements, and >> a program is a Java source file with a `main` method -- are easily >> understood in >> relation to their fully specified counterparts. >> >> ## Alternatives >> >> Obviously, we've lived with the status quo for 25+ years, so we could >> continue >> to do so.? There were other alternatives explored as well; >> ultimately, each of >> these fell afoul of one of our goals. >> >> ### Can't we go further? >> >> Fans of "code golf" -- of which there are many -- are surely right >> now trying to >> figure out how to eliminate the last little bit, the `main` method, >> and allow >> statements to exist at the top-level of a program.? We deliberately >> stopped >> short of this because it offers little value beyond the first few >> minutes, and >> even that small value quickly becomes something that needs to be >> unlearned. >> >> The fundamental problem behind allowing such "loose" statements is that >> variables can be declared inside both classes (fields) and methods (local >> variables), and they share the same syntactic production but not the same >> semantics.? So it is unclear (to both compilers and humans) whether a >> "loose" >> variable would be a local or a field.? If we tried to adopt some sort >> of simple >> heuristic to collapse this ambiguity (e.g., whether it precedes or >> follows the >> first statement), that may satisfy the compiler, but now simple >> refactorings >> might subtly change the meaning of the program, and we'd be replacing the >> explicit syntactic overhead of `void main()` with an invisible "line" >> in the >> program that subtly affects semantics, and a new subtle rule about >> the meaning >> of variable declarations that applies only to unnamed classes.? This >> doesn't >> help students, nor is this particularly helpful for all but the most >> trivial >> programs.? It quickly becomes a crutch to be discarded and unlearned, >> which >> falls afoul of our "on ramp" goals.? Of all the concepts on our list, >> "methods" >> and "a program is specified by a main method" seem the ones that are >> most worth >> asking students to learn early. >> >> ### Why not "just" use `jshell`? >> >> While JShell is a great interactive tool, leaning too heavily on it >> as an onramp >> would fall afoul of our goals.? A JShell session is not a program, but a >> sequence of code snippets.? When we type declarations into `jshell`, >> they are >> viewed as implicitly static members of some unspecified class, with >> accessibility is ignored completely, and statements execute in a >> context where >> all previous declarations are in scope.? This is convenient for >> experimentation >> -- the primary goal of `jshell` -- but not such a great mental model for >> learning to write Java programs.? Transforming a batch of working >> declarations >> in `jshell` to a real Java program would not be sufficiently simple or >> unintrusive, and would lead to a non-idiomatic style of code, because the >> straightforward translation would have us redeclaring each method, >> class, and >> variable declaration as `static`.? Further, this is probably not the >> direction >> we want to go when we scale up from a handful of statements and >> declarations to >> a simple class -- we probably want to start using classes as classes, >> not just >> as containers for static members. JShell is a great tool for >> exploration and >> debugging, and we expect many educators will continue to incorporate >> it into >> their curriculum, but is not the on-ramp programming model we are >> looking for. >> >> ### What about "always local"? >> >> One of the main tensions that `main` introduces is that most class >> members are >> not `static`, but the `main` method is -- and that forces programmers to >> confront the seam between static and non-static members.? JShell >> answers this >> with "make everything static". >> >> Another approach would be to "make everything local" -- treat a >> simple program >> as being the "unwrapped" body of an implicit main method.? We already >> allow >> variables and classes to be declared local to a method.? We could add >> local >> methods (a useful feature in its own right) and relax some of the >> asymmetries >> around nesting (again, an attractive cleanup), and then treat a mix of >> declarations and statements without a class wrapper as the body of an >> invisible >> `main` method. This seems an attractive model as well -- at first. >> >> While the syntactic overhead of converting back to full-blown classes >> -- wrap >> the whole thing in a `main` method and a `class` declaration -- is >> far less >> intrusive than the transformation inherent in `jshell`, this is still >> not an >> ideal on-ramp.? Local variables interact with local classes (and >> methods, when >> we have them) in a very different way than instance fields do with >> instance >> methods and inner classes: their scopes are different (no forward >> references), >> their initialization rules are different, and captured local >> variables must be >> effectively final.? This is a subtly different programming model that >> would then >> have to be unlearned when scaling up to full classes. Further, the >> result of >> this wrapping -- where everything is local to the main method -- is >> also not >> "idiomatic Java".? So while local methods may be an attractive >> feature, they are >> similarly not the on-ramp we are looking for. >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Sep 29 14:01:19 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 29 Sep 2022 10:01:19 -0400 Subject: Paving the on-ramp In-Reply-To: References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> <2e779dc3-8a16-8d89-1e5d-6b14b04998d3@oracle.com> Message-ID: > Indeed, I intend that there are no explicit constructors or > instance initializers here.? (There can't be constructors, because > the class is unnamed!) > > > Hmm, I was under the impression I could drop all my `static`s?while > keeping the class signature if I wanted? But, if I can and even then > explicit constrs and initers are banned, then indeed, at least one of > my drawbacks is invalid. I don't think it undercuts my overall case > that much. Yes you can.? Example: ??? class InstanceMain implements Serializable { ??????? public InstanceMain() { } ??????? public void main() { ... } ??? } and if you `java InstanceMain`, the launcher will do `new InstanceMain().main()`. The two features -- no class header and instance main -- are orthogonal.? If you don't have a class header, you don't get explicit constructors.? If you use instance main, you must have a no-arg constructor, which could be supplied explicitlly (if there is a class header) or implicitly (whether or not there is a class header.) >> >> One or more class annotations could appear below package/imports? > > No package statement (unnamed classes live in the unnamed > package), but imports are OK. > > > I'm confused; what does any of this have to do with package location? > Isn't that orthogonal to everything we're discussing? There's a world where package is relevant here, but it seems pretty esoteric.? If you define a class with no name, the thing you want to do with it is launch it directly.? Seems like putting it in a package makes little sense here. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Sep 29 14:57:57 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 29 Sep 2022 10:57:57 -0400 Subject: Paving the on-ramp In-Reply-To: References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: On 9/29/2022 3:07 AM, Tagir Valeev wrote: >> ## Unnamed classes >> ... >> Such source files can still have fields, methods, and even nested classes, so >> that as a program evolves from a few statements to needing some ancillary state >> or helper methods, these can be factored out of the `main` method while still > I wonder how we tell apart unnamed class syntax and normal class > syntax. E.g., consider the source file: > > // Hello.java > public class Hello { > // tons of logic > } > > void main() { > } > > Will it be considered as a correct Java file, having Hello class as a > nested class of top-level unnamed class? > If yes, then, adding a main method after the class declaration, I > change the class semantics, making it an inner class. > This looks like action at a distance and may cause confusion. E.g., I > just wrote a main() method outside of Hello class instead of inside, > and boom, > now Hello is not resolvable from other classes, for no apparent reason. Yes, this is where the bodies are buried.? At some point, the file name is likely to come into play, even though we would prefer it not.? (Note that we have a little of that issue with "auxilliary classes" today.)?? I think the move here is that for unnamed classes, if there is a "top level" nested class that matches the file name, we call that an error. > I assume that the main() method is required for an unnamed class, and > if there are only other top-level declarations, > then it should be a compilation error, right? Probably so, yes. > >> ## Predefined static imports >> ``` >> void main() { >> println("Hello World"); >> } >> ``` > I wonder how it will play with existing static star imports. We > already saw problems when updated to Java 9 or Java 14 that > star-imported class named Module or Record becomes unresolvable. If > existing code already imports static method named println from > somewhere, will this code become invalid? "Star" is the right word.? Currently we have a scheme where single imports and beat star imports, so that if someone declares their own `println` method, it wins.? Details to be worked out. From james.laskey at oracle.com Thu Sep 29 16:47:17 2022 From: james.laskey at oracle.com (Jim Laskey) Date: Thu, 29 Sep 2022 16:47:17 +0000 Subject: Paving the on-ramp In-Reply-To: References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: <0A644B30-2AB1-425C-8DCA-1F706BD586BB@oracle.com> // Hello.java public class Hello { // tons of logic } void main() { } and void main() { } // Hello.java public class Hello { // tons of logic } Are equivalent, as though the file content is wrapped in an outer class. public class {$name} {} // Hello.java public class Hello { // tons of logic } void main() { } } The trigger for an unnamed class is a method or field defined at the top level. So the order doesn?t matter. {$name} is derived from the source file name and must be a valid identifier. When running the source launcher, the class name doesn't matter (we could allow only with the source launcher). When compiling with javac we have to stuff the class somewhere and using a name derived from the source makes sense. So if the source is Hello.java you can access the class Hello.Hello from an external reference. Cheers, ? Jim On Sep 29, 2022, at 4:07 AM, Tagir Valeev > wrote: Hello! Very interesting writing, thanks! A couple of notes from me: ## Unnamed classes ... Such source files can still have fields, methods, and even nested classes, so that as a program evolves from a few statements to needing some ancillary state or helper methods, these can be factored out of the `main` method while still I wonder how we tell apart unnamed class syntax and normal class syntax. E.g., consider the source file: // Hello.java public class Hello { // tons of logic } void main() { } Will it be considered as a correct Java file, having Hello class as a nested class of top-level unnamed class? If yes, then, adding a main method after the class declaration, I change the class semantics, making it an inner class. This looks like action at a distance and may cause confusion. E.g., I just wrote a main() method outside of Hello class instead of inside, and boom, now Hello is not resolvable from other classes, for no apparent reason. I assume that the main() method is required for an unnamed class, and if there are only other top-level declarations, then it should be a compilation error, right? ## Predefined static imports ``` void main() { println("Hello World"); } ``` I wonder how it will play with existing static star imports. We already saw problems when updated to Java 9 or Java 14 that star-imported class named Module or Record becomes unresolvable. If existing code already imports static method named println from somewhere, will this code become invalid? With best regards, Tagir Valeev. -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.laskey at oracle.com Thu Sep 29 16:53:42 2022 From: james.laskey at oracle.com (Jim Laskey) Date: Thu, 29 Sep 2022 16:53:42 +0000 Subject: Paving the on-ramp In-Reply-To: References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: <49A8B9F9-D0D1-4E80-9BAD-E870E7CF6C90@oracle.com> Another safer approach we are playing with is to synthesize a public static void main method in an unnamed class when missing. The contents of that method would then invoke the user's main. Cheers, ? Jim On Sep 28, 2022, at 4:49 PM, Kevin Bourrillion > wrote: Virtuous. The quips about horses having fled the barn are coming, but whether they did is irrelevant; let's just make Java better now. On Wed, Sep 28, 2022 at 10:57 AM Brian Goetz > wrote: ## Concept overload I like that the focus is not just on boilerplate but on the offense of forcing learners to encounter concepts they *will* need to care about but don't yet. - Relax the requirement that the class, and `main` method, be public. Public accessibility is only relevant when access crosses packages; simple programs live in the unnamed package, so cannot be accessed from any other package anyway. For a program whose main class is in the unnamed package, we can drop the requirement that the class or its `main` method be public, effectively treating the `java` launcher as if it too resided in the unnamed package. Alternative: drop the requirement altogether. Most main methods have no desire to make themselves publicly callable as `TheClass.main(args)`, but today they are forced to expose that API anyway. I feel like it would still be conceptually clean to say that `public` is really about whether other *code* can access it, not whether a VM can get to it at all. - Make the "args" parameter to `main` optional, by allowing the `java` launcher to first look for a main method with the traditional `main(String[])` signature, and then (if not found) for a main method with no arguments. This seems to leave users vulnerable to some surprises, where the code they think is being called isn't. Why not make it a compile-time error to provide both forms? - Make the `static` modifier on `main` optional, by allowing the `java` launcher to invoke an instance `main` method (of either signature) by instantiating an instance using an accessible no-arg constructor and then invoking the `main` method on it. I'll give the problems I see with this, without a judgement on what should be done. What's the whole idea of main? Well, it's the entry point into the program. But now it's not really the entry point; finding the entry point is more subtle. (Okay, I concede that static initializers are run first either way; that undercuts *some* of the strength of my argument here.) Even if this is okay when I'm writing my own new program, understanding it as I go, then suppose someone else reads my program. That person has the burden of remembering to check whether `main` is static or not, and remembering that some constructor code is happening first if it's not. Classes that have both main and a constructor will be a mixture of some that call them in one order and some in the other. That's just, like, messy. And is it even clear, then, why the VM shouldn't be passing `args` to the constructor, only hoarding it until calling `main`? On a deep conceptual level... I'd insist that main() *is static*. It is *the* single entry point into the program; what could be more static than that? But thinking about our learner, who wrote some `main`s before learning about static. The instant they learn `static` is a keyword a method can have, they'll "know" one thing about it already: this is going to be something new that's *not* true of main(). But then they hear an explanation that fits `main` perfectly? Because excessive use of `static` is considered a code smell, many educators encourage the pattern of "all the static `main` method does is instantiate an instance and call an instance `main` method" anyway. Heavy groan. In my opinion, some ideas are too misguided to take seriously. The value in that practice is if instance `main` accepts parameters like `PrintStream` and `Console`, and static main passes in `System.out` and `System.console()`. That makes all your actual program logic unit-testable. Great! This actually strikes directly at the heart of what the entire problem with `static` is! But this isn't the case you're addressing. Static methods are not a code smell! Static methods that ought to be overrideable by one of their argument types (Collections.sort()), sure. Static mutable state is a code smell, definitely -- but a method that touches that state is equally problematic whether it itself is static or not. There are some code smells around `static`, but `static` itself is fresh and flowery. (Further, allowing the `main` method to be an instance method means that it could be inherited from a superclass, which is useful for simple frameworks such as test runners or service frameworks.) This does not give me a happy feeling. Going into it is a deep discussion though. Rest of the response coming soon, I hope. Just to mention one additional idea. We could permit `main` to optionally return `int`, becoming the default exit status if `exit` is never called. Seems elegant for the rare cases where you care about exit status, but (a) would this feature get in the way in *any* sense for the vast majority of cases that don't care, or (b) are the cases that care just way too rare for us to worry about? I'm not sure about (a). But (b) kinda seems like a yes. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From angelos.bimpoudis at oracle.com Thu Sep 29 17:03:33 2022 From: angelos.bimpoudis at oracle.com (Angelos Bimpoudis) Date: Thu, 29 Sep 2022 17:03:33 +0000 Subject: Draft JEP: Unnamed local variables and patterns Message-ID: Dear experts, The draft JEP for unnamed local variables and patterns, that has been previously discussed on this list is available at: https://bugs.openjdk.org/browse/JDK-8294349 Comments welcomed! Angelos -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Sep 29 17:06:22 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 29 Sep 2022 13:06:22 -0400 Subject: Paving the on-ramp In-Reply-To: References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: <259fc28b-acc8-ac75-d163-a1d6ef34883f@oracle.com> This question came up in a few different forms, but there's a reason why I've only relaxed the "must be public" for classes in the unnamed package: security. There may be existing classes with a package-private instance main() method, and they may have reasonably assumed that these are only callable from within the package.? If the launcher can barge in and open non-public classes and call non-public methods, that may be surprising.? Restricting the "main can be non public" to the unnamed package is justifiable because we can reasonably treat the launcher as being part of the unnamed package (and therefore this rule falls out from ordinary access control) and because it is disadvised to distribute libraries that use the unnamed package, reserving it instead for local experimentation. Framing the launcher as "just some Java code in the unnamed package" also demystifies the launcher a bit. On 9/28/2022 3:49 PM, Kevin Bourrillion wrote: > > ?- Relax the requirement that the class, and `main` method, be > public. Public > ?? accessibility is only relevant when access crosses packages; > simple programs > ?? live in the unnamed package, so cannot be accessed from any > other package > ?? anyway.? For a program whose main class is in the unnamed > package, we can > ?? drop the requirement that the class or its `main` method be public, > ?? effectively treating the `java` launcher as if it too resided > in the unnamed > ?? package. > > > Alternative: drop the requirement altogether. Most main methods have > no desire to make themselves publicly?callable as > `TheClass.main(args)`, but today they are forced to expose that API > anyway. I feel like it would still be conceptually clean?to say that > `public` is really about whether other *code* can access it, not > whether a VM can get to it at all. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Sep 29 18:20:42 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 29 Sep 2022 14:20:42 -0400 Subject: Paving the on-ramp In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: <39a19c35-d15a-a05f-ac3e-6aa7fd3adbcb@oracle.com> One thing that this forces us to confront, in some manner or other, are the rules about the relationship between the file name and the class name, which are spread out in a few different places. The compiler issues an error when a top-level *public* class does not match the file it is in.? Order is irrelevant; the following are valid Foo.java files: -- public class Foo { } class Bar { } -- class Foo { } class Bar { } -- class Bar { } public class Foo { } -- but the following are illegal for Foo.java: -- public class Foo { } public class Bar { } -- public class Bar { } -- The standard class loader implementation enforces that a class file X.class must contain a class called X (javap will warn about this too.)? When compiling code in-memory, the javac "FileManager" abstraction still requires a "file name" for each source unit being compiled, even if there is no actual file. The prototype implementation we have does infer a class name from the file name in the obvious way; we just don't enter it into the symbol table.? But, if you put this in Foo.java: -- void main() { } -- You'll get a Foo.class with class Foo in it, and if you put that class file on the class path, *other* classes can instantiate it by name.? So the prototype is currently in a half-here, half-there situation.? It is probably overkill to try to have some ACC_UNNAMED marking to prevent this.? So we might accept this odd state, or we might embrace it as Guy suggests, and go ahead and enter Foo in the symbol table, and even let people declare constructors.? That means that the name "unnamed class" would no longer be an accurate name (a shame, since its friends "unnamed module" and "unnamed package" have been saving it a seat.)? This is a workable direction, though not a forced move. The other connection point with the file name is the one Tagir brought up, which is the effect of accidentally putting a method or field outside the braces.? If you have a class -- class Foo { } void x() { } -- today, this is an error; under this proposal, this becomes a valid unnamed class with a *nested* class Foo, which may not be what was meant.? We can reduce the possibility of this by issuing a warning/error if an unnamed class has a "top-level" nested class whose name matches the file.? This seems reasonably consistent with the existing rules constraining file names and class names.? If we combined this with the previous move, this becomes "An unnamed class cannot have a top-level nested class of the same name". On 9/28/2022 1:57 PM, Brian Goetz wrote: > At various points, we've explored the question of which program > elements are most and least helpful for students first learning Java.? > After considering a number of alternatives over the years, I have a > simple proposal for smoothing the "on ramp" to Java programming, while > not creating new things to unlearn. > > Markdown source is below, HTML will appear soon at: > > https://openjdk.org/projects/amber/design-notes/on-ramp > > > # Paving the on-ramp > > Java is one of the most widely taught programming languages in the > world.? Tens > of thousands of educators find that the imperative core of the > language combined > with a straightforward standard library is a foundation that students can > comfortably learn on.? Choosing Java gives educators many degrees of > freedom: > they can situate students in `jshell` or Notepad or a full-fledged > IDE; they can > teach imperative, object-oriented, functional, or hybrid programming > styles; and > they can easily find libraries to interact with external data and > services. > > No language is perfect, and one of the most common complaints about > Java is that > it is "too verbose" or has "too much ceremony."? And unfortunately, > Java imposes > its heaviest ceremony on those first learning the language, who need and > appreciate it the least.? The declaration of a class and the > incantation of > `public static void main` is pure mystery to a beginning programmer.? > While > these incantations have principled origins and serve a useful > organizing purpose > in larger programs, they have the effect of placing obstacles in the > path of > _becoming_ Java programmers. Educators constantly remind us of the > litany of > complexity that students have to confront on Day 1 of class -- when > they really > just want to write their first program. > > As an amusing demonstration of this, in her JavaOne keynote appearance > in 2019, > [Aimee Lucido](https://www.youtube.com/watch?v=BkPPFiXUwYk) talked > about when > she learned to program in Java, and how her teacher performed a rap song > to help students memorize `"public static void main"`.? Our hats are > off to > creative educators everywhere for this kind of dedication, but teachers > shouldn't have to do this. > > Of course, advanced programmers complain about ceremony too. We will > never be > able to satisfy programmers' insatiable appetite for typing fewer > keystrokes, > and we shouldn't try, because the goal of programming is to write > programs that > are easy to read and are clearly correct, not programs that were easy > to type. > But we can try to better align the ceremony commensurate with the value it > brings to a program -- and let simple programs be expressed more simply. > > ## Concept overload > > The classic "Hello World" program looks like this in Java: > > ``` > public class HelloWorld { > ??? public static void main(String[] args) { > ??????? System.out.println("Hello World"); > ??? } > } > ``` > > It may only be five lines, but those lines are packed with concepts > that are > challenging to absorb without already having some programming > experience and > familiarity with object orientation. Let's break down the concepts a > student > confronts when writing their first Java program: > > ? - **public** (on the class).? The `public` accessibility level is > relevant > ??? only when there is going to be cross-package access; in a simple > "Hello > ??? World" program, there is only one class, which lives in the > unnamed package. > ??? They haven't even written a one-line program yet; the notion of access > ??? control -- keeping parts of a program from accessing other parts > of it -- is > ??? still way in their future. > > ? - **class**.? Our student hasn't set out to write a _class_, or model a > ??? complex system with objects; they want to write a _program_.? In > Java, a > ??? program is just a `main` method in some class, but at this point > our student > ??? still has no idea what a class is or why they want one. > > ? - **Methods**.? Methods are of course a key concept in Java, but the > mechanics > ??? of methods -- parameters, return types, and invocation -- are still > ??? unfamiliar, and the `main` method is invoked magically from the `java` > ??? launcher rather than from explicit code. > > ? - **public** (again).? Like the class, the `main` method has to be > public, but > ??? again this is only relevant when programs are large enough to require > ??? packages to organize them. > > ? - **static**.? The `main` method has to be static, and at this > point, students > ??? have no context for understanding what a static method is or why > they want > ??? one.? Worse, the early exposure to `static` methods will turn out > to be a > ??? bad habit that must be later unlearned.? Worse still, the fact > that the > ??? `main` method is `static` creates a seam between `main` and other > methods; > ??? either they must become `static` too, or the `main` method must > trampoline > ??? to some sort of "instance main" (more ceremony!)? And if we get > this wrong, > ??? we get the dreaded and mystifying `"cannot be referenced from a static > ??? context"` error. > > ? - **main**.? The name `main` has special meaning in a Java program, > indicating > ??? the starting point of a program, but this specialness hides behind > being an > ??? ordinary method name.? This may contribute to the sense of "so > many magic > ??? incantations." > > ? - **String[]**.? The parameter to `main` is an array of strings, > which are the > ??? arguments that the `java` launcher collected from the command > line.? But our > ??? first program -- likely our first dozen -- will not use command-line > ??? parameters. Requiring the `String[]` parameter is, at this point, > a mistake > ??? waiting to happen, and it will be a long time until this parameter > makes > ??? sense.? Worse, educators may be tempted to explain arrays at this > point, > ??? which further increases the time-to-first-program. > > ? - **System.out.println**.? If you look closely at this incantation, each > ??? element in the chain is a different thing -- `System` is a class > (what's a > ??? class again?), `out` is a static field (what's a field?), and > `println` is > ??? an instance method.? The only part the student cares about right > now is > ??? `println`; the rest of it is an incantation that they do not yet > understand > ??? in order to get at the behavior they want. > > That's a lot to explain to a student on the first day of class.? > There's a good > chance that by now, class is over and we haven't written any programs > yet, or > the teacher has said "don't worry what this means, you'll understand > it later" > six or eight times.? Not only is this a lot of _syntactic_ things to > absorb, but > each of those things appeals to a different concept (class, method, > package, > return value, parameter, array, static, public, etc) that the student > doesn't > have a framework for understanding yet.? Each of these will have an > important > role to play in larger programs, but so far, they only contribute to "wow, > programming is complicated." > > It won't be practical (or even desirable) to get _all_ of these > concepts out of > the student's face on day 1, but we can do a lot -- and focus on the > ones that > do the most to help beginners understand how programs are constructed. > > ## Goal: a smooth on-ramp > > As much as programmers like to rant about ceremony, the real goal here > is not > mere ceremony reduction, but providing a graceful _on ramp_ to Java > programming. > This on-ramp should be helpful to beginning programmers by requiring > only those > concepts that a simple program needs. > > Not only should an on-ramp have a gradual slope and offer enough > acceleration > distance to get onto the highway at the right speed, but its direction > must > align with that of the highway.? When a programmer is ready to learn > about more > advanced concepts, they should not have to discard what they've > already learned, > but instead easily see how the simple programs they've already written > generalize to more complicated ones, and both the syntatic and conceptual > transformation from "simple" to "full blown" program should be > straightforward > and unintrusive.? It is a definite non-goal to create a "simplified > dialect of > Java for students". > > We identify three simplifications that should aid both educators and > students in > navigating the on-ramp to Java, as well as being generally useful to > simple > programs beyond the classroom as well: > > ?- A more tolerant launch protocol > ?- Unnamed classes > ?- Predefined static imports for the most critical methods and fields > > ## A more tolerant launch protocol > > The Java Language Specification has relatively little to say about how > Java > "programs" get launched, other than saying that there is some way to > indicate > which class is the initial class of a program (JLS 12.1.1) and that a > public > static method called `main` whose sole argument is of type `String[]` > and whose > return is `void` constitutes the entry point of the indicated class. > > We can eliminate much of the concept overload simply by relaxing the > interactions between a Java program and the `java` launcher: > > ?- Relax the requirement that the class, and `main` method, be > public.? Public > ?? accessibility is only relevant when access crosses packages; simple > programs > ?? live in the unnamed package, so cannot be accessed from any other > package > ?? anyway.? For a program whose main class is in the unnamed package, > we can > ?? drop the requirement that the class or its `main` method be public, > ?? effectively treating the `java` launcher as if it too resided in > the unnamed > ?? package. > > ?- Make the "args" parameter to `main` optional, by allowing the > `java` launcher to > ?? first look for a main method with the traditional `main(String[])` > ?? signature, and then (if not found) for a main method with no arguments. > > ?- Make the `static` modifier on `main` optional, by allowing the > `java` launcher to > ?? invoke an instance `main` method (of either signature) by > instantiating an > ?? instance using an accessible no-arg constructor and then invoking > the `main` > ?? method on it. > > This small set of changes to the launch protocol strikes out five of > the bullet > points in the above list of concepts: public (twice), static, method > parameters, > and `String[]`. > > At this point, our Hello World program is now: > > ``` > class HelloWorld { > ??? void main() { > ??????? System.out.println("Hello World"); > ??? } > } > ``` > > It's not any shorter by line count, but we've removed a lot of "horizontal > noise" along with a number of concepts.? Students and educators will > appreciate > it, but advanced programmers are unlikely to be in any hurry to make these > implicit elements explicit either. > > Additionally, the notion of an "instance main" has value well beyond > the first > day.? Because excessive use of `static` is considered a code smell, many > educators encourage the pattern of "all the static `main` method does is > instantiate an instance and call an instance `main` method" anyway.? > Formalizing > the "instance main" protocol reduces a layer of boilerplate in these > cases, and > defers the point at which we have to explain what instance creation is > -- and > what `static` is.? (Further, allowing the `main` method to be an > instance method > means that it could be inherited from a superclass, which is useful > for simple > frameworks such as test runners or service frameworks.) > > ## Unnamed classes > > In a simple program, the `class` declaration often doesn't help > either, because > other classes (if there are any) are not going to reference it by > name, and we > don't extend a superclass or implement any interfaces.? If we say an > "unnamed > class" consists of member declarations without a class header, then > our Hello > World program becomes: > > ``` > void main() { > ??? System.out.println("Hello World"); > } > ``` > > Such source files can still have fields, methods, and even nested > classes, so > that as a program evolves from a few statements to needing some > ancillary state > or helper methods, these can be factored out of the `main` method > while still > not yet requiring a full class declaration: > > ``` > String greeting() { return "Hello World"; } > > void main() { > ??? System.out.println(greeting()); > } > ``` > > This is where treating `main` as an instance method really shines; the > user has > just declared two methods, and they can freely call each other.? > Students need > not confront the confusing distinction between instance and static > methods yet; > indeed, if not forced to confront static members on day 1, it might be > a while > before they do have to learn this distinction.? The fact that there is a > receiver lurking in the background will come in handy later, but right > now is > not bothering anybody. > > [JEP 330](https://openjdk.org/jeps/330) allows single-file programs to be > launched directly without compilation; this streamlined launcher pairs > well with > unnamed classes. > > ## Predefined static imports > > The most important classes, such as `String` and `Integer`, live in the > `java.lang` package, which is automatically on-demand imported into all > compilation units; this is why we do not have to `import > java.lang.String` in > every class.? Static imports were not added until Java 5, but no > corresponding > facility for automatic on-demand import of common behavior was added > at that > time.? Most programs, however, will want to do console IO, and Java > forces us to > do this in a roundabout way -- through the static `System.out` and > `System.in` > fields.? Basic console input and output is a reasonable candidate for > auto-static import, as one or both are needed by most simple > programs.? While > these are currently instance methods accessed through static fields, > we can > easily create static methods for `println` and `readln` which are > suitable for > static import, and automatically import them.? At which point our > first program > is now down to: > > ``` > void main() { > ??? println("Hello World"); > } > ``` > > ## Putting this all together > > We've discussed several simplifications: > > ?- Update the launcher protocol to make public, static, and arguments > optional > ?? for main methods, and for main methods to be instance methods (when a > ?? no-argument constructor is available); > ?- Make the class wrapper for "main classes" optional (unnamed classes); > ?- Automatically static import methods like `println` > > which together whittle our long list of day-1 concepts down > considerably.? While > this is still not as minimal as the minimal Python or Ruby program -- > statements > must still live in a method -- the goal here is not to win at "code > golf".? The > goal is to ensure that concepts not needed by simple programs need not > appear in > those programs, while at the same time not encouraging habits that > have to be > unlearned as programs scale up. > > Each of these simplifications is individually small and unintrusive, > and each is > independent of the others.? And each embodies a simple transformation > that the > author can easily manually reverse when it makes sense to do so: elided > modifiers and `main` arguments can be added back, the class wrapper > can be added > back when the affordances of classes are needed (supertypes, > constructors), and > the full qualifier of static-import can be added back.? And these > reversals are > independent of one another; they can done in any combination or any order. > > This seems to meet the requirements of our on-ramp; we've eliminated > most of the > day-1 ceremony elements without introducing new concepts that need to be > unlearned. The remaining concepts -- a method is a container for > statements, and > a program is a Java source file with a `main` method -- are easily > understood in > relation to their fully specified counterparts. > > ## Alternatives > > Obviously, we've lived with the status quo for 25+ years, so we could > continue > to do so.? There were other alternatives explored as well; ultimately, > each of > these fell afoul of one of our goals. > > ### Can't we go further? > > Fans of "code golf" -- of which there are many -- are surely right now > trying to > figure out how to eliminate the last little bit, the `main` method, > and allow > statements to exist at the top-level of a program.? We deliberately > stopped > short of this because it offers little value beyond the first few > minutes, and > even that small value quickly becomes something that needs to be > unlearned. > > The fundamental problem behind allowing such "loose" statements is that > variables can be declared inside both classes (fields) and methods (local > variables), and they share the same syntactic production but not the same > semantics.? So it is unclear (to both compilers and humans) whether a > "loose" > variable would be a local or a field.? If we tried to adopt some sort > of simple > heuristic to collapse this ambiguity (e.g., whether it precedes or > follows the > first statement), that may satisfy the compiler, but now simple > refactorings > might subtly change the meaning of the program, and we'd be replacing the > explicit syntactic overhead of `void main()` with an invisible "line" > in the > program that subtly affects semantics, and a new subtle rule about the > meaning > of variable declarations that applies only to unnamed classes.? This > doesn't > help students, nor is this particularly helpful for all but the most > trivial > programs.? It quickly becomes a crutch to be discarded and unlearned, > which > falls afoul of our "on ramp" goals.? Of all the concepts on our list, > "methods" > and "a program is specified by a main method" seem the ones that are > most worth > asking students to learn early. > > ### Why not "just" use `jshell`? > > While JShell is a great interactive tool, leaning too heavily on it as > an onramp > would fall afoul of our goals.? A JShell session is not a program, but a > sequence of code snippets.? When we type declarations into `jshell`, > they are > viewed as implicitly static members of some unspecified class, with > accessibility is ignored completely, and statements execute in a > context where > all previous declarations are in scope.? This is convenient for > experimentation > -- the primary goal of `jshell` -- but not such a great mental model for > learning to write Java programs.? Transforming a batch of working > declarations > in `jshell` to a real Java program would not be sufficiently simple or > unintrusive, and would lead to a non-idiomatic style of code, because the > straightforward translation would have us redeclaring each method, > class, and > variable declaration as `static`.? Further, this is probably not the > direction > we want to go when we scale up from a handful of statements and > declarations to > a simple class -- we probably want to start using classes as classes, > not just > as containers for static members. JShell is a great tool for > exploration and > debugging, and we expect many educators will continue to incorporate > it into > their curriculum, but is not the on-ramp programming model we are > looking for. > > ### What about "always local"? > > One of the main tensions that `main` introduces is that most class > members are > not `static`, but the `main` method is -- and that forces programmers to > confront the seam between static and non-static members. JShell > answers this > with "make everything static". > > Another approach would be to "make everything local" -- treat a simple > program > as being the "unwrapped" body of an implicit main method.? We already > allow > variables and classes to be declared local to a method.? We could add > local > methods (a useful feature in its own right) and relax some of the > asymmetries > around nesting (again, an attractive cleanup), and then treat a mix of > declarations and statements without a class wrapper as the body of an > invisible > `main` method. This seems an attractive model as well -- at first. > > While the syntactic overhead of converting back to full-blown classes > -- wrap > the whole thing in a `main` method and a `class` declaration -- is far > less > intrusive than the transformation inherent in `jshell`, this is still > not an > ideal on-ramp.? Local variables interact with local classes (and > methods, when > we have them) in a very different way than instance fields do with > instance > methods and inner classes: their scopes are different (no forward > references), > their initialization rules are different, and captured local variables > must be > effectively final.? This is a subtly different programming model that > would then > have to be unlearned when scaling up to full classes. Further, the > result of > this wrapping -- where everything is local to the main method -- is > also not > "idiomatic Java".? So while local methods may be an attractive > feature, they are > similarly not the on-ramp we are looking for. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Thu Sep 29 20:04:15 2022 From: john.r.rose at oracle.com (John Rose) Date: Thu, 29 Sep 2022 13:04:15 -0700 Subject: Paving the on-ramp In-Reply-To: <20e10b28-5d05-f381-5cbd-458418377231@oracle.com> References: <20e10b28-5d05-f381-5cbd-458418377231@oracle.com> Message-ID: On Sep 29, 2022, at 6:55 AM, Brian Goetz wrote: > >> (3) Instead of speaking of automatic imports, speak of the compiler automatically providing certain import statements if the compilation unit doesn?t have a class header. > > If we did this, when a class "graduates" from a low-ceremony class to a full class, then they'd have to go back and fix up all the println calls, and similarly it would put users in a position of "you can have ceremony reduction X, but only if you qualify for ceremony reduction Y." > ? > > Taken together, coupling "instance main" and "auto static imports" to "no class header" means that we have created a "beginners dialect" which is different, and which has to be unlearned and undone as soon as a class graduates. I would prefer to have these be orthogonal features to the extent possible. I like the principle behind Guy?s moves for removing magic, by implicitly adding stuff you could have had explicitly. But adding `public static main` when there is an instance `main` is not a big payoff, though, since (a) you don?t want to apply such a rule to all class files in existence today, and (b) applying it only to the unnamed classes couples the two features in a (probably) confusing way. So, with my VM hat on, I say, fine, let?s add another trick to the launcher?s bag of tricks: If (1) a class is mentioned on a cammand line, then (2) we look for a somewhat wider range of methods (but all named `main`). Again, having implicit static-imports that one could have written explicitly is very good, and it is a fine de-mystification move to say ?one will be written for you if you didn?t write it already?. I think that?s a way to explain our handling of `java.lang.*` today, isn?t it? So there?s a small risk to adding more ?stuff? to `java.lang.*`. (The problem with `Module` was mentioned in this thread.) And something equivalent to `import static java.lang.StaticImports.*` will further poke the bear, depending on how rich we make the set of imported names. Here?s a suggestion regarding static imports, specifically, that would match the on-ramp goals and mitigate the risk of name pollution from new *static* imports: 1. If there is no `import java.lang.*` the program acts as if it were inserted. 2. If there are *no imports at all*, the program acts as if *two* imports were inserted: `import java.lang.*` and `import static java.lang.StaticImports.*` (or whatever the name is). The effect of this is an empty set of imports will get a predictable, useful, and up-to-date set of default names. That makes for good on-ramp conditions. To get control over those imports, the user starts adding explicit imports at the top of the file. We proceed up the on-ramp by a series of one-line changes, not wholesale refactorings. This is akin to today?s mitigation of the problem with `java.lang.Module`: You mitigate by *adding another import*, by-name import of your chosen class named `Module`. That?s how Java has always worked. Removing an intrusive static import from `java.lang` would (under the above rule) be mitigated more simply; just add any import at all, even a redundant `import java.lang.*`. That?s a little magic, but the story is clear: You get a certain ?menu? of imports if you don?t specify *any*. (Q: What would break if we also auto-imported `java.util.*` under the null-import condition? How disruptive would that be??) I agree, in hindsight, with Guy?s point about unnamed classes in named packages. I don?t see a deep coupling between those two parts of the language, so don?t make a shallow one. In general, shallow couplings lead to the problem of ?beginner?s dialect? Brian mentioned: If simplifications A and B are coupled, when you graduate from one you have to ?complicate? to the others. In the case of the unnamed package, when you graduate your program to a named package (perhaps because it is now a unit test or utility that needs package API access) you might not want to graduate it, at the same time, from its unnamed format. With my VM hat on again, I have a tentative suggestion for ?fixing? the problem with an *unintentionally* linkable/denotable class. (As pointed out, that could be a class named `Foo` just because it is anonymous in a file that happens to have the ?pretty name? `Foo.java`.) Suggestion: Allow classfiles (in newer classfile versions) to specify `ACC_PRIVATE` in their `access_flags` for the class. With the obvious (!?) meaning: A class marked private (at the VM level) will fail access checks except to itself and its nestmates (if any). Roll it out as a VM feature first, and later as a slightly-incompatible language change for nested classes. Heck, even named classes (that?s a compatible extension). (Immediate use cases: All non-denotable classes are compiled `ACC_PRIVATE`. That includes both ?on ramp? unnamed classes and also any ?inner class? which doesn?t have a linkable bytecode name.) Second suggestion (independent of first): In the example of `Foo.java`, ?poison? the name `Foo` in a predictable way (prepend `$` or add `$unnamed` for example), and also mark the class as Synthetic (or with a new attribute). Then, liberalize the launcher *ever so slightly* so as to allow (1) either the exactly matching name as today, (2) the predictably poisoned name (`Foo$unnamed`) if the class is also marked as synthetic/unnamed/whatever with an attribute. This will put unnamed classes on a common footing with other classes (local & anonymous inner classes) that already have linkable-but-unpredictable names. This is simpler than supporting `ACC_PRIVATE` and probably easier in the resolver (since there are just two names to check instead of one). Third suggestion, probably not usable: We have properly anonymous classes in the VM (VMACs), which have names that not even the class itself can resolve; they have a special ability to self-resolve `CONSTANT_Class` but it is hardwired and doesn?t go through a class-loader. We could try to do something like this for unnamed classes, *but* it would not scale well to unnamed classes *which have named nested classes*. To name those nested classes `Foo.Bar` you need a resolvable name like `Foo$unnamed$Bar`. (But the classes could be marked `ACC_PRIVATE`; see above.) I don?t know a clean way to fix the syntax ambiguity between (a) nested class/interface of unnamed class (new) and (b) non-public top-level (package-member) class/interface (old). Here are two dirty workarounds, both of which make such secondary classes into inherently non-linkable inner classes: 1. Put all your nested classes together in a method body. 2. Put all your nested classes in an instance initializer (magic braces!). Both have the problem that the class names don?t scope to the whole top-level (unnamed) class, so they are non-starters I guess, but might jog someone else?s imagination for a better workaround. Here?s another workaround, which I guess Brian already mentioned: 3. If your user is wishing for nested classes or interfaces (or more likely records), then it?s time to learn about type definitions, so require them to ?graduate? to a top-level class *at that time*. Tentative suggestion, again for brainstorming: A way to smooth *that* move might be to provide yet more syntaxes to declare a *class which is not denotable but which has a body*. Something like a truncated class header with a body: `class /*empty header*/ { ?body here?}`. The rule would be: If you are defining classes, it?s time to acknowledge you are defining a top-level one to surround them, but you don?t have to name it yet; it?s ?just there?. (On this slippery slope, maybe allow nested unnamed classes as well? And/or unnamed-but-denotable constructors: `class { ? String field; class(String field) { this.field = field; } ? `. This doesn?t appeal much to me, at least until we have compelling new use cases for anonymous classes, not already covered by `new Object() { ? }`. Enhanced inference could make such a class into a poly-type-expression, someday, for some contexts where supers would be inferred. I think that?s what C# does in this vein.) OK, that?s enough BS (brainstorming, of course) from me. From guy.steele at oracle.com Thu Sep 29 20:58:45 2022 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 29 Sep 2022 20:58:45 +0000 Subject: Paving the on-ramp In-Reply-To: References: <20e10b28-5d05-f381-5cbd-458418377231@oracle.com> Message-ID: <1F34BC13-4FC8-4061-877C-72DF08EB1EB9@oracle.com> > On Sep 29, 2022, at 4:04 PM, John Rose wrote: > . . . > > I like the principle behind Guy?s moves for removing magic, by implicitly adding stuff you could have had explicitly. Thanks for the phrasing of those last nine words. And someone else has pointed out to me that the expression of my ideas would have been clearer, more general, and more accurate if I had spoken in terms of ?implicit declaration? under thus-and-so circumstances, rather than assuming that the compiler is necessarily the mechanism by which those implicit declarations are handled. I?m not insistent on the particular solutions I suggested; I?m just happy to have gotten everyone else thinking in that direction. ?Guy From john.r.rose at oracle.com Thu Sep 29 22:20:11 2022 From: john.r.rose at oracle.com (John Rose) Date: Thu, 29 Sep 2022 15:20:11 -0700 Subject: Paving the on-ramp In-Reply-To: <1054807958.15296354.1664398190396.JavaMail.zimbra@u-pem.fr> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> <1054807958.15296354.1664398190396.JavaMail.zimbra@u-pem.fr> Message-ID: <81497E7C-98E8-4650-9F18-FC6BF3EBF3F9@oracle.com> On 28 Sep 2022, at 13:49, Remi Forax wrote: > - We should not be able to declare fields inside a classless class, > students strugle at the beginning to make the difference between a > field and a local variable. > Every syntax that make that distinction murkier is a bad idea. > So perhaps what we want is a classless container of methods, not a > classless class. Hmmm? That would be an interface. I?ll pull on that thread a little: An interface has no non-static fields and (bonus) its static fields are always constant. So you can teach interface *as a container* without getting into mutability. Methods would have to be implicitly decorated with `default` in an anonymous *interface*. The execution of an instance-main anonymous interface would look almost *exactly* like that for a class: `public static void main(String[] av) { new (){}.main(); }` The only difference is the `{}`. Abstracts would be forbidden in an anonymous interface: Every method has a body, just as every field has an initializer. Bonus: No instance initializers, since it?s an interface. (No constructors either.) So the headaches about initialization-related syntaxes go away without additional special pleading. Objection: *That?s no interface!* Well, true. Except it is an interface to the system, being a launch point. (Is that just a bad pun?) Also, folks use interfaces today as an idiom for a lightweight container of Java code (at least, I do that). Bonus: If the ?instance main? feature is supported *only for interface containers* then some issues of accidentally creating a main (in existing code) go away, simply because the attack surface (for accidents) gets smaller. Yes, that?s a yucky bonus. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Sep 29 22:20:21 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 29 Sep 2022 18:20:21 -0400 Subject: Paving the on-ramp: couplings In-Reply-To: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: I thought it would be useful to enumerate where the couplings are between the various features here.? The goal is to avoid what John is calling "shallow couplings".? I'll break the features down into more granularity: ?- predefined static imports ?- public is optional on main ?- args are optional on main ?- main can be instance or static ?- unnamed classes And they interact with these (currently): ?- unnamed packages ?- constructors Coupling: unnamed classes must live in the unnamed package. Coupling: public is only optional on main methods in the unnamed package. Coupling: instance main requires a no-arg constructor. Coupling: unnamed classes don't get constructors. Coupling: unnamed classes must have a main. Before we set to arguing whether these couplings are OK or not, what others have I missed? (Bonus naming round: while I like the concept of unnamed classes, it may not be a perfect fit; if we decide the fit is too poor, we could call them "implicit classes".) On 9/28/2022 1:57 PM, Brian Goetz wrote: > At various points, we've explored the question of which program > elements are most and least helpful for students first learning Java.? > After considering a number of alternatives over the years, I have a > simple proposal for smoothing the "on ramp" to Java programming, while > not creating new things to unlearn. > > Markdown source is below, HTML will appear soon at: > > https://openjdk.org/projects/amber/design-notes/on-ramp > > > # Paving the on-ramp > > Java is one of the most widely taught programming languages in the > world.? Tens > of thousands of educators find that the imperative core of the > language combined > with a straightforward standard library is a foundation that students can > comfortably learn on.? Choosing Java gives educators many degrees of > freedom: > they can situate students in `jshell` or Notepad or a full-fledged > IDE; they can > teach imperative, object-oriented, functional, or hybrid programming > styles; and > they can easily find libraries to interact with external data and > services. > > No language is perfect, and one of the most common complaints about > Java is that > it is "too verbose" or has "too much ceremony."? And unfortunately, > Java imposes > its heaviest ceremony on those first learning the language, who need and > appreciate it the least.? The declaration of a class and the > incantation of > `public static void main` is pure mystery to a beginning programmer.? > While > these incantations have principled origins and serve a useful > organizing purpose > in larger programs, they have the effect of placing obstacles in the > path of > _becoming_ Java programmers. Educators constantly remind us of the > litany of > complexity that students have to confront on Day 1 of class -- when > they really > just want to write their first program. > > As an amusing demonstration of this, in her JavaOne keynote appearance > in 2019, > [Aimee Lucido](https://www.youtube.com/watch?v=BkPPFiXUwYk) talked > about when > she learned to program in Java, and how her teacher performed a rap song > to help students memorize `"public static void main"`.? Our hats are > off to > creative educators everywhere for this kind of dedication, but teachers > shouldn't have to do this. > > Of course, advanced programmers complain about ceremony too. We will > never be > able to satisfy programmers' insatiable appetite for typing fewer > keystrokes, > and we shouldn't try, because the goal of programming is to write > programs that > are easy to read and are clearly correct, not programs that were easy > to type. > But we can try to better align the ceremony commensurate with the value it > brings to a program -- and let simple programs be expressed more simply. > > ## Concept overload > > The classic "Hello World" program looks like this in Java: > > ``` > public class HelloWorld { > ??? public static void main(String[] args) { > ??????? System.out.println("Hello World"); > ??? } > } > ``` > > It may only be five lines, but those lines are packed with concepts > that are > challenging to absorb without already having some programming > experience and > familiarity with object orientation. Let's break down the concepts a > student > confronts when writing their first Java program: > > ? - **public** (on the class).? The `public` accessibility level is > relevant > ??? only when there is going to be cross-package access; in a simple > "Hello > ??? World" program, there is only one class, which lives in the > unnamed package. > ??? They haven't even written a one-line program yet; the notion of access > ??? control -- keeping parts of a program from accessing other parts > of it -- is > ??? still way in their future. > > ? - **class**.? Our student hasn't set out to write a _class_, or model a > ??? complex system with objects; they want to write a _program_.? In > Java, a > ??? program is just a `main` method in some class, but at this point > our student > ??? still has no idea what a class is or why they want one. > > ? - **Methods**.? Methods are of course a key concept in Java, but the > mechanics > ??? of methods -- parameters, return types, and invocation -- are still > ??? unfamiliar, and the `main` method is invoked magically from the `java` > ??? launcher rather than from explicit code. > > ? - **public** (again).? Like the class, the `main` method has to be > public, but > ??? again this is only relevant when programs are large enough to require > ??? packages to organize them. > > ? - **static**.? The `main` method has to be static, and at this > point, students > ??? have no context for understanding what a static method is or why > they want > ??? one.? Worse, the early exposure to `static` methods will turn out > to be a > ??? bad habit that must be later unlearned.? Worse still, the fact > that the > ??? `main` method is `static` creates a seam between `main` and other > methods; > ??? either they must become `static` too, or the `main` method must > trampoline > ??? to some sort of "instance main" (more ceremony!)? And if we get > this wrong, > ??? we get the dreaded and mystifying `"cannot be referenced from a static > ??? context"` error. > > ? - **main**.? The name `main` has special meaning in a Java program, > indicating > ??? the starting point of a program, but this specialness hides behind > being an > ??? ordinary method name.? This may contribute to the sense of "so > many magic > ??? incantations." > > ? - **String[]**.? The parameter to `main` is an array of strings, > which are the > ??? arguments that the `java` launcher collected from the command > line.? But our > ??? first program -- likely our first dozen -- will not use command-line > ??? parameters. Requiring the `String[]` parameter is, at this point, > a mistake > ??? waiting to happen, and it will be a long time until this parameter > makes > ??? sense.? Worse, educators may be tempted to explain arrays at this > point, > ??? which further increases the time-to-first-program. > > ? - **System.out.println**.? If you look closely at this incantation, each > ??? element in the chain is a different thing -- `System` is a class > (what's a > ??? class again?), `out` is a static field (what's a field?), and > `println` is > ??? an instance method.? The only part the student cares about right > now is > ??? `println`; the rest of it is an incantation that they do not yet > understand > ??? in order to get at the behavior they want. > > That's a lot to explain to a student on the first day of class.? > There's a good > chance that by now, class is over and we haven't written any programs > yet, or > the teacher has said "don't worry what this means, you'll understand > it later" > six or eight times.? Not only is this a lot of _syntactic_ things to > absorb, but > each of those things appeals to a different concept (class, method, > package, > return value, parameter, array, static, public, etc) that the student > doesn't > have a framework for understanding yet.? Each of these will have an > important > role to play in larger programs, but so far, they only contribute to "wow, > programming is complicated." > > It won't be practical (or even desirable) to get _all_ of these > concepts out of > the student's face on day 1, but we can do a lot -- and focus on the > ones that > do the most to help beginners understand how programs are constructed. > > ## Goal: a smooth on-ramp > > As much as programmers like to rant about ceremony, the real goal here > is not > mere ceremony reduction, but providing a graceful _on ramp_ to Java > programming. > This on-ramp should be helpful to beginning programmers by requiring > only those > concepts that a simple program needs. > > Not only should an on-ramp have a gradual slope and offer enough > acceleration > distance to get onto the highway at the right speed, but its direction > must > align with that of the highway.? When a programmer is ready to learn > about more > advanced concepts, they should not have to discard what they've > already learned, > but instead easily see how the simple programs they've already written > generalize to more complicated ones, and both the syntatic and conceptual > transformation from "simple" to "full blown" program should be > straightforward > and unintrusive.? It is a definite non-goal to create a "simplified > dialect of > Java for students". > > We identify three simplifications that should aid both educators and > students in > navigating the on-ramp to Java, as well as being generally useful to > simple > programs beyond the classroom as well: > > ?- A more tolerant launch protocol > ?- Unnamed classes > ?- Predefined static imports for the most critical methods and fields > > ## A more tolerant launch protocol > > The Java Language Specification has relatively little to say about how > Java > "programs" get launched, other than saying that there is some way to > indicate > which class is the initial class of a program (JLS 12.1.1) and that a > public > static method called `main` whose sole argument is of type `String[]` > and whose > return is `void` constitutes the entry point of the indicated class. > > We can eliminate much of the concept overload simply by relaxing the > interactions between a Java program and the `java` launcher: > > ?- Relax the requirement that the class, and `main` method, be > public.? Public > ?? accessibility is only relevant when access crosses packages; simple > programs > ?? live in the unnamed package, so cannot be accessed from any other > package > ?? anyway.? For a program whose main class is in the unnamed package, > we can > ?? drop the requirement that the class or its `main` method be public, > ?? effectively treating the `java` launcher as if it too resided in > the unnamed > ?? package. > > ?- Make the "args" parameter to `main` optional, by allowing the > `java` launcher to > ?? first look for a main method with the traditional `main(String[])` > ?? signature, and then (if not found) for a main method with no arguments. > > ?- Make the `static` modifier on `main` optional, by allowing the > `java` launcher to > ?? invoke an instance `main` method (of either signature) by > instantiating an > ?? instance using an accessible no-arg constructor and then invoking > the `main` > ?? method on it. > > This small set of changes to the launch protocol strikes out five of > the bullet > points in the above list of concepts: public (twice), static, method > parameters, > and `String[]`. > > At this point, our Hello World program is now: > > ``` > class HelloWorld { > ??? void main() { > ??????? System.out.println("Hello World"); > ??? } > } > ``` > > It's not any shorter by line count, but we've removed a lot of "horizontal > noise" along with a number of concepts.? Students and educators will > appreciate > it, but advanced programmers are unlikely to be in any hurry to make these > implicit elements explicit either. > > Additionally, the notion of an "instance main" has value well beyond > the first > day.? Because excessive use of `static` is considered a code smell, many > educators encourage the pattern of "all the static `main` method does is > instantiate an instance and call an instance `main` method" anyway.? > Formalizing > the "instance main" protocol reduces a layer of boilerplate in these > cases, and > defers the point at which we have to explain what instance creation is > -- and > what `static` is.? (Further, allowing the `main` method to be an > instance method > means that it could be inherited from a superclass, which is useful > for simple > frameworks such as test runners or service frameworks.) > > ## Unnamed classes > > In a simple program, the `class` declaration often doesn't help > either, because > other classes (if there are any) are not going to reference it by > name, and we > don't extend a superclass or implement any interfaces.? If we say an > "unnamed > class" consists of member declarations without a class header, then > our Hello > World program becomes: > > ``` > void main() { > ??? System.out.println("Hello World"); > } > ``` > > Such source files can still have fields, methods, and even nested > classes, so > that as a program evolves from a few statements to needing some > ancillary state > or helper methods, these can be factored out of the `main` method > while still > not yet requiring a full class declaration: > > ``` > String greeting() { return "Hello World"; } > > void main() { > ??? System.out.println(greeting()); > } > ``` > > This is where treating `main` as an instance method really shines; the > user has > just declared two methods, and they can freely call each other.? > Students need > not confront the confusing distinction between instance and static > methods yet; > indeed, if not forced to confront static members on day 1, it might be > a while > before they do have to learn this distinction.? The fact that there is a > receiver lurking in the background will come in handy later, but right > now is > not bothering anybody. > > [JEP 330](https://openjdk.org/jeps/330) allows single-file programs to be > launched directly without compilation; this streamlined launcher pairs > well with > unnamed classes. > > ## Predefined static imports > > The most important classes, such as `String` and `Integer`, live in the > `java.lang` package, which is automatically on-demand imported into all > compilation units; this is why we do not have to `import > java.lang.String` in > every class.? Static imports were not added until Java 5, but no > corresponding > facility for automatic on-demand import of common behavior was added > at that > time.? Most programs, however, will want to do console IO, and Java > forces us to > do this in a roundabout way -- through the static `System.out` and > `System.in` > fields.? Basic console input and output is a reasonable candidate for > auto-static import, as one or both are needed by most simple > programs.? While > these are currently instance methods accessed through static fields, > we can > easily create static methods for `println` and `readln` which are > suitable for > static import, and automatically import them.? At which point our > first program > is now down to: > > ``` > void main() { > ??? println("Hello World"); > } > ``` > > ## Putting this all together > > We've discussed several simplifications: > > ?- Update the launcher protocol to make public, static, and arguments > optional > ?? for main methods, and for main methods to be instance methods (when a > ?? no-argument constructor is available); > ?- Make the class wrapper for "main classes" optional (unnamed classes); > ?- Automatically static import methods like `println` > > which together whittle our long list of day-1 concepts down > considerably.? While > this is still not as minimal as the minimal Python or Ruby program -- > statements > must still live in a method -- the goal here is not to win at "code > golf".? The > goal is to ensure that concepts not needed by simple programs need not > appear in > those programs, while at the same time not encouraging habits that > have to be > unlearned as programs scale up. > > Each of these simplifications is individually small and unintrusive, > and each is > independent of the others.? And each embodies a simple transformation > that the > author can easily manually reverse when it makes sense to do so: elided > modifiers and `main` arguments can be added back, the class wrapper > can be added > back when the affordances of classes are needed (supertypes, > constructors), and > the full qualifier of static-import can be added back.? And these > reversals are > independent of one another; they can done in any combination or any order. > > This seems to meet the requirements of our on-ramp; we've eliminated > most of the > day-1 ceremony elements without introducing new concepts that need to be > unlearned. The remaining concepts -- a method is a container for > statements, and > a program is a Java source file with a `main` method -- are easily > understood in > relation to their fully specified counterparts. > > ## Alternatives > > Obviously, we've lived with the status quo for 25+ years, so we could > continue > to do so.? There were other alternatives explored as well; ultimately, > each of > these fell afoul of one of our goals. > > ### Can't we go further? > > Fans of "code golf" -- of which there are many -- are surely right now > trying to > figure out how to eliminate the last little bit, the `main` method, > and allow > statements to exist at the top-level of a program.? We deliberately > stopped > short of this because it offers little value beyond the first few > minutes, and > even that small value quickly becomes something that needs to be > unlearned. > > The fundamental problem behind allowing such "loose" statements is that > variables can be declared inside both classes (fields) and methods (local > variables), and they share the same syntactic production but not the same > semantics.? So it is unclear (to both compilers and humans) whether a > "loose" > variable would be a local or a field.? If we tried to adopt some sort > of simple > heuristic to collapse this ambiguity (e.g., whether it precedes or > follows the > first statement), that may satisfy the compiler, but now simple > refactorings > might subtly change the meaning of the program, and we'd be replacing the > explicit syntactic overhead of `void main()` with an invisible "line" > in the > program that subtly affects semantics, and a new subtle rule about the > meaning > of variable declarations that applies only to unnamed classes.? This > doesn't > help students, nor is this particularly helpful for all but the most > trivial > programs.? It quickly becomes a crutch to be discarded and unlearned, > which > falls afoul of our "on ramp" goals.? Of all the concepts on our list, > "methods" > and "a program is specified by a main method" seem the ones that are > most worth > asking students to learn early. > > ### Why not "just" use `jshell`? > > While JShell is a great interactive tool, leaning too heavily on it as > an onramp > would fall afoul of our goals.? A JShell session is not a program, but a > sequence of code snippets.? When we type declarations into `jshell`, > they are > viewed as implicitly static members of some unspecified class, with > accessibility is ignored completely, and statements execute in a > context where > all previous declarations are in scope.? This is convenient for > experimentation > -- the primary goal of `jshell` -- but not such a great mental model for > learning to write Java programs.? Transforming a batch of working > declarations > in `jshell` to a real Java program would not be sufficiently simple or > unintrusive, and would lead to a non-idiomatic style of code, because the > straightforward translation would have us redeclaring each method, > class, and > variable declaration as `static`.? Further, this is probably not the > direction > we want to go when we scale up from a handful of statements and > declarations to > a simple class -- we probably want to start using classes as classes, > not just > as containers for static members. JShell is a great tool for > exploration and > debugging, and we expect many educators will continue to incorporate > it into > their curriculum, but is not the on-ramp programming model we are > looking for. > > ### What about "always local"? > > One of the main tensions that `main` introduces is that most class > members are > not `static`, but the `main` method is -- and that forces programmers to > confront the seam between static and non-static members. JShell > answers this > with "make everything static". > > Another approach would be to "make everything local" -- treat a simple > program > as being the "unwrapped" body of an implicit main method.? We already > allow > variables and classes to be declared local to a method.? We could add > local > methods (a useful feature in its own right) and relax some of the > asymmetries > around nesting (again, an attractive cleanup), and then treat a mix of > declarations and statements without a class wrapper as the body of an > invisible > `main` method. This seems an attractive model as well -- at first. > > While the syntactic overhead of converting back to full-blown classes > -- wrap > the whole thing in a `main` method and a `class` declaration -- is far > less > intrusive than the transformation inherent in `jshell`, this is still > not an > ideal on-ramp.? Local variables interact with local classes (and > methods, when > we have them) in a very different way than instance fields do with > instance > methods and inner classes: their scopes are different (no forward > references), > their initialization rules are different, and captured local variables > must be > effectively final.? This is a subtly different programming model that > would then > have to be unlearned when scaling up to full classes. Further, the > result of > this wrapping -- where everything is local to the main method -- is > also not > "idiomatic Java".? So while local methods may be an attractive > feature, they are > similarly not the on-ramp we are looking for. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Sep 30 19:07:09 2022 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 30 Sep 2022 15:07:09 -0400 Subject: Paving the on-ramp: couplings In-Reply-To: References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: > Coupling: unnamed classes must live in the unnamed package. The rationale for this is that the only thing you can do with an unnamed class is run it from the command line, and it may well be the only class in your program.? If you're going to the effort of organizing into packages and distributing a JAR, you're well outside the use case for an unnamed class. Another way to phrase this coupling is: distribution -> requires named classes. > Coupling: public is only optional on main methods in the unnamed package. This is largely a forced move, because giving the launcher additional privileges to open classes in existing packages would allow running of "main" methods that are not allowed today, which seems a compromise to the accessibility model.? Situating the launcher in the unnamed package seems an entirely unsurprising thing, and again, people don't (or shouldn't) distribute code in the unnamed package. Another way to phrase this coupling is: distribution -> requires public entry points. > Coupling: instance main requires a no-arg constructor. Pretty hard to imagine getting around this one; seems intrinsic to the "instance main" feature. > Coupling: unnamed classes don't get constructors. This one could be decoupled, though I'm not sure it helps. > Coupling: unnamed classes must have a main. If we interpret unnamed as really unnamed, the only thing you can do with an unnamed class is run it via the launcher, so not having a main would be silly. -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Sep 30 19:41:31 2022 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 30 Sep 2022 21:41:31 +0200 (CEST) Subject: Paving the on-ramp: couplings In-Reply-To: References: <1b6200d3-a7a6-6479-e8ab-d932eedbceb1@oracle.com> Message-ID: <446462979.16648698.1664566891769.JavaMail.zimbra@u-pem.fr> > From: "Brian Goetz" > To: "amber-spec-experts" > Sent: Friday, September 30, 2022 9:07:09 PM > Subject: Re: Paving the on-ramp: couplings >> Coupling: unnamed classes must live in the unnamed package. > The rationale for this is that the only thing you can do with an unnamed class > is run it from the command line, and it may well be the only class in your > program. If you're going to the effort of organizing into packages and > distributing a JAR, you're well outside the use case for an unnamed class. > Another way to phrase this coupling is: distribution -> requires named classes. >> Coupling: public is only optional on main methods in the unnamed package. > This is largely a forced move, because giving the launcher additional privileges > to open classes in existing packages would allow running of "main" methods that > are not allowed today, which seems a compromise to the accessibility model. > Situating the launcher in the unnamed package seems an entirely unsurprising > thing, and again, people don't (or shouldn't) distribute code in the unnamed > package. > Another way to phrase this coupling is: distribution -> requires public entry > points. >> Coupling: instance main requires a no-arg constructor. > Pretty hard to imagine getting around this one; seems intrinsic to the "instance > main" feature. Technically you can store the array of arguments in a field but fields should not be allowed, see below. >> Coupling: unnamed classes don't get constructors. > This one could be decoupled, though I'm not sure it helps. >> Coupling: unnamed classes must have a main. > If we interpret unnamed as really unnamed, the only thing you can do with an > unnamed class is run it via the launcher, so not having a main would be silly. Coupling: a nested class should not have the same name as the filename (minus ".java") of an unamed classes Avoid confusion between a top-level class and a nested class of an unnamed class (as proposed by Tagir) Coupling: unnamed classes don't get fields. If there is no constructor, there is no way to properly initialize fields. And field syntax is too close to the local variable syntax when there is no enclosing class. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: