From amaembo at gmail.com Fri Mar 1 05:12:09 2019 From: amaembo at gmail.com (Tagir Valeev) Date: Fri, 1 Mar 2019 12:12:09 +0700 Subject: Switch expressions spec In-Reply-To: <2A8B1F01-52A7-4688-B4F5-CBB73826500E@oracle.com> References: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> <5C393807.8000500@oracle.com> <5C3CF3AB.5030602@oracle.com> <7A903926-3E7C-498A-9D57-E5AA0462D568@oracle.com> <2A8B1F01-52A7-4688-B4F5-CBB73826500E@oracle.com> Message-ID: Hello! 15.15 says: The following production from 15.16 is shown here for convenience: CastExpression: ( PrimitiveType ) UnaryExpression ( ReferenceType {AdditionalBound} ) UnaryExpressionNotPlusMinus ( ReferenceType {AdditionalBound} ) LambdaExpression # Before it was clear that CastExpression is for completeness here, as other operators are covered above. Now it's unclear why CastExpression production is shown here for convenience, but switch expression is not shown. Probably both should be shown or both removed. With best regards, Tagir Valeev. On Wed, Feb 27, 2019 at 7:43 PM Gavin Bierman wrote: > > I have uploaded a revised switch expressions spec at: > > http://cr.openjdk.java.net/~gbierman/switch-expressions.html > > This is functionally equivalent to the spec uploaded last month. The change is in how we specify the type checking of switch expressions. We have make simplifications to make it more consistent with the specification of conditional expressions. The behaviour of type checking is unchanged. > > Thanks, > Gavin > > PS: I have left the January version at http://cr.openjdk.java.net/~gbierman/switch-expressions-old.html for reference. > > > On 17 Jan 2019, at 10:14, Gavin Bierman wrote: > > > > Thank you Alex and Tagir. I have uploaded a new version of the spec at: > > > > http://cr.openjdk.java.net/~gbierman/switch-expressions.html > > > > This contains all the changes you suggested below. In addition, there is a small bug fix in 5.6.3 concerning widening (https://bugs.openjdk.java.net/browse/JDK-8213180). I have also taken the opportunity to reorder chapter 15 slightly, so switch expressions are now section 15.28 and constant expressions are now section 15.29 (the last section in the chapter). > > > > Comments welcome! > > Gavin > From brian.goetz at oracle.com Fri Mar 1 20:14:31 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 1 Mar 2019 15:14:31 -0500 Subject: Updated document on data classes and sealed types Message-ID: I've updated the document on data classes here: http://cr.openjdk.java.net/~briangoetz/amber/datum.html (older versions of the document are retained in the same directory for historical comparison.) While the previous version was mostly about tradeoffs, this version takes a much more opinionated interpretation of the feature, offering more examples of use cases of where it is intended to be used (and not used). Many of the "under consideration" flexibilities (extension, mutability, additional fields) have collapsed to their more restrictive form; while some people will be disappointed because it doesn't solve the worst of their boilerplate problems, our conclusion is: records are a powerful feature, but they're not necessarily the delivery vehicle for easing all the (often self-inflicted) pain of JavaBeans. We can continue to explore relief for these situations too as separate features, but trying to be all things to all classes has delayed the records train long enough, and I'm convince they're separate problems that want separate solutions. Time to let the records train roll. I've also combined the information on sealed types in this document, as the two are so tightly related. Comments welcome. From alex.buckley at oracle.com Sat Mar 2 01:05:06 2019 From: alex.buckley at oracle.com (Alex Buckley) Date: Fri, 01 Mar 2019 17:05:06 -0800 Subject: Updated document on data classes and sealed types In-Reply-To: References: Message-ID: <5C79D6C2.20603@oracle.com> On 3/1/2019 12:14 PM, Brian Goetz wrote: > While the previous version was mostly about tradeoffs, this version > takes a much more opinionated interpretation of the feature, offering > more examples of use cases of where it is intended to be used (and not > used). (Setting aside value records throughout.) A record type "codes like a class", not "codes like an int" -- you can `new` it, and the resulting record object has identity and can refer directly or indirectly to other record objects of the same type. (Contrast with LW1, "Value types may not declare fields of its own type directly or indirectly".) A variable of record type may even be null. The on-ramp to records looks smooth -- but can I compatibly turn a class type into a record type? It's potentially a source-compatible change (make the state description match the class's ctor), but binary-compatible? Similarly, if a record type is chafing at the restrictions of "state only!", then can I (source|binary)-compatibly turn it into a class type? Alex From brian.goetz at oracle.com Sat Mar 2 13:30:14 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 2 Mar 2019 08:30:14 -0500 Subject: Updated document on data classes and sealed types In-Reply-To: <5C79D6C2.20603@oracle.com> References: <5C79D6C2.20603@oracle.com> Message-ID: <325861c1-e2f8-201c-24cb-dc825dae80cc@oracle.com> > but can I compatibly turn a class type into a record type? It's > potentially a source-compatible change (make the state description > match the class's ctor), but binary-compatible? Yes.? You can transform ??? class Point { ??????? final int x, y; ??????? public Point(int x, int y) { this.x = x; this.y = y; } ??????? public int x() { return x; } ??????? public int y() { return y; } ??????? // state-based equals and hashCode ??????? // more methods ??? } into ??? record Point(int x, int y) { ??????? // more methods ??? } and this will be source- and binary-compatible. > Similarly, if a record type is chafing at the restrictions of "state > only!", then can I (source|binary)-compatibly turn it into a class type? Yes, mostly.? Records do currently have one aspect that is not *yet* denotable by ordinary classes; the pattern match extractor.? Until ordinary classes can declare one, existing records whose clients use pattern matching can't be migrated. From forax at univ-mlv.fr Sat Mar 2 17:48:49 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 2 Mar 2019 18:48:49 +0100 (CET) Subject: Updated document on data classes and sealed types In-Reply-To: References: Message-ID: <510506611.1083197.1551548929550.JavaMail.zimbra@u-pem.fr> So records are only immutable, it's a bold move and i like that. For beginners we offer a simple model with immutable named tuples and mutable List and Map, very like Python. I still think we should restrict sealed to interface only (you can always retrofit a class or an abstract class to add a super interface and this will avoid the nesting issue). R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Vendredi 1 Mars 2019 21:14:31 > Objet: Updated document on data classes and sealed types > I've updated the document on data classes here: > > http://cr.openjdk.java.net/~briangoetz/amber/datum.html > > (older versions of the document are retained in the same directory for > historical comparison.) > > While the previous version was mostly about tradeoffs, this version > takes a much more opinionated interpretation of the feature, offering > more examples of use cases of where it is intended to be used (and not > used). Many of the "under consideration" flexibilities (extension, > mutability, additional fields) have collapsed to their more restrictive > form; while some people will be disappointed because it doesn't solve > the worst of their boilerplate problems, our conclusion is: records are > a powerful feature, but they're not necessarily the delivery vehicle for > easing all the (often self-inflicted) pain of JavaBeans. We can > continue to explore relief for these situations too as separate > features, but trying to be all things to all classes has delayed the > records train long enough, and I'm convince they're separate problems > that want separate solutions. Time to let the records train roll. > > I've also combined the information on sealed types in this document, as > the two are so tightly related. > > Comments welcome. From amaembo at gmail.com Mon Mar 4 03:30:09 2019 From: amaembo at gmail.com (Tagir Valeev) Date: Mon, 4 Mar 2019 10:30:09 +0700 Subject: Patterns for arrays of specific length Message-ID: Hello! In intellij IDEA code we often see snippets like this: final ResolveResult[] resolveResults = multiResolve(false); return resolveResults.length == 1 && resolveResults[0].isValidResult() ? resolveResults[0].getElement() : null; I wonder if special kind of patterns to cover such case could be invented like return multiResolve(false) instanceof ResolveResult[] {var res} && res.isValidResult() ? res.getElement() : null; In essence it should be a deconstruction pattern for arrays. I don't remember whether it was discussed, but probably I'm missing something. Alternatively this could be covered by utility method like static T getOnlyElement(T[] array) { return array.length == 1 ? array[0] : null; } return getOnlyElement(multiResolve(false)) instanceof ResolveResult res && res.isValidResult() ? res.getElement() : null; But this doesn't scale for arrays of bigger length. With best regards, Tagir Valeev. From brian.goetz at oracle.com Mon Mar 4 07:34:46 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 4 Mar 2019 07:34:46 +0000 Subject: Patterns for arrays of specific length In-Reply-To: References: Message-ID: <96268694-48C3-4D6F-B301-FA7B2C72BFEC@oracle.com> In general, there?s a duality between the ways in which we construct composites and the way we deconstruct them. There?s an obvious duality between constructor and deconstructor patterns; between static factories and static patterns, etc. For structural composites, the obvious place to start is the dual of array/list/map literals. (We?re not ready to do these just yet, but it makes sense that we consider structural literals and structural patterns at the same time.) So for example, if an array literal looks like [ e, f, g ], an array pattern might look like [ p, q, r ]. Given that, your example might look like: return resolveResults instanceof [ var e ] && e.isValid() ? e : null Without collection literals, we?d write methods to do what we want, so these methods have duals too. For example: return resolveResults instanceof oneElementArray(var e) && e.isvalid() ? e : null > On Mar 4, 2019, at 3:30 AM, Tagir Valeev wrote: > > Hello! > > In intellij IDEA code we often see snippets like this: > > final ResolveResult[] resolveResults = multiResolve(false); > return resolveResults.length == 1 && resolveResults[0].isValidResult() ? > resolveResults[0].getElement() : null; > > I wonder if special kind of patterns to cover such case could be invented like > > return multiResolve(false) instanceof ResolveResult[] {var res} && > res.isValidResult() ? > res.getElement() : null; > > In essence it should be a deconstruction pattern for arrays. I don't > remember whether it was discussed, but probably I'm missing something. > > Alternatively this could be covered by utility method like > > static T getOnlyElement(T[] array) { > return array.length == 1 ? array[0] : null; > } > > return getOnlyElement(multiResolve(false)) instanceof ResolveResult > res && res.isValidResult() ? > res.getElement() : null; > > But this doesn't scale for arrays of bigger length. > > With best regards, > Tagir Valeev. From forax at univ-mlv.fr Mon Mar 4 08:08:47 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 4 Mar 2019 09:08:47 +0100 (CET) Subject: Patterns for arrays of specific length In-Reply-To: References: Message-ID: <2039892142.19913.1551686927986.JavaMail.zimbra@u-pem.fr> Hi Tagir, ----- Mail original ----- > De: "Tagir Valeev" > ?: "amber-spec-experts" > Envoy?: Lundi 4 Mars 2019 04:30:09 > Objet: Patterns for arrays of specific length > Hello! > > In intellij IDEA code we often see snippets like this: > > final ResolveResult[] resolveResults = multiResolve(false); > return resolveResults.length == 1 && resolveResults[0].isValidResult() ? > resolveResults[0].getElement() : null; Arrays.stream(resolveResults).findFirst().filter(ResolveResult::isValidResult).map(ResolveResult::getElement).orElse(null) and obviously, the method should return an Optional instead of calling orElse(null) at the end. R?mi > > I wonder if special kind of patterns to cover such case could be invented like > > return multiResolve(false) instanceof ResolveResult[] {var res} && > res.isValidResult() ? > res.getElement() : null; > > In essence it should be a deconstruction pattern for arrays. I don't > remember whether it was discussed, but probably I'm missing something. > > Alternatively this could be covered by utility method like > > static T getOnlyElement(T[] array) { > return array.length == 1 ? array[0] : null; > } > > return getOnlyElement(multiResolve(false)) instanceof ResolveResult > res && res.isValidResult() ? > res.getElement() : null; > > But this doesn't scale for arrays of bigger length. > > With best regards, > Tagir Valeev. From forax at univ-mlv.fr Mon Mar 4 08:13:18 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 4 Mar 2019 09:13:18 +0100 (CET) Subject: Patterns for arrays of specific length In-Reply-To: <96268694-48C3-4D6F-B301-FA7B2C72BFEC@oracle.com> References: <96268694-48C3-4D6F-B301-FA7B2C72BFEC@oracle.com> Message-ID: <1332292808.20818.1551687198571.JavaMail.zimbra@u-pem.fr> It's a poster child for a 'let' expression instead of twisting instanceof to work with no type return let results = multiResolve(false) in results.length == 1 && results[0].isValidResult() ? results[0].getElement() : null; R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "Tagir Valeev" > Cc: "amber-spec-experts" > Envoy?: Lundi 4 Mars 2019 08:34:46 > Objet: Re: Patterns for arrays of specific length > In general, there?s a duality between the ways in which we construct composites > and the way we deconstruct them. There?s an obvious duality between > constructor and deconstructor patterns; between static factories and static > patterns, etc. > > For structural composites, the obvious place to start is the dual of > array/list/map literals. (We?re not ready to do these just yet, but it makes > sense that we consider structural literals and structural patterns at the same > time.) So for example, if an array literal looks like [ e, f, g ], an array > pattern might look like [ p, q, r ]. > > Given that, your example might look like: > > return resolveResults instanceof [ var e ] && e.isValid() ? e : null > > Without collection literals, we?d write methods to do what we want, so these > methods have duals too. For example: > > return resolveResults instanceof oneElementArray(var e) && e.isvalid() ? e : > null > > > > >> On Mar 4, 2019, at 3:30 AM, Tagir Valeev wrote: >> >> Hello! >> >> In intellij IDEA code we often see snippets like this: >> >> final ResolveResult[] resolveResults = multiResolve(false); >> return resolveResults.length == 1 && resolveResults[0].isValidResult() ? >> resolveResults[0].getElement() : null; >> >> I wonder if special kind of patterns to cover such case could be invented like >> >> return multiResolve(false) instanceof ResolveResult[] {var res} && >> res.isValidResult() ? >> res.getElement() : null; >> >> In essence it should be a deconstruction pattern for arrays. I don't >> remember whether it was discussed, but probably I'm missing something. >> >> Alternatively this could be covered by utility method like >> >> static T getOnlyElement(T[] array) { >> return array.length == 1 ? array[0] : null; >> } >> >> return getOnlyElement(multiResolve(false)) instanceof ResolveResult >> res && res.isValidResult() ? >> res.getElement() : null; >> >> But this doesn't scale for arrays of bigger length. >> >> With best regards, > > Tagir Valeev. From amaembo at gmail.com Mon Mar 4 09:13:40 2019 From: amaembo at gmail.com (Tagir Valeev) Date: Mon, 4 Mar 2019 16:13:40 +0700 Subject: Patterns for arrays of specific length In-Reply-To: <2039892142.19913.1551686927986.JavaMail.zimbra@u-pem.fr> References: <2039892142.19913.1551686927986.JavaMail.zimbra@u-pem.fr> Message-ID: Hello! > Arrays.stream(resolveResults).findFirst().filter(ResolveResult::isValidResult).map(ResolveResult::getElement).orElse(null) This is obviously wrong: we need to return non-null result only if there's exactly one resolve result. As IDE we need to support incorrect code, in particular where reference resolves to several symbols (e.g. ambigous method overload), but in most of the places we want to proceed further only if the resolve result points to exactly one symbol. We could use a third-party collector, like in my StreamEx lib: StreamEx.of(resolveResults).collect(MoreCollectors.onlyOne()).filter(ResolveResult::isValidResult).map(ResolveResult::getElement).orElse(null) That would be technically correct, but I really worry about amount of garbage created in such kind of code (the same concern applies to your version as well). For us GC pressure is very important (you may imagine amount of reports "IDEA eats insane amount of memory", "IDEA stuck in garbage collection" and good old "IDEA is slow" we receive) and when allocation-free version of the code is not much longer, I would certainly prefer it. With best regards, Tagir Valeev. > > and obviously, the method should return an Optional instead of calling orElse(null) at the end. > > R?mi > > > > > I wonder if special kind of patterns to cover such case could be invented like > > > > return multiResolve(false) instanceof ResolveResult[] {var res} && > > res.isValidResult() ? > > res.getElement() : null; > > > > In essence it should be a deconstruction pattern for arrays. I don't > > remember whether it was discussed, but probably I'm missing something. > > > > Alternatively this could be covered by utility method like > > > > static T getOnlyElement(T[] array) { > > return array.length == 1 ? array[0] : null; > > } > > > > return getOnlyElement(multiResolve(false)) instanceof ResolveResult > > res && res.isValidResult() ? > > res.getElement() : null; > > > > But this doesn't scale for arrays of bigger length. > > > > With best regards, > > Tagir Valeev. From forax at univ-mlv.fr Mon Mar 4 12:18:26 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 4 Mar 2019 13:18:26 +0100 (CET) Subject: Patterns for arrays of specific length In-Reply-To: References: <2039892142.19913.1551686927986.JavaMail.zimbra@u-pem.fr> Message-ID: <1372717152.87737.1551701906822.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Tagir Valeev" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Lundi 4 Mars 2019 10:13:40 > Objet: Re: Patterns for arrays of specific length > Hello! > >> Arrays.stream(resolveResults).findFirst().filter(ResolveResult::isValidResult).map(ResolveResult::getElement).orElse(null) > > This is obviously wrong: we need to return non-null result only if > there's exactly one resolve result. As IDE we need to support > incorrect code, in particular where reference resolves to several > symbols (e.g. ambigous method overload), but in most of the places we > want to proceed further only if the resolve result points to exactly > one symbol. oops, forget a filter Arrays.stream(resolveResults).filter(results -> results.size() == 1).findFirst().filter(ResolveResult::isValidResult).map(ResolveResult::getElement).orElse(null) or only using Optional Optional.of(resolveResults).filter(results -> results.size() == 1).map(results -> results[0]).filter(ResolveResult::isValidResult).map(ResolveResult::getElement).orElse(null) > > We could use a third-party collector, like in my StreamEx lib: > StreamEx.of(resolveResults).collect(MoreCollectors.onlyOne()).filter(ResolveResult::isValidResult).map(ResolveResult::getElement).orElse(null) > > That would be technically correct, but I really worry about amount of > garbage created in such kind of code (the same concern applies to your > version as well). For us GC pressure is very important (you may > imagine amount of reports "IDEA eats insane amount of memory", "IDEA > stuck in garbage collection" and good old "IDEA is slow" we receive) > and when allocation-free version of the code is not much longer, I > would certainly prefer it. I believe the verson with only Optional should be OK, the VM tends to do a very good job not allowing them, obviously, it will be better when Optional will be a tru value type. and a solution to avoid IDEA to eats too much memory is to pass the Collector to multiResolve so you can call with a collector that reduce to an Optional (your MoreCollectors.onlyOne()) or a List depending if you want only one result or all results. > > With best regards, > Tagir Valeev. R?mi > >> >> and obviously, the method should return an Optional instead of calling >> orElse(null) at the end. >> >> R?mi >> >> > >> > I wonder if special kind of patterns to cover such case could be invented like >> > >> > return multiResolve(false) instanceof ResolveResult[] {var res} && >> > res.isValidResult() ? >> > res.getElement() : null; >> > >> > In essence it should be a deconstruction pattern for arrays. I don't >> > remember whether it was discussed, but probably I'm missing something. >> > >> > Alternatively this could be covered by utility method like >> > >> > static T getOnlyElement(T[] array) { >> > return array.length == 1 ? array[0] : null; >> > } >> > >> > return getOnlyElement(multiResolve(false)) instanceof ResolveResult >> > res && res.isValidResult() ? >> > res.getElement() : null; >> > >> > But this doesn't scale for arrays of bigger length. >> > >> > With best regards, > > > Tagir Valeev. From alex.buckley at oracle.com Mon Mar 4 23:20:01 2019 From: alex.buckley at oracle.com (Alex Buckley) Date: Mon, 04 Mar 2019 15:20:01 -0800 Subject: Switch expressions spec In-Reply-To: <2A8B1F01-52A7-4688-B4F5-CBB73826500E@oracle.com> References: <95AE9FF4-9484-4501-91FE-C1E49123109D@oracle.com> <5C393807.8000500@oracle.com> <5C3CF3AB.5030602@oracle.com> <7A903926-3E7C-498A-9D57-E5AA0462D568@oracle.com> <2A8B1F01-52A7-4688-B4F5-CBB73826500E@oracle.com> Message-ID: <5C7DB2A1.9030802@oracle.com> For clarity, we have renamed the January 2019 version from: http://cr.openjdk.java.net/~gbierman/switch-expressions-old.html to: http://cr.openjdk.java.net/~gbierman/switch-expressions-2019-01.html The CSR for switch expressions (JDK-8207241) has the spec as an attachment, but also links to an online version for reader convenience. Originally it linked to `switch-expressions.html`, but that is a living document and hence unsuitable for a CSR, so now it links to `switch-expressions-2019-01.html`. Alex On 2/27/2019 4:43 AM, Gavin Bierman wrote: > I have uploaded a revised switch expressions spec at: > > http://cr.openjdk.java.net/~gbierman/switch-expressions.html > > This is functionally equivalent to the spec uploaded last month. The change is in how we specify the type checking of switch expressions. We have make simplifications to make it more consistent with the specification of conditional expressions. The behaviour of type checking is unchanged. > > Thanks, > Gavin > > PS: I have left the January version at http://cr.openjdk.java.net/~gbierman/switch-expressions-old.html for reference. > >> On 17 Jan 2019, at 10:14, Gavin Bierman wrote: >> >> Thank you Alex and Tagir. I have uploaded a new version of the spec at: >> >> http://cr.openjdk.java.net/~gbierman/switch-expressions.html >> >> This contains all the changes you suggested below. In addition, there is a small bug fix in 5.6.3 concerning widening (https://bugs.openjdk.java.net/browse/JDK-8213180). I have also taken the opportunity to reorder chapter 15 slightly, so switch expressions are now section 15.28 and constant expressions are now section 15.29 (the last section in the chapter). >> >> Comments welcome! >> Gavin > From alex.buckley at oracle.com Wed Mar 6 20:53:33 2019 From: alex.buckley at oracle.com (Alex Buckley) Date: Wed, 06 Mar 2019 12:53:33 -0800 Subject: Switch expressions spec In-Reply-To: References: Message-ID: <5C80334D.90508@oracle.com> Hi Gavin, On 3/6/2019 1:51 AM, Manoj Palat wrote: > *1: In section, *14.15 The breakStatement > > A breakstatement transfers control out of an enclosing statement_, or > causes an enclosing __switch__expression to produce a specified value_. > > > /BreakStatement:/ > break[~~ /Identifier/~~]; > _break___/_Expression_/___;_ > _break____;_ > > the identifier is dropped ? That looks like a typographical issue (since > it was mentioned that there was not functional difference) ? Identifier > is mentioned in the statements following the above para as well. Similar > issue is displayed in "continue" section also. The dropping of the `break [Identifier]` alternative looks like an editing error when the spec document was being reformatted; compare: old format: http://cr.openjdk.java.net/~gbierman/switch-expressions-2019-01.html#jep325-14.15 new format: http://cr.openjdk.java.net/~gbierman/switch-expressions.html#jep325-14.15 > 2. A related query, though a bit late, but better late than never:) - : > In the Eclipse Compiler implementation we assume expression encompasses > identifier (in the syntax context), and then deduce whether this is a > label or an expression later in the resolution context. From the grammar > above, it does not look like we can distinguish whether an identifier is > a label or an expression in the first place? An explicit statement in > the spec about how to distinguish would be helpful. This will become moot if the change anticipated by Brian happens (change ?break value? to ?break-with value?). Until then, Manoj is asking a great question. Per 6.2, a label is not a name, but per 14.7, a label does have scope, and: "There is no restriction against using the same identifier as a label and as the name of a package, class, interface, method, field, parameter, or local variable. Use of an identifier to label a statement does not obscure (?6.4.2) a package, class, interface, method, field, parameter, or local variable with the same name. Use of an identifier as a class, interface, method, field, local variable or as the parameter of an exception handler (?14.20) does not obscure a statement label with the same name." I seem to recall a discussion recognizing and accepting the source incompatibility of recasting `break X;` from "Jump to label X" to "Evaluate X and yield the result". Such acceptance would suggest an edit to the last sentence quoted above. > 3. In section, 5.6 *? ?*_A _*/_unary numeric promotion_/*_applies > numeric promotion to an operand expression and a notional non-constant > expression of type _*int*_.?_ > It will be nice to explain in the spec a little more as to what is meant > by ?a notional non-constant expression? ? I believe more polishing is already on the way for the recast definition of numeric promotion? Alex From manoj.palat at in.ibm.com Wed Mar 6 09:51:49 2019 From: manoj.palat at in.ibm.com (Manoj Palat) Date: Wed, 6 Mar 2019 15:21:49 +0530 Subject: Switch expressions spec Message-ID: Hi Alex, Gavin, A few comments/clarifications: 1: In section, 14.15 The break Statement A break statement transfers control out of an enclosing statement, or causes an enclosing switchexpression to produce a specified value. BreakStatement: break [~~ Identifier ~~] ; break Expression ; break ; the identifier is dropped ? That looks like a typographical issue (since it was mentioned that there was not functional difference) ? Identifier is mentioned in the statements following the above para as well. Similar issue is displayed in "continue" section also. 2. A related query, though a bit late, but better late than never:) - : In the Eclipse Compiler implementation we assume expression encompasses identifier (in the syntax context), and then deduce whether this is a label or an expression later in the resolution context. From the grammar above, it does not look like we can distinguish whether an identifier is a label or an expression in the first place? An explicit statement in the spec about how to distinguish would be helpful. 3. In section, 5.6 ? ?A unary numeric promotion applies numeric promotion to an operand expression and a notional non-constant expression of type int .? It will be nice to explain in the spec a little more as to what is meant by ?a notional non-constant expression? ? Regards, Manoj Eclipse Java Dev. Tools Alex Buckley wrote: For clarity, we have renamed the January 2019 version from: http://cr.openjdk.java.net/~gbierman/switch-expressions-old.html to: http://cr.openjdk.java.net/~gbierman/switch-expressions-2019-01.html The CSR for switch expressions (JDK-8207241) has the spec as an attachment, but also links to an online version for reader convenience. Originally it linked to `switch-expressions.html`, but that is a living document and hence unsuitable for a CSR, so now it links to `switch-expressions-2019-01.html`. Alex On 2/27/2019 4:43 AM, Gavin Bierman wrote: > I have uploaded a revised switch expressions spec at: > > http://cr.openjdk.java.net/~gbierman/switch-expressions.html > > This is functionally equivalent to the spec uploaded last month. The change is in how we specify the type checking of switch expressions. We have make simplifications to make it more consistent with the specification of conditional expressions. The behaviour of type checking is unchanged. > > Thanks, > Gavin > > PS: I have left the January version at http://cr.openjdk.java.net/~gbierman/switch-expressions-old.html for reference. > >> On 17 Jan 2019, at 10:14, Gavin Bierman wrote: >> >> Thank you Alex and Tagir. I have uploaded a new version of the spec at: >> >> http://cr.openjdk.java.net/~gbierman/switch-expressions.html >> >> This contains all the changes you suggested below. In addition, there is a small bug fix in 5.6.3 concerning widening ( https://bugs.openjdk.java.net/browse/JDK-8213180). I have also taken the opportunity to reorder chapter 15 slightly, so switch expressions are now section 15.28 and constant expressions are now section 15.29 (the last section in the chapter). >> >> Comments welcome! >> Gavin > -------------- next part -------------- An HTML attachment was scrubbed... URL: From amalloy at google.com Thu Mar 7 19:12:30 2019 From: amalloy at google.com (Alan Malloy) Date: Thu, 7 Mar 2019 11:12:30 -0800 Subject: Updated document on data classes and sealed types In-Reply-To: References: Message-ID: I have two remarks about this proposal. The first is basically: why allow overriding accessors? If a record is required to have a one-to-one correspondence between its (private final) fields and its public accessors, and is required to ?give up [its] data freely to all requestors? what possible override could be correct? It makes sense to allow overriding the constructor, for validation and normalization, but once the fields are cemented in place, what could an accessor do but return its corresponding field? My second remark is much more long-winded, and inspired by the first. The TL;DR version is: what about normalization and derived fields? In the longer version below, I?ll be using Fraction as an example of a simple class that could be a record instead, where normalization is reducing a fraction to simplest form. However, please generalize from this: it could apply to any record where a derived field can be computed from the provided fields by computing a perhaps-expensive pure function on them. I tried answering my first question by saying, ?ah, we could do normalization in the accessors instead of the constructor?. Besides Point, one classic example of a pair type is Fraction. I can easily imagine Fractional Fran rejoicing over the introduction of records, and saying ?time to implement Fraction as a record!? And of course a Fraction should always be in simplest form (in particular because this is necessary for equals() to behave well). Therefore, Fran adds a GCD call to her constructor, so that (even without overriding them) her numerator() and denominator() field accessors always return values that are relatively prime, and denominator() is always positive. She publishes her Fraction record as a library, and all is well. Productive Peter is happy too: he has a use case in mind for Fraction. He has a List, and wants to multiply them all together. He of course wants the final result in simplest form, but doesn?t want to waste a bunch of time reducing intermediate results to their simplest form. He?d rather just build the next term as new Fraction(num1*num2, denom1*denom2), and reduce the final result at the end (assume for this example that we are not worried about intermediate terms exceeding the size of an int, or that Fraction#mul(Fraction) is smart enough to reduce when overflowing is the alternative). Peter is a plausible use-case for overriding the accessor methods: he wants a constructor that does no normalization, with accessor functions normalizing instead. But there?s still wasted work here: numerator() and denominator() each have to perform a GCD calculation on the same numbers, and will even have to repeat it if someone calls them multiple times (e.g. for equals() and hashCode() when using a Fraction as a map key). A Fraction library can?t satisfy both Fran and Peter. It has to choose a place to do this normalization, or else decline to do it at all - but this is no solution, as now the class has very sharp edges, really no more useful than a Pair. There are two possible solutions I see to this. The first is to permit some kind of derived-field mechanism, preferably lazy. Then, Fraction?s constructor would save a thunk for producing the reduced form, and refer to that thunk in the numerator() and denominator() accessors, but ignore it in the #mul method so that we don?t pay the cost of reducing unless we want it (here, imagine reducing a Fraction is more expensive than allocating a thunk). The second is to simply say that Fraction is a bad candidate for a record, because it wants to decouple its interface from its implementation. I think this is actually the right approach, but it may be unconvincing because of how ?obvious? it is that a Fraction is just a pair with some extra calculations to perform based on its components. If we say that Fraction is a bad record, I worry that many more bad records like it will be built, and their subtle problems discovered only after their APIs have been published and committed to. Further, if this is indeed a bad record, I can?t think of any other good use case for overriding an accessor method (my first remark). On Fri, Mar 1, 2019 at 12:28 PM Brian Goetz wrote: > I've updated the document on data classes here: > > http://cr.openjdk.java.net/~briangoetz/amber/datum.html > > (older versions of the document are retained in the same directory for > historical comparison.) > > While the previous version was mostly about tradeoffs, this version > takes a much more opinionated interpretation of the feature, offering > more examples of use cases of where it is intended to be used (and not > used). Many of the "under consideration" flexibilities (extension, > mutability, additional fields) have collapsed to their more restrictive > form; while some people will be disappointed because it doesn't solve > the worst of their boilerplate problems, our conclusion is: records are > a powerful feature, but they're not necessarily the delivery vehicle for > easing all the (often self-inflicted) pain of JavaBeans. We can > continue to explore relief for these situations too as separate > features, but trying to be all things to all classes has delayed the > records train long enough, and I'm convince they're separate problems > that want separate solutions. Time to let the records train roll. > > I've also combined the information on sealed types in this document, as > the two are so tightly related. > > Comments welcome. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Thu Mar 7 20:45:33 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 7 Mar 2019 20:45:33 +0000 Subject: Updated document on data classes and sealed types In-Reply-To: References: Message-ID: <0F1FBB17-0D83-416A-A2D0-04E42C0D60E1@oracle.com> Thanks for these great comments. These cut to the heart of some uncomfortable tradeoffs. > I have two remarks about this proposal. The first is basically: why allow overriding accessors? If a record is required to have a one-to-one correspondence between its (private final) fields and its public accessors, and is required to ?give up [its] data freely to all requestors? what possible override could be correct? It makes sense to allow overriding the constructor, for validation and normalization, but once the fields are cemented in place, what could an accessor do but return its corresponding field? Yes, overriding accessors could be abused to avoid giving up the classes data; they could be overridden to throw, for example, which would undermine the ?give up their data easily? dictum. Note that if overriding accessors were not allowed, it would be as if the fields were public and final. (We actually considered that as an option, briefly.) I actually think that public final fields get a bad rap, but the Uniform Access principle encourages accessors, and public final fields freak a lot of people out. So between the two, non-overridable accessors seem better. But, there?s still a reason to allow overriding accessors ? mutable types which don?t provide unmodifiable views ? arrays being the obvious case. Yes, arrays and records are an uncomfortable pairing, but if you can override the accessors, at least you can clone them on the way out. (If you can?t override the accessors, people might still use records with arrays, and then expose the mutable state, possibly without realizing it. That seems worse.) So overriding accessors seems like it should be in the ?safe, legal, and rare? category. Note too that the deconstruction pattern is likely to delegate to the accessors, so that you only have to override things in one place to prevent mutable state from leaking. There?s another consideration, too. We considered outlawing overriding the equals/hashCode method. This goes a long way towards enforcing the desired invariants, but again seems pretty restrictive. And, having an irregular set of rules about what can be overridden and what can?t (e.g., no to equals, yes to toString), seems likely to (a) make the feature harder to learn/undersatnd and (b) lead to lots more ?why can?t I, I just want to ?.? complaints. Better to have an all-or-nothing treatment of overriding, even though people can undermine the intent by careless overriding. (One thing working in our favor here is that, if you?re overriding a bunch of methods, the concision benefit drops a lot, so that helps limit the problem.) Before I jump into the second, let me talk about intended overriding modes for the constructor. These are primarily: validation and normalization. The validation cases are obvious: record Range(int lo, int hi) { public Range { if (lo > hi) throw new LowGreaterThanHighException(); } } Normalization can happen on single arguments or multiple: record Person(String name) { public Person { name = name.toUpperCase(); } } (Note that I?m mutating the parameter, which will then get written to the field.) record Rational(int num, int denom) { public Range { int gcd = gcd(num, denim); num /= gcd; denom /= gcd; } } > > My second remark is much more long-winded, and inspired by the first. The TL;DR version is: what about normalization and derived fields? This is two questions :) Let?s start with the first. > In the longer version below, I?ll be using Fraction as an example of a simple class that could be a record instead, where normalization is reducing a fraction to simplest form. However, please generalize from this: it could apply to any record where a derived field can be computed from the provided fields by computing a perhaps-expensive pure function on them. Rational numbers are a great example; Guy raised these earlier as well. Where rationals challenge the model here is: the user provided a state vector of (4, 2), but the final state of the object is (2, 1). This is at odds with the following desirable-seeming invariant: record Foo(int x, int y) assert new Foo(1, 2).x() == 1 assert new Foo(1, 2).y() == 2 That is, if we normalize any fields in the ctor, then the relationship of ?the constructor argument x and the accessor x() are referring to the same state? appears to be severed. > > > A Fraction library can?t satisfy both Fran and Peter. It has to choose a place to do this normalization, or else decline to do it at all - but this is no solution, as now the class has very sharp edges, really no more useful than a Pair. That?s true, but what Peter really _wants_ is an IntIntPair class! Because his goals are that it should hold the pair, and do no extra computation (and commit to no additional semantic requirements). And he can easily write one. (Or, he could get over his micro-performance obsession and use Fran?s class.) So, let?s wrap up normalization before we get to derived fields. I don?t mind the Peter/Fran tension here, but I am mildly uncomfortable at the fact that ?new Foo(x, y).x() == x? doesn?t always hold, because it complicates an attractive invariant. The actual invariant you get with normalization is slightly more complicated: that there be a projection-embedding pair between the constructor arguments and the representation. Let?s write the ctor args and state as a tuple; while ctor \andThen dtor is not an identity, going around the other way (dtor \then ctor) is as long as the normalization is well-defined and consistently applied. This is a tradeoff of simplicity vs usefulness; overall it seems a fair balance. > > There are two possible solutions I see to this. The first is to permit some kind of derived-field mechanism, preferably lazy. Then, Fraction?s constructor would save a thunk for producing the reduced form, and refer to that thunk in the numerator() and denominator() accessors, but ignore it in the #mul method so that we don?t pay the cost of reducing unless we want it (here, imagine reducing a Fraction is more expensive than allocating a thunk). The stricture against derived fields was probably the hardest choice here. On the one hand, strictly derived fields are safe and don?t undermine the invariants; on the other, without more help from the language or runtime, we can?t enforce that additional fields are actually derived, *and* it will be ultra-super-duper-tempting to make them not so. (I don?t see remotely as much temptation to implement maliciously nonconformant accessors or equals methods.) If we allowed additional fields, we would surely have to lock down equals/hashCode. We?re exploring the notion of lazy final fields; I think that would move the balance on allowing additional fields, since the mechanism would push pretty hard to making them truly derived from the record state. > > The second is to simply say that Fraction is a bad candidate for a record, because it wants to decouple its interface from its implementation. I think this is actually the right approach, but it may be unconvincing because of how ?obvious? it is that a Fraction is just a pair with some extra calculations to perform based on its components. If we say that Fraction is a bad record, I worry that many more bad records like it will be built, and their subtle problems discovered only after their APIs have been published and committed to. Further, if this is indeed a bad record, I can?t think of any other good use case for overriding an accessor method (my first remark). I?m sympathetic to both sides of this argument. One the one hand, we want the feature to be useful; on the other, we want it to have a clear, unambiguous user model. A third explanation is that Peter?s expectations are either unreasonable or inconsistent with the idea of using someone else?s library class. > > On Fri, Mar 1, 2019 at 12:28 PM Brian Goetz > wrote: > I've updated the document on data classes here: > > http://cr.openjdk.java.net/~briangoetz/amber/datum.html > > (older versions of the document are retained in the same directory for > historical comparison.) > > While the previous version was mostly about tradeoffs, this version > takes a much more opinionated interpretation of the feature, offering > more examples of use cases of where it is intended to be used (and not > used). Many of the "under consideration" flexibilities (extension, > mutability, additional fields) have collapsed to their more restrictive > form; while some people will be disappointed because it doesn't solve > the worst of their boilerplate problems, our conclusion is: records are > a powerful feature, but they're not necessarily the delivery vehicle for > easing all the (often self-inflicted) pain of JavaBeans. We can > continue to explore relief for these situations too as separate > features, but trying to be all things to all classes has delayed the > records train long enough, and I'm convince they're separate problems > that want separate solutions. Time to let the records train roll. > > I've also combined the information on sealed types in this document, as > the two are so tightly related. > > Comments welcome. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Thu Mar 7 21:52:14 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Thu, 7 Mar 2019 13:52:14 -0800 Subject: Updated document on data classes and sealed types In-Reply-To: <0F1FBB17-0D83-416A-A2D0-04E42C0D60E1@oracle.com> References: <0F1FBB17-0D83-416A-A2D0-04E42C0D60E1@oracle.com> Message-ID: On Thu, Mar 7, 2019 at 12:47 PM Brian Goetz wrote: But, there?s still a reason to allow overriding accessors ? mutable types > which don?t provide unmodifiable views ? arrays being the obvious case. Okay, mutable field types happen, and I can't think of any other reasonable approach besides allowing accessor overrides. > There?s another consideration, too. We considered outlawing overriding > the equals/hashCode method. This goes a long way towards enforcing the > desired invariants, but again seems pretty restrictive. Yes, it's restrictive, and restrictiveness is what's great about records. Is this decision based on compelling use cases or on "seems pretty restrictive"? > And, having an irregular set of rules about what can be overridden and > what can?t (e.g., no to equals, yes to toString), seems likely to (a) make > the feature harder to learn/undersatnd and (b) lead to lots more ?why can?t > I, I just want to ?.? complaints. Better to have an all-or-nothing > treatment of overriding, I like predictable rules, but this doesn't seem like a *major* consideration to me. If one has good use cases and the other doesn't, a different rule for each may be justified. > Normalization can happen on single arguments or multiple: > > record Person(String name) { > public Person { > name = name.toUpperCase(); > } > } > > (Note that I?m mutating the parameter, which will then get written to the > field.) > I'm a bit relieved to hear this. The document seemed to imply that you would assign to the field, then later the remaining fields that weren't DA would be set from the remaining parameters. I think parameter reassignment is superior because there are never two versions of the data in scope at the same time. (I think the argument *against* parameter reassignment is mainly that it's heresy.) Rational numbers are a great example; Guy raised these earlier as well. > Where rationals challenge the model here is: the user provided a state > vector of (4, 2), but the final state of the object is (2, 1). This is at > odds with the following desirable-seeming invariant: > > record Foo(int x, int y) > assert new Foo(1, 2).x() == 1 > assert new Foo(1, 2).y() == 2 > > That is, if we normalize any fields in the ctor, then the relationship of > ?the constructor argument x and the accessor x() are referring to the same > state? appears to be severed. > Desirable-*seeming*, maybe, but this invariant is *not* actually desirable for a Rational class. I'm not sure whether this is what you are also saying. (Apart from that Rational class, one could have a "Fraction" class that does behave that way, but *that* is the case that similar enough to IntIntPair as to be not that interesting for our current discussion. And if it has an equality method that returns true for 1/2 and 2/4 that method should *not* be called equals. I don't think that is what either Peter or Fran is after here; they differ only in what they want the private internal representation to be, for performance reasons.) > There are two possible solutions I see to this. The first is to permit > some kind of derived-field mechanism, preferably lazy. Then, Fraction?s > constructor would save a thunk for producing the reduced form, and refer to > that thunk in the numerator() and denominator() accessors, but ignore it in > the #mul method so that we don?t pay the cost of reducing unless we want it > (here, imagine reducing a Fraction is more expensive than allocating a > thunk). > > The stricture against derived fields was probably the hardest choice > here. On the one hand, strictly derived fields are safe and don?t > undermine the invariants; on the other, without more help from the language > or runtime, we can?t enforce that additional fields are actually derived, > *and* it will be ultra-super-duper-tempting to make them not so. (I don?t > see remotely as much temptation to implement maliciously nonconformant > accessors or equals methods.) If we allowed additional fields, we would > surely have to lock down equals/hashCode. I still want to understand what the scenario we're worried about here is. Whether the value is computed later using a "lazy fields" feature or eagerly in the constructor, only the record's state is in scope, and sure, people *can* shoot themselves in the foot by calling out to some static method and getting some result not determined by the parameters, but why is this worth worrying about? Do you have an example that's both dangerous and tempting? (Sorry if you've said it before.) > On Fri, Mar 1, 2019 at 12:28 PM Brian Goetz > wrote: > > I've updated the document on data classes here: > > > > http://cr.openjdk.java.net/~briangoetz/amber/datum.html < > http://cr.openjdk.java.net/~briangoetz/amber/datum.html> > > > > (older versions of the document are retained in the same directory for > > historical comparison.) > > > > While the previous version was mostly about tradeoffs, this version > > takes a much more opinionated interpretation of the feature, offering > > more examples of use cases of where it is intended to be used (and not > > used). Many of the "under consideration" flexibilities (extension, > > mutability, additional fields) have collapsed to their more restrictive > > form; while some people will be disappointed because it doesn't solve > > the worst of their boilerplate problems, our conclusion is: records are > > a powerful feature, but they're not necessarily the delivery vehicle for > > easing all the (often self-inflicted) pain of JavaBeans. We can > > continue to explore relief for these situations too as separate > > features, but trying to be all things to all classes has delayed the > > records train long enough, and I'm convince they're separate problems > > that want separate solutions. Time to let the records train roll. > > > > I've also combined the information on sealed types in this document, as > > the two are so tightly related. > > > > Comments welcome. > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Thu Mar 7 22:18:39 2019 From: guy.steele at oracle.com (Guy Steele) Date: Thu, 7 Mar 2019 17:18:39 -0500 Subject: Updated document on data classes and sealed types In-Reply-To: References: <0F1FBB17-0D83-416A-A2D0-04E42C0D60E1@oracle.com> Message-ID: <37C8FE26-00FC-450D-8FA6-3F80FCAE34AE@oracle.com> > On Mar 7, 2019, at 4:52 PM, Kevin Bourrillion wrote: > > On Thu, Mar 7, 2019 at 12:47 PM Brian Goetz > wrote: > > Normalization can happen on single arguments or multiple: > > record Person(String name) { > public Person { > name = name.toUpperCase(); > } > } > > (Note that I?m mutating the parameter, which will then get written to the field.) > > I'm a bit relieved to hear this. The document seemed to imply that you would assign to the field, then later the remaining fields that weren't DA would be set from the remaining parameters. I think parameter reassignment is superior because there are never two versions of the data in scope at the same time. (I think the argument against parameter reassignment is mainly that it's heresy.) I see no heresy in assigning to a non-final variable. On the other hand, I?m a great believer in adding `final` to my method parameters, at least when there might be any doubt. Anyone who complains that ?Duh, parameters should *always* be final, why isn?t that the default?? will find me replying ?Duh, local variables should *almost always* be final, why isn?t that the default?? :-) ?Guy -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Mar 7 22:51:29 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 7 Mar 2019 23:51:29 +0100 (CET) Subject: Trusted final fields Was: Updated document on data classes and sealed types In-Reply-To: References: Message-ID: <133275984.820503.1551999089052.JavaMail.zimbra@u-pem.fr> As you may know, there are two kinds of final field in Java, you have final fields and trusted final fields, the former are classical final fields the later are final fields that can no be changed by reflection thus are considered as "real" final field by JITs (see https://shipilev.net/jvm/anatomy-quarks/17-trust-nonstatic-final-fields/ for more). So should record fields always be trusted ? R?mi ----- Mail original ----- > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Vendredi 1 Mars 2019 21:14:31 > Objet: Updated document on data classes and sealed types > I've updated the document on data classes here: > > http://cr.openjdk.java.net/~briangoetz/amber/datum.html > > (older versions of the document are retained in the same directory for > historical comparison.) > > While the previous version was mostly about tradeoffs, this version > takes a much more opinionated interpretation of the feature, offering > more examples of use cases of where it is intended to be used (and not > used). Many of the "under consideration" flexibilities (extension, > mutability, additional fields) have collapsed to their more restrictive > form; while some people will be disappointed because it doesn't solve > the worst of their boilerplate problems, our conclusion is: records are > a powerful feature, but they're not necessarily the delivery vehicle for > easing all the (often self-inflicted) pain of JavaBeans. We can > continue to explore relief for these situations too as separate > features, but trying to be all things to all classes has delayed the > records train long enough, and I'm convince they're separate problems > that want separate solutions. Time to let the records train roll. > > I've also combined the information on sealed types in this document, as > the two are so tightly related. > > Comments welcome. From brian.goetz at oracle.com Fri Mar 8 09:04:39 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 8 Mar 2019 09:04:39 +0000 Subject: Updated document on data classes and sealed types In-Reply-To: References: <0F1FBB17-0D83-416A-A2D0-04E42C0D60E1@oracle.com> Message-ID: <7BC8C97D-C4F3-4110-9F61-AF5BC7E98772@oracle.com> > > Okay, mutable field types happen, and I can't think of any other reasonable approach besides allowing accessor overrides. Yes, shaking your first and shouting ?damn you, pervasive mutability? into the void (because that?s where the side effects live, of course) is a perfectly rational response. > > There?s another consideration, too. We considered outlawing overriding the equals/hashCode method. This goes a long way towards enforcing the desired invariants, but again seems pretty restrictive. > > Yes, it's restrictive, and restrictiveness is what's great about records. Is this decision based on compelling use cases or on "seems pretty restrictive?? Now that you are caught in the grasp of pervasive mutability, here?s another shake. For a well-behaved record R(x,y) and an instance `r`, it seems pretty reasonable to expect that new R(r.x(), r.y()) equals r. If we have record KevinHatesLife(int[] ints) { } and we take the default accessor and equals, this will be true, but we?ll be leaking our mutability. There exist cases where that?s OK and desired, but there are cases when we want to not be leaky. So we override the ints() accessor to clone on the way out. But now, when we deconstruct-and-reconstruct, the new KHL is not equal to the old one. To restore the desired equality semantics, we want to override equals as follows: boolean equals(Other o) { return (o instanceof KevinHatesLife(var is)) && Arrays.equals(ints, is); } (and of course the same for hashCode.) Seems mean to not let you define equality in this way. Say it with me: ?DAMN YOU, PERVASIVE MUTABILITY!" > > I'm a bit relieved to hear this. The document seemed to imply that you would assign to the field, then later the remaining fields that weren't DA would be set from the remaining parameters. I think parameter reassignment is superior because there are never two versions of the data in scope at the same time. (I think the argument against parameter reassignment is mainly that it's heresy.) Yes! While it may seem more ?efficient? to just write to the field directly, it would make things much more complicated. If our ctor was Foo { if (x < 0) this.x = 0; } now on exit from the ctor, this.x is neither DA nor DU, so the compiler would have to generate some pretty nasty code to replicate the conditions under which it would want to do the assignment. So either the user-written ctor always writes the field (DA), or never does (DU) ? or it?s an error. > > > The stricture against derived fields was probably the hardest choice here. On the one hand, strictly derived fields are safe and don?t undermine the invariants; on the other, without more help from the language or runtime, we can?t enforce that additional fields are actually derived, *and* it will be ultra-super-duper-tempting to make them not so. (I don?t see remotely as much temptation to implement maliciously nonconformant accessors or equals methods.) If we allowed additional fields, we would surely have to lock down equals/hashCode. > > I still want to understand what the scenario we're worried about here is. Whether the value is computed later using a "lazy fields" feature or eagerly in the constructor, only the record's state is in scope, and sure, people can shoot themselves in the foot by calling out to some static method and getting some result not determined by the parameters, but why is this worth worrying about? Do you have an example that's both dangerous and tempting? (Sorry if you've said it before.) > > If we did allow them, the next thing people (Alan already did, and you made this same comment in an earlier round) would ask is whether they can be mutable ? so that derived fields can be lazily derived, Now, records are a combination of a ?true record" plus an unconstrained bag of mutable state. And what do you think the chances are that this state won?t make it into equals/hashCode semantics? Now, we?ve completely lost our grasp on the semantic constraint ? that a record is ?just? its state. We could try to put the toothpaste back in the tube by clamping down on the ability to override equals/hashCode, but now the previous example rears its head again. Worse, it does a lot of damage to the mental model of what records are for. We know they?re not about writing a class with fewer lines of code (though that?s an advantage), they?re about transparent carriers for a defined unit of state. But, if users routinely see records in the wild with lots of extra state stapled to the side, maybe even affecting equals/hashCode, this design center is going to be harder to see (this ultimately leads to a feedback loop, where users who can?t understand what the feature is for will demand more features that are out of line with its design center, further obscuring the design center.) As I said to Alan, uncomfortable tradeoffs indeed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 8 09:10:31 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 8 Mar 2019 09:10:31 +0000 Subject: Trusted final fields Was: Updated document on data classes and sealed types In-Reply-To: <133275984.820503.1551999089052.JavaMail.zimbra@u-pem.fr> References: <133275984.820503.1551999089052.JavaMail.zimbra@u-pem.fr> Message-ID: There needs to be a name for the phenomena where, when we have a wrong default (e.g., mutable method parameters), every time a new language context comes up where new code will be written, it seems overwhelmingly tempting to say ?but surely we can flip the default for _them_.? (We saw the same thing with ?lambda parameters should be implicitly final.?) It?s always good to knock on this door, but most of the time the door is locked by the latch of either migration compatibility, or complexity management, or both. In this case (as with lambda parameters), I think it?s both. The more differences there are between a record and the equivalent boilerplate, the more pain users will have migrating between them. And a rule like this is an isolated fact that users have to carry around, rather than a general rule ? increasing the complexity of the language semantics. I would love to get to ?trust final fields? everywhere ? that?s a rule that neither gets in the way of migration nor an isolated fact you have to keep track of separately. (In fact, its really just what you would expect.) So I share your goal, but I don?t think this is the train that goes there. > On Mar 7, 2019, at 10:51 PM, Remi Forax wrote: > > As you may know, there are two kinds of final field in Java, > you have final fields and trusted final fields, the former are classical final fields the later are final fields that can no be changed by reflection thus are considered as "real" final field by JITs > (see https://shipilev.net/jvm/anatomy-quarks/17-trust-nonstatic-final-fields/ for more). > > So should record fields always be trusted ? > > R?mi > > ----- Mail original ----- >> De: "Brian Goetz" >> ?: "amber-spec-experts" >> Envoy?: Vendredi 1 Mars 2019 21:14:31 >> Objet: Updated document on data classes and sealed types > >> I've updated the document on data classes here: >> >> http://cr.openjdk.java.net/~briangoetz/amber/datum.html >> >> (older versions of the document are retained in the same directory for >> historical comparison.) >> >> While the previous version was mostly about tradeoffs, this version >> takes a much more opinionated interpretation of the feature, offering >> more examples of use cases of where it is intended to be used (and not >> used). Many of the "under consideration" flexibilities (extension, >> mutability, additional fields) have collapsed to their more restrictive >> form; while some people will be disappointed because it doesn't solve >> the worst of their boilerplate problems, our conclusion is: records are >> a powerful feature, but they're not necessarily the delivery vehicle for >> easing all the (often self-inflicted) pain of JavaBeans. We can >> continue to explore relief for these situations too as separate >> features, but trying to be all things to all classes has delayed the >> records train long enough, and I'm convince they're separate problems >> that want separate solutions. Time to let the records train roll. >> >> I've also combined the information on sealed types in this document, as >> the two are so tightly related. >> >> Comments welcome. From kevinb at google.com Fri Mar 8 20:43:54 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 8 Mar 2019 12:43:54 -0800 Subject: Updated document on data classes and sealed types In-Reply-To: References: Message-ID: Re: annotations, Doc says, "Record components constitute a new place to put annotations; we'll likely want to extend the @Target meta-annotation to reflect this." I'm sure we discussed this before, but I also expect to be able to put any METHOD-, FIELD- or PARAMETER-targeted annotation on a record component, and have that annotation appear to be present on the synthesized accessor/field/constructor-parameter. Is that sensible? (As for records themselves, I expect they are targeted with TYPE just as enums/interfaces/"plain old classes" (jeesh, is there any term that means the latter?).) On Fri, Mar 1, 2019 at 12:16 PM Brian Goetz wrote: > I've updated the document on data classes here: > > http://cr.openjdk.java.net/~briangoetz/amber/datum.html > > (older versions of the document are retained in the same directory for > historical comparison.) > > While the previous version was mostly about tradeoffs, this version > takes a much more opinionated interpretation of the feature, offering > more examples of use cases of where it is intended to be used (and not > used). Many of the "under consideration" flexibilities (extension, > mutability, additional fields) have collapsed to their more restrictive > form; while some people will be disappointed because it doesn't solve > the worst of their boilerplate problems, our conclusion is: records are > a powerful feature, but they're not necessarily the delivery vehicle for > easing all the (often self-inflicted) pain of JavaBeans. We can > continue to explore relief for these situations too as separate > features, but trying to be all things to all classes has delayed the > records train long enough, and I'm convince they're separate problems > that want separate solutions. Time to let the records train roll. > > I've also combined the information on sealed types in this document, as > the two are so tightly related. > > Comments welcome. > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Fri Mar 8 21:45:17 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 8 Mar 2019 13:45:17 -0800 Subject: Updated document on data classes and sealed types In-Reply-To: <7BC8C97D-C4F3-4110-9F61-AF5BC7E98772@oracle.com> References: <0F1FBB17-0D83-416A-A2D0-04E42C0D60E1@oracle.com> <7BC8C97D-C4F3-4110-9F61-AF5BC7E98772@oracle.com> Message-ID: On Fri, Mar 8, 2019 at 1:04 AM Brian Goetz wrote: For a well-behaved record R(x,y) and an instance `r`, it seems pretty > reasonable to expect that new R(r.x(), r.y()) equals r. If we have > > record KevinHatesLife(int[] ints) { } > > and we take the default accessor and equals, this will be true, but we?ll > be leaking our mutability. There exist cases where that?s OK and desired, > but there are cases when we want to not be leaky. So we override the > ints() accessor to clone on the way out. But now, when we > deconstruct-and-reconstruct, the new KHL is not equal to the old one. To > restore the desired equality semantics, we want to override equals as > follows: > > boolean equals(Other o) { > return (o instanceof KevinHatesLife(var is)) && > Arrays.equals(ints, is); > } > > (and of course the same for hashCode.) Seems mean to not let you define > equality in this way. > > Say it with me: ?DAMN YOU, PERVASIVE MUTABILITY!" > Let me see if I have this right. If you want to use an array field, and you are conscientious and want to avoid bugs, then 1. You need constructor boilerplate scaling with the number of array fields 2. You need accessor boilerplates caling with the number of array fields 3. You need to completely implement equals, hashCode, and toString yourself if you have even *one* array. The pattern here is how tidily it seems to undermine the benefits of records. :-( And other mutable types may not share #3, but do #1 and #2. In other words, the records design simply does not play well with mutable types, just as it doesn't with the fields being non-final. They are a mismatch. But here's my real question. Even if we enable users to do these workarounds - is this even the remedy that we want to recommend in our documentation? There's another remedy that I think is strictly better: *get* yourself an immutable type to use, even if you have to create a wrapper yourself. It's a small investment, it makes accessors as cheap as expected, and the very next time you want to use this field type in a record you'll already be very glad. And sometimes you will discover that a suitable immutable type already exists. Maybe the ones people create for this reason will get shared more. This is a pretty good picture of the world imho. This is *not* a religious argument about "immutable good mutable bad". And this is not pretending that we can uninvent mutability. Mutability has its places, many of them. This is asking whether records really can be one of those places. Are we *really gaining anything* with this approach to trying to accommodate it? We're already comfortable denying mutable fields. Yes! While it may seem more ?efficient? to just write to the field > directly, it would make things much more complicated. If our ctor was > > Foo { > if (x < 0) > this.x = 0; > } > > now on exit from the ctor, this.x is neither DA nor DU, so the compiler > would have to generate some pretty nasty code to replicate the conditions > under which it would want to do the assignment. So either the user-written > ctor always writes the field (DA), or never does (DU) ? or it?s an error. > Well, I don't care about nasty compiler code; it's still the right way to do it. :-) So that's a note for the next version of the doc. The stricture against derived fields was probably the hardest choice here. >> On the one hand, strictly derived fields are safe and don?t undermine the >> invariants; on the other, without more help from the language or runtime, >> we can?t enforce that additional fields are actually derived, *and* it will >> be ultra-super-duper-tempting to make them not so. (I don?t see remotely >> as much temptation to implement maliciously nonconformant accessors or >> equals methods.) If we allowed additional fields, we would surely have to >> lock down equals/hashCode. > > > I still want to understand what the scenario we're worried about here is. > Whether the value is computed later using a "lazy fields" feature or > eagerly in the constructor, only the record's state is in scope, and sure, > people *can* shoot themselves in the foot by calling out to some static > method and getting some result not determined by the parameters, but why is > this worth worrying about? Do you have an example that's both dangerous and > tempting? (Sorry if you've said it before.) > > If we did allow them, the next thing people (Alan already did, and you > made this same comment in an earlier round) would ask is whether they can > be mutable ? so that derived fields can be lazily derived, Now, records > are a combination of a ?true record" plus an unconstrained bag of mutable > state. > My question did actually exclude this option. We should tell those people no. And what do you think the chances are that this state won?t make it into > equals/hashCode semantics? > If it's derived, it doesn't hurt that much; if it's not, why are they working so hard to not make it a regular record field? This is part of what I'm talking about when I say "sure, they *can* shoot themselves in the foot". What is a realistic example we are worried about? Now, we?ve completely lost our grasp on the semantic constraint ? that a > record is ?just? its state. > Well, it's still just its *real* state. We could try to put the toothpaste back in the tube by clamping down on > the ability to override equals/hashCode, but now the previous example rears > its head again. > > Worse, it does a lot of damage to the mental model of what records are > for. We know they?re not about writing a class with fewer lines of code > (though that?s an advantage), they?re about transparent carriers for a > defined unit of state. But, if users routinely see records in the wild > with lots of extra state stapled to the side, maybe even affecting > equals/hashCode, this design center is going to be harder to see (this > ultimately leads to a feedback loop, where users who can?t understand what > the feature is for will demand more features that are out of line with its > design center, further obscuring the design center.) > Just saying that I don't share this concern. And I get concerned about everything. :-) Derived state is common and not complicated. I think it's just the fear of this being abused for non-derived state that we're talking about. Isn't it just another case like "people can pass side-effecting functions to map()"? I'm trying to figure out if there's somewhere I'm being logically inconsistent with myself, because it feels weird to be simultaneously arguing for allowing something you don't want to allow and for disallowing something you do want to allow. :-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Mar 9 12:47:28 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 9 Mar 2019 12:47:28 +0000 Subject: Records and annotations (was: Updated document on data classes and sealed types) In-Reply-To: References: Message-ID: <3F1E3D1E-6132-4DC1-90CE-2A9D3C0FA9A9@oracle.com> This came up before, but we didn?t reach a conclusion. A record component is more than just the lower-level members (fields, accessors, ctor params) it gets desugared too. So it seems reasonable that it be considered an annotatable program element, and that reflection expose directly the annotations on record components (separately from any annotations on the class members that may or may not derive from desugaring of records.) But, that still leaves the question of whether the desugaring should, or should not be, transparent to annotations. My sense is that pushing annotations down to fields, ctor params, and accessors _seems_ friendly, but also opens a number of uncomfortable questions. - Should we treat the cases where @A has a target of RECORD_COMPONENT, separately from the cases where it does not, such as, only push the annotation down to members when the target does not include RECORD_COMPONENT? That is, is the desire to push down annotations based on ?well, what if we want to apply a ?legacy? annotation? If so, this causes a migration compatibility issue; if someone adds RC to the targets list for @A, then when the record is recompiled, the location of the annotations will changed, possibly changing the behavior of frameworks that encounter the record. - What if @A has a target set of { field, parameter }, but for some reason the user does _not_ want the annotation pushed down? Tough luck? Redeclare the member without the annotation? - If the user explicitly redeclares the member (ctor, accessor), what happens? Do we still implicitly push down annotations from record components to the explicit member? Will this be confusing when the source says ?@B int x() -> x?, but reflection yields both @A and @B as annotations on x()? All of which causes me to back up and say: what is the motivation for pushing these down to implicit members, other than ?general friendliness?? Is this a migration strategy for migrating existing code to use records, without having to redeclare annotations on the members? And if so, how useful is it really? Will users want to throw the union of field/accessor/ctor parameter annotations on the record components just to gain compatibility with their existing code? My gut sense is that the stable solution is to make record component a new kind of target, and encourage frameworks to learn about these, rather than trying to fake out frameworks by emulating legacy behavior. > On Mar 8, 2019, at 8:43 PM, Kevin Bourrillion wrote: > > Re: annotations, > > Doc says, "Record components constitute a new place to put annotations; we'll likely want to extend the @Target meta-annotation to reflect this." > > I'm sure we discussed this before, but I also expect to be able to put any METHOD-, FIELD- or PARAMETER-targeted annotation on a record component, and have that annotation appear to be present on the synthesized accessor/field/constructor-parameter. Is that sensible? > > (As for records themselves, I expect they are targeted with TYPE just as enums/interfaces/"plain old classes" (jeesh, is there any term that means the latter?).) > > > > > > > > On Fri, Mar 1, 2019 at 12:16 PM Brian Goetz > wrote: > I've updated the document on data classes here: > > http://cr.openjdk.java.net/~briangoetz/amber/datum.html > > (older versions of the document are retained in the same directory for > historical comparison.) > > While the previous version was mostly about tradeoffs, this version > takes a much more opinionated interpretation of the feature, offering > more examples of use cases of where it is intended to be used (and not > used). Many of the "under consideration" flexibilities (extension, > mutability, additional fields) have collapsed to their more restrictive > form; while some people will be disappointed because it doesn't solve > the worst of their boilerplate problems, our conclusion is: records are > a powerful feature, but they're not necessarily the delivery vehicle for > easing all the (often self-inflicted) pain of JavaBeans. We can > continue to explore relief for these situations too as separate > features, but trying to be all things to all classes has delayed the > records train long enough, and I'm convince they're separate problems > that want separate solutions. Time to let the records train roll. > > I've also combined the information on sealed types in this document, as > the two are so tightly related. > > Comments welcome. > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Mar 9 13:07:59 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 9 Mar 2019 13:07:59 +0000 Subject: Records and mutable components (was: Updated document on data classes and sealed types) In-Reply-To: References: <0F1FBB17-0D83-416A-A2D0-04E42C0D60E1@oracle.com> <7BC8C97D-C4F3-4110-9F61-AF5BC7E98772@oracle.com> Message-ID: Splitting this into two topics. > > boolean equals(Other o) { > return (o instanceof KevinHatesLife(var is)) && Arrays.equals(ints, is); > } > > (and of course the same for hashCode.) Seems mean to not let you define equality in this way. > > Say it with me: ?DAMN YOU, PERVASIVE MUTABILITY!" > > Let me see if I have this right. If you want to use an array field, and you are conscientious and want to avoid bugs, then > > 1. You need constructor boilerplate scaling with the number of array fields > 2. You need accessor boilerplates caling with the number of array fields > 3. You need to completely implement equals, hashCode, and toString yourself if you have even *one* array. > > The pattern here is how tidily it seems to undermine the benefits of records. :-( I get where you?re going. (Meta observation: discussions like this just underscore the importance of having a clear, crisp ?what are records for? message, which plays into the other question.) You could call this ?undermining?, or you could call this ?pay to play.? A record with lots of immutable components, and one pesky array, will still benefit form records, just a little bit less. Here?s a valid record: record MutableArrayPair(int[] as, int[] bs) { } Before you recoil in horror (?your mutability is showing, mon dieu!?), let?s remember that the requirement to encapsulate mutability is not a law of nature, as much as a statement of which boundaries you care about. Code that trusts its clients is perfectly justified in writing the above record, and surely would not want to be shut down by the rules. (Note this is the semantics you?d get with public final fields, or the equivalent.) So I think the above is a valid design choice for records, which is responsible in some situations and irresponsible in others. And we get the nice invariant that deconstruction + reconstruction is an identity under equals(), because both the accessors and the equals method work the same way. Here?s another valid record: record EncapsulatedArrayPair(int[] as, int[] bs) { public EAP { as = as.clone(); bs = bs.clone(); } public int[] as() -> as.clone(); public int[] bs() -> bs.clone(); public int hashCode() -> Objects.hash(Arrays.hashCode(as), Arrays.hashCode(bs)); public boolean equals(Object o) -> o instanceof EAP eap && Arrays.equals(as, eap.as) && Arrays.equals(bs, eat.bs); } That?s a valid record too, and shares the nice identity with its leaky cousin ? that deconstruction + reconstruction yields an equals() instance. Now, you say: ?but that?s a stupid record, you reimplement almost all the methods!? Which would be true in Billy-World, where records are only for boilerplate reduction. But records are a semantic statement ? that their API is coupled to their representation (and hence, we get pattern-friendliness, and that certain API elements have useful invariants.) Plus, we haven?t redeclared _all_ the members, and if there were other fields, that would be even less true. > > In other words, the records design simply does not play well with mutable types, just as it doesn't with the fields being non-final. They are a mismatch. I think mismatch is too strong; the interaction between mutable components (especially those that don?t admit a nice immutable wrapper like Collections::unmodifiableXxx) and records causes friction, which requires the user to do some extra work to make up the difference. But this isn?t ?glass 100% empty?, it?s ?glass less full than it would be in a perfect world.? I kind of like the ?pay to play? nature; if you want to leak mutability, you can; if you want to plug it, you can, but you might be irritated at the cost (or not!), and if you are, you might seek alternate strategies (like more immutability, immutable wrappers, etc.) > > But here's my real question. Even if we enable users to do these workarounds - is this even the remedy that we want to recommend in our documentation? There's another remedy that I think is strictly better: get yourself an immutable type to use, even if you have to create a wrapper yourself. It's a small investment, it makes accessors as cheap as expected, and the very next time you want to use this field type in a record you'll already be very glad. And sometimes you will discover that a suitable immutable type already exists. Maybe the ones people create for this reason will get shared more. This is a pretty good picture of the world imho. So, I agree with the world you want to get to. And I think natural laziness will help pull users there too; the opportunity to do some fixup in one place seems more attractive than N places. But I think _prohibiting_ these things ? which I was initially attracted to ? does not lead us to a stable place. I think this one is better to encourage through carrot (e.g., value array wrapper classes, freezable arrays) than stick (records + arrays = leaked mutability). > > This is not a religious argument about "immutable good mutable bad". And this is not pretending that we can uninvent mutability. Mutability has its places, many of them. This is asking whether records really can be one of those places. Are we really gaining anything with this approach to trying to accommodate it? We're already comfortable denying mutable fields. In part, we?re comfortable denying mutable fields because we _can_; the language has a concept of field mutability. It does not, for better or worse, have a concept of deep immutability. So attempts to discourage deep mutability (such as, by making it more dangerous when it comes in contact with records) feels like the wrong end of the lever. > > Well, I don't care about nasty compiler code; it's still the right way to do it. :-) So that's a note for the next version of the doc. I think we?re in agreement; the ?wrong? way to do it is also really hard to implement correctly. Looking ahead (please, let?s not bikeshed this now), I want to take this idiom (?bound constructors?) from records and eventually make it possible to declare a ?constructor whose parameters are bound to fields?, and yield the same sort of ?we?ll fill in the error-prone boilerplate for you? result. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Mar 9 13:37:16 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 9 Mar 2019 13:37:16 +0000 Subject: Records and derived fields (was: Updated document on data classes and sealed types) In-Reply-To: References: <0F1FBB17-0D83-416A-A2D0-04E42C0D60E1@oracle.com> <7BC8C97D-C4F3-4110-9F61-AF5BC7E98772@oracle.com> Message-ID: <86781A99-A85E-410C-8FC3-0C6FCE8717D6@oracle.com> The second of the two topics raised by Kevin?s note. The subject of ancillary fields seems to be the hardest question to answer; none of the answers seem great. But let?s tease apart some of the use cases. I think Kevin?s comment here is, at root: ?but derived fields are super-useful, and seem perfectly safe, isn?t there some way to wedge these into the model?? Correct me if I?m wrong, but I would think you would agree that arbitrary mutable fields in records are _not_ a great idea. So the game here is, is is possible to carve out a space where derived fields are possible, but non-derived ones are not? (And, I think there is, if we stop thinking about ?fields? and start thinking about semantics. Derived fields are a mechanism, in aid of: is it possible to ensure that a computation on record state is done at-most-once? More on that later.) > >> >> I still want to understand what the scenario we're worried about here is. Whether the value is computed later using a "lazy fields" feature or eagerly in the constructor, only the record's state is in scope, and sure, people can shoot themselves in the foot by calling out to some static method and getting some result not determined by the parameters, but why is this worth worrying about? Do you have an example that's both dangerous and tempting? (Sorry if you've said it before.) > > If we did allow them, the next thing people (Alan already did, and you made this same comment in an earlier round) would ask is whether they can be mutable ? so that derived fields can be lazily derived, Now, records are a combination of a ?true record" plus an unconstrained bag of mutable state. > > My question did actually exclude this option. We should tell those people no. OK, so you?re suggesting: ancillary final fields with initializers is OK. Not a totally silly option (we did talk about it before), but let?s look at how it affects the user perception of what the feature is for, and then try and make a cost-benefit comparison between this and the base case (no additional fields.) Note that the benefit we?re aiming for here is purely an optimization; the avoidance of recalculating derived state. (Nagging question: if this optimization _is_ super-important, is this the best way to ensure it?) I think you have also noted in the past that there are lots of ways to get around the restriction, at various degrees of obviously-missing-the-point: final notReallyDerived = new Foo[1]; // effectively, a mutable Foo field static final WeakHashMap // same effect, just more absurd > > And what do you think the chances are that this state won?t make it into equals/hashCode semantics? > > If it's derived, it doesn't hurt that much; if it's not, why are they working so hard to not make it a regular record field? > This is part of what I'm talking about when I say "sure, they can shoot themselves in the foot". What is a realistic example we are worried about? The language doesn?t have a notion of ?derived quantity?, so any attempt to restrict the relaxation to derived quantities will be an approximation. But I think the ?why are they working so hard? question answers itself: concision! And the most complex the boundary of the records feature is, the more that Billy will be confused into thinking this is just an overly-complicated, sharp-edged, frankly-crappy way to get concision. If we want to carve out an exception for ?derived state?, let?s go there more directly (in a way that benefits all classes, not just records). One way we discussed was something like this: lazy final String fullName = first + last; and then allowing records to have lazy fields. This is a stronger hint that this is no ordinary field, but ultimately still is too easy to abuse by initializing it with a one-element array. A more direct way to get there is to introduce a semantic notion that a computation should be done at most once: record Name(String first, String last) { __at_most_once String fullName() -> first + last; } This has a lot of advantages over the field approach: - We have said directly what we mean, in a way that doesn?t ?leak? its implementation mechanism (fields); - The runtime can probably optimize more directly and flexibly; - We haven?t distorted the set of class members for a performance concern; - Mechanism usable equally by records and non-records. A downside is that we don?t have this mechanism yet, and we don?t really want to hold up records to get it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Mar 9 13:57:47 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 9 Mar 2019 13:57:47 +0000 Subject: Records and annotations (was: Updated document on data classes and sealed types) In-Reply-To: <3F1E3D1E-6132-4DC1-90CE-2A9D3C0FA9A9@oracle.com> References: <3F1E3D1E-6132-4DC1-90CE-2A9D3C0FA9A9@oracle.com> Message-ID: <81D633A4-B5F9-48BF-A883-AFFEC83E46EE@oracle.com> This raises a related question, which is: what if the author likes the default implementation, but wants to add more annotations (or Javadoc)? Currently, the story is decent for constructors; declaring an empty `Foo { }` constructor recreates the default behavior. There is no equivalent for accessors, equals, hashCode, or toString. We toyed with a ?default.m()? syntax in the past: record R(int x, int y) { @MyAnnotation public boolean equals(Object o) -> default.equals(o); } which doesn?t seem so bad, and is surely better than trying (and maybe failing) to reproduce the default behavior imperatively. > On Mar 9, 2019, at 12:47 PM, Brian Goetz wrote: > > This came up before, but we didn?t reach a conclusion. > > A record component is more than just the lower-level members (fields, accessors, ctor params) it gets desugared too. So it seems reasonable that it be considered an annotatable program element, and that reflection expose directly the annotations on record components (separately from any annotations on the class members that may or may not derive from desugaring of records.) > > But, that still leaves the question of whether the desugaring should, or should not be, transparent to annotations. My sense is that pushing annotations down to fields, ctor params, and accessors _seems_ friendly, but also opens a number of uncomfortable questions. > > - Should we treat the cases where @A has a target of RECORD_COMPONENT, separately from the cases where it does not, such as, only push the annotation down to members when the target does not include RECORD_COMPONENT? That is, is the desire to push down annotations based on ?well, what if we want to apply a ?legacy? annotation? If so, this causes a migration compatibility issue; if someone adds RC to the targets list for @A, then when the record is recompiled, the location of the annotations will changed, possibly changing the behavior of frameworks that encounter the record. > > - What if @A has a target set of { field, parameter }, but for some reason the user does _not_ want the annotation pushed down? Tough luck? Redeclare the member without the annotation? > > - If the user explicitly redeclares the member (ctor, accessor), what happens? Do we still implicitly push down annotations from record components to the explicit member? Will this be confusing when the source says ?@B int x() -> x?, but reflection yields both @A and @B as annotations on x()? > > All of which causes me to back up and say: what is the motivation for pushing these down to implicit members, other than ?general friendliness?? Is this a migration strategy for migrating existing code to use records, without having to redeclare annotations on the members? And if so, how useful is it really? Will users want to throw the union of field/accessor/ctor parameter annotations on the record components just to gain compatibility with their existing code? > > My gut sense is that the stable solution is to make record component a new kind of target, and encourage frameworks to learn about these, rather than trying to fake out frameworks by emulating legacy behavior. > > >> On Mar 8, 2019, at 8:43 PM, Kevin Bourrillion > wrote: >> >> Re: annotations, >> >> Doc says, "Record components constitute a new place to put annotations; we'll likely want to extend the @Target meta-annotation to reflect this." >> >> I'm sure we discussed this before, but I also expect to be able to put any METHOD-, FIELD- or PARAMETER-targeted annotation on a record component, and have that annotation appear to be present on the synthesized accessor/field/constructor-parameter. Is that sensible? >> >> (As for records themselves, I expect they are targeted with TYPE just as enums/interfaces/"plain old classes" (jeesh, is there any term that means the latter?).) >> >> >> >> >> >> >> >> On Fri, Mar 1, 2019 at 12:16 PM Brian Goetz > wrote: >> I've updated the document on data classes here: >> >> http://cr.openjdk.java.net/~briangoetz/amber/datum.html >> >> (older versions of the document are retained in the same directory for >> historical comparison.) >> >> While the previous version was mostly about tradeoffs, this version >> takes a much more opinionated interpretation of the feature, offering >> more examples of use cases of where it is intended to be used (and not >> used). Many of the "under consideration" flexibilities (extension, >> mutability, additional fields) have collapsed to their more restrictive >> form; while some people will be disappointed because it doesn't solve >> the worst of their boilerplate problems, our conclusion is: records are >> a powerful feature, but they're not necessarily the delivery vehicle for >> easing all the (often self-inflicted) pain of JavaBeans. We can >> continue to explore relief for these situations too as separate >> features, but trying to be all things to all classes has delayed the >> records train long enough, and I'm convince they're separate problems >> that want separate solutions. Time to let the records train roll. >> >> I've also combined the information on sealed types in this document, as >> the two are so tightly related. >> >> Comments welcome. >> >> >> -- >> Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Sat Mar 9 14:08:11 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Sat, 9 Mar 2019 14:08:11 +0000 Subject: Records and annotations (was: Updated document on data classes and sealed types) In-Reply-To: <81D633A4-B5F9-48BF-A883-AFFEC83E46EE@oracle.com> References: <3F1E3D1E-6132-4DC1-90CE-2A9D3C0FA9A9@oracle.com> <81D633A4-B5F9-48BF-A883-AFFEC83E46EE@oracle.com> Message-ID: <012CB8B0-9EC4-4B1D-929F-E0D69FB3F579@oracle.com> An alternate is to allow the body to be left off entirely: record R(int x, int y) { @MyAnnotation public boolean equals(Object o); } > On Mar 9, 2019, at 1:57 PM, Brian Goetz wrote: > > This raises a related question, which is: what if the author likes the default implementation, but wants to add more annotations (or Javadoc)? > > Currently, the story is decent for constructors; declaring an empty `Foo { }` constructor recreates the default behavior. There is no equivalent for accessors, equals, hashCode, or toString. We toyed with a ?default.m()? syntax in the past: > > record R(int x, int y) { > @MyAnnotation > public boolean equals(Object o) -> default.equals(o); > } > > which doesn?t seem so bad, and is surely better than trying (and maybe failing) to reproduce the default behavior imperatively. > > >> On Mar 9, 2019, at 12:47 PM, Brian Goetz > wrote: >> >> This came up before, but we didn?t reach a conclusion. >> >> A record component is more than just the lower-level members (fields, accessors, ctor params) it gets desugared too. So it seems reasonable that it be considered an annotatable program element, and that reflection expose directly the annotations on record components (separately from any annotations on the class members that may or may not derive from desugaring of records.) >> >> But, that still leaves the question of whether the desugaring should, or should not be, transparent to annotations. My sense is that pushing annotations down to fields, ctor params, and accessors _seems_ friendly, but also opens a number of uncomfortable questions. >> >> - Should we treat the cases where @A has a target of RECORD_COMPONENT, separately from the cases where it does not, such as, only push the annotation down to members when the target does not include RECORD_COMPONENT? That is, is the desire to push down annotations based on ?well, what if we want to apply a ?legacy? annotation? If so, this causes a migration compatibility issue; if someone adds RC to the targets list for @A, then when the record is recompiled, the location of the annotations will changed, possibly changing the behavior of frameworks that encounter the record. >> >> - What if @A has a target set of { field, parameter }, but for some reason the user does _not_ want the annotation pushed down? Tough luck? Redeclare the member without the annotation? >> >> - If the user explicitly redeclares the member (ctor, accessor), what happens? Do we still implicitly push down annotations from record components to the explicit member? Will this be confusing when the source says ?@B int x() -> x?, but reflection yields both @A and @B as annotations on x()? >> >> All of which causes me to back up and say: what is the motivation for pushing these down to implicit members, other than ?general friendliness?? Is this a migration strategy for migrating existing code to use records, without having to redeclare annotations on the members? And if so, how useful is it really? Will users want to throw the union of field/accessor/ctor parameter annotations on the record components just to gain compatibility with their existing code? >> >> My gut sense is that the stable solution is to make record component a new kind of target, and encourage frameworks to learn about these, rather than trying to fake out frameworks by emulating legacy behavior. >> >> >>> On Mar 8, 2019, at 8:43 PM, Kevin Bourrillion > wrote: >>> >>> Re: annotations, >>> >>> Doc says, "Record components constitute a new place to put annotations; we'll likely want to extend the @Target meta-annotation to reflect this." >>> >>> I'm sure we discussed this before, but I also expect to be able to put any METHOD-, FIELD- or PARAMETER-targeted annotation on a record component, and have that annotation appear to be present on the synthesized accessor/field/constructor-parameter. Is that sensible? >>> >>> (As for records themselves, I expect they are targeted with TYPE just as enums/interfaces/"plain old classes" (jeesh, is there any term that means the latter?).) >>> >>> >>> >>> >>> >>> >>> >>> On Fri, Mar 1, 2019 at 12:16 PM Brian Goetz > wrote: >>> I've updated the document on data classes here: >>> >>> http://cr.openjdk.java.net/~briangoetz/amber/datum.html >>> >>> (older versions of the document are retained in the same directory for >>> historical comparison.) >>> >>> While the previous version was mostly about tradeoffs, this version >>> takes a much more opinionated interpretation of the feature, offering >>> more examples of use cases of where it is intended to be used (and not >>> used). Many of the "under consideration" flexibilities (extension, >>> mutability, additional fields) have collapsed to their more restrictive >>> form; while some people will be disappointed because it doesn't solve >>> the worst of their boilerplate problems, our conclusion is: records are >>> a powerful feature, but they're not necessarily the delivery vehicle for >>> easing all the (often self-inflicted) pain of JavaBeans. We can >>> continue to explore relief for these situations too as separate >>> features, but trying to be all things to all classes has delayed the >>> records train long enough, and I'm convince they're separate problems >>> that want separate solutions. Time to let the records train roll. >>> >>> I've also combined the information on sealed types in this document, as >>> the two are so tightly related. >>> >>> Comments welcome. >>> >>> >>> -- >>> Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Sat Mar 9 17:56:51 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Sat, 9 Mar 2019 09:56:51 -0800 Subject: Records and annotations (was: Updated document on data classes and sealed types) In-Reply-To: <012CB8B0-9EC4-4B1D-929F-E0D69FB3F579@oracle.com> References: <3F1E3D1E-6132-4DC1-90CE-2A9D3C0FA9A9@oracle.com> <81D633A4-B5F9-48BF-A883-AFFEC83E46EE@oracle.com> <012CB8B0-9EC4-4B1D-929F-E0D69FB3F579@oracle.com> Message-ID: Only time to respond to the very last bit for now: Interesting! The criticism could be "that makes it look abstract", but I'm not sure that's even a problem. If you had to put this information into an interface implemented by the record, this is what you would write there (and hope the annotation processor implements automatic annotation-inheritance from supermethods), so it's pretty reasonable to write here too. On Sat, Mar 9, 2019 at 6:08 AM Brian Goetz wrote: > An alternate is to allow the body to be left off entirely: > > record R(int x, int y) { > @MyAnnotation > public boolean equals(Object o); > } > > > On Mar 9, 2019, at 1:57 PM, Brian Goetz wrote: > > This raises a related question, which is: what if the author likes the > default implementation, but wants to add more annotations (or Javadoc)? > > Currently, the story is decent for constructors; declaring an empty `Foo { > }` constructor recreates the default behavior. There is no equivalent for > accessors, equals, hashCode, or toString. We toyed with a ?default.m()? > syntax in the past: > > record R(int x, int y) { > @MyAnnotation > public boolean equals(Object o) -> default.equals(o); > } > > which doesn?t seem so bad, and is surely better than trying (and maybe > failing) to reproduce the default behavior imperatively. > > > On Mar 9, 2019, at 12:47 PM, Brian Goetz wrote: > > This came up before, but we didn?t reach a conclusion. > > A record component is more than just the lower-level members (fields, > accessors, ctor params) it gets desugared too. So it seems reasonable that > it be considered an annotatable program element, and that reflection expose > directly the annotations on record components (separately from any > annotations on the class members that may or may not derive from desugaring > of records.) > > But, that still leaves the question of whether the desugaring should, or > should not be, transparent to annotations. My sense is that pushing > annotations down to fields, ctor params, and accessors _seems_ friendly, > but also opens a number of uncomfortable questions. > > - Should we treat the cases where @A has a target of RECORD_COMPONENT, > separately from the cases where it does not, such as, only push the > annotation down to members when the target does not include > RECORD_COMPONENT? That is, is the desire to push down annotations based on > ?well, what if we want to apply a ?legacy? annotation? If so, this causes > a migration compatibility issue; if someone adds RC to the targets list for > @A, then when the record is recompiled, the location of the annotations > will changed, possibly changing the behavior of frameworks that encounter > the record. > > - What if @A has a target set of { field, parameter }, but for some > reason the user does _not_ want the annotation pushed down? Tough luck? > Redeclare the member without the annotation? > > - If the user explicitly redeclares the member (ctor, accessor), what > happens? Do we still implicitly push down annotations from record > components to the explicit member? Will this be confusing when the source > says ?@B int x() -> x?, but reflection yields both @A and @B as annotations > on x()? > > All of which causes me to back up and say: what is the motivation for > pushing these down to implicit members, other than ?general friendliness?? > Is this a migration strategy for migrating existing code to use records, > without having to redeclare annotations on the members? And if so, how > useful is it really? Will users want to throw the union of > field/accessor/ctor parameter annotations on the record components just to > gain compatibility with their existing code? > > My gut sense is that the stable solution is to make record component a new > kind of target, and encourage frameworks to learn about these, rather than > trying to fake out frameworks by emulating legacy behavior. > > > On Mar 8, 2019, at 8:43 PM, Kevin Bourrillion wrote: > > Re: annotations, > > Doc says, "Record components constitute a new place to put annotations; > we'll likely want to extend the @Target meta-annotation to reflect this." > > I'm sure we discussed this before, but I also expect to be able to put any > METHOD-, FIELD- or PARAMETER-targeted annotation on a record component, and > have that annotation appear to be present on the synthesized > accessor/field/constructor-parameter. Is that sensible? > > (As for records themselves, I expect they are targeted with TYPE just as > enums/interfaces/"plain old classes" (jeesh, is there any term that means > the latter?).) > > > > > > > > On Fri, Mar 1, 2019 at 12:16 PM Brian Goetz > wrote: > >> I've updated the document on data classes here: >> >> http://cr.openjdk.java.net/~briangoetz/amber/datum.html >> >> (older versions of the document are retained in the same directory for >> historical comparison.) >> >> While the previous version was mostly about tradeoffs, this version >> takes a much more opinionated interpretation of the feature, offering >> more examples of use cases of where it is intended to be used (and not >> used). Many of the "under consideration" flexibilities (extension, >> mutability, additional fields) have collapsed to their more restrictive >> form; while some people will be disappointed because it doesn't solve >> the worst of their boilerplate problems, our conclusion is: records are >> a powerful feature, but they're not necessarily the delivery vehicle for >> easing all the (often self-inflicted) pain of JavaBeans. We can >> continue to explore relief for these situations too as separate >> features, but trying to be all things to all classes has delayed the >> records train long enough, and I'm convince they're separate problems >> that want separate solutions. Time to let the records train roll. >> >> I've also combined the information on sealed types in this document, as >> the two are so tightly related. >> >> Comments welcome. >> > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > > > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Tue Mar 12 12:34:15 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 12 Mar 2019 08:34:15 -0400 Subject: Fwd: Record invariants vs constructor/destructor identity References: Message-ID: <592B8E72-D806-497C-A158-66800D2CE4A5@oracle.com> Received on amber-spec-comments. My comments: This is a nice direction, but in order to make it work, the language would need an actual concept of factory methods, so that the pattern of ?private ctor, public factory? could be counted upon. So the fact that they are known to users is necessary but not sufficient; they would also need to be known _to the language_. > Begin forwarded message: > > From: Victor Nazarov > Subject: Record invariants vs constructor/destructor identity > Date: March 11, 2019 at 12:08:38 PM EDT > To: amber-spec-comments at openjdk.java.net > > Recent discussion on amber experts list considers particularities of > constructor overriding for records. > There seems to be a tension between two requirements: > > 1. Records should be it's state only state and nothing but state > 2. Records should enforce some requirements considering state > > (There was a third requirement that records should allow derived state that > facilitate more optimizations, but I won't touch this point in this > message). > > What I'd like to point out is that ML-family languages already implement > first maxima that "Records should be it's state only state and nothing but > state" and simultaneously deals with enforcement of invariants. > And I think the was it is implemented in languages like Haskell, > StandardML, etc is quite successful. > > In ML-laguages constructor and destructor ALWAYS form an identity and users > are quite accustomed to and rely on this. But types like `Rational` needs > to preprocess it's data to be useful and you can do it in ML-languages. To > do it, you just need to (TM) use a mechanism that you already have: private > constructor. So `Rational` becomes: > > private-constructor record Rational(int num, int denom) { > public static Rational makeRational(int num, int denom) { > int gcd = gcd(num, denim); > num /= gcd; > denom /= gcd; > return new Rational(num, denom); > } > } > > User knows that constructor/destructor identity always holds in any order, > user gets access to all internal representation, but user can't break class > invariant because they don't have access to direct constructor call. > User model remains clear and simple. > > The following properties always hold > > record Foo(int x, int y) > assert new Foo(1, 2).x() == 1 > assert new Foo(1, 2).y() == 2 > > but sometimes user can't create every representable record instance. > Factory-method is here to moderate instance creation. And factory methods > are already known mechanism for Java-programmers. > > -- > Victor Nazarov -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Mar 13 17:52:23 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 13 Mar 2019 13:52:23 -0400 Subject: String reboot (plain text) In-Reply-To: References: Message-ID: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> Lots of good discussion so far. Let me gather the threads. - The primary use case is embedding multi-line chunks of foreign code or data in Java, with minimal need to cruft it up with escaping. This says to me that _multi-line strings_ are actually the high-order bit here, and raw strings are the next bit. Let?s address these in order. - Multi-line-ness and raw-ness are orthogonal concepts. Some languages merge them, and we might consider doing that too, but we shouldn?t start there. - For multi-line strings, a stronger delimiter (e.g., """) seems to be preferred on readability grounds, because people don't want to have to squint to see where the embedded code ends and the Java code resumes. To which I'll add the following observations: - Most multi-line string candidates (JSON, XML, SQL, etc) do not require characters that have to be escaped, as long as we don't have conflicts with the quote character. Which suggests further than ML-ness and raw-ness are solving separate problems. - Once we separate multi-line from raw, the idea of automatically reflowing indentation starts to become a sensible option on non-raw, multi-line strings. - Repeating delimiters are slightly more powerful than fixed delimiters, but also have additional cognitive load, and can still lead to anomalies that are easily encountered. With that said, let's reorder the dishes a bit. For our first course, we could have multi-line strings, delimited by the fixed delimiter """. These would be escaped strings, just like existing string literals, but because the single-quote is no longer the delimiter, the most common source of escaping (embedded quotes) is removed. Most multi-line strings will require no escaping at all. Note that if we stopped here _and never ordered anything else_, we would still be in a much better place than we are now (most snippets could just be cut and pasted without mangling), and what we've introduced is dead-simple! So the cost-benefit ratio here is high; it?s a simple addition that addresses a significant fraction of the pain points. I think we should at least order this. Now, maybe we're still a little hungry, and the above doesn't help with those strings that are most polluted by escapes, such as regular expressions. So, we might additionally order the ability to layer a way to say "no escape mangling" atop both our " strings and our """ strings. Jim proposes we use a delimiter of \".."\ for such strings (\""" ... """\ for the multi-line version). This has a nice connotation; it is as if the backslash is ?distributed over? the whole string. This does, unfortunately, bring us back into Delimiter Hell; what if we want our string to contain the quote + backslash combination? One way is to dive back into repeating delimiters (e.g., using multiple backslashes in the delimiter). Having a non-homogeneous repeating delimiter leaves us in a slightly better place than the original proposal, as we?ve eliminated the ?empty string? anomaly as well as the ?starting with backtick? anomaly. So this seems a workable direction, though the cost-benefit here is less than with the first course ? in both directions (higher cost, lower benefit.) So, in the spirit of ?keep ordering until sated, but stop there?, here are some reasonable choices. 1. Do multi-line (escaped) strings with a ??? fixed delimiter. Large benefit, small cost. Most embedded snippets don?t need any escaping. Low cost, big payoff. 1a. Do 1, but automatically reflow multi-line strings using the equivalent of String::align. There have been reasonable proposals on how to do this; where they fell apart is the interaction with raw-ness, but if we separate ML and raw, these become reasonable again. Higher cost, but higher payoff; having separated the interaction with raw strings, this is more defensible. 2. Do (1) or (1a), and add: single-line raw string literals delimited by \???\. 2a. Do (1) or (1a), and also support multi-line raw string literals (where we _don?t_ automatically apply String::align; this can be done manually). Note that this creates anomalies for multi-line raw string literals starting with quotes (this can be handled with concatenation, and having separated ML and raw, this is less of a problem than before). 3. Do (2) and (2a), and also support a repeating compound delimiter with multiple backslashes and a quote. Note that we can start with 1 or 1a now, and move on to 2/2a later, and same for 3. As we evaluate these options, note that: - Having separated ML-ness from raw-ness, doing automatic reflow becomes more defensible for the common (ML, non-raw) case. - The intersection of ML and raw seems pretty small, so doing 1a + 2, while asymmetric, is defensible. - What we don?t order now, we can add later. On 2/10/2019 1:10 PM, Jim Laskey wrote: > Focus > ===== > > Instead of ordering everything on the menu and immobilizing ourselves with excessive gluttony, let?s focus our attention on the appetizer. If we plan correctly, we'll have room for entrees and desserts later. > > The appetizer here is simplifying the injection of "foreign" language code into Java source. Think tapas. We may well be sated by the time we?re done. > > > Goal > ==== > > Repurposing the Java String as a "foreign" code literal seems to be the most natural and least intrusive contrivance for Java support. In fact, this is already the case. Example; > > // > // > //

Hello World.

> // > // > // > > String html = "\n" + > " \n" + > "

Hello World.

\n" + > " \n" + > " \n" + > "\n"; > > The primary reason we are having the string literal discussion is that the existing form has a few issues; > > ? The existing form is difficult to maintain without support from IDEs and is prone to error. The introduction and subsequent editing of foreign code requires additional delimiters, newlines, concatenations and escape sequences (DNCE). > > ? More to the point, the existing form is difficult to read. The additional DNCE obscure the underlying content of the string. > > Our aim is to come up with a DNCE lexicon that improves foreign code literal readability and maintainability without leaving developers in a confused state; with emphasis on reducing the E (escape sequences.) > > > 50% solution > ============ > > Where we keep running into trouble is that a choice for one part of the lexicon spreads into the the other parts. That is, use of certain characters in the delimiter affect which characters require escaping and which characters can be used for escaping. > > So, let's pick off the lexicon easy bits first. Newlines, concatenations and in-between delimiters can be implicit if we just allow strings to span multiple lines (see Rust.) > > String html = " > >

Hello World.

> > > "; > > That's not so bad. If we did nothing else, we still would be better off than we were before. > > > 75% solution, almost > ==================== > > What problems are left? > > ? The foreign delimiters (quotes) have to be escaped. > > ? The foreign escape sequences also have to be escaped. > > ? And to a lesser degree, it's difficult to locate the closing delimiter. > > Fortunately, we don't have many choices for dealing with escapes; > > ? Backslash is Java's escape character. > > ? Either escaping is on or is off (raw), so we need a way to flag a string as being escaped. We could have an option to turn escaping on/off within a string, but it has been hard to come up with examples where this might be required. > > ? Even with escaping off, we still might have to escape delimiters. Repeated backslashes (or repeated delimiters) is the typical out. > > How about trying \ as the flag for escapes off; > > String html = \" > >

Hello World.

> > > "; > > That doesn't work because it looks like the string ends at the first quote. Let's try symmetry, either \" or "\ as the closing delimiter. "\ is preferable because then it doesn't look like an escape sequence (see Swift.) > > String html = \" > >

Hello World.

> > > "\; > > ? The only new string rule added is to allow multi-line strings. > > ? Adding backslash before and after the string indicates escaping off. > > > But wait > ======== > > This looks like the 75% solution; > > ? Builds on our cred with existing strings. > > ? Escape processing is orthogonal to multi-line. > > ? Delimiter can easily be understood to mean ?string with escapes." > > But wait. "\nloaded" looks like it contains the end delimiter. Rats!!! Captain we need more sequences. > > And, this is the crux of all the debate around strings. Fixed delimiters imply a requirement for escape sequences, otherwise there is content you cannot express as a string. > > The inverse of this implication is that if you have escape sequences you don't need flexible delimiters. This can be reinterpreted as you only need flexible delimiters if you want to always avoid escape sequences. > > Wasn't avoiding escape sequences the goal? > > All this brings us to the central choice we have to make before we get into the rest of the meal. Do we go with fixed delimiter(s), structured delimiters or nonce delimiters. > > > Fixed delimiter > =============== > > If we go with a fixed delimiter then we limit the content that can be expressed without escape sequences. This is not totally left field. There are floating point values we can not express in Java and types we can express but not denote, such as anonymous class types, intersection types or capture types. > > Everything is a degree of tradeoff. And, those tradeoffs are okay as long as we are explicit about it. > > We could get closer to the 85% mark if we had a way to have " in our content without escaping. Let's introduce a secondary delimiter, """. > > String html = """ > >

Hello World.

> > > """; > > The introduction of """ would allow " with the only restriction that we can not use """ in the content without escaping. We could say that """ also means escaping off, but then we would have no way to escape """ (\"""). Keeping escaping as an orthogonal issue allows the best of both worlds. > > String html = \""" > >

Hello World.

> > > """\; > > Once you take away conflicts with the delimiter, most strings do not require escaping. > > Also at this point we should note that other combinations of quotes ('''. ```, "'") don't bring anything new to the table; Tomato/Tomato, Potato/Potato. > > Summary: All strings can be expressed with fixed plus escaping, but can not express strings containing the fixed delimiter (""") with escaping off. > > Jumping ahead: I think that stating that traditional " strings must be single-line will be a popular restriction, even if it not needed. Then they will think of """ as meaning multi-line. > > > Structured delimiter > ==================== > > A structured delimiter contains a repeating pattern that can be expanded to suit a scenario. We attempted to introduce this notion with the original backtick proposal, but that proposal was withdrawn because a) didn't want to burn the backtick, b) developers weren't comfortable with infinitely repeating delimiters, and c) non-expressible anomalies such as content with leading or trailing backticks. > > Using " instead of backtick addresses a). > > String html = """""" > >

Hello World.

> > > """"""; > > For b) is there a limit where developers would be comfortable? That is, what about a range of fixed delimiters; ", """, """", """"", """""". This is slightly different than fixed delimiters in that it increases the combinations of content containing delimiters. Example, """"" could allow ", """, """", ..., Nx" for N != 5. > > Structured delimiters also differ from fixed delimiters in the fact that there is pressure to have escaping off when N >= 3. You can always fall back to a single ". > > Summary: Can express all strings with and without escaping. If the delimiter length is limited the there there is still a (smaller) set of strings that can not be expressed. > > > Nonce delimiter > =============== > > A nonce or custom delimiter allows developers to include a unique character sequence in the delimiter. This provides a flexible delimiter without fear of going too far. There is also the advantage/distraction of providing commentary. > > String html = \HTML" > >

Hello World.

> > > "HTML\; > > Summary: Can express all strings with and without escaping, but nonce can affect readability. > > > Multi-line formatting > ===================== > > I left this out of the main discussion, but I think we can all agree that formatting rules should separate the delimiters from the content. Other details can be refined after choice of delimiter(s). > > String html = \""" > > >

Hello World.

> > > > """\; > > String html = """""" > > >

Hello World.

> > > > """"""; > > String html = \HTML" > > >

Hello World.

> > > > "HTML/; > > > Entrees and desserts > ==================== > > If we make good choices now (stay away from the oysters) we can still move on to other courses later. > > For instance; if we got up from the table with the ", """, \", \""" set of delimiters, we could still introduce structured delimiters in the future; either with repeated \ (see Swift) or repeated ". We could also follow a suggestion John made to use a pseudo nonce like \5" for \\\\\" or \""""". > > Point being, we can work with a 85% solution now that we can supplement later when we're not so hangry. > > > > > > >> On Feb 10, 2019, at 12:30 PM, James Laskey wrote: >> >> I should know better than format e-mails. Many a backslash eaten. The summary should be; >> >>>> For instance; if we got up from the table with the ", """, \", \""" set of delimiters, we could still introduce structured delimiters in the future; either with repeated \ (see Swift) or repeated ". We could also follow a suggestion John made to use a pseudo nonce like \5" for \\\\\" or \""""". >>>> >> Sent from my iPhone >> >> On Feb 10, 2019, at 11:43 AM, Jim Laskey wrote: >> >>>> Focus >>>> >>>> Instead of ordering everything on the menu and immobilizing ourselves with excessive gluttony, let?s focus our attention on the appetizer. If we plan correctly, we'll have room for entrees and desserts later. >>>> >>>> The appetizer here is simplifying the injection of "foreign" language code into Java source. Think tapas. We may well be sated by the time we?re done. >>>> >>>> Goal >>>> >>>> Repurposing the Java String as a "foreign" code literal seems to be the most natural and least intrusive contrivance for Java support. In fact, this is already the case. Example; >>>> >>>> // >>>> // >>>> //

Hello World.

>>>> // >>>> // >>>> // >>>> >>>> String html = "\n" + >>>> " \n" + >>>> "

Hello World.

\n" + >>>> " \n" + >>>> " \n" + >>>> "\n"; >>>> >>>> The primary reason we are having the string literal discussion is that the existing form has a few issues; >>>> >>>> ? The existing form is difficult to maintain without support from IDEs and is prone to error. The introduction and subsequent editing of foreign code requires additional delimiters, newlines, concatenations and escape sequences (DNCE). >>>> >>>> ? More to the point, the existing form is difficult to read. The additional DNCE obscure the underlying content of the string. >>>> >>>> Our aim is to come up with a DNCE lexicon that improves foreign code literal readability and maintainability without leaving developers in a confused state; with emphasis on reducing the E (escape sequences.) >>>> >>>> 50% solution >>>> >>>> Where we keep running into trouble is that a choice for one part of the lexicon spreads into the the other parts. That is, use of certain characters in the delimiter affect which characters require escaping and which characters can be used for escaping. >>>> >>>> So, let's pick off the lexicon easy bits first. Newlines, concatenations and in-between delimiters can be implicit if we just allow strings to span multiple lines (see Rust.) >>>> >>>> String html = " >>>> >>>>

Hello World.

>>>> >>>> >>>> "; >>>> >>>> That's not so bad. If we did nothing else, we still would be better off than we were before. >>>> >>>> 75% solution, almost >>>> >>>> What problems are left? >>>> >>>> ? The foreign delimiters (quotes) have to be escaped. >>>> >>>> ? The foreign escape sequences also have to be escaped. >>>> >>>> ? And to a lesser degree, it's difficult to locate the closing delimiter. >>>> >>>> Fortunately, we don't have many choices for dealing with escapes; >>>> >>>> ? Backslash is Java's escape character. >>>> >>>> ? Either escaping is on or is off (raw), so we need a way to flag a string as being escaped. We could have an option to turn escaping on/off within a string, but it has been hard to come up with examples where this might be required. >>>> >>>> ? Even with escaping off, we still might have to escape delimiters. Repeated backslashes (or repeated delimiters) is the typical out. >>>> >>>> How about trying as the flag for escapes off; >>>> >>>> String html = \" >>>> >>>>

Hello World.

>>>> >>>> >>>> "; >>>> >>>> That doesn't work because it looks like the string ends at the first quote. Let's try symmetry, either " or " as the closing delimiter. " is preferable because then it doesn't look like an escape sequence (see Swift.) >>>> >>>> String html = \" >>>> >>>>

Hello World.

>>>> >>>> >>>> "\; >>>> >>>> ? The only new string rule added is to allow multi-line strings. >>>> >>>> ? Adding backslash before and after the string indicates escaping off. >>>> >>>> But wait >>>> >>>> This looks like the 75% solution; >>>> >>>> ? Builds on our cred with existing strings. >>>> >>>> ? Escape processing is orthogonal to multi-line. >>>> >>>> ? Delimiter can easily be understood to mean ?string with escapes." >>>> >>>> But wait. "" looks like it contains the end delimiter. Rats!!! Captain we need more sequences. >>>> >>>> And, this is the crux of all the debate around strings. Fixed delimiters imply a requirement for escape sequences, otherwise there is content you cannot express as a string. >>>> >>>> The inverse of this implication is that if you have escape sequences you don't need flexible delimiters. This can be reinterpreted as you only need flexible delimiters if you want to always avoid escape sequences. >>>> >>>> Wasn't avoiding escape sequences the goal? >>>> >>>> All this brings us to the central choice we have to make before we get into the rest of the meal. Do we go with fixed delimiter(s), structured delimiters or nonce delimiters. >>>> >>>> Fixed delimiter >>>> >>>> If we go with a fixed delimiter then we limit the content that can be expressed without escape sequences. This is not totally left field. There are floating point values we can not express in Java and types we can express but not denote, such as anonymous class types, intersection types or capture types. >>>> >>>> Everything is a degree of tradeoff. And, those tradeoffs are okay as long as we are explicit about it. >>>> >>>> We could get closer to the 85% mark if we had a way to have " in our content without escaping. Let's introduce a secondary delimiter, """. >>>> >>>> String html = """ >>>> >>>>

Hello World.

>>>> >>>> >>>> """; >>>> >>>> The introduction of """ would allow " with the only restriction that we can not use """ in the content without escaping. We could say that """ also means escaping off, but then we would have no way to escape """ (\"""). Keeping escaping as an orthogonal issue allows the best of both worlds. >>>> >>>> String html = \""" >>>> >>>>

Hello World.

>>>> >>>> >>>> """\; >>>> >>>> Once you take away conflicts with the delimiter, most strings do not require escaping. >>>> >>>> Also at this point we should note that other combinations of quotes ('''. ```, "'") don't bring anything new to the table; Tomato/Tomato, Potato/Potato. >>>> >>>> Summary: All strings can be expressed with fixed plus escaping, but can not express strings containing the fixed delimiter (""") with escaping off. >>>> >>>> Jumping ahead: I think that stating that traditional " strings must be single-line will be a popular restriction, even if it not needed. Then they will think of """ as meaning multi-line. >>>> >>>> Structured delimiter >>>> >>>> A structured delimiter contains a repeating pattern that can be expanded to suit a scenario. We attempted to introduce this notion with the original backtick proposal, but that proposal was withdrawn because a) didn't want to burn the backtick, b) developers weren't comfortable with infinitely repeating delimiters, and c) non-expressible anomalies such as content with leading or trailing backticks. >>>> >>>> Using " instead of backtick addresses a). >>>> >>>> String html = """""" >>>> >>>>

Hello World.

>>>> >>>> >>>> """"""; >>>> >>>> For b) is there a limit where developers would be comfortable? That is, what about a range of fixed delimiters; ", """, """", """"", """""". This is slightly different than fixed delimiters in that it increases the combinations of content containing delimiters. Example, """"" could allow ", """, """", ..., Nx" for N != 5. >>>> >>>> Structured delimiters also differ from fixed delimiters in the fact that there is pressure to have escaping off when N >= 3. You can always fall back to a single ". >>>> >>>> Summary: Can express all strings with and without escaping. If the delimiter length is limited the there there is still a (smaller) set of strings that can not be expressed. >>>> >>>> Nonce delimiter >>>> >>>> A nonce or custom delimiter allows developers to include a unique character sequence in the delimiter. This provides a flexible delimiter without fear of going too far. There is also the advantage/distraction of providing commentary. >>>> >>>> String html = \HTML" >>>> >>>>

Hello World.

>>>> >>>> >>>> "HTML\; >>>> >>>> Summary: Can express all strings with and without escaping, but nonce can affect readability. >>>> >>>> Multi-line formatting >>>> >>>> I left this out of the main discussion, but I think we can all agree that formatting rules should separate the delimiters from the content. Other details can be refined after choice of delimiter(s). >>>> >>>> String html = \""" >>>> >>>> >>>>

Hello World.

>>>> >>>> >>>> >>>> """\; >>>> >>>> String html = """""" >>>> >>>> >>>>

Hello World.

>>>> >>>> >>>> >>>> """"""; >>>> >>>> String html = \HTML" >>>> >>>> >>>>

Hello World.

>>>> >>>> >>>> >>>> "HTML/; >>>> >>>> Entrees and desserts >>>> >>>> If we make good choices now (stay away from the oysters) we can still move on to other courses later. >>>> >>>> For instance; if we got up from the table with the ", """, ", """ set of delimiters, we could still introduce structured delimiters in the future; either with repeated (see Swift) or repeated ". We could also follow a suggestion John made to use a pseudo nonce like " for \\" or """"". >>>> >>>> Point being, we can work with a 85% solution now that we can supplement later when we're not so hangry. >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Wed Mar 13 17:59:23 2019 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 13 Mar 2019 13:59:23 -0400 Subject: String reboot (plain text) In-Reply-To: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> Message-ID: > On Mar 13, 2019, at 1:52 PM, Brian Goetz wrote: > . . . > On 2/10/2019 1:10 PM, Jim Laskey wrote: >> ? Let's try symmetry, either \" or "\ as the closing delimiter. "\ is preferable because then it doesn't look like an escape sequence (see Swift.) >> >> String html = \" >> >>

Hello World.

>> >> >> "\; I believe there is a small problem with this specific example: doesn?t this string literal end just before the word ?loaded? in the penultimate line? I see a double quote that is (coincidentally) immediately followed by a backslash. Sorry I failed to note this back in February. Of course, using \???????\ avoids this problem. ?Guy From brian.goetz at oracle.com Wed Mar 13 18:23:18 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 13 Mar 2019 14:23:18 -0400 Subject: String reboot (plain text) In-Reply-To: References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> Message-ID: <9C91B351-4A82-46A4-BF88-D596A79450EE@oracle.com> Jim?s example was a little hard to follow; (I think) he was pulling the string of ?what if we just let normal string literals span lines?, and then pulled back from this to say ?I think we actually do want a separate delimiter?, and comes to the same conclusion you did. > On Mar 13, 2019, at 1:59 PM, Guy Steele wrote: > > >> On Mar 13, 2019, at 1:52 PM, Brian Goetz wrote: >> . . . >> On 2/10/2019 1:10 PM, Jim Laskey wrote: >>> ? Let's try symmetry, either \" or "\ as the closing delimiter. "\ is preferable because then it doesn't look like an escape sequence (see Swift.) >>> >>> String html = \" >>> >>>

Hello World.

>>> >>> >>> "\; > I believe there is a small problem with this specific example: doesn?t this string literal end just before the word ?loaded? in the penultimate line? I see a double quote that is (coincidentally) immediately followed by a backslash. > > Sorry I failed to note this back in February. > > Of course, using \???????\ avoids this problem. > > ?Guy > From guy.steele at oracle.com Wed Mar 13 18:07:37 2019 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 13 Mar 2019 14:07:37 -0400 Subject: String reboot (plain text) In-Reply-To: <9C91B351-4A82-46A4-BF88-D596A79450EE@oracle.com> References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> <9C91B351-4A82-46A4-BF88-D596A79450EE@oracle.com> Message-ID: I think that?s right. I just wanted to point out, for the record, the specific small problem in the one specific example, since no one else had. That?s all. I think String html = \"""

Hello World.

"""\; looks pretty good. > On Mar 13, 2019, at 2:23 PM, Brian Goetz wrote: > > Jim?s example was a little hard to follow; (I think) he was pulling the string of ?what if we just let normal string literals span lines?, and then pulled back from this to say ?I think we actually do want a separate delimiter?, and comes to the same conclusion you did. > >> On Mar 13, 2019, at 1:59 PM, Guy Steele wrote: >> >> >>> On Mar 13, 2019, at 1:52 PM, Brian Goetz wrote: >>> . . . >>> On 2/10/2019 1:10 PM, Jim Laskey wrote: >>>> ? Let's try symmetry, either \" or "\ as the closing delimiter. "\ is preferable because then it doesn't look like an escape sequence (see Swift.) >>>> >>>> String html = \" >>>> >>>>

Hello World.

>>>> >>>> >>>> "\; >> I believe there is a small problem with this specific example: doesn?t this string literal end just before the word ?loaded? in the penultimate line? I see a double quote that is (coincidentally) immediately followed by a backslash. >> >> Sorry I failed to note this back in February. >> >> Of course, using \???????\ avoids this problem. >> >> ?Guy >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.laskey at oracle.com Wed Mar 13 18:36:26 2019 From: james.laskey at oracle.com (James Laskey) Date: Wed, 13 Mar 2019 15:36:26 -0300 Subject: String reboot (plain text) In-Reply-To: References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> Message-ID: <9DF0850D-2722-4ED1-B11C-743A68405735@oracle.com> I think I mention that later in the doc. Sent from my iPhone > On Mar 13, 2019, at 2:59 PM, Guy Steele wrote: > > >> On Mar 13, 2019, at 1:52 PM, Brian Goetz wrote: >> . . . >> On 2/10/2019 1:10 PM, Jim Laskey wrote: >>> ? Let's try symmetry, either \" or "\ as the closing delimiter. "\ is preferable because then it doesn't look like an escape sequence (see Swift.) >>> >>> String html = \" >>> >>>

Hello World.

>>> >>> >>> "\; > I believe there is a small problem with this specific example: doesn?t this string literal end just before the word ?loaded? in the penultimate line? I see a double quote that is (coincidentally) immediately followed by a backslash. > > Sorry I failed to note this back in February. > > Of course, using \???????\ avoids this problem. > > ?Guy > From guy.steele at oracle.com Wed Mar 13 18:22:08 2019 From: guy.steele at oracle.com (Guy Steele) Date: Wed, 13 Mar 2019 14:22:08 -0400 Subject: String reboot (plain text) In-Reply-To: <9DF0850D-2722-4ED1-B11C-743A68405735@oracle.com> References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> <9DF0850D-2722-4ED1-B11C-743A68405735@oracle.com> Message-ID: Ah, yes, you do, somewhat obliquely: But wait. "\nloaded" looks like it contains the end delimiter. Rats!!! Captain we need more sequences. Unfortunately, this was one of the paragraphs that didn?t read correctly in the original (non-plain-text) email. Oh, the irony! :-) In any case, thank you for helping me to make sure that this dead horse is now _thoroughly_ dead. > On Mar 13, 2019, at 2:36 PM, James Laskey wrote: > > I think I mention that later in the doc. > > Sent from my iPhone > >> On Mar 13, 2019, at 2:59 PM, Guy Steele wrote: >> >> >>> On Mar 13, 2019, at 1:52 PM, Brian Goetz wrote: >>> . . . >>> On 2/10/2019 1:10 PM, Jim Laskey wrote: >>>> ? Let's try symmetry, either \" or "\ as the closing delimiter. "\ is preferable because then it doesn't look like an escape sequence (see Swift.) >>>> >>>> String html = \" >>>> >>>>

Hello World.

>>>> >>>> >>>> "\; >> I believe there is a small problem with this specific example: doesn?t this string literal end just before the word ?loaded? in the penultimate line? I see a double quote that is (coincidentally) immediately followed by a backslash. >> >> Sorry I failed to note this back in February. >> >> Of course, using \???????\ avoids this problem. >> >> ?Guy >> > From kevinb at google.com Wed Mar 13 18:56:32 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Wed, 13 Mar 2019 11:56:32 -0700 Subject: String reboot (plain text) In-Reply-To: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> Message-ID: On Wed, Mar 13, 2019 at 10:52 AM Brian Goetz wrote: Lots of good discussion so far. Let me gather the threads. > > - The primary use case is embedding multi-line chunks of foreign code or > data in Java, with minimal need to cruft it up with escaping. This says to > me that _multi-line strings_ are actually the high-order bit here, and raw > strings are the next bit. Let?s address these in order. We have found this to be true; we have also found that the next few use cases are actually not far behind: console output, long "expected" strings in tests, and long exception messages. (The last case is interesting because you don't really want to keep \n's in the string at runtime, yet it's still *way* nicer to write as a multiline literal that can be easily reflowed without dealing with `" + "`. ) Happily, these cases also all support the claim that multi-line-ness is more often desired than raw-ness. - Multi-line-ness and raw-ness are orthogonal concepts. Is that true, as stated? I would have said that any support for rawness automatically gives you support for multi-line-ness by nature, because a newline character becomes literal. That doesn't seem like orthogonality. I say this because it's the reason I was always completely fine with the fact we were talking only about a "raw" feature and not two independent features. The proposal as it was published months ago would have done somewhere close to 100% of what our codebase needs... if only we could have settled how to get indentation stripping. We had options for how that could be done in a reasonably learnable way, and of course with the strict requirement that the "I only care about rawness" users are unaffected. I know this opinion is not shared, but it seemed to me that it was only our discomfort with writing the stripping behavior into the language spec, and nothing else, that stopped us from having a great solution. Some languages merge them, and we might consider doing that too, but we > shouldn?t start there. > > - For multi-line strings, a stronger delimiter (e.g., """) seems to be > preferred on readability grounds, because people don't want to have to > squint to see where the embedded code ends and the Java code resumes. > Valid point. Today, every line or group of lines in a .java source file *is* Java code, but now there will be sections where that's not at all clearly the case. Making the boundaries clear between the two types of code seems like a good practice. The old proposal *allowed* a single backtick to offset these sections in 99% of cases, but it occurred to me that developers would often be better off using more of them just to delineate better... To which I'll add the following observations: > > - Most multi-line string candidates (JSON, XML, SQL, etc) do not require > characters that have to be escaped, as long as we don't have conflicts with > the quote character. (We did find this to be true. Quotes, of course, are quite common.) > For our first course, we could have multi-line strings, delimited by the > fixed delimiter """. These would be escaped strings, just like existing > string literals, but because the single-quote is no longer the delimiter, > the most common source of escaping (embedded quotes) is removed. Most > multi-line strings will require no escaping at all. > Note that if we stopped here _and never ordered anything else_, we would > still be in a much better place than we are now (most snippets could just > be cut and pasted without mangling), and what we've introduced is > dead-simple! So the cost-benefit ratio here is high; it?s a simple > addition that addresses a significant fraction of the pain points. I think > we should at least order this. > This is true. (Call this State A for now.) > Now, maybe we're still a little hungry, and the above doesn't help with > those strings that are most polluted by escapes, such as regular > expressions. So, we might additionally order the ability to layer a way to > say "no escape mangling" atop both our " strings and our """ > strings. Jim proposes we use a delimiter of \".."\ for such strings (\""" > ... """\ for the multi-line version). This has a nice connotation; it is > as if the backslash is ?distributed over? the whole string. > This is the part that concerns me a lot. I think that adding *two* new string-literal features that can be used separately or together is putting the language in a *much* more complex state. If we reached State A (above) I would feel much better about stopping there than coming here. And it would be a bummer about regular expressions. I believe State A is inferior to the proposal we looked at several months ago, which did a pretty good job of also handling things like that. 1a. Do 1, but automatically reflow multi-line strings using the equivalent > of String::align. There have been reasonable proposals on how to do this; > where they fell apart is the interaction with raw-ness, Did they? I didn't think they did. The problem that a raw string might get unintentionally stripped seemed to me like one we had easy ways to deal with. As Brian knows well, it really surprises me that we went back to the drawing board with this feature, because the reasons we cited for doing so seemed so very minor. -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Mar 13 19:09:55 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 13 Mar 2019 15:09:55 -0400 Subject: String reboot (plain text) In-Reply-To: References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> Message-ID: <329E7B42-7904-4FC3-ADAD-BAD007F4AD33@oracle.com> > > - Multi-line-ness and raw-ness are orthogonal concepts. > > Is that true, as stated? I would have said that any support for rawness automatically gives you support for multi-line-ness by nature, because a newline character becomes literal. That doesn't seem like orthogonality. My claim is that this is a shortcut that skips over the inherent orthogonality, and in the process, misses out on some useful improvements. It?s a shortcut that works (other languages have gone there), but we lose something by compressing the two features into one. For example, we miss out on the opportunity to do text reflow on non-raw-ML strings and not do it on raw-ML strings (I don?t think we can really justify doing it automatically on raw strings; that?s not what raw means.) So if we compress the two concepts, we have to give up automatically on automatic reflow, even though it is worth considering. > I say this because it's the reason I was always completely fine with the fact we were talking only about a "raw" feature and not two independent features. The proposal as it was published months ago would have done somewhere close to 100% of what our codebase needs... if only we could have settled how to get indentation stripping. We had options for how that could be done in a reasonably learnable way, and of course with the strict requirement that the "I only care about rawness" users are unaffected. I know this opinion is not shared, but it seemed to me that it was only our discomfort with writing the stripping behavior into the language spec, and nothing else, that stopped us from having a great solution. I would phrase this as: ?raw is often an acceptable substitute for ML.? > > Now, maybe we're still a little hungry, and the above doesn't help with those strings that are most polluted by escapes, such as regular expressions. So, we might additionally order the ability to layer a way to say "no escape mangling" atop both our " strings and our """ strings. Jim proposes we use a delimiter of \".."\ for such strings (\""" ... """\ for the multi-line version). This has a nice connotation; it is as if the backslash is ?distributed over? the whole string. > > This is the part that concerns me a lot. I think that adding *two* new string-literal features that can be used separately or together is putting the language in a much more complex state. If we reached State A (above) I would feel much better about stopping there than coming here. And that?s a valid choice! State A was placed first because it is simple, effective, and might well be enough; the rest of the discussions is an exercise in incremental costs and incremental benefits. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cushon at google.com Wed Mar 13 19:42:29 2019 From: cushon at google.com (Liam Miller-Cushon) Date: Wed, 13 Mar 2019 12:42:29 -0700 Subject: String reboot (plain text) In-Reply-To: <329E7B42-7904-4FC3-ADAD-BAD007F4AD33@oracle.com> References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> <329E7B42-7904-4FC3-ADAD-BAD007F4AD33@oracle.com> Message-ID: On Wed, Mar 13, 2019 at 12:10 PM Brian Goetz wrote: > For example, we miss out on the opportunity to do text reflow on > non-raw-ML strings and not do it on raw-ML strings (I don?t think we can > really justify doing it automatically on raw strings; that?s not what raw > means.) > For my understanding, is this something you could expand on? I appreciate the pedagogical simplicity of "raw means raw", instead of having to understand nuances of automatically adjusting indentation when thinking about raw strings. There are clearly tradeoffs here, but it isn't obvious to me that reflowing indentation for raw strings is fatally flawed. And there are advantages: it decouples formatting and indentation choices from the value of the string literal, and many raw strings (especially foreign code) won't care about the leading whitespace anyway, and for cases where the leading whitespace is needed there are advantages to a library solution (reading `.indent(20)` is easier than counting 20 spaces). One thing I found noteworthy from the swift proposal was that both their multi-line and raw strings automatically manage leading indentation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Wed Mar 13 19:56:55 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Wed, 13 Mar 2019 15:56:55 -0400 Subject: String reboot (plain text) In-Reply-To: References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> <329E7B42-7904-4FC3-ADAD-BAD007F4AD33@oracle.com> Message-ID: In general, being able to break things down into orthogonal primitives leads to a more expressive and simpler outcome. If we squashed the two features together, the outcome would be that we have two non-primitive, non-orthogonal string features ? ?old strings? (which are single-line and have escaping) and ?new strings? (which are multi-line, have automatic reflow, some anomalies surrounding starting with quotes, and no escaping.) You could argue that a menu of two very different items gives you more chances to have something that is close to what you want, but I think the burden is high to justify an ad-hoc alternate way to do something that you could already do. (The previous RSL proposal, the one we almost went out with, was exactly such an ad-hoc way to do things that could already be done.) Would it have been a disaster? No, of course not. But I think _any_ of the options on the menu I outlined in the previous mail are substantially better than the previous proposal. They are all grounded in simple, mostly-orthogonal variations on the basic theme of string literals (tied together with a common syntactic approach ? which the previous RSL proposal also lacked.) ?Raw, except for line terminator normalization, and text flowing, and maybe later interpolation? is not an easy concept to explain or understand, because it couples multiple unrelated things. (To the ?indent is good enough? point: Auto reflow is a disaster when applied to mixed spaces and tabs; while in general one should avoid this, I cannot rule out the possibility that someone might actually want to embed such a snippet; in that case, truly raw strings are an option. If we take away truly raw, now they just have two bad approximations.) > On Mar 13, 2019, at 3:42 PM, Liam Miller-Cushon wrote: > > On Wed, Mar 13, 2019 at 12:10 PM Brian Goetz > wrote: > For example, we miss out on the opportunity to do text reflow on non-raw-ML strings and not do it on raw-ML strings (I don?t think we can really justify doing it automatically on raw strings; that?s not what raw means.) > > For my understanding, is this something you could expand on? > > I appreciate the pedagogical simplicity of "raw means raw", instead of having to understand nuances of automatically adjusting indentation when thinking about raw strings. There are clearly tradeoffs here, but it isn't obvious to me that reflowing indentation for raw strings is fatally flawed. And there are advantages: it decouples formatting and indentation choices from the value of the string literal, and many raw strings (especially foreign code) won't care about the leading whitespace anyway, and for cases where the leading whitespace is needed there are advantages to a library solution (reading `.indent(20)` is easier than counting 20 spaces). > > One thing I found noteworthy from the swift proposal was that both their multi-line and raw strings automatically manage leading indentation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.buckley at oracle.com Fri Mar 15 18:39:49 2019 From: alex.buckley at oracle.com (Alex Buckley) Date: Fri, 15 Mar 2019 11:39:49 -0700 Subject: Switch expressions spec In-Reply-To: References: <5C80334D.90508@oracle.com> Message-ID: <5C8BF175.8050904@oracle.com> // The mail below doesn't appear to have made it to the amber-spec-experts web archive, even though Manoj is a member of the list. Hey Gavin, In a switch expression, I believe it should be legal for every `case`/`default` arm to complete abruptly _for a reason other than a break with value_. That is, a legal switch expression may have zero rules like `case a -> 5;` (completes normally) or `case b -> { break 6; }` (completes abruptly for reason of break with value). Instead, it only has rules like `case c -> { throw new Exc(); }` or `case d -> throw new Exc();`, both of which complete abruptly for reason othan than break with value. (Extend to switch labeled statement groups.) I suspect that the strong rule flagged by Manoj: It is a compile-time error if a switch expression has no result expressions. is trying to require a value `break` statement being present in every switch labeled block, because a rule earlier in 15.28.1 did not quite go that far: If the switch block consists of switch labeled rules, then any switch labeled block (14.11.1) must complete abruptly. Alex On 3/14/2019 4:14 PM, Manoj Palat wrote: > Hi Alex, Gavin, > > One more clarification of the spec: > > Consider the following code: > public class X { > @SuppressWarnings("preview") > public static int foo(int i) throws MyException { > int v = switch (i) { > default -> throw new MyException(); // error or no error? > }; > return v; > } > public static void main(String argv[]) { > try { > System.out.println(X.foo(1)); > } catch (MyException e) { > System.out.println("Exception thrown as expected"); > } > } > } > class MyException extends Exception { > private static final long serialVersionUID = 3461899582505930473L; > } > > As per spec, JLS 15.28.1 > > It is a compile-time error if a switch expression has no result expressions. > > ------------------------ > > Throw statement is not a result expression and hence as per the spec we > should be giving this error. > > Is this an omission in the spec? Should we be flagging an error? > > In Eclipse ECJ we are flagging an error but I observed javac does not - > Want to get clarity on what does the spec mean. > > Regards, > > Manoj > > Eclipse Java Dev, > > IBM. From cushon at google.com Fri Mar 15 18:49:07 2019 From: cushon at google.com (Liam Miller-Cushon) Date: Fri, 15 Mar 2019 11:49:07 -0700 Subject: String reboot (plain text) In-Reply-To: References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> <329E7B42-7904-4FC3-ADAD-BAD007F4AD33@oracle.com> Message-ID: On Wed, Mar 13, 2019 at 12:57 PM Brian Goetz wrote: > To the ?indent is good enough? point: Auto reflow is a disaster when > applied to mixed spaces and tabs; while in general one should avoid this, I > cannot rule out the possibility that someone might actually want to embed > such a snippet; in that case, truly raw strings are an option. If we take > away truly raw, now they just have two bad approximations. > What do you think is the best framework for evaluating that trade-off? The issues with leading spaces and tabs may be severe but should be extremely rare. The issues with manually managing leading indentation are less severe, but also very common. If the leading indentation feature considers the closing delimiter position (as some of the earlier proposals did), it's easy to explicitly keep the leading whitespace and avoid collateral damage even with mixed spaces/tabs: void f() { String hello = \""" all leading whitespace is preserved even with a mixture of tabs and spaces """\; // (the closing delimiter is un-indented to the margin, forcing leading whitespace to be kept for the other lines) } -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Fri Mar 15 19:01:33 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 15 Mar 2019 12:01:33 -0700 Subject: Switch expressions spec In-Reply-To: <5C8BF175.8050904@oracle.com> References: <5C80334D.90508@oracle.com> <5C8BF175.8050904@oracle.com> Message-ID: On Mar 15, 2019, at 11:39 AM, Alex Buckley wrote: > > In a switch expression, I believe it should be legal for every `case`/`default` arm to complete abruptly _for a reason other than a break with value_. My reading of Gavin's draft is that he is doing something very subtle there, which is to retain an existing feature in the language that an expression always has a defined normal completion. We also don't have expressions of the form "throw e". Allowing a switch expression to complete without a value on *every* arm raises the same question as "throw e" as an expression. How do you type "f(throw e)"? If you can answer that, then you can also have switch expressions that refuse to break with any values. BTW, if an expression has a defined normal completion, it also has a possible type. By possible type I mean at least one correct typing (poly-expressions can have many). So one obvious result of Gavin's draft is that you derive possible types from the arms of the switch expression that break with values. But the root requirement, I think, is to preserve the possible normal normal of every expression. "What about some form of 1/0?" That's a good question. What about it? It completes normally with a type of int. Dynamically, the normal completion is never taken. Gavin might call that a "notional normal completion" (I like that word) provided to uphold the general principle even where static analysis proves that the Turing machine fails to return normally. ? John From john.r.rose at oracle.com Fri Mar 15 19:06:50 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 15 Mar 2019 12:06:50 -0700 Subject: Switch expressions spec In-Reply-To: References: <5C80334D.90508@oracle.com> <5C8BF175.8050904@oracle.com> Message-ID: <005EC255-3679-4694-9BFC-3764DF169FE7@oracle.com> On Mar 15, 2019, at 12:01 PM, John Rose wrote: > > How do you type "f(throw e)"? P.S. I suppose this EG has already considered that question, but here's one answer that occurred to after hitting send: "throw e" is sugar for an unconstrained poly expression, approximately: static X throwMe(T t) throws T { throw T; } ? throwMe(e) From brian.goetz at oracle.com Fri Mar 15 19:09:22 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 15 Mar 2019 15:09:22 -0400 Subject: Switch expressions spec In-Reply-To: References: <5C80334D.90508@oracle.com> <5C8BF175.8050904@oracle.com> Message-ID: <115286FF-64EA-43A0-BE14-058FC1DDC9A2@oracle.com> At the same time, we also reaffirmed our choice to _not_ allow throw from one half of a conditional: int x = foo ? 3 : throw new FooException() But John has this right ? the high order bit is that every expression should have a defined normal completion, and a type, even if computing sub-expressions (or in this case, sub-statements) might throw. And without at least one arm yielding a value, it would be impossible to infer the type of the expression. > On Mar 15, 2019, at 3:01 PM, John Rose wrote: > > On Mar 15, 2019, at 11:39 AM, Alex Buckley wrote: >> >> In a switch expression, I believe it should be legal for every `case`/`default` arm to complete abruptly _for a reason other than a break with value_. > > My reading of Gavin's draft is that he is doing something very > subtle there, which is to retain an existing feature in the language > that an expression always has a defined normal completion. > > We also don't have expressions of the form "throw e". Allowing > a switch expression to complete without a value on *every* arm > raises the same question as "throw e" as an expression. How do > you type "f(throw e)"? If you can answer that, then you can also > have switch expressions that refuse to break with any values. > > BTW, if an expression has a defined normal completion, it also > has a possible type. By possible type I mean at least one correct > typing (poly-expressions can have many). So one obvious > result of Gavin's draft is that you derive possible types from > the arms of the switch expression that break with values. > > But the root requirement, I think, is to preserve the possible > normal normal of every expression. > > "What about some form of 1/0?" That's a good question. > What about it? It completes normally with a type of int. > Dynamically, the normal completion is never taken. > Gavin might call that a "notional normal completion" > (I like that word) provided to uphold the general principle > even where static analysis proves that the Turing machine > fails to return normally. > > ? John From brian.goetz at oracle.com Fri Mar 15 19:13:02 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 15 Mar 2019 15:13:02 -0400 Subject: String reboot (plain text) In-Reply-To: References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> <329E7B42-7904-4FC3-ADAD-BAD007F4AD33@oracle.com> Message-ID: <71D4524B-8917-4AAA-BF25-7CAF28D9A842@oracle.com> I think all of this is fine in the context of a multi-line string feature (option 1a from my mail). I think its highly questionable if we try to apply this to so-called ?raw? string literals, and it becomes a tangled hairball if the _only_ way to get to the multi-line feature is to randomly suppress some forms of input processing (escapes) but not others (text reflow). > On Mar 15, 2019, at 2:49 PM, Liam Miller-Cushon wrote: > > On Wed, Mar 13, 2019 at 12:57 PM Brian Goetz > wrote: > To the ?indent is good enough? point: Auto reflow is a disaster when applied to mixed spaces and tabs; while in general one should avoid this, I cannot rule out the possibility that someone might actually want to embed such a snippet; in that case, truly raw strings are an option. If we take away truly raw, now they just have two bad approximations. > > What do you think is the best framework for evaluating that trade-off? > > The issues with leading spaces and tabs may be severe but should be extremely rare. The issues with manually managing leading indentation are less severe, but also very common. > > If the leading indentation feature considers the closing delimiter position (as some of the earlier proposals did), it's easy to explicitly keep the leading whitespace and avoid collateral damage even with mixed spaces/tabs: > > void f() { > String hello = \""" > all leading whitespace is preserved > even with a mixture of tabs and spaces > """\; // (the closing delimiter is un-indented to the margin, forcing leading whitespace to be kept for the other lines) > } -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Fri Mar 15 19:42:00 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 15 Mar 2019 12:42:00 -0700 Subject: String reboot (plain text) In-Reply-To: References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> Message-ID: <74ECF5A3-162A-4E53-B387-19675B264C83@oracle.com> On Mar 13, 2019, at 11:56 AM, Kevin Bourrillion wrote: > > - Multi-line-ness and raw-ness are orthogonal concepts. > > Is that true, as stated? I would have said that any support for rawness automatically gives you support for multi-line-ness by nature, because a newline character becomes literal. That doesn't seem like orthogonality. True orthogonality means that the two vectors have a cosine of zero. That's a stronger condition than independence, which is a cosine of less than one. There are at least two interesting factors that pull the cosine between "raw" and "multiline" away from zero. One is that raw implies multiline, as you just pointed out, Kevin. A second goes the other way: Multiline asks for raw, because of its scaling properties. Classic escapes are *less appropriate* to multiline strings than classic single-line strings. This point is touched on here: >> - For multi-line strings, a stronger delimiter (e.g., """) seems to be preferred on readability grounds, because people don't want to have to squint to see where the embedded code ends and the Java code resumes. >> > Valid point. Today, every line or group of lines in a .java source file is Java code, but now there will be sections where that's not at all clearly the case. Making the boundaries clear between the two types of code seems like a good practice. The old proposal allowed a single backtick to offset these sections in 99% of cases, but it occurred to me that developers would often be better off using more of them just to delineate better? But I think the point is a little stronger. We can expect that normal code has visually limited line lengths, but visually unlimited line counts. Even if we believe that well-behaved multi-line strings will fit in a single screenful, it is the case that the scale of a single-line string is the scale of a single screen line, while the scale of a multi-line string is a *whole screen*. It is a *questionable assumption* that escape sequence notations will work just as well at the larger scale as the known-good smaller scale. And we question that assumption when we speak of "squinting" as above. Let's be clear about this: Squinting through a page of code for escapes is at least N times harder than squinting through a line of code, where N is the page size. Raw strings given a clear and plausible answer to this problem posed by multi-line strings, hence my conclusion that they are (for this reason among others) not fully orthogonal features. The answer is, "we won't put any escape sequences into the bulk, we will only put them at the boundary". Boundaries are *always* (barring fractals) smaller than bulks. Another part of the answer, which has been derived again in a previous message, is "we'll put a big-enough escape sequence at the boundary so you'll have a fighting chance to see it in the bulk". I think that's the real reason why, after inspecting single-" as a multi-line delimiter, we always discard it in favor of something more distinctive, with multiple characters. The clever discoveries of payloads which introduce the short closing quote are interesting puzzles, but they are just special cases of the general rule that, if you are going to spray a large bulk of string payload on the screen, you are going to need a larger unit of visual information to make a clearly evident ending fence for it. That more general rule does not appeal to dubious assertions like "this will only be for SQL and five more notations, we promise". Especially if we (later?) allow the ending fence to grow as large and robust as each use case requires. (That's my argument for "strong quotes" in all sizes, of course.) I guess where this ends for me is that, not buying the orthogonality argument, I more easily see raw as a better first course, because it picks up most of the multi-line use cases, and also the case of single-line regular expressions. ? John From brian.goetz at oracle.com Fri Mar 15 19:44:56 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 15 Mar 2019 15:44:56 -0400 Subject: Updated document on data classes and sealed types In-Reply-To: References: Message-ID: <8C50D2AD-89C9-46E4-AC5E-CCF2AD3420F8@oracle.com> There is (at least) one area of interaction with other features that I want to nail down for records: serialization (it?s like death and taxes, always catches up with you.) My proposal here is simple: if a record is Serializable, we inject an implementation of readResolve() that runs back through the constructor; for a record Foo with components a, b, and c, we?d get: private Object readResolve() { return new Foo(a, b, c); } This doesn?t interfere with the serialization mechanism (default vs readObject/writeObject), but does defend against malicious streams that forge record contents, by piping them back through the ctor which will do validation / normalization. It may seem a little odd to do something here for records, but not for everything else. To that, I have two answers: - Records are special in that we _can_ do this, and its pretty hard to argue this is wrong (though perhaps slightly slower); - This is a down payment on a bigger story for serialization, in the same key: leaning on the constructor to validate state where possible.I?d rather records (and values) be safe out of the gate, rather than having to patch them later, and worry about older classfiles. > On Mar 1, 2019, at 3:14 PM, Brian Goetz wrote: > > I've updated the document on data classes here: > > http://cr.openjdk.java.net/~briangoetz/amber/datum.html > > (older versions of the document are retained in the same directory for historical comparison.) > > While the previous version was mostly about tradeoffs, this version takes a much more opinionated interpretation of the feature, offering more examples of use cases of where it is intended to be used (and not used). Many of the "under consideration" flexibilities (extension, mutability, additional fields) have collapsed to their more restrictive form; while some people will be disappointed because it doesn't solve the worst of their boilerplate problems, our conclusion is: records are a powerful feature, but they're not necessarily the delivery vehicle for easing all the (often self-inflicted) pain of JavaBeans. We can continue to explore relief for these situations too as separate features, but trying to be all things to all classes has delayed the records train long enough, and I'm convince they're separate problems that want separate solutions. Time to let the records train roll. > > I've also combined the information on sealed types in this document, as the two are so tightly related. > > Comments welcome. From alex.buckley at oracle.com Fri Mar 15 20:20:06 2019 From: alex.buckley at oracle.com (Alex Buckley) Date: Fri, 15 Mar 2019 13:20:06 -0700 Subject: Switch expressions spec In-Reply-To: <115286FF-64EA-43A0-BE14-058FC1DDC9A2@oracle.com> References: <5C80334D.90508@oracle.com> <5C8BF175.8050904@oracle.com> <115286FF-64EA-43A0-BE14-058FC1DDC9A2@oracle.com> Message-ID: <5C8C08F6.3040304@oracle.com> OK, we intend at least one result expression to be required, so the spec is correct as is. (I should have been clearer that my belief was about the intent of the spec, rather than about how I personally think completion should occur.) Manoj didn't say what javac build he is testing with, but this is a substantial discrepancy between compiler and spec. I hope that Leonid Arbouzov (cc'd) can tell us what conformance tests exist in this area. Alex On 3/15/2019 12:09 PM, Brian Goetz wrote: > At the same time, we also reaffirmed our choice to _not_ allow throw from one half of a conditional: > > int x = foo ? 3 : throw new FooException() > > But John has this right ? the high order bit is that every expression should have a defined normal completion, and a type, even if computing sub-expressions (or in this case, sub-statements) might throw. And without at least one arm yielding a value, it would be impossible to infer the type of the expression. > >> On Mar 15, 2019, at 3:01 PM, John Rose wrote: >> >> On Mar 15, 2019, at 11:39 AM, Alex Buckley wrote: >>> >>> In a switch expression, I believe it should be legal for every `case`/`default` arm to complete abruptly _for a reason other than a break with value_. >> >> My reading of Gavin's draft is that he is doing something very >> subtle there, which is to retain an existing feature in the language >> that an expression always has a defined normal completion. >> >> We also don't have expressions of the form "throw e". Allowing >> a switch expression to complete without a value on *every* arm >> raises the same question as "throw e" as an expression. How do >> you type "f(throw e)"? If you can answer that, then you can also >> have switch expressions that refuse to break with any values. >> >> BTW, if an expression has a defined normal completion, it also >> has a possible type. By possible type I mean at least one correct >> typing (poly-expressions can have many). So one obvious >> result of Gavin's draft is that you derive possible types from >> the arms of the switch expression that break with values. >> >> But the root requirement, I think, is to preserve the possible >> normal normal of every expression. >> >> "What about some form of 1/0?" That's a good question. >> What about it? It completes normally with a type of int. >> Dynamically, the normal completion is never taken. >> Gavin might call that a "notional normal completion" >> (I like that word) provided to uphold the general principle >> even where static analysis proves that the Turing machine >> fails to return normally. >> >> ? John > From kevinb at google.com Fri Mar 15 21:02:24 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Fri, 15 Mar 2019 14:02:24 -0700 Subject: Updated document on data classes and sealed types In-Reply-To: <8C50D2AD-89C9-46E4-AC5E-CCF2AD3420F8@oracle.com> References: <8C50D2AD-89C9-46E4-AC5E-CCF2AD3420F8@oracle.com> Message-ID: Well, I thought of nothing to dislike about this. 99.9% of users will never know or care that this is happening. Occasionally an exception will just pop up when deserializing invalid data and it would be hard to view that exception as a bad thing. Cool.... On Fri, Mar 15, 2019 at 12:45 PM Brian Goetz wrote: > There is (at least) one area of interaction with other features that I > want to nail down for records: serialization (it?s like death and taxes, > always catches up with you.) > > My proposal here is simple: if a record is Serializable, we inject an > implementation of readResolve() that runs back through the constructor; for > a record Foo with components a, b, and c, we?d get: > > private Object readResolve() { > return new Foo(a, b, c); > } > > This doesn?t interfere with the serialization mechanism (default vs > readObject/writeObject), but does defend against malicious streams that > forge record contents, by piping them back through the ctor which will do > validation / normalization. > > It may seem a little odd to do something here for records, but not for > everything else. To that, I have two answers: > > - Records are special in that we _can_ do this, and its pretty hard to > argue this is wrong (though perhaps slightly slower); > - This is a down payment on a bigger story for serialization, in the same > key: leaning on the constructor to validate state where possible.I?d rather > records (and values) be safe out of the gate, rather than having to patch > them later, and worry about older classfiles. > > > > On Mar 1, 2019, at 3:14 PM, Brian Goetz wrote: > > > > I've updated the document on data classes here: > > > > http://cr.openjdk.java.net/~briangoetz/amber/datum.html > > > > (older versions of the document are retained in the same directory for > historical comparison.) > > > > While the previous version was mostly about tradeoffs, this version > takes a much more opinionated interpretation of the feature, offering more > examples of use cases of where it is intended to be used (and not used). > Many of the "under consideration" flexibilities (extension, mutability, > additional fields) have collapsed to their more restrictive form; while > some people will be disappointed because it doesn't solve the worst of > their boilerplate problems, our conclusion is: records are a powerful > feature, but they're not necessarily the delivery vehicle for easing all > the (often self-inflicted) pain of JavaBeans. We can continue to explore > relief for these situations too as separate features, but trying to be all > things to all classes has delayed the records train long enough, and I'm > convince they're separate problems that want separate solutions. Time to > let the records train roll. > > > > I've also combined the information on sealed types in this document, as > the two are so tightly related. > > > > Comments welcome. > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Mar 15 22:24:57 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Fri, 15 Mar 2019 23:24:57 +0100 (CET) Subject: Updated document on data classes and sealed types In-Reply-To: References: <8C50D2AD-89C9-46E4-AC5E-CCF2AD3420F8@oracle.com> Message-ID: <594384846.1382629.1552688697823.JavaMail.zimbra@u-pem.fr> > De: "Kevin Bourrillion" > ?: "Amber Expert Group Observers" > Cc: "amber-spec-experts" > Envoy?: Vendredi 15 Mars 2019 22:02:24 > Objet: Re: Updated document on data classes and sealed types > Well, I thought of nothing to dislike about this. 99.9% of users will never know > or care that this is happening. Occasionally an exception will just pop up when > deserializing invalid data and it would be hard to view that exception as a bad > thing. > Cool.... Hi Brian, I like it too, better that my proposal that requires a special treatment of records in ObjectInputStream/ObjectOutputStream. I suppose readResolve() can be overriden ?? And playing the devil advocate, you rule out the automatic implementation of Comparable as been too magic but you are proposing exactly the same mechanism for serialization (that's why i have not proposed to used readResolve() in my previous mail). So i wonder if your position has changed on Comparable ? regards, R?mi > On Fri, Mar 15, 2019 at 12:45 PM Brian Goetz < [ mailto:brian.goetz at oracle.com | > brian.goetz at oracle.com ] > wrote: >> There is (at least) one area of interaction with other features that I want to >> nail down for records: serialization (it?s like death and taxes, always catches >> up with you.) >> My proposal here is simple: if a record is Serializable, we inject an >> implementation of readResolve() that runs back through the constructor; for a >> record Foo with components a, b, and c, we?d get: >> private Object readResolve() { >> return new Foo(a, b, c); >> } >> This doesn?t interfere with the serialization mechanism (default vs >> readObject/writeObject), but does defend against malicious streams that forge >> record contents, by piping them back through the ctor which will do validation >> / normalization. >> It may seem a little odd to do something here for records, but not for >> everything else. To that, I have two answers: >> - Records are special in that we _can_ do this, and its pretty hard to argue >> this is wrong (though perhaps slightly slower); >> - This is a down payment on a bigger story for serialization, in the same key: >> leaning on the constructor to validate state where possible.I?d rather records >> (and values) be safe out of the gate, rather than having to patch them later, >> and worry about older classfiles. >>> On Mar 1, 2019, at 3:14 PM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >> > brian.goetz at oracle.com ] > wrote: >> > I've updated the document on data classes here: >>> [ http://cr.openjdk.java.net/~briangoetz/amber/datum.html | >> > http://cr.openjdk.java.net/~briangoetz/amber/datum.html ] >>> (older versions of the document are retained in the same directory for >> > historical comparison.) >>> While the previous version was mostly about tradeoffs, this version takes a much >>> more opinionated interpretation of the feature, offering more examples of use >>> cases of where it is intended to be used (and not used). Many of the "under >>> consideration" flexibilities (extension, mutability, additional fields) have >>> collapsed to their more restrictive form; while some people will be >>> disappointed because it doesn't solve the worst of their boilerplate problems, >>> our conclusion is: records are a powerful feature, but they're not necessarily >>> the delivery vehicle for easing all the (often self-inflicted) pain of >>> JavaBeans. We can continue to explore relief for these situations too as >>> separate features, but trying to be all things to all classes has delayed the >>> records train long enough, and I'm convince they're separate problems that want >> > separate solutions. Time to let the records train roll. >>> I've also combined the information on sealed types in this document, as the two >> > are so tightly related. >> > Comments welcome. > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | [ mailto:kevinb at google.com | > kevinb at google.com ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Fri Mar 15 22:32:21 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Fri, 15 Mar 2019 18:32:21 -0400 Subject: Updated document on data classes and sealed types In-Reply-To: <594384846.1382629.1552688697823.JavaMail.zimbra@u-pem.fr> References: <8C50D2AD-89C9-46E4-AC5E-CCF2AD3420F8@oracle.com> <594384846.1382629.1552688697823.JavaMail.zimbra@u-pem.fr> Message-ID: <68dd883f-abf3-9817-adc3-d4e2050d2bfc@oracle.com> Yes, if you specify readResolve explicitly, you'll get what you wrote. As to "why Serialization but not Comparable": because serialization is, in some very real sense, a language feature (even though it dramatically pretends to be "just" a library feature) -- and one that has significant security impact. In comparison (heh), Comparable is just a random library interface. On 3/15/2019 6:24 PM, Remi Forax wrote: > > > ------------------------------------------------------------------------ > > *De: *"Kevin Bourrillion" > *?: *"Amber Expert Group Observers" > > *Cc: *"amber-spec-experts" > *Envoy?: *Vendredi 15 Mars 2019 22:02:24 > *Objet: *Re: Updated document on data classes and sealed types > > Well, I thought of nothing to dislike about this. 99.9% of users > will never know or care that this is happening. Occasionally an > exception will just pop up when deserializing invalid data and it > would be hard to view that exception as a bad thing. > Cool.... > > > Hi Brian, > I like it too, better that my proposal that requires a special > treatment of records in ObjectInputStream/ObjectOutputStream. > > I suppose readResolve() can be overriden ?? > > And playing the devil advocate, you rule out the automatic > implementation of Comparable as been too magic but you are proposing > exactly the same mechanism for serialization (that's why i have not > proposed to used readResolve() in my previous mail). > So i wonder if your position has changed on Comparable ? > > regards, > R?mi > > > > On Fri, Mar 15, 2019 at 12:45 PM Brian Goetz > > wrote: > > There is (at least) one area of interaction with other > features that I want to nail down for records: serialization > (it?s like death and taxes, always catches up with you.) > > My proposal here is simple: if a record is Serializable, we > inject an implementation of readResolve() that runs back > through the constructor; for a record Foo with components a, > b, and c, we?d get: > > ? ? private Object readResolve() { > ? ? ? ? return new Foo(a, b, c); > ? ? } > > This doesn?t interfere with the serialization mechanism > (default vs readObject/writeObject), but does defend against > malicious streams that forge record contents, by piping them > back through the ctor which will do validation / normalization. > > It may seem a little odd to do something here for records, but > not for everything else.? To that, I have two answers: > > ?- Records are special in that we _can_ do this, and its > pretty hard to argue this is wrong (though perhaps slightly > slower); > ?- This is a down payment on a bigger story for serialization, > in the same key: leaning on the constructor to validate state > where possible.I?d rather records (and values) be safe out of > the gate, rather than having to patch them later, and worry > about older classfiles. > > > > On Mar 1, 2019, at 3:14 PM, Brian Goetz > > wrote: > > > > I've updated the document on data classes here: > > > > http://cr.openjdk.java.net/~briangoetz/amber/datum.html > > > > (older versions of the document are retained in the same > directory for historical comparison.) > > > > While the previous version was mostly about tradeoffs, this > version takes a much more opinionated interpretation of the > feature, offering more examples of use cases of where it is > intended to be used (and not used).? Many of the "under > consideration" flexibilities (extension, mutability, > additional fields) have collapsed to their more restrictive > form; while some people will be disappointed because it > doesn't solve the worst of their boilerplate problems, our > conclusion is: records are a powerful feature, but they're not > necessarily the delivery vehicle for easing all the (often > self-inflicted) pain of JavaBeans.? We can continue to explore > relief for these situations too as separate features, but > trying to be all things to all classes has delayed the records > train long enough, and I'm convince they're separate problems > that want separate solutions.? Time to let the records train roll. > > > > I've also combined the information on sealed types in this > document, as the two are so tightly related. > > > > Comments welcome. > > > > -- > Kevin Bourrillion?|?Java Librarian |?Google, > Inc.?|kevinb at google.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Fri Mar 15 22:47:43 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Fri, 15 Mar 2019 23:47:43 +0100 (CET) Subject: Updated document on data classes and sealed types In-Reply-To: <68dd883f-abf3-9817-adc3-d4e2050d2bfc@oracle.com> References: <8C50D2AD-89C9-46E4-AC5E-CCF2AD3420F8@oracle.com> <594384846.1382629.1552688697823.JavaMail.zimbra@u-pem.fr> <68dd883f-abf3-9817-adc3-d4e2050d2bfc@oracle.com> Message-ID: <2065042069.1383504.1552690063580.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" , "Kevin > Bourrillion" > Envoy?: Vendredi 15 Mars 2019 23:32:21 > Objet: Re: Updated document on data classes and sealed types > Yes, if you specify readResolve explicitly, you'll get what you wrote. > As to "why Serialization but not Comparable": because serialization is, in some > very real sense, a language feature (even though it dramatically pretends to be > "just" a library feature) -- and one that has significant security impact. > In comparison (heh), Comparable is just a random library interface. Just for the record (heh), i don't care to write more characters to implement Comparable, but i care that because equals() is generated but compareTo() is hand written there will be hidden bugs, mostly a.equals(b) == true not being equivalent to a.compareTo(b) == 0 because of the overflows. so Comparable.compareTo is not really a method of a random interface because it has a dependency on equals. R?mi > On 3/15/2019 6:24 PM, Remi Forax wrote: >>> De: "Kevin Bourrillion" [ mailto:kevinb at google.com | ] >>> ?: "Amber Expert Group Observers" [ mailto:amber-spec-observers at openjdk.java.net >>> | ] >>> Cc: "amber-spec-experts" [ mailto:amber-spec-experts at openjdk.java.net | >>> ] >>> Envoy?: Vendredi 15 Mars 2019 22:02:24 >>> Objet: Re: Updated document on data classes and sealed types >>> Well, I thought of nothing to dislike about this. 99.9% of users will never know >>> or care that this is happening. Occasionally an exception will just pop up when >>> deserializing invalid data and it would be hard to view that exception as a bad >>> thing. >>> Cool.... >> Hi Brian, >> I like it too, better that my proposal that requires a special treatment of >> records in ObjectInputStream/ObjectOutputStream. >> I suppose readResolve() can be overriden ?? >> And playing the devil advocate, you rule out the automatic implementation of >> Comparable as been too magic but you are proposing exactly the same mechanism >> for serialization (that's why i have not proposed to used readResolve() in my >> previous mail). >> So i wonder if your position has changed on Comparable ? >> regards, >> R?mi >>> On Fri, Mar 15, 2019 at 12:45 PM Brian Goetz < [ mailto:brian.goetz at oracle.com | >>> brian.goetz at oracle.com ] > wrote: >>>> There is (at least) one area of interaction with other features that I want to >>>> nail down for records: serialization (it?s like death and taxes, always catches >>>> up with you.) >>>> My proposal here is simple: if a record is Serializable, we inject an >>>> implementation of readResolve() that runs back through the constructor; for a >>>> record Foo with components a, b, and c, we?d get: >>>> private Object readResolve() { >>>> return new Foo(a, b, c); >>>> } >>>> This doesn?t interfere with the serialization mechanism (default vs >>>> readObject/writeObject), but does defend against malicious streams that forge >>>> record contents, by piping them back through the ctor which will do validation >>>> / normalization. >>>> It may seem a little odd to do something here for records, but not for >>>> everything else. To that, I have two answers: >>>> - Records are special in that we _can_ do this, and its pretty hard to argue >>>> this is wrong (though perhaps slightly slower); >>>> - This is a down payment on a bigger story for serialization, in the same key: >>>> leaning on the constructor to validate state where possible.I?d rather records >>>> (and values) be safe out of the gate, rather than having to patch them later, >>>> and worry about older classfiles. >>>>> On Mar 1, 2019, at 3:14 PM, Brian Goetz < [ mailto:brian.goetz at oracle.com | >>>> > brian.goetz at oracle.com ] > wrote: >>>> > I've updated the document on data classes here: >>>>> [ http://cr.openjdk.java.net/~briangoetz/amber/datum.html | >>>> > http://cr.openjdk.java.net/~briangoetz/amber/datum.html ] >>>>> (older versions of the document are retained in the same directory for >>>> > historical comparison.) >>>>> While the previous version was mostly about tradeoffs, this version takes a much >>>>> more opinionated interpretation of the feature, offering more examples of use >>>>> cases of where it is intended to be used (and not used). Many of the "under >>>>> consideration" flexibilities (extension, mutability, additional fields) have >>>>> collapsed to their more restrictive form; while some people will be >>>>> disappointed because it doesn't solve the worst of their boilerplate problems, >>>>> our conclusion is: records are a powerful feature, but they're not necessarily >>>>> the delivery vehicle for easing all the (often self-inflicted) pain of >>>>> JavaBeans. We can continue to explore relief for these situations too as >>>>> separate features, but trying to be all things to all classes has delayed the >>>>> records train long enough, and I'm convince they're separate problems that want >>>> > separate solutions. Time to let the records train roll. >>>>> I've also combined the information on sealed types in this document, as the two >>>> > are so tightly related. >>>> > Comments welcome. >>> -- >>> Kevin Bourrillion | Java Librarian | Google, Inc. | [ mailto:kevinb at google.com | >>> kevinb at google.com ] -------------- next part -------------- An HTML attachment was scrubbed... URL: From guy.steele at oracle.com Sat Mar 16 00:01:21 2019 From: guy.steele at oracle.com (Guy Steele) Date: Fri, 15 Mar 2019 20:01:21 -0400 Subject: Switch expressions spec In-Reply-To: <115286FF-64EA-43A0-BE14-058FC1DDC9A2@oracle.com> References: <5C80334D.90508@oracle.com> <5C8BF175.8050904@oracle.com> <115286FF-64EA-43A0-BE14-058FC1DDC9A2@oracle.com> Message-ID: <7B775DC8-4907-4B05-913D-0D3FB70293E4@oracle.com> And here?s another way to think about it: if a subexpression never completes normally, then in effect there is unreachable code in the containing expression, and we don?t like to have unreachable code. > On Mar 15, 2019, at 3:09 PM, Brian Goetz wrote: > > At the same time, we also reaffirmed our choice to _not_ allow throw from one half of a conditional: > > int x = foo ? 3 : throw new FooException() > > But John has this right ? the high order bit is that every expression should have a defined normal completion, and a type, even if computing sub-expressions (or in this case, sub-statements) might throw. And without at least one arm yielding a value, it would be impossible to infer the type of the expression. From john.r.rose at oracle.com Sat Mar 16 00:54:30 2019 From: john.r.rose at oracle.com (John Rose) Date: Fri, 15 Mar 2019 17:54:30 -0700 Subject: String reboot (plain text) In-Reply-To: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> Message-ID: <58E6523D-8951-4927-85A7-0BAB30234EC3@oracle.com> OK, I responded to one corner by pointing out a principle that tends to align rawness more strongly with multi-line-ness. I guess I should lay all my cards on the table FTR, and will do so by responding to Brian's restacking Email and Jim's reboot Email. (I guess today's String-day.) TL;DR: I agree substantially with Jim's analysis and Brian's staging, especially the earlier and simpler parts. Our order #1 should keep classic escapes, instead of eliminating them (raw) or strengthening them (strong escapes, like strong delimiters). Later orders should have a place for such things (raw and/or strong escapes/quotes). (Side note: The term "escape" always make me think of a two character sequence, the first of which is probably reverse solidus, like "\x". I'd like to use a neutral term like "interruptor" coupled with "quote" to refer to the more general feature of "a visible notation which interrupts a string rather than terminates it like a quote does". And now I realize that Jim's term "delimiter" does the same thing for "quote". So I'll try to tilt toward "delimiter" and "interruptor" instead of "quote" and "escape".) Classic escapes and single quotes are both too tiny to see well inside multi-line strings, but they are also familiar and people will get used to "squinting" for them, at least the escapes. Our take is that we'd all rather "squint" (in the first order) instead of add complexity to the first feature. I'm fine with a two- or three-order stacking, as long as there is a credible story for the final course of the meal, if we are still hungry, which includes strong delimiters and (some sort of) strong escapes that are (a) not easy to collide with and (b) not hard to "squint" for. IMO strong delimiters will often be associated somehow with strong interruptors. In fact (see digression below in context) I think rawness is maybe not exactly the right concept; the concept of "escape strength" may be more fruitful for us. > On Mar 13, 2019, at 10:52 AM, Brian Goetz wrote: > > Lots of good discussion so far. Let me gather the threads. > > - The primary use case is embedding multi-line chunks of foreign code or data in Java, with minimal need to cruft it up with escaping. This says to me that _multi-line strings_ are actually the high-order bit here, and raw strings are the next bit. Let?s address these in order. +1 > - Multi-line-ness and raw-ness are orthogonal concepts. Some languages merge them, and we might consider doing that too, but we shouldn?t start there. +0.6 (As I implied previously, a number less than one is more representative of orthogonality, sine-of-the-angle-between, of the two features. But also, I'm fine with not starting with raw-ness, as long as it's on the menu somewhere. > - For multi-line strings, a stronger delimiter (e.g., """) seems to be preferred on readability grounds, because people don't want to have to squint to see where the embedded code ends and the Java code resumes. Yes. The same point applies to escapes ("string interruptors", not "string delimiters"), but since escapes are clearly less common than string boundaries, I'm content to just note the point, and accept a design which requires users to squint for escapes, on the grounds that they will be both rare, usually safe to disregard on first reading. > To which I'll add the following observations: > > - Most multi-line string candidates (JSON, XML, SQL, etc) do not require characters that have to be escaped, as long as we don't have conflicts with the quote character. Which suggests further than ML-ness and raw-ness are solving separate problems. Jim notes this in passing in the "75%" section, but I'll call it out here too: "Characters that have to be escaped" also include Java's escape. A JSON string will have a puzzling problem if it contains a JSON escape sequence that is processed by Java, rather than by the JSON parser. I don't see how to avoid this easily in the first course on the menu, but I want to note the design heuristic that design vectors for delimiters are correlated with interruptors. (The problem with JSON escapes is like the problem with regexp escapes. In both cases we have both Java and the foreign notation competing for ownership of the reverse solidus. I think a proper notion of strong interruptors will allow Java to gracefully give the foreign notation precedence, within certain of Java's envelopes, just as strong delimiters do so with quotes.) If you have to escape foreign delimiters, chances are you'll have to escape foreign interruptors. Another use of the heuristic: If you found yourself tripling the quotes to avoid collisions, there's probably a related use case for strengthening (tripling???) the escapes, to avoid the same (but rarer) sort of collisions. (I'm thinking Python also and JavaScript also, for script fragments, but we choose to place scripting lower on the menu, along with quoted-Java-in-Java nesting.) > - Once we separate multi-line from raw, the idea of automatically reflowing indentation starts to become a sensible option on non-raw, multi-line strings. +100 Yes, this is the nugget of gold that we mine out of the decision to defer rawness. > - Repeating delimiters are slightly more powerful than fixed delimiters, but also have additional cognitive load, and can still lead to anomalies that are easily encountered. That said, they pay for themselves as visual cues for multi-line thingies, and we immediately put them back into the shopping cart, with length set at three. This helps us properly size the "cognitive load" argument. Once you learn about jumbo delimiters, you learn to spot them, and you are paid for the effort because you only learned once, but you can spot them quicker every time you look. The same point readily applies to replacing "a count of three" with "a count of three or more", although with sharply diminished returns, since three is almost always enough. (What about quote counting? Well, programmers shouldn't be writing puzzlers in their code. So use extra, enough to make it obvious, and don't trick your reader with one-off counts unless you are writing a puzzler book. Or find another solution instead of quote counting to make the quotes look (a) like the quotes they are, and (b) different enough from competing would-be quotes.) None of these ideas apply to the first course, IMO. I'm realizing how apt it is for Jim to call it an appetizer; it is very thin but tasty, as an appetizer should be. And Brian will say, "wait until you see how filling it is!" We certainly want to avoid unhealthy gorging? > With that said, let's reorder the dishes a bit. > > For our first course, we could have multi-line strings, delimited by the fixed delimiter """. These would be escaped strings, just like existing string literals, but because the single-quote is no longer the delimiter, the most common source of escaping (embedded quotes) is removed. Most multi-line strings will require no escaping at all. +1 (for most definitions of "most") > Note that if we stopped here _and never ordered anything else_, we would still be in a much better place than we are now (most snippets could just be cut and pasted without mangling), and what we've introduced is dead-simple! So the cost-benefit ratio here is high; it?s a simple addition that addresses a significant fraction of the pain points. I think we should at least order this. +100 > Now, maybe we're still a little hungry, and the above doesn't help with those strings that are most polluted by escapes, such as regular expressions. So, we might additionally order the ability to layer a way to say "no escape mangling" atop both our " strings and our """ strings. Jim proposes we use a delimiter of \".."\ for such strings (\""" ... """\ for the multi-line version). This has a nice connotation; it is as if the backslash is ?distributed over? the whole string. +1; it wins the beauty contest. It needs lack of simplicity as well as beauty. By simplicity I mean it resists unintentional creation of puzzlers, and we think intentional puzzlers have a limited effect. The jury is out IMO; puzzle on. Also, the second course (tweaking escapes) needs IMO to be plausibly followable (if not followed in fact) by a third course, which allows fullest control of syntax (nonces, repeats, whatever). I think Jim's syntax passes that test, since there are ways to increase the number of escapes, or lengthen the token in other ways to achieve strong delimiters. It seems to me there may be a good course #3 design which pins the quotes at three and allows larger and larger numbers of escapes. (Hmm, idea of the moment: We could allow any *whole* delimiter sequence to be *tripled* in order to strengthen it. Not just little old double-quote " gets the tripling treatment. But now I'm puzzling way outside the box.) > This does, unfortunately, bring us back into Delimiter Hell; what if we want our string to contain the quote + backslash combination? One way is to dive back into repeating delimiters (e.g., using multiple backslashes in the delimiter). Having a non-homogeneous repeating delimiter leaves us in a slightly better place than the original proposal, as we?ve eliminated the ?empty string? anomaly as well as the ?starting with backtick? anomaly. So this seems a workable direction, though the cost-benefit here is less than with the first course ? in both directions (higher cost, lower benefit.) > > > So, in the spirit of ?keep ordering until sated, but stop there?, here are some reasonable choices. > > 1. Do multi-line (escaped) strings with a ??? fixed delimiter. Large benefit, small cost. Most embedded snippets don?t need any escaping. Low cost, big payoff. > > 1a. Do 1, but automatically reflow multi-line strings using the equivalent of String::align. There have been reasonable proposals on how to do this; where they fell apart is the interaction with raw-ness, but if we separate ML and raw, these become reasonable again. Higher cost, but higher payoff; having separated the interaction with raw strings, this is more defensible. I like this; it will make ML-string code more readable, and coders can use indentation to guide the eye. This almost (not quite) removes the need for tripling the quote. (Not quite because it would mandate indentation, and because of JSON quotes. Heuristic comment: Remember JSON escapes also.) 1a'. As part of 1a., add a one or two new escape sequences to control string body layout, in straightforward ways, as part of the reflow story. Discussion on request; one way is to allow a "white space gobbler" escape which eats the backslash and all whitespace plus a final newline if any. I'm mentioning that now here because it has several uses. > 2. Do (1) or (1a), and add: single-line raw string literals delimited by \???\. This course (#2) raises the issue of controlling delimiters and interruptors separately instead of together. I think it's fine to control them separately, in different courses. If quote and escapes (delimiters and interruptors) were equally common in today's workloads I think we'd choose to control them together, but they are not, so it's more important to tweak the delimiters than tweak the interruptors. This proposal can be understood in either of two ways: The contents of the string are absolutely raw except for the occurrences of end-delimiters, or they are "more strongly raw", in that some stronger interruptor is sufficient to bring in today's rules for escapes, just as some stronger delimiter is sufficient to delimit the end of the string. I think Jim anticipated the idea of stronger interruptors when he said: > Even with escaping off, we still might have to escape delimiters. > Repeated backslashes (or repeated delimiters) is the typical out. The idea of stronger escapes conflicts with absolute "escaping off", which Jim also calls for, so I think order #2 needs a little more simmering. Which is fine; let's eat order #1 first. My overall take is, if a strong-enough (repeated?) escape can escape a strong delimiter, let's also allow such a strong-enough escape to do other chores as well; that leads me to a proper concept of "strong interruptor". This means that if you have a raw string that has a very rare need for an escape sequence, then you just strengthen the escape, rather than cook the whole string or concatenate it. Use the right rawness for the job, certainly, and maybe there's a way to do this on the whole-string level. In any case I think we can improve here on the previous proposals for "regional rawness". More details later; that's enough for now. Rawness is proportional to escape strength. No single string syntax is truly 100.000% raw, because the raw string cannot include a copy of its delimiter. Adjust that viewpoint to embrace interruptors as well and you get: A very raw string is one which is difficult, but not impossible, to end with a delimiter token, or to interrupt with an interruptor token. What does "difficult" mean? Simple, it means using more characters, until the subject string gives up and says, "don't have one of those, go fish". So the quest for ever stronger delimiters has a flip side: It is also a quest for ever rawer string notations. There is no such thing as an absolutely raw string, just one that is "raw enough". In those terms, I'd like to reserve, for an optional final course, a scheme for making strings as raw as you please, so that a quoted-and-escaped-five-times-raw string can be quoted inside of quoted-and-escaped-six-times-raw string. A corner case for purists? Yes. A real need for real users? We'll see; let's keep something brewing in the kitchen, just in case. > 2a. Do (1) or (1a), and also support multi-line raw string literals (where we _don?t_ automatically apply String::align; this can be done manually). Note that this creates anomalies for multi-line raw string literals starting with quotes (this can be handled with concatenation, and having separated ML and raw, this is less of a problem than before). +1 If we allow stronger interruptors in rawer strings, we can easily disrupt would-be delimiters by escaping them, so we wouldn't need concatenation. The stronger escapes could be part of 2 (controversially complex) or 3 (slightly inconsistent with absolute rawness of simple 2 syntax). > 3. Do (2) and (2a), and also support a repeating compound delimiter with multiple backslashes and a quote. > > Note that we can start with 1 or 1a now, and move on to 2/2a later, and same for 3. Order #3 is where we would have a full and decisive conversation about not only strong delimiters but also strong interruptors. I bring it up with order #2 above because #2 is where interruptor control first appears as a possibility. > As we evaluate these options, note that: > > - Having separated ML-ness from raw-ness, doing automatic reflow becomes more defensible for the common (ML, non-raw) case. This is a very important point. It wasn't apparent when we started, and that's why we go slowly on these things. > - The intersection of ML and raw seems pretty small, so doing 1a + 2, while asymmetric, is defensible. Our experience will bear out how truly small this intersection is; you and I perhaps differ on that call. But after doing 1a (1a' please!) we will certainly know more. > - What we don?t order now, we can add later. Yes, if we are careful not to get ourselves thrown out of the restaurant by making poor choices during the early courses. That's why I'm being all picky and theoretical here. Now for some brief responses to Jim's points, if they are not already noted above: On Feb 10, 2019, at 7:43 AM, Jim Laskey wrote: > >> ...50% solution >> >> Where we keep running into trouble is that a choice for one part of the lexicon spreads into the the other parts. That is, use of certain characters in the delimiter affect which characters require escaping and which characters can be used for escaping. (Good insight; leads to independent control for delimiter.) >> ... >> >> 75% solution, almost >> >> ? >> ? Even with escaping off, we still might have to escape delimiters. Repeated backslashes (or repeated delimiters) is the typical out. (Yes, this got me going, maybe more than you intended, see above.) >> >> String html = \" >> >>

Hello World.

>> >> >> "\; (I'm starting to call these Jim-quotes. They are growing on me.) >> ? Captain we need more sequences. > >> And, this is the crux of all the debate around strings. Fixed delimiters imply a requirement for escape sequences, otherwise there is content you cannot express as a string. (My work is almost done here! Now if we apply that reasoning to interruptors also, we get the idea of adjustable rawness, without losing the benefits of escape sequences.) >> ... >> Fixed delimiter >> >> If we go with a fixed delimiter then we limit the content that can be expressed without escape sequences. This is not totally left field. There are floating point values we can not express in Java and types we can express but not denote, such as anonymous class types, intersection types or capture types. (Sure, but strings are much more "free" mathematically than those other things One character shouldn't have to care (char?) what its neighbors are doing.) >> ... >> Once you take away conflicts with the delimiter, most strings do not require escaping. ?Always excepting strings which have the audacity to mention the New, Improved Delimiter. If Java picks one that nobody else would ever dream of, we'll still have one remaining case of embedding Java inside of Java. For me failure to nest is a smell indicating possible rats, for others it's a trade-off. >> ? >> Summary: All strings can be expressed with fixed plus escaping, but can not express strings containing the fixed delimiter (""") with escaping off. True. Three points related to that: A. If we have to escape the fixed delimiter, then we place an escape before it, and all is well. If we are happy that users can easily spot our delimiter without "squinting", then they can probably spot the escaped copy of the same delimiter. B. But, once we allow delimiters to run through the string, there is another cost: Little sequences like \\ and \n and \0 can be anywhere in the bulk of the ML string, and users *must squint* for those. This is a cost, and we wish we could make those more visible also, or just make the rest of the string raw. C. The observations of A and B can be balanced if we use strong interruptors instead of the "little squinty sequences", and maybe also for the escaped delimiter. There are various ways to do this, all of which suppress short escape sequences in favor of longer ones. >> Jumping ahead: I think that stating that traditional " strings must be single-line will be a popular restriction, even if it not needed. Then they will think of """ as meaning multi-line. +1 >> >> Structured delimiter (AKA periodic or partially periodic string.) >> ? >> Summary: Can express all strings with and without escaping. If the delimiter length is limited the there there is still a (smaller) set of strings that can not be expressed. Yep. And put "structured interruptor" in the kitchen also. >> Nonce delimiter >> >> ... >> Summary: Can express all strings with and without escaping, but nonce can affect readability. I agree. There's too much "noise" in a nonce, and it's easy to misuse. Alternative (stated elsewhere): Indexed delimiter. Here, the role of the nonce is played by a small number which is not the length of the delimiter but rather an actual numeral placed in the delimiter. Such things can be made deterministic, so that, if you are going to quote a string S which has apparent delimiters in it, there is a unique smallest non-conflicting index which may be used for the indexed delimiter of the quoted string. (And the indexed interruptor, if you want one.) >> >> Multi-line formatting >> >> I left this out of the main discussion, but I think we can all agree that formatting rules should separate the delimiters from the content. +1 (This is an instance of user control over the form of the source program containing the string. I don't know what is the right mix of mechanism and policy to get it all right, but I agree format control is an important issue.) >> Other details can be refined after choice of delimiter(s). >> ... >> Entrees and desserts >> >> If we make good choices now (stay away from the oysters) we can still move on to other courses later. >> >> For instance; if we got up from the table with the ", """, ", """ set of delimiters, we could still introduce structured delimiters in the future; This is often true, but not always, so we have to keep our eyes open. Purely periodic strings don't extend, as structured delimiters, as well as non-periodic or (some) partially-periodic ones. Consider: var s = \"""""? Does that begin today's three-quote-delimited string, which has two more quotes in it, or tomorrow's five-quote-delimited string? (This takes me back to the crazy idea of going with 1, 3, 9, 27 quotes. "I'll have a triple.") If I allow up to N quotes in my delimiter today, then coders will write strings which begin with more quotes in the string body. Either I have to somehow outlaw that, or else I am forbidden from using longer strings of N+1 quotes for future delimiters. Adding more escapes on the front is another matter, and I think that would work fine, especially if the "extra" escapes on the front somehow strengthened the string's interruptor and delimiter in a consistent manner. So we could enumerate ", \", """, \""", \\""", \\\""", \\\\""" etc. Or ", \", """, \""", \1""", \2""", \3""" etc. No need for more than three quotes (or more than one, for that matter, but there are other reasons to like three). >> either with repeated (see Swift) or repeated ". We could also follow a suggestion John made to use a pseudo nonce like " for \\" or """"". Yep, see above. >> Point being, we can work with a 85% solution now that we can supplement later when we're not so hangry. +100 HTH ? John From brian.goetz at oracle.com Mon Mar 18 14:06:30 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 18 Mar 2019 10:06:30 -0400 Subject: Fwd: String reboot (plain text) References: <437B786C-BDE5-4840-A77B-9697013FA9D7@icloud.com> Message-ID: <8BD6452D-CB90-49AA-98C0-70749B0B526B@oracle.com> Received on the -comments list. Summary: another plea for ?just do what Swift does?. My response: The approach of ?just do what language X does? is intrinsically irresponsible; nearly every feature of every language is conditioned by other features of that language. Instead, the game is to learn from how other languages do things, assess the tradeoffs they?ve chosen (explicitly and implicitly), and ask what can be applied to the constraints of the language we have and user expectations within the community we have. That said, there's lots to learn from Swift did (note that this aligns strongly with John?s ?strong interruptor? theory). Similarly aligned with John?s point from the other day, is that raw/non-raw is not a binary, but a spectrum, and in that, there may be a path to seeing the two in a unified framework, rather than two alternatives. This is a useful direction to explore, and, if we go this way, we should drop the ?raw? terminology by the wayside, because its both pulls the design center in another direction, and is a confusing way to describe the feature. (Kotlin has similarly adopted the ?raw? terminology in a distinctly non-raw way, which is similarly unfortunate.). All that said, purely for purposes of making forward progress, I'm declaring a temporary moratorium on raw-ness and escaping until we work out the 1/1a story for multi-line strings. We?re deep in Jim?s ?hangry? territory, and we need to eat something before we dive down that rathole. > Begin forwarded message: > > From: Fred Curts > Subject: String reboot (plain text) > Date: March 18, 2019 at 7:43:56 AM EDT > To: amber-spec-comments at openjdk.java.net > > Let's do a thought experiment: If Java literally adopted Swift's string literals, there would be no old/new or raw/non-raw distinctions. There would just be single line and multiline strings with customizable string delimiters. (Swift's "raw string" terminology is unfortunate.) A truly raw string literal would be easy enough to simulate with a highly customized string delimiter (which, importantly, also gives a highly customized escape sequence) and unindented closing delimiter. String interpolation could be added later (even for single line strings) because it's based on the existing (now customizable) escape sequence. What's not to like here? > > In my view, Swift demonstrates that customizable string delimiters (which also affect escape sequences) make a raw/non-raw distinction unnecessary. (Note that this has nothing to do with coupling orthogonal features.) I have yet to hit a case where Swift's string literals prove inadequate in practice. > > -Fred > > On Wed, Mar 13, 2019 at 12:57 PM Brian Goetz wrote: > >> To the ?indent is good enough? point: Auto reflow is a disaster when >> applied to mixed spaces and tabs; while in general one should avoid this, I >> cannot rule out the possibility that someone might actually want to embed >> such a snippet; in that case, truly raw strings are an option. If we take >> away truly raw, now they just have two bad approximations. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Mar 18 14:19:31 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 18 Mar 2019 10:19:31 -0400 Subject: Fwd: Concise method bodies with delegation and this References: Message-ID: Received on the -comments list. So, the CMB proposal was a bit of an experiment in letting a half-baked idea out of the lab a little sooner than we usually do. Since it (for expected reasons) raised some concerns both on the feature itself, and of whether this is the right feature to be working on now, I?m letting it lie fallow for a while. I think ?almost got a consensus, with only one remaining corner? is a little bit of an optimistic description of the status. YOu?ve noticed that I sometimes use the CMB syntax in examples; this is both out of convenience (it is a convenient notation) and out of marketing (getting people used to the idea.). I expect the marketing process will take some time, but that?s fine, we?re not going anywhere. You are right that the really interesting part of the feature is the treatment of ?this? when the descriptors don?t already match exactly, and that there?s subtle power in this trick. And you are right that we are building on an approach that we already started with in lambda, though there are more cases here than lambda handles. The trick of building on top of explicit `this` is definitely one of the tools we have in the toolbox. (While you are correct that the proximate cause for adding it was receiver annotations, we wouldn?t have been so quick to do this if we didn?t see other uses for the same thing in the future, such as such as using it to denote type constraints on conditional methods (e.g., `Foo> this)). > Begin forwarded message: > > From: Victor Nazarov > Subject: Concise method bodies with delegation and this > Date: March 12, 2019 at 12:26:07 PM EDT > To: amber-spec-comments at openjdk.java.net > > There was not much discussion about concise method bodies recently. > But record discussions usually touches concise methods or silently uses > concise method declarations in example code. > > As I understand it concise methods almost got a consensus about basic > syntax and the only remaining corner stone are the details of delegation > and passing `this` argument to delegated method. > > What I wan't to propose is to left the choice whether to pass `this` as an > argument or not to user with the help of additional syntax that is already > present in Java. > > To recall basic syntax uses arrow like lambda expressions and allows to > define methods without curly braces and return keyword: > > record Person(String firstName, String lastName) { > String fullName() -> firstName + lastName; > } > > One of the motivating example of concise methods is an implementation of > Comparable interface with Comparator; > > record Person(String firstName, String lastName) > implements Comparable { > > private static final Comparator comparator = > Comparator.comparing(Person::lastName) > .thenComparing(Person::firstName); > > String fullName() -> firstName + lastName; > int compareTo(Person that) -> comparator.compare(this, that); > } > > Delegation form allows to declare method without the need to explicitly > write method call. `compareTo` implementation becomes: > > record Person(String firstName, String lastName) > implements Comparable { > > private static final Comparator comparator = // ... > > String fullName() -> firstName + lastName; > int compareTo(Person that) = comparator::compare; > } > > The problem with delegation form is that sometime it is desireable to pass > `this` to delegated method, but sometimes it's not: > > record Person(String firstName, String lastName, List children) > implements Comparable { > > private static final Comparator comparator = // ... > > String fullName() -> firstName + lastName; > int compareTo(Person that) = comparator::compare; > int numberOfChildren() = children::size; > } > > We want to pass `this` as a first argument into comparator::compare call, > but we don't want to pass `this` as an argument to children::size call. The > decision when to pass or not to pass `this` can be left to compiler or more > specifically to type-checker. But an implementation requires two branches > of type-inference to be performed. Furthermore both branches can > potentially succeed and there is no clear criteria what branch should be > preferred. Additionally this type-inference is inconsistent with type > inference performed for lambda expressions and method references. > > What I want to propose is to left the choice whether to pass `this` as an > argument or not to user. What if there is some additional syntax to specify > when to pass this on delegation or not? But there is already such syntax. > Java 8 allows `this` to be declared as method parameter (It was done to > allow custom annotations on this type). The example from above becomes: > > record Person(String firstName, String lastName, List children) > implements Comparable { > > private static final Comparator comparator = // ... > > String fullName() -> firstName + lastName; > > // Here explicit this parameter is declared, so it's passed on to > delegated method > int compareTo(Person this, Person that) = comparator::compare; > > // Here no explicit this parameter is declared, so it's not passed > int numberOfChildren() = children::size; > } > > > -- > Victor Nazarov -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Mar 18 15:14:06 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Mon, 18 Mar 2019 16:14:06 +0100 (CET) Subject: Concise method bodies with delegation and this In-Reply-To: References: Message-ID: <1778512544.198209.1552922046486.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "amber-spec-experts" > Envoy?: Lundi 18 Mars 2019 15:19:31 > Objet: Fwd: Concise method bodies with delegation and this > Received on the -comments list. > So, the CMB proposal was a bit of an experiment in letting a half-baked idea out > of the lab a little sooner than we usually do. Since it (for expected reasons) > raised some concerns both on the feature itself, and of whether this is the > right feature to be working on now, I?m letting it lie fallow for a while. I > think ?almost got a consensus, with only one remaining corner? is a little bit > of an optimistic description of the status. > YOu?ve noticed that I sometimes use the CMB syntax in examples; this is both out > of convenience (it is a convenient notation) and out of marketing (getting > people used to the idea.). I expect the marketing process will take some time, > but that?s fine, we?re not going anywhere. There are two syntaxes, the arrow syntax and the colon-colon syntax, i think there is consensus that the arrow syntax (the one Brian uses) is nice and convenient. The colon-colon syntax is more controversial, the issues are: - the method reference and the concise method are both using the same colon-colon operator but the semantics are different, - using '=' to set a concise method make the syntax easy to confound with a field initialization (something which is ok in Scala because the Scala syntax also blur that distinction between a method call and a field access, it's less ok in Java IMO). > You are right that the really interesting part of the feature is the treatment > of ?this? when the descriptors don?t already match exactly, and that there?s > subtle power in this trick. And you are right that we are building on an > approach that we already started with in lambda, though there are more cases > here than lambda handles. > The trick of building on top of explicit `this` is definitely one of the tools > we have in the toolbox. (While you are correct that the proximate cause for > adding it was receiver annotations, we wouldn?t have been so quick to do this > if we didn?t see other uses for the same thing in the future, such as such as > using it to denote type constraints on conditional methods (e.g., `Foo extends Comparable> this)). I really dislike the notation: void foo(Foo> this) because it move something that you be in the declaration part of the method (when T extends Comparable ) to the parameter part of the method, mixing the declaration part and the use part, making the code hard to comprenhend. And how does it work with generic methods (if you want to restrict one type parameter of the method) ? R?mi >> Begin forwarded message: >> From: Victor Nazarov < [ mailto:asviraspossible at gmail.com | >> asviraspossible at gmail.com ] > >> Subject: Concise method bodies with delegation and this >> Date: March 12, 2019 at 12:26:07 PM EDT >> To: [ mailto:amber-spec-comments at openjdk.java.net | >> amber-spec-comments at openjdk.java.net ] >> There was not much discussion about concise method bodies recently. >> But record discussions usually touches concise methods or silently uses >> concise method declarations in example code. >> As I understand it concise methods almost got a consensus about basic >> syntax and the only remaining corner stone are the details of delegation >> and passing `this` argument to delegated method. >> What I wan't to propose is to left the choice whether to pass `this` as an >> argument or not to user with the help of additional syntax that is already >> present in Java. >> To recall basic syntax uses arrow like lambda expressions and allows to >> define methods without curly braces and return keyword: >> record Person(String firstName, String lastName) { >> String fullName() -> firstName + lastName; >> } >> One of the motivating example of concise methods is an implementation of >> Comparable interface with Comparator; >> record Person(String firstName, String lastName) >> implements Comparable { >> private static final Comparator comparator = >> Comparator.comparing(Person::lastName) >> .thenComparing(Person::firstName); >> String fullName() -> firstName + lastName; >> int compareTo(Person that) -> comparator.compare(this, that); >> } >> Delegation form allows to declare method without the need to explicitly >> write method call. `compareTo` implementation becomes: >> record Person(String firstName, String lastName) >> implements Comparable { >> private static final Comparator comparator = // ... >> String fullName() -> firstName + lastName; >> int compareTo(Person that) = comparator::compare; >> } >> The problem with delegation form is that sometime it is desireable to pass >> `this` to delegated method, but sometimes it's not: >> record Person(String firstName, String lastName, List children) >> implements Comparable { >> private static final Comparator comparator = // ... >> String fullName() -> firstName + lastName; >> int compareTo(Person that) = comparator::compare; >> int numberOfChildren() = children::size; >> } >> We want to pass `this` as a first argument into comparator::compare call, >> but we don't want to pass `this` as an argument to children::size call. The >> decision when to pass or not to pass `this` can be left to compiler or more >> specifically to type-checker. But an implementation requires two branches >> of type-inference to be performed. Furthermore both branches can >> potentially succeed and there is no clear criteria what branch should be >> preferred. Additionally this type-inference is inconsistent with type >> inference performed for lambda expressions and method references. >> What I want to propose is to left the choice whether to pass `this` as an >> argument or not to user. What if there is some additional syntax to specify >> when to pass this on delegation or not? But there is already such syntax. >> Java 8 allows `this` to be declared as method parameter (It was done to >> allow custom annotations on this type). The example from above becomes: >> record Person(String firstName, String lastName, List children) >> implements Comparable { >> private static final Comparator comparator = // ... >> String fullName() -> firstName + lastName; >> // Here explicit this parameter is declared, so it's passed on to >> delegated method >> int compareTo(Person this, Person that) = comparator::compare; >> // Here no explicit this parameter is declared, so it's not passed >> int numberOfChildren() = children::size; >> } >> -- >> Victor Nazarov -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Mar 18 15:19:19 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 18 Mar 2019 11:19:19 -0400 Subject: Concise method bodies with delegation and this In-Reply-To: <1778512544.198209.1552922046486.JavaMail.zimbra@u-pem.fr> References: <1778512544.198209.1552922046486.JavaMail.zimbra@u-pem.fr> Message-ID: > There are two syntaxes, the arrow syntax and the colon-colon syntax, i think there is consensus that the arrow syntax (the one Brian uses) is nice and convenient. I don?t really even think there?s consensus there; it is nice and convenient and mostly unobjectionable, but some felt ?meh, what?s the point?, as it merely eliminates a few characters of typing. The implementation-by-delegation sub feature is far more substantial; it allows you to implement a class by wiring its declarations to existing reusable behaviors. This has far more potential benefit, but also more cost. > - using '=' to set a concise method make the syntax easy to confound with a field initialization (something which is ok in Scala because the Scala syntax also blur that distinction between a method call and a field access, it's less ok in Java IMO). Please, can we not harp on notation unnecessarily? > I really dislike the notation Please, can we not harp on notation unnecessarily? -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevinb at google.com Mon Mar 18 21:53:11 2019 From: kevinb at google.com (Kevin Bourrillion) Date: Mon, 18 Mar 2019 14:53:11 -0700 Subject: Records and annotations (was: Updated document on data classes and sealed types) In-Reply-To: <3F1E3D1E-6132-4DC1-90CE-2A9D3C0FA9A9@oracle.com> References: <3F1E3D1E-6132-4DC1-90CE-2A9D3C0FA9A9@oracle.com> Message-ID: On Sat, Mar 9, 2019 at 4:47 AM Brian Goetz wrote: This came up before, but we didn?t reach a conclusion. > > A record component is more than just the lower-level members (fields, > accessors, ctor params) it gets desugared too. So it seems reasonable that > it be considered an annotatable program element, and that reflection expose > directly the annotations on record components (separately from any > annotations on the class members that may or may not derive from desugaring > of records.) > > But, that still leaves the question of whether the desugaring should, or > should not be, transparent to annotations. My sense is that pushing > annotations down to fields, ctor params, and accessors _seems_ friendly, > but also opens a number of uncomfortable questions. > > - Should we treat the cases where @A has a target of RECORD_COMPONENT, > separately from the cases where it does not, such as, only push the > annotation down to members when the target does not include > RECORD_COMPONENT? That is, is the desire to push down annotations based on > ?well, what if we want to apply a ?legacy? annotation? If so, this causes > a migration compatibility issue; if someone adds RC to the targets list for > @A, then when the record is recompiled, the location of the annotations > will changed, possibly changing the behavior of frameworks that encounter > the record. > No, we would certainly not require @RC to also be present. If I have released a method annotation it is the current reality that *any* method can use it, including ones I meant it to be applicable to and ones I didn't. I would expect the methods that appear on records to be no exception. - What if @A has a target set of { field, parameter }, but for some reason > the user does _not_ want the annotation pushed down? Tough luck? > Redeclare the member without the annotation? > Sure, that sounds fine. > - If the user explicitly redeclares the member (ctor, accessor), what > happens? Do we still implicitly push down annotations from record > components to the explicit member? Will this be confusing when the source > says ?@B int x() -> x?, but reflection yields both @A and @B as annotations > on x()? > Saying that redeclaring makes you responsible for the annotation choices (i.e., none copied) makes perfect sense to me! All of which causes me to back up and say: what is the motivation for > pushing these down to implicit members, other than ?general friendliness?? > Is this a migration strategy for migrating existing code to use records, > without having to redeclare annotations on the members? And if so, how > useful is it really? Will users want to throw the union of > field/accessor/ctor parameter annotations on the record components just to > gain compatibility with their existing code? > Say project A has released jars containing method annotations and we're using those annotations on our methods. Under your proposal we are prevented from converting to records. We have to beg project A to upgrade to Java 1X. And in fact since they will likely not want to suddenly pull the rug out from all their users on earlier Java versions, what I really need to beg them to do is adopt jep238 multirelease jars, which probably also means adopting a preprocessor in their build so they can generate the different versions (as this is a pretty bad case for branching). This can result in a long period where records are unadoptable by the end user. Btw, this precise issue will not be hypothetical for us. :-) My gut sense is that the stable solution is to make record component a new > kind of target, and encourage frameworks to learn about these, rather than > trying to fake out frameworks by emulating legacy behavior. > > > On Mar 8, 2019, at 8:43 PM, Kevin Bourrillion wrote: > > Re: annotations, > > Doc says, "Record components constitute a new place to put annotations; > we'll likely want to extend the @Target meta-annotation to reflect this." > > I'm sure we discussed this before, but I also expect to be able to put any > METHOD-, FIELD- or PARAMETER-targeted annotation on a record component, and > have that annotation appear to be present on the synthesized > accessor/field/constructor-parameter. Is that sensible? > > (As for records themselves, I expect they are targeted with TYPE just as > enums/interfaces/"plain old classes" (jeesh, is there any term that means > the latter?).) > > > > > > > > On Fri, Mar 1, 2019 at 12:16 PM Brian Goetz > wrote: > >> I've updated the document on data classes here: >> >> http://cr.openjdk.java.net/~briangoetz/amber/datum.html >> >> (older versions of the document are retained in the same directory for >> historical comparison.) >> >> While the previous version was mostly about tradeoffs, this version >> takes a much more opinionated interpretation of the feature, offering >> more examples of use cases of where it is intended to be used (and not >> used). Many of the "under consideration" flexibilities (extension, >> mutability, additional fields) have collapsed to their more restrictive >> form; while some people will be disappointed because it doesn't solve >> the worst of their boilerplate problems, our conclusion is: records are >> a powerful feature, but they're not necessarily the delivery vehicle for >> easing all the (often self-inflicted) pain of JavaBeans. We can >> continue to explore relief for these situations too as separate >> features, but trying to be all things to all classes has delayed the >> records train long enough, and I'm convince they're separate problems >> that want separate solutions. Time to let the records train roll. >> >> I've also combined the information on sealed types in this document, as >> the two are so tightly related. >> >> Comments welcome. >> > > > -- > Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com > > > -- Kevin Bourrillion | Java Librarian | Google, Inc. | kevinb at google.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Mar 18 22:28:43 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Mon, 18 Mar 2019 23:28:43 +0100 (CET) Subject: Concise method bodies with delegation and this In-Reply-To: References: <1778512544.198209.1552922046486.JavaMail.zimbra@u-pem.fr> Message-ID: <1911049471.275247.1552948123205.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Remi Forax" > Cc: "amber-spec-experts" > Envoy?: Lundi 18 Mars 2019 16:19:19 > Objet: Re: Concise method bodies with delegation and this >> There are two syntaxes, the arrow syntax and the colon-colon syntax, i think >> there is consensus that the arrow syntax (the one Brian uses) is nice and >> convenient. > I don?t really even think there?s consensus there; it is nice and convenient and > mostly unobjectionable, but some felt ?meh, what?s the point?, as it merely > eliminates a few characters of typing. Especially if you still have a big javadoc comment as Kevin said. I think this feature shine when you override a method, because the semantics is already defined (usually you don't need any javadoc), the poster child being implementing Comparable as Victor said. And BTW, i still think we should come with a shorter syntax for overridden method by allowing to not declare parameter types like with a lambda, class Person implement Comparable { private final int id; // more fields override equals(o) -> (o instanceof Person p)? p.id == id : false; override hashCode() -> id; override compareTo(p) -> Integer.compare(id, p.id); } > The implementation-by-delegation sub feature is far more substantial; it allows > you to implement a class by wiring its declarations to existing reusable > behaviors. This has far more potential benefit, but also more cost. yes, but i believe part of the cost is because the current proposed syntax is reusing the syntaxic operators = and :: but with a slightly different semantics. >> - using '=' to set a concise method make the syntax easy to confound with a >> field initialization (something which is ok in Scala because the Scala syntax >> also blur that distinction between a method call and a field access, it's less >> ok in Java IMO). > Please, can we not harp on notation unnecessarily? i disagree here because i think that the proposed syntax is part of the issue >> I really dislike the notation > Please, can we not harp on notation unnecessarily? fair point for this one. R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.goetz at oracle.com Mon Mar 18 23:25:58 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Mon, 18 Mar 2019 19:25:58 -0400 Subject: Records and annotations In-Reply-To: References: <3F1E3D1E-6132-4DC1-90CE-2A9D3C0FA9A9@oracle.com> Message-ID: <56f5101f-89b8-a3b5-fd69-fed693076554@oracle.com> > > ?- Should we treat the cases where @A has a target of > RECORD_COMPONENT, separately from the cases where it does not, > such as, only push the annotation down to members when the target > does not include RECORD_COMPONENT?? That is, is the desire to push > down annotations based on ?well, what if we want to apply a > ?legacy? annotation?? If so, this causes a migration compatibility > issue; if someone adds RC to the targets list for @A, then when > the record is recompiled, the location of the annotations will > changed, possibly changing the behavior of frameworks that > encounter the record. > > > No, we would certainly not require?@RC to also be present.? If I have > released a method annotation it is the current reality that *any* > method can use it, including ones I meant it to be applicable to and > ones I didn't. I would expect the methods that appear on records to be > no exception. I'm not sure we're talking about the same thing now?? If I have a method annotation @MethodsOnly, and I declare ??? record Foo(@MethodsOnly int a); what am I annotating?? It feels a little ad-hoc to say that my intention is to annotate only the desugared accessor, because that's the method that is tied to `a`.? But assuming that is what you're talking about, is this really because we want to annotate record components with annotations that don't target record components, or is it because we're worried that it will take some time for frameworks to add Target.R_C to their annos? > All of which causes me to back up and say: what is the motivation > for pushing these down to implicit members, other than ?general > friendliness?? ? Is this a migration strategy for migrating > existing code to use records, without having to redeclare > annotations on the members?? And if so, how useful is it really?? > Will users want to throw the union of field/accessor/ctor > parameter annotations on the record components just to gain > compatibility with their existing code? > > > Say project A has released jars containing method annotations and > we're using those annotations on our methods. Under your proposal we > are prevented from converting to records. We have to beg project A to > upgrade to Java 1X. And in fact since they will likely not want to > suddenly pull the rug out from all their users on earlier Java > versions, what I really need to beg them to do is adopt jep238 > multirelease jars, which probably also means adopting a preprocessor > in their build so they can generate the different versions (as this is > a pretty bad case for branching). Thought experiment: if you could wave a magic wand and retroactively re-target all the worlds annotations to include R_C, and all the frameworks to support it, would you still want this?? In other words, is this a feature that you want in its own right, or just as a migration aid, right?? (It has to be the latter, right?) -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Mon Mar 18 23:46:04 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Tue, 19 Mar 2019 00:46:04 +0100 (CET) Subject: Records and annotations In-Reply-To: <56f5101f-89b8-a3b5-fd69-fed693076554@oracle.com> References: <3F1E3D1E-6132-4DC1-90CE-2A9D3C0FA9A9@oracle.com> <56f5101f-89b8-a3b5-fd69-fed693076554@oracle.com> Message-ID: <1173221422.278074.1552952764026.JavaMail.zimbra@u-pem.fr> > De: "Brian Goetz" > ?: "Kevin Bourrillion" > Cc: "amber-spec-experts" > Envoy?: Mardi 19 Mars 2019 00:25:58 > Objet: Re: Records and annotations >>> - Should we treat the cases where @A has a target of RECORD_COMPONENT, >>> separately from the cases where it does not, such as, only push the annotation >>> down to members when the target does not include RECORD_COMPONENT? That is, is >>> the desire to push down annotations based on ?well, what if we want to apply a >>> ?legacy? annotation? If so, this causes a migration compatibility issue; if >>> someone adds RC to the targets list for @A, then when the record is recompiled, >>> the location of the annotations will changed, possibly changing the behavior of >>> frameworks that encounter the record. >> No, we would certainly not require @RC to also be present. If I have released a >> method annotation it is the current reality that *any* method can use it, >> including ones I meant it to be applicable to and ones I didn't. I would expect >> the methods that appear on records to be no exception. > I'm not sure we're talking about the same thing now? If I have a method > annotation @MethodsOnly, and I declare > record Foo(@MethodsOnly int a); > what am I annotating? It feels a little ad-hoc to say that my intention is to > annotate only the desugared accessor, because that's the method that is tied to > `a`. But assuming that is what you're talking about, is this really because we > want to annotate record components with annotations that don't target record > components, or is it because we're worried that it will take some time for > frameworks to add Target.R_C to their annos? >>> All of which causes me to back up and say: what is the motivation for pushing >>> these down to implicit members, other than ?general friendliness?? Is this a >>> migration strategy for migrating existing code to use records, without having >>> to redeclare annotations on the members? And if so, how useful is it really? >>> Will users want to throw the union of field/accessor/ctor parameter annotations >>> on the record components just to gain compatibility with their existing code? >> Say project A has released jars containing method annotations and we're using >> those annotations on our methods. Under your proposal we are prevented from >> converting to records. We have to beg project A to upgrade to Java 1X. And in >> fact since they will likely not want to suddenly pull the rug out from all >> their users on earlier Java versions, what I really need to beg them to do is >> adopt jep238 multirelease jars, which probably also means adopting a >> preprocessor in their build so they can generate the different versions (as >> this is a pretty bad case for branching). > Thought experiment: if you could wave a magic wand and retroactively re-target > all the worlds annotations to include R_C, and all the frameworks to support > it, would you still want this? In other words, is this a feature that you want > in its own right, or just as a migration aid, right? (It has to be the latter, > right?) I have a question, how javac 8 works when it sees a MODULE as annotation target ? in my opinion, it should ignore it, no ? R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From john.r.rose at oracle.com Tue Mar 19 04:46:13 2019 From: john.r.rose at oracle.com (John Rose) Date: Mon, 18 Mar 2019 21:46:13 -0700 Subject: Records and annotations (was: Updated document on data classes and sealed types) In-Reply-To: <012CB8B0-9EC4-4B1D-929F-E0D69FB3F579@oracle.com> References: <3F1E3D1E-6132-4DC1-90CE-2A9D3C0FA9A9@oracle.com> <81D633A4-B5F9-48BF-A883-AFFEC83E46EE@oracle.com> <012CB8B0-9EC4-4B1D-929F-E0D69FB3F579@oracle.com> Message-ID: The benefit of default.something is that it can be composed with ad hoc code, to add assertions, short-circuits, etc. An empty body notation means you cannot cooperate meaningfully with the default mechanism. On Mar 9, 2019, at 6:08 AM, Brian Goetz wrote: > > An alternate is to allow the body to be left off entirely: > > record R(int x, int y) { > @MyAnnotation > public boolean equals(Object o); > } From brian.goetz at oracle.com Tue Mar 19 18:42:11 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Tue, 19 Mar 2019 14:42:11 -0400 Subject: Records and annotations (was: Updated document on data classes and sealed types) In-Reply-To: <3F1E3D1E-6132-4DC1-90CE-2A9D3C0FA9A9@oracle.com> References: <3F1E3D1E-6132-4DC1-90CE-2A9D3C0FA9A9@oracle.com> Message-ID: <289B72C8-1DA4-4DCA-B32F-4A28B01407F8@oracle.com> > But, that still leaves the question of whether the desugaring should, or should not be, transparent to annotations. My sense is that pushing annotations down to fields, ctor params, and accessors _seems_ friendly, but also opens a number of uncomfortable questions. Also, we?ve been treating these as if they are the same case. In record R(@Foo int a); I think the case for ?pushing down? @Foo to the _field_ for a has a somewhat stronger argument than pushing it down to the _accessor method_ for a. (Not only does the record component look like a field declaration, but record declarations are about ?let me declare the state, and then you can derive the API from that state.? So pushing the annotations down to the actual state declaration seems sensible, but pushing annotations from there to the corresponding API elements seems more questionable. (Thought experiment: if we had some sort of ?auto accessor? mechanism: Class Foo { @Foo with-getter with-setter int a; } Would we similarly be talking about pushing the @Foo annotation down to the synthetic methods, or would we say ?Well, obviously @Foo is an annotation on the declaration of the field.? It?s not a slam-dunk either way, but the latter seems more likely to me. Which is to say: there are possibly-credible middle grounds between ?push it everywhere we could conceivably do so? and ?it?s an annotation on the record component, full stop.? From leonid.arbouzov at oracle.com Wed Mar 20 00:24:12 2019 From: leonid.arbouzov at oracle.com (leonid.arbouzov at oracle.com) Date: Tue, 19 Mar 2019 17:24:12 -0700 Subject: Switch expressions spec In-Reply-To: <5C8C08F6.3040304@oracle.com> References: <5C80334D.90508@oracle.com> <5C8BF175.8050904@oracle.com> <115286FF-64EA-43A0-BE14-058FC1DDC9A2@oracle.com> <5C8C08F6.3040304@oracle.com> Message-ID: <49ac732e-4a3e-5d1c-f668-c20a5a36c0cd@oracle.com> > I hope that Leonid Arbouzov (cc'd) can tell us what conformance tests exist in this area. There are tests on switch expression with cases that throwing exception, causing division by zero, index out of range, etc. These are all positive tests i.e. compile fine. Thanks, Leonid On 3/15/19 1:20 PM, Alex Buckley wrote: > OK, we intend at least one result expression to be required, so the > spec is correct as is. > > (I should have been clearer that my belief was about the intent of the > spec, rather than about how I personally think completion should occur.) > > Manoj didn't say what javac build he is testing with, but this is a > substantial discrepancy between compiler and spec. I hope that Leonid > Arbouzov (cc'd) can tell us what conformance tests exist in this area. > > Alex > > On 3/15/2019 12:09 PM, Brian Goetz wrote: >> At the same time, we also reaffirmed our choice to _not_ allow throw >> from one half of a conditional: >> >> int x = foo ? 3 : throw new FooException() >> >> But John has this right ? the high order bit is that every expression >> should have a defined normal completion, and a type, even if >> computing sub-expressions (or in this case, sub-statements) might >> throw. And without at least one arm yielding a value, it would be >> impossible to infer the type of the expression. >> >>> On Mar 15, 2019, at 3:01 PM, John Rose wrote: >>> >>> On Mar 15, 2019, at 11:39 AM, Alex Buckley >>> wrote: >>>> >>>> In a switch expression, I believe it should be legal for every >>>> `case`/`default` arm to complete abruptly _for a reason other than >>>> a break with value_. >>> >>> My reading of Gavin's draft is that he is doing something very >>> subtle there, which is to retain an existing feature in the language >>> that an expression always has a defined normal completion. >>> >>> We also don't have expressions of the form "throw e". Allowing >>> a switch expression to complete without a value on *every* arm >>> raises the same question as "throw e" as an expression. How do >>> you type "f(throw e)"? If you can answer that, then you can also >>> have switch expressions that refuse to break with any values. >>> >>> BTW, if an expression has a defined normal completion, it also >>> has a possible type. By possible type I mean at least one correct >>> typing (poly-expressions can have many). So one obvious >>> result of Gavin's draft is that you derive possible types from >>> the arms of the switch expression that break with values. >>> >>> But the root requirement, I think, is to preserve the possible >>> normal normal of every expression. >>> >>> "What about some form of 1/0?" That's a good question. >>> What about it? It completes normally with a type of int. >>> Dynamically, the normal completion is never taken. >>> Gavin might call that a "notional normal completion" >>> (I like that word) provided to uphold the general principle >>> even where static analysis proves that the Turing machine >>> fails to return normally. >>> >>> ? John >> From alex.buckley at oracle.com Wed Mar 20 00:28:17 2019 From: alex.buckley at oracle.com (Alex Buckley) Date: Tue, 19 Mar 2019 17:28:17 -0700 Subject: Switch expressions spec In-Reply-To: <49ac732e-4a3e-5d1c-f668-c20a5a36c0cd@oracle.com> References: <5C80334D.90508@oracle.com> <5C8BF175.8050904@oracle.com> <115286FF-64EA-43A0-BE14-058FC1DDC9A2@oracle.com> <5C8C08F6.3040304@oracle.com> <49ac732e-4a3e-5d1c-f668-c20a5a36c0cd@oracle.com> Message-ID: <5C918921.9040703@oracle.com> Hi Leonid, So there are no negative tests that check what happens if a switch expression has no result expressions? Alex On 3/19/2019 5:24 PM, leonid.arbouzov at oracle.com wrote: > There are tests on switch expression with cases that throwing exception, > causing division by zero, index out of range, etc. > These are all positive tests i.e. compile fine. > > Thanks, > Leonid > > > On 3/15/19 1:20 PM, Alex Buckley wrote: >> OK, we intend at least one result expression to be required, so the >> spec is correct as is. >> >> (I should have been clearer that my belief was about the intent of the >> spec, rather than about how I personally think completion should occur.) >> >> Manoj didn't say what javac build he is testing with, but this is a >> substantial discrepancy between compiler and spec. I hope that Leonid >> Arbouzov (cc'd) can tell us what conformance tests exist in this area. >> >> Alex >> >> On 3/15/2019 12:09 PM, Brian Goetz wrote: >>> At the same time, we also reaffirmed our choice to _not_ allow throw >>> from one half of a conditional: >>> >>> int x = foo ? 3 : throw new FooException() >>> >>> But John has this right ? the high order bit is that every expression >>> should have a defined normal completion, and a type, even if >>> computing sub-expressions (or in this case, sub-statements) might >>> throw. And without at least one arm yielding a value, it would be >>> impossible to infer the type of the expression. >>> >>>> On Mar 15, 2019, at 3:01 PM, John Rose wrote: >>>> >>>> On Mar 15, 2019, at 11:39 AM, Alex Buckley >>>> wrote: >>>>> >>>>> In a switch expression, I believe it should be legal for every >>>>> `case`/`default` arm to complete abruptly _for a reason other than >>>>> a break with value_. >>>> >>>> My reading of Gavin's draft is that he is doing something very >>>> subtle there, which is to retain an existing feature in the language >>>> that an expression always has a defined normal completion. >>>> >>>> We also don't have expressions of the form "throw e". Allowing >>>> a switch expression to complete without a value on *every* arm >>>> raises the same question as "throw e" as an expression. How do >>>> you type "f(throw e)"? If you can answer that, then you can also >>>> have switch expressions that refuse to break with any values. >>>> >>>> BTW, if an expression has a defined normal completion, it also >>>> has a possible type. By possible type I mean at least one correct >>>> typing (poly-expressions can have many). So one obvious >>>> result of Gavin's draft is that you derive possible types from >>>> the arms of the switch expression that break with values. >>>> >>>> But the root requirement, I think, is to preserve the possible >>>> normal normal of every expression. >>>> >>>> "What about some form of 1/0?" That's a good question. >>>> What about it? It completes normally with a type of int. >>>> Dynamically, the normal completion is never taken. >>>> Gavin might call that a "notional normal completion" >>>> (I like that word) provided to uphold the general principle >>>> even where static analysis proves that the Turing machine >>>> fails to return normally. >>>> >>>> ? John >>> > From leonid.arbouzov at oracle.com Thu Mar 21 01:05:12 2019 From: leonid.arbouzov at oracle.com (Leonid Arbuzov) Date: Wed, 20 Mar 2019 18:05:12 -0700 Subject: Switch expressions spec In-Reply-To: <5C918921.9040703@oracle.com> References: <5C80334D.90508@oracle.com> <5C8BF175.8050904@oracle.com> <115286FF-64EA-43A0-BE14-058FC1DDC9A2@oracle.com> <5C8C08F6.3040304@oracle.com> <49ac732e-4a3e-5d1c-f668-c20a5a36c0cd@oracle.com> <5C918921.9040703@oracle.com> Message-ID: Hi Alex, There are negative tests with missed result expressions: int a = switch (selectorExpr) { case 0 -> 1; case 1 -> 1; default -> ; }; int a = switch (selectorExpr) { case 0 -> { break 1; } case 1 -> { break 1; } default -> { fun(); } }; If you meant no result expressions at all then I couldn't find such test yet. It can be added in JCK13. Thanks, Leonid On 3/19/2019 5:28 PM, Alex Buckley wrote: > Hi Leonid, > > So there are no negative tests that check what happens if a switch > expression has no result expressions? > > Alex > > On 3/19/2019 5:24 PM, leonid.arbouzov at oracle.com wrote: >> There are tests on switch expression with cases that throwing exception, >> causing division by zero, index out of range, etc. >> These are all positive tests i.e. compile fine. >> >> Thanks, >> Leonid >> >> >> On 3/15/19 1:20 PM, Alex Buckley wrote: >>> OK, we intend at least one result expression to be required, so the >>> spec is correct as is. >>> >>> (I should have been clearer that my belief was about the intent of the >>> spec, rather than about how I personally think completion should >>> occur.) >>> >>> Manoj didn't say what javac build he is testing with, but this is a >>> substantial discrepancy between compiler and spec. I hope that Leonid >>> Arbouzov (cc'd) can tell us what conformance tests exist in this area. >>> >>> Alex >>> >>> On 3/15/2019 12:09 PM, Brian Goetz wrote: >>>> At the same time, we also reaffirmed our choice to _not_ allow throw >>>> from one half of a conditional: >>>> >>>> ???? int x = foo ? 3 : throw new FooException() >>>> >>>> But John has this right ? the high order bit is that every expression >>>> should have a defined normal completion, and a type, even if >>>> computing sub-expressions (or in this case, sub-statements) might >>>> throw.? And without at least one arm yielding a value, it would be >>>> impossible to infer the type of the expression. >>>> >>>>> On Mar 15, 2019, at 3:01 PM, John Rose >>>>> wrote: >>>>> >>>>> On Mar 15, 2019, at 11:39 AM, Alex Buckley >>>>> wrote: >>>>>> >>>>>> In a switch expression, I believe it should be legal for every >>>>>> `case`/`default` arm to complete abruptly _for a reason other than >>>>>> a break with value_. >>>>> >>>>> My reading of Gavin's draft is that he is doing something very >>>>> subtle there, which is to retain an existing feature in the language >>>>> that an expression always has a defined normal completion. >>>>> >>>>> We also don't have expressions of the form "throw e". Allowing >>>>> a switch expression to complete without a value on *every* arm >>>>> raises the same question as "throw e" as an expression. How do >>>>> you type "f(throw e)"?? If you can answer that, then you can also >>>>> have switch expressions that refuse to break with any values. >>>>> >>>>> BTW, if an expression has a defined normal completion, it also >>>>> has a possible type.? By possible type I mean at least one correct >>>>> typing (poly-expressions can have many).? So one obvious >>>>> result of Gavin's draft is that you derive possible types from >>>>> the arms of the switch expression that break with values. >>>>> >>>>> But the root requirement, I think, is to preserve the possible >>>>> normal normal of every expression. >>>>> >>>>> "What about some form of 1/0?"? That's a good question. >>>>> What about it?? It completes normally with a type of int. >>>>> Dynamically, the normal completion is never taken. >>>>> Gavin might call that a "notional normal completion" >>>>> (I like that word) provided to uphold the general principle >>>>> even where static analysis proves that the Turing machine >>>>> fails to return normally. >>>>> >>>>> ? John >>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Thu Mar 21 13:47:32 2019 From: forax at univ-mlv.fr (Remi Forax) Date: Thu, 21 Mar 2019 14:47:32 +0100 (CET) Subject: String reboot (plain text) In-Reply-To: <58E6523D-8951-4927-85A7-0BAB30234EC3@oracle.com> References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> <58E6523D-8951-4927-85A7-0BAB30234EC3@oracle.com> Message-ID: <1565954477.1595365.1553176052877.JavaMail.zimbra@u-pem.fr> I really like in the syntax proposed by Jim the fact that the single quote " is retconned to allow several lines, it seems the easiest thing to do if we just want to introduce a multi-lines literal string. >From that, i agree that the more lines you have, the more you need to have a way to defines raw strings because it's far easier to read, i'm fine with """ meaning raw string, like ", """ will allow single line strings and multi-line strings. I disagree with Brian that we should try to have an intelligent algorithm to remove the blank spaces because i see several possible intelligent algorithms, so i think it's better to keep things simple (another interesting question is should this intelligent algorithm applied only on the escapable strings or on the raw strings too ?). Obviously, it means users will be frustrated to not have an intelligent algorithm to remove the blank spaces but i think the code will be more readable using a lazy static final, by example private lazy static final String aText = """ + This is the first sentence of the comment + this is the second sentence """.alignUsingCharacter("+"); (note that we don't need the compiler to fold the method call here, it can be done by the condy BSM (implementing the lazy static final) and when we will revisit the BSM protocol, we may add a way to read a constant without storing it in the runtime constant pool representation so the intermediary string doens't have to be stored at runtime). R?mi ----- Mail original ----- > De: "John Rose" > ?: "Brian Goetz" , "Jim Laskey" > Cc: "amber-spec-experts" > Envoy?: Samedi 16 Mars 2019 01:54:30 > Objet: Re: String reboot (plain text) > OK, I responded to one corner by pointing out a principle that tends to > align rawness more strongly with multi-line-ness. I guess I should lay > all my cards on the table FTR, and will do so by responding to Brian's > restacking Email and Jim's reboot Email. (I guess today's String-day.) > > TL;DR: I agree substantially with Jim's analysis and Brian's staging, > especially the earlier and simpler parts. > > Our order #1 should keep classic escapes, instead of eliminating them (raw) > or strengthening them (strong escapes, like strong delimiters). Later orders > should have a place for such things (raw and/or strong escapes/quotes). > > (Side note: The term "escape" always make me think of a two character > sequence, the first of which is probably reverse solidus, like "\x". > I'd like to use a neutral term like "interruptor" coupled with "quote" to refer > to the more general feature of "a visible notation which interrupts a string > rather than terminates it like a quote does". And now I realize that Jim's > term "delimiter" does the same thing for "quote". So I'll try to tilt toward > "delimiter" and "interruptor" instead of "quote" and "escape".) > > Classic escapes and single quotes are both too tiny to see well inside > multi-line > strings, but they are also familiar and people will get used to "squinting" for > them, > at least the escapes. Our take is that we'd all rather "squint" (in the first > order) > instead of add complexity to the first feature. > > I'm fine with a two- or three-order stacking, as long as there is a credible > story for the final course of the meal, if we are still hungry, which includes > strong delimiters and (some sort of) strong escapes that are (a) not easy to > collide with and (b) not hard to "squint" for. IMO strong delimiters will > often > be associated somehow with strong interruptors. In fact (see digression > below in context) I think rawness is maybe not exactly the right concept; > the concept of "escape strength" may be more fruitful for us. > >> On Mar 13, 2019, at 10:52 AM, Brian Goetz wrote: >> >> Lots of good discussion so far. Let me gather the threads. >> >> - The primary use case is embedding multi-line chunks of foreign code or data in >> Java, with minimal need to cruft it up with escaping. This says to me that >> _multi-line strings_ are actually the high-order bit here, and raw strings are >> the next bit. Let?s address these in order. > > +1 > >> - Multi-line-ness and raw-ness are orthogonal concepts. Some languages merge >> them, and we might consider doing that too, but we shouldn?t start there. > > +0.6 > > (As I implied previously, a number less than one is more representative of > orthogonality, sine-of-the-angle-between, of the two features. > > But also, I'm fine with not starting with raw-ness, as long as it's on the > menu somewhere. > >> - For multi-line strings, a stronger delimiter (e.g., """) seems to be preferred >> on readability grounds, because people don't want to have to squint to see >> where the embedded code ends and the Java code resumes. > > Yes. The same point applies to escapes ("string interruptors", not "string > delimiters"), > but since escapes are clearly less common than string boundaries, I'm content to > just note the point, and accept a design which requires users to squint for > escapes, > on the grounds that they will be both rare, usually safe to disregard on first > reading. > >> To which I'll add the following observations: >> >> - Most multi-line string candidates (JSON, XML, SQL, etc) do not require >> characters that have to be escaped, as long as we don't have conflicts with the >> quote character. Which suggests further than ML-ness and raw-ness are solving >> separate problems. > > Jim notes this in passing in the "75%" section, but I'll call it out here too: > > "Characters that have to be escaped" also include Java's escape. A JSON > string will have a puzzling problem if it contains a JSON escape sequence that > is processed by Java, rather than by the JSON parser. I don't see how to avoid > this easily in the first course on the menu, but I want to note the design > heuristic that design vectors for delimiters are correlated with interruptors. > > (The problem with JSON escapes is like the problem with regexp escapes. > In both cases we have both Java and the foreign notation competing for > ownership of the reverse solidus. I think a proper notion of strong > interruptors > will allow Java to gracefully give the foreign notation precedence, within > certain of Java's envelopes, just as strong delimiters do so with quotes.) > > If you have to escape foreign delimiters, chances are you'll have to escape > foreign > interruptors. Another use of the heuristic: If you found yourself tripling the > quotes > to avoid collisions, there's probably a related use case for strengthening > (tripling???) the escapes, to avoid the same (but rarer) sort of collisions. > > (I'm thinking Python also and JavaScript also, for script fragments, but we > choose > to place scripting lower on the menu, along with quoted-Java-in-Java nesting.) > >> - Once we separate multi-line from raw, the idea of automatically reflowing >> indentation starts to become a sensible option on non-raw, multi-line strings. > > +100 Yes, this is the nugget of gold that we mine out of the decision to defer > rawness. > >> - Repeating delimiters are slightly more powerful than fixed delimiters, but >> also have additional cognitive load, and can still lead to anomalies that are >> easily encountered. > > That said, they pay for themselves as visual cues for multi-line thingies, and > we > immediately put them back into the shopping cart, with length set at three. > This helps us properly size the "cognitive load" argument. Once you learn about > jumbo delimiters, you learn to spot them, and you are paid for the effort > because > you only learned once, but you can spot them quicker every time you look. > > The same point readily applies to replacing "a count of three" with "a count of > three or more", although with sharply diminished returns, since three is almost > always enough. (What about quote counting? Well, programmers shouldn't be > writing puzzlers in their code. So use extra, enough to make it obvious, and > don't > trick your reader with one-off counts unless you are writing a puzzler book. > Or find another solution instead of quote counting to make the quotes look > (a) like the quotes they are, and (b) different enough from competing would-be > quotes.) > > None of these ideas apply to the first course, IMO. I'm realizing how apt it is > for Jim to call it an appetizer; it is very thin but tasty, as an appetizer > should be. > And Brian will say, "wait until you see how filling it is!" We certainly want > to avoid > unhealthy gorging? > >> With that said, let's reorder the dishes a bit. >> >> For our first course, we could have multi-line strings, delimited by the fixed >> delimiter """. These would be escaped strings, just like existing string >> literals, but because the single-quote is no longer the delimiter, the most >> common source of escaping (embedded quotes) is removed. Most multi-line >> strings will require no escaping at all. > > +1 (for most definitions of "most") > >> Note that if we stopped here _and never ordered anything else_, we would still >> be in a much better place than we are now (most snippets could just be cut and >> pasted without mangling), and what we've introduced is dead-simple! So the >> cost-benefit ratio here is high; it?s a simple addition that addresses a >> significant fraction of the pain points. I think we should at least order >> this. > > +100 > >> Now, maybe we're still a little hungry, and the above doesn't help with those >> strings that are most polluted by escapes, such as regular expressions. So, we >> might additionally order the ability to layer a way to say "no escape mangling" >> atop both our " strings and our """ strings. Jim proposes we use a delimiter >> of \".."\ for such strings (\""" ... """\ for the multi-line version). This >> has a nice connotation; it is as if the backslash is ?distributed over? the >> whole string. > > +1; it wins the beauty contest. > > It needs lack of simplicity as well as beauty. By simplicity I mean > it resists unintentional creation of puzzlers, and we think intentional > puzzlers have a limited effect. The jury is out IMO; puzzle on. > > Also, the second course (tweaking escapes) needs IMO to be plausibly > followable (if not followed in fact) by a third course, which allows fullest > control of syntax (nonces, repeats, whatever). I think Jim's syntax passes > that test, since there are ways to increase the number of escapes, or > lengthen the token in other ways to achieve strong delimiters. It seems > to me there may be a good course #3 design which pins the quotes > at three and allows larger and larger numbers of escapes. > > (Hmm, idea of the moment: We could allow any *whole* delimiter > sequence to be *tripled* in order to strengthen it. Not just little old > double-quote " gets the tripling treatment. But now I'm puzzling way > outside the box.) > >> This does, unfortunately, bring us back into Delimiter Hell; what if we want our >> string to contain the quote + backslash combination? One way is to dive back >> into repeating delimiters (e.g., using multiple backslashes in the delimiter). >> Having a non-homogeneous repeating delimiter leaves us in a slightly better >> place than the original proposal, as we?ve eliminated the ?empty string? >> anomaly as well as the ?starting with backtick? anomaly. So this seems a >> workable direction, though the cost-benefit here is less than with the first >> course ? in both directions (higher cost, lower benefit.) >> >> >> So, in the spirit of ?keep ordering until sated, but stop there?, here are some >> reasonable choices. >> >> 1. Do multi-line (escaped) strings with a ??? fixed delimiter. Large benefit, >> small cost. Most embedded snippets don?t need any escaping. Low cost, big >> payoff. >> >> 1a. Do 1, but automatically reflow multi-line strings using the equivalent of >> String::align. There have been reasonable proposals on how to do this; where >> they fell apart is the interaction with raw-ness, but if we separate ML and >> raw, these become reasonable again. Higher cost, but higher payoff; having >> separated the interaction with raw strings, this is more defensible. > > I like this; it will make ML-string code more readable, and coders can use > indentation to guide the eye. This almost (not quite) removes the need for > tripling the quote. (Not quite because it would mandate indentation, and > because of JSON quotes. Heuristic comment: Remember JSON escapes also.) > > 1a'. As part of 1a., add a one or two new escape sequences to control > string body layout, in straightforward ways, as part of the reflow story. > Discussion on request; one way is to allow a "white space gobbler" escape > which eats the backslash and all whitespace plus a final newline if any. > I'm mentioning that now here because it has several uses. > >> 2. Do (1) or (1a), and add: single-line raw string literals delimited by \???\. > > This course (#2) raises the issue of controlling delimiters and interruptors > separately > instead of together. I think it's fine to control them separately, in different > courses. > If quote and escapes (delimiters and interruptors) were equally common in > today's > workloads I think we'd choose to control them together, but they are not, so > it's > more important to tweak the delimiters than tweak the interruptors. > > This proposal can be understood in either of two ways: The contents of the > string > are absolutely raw except for the occurrences of end-delimiters, or they are > "more > strongly raw", in that some stronger interruptor is sufficient to bring in > today's > rules for escapes, just as some stronger delimiter is sufficient to delimit the > end of the string. > > I think Jim anticipated the idea of stronger interruptors when he said: > >> Even with escaping off, we still might have to escape delimiters. >> Repeated backslashes (or repeated delimiters) is the typical out. > > The idea of stronger escapes conflicts with absolute "escaping > off", which Jim also calls for, so I think order #2 needs a little > more simmering. Which is fine; let's eat order #1 first. > > My overall take is, if a strong-enough (repeated?) escape can escape a > strong delimiter, let's also allow such a strong-enough escape to do > other chores as well; that leads me to a proper concept of "strong > interruptor". This means that if you have a raw string that has a very > rare need for an escape sequence, then you just strengthen the escape, > rather than cook the whole string or concatenate it. Use the right rawness > for the job, certainly, and maybe there's a way to do this on the whole-string > level. In any case I think we can improve here on the previous proposals for > "regional rawness". More details later; that's enough for now. > > > Rawness is proportional to escape strength. > > No single string syntax is truly 100.000% raw, because the raw string > cannot include a copy of its delimiter. Adjust that viewpoint to embrace > interruptors as well and you get: A very raw string is one which is difficult, > but not impossible, to end with a delimiter token, or to interrupt with an > interruptor token. What does "difficult" mean? Simple, it means using > more characters, until the subject string gives up and says, "don't have > one of those, go fish". > > So the quest for ever stronger delimiters has a flip side: It is also a quest > for > ever rawer string notations. There is no such thing as an absolutely raw > string, > just one that is "raw enough". In those terms, I'd like to reserve, for an > optional final course, a scheme for making strings as raw as you please, > so that a quoted-and-escaped-five-times-raw string can be quoted inside > of quoted-and-escaped-six-times-raw string. A corner case for purists? > Yes. A real need for real users? We'll see; let's keep something brewing in > the kitchen, just in case. > > > >> 2a. Do (1) or (1a), and also support multi-line raw string literals (where we >> _don?t_ automatically apply String::align; this can be done manually). Note >> that this creates anomalies for multi-line raw string literals starting with >> quotes (this can be handled with concatenation, and having separated ML and >> raw, this is less of a problem than before). > > +1 > > If we allow stronger interruptors in rawer strings, we can easily disrupt > would-be > delimiters by escaping them, so we wouldn't need concatenation. The stronger > escapes could be part of 2 (controversially complex) or 3 (slightly inconsistent > with absolute rawness of simple 2 syntax). > >> 3. Do (2) and (2a), and also support a repeating compound delimiter with >> multiple backslashes and a quote. >> >> Note that we can start with 1 or 1a now, and move on to 2/2a later, and same for >> 3. > > Order #3 is where we would have a full and decisive conversation about not > only strong delimiters but also strong interruptors. I bring it up with order > #2 > above because #2 is where interruptor control first appears as a possibility. > >> As we evaluate these options, note that: >> >> - Having separated ML-ness from raw-ness, doing automatic reflow becomes more >> defensible for the common (ML, non-raw) case. > > This is a very important point. It wasn't apparent when we started, and that's > why we go slowly on these things. > >> - The intersection of ML and raw seems pretty small, so doing 1a + 2, while >> asymmetric, is defensible. > > Our experience will bear out how truly small this intersection is; you and I > perhaps > differ on that call. But after doing 1a (1a' please!) we will certainly know > more. > >> - What we don?t order now, we can add later. > > Yes, if we are careful not to get ourselves thrown out of the restaurant > by making poor choices during the early courses. That's why I'm being > all picky and theoretical here. > > Now for some brief responses to Jim's points, if they are not already > noted above: > > On Feb 10, 2019, at 7:43 AM, Jim Laskey wrote: >> >>> ...50% solution >>> >>> Where we keep running into trouble is that a choice for one part of the lexicon >>> spreads into the the other parts. That is, use of certain characters in the >>> delimiter affect which characters require escaping and which characters can be >>> used for escaping. > > (Good insight; leads to independent control for delimiter.) > >>> ... >>> >>> 75% solution, almost >>> >>> ? >>> ? Even with escaping off, we still might have to escape delimiters. Repeated >>> backslashes (or repeated delimiters) is the typical out. > > (Yes, this got me going, maybe more than you intended, see above.) > >>> >>> String html = \" >>> >>>

Hello World.

>>> >>> >>> "\; > > (I'm starting to call these Jim-quotes. They are growing on me.) > >>> ? Captain we need more sequences. > >> >>> And, this is the crux of all the debate around strings. Fixed delimiters imply a >>> requirement for escape sequences, otherwise there is content you cannot express >>> as a string. > > (My work is almost done here! Now if we apply that reasoning to > interruptors also, we get the idea of adjustable rawness, without > losing the benefits of escape sequences.) > >>> ... > >>> Fixed delimiter >>> >>> If we go with a fixed delimiter then we limit the content that can be expressed >>> without escape sequences. This is not totally left field. There are floating >>> point values we can not express in Java and types we can express but not >>> denote, such as anonymous class types, intersection types or capture types. > > (Sure, but strings are much more "free" mathematically than those other things > One character shouldn't have to care (char?) what its neighbors are doing.) > >>> ... >>> Once you take away conflicts with the delimiter, most strings do not require >>> escaping. > > ?Always excepting strings which have the audacity to mention > the New, Improved Delimiter. If Java picks one that nobody else > would ever dream of, we'll still have one remaining case of > embedding Java inside of Java. For me failure to nest is a smell > indicating possible rats, for others it's a trade-off. > >>> ? >>> Summary: All strings can be expressed with fixed plus escaping, but can not >>> express strings containing the fixed delimiter (""") with escaping off. > > True. Three points related to that: > > A. If we have to escape the fixed delimiter, then we place an escape > before it, and all is well. If we are happy that users can easily spot > our delimiter without "squinting", then they can probably spot the > escaped copy of the same delimiter. > > B. But, once we allow delimiters to run through the string, there is > another cost: Little sequences like \\ and \n and \0 can be anywhere > in the bulk of the ML string, and users *must squint* for those. > This is a cost, and we wish we could make those more visible also, > or just make the rest of the string raw. > > C. The observations of A and B can be balanced if we use strong > interruptors instead of the "little squinty sequences", and maybe > also for the escaped delimiter. There are various ways to do this, > all of which suppress short escape sequences in favor of longer ones. > >>> Jumping ahead: I think that stating that traditional " strings must be >>> single-line will be a popular restriction, even if it not needed. Then they >>> will think of """ as meaning multi-line. > > +1 > >>> >>> Structured delimiter > > (AKA periodic or partially periodic string.) > >>> ? >>> Summary: Can express all strings with and without escaping. If the delimiter >>> length is limited the there there is still a (smaller) set of strings that can >>> not be expressed. > > Yep. And put "structured interruptor" in the kitchen also. > >>> Nonce delimiter >>> >>> ... >>> Summary: Can express all strings with and without escaping, but nonce can affect >>> readability. > > I agree. There's too much "noise" in a nonce, and it's easy to misuse. > > Alternative (stated elsewhere): Indexed delimiter. Here, the role of the nonce > is > played by a small number which is not the length of the delimiter but rather an > actual numeral placed in the delimiter. Such things can be made deterministic, > so that, if you are going to quote a string S which has apparent delimiters in > it, > there is a unique smallest non-conflicting index which may be used for the > indexed delimiter of the quoted string. (And the indexed interruptor, if you > want one.) > >>> >>> Multi-line formatting >>> >>> I left this out of the main discussion, but I think we can all agree that >>> formatting rules should separate the delimiters from the content. > > +1 (This is an instance of user control over the form of the source program > containing the string. I don't know what is the right mix of mechanism and > policy to get it all right, but I agree format control is an important issue.) > >>> Other details can be refined after choice of delimiter(s). >>> ... >>> Entrees and desserts >>> >>> If we make good choices now (stay away from the oysters) we can still move on to >>> other courses later. >>> >>> For instance; if we got up from the table with the ", """, ", """ set of >>> delimiters, we could still introduce structured delimiters in the future; > > This is often true, but not always, so we have to keep our eyes open. > Purely periodic strings don't extend, as structured delimiters, as well > as non-periodic or (some) partially-periodic ones. Consider: > > var s = \"""""? > > Does that begin today's three-quote-delimited string, which has two more > quotes in it, or tomorrow's five-quote-delimited string? (This takes me back > to the crazy idea of going with 1, 3, 9, 27 quotes. "I'll have a triple.") > > If I allow up to N quotes in my delimiter today, then coders will write strings > which > begin with more quotes in the string body. Either I have to somehow outlaw > that, > or else I am forbidden from using longer strings of N+1 quotes for future > delimiters. > > Adding more escapes on the front is another matter, and I think that would work > fine, especially if the "extra" escapes on the front somehow strengthened the > string's interruptor and delimiter in a consistent manner. > > So we could enumerate ", \", """, \""", \\""", \\\""", \\\\""" etc. > > Or ", \", """, \""", \1""", \2""", \3""" etc. > > No need for more than three quotes (or more than one, for that matter, > but there are other reasons to like three). > >>> either with repeated (see Swift) or repeated ". We could also follow a >>> suggestion John made to use a pseudo nonce like " for \\" or """"". > > Yep, see above. > >>> Point being, we can work with a 85% solution now that we can supplement later >>> when we're not so hangry. > > +100 > > HTH > > ? John From brian.goetz at oracle.com Thu Mar 21 13:56:58 2019 From: brian.goetz at oracle.com (Brian Goetz) Date: Thu, 21 Mar 2019 09:56:58 -0400 Subject: String reboot (plain text) In-Reply-To: <1565954477.1595365.1553176052877.JavaMail.zimbra@u-pem.fr> References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> <58E6523D-8951-4927-85A7-0BAB30234EC3@oracle.com> <1565954477.1595365.1553176052877.JavaMail.zimbra@u-pem.fr> Message-ID: > I really like in the syntax proposed by Jim the fact that the single quote " is retconned to allow several lines, > it seems the easiest thing to do if we just want to introduce a multi-lines literal string. This has already been rejected, because it doesn't address the main use cases -- most multi-line snippets still want to have quotes in them (SQL, JSON, XML, etc), and thus would still have to be escaped. > I disagree with Brian that we should try to have an intelligent algorithm to remove the blank spaces Thought that's not actually what I proposed.? What I've proposed is to start with a choice of "1" or "1a", the latter being "1 with intelligent reflow."? So your preference for 1 over 1a is recorded! From forax at univ-mlv.fr Thu Mar 21 14:19:07 2019 From: forax at univ-mlv.fr (forax at univ-mlv.fr) Date: Thu, 21 Mar 2019 15:19:07 +0100 (CET) Subject: String reboot (plain text) In-Reply-To: References: <7591899A-FB5F-4277-936D-937B7DDBF1E6@oracle.com> <58E6523D-8951-4927-85A7-0BAB30234EC3@oracle.com> <1565954477.1595365.1553176052877.JavaMail.zimbra@u-pem.fr> Message-ID: <1842314062.1606725.1553177947475.JavaMail.zimbra@u-pem.fr> ----- Mail original ----- > De: "Brian Goetz" > ?: "Remi Forax" , "John Rose" > Cc: "Jim Laskey" , "amber-spec-experts" > Envoy?: Jeudi 21 Mars 2019 14:56:58 > Objet: Re: String reboot (plain text) >> I really like in the syntax proposed by Jim the fact that the single quote " is >> retconned to allow several lines, >> it seems the easiest thing to do if we just want to introduce a multi-lines >> literal string. > > This has already been rejected, because it doesn't address the main use > cases -- most multi-line snippets still want to have quotes in them > (SQL, JSON, XML, etc), and thus would still have to be escaped. ok, i never expect to have a lot of codes using a multi-lines single quote but it's fairly common to have DSLs like JSP, velocity marker, mustache template etc to be compiled to Java source and they contains unicode escapes so using raw strings is not an option. > >> I disagree with Brian that we should try to have an intelligent algorithm to >> remove the blank spaces > > Thought that's not actually what I proposed.? What I've proposed is to > start with a choice of "1" or "1a", the latter being "1 with intelligent > reflow."? So your preference for 1 over 1a is recorded! R?mi From leonid.arbouzov at oracle.com Thu Mar 21 15:25:22 2019 From: leonid.arbouzov at oracle.com (Leonid Arbuzov) Date: Thu, 21 Mar 2019 08:25:22 -0700 Subject: Switch expressions spec In-Reply-To: References: <5C80334D.90508@oracle.com> <5C8BF175.8050904@oracle.com> <115286FF-64EA-43A0-BE14-058FC1DDC9A2@oracle.com> <5C8C08F6.3040304@oracle.com> <49ac732e-4a3e-5d1c-f668-c20a5a36c0cd@oracle.com> <5C918921.9040703@oracle.com> Message-ID: Hi Manoj, Thanks for the example. The? JCK12 doesn't have this particular testcase and can add it in the next release. Regards, -leonid On 3/20/2019 8:07 PM, Manoj Palat wrote: > > Hi Leonid, > The original query was based on this case: > > > Consider the following code: > public class X { > @SuppressWarnings("preview") > public static int foo(int i) throws MyException { > int v = switch (i) { > default -> throw new MyException(); // no error? > }; > return v; > } > public static void main(String argv[]) { > try { > System.out.println(X.foo(1)); > } catch (MyException e) { > System.out.println("Exception thrown as expected"); > } > } > } > class MyException extends Exception { > private static final long serialVersionUID = 3461899582505930473L; > } > > As per spec, JLS 15.28.1 > > It is a compile-time error if a switch expression has no result > expressions. > but javac (12) does not flag an error. > > Regards, > Manoj > > > Inactive hide details for Leonid Arbuzov ---03/21/2019 06:35:33 > AM---Hi Alex, There are negative tests with missed result expreLeonid > Arbuzov ---03/21/2019 06:35:33 AM---Hi Alex, There are negative tests > with missed result expressions: > > From: Leonid Arbuzov > To: Alex Buckley , amber-spec-experts > > Cc: Stephan Herrmann > Date: 03/21/2019 06:35 AM > Subject: Re: Switch expressions spec > Sent by: "amber-spec-experts" > > > ------------------------------------------------------------------------ > > > > Hi Alex, > > There are negative tests with missed result expressions: > > ? ? ? ? int a = switch (selectorExpr) { > ? ? ? ? ? ?case 0 -> 1; > ? ? ? ? ? ?case 1 -> 1; > ? ? ? ? ? ?default -> ; > ? ? ? ?}; > ? ? ? ? int a = switch (selectorExpr) { > ? ? ? ? ? ?case 0 -> { break 1; } > ? ? ? ? ? ?case 1 -> { break 1; } > ? ? ? ? ? ?default -> { fun(); } > ? ? ? ?}; > > If you meant no result expressions at all then I couldn't find such > test yet. > It can be added in JCK13. > > Thanks, > Leonid > > On 3/19/2019 5:28 PM, Alex Buckley wrote: > > Hi Leonid, > > So there are no negative tests that check what happens if a > switch expression has no result expressions? > > Alex > > On 3/19/2019 5:24 PM, _leonid.arbouzov at oracle.com_ > wrote: > There are tests on switch expression with cases that > throwing exception, > causing division by zero, index out of range, etc. > These are all positive tests i.e. compile fine. > > Thanks, > Leonid > > > On 3/15/19 1:20 PM, Alex Buckley wrote: > OK, we intend at least one result expression > to be required, so the > spec is correct as is. > > (I should have been clearer that my belief was > about the intent of the > spec, rather than about how I personally think > completion should occur.) > > Manoj didn't say what javac build he is > testing with, but this is a > substantial discrepancy between compiler and > spec. I hope that Leonid > Arbouzov (cc'd) can tell us what conformance > tests exist in this area. > > Alex > > On 3/15/2019 12:09 PM, Brian Goetz wrote: > At the same time, we also reaffirmed > our choice to _not_ allow throw > from one half of a conditional: > > ???? int x = foo ? 3 : throw new > FooException() > > But John has this right ? the high > order bit is that every expression > should have a defined normal > completion, and a type, even if > computing sub-expressions (or in this > case, sub-statements) might > throw.? And without at least one arm > yielding a value, it would be > impossible to infer the type of the > expression. > On Mar 15, 2019, at 3:01 PM, > John Rose > __ > > wrote: > > On Mar 15, 2019, at 11:39 AM, > Alex Buckley > __ > > wrote: > > In a switch > expression, I believe > it should be legal for > every > `case`/`default` arm > to complete abruptly > _for a reason other than > a break with value_. > My reading of Gavin's draft is > that he is doing something very > subtle there, which is to > retain an existing feature in > the language > that an expression always has > a defined normal completion. > > We also don't have expressions > of the form "throw e". Allowing > a switch expression to > complete without a value on > *every* arm > raises the same question as > "throw e" as an expression.? > How do > you type "f(throw e)"?? If you > can answer that, then you can > also > have switch expressions that > refuse to break with any values. > > BTW, if an expression has a > defined normal completion, it > also > has a possible type.? By > possible type I mean at least > one correct > typing (poly-expressions can > have many).? So one obvious > result of Gavin's draft is > that you derive possible types > from > the arms of the switch > expression that break with > values. > > But the root requirement, I > think, is to preserve the > possible > normal normal of every > expression. > > "What about some form of > 1/0?"? That's a good question. > What about it?? It completes > normally with a type of int. > Dynamically, the normal > completion is never taken. > Gavin might call that a > "notional normal completion" > (I like that word) provided to > uphold the general principle > even where static analysis > proves that the Turing machine > fails to return normally. > > ? John > > From alex.buckley at oracle.com Thu Mar 21 19:16:23 2019 From: alex.buckley at oracle.com (Alex Buckley) Date: Thu, 21 Mar 2019 12:16:23 -0700 Subject: Switch expressions spec In-Reply-To: References: <5C80334D.90508@oracle.com> <5C8BF175.8050904@oracle.com> <115286FF-64EA-43A0-BE14-058FC1DDC9A2@oracle.com> <5C8C08F6.3040304@oracle.com> <49ac732e-4a3e-5d1c-f668-c20a5a36c0cd@oracle.com> <5C918921.9040703@oracle.com> Message-ID: <5C93E307.6070503@oracle.com> Leonid, thanks for checking JCK12, and for looking ahead to JCK13. Alex On 3/21/2019 8:25 AM, Leonid Arbuzov wrote: > Hi Manoj, > > Thanks for the example. > The JCK12 doesn't have this particular testcase and can add it in the > next release. > > Regards, > -leonid > > On 3/20/2019 8:07 PM, Manoj Palat wrote: >> >> Hi Leonid, >> The original query was based on this case: >> >> >> Consider the following code: >> public class X { >> @SuppressWarnings("preview") >> public static int foo(int i) throws MyException { >> int v = switch (i) { >> default -> throw new MyException(); // no error? >> }; >> return v; >> } >> public static void main(String argv[]) { >> try { >> System.out.println(X.foo(1)); >> } catch (MyException e) { >> System.out.println("Exception thrown as expected"); >> } >> } >> } >> class MyException extends Exception { >> private static final long serialVersionUID = 3461899582505930473L; >> } >> >> As per spec, JLS 15.28.1 >> >> It is a compile-time error if a switch expression has no result >> expressions. >> but javac (12) does not flag an error. >> >> Regards, >> Manoj >> >> >> Inactive hide details for Leonid Arbuzov ---03/21/2019 06:35:33 >> AM---Hi Alex, There are negative tests with missed result expreLeonid >> Arbuzov ---03/21/2019 06:35:33 AM---Hi Alex, There are negative tests >> with missed result expressions: >> >> From: Leonid Arbuzov >> To: Alex Buckley , amber-spec-experts >> >> Cc: Stephan Herrmann >> Date: 03/21/2019 06:35 AM >> Subject: Re: Switch expressions spec >> Sent by: "amber-spec-experts" >> >> >> ------------------------------------------------------------------------ >> >> >> >> Hi Alex, >> >> There are negative tests with missed result expressions: >> >> int a = switch (selectorExpr) { >> case 0 -> 1; >> case 1 -> 1; >> default -> ; >> }; >> int a = switch (selectorExpr) { >> case 0 -> { break 1; } >> case 1 -> { break 1; } >> default -> { fun(); } >> }; >> >> If you meant no result expressions at all then I couldn't find such >> test yet. >> It can be added in JCK13. >> >> Thanks, >> Leonid >> >> On 3/19/2019 5:28 PM, Alex Buckley wrote: >> >> Hi Leonid, >> >> So there are no negative tests that check what happens if a >> switch expression has no result expressions? >> >> Alex >> >> On 3/19/2019 5:24 PM, _leonid.arbouzov at oracle.com_ >> wrote: >> There are tests on switch expression with cases that >> throwing exception, >> causing division by zero, index out of range, etc. >> These are all positive tests i.e. compile fine. >> >> Thanks, >> Leonid >> >> >> On 3/15/19 1:20 PM, Alex Buckley wrote: >> OK, we intend at least one result expression >> to be required, so the >> spec is correct as is. >> >> (I should have been clearer that my belief was >> about the intent of the >> spec, rather than about how I personally think >> completion should occur.) >> >> Manoj didn't say what javac build he is >> testing with, but this is a >> substantial discrepancy between compiler and >> spec. I hope that Leonid >> Arbouzov (cc'd) can tell us what conformance >> tests exist in this area. >> >> Alex >> >> On 3/15/2019 12:09 PM, Brian Goetz wrote: >> At the same time, we also reaffirmed >> our choice to _not_ allow throw >> from one half of a conditional: >> >> int x = foo ? 3 : throw new >> FooException() >> >> But John has this right ? the high >> order bit is that every expression >> should have a defined normal >> completion, and a type, even if >> computing sub-expressions (or in this >> case, sub-statements) might >> throw. And without at least one arm >> yielding a value, it would be >> impossible to infer the type of the >> expression. >> On Mar 15, 2019, at 3:01 PM, >> John Rose >> __ >> wrote: >> >> >> On Mar 15, 2019, at 11:39 AM, >> Alex Buckley >> __ >> >> wrote: >> >> In a switch >> expression, I believe >> it should be legal for >> every >> `case`/`default` arm >> to complete abruptly >> _for a reason other than >> a break with value_. >> My reading of Gavin's draft is >> that he is doing something very >> subtle there, which is to >> retain an existing feature in >> the language >> that an expression always has >> a defined normal completion. >> >> We also don't have expressions >> of the form "throw e". Allowing >> a switch expression to >> complete without a value on >> *every* arm >> raises the same question as >> "throw e" as an expression. >> How do >> you type "f(throw e)"? If you >> can answer that, then you can >> also >> have switch expressions that >> refuse to break with any values. >> >> BTW, if an expression has a >> defined normal completion, it >> also >> has a possible type. By >> possible type I mean at least >> one correct >> typing (poly-expressions can >> have many). So one obvious >> result of Gavin's draft is >> that you derive possible types >> from >> the arms of the switch >> expression that break with >> values. >> >> But the root requirement, I >> think, is to preserve the >> possible >> normal normal of every >> expression. >> >> "What about some form of >> 1/0?" That's a good question. >> What about it? It completes >> normally with a type of int. >> Dynamically, the normal >> completion is never taken. >> Gavin might call that a >> "notional normal completion" >> (I like that word) provided to >> uphold the general principle >> even where static analysis >> proves that the Turing machine >> fails to return normally. >> >> ? John >> >> From gavin.bierman at oracle.com Thu Mar 28 14:27:27 2019 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Thu, 28 Mar 2019 15:27:27 +0100 Subject: Switch expressions spec In-Reply-To: <5C80334D.90508@oracle.com> References: <5C80334D.90508@oracle.com> Message-ID: <05D5DECB-9E5C-4510-8933-DD509E55A064@oracle.com> Apologies for my slow responses. > On 6 Mar 2019, at 21:53, Alex Buckley wrote: > > Hi Gavin, > > On 3/6/2019 1:51 AM, Manoj Palat wrote: >> *1: In section, *14.15 The breakStatement >> >> A breakstatement transfers control out of an enclosing statement_, or >> causes an enclosing __switch__expression to produce a specified value_. >> >> >> /BreakStatement:/ >> break[~~ /Identifier/~~]; >> _break___/_Expression_/___;_ >> _break____;_ >> >> the identifier is dropped ? That looks like a typographical issue (since >> it was mentioned that there was not functional difference) ? Identifier >> is mentioned in the statements following the above para as well. Similar >> issue is displayed in "continue" section also. > > The dropping of the `break [Identifier]` alternative looks like an editing error when the spec document was being reformatted; compare: > > old format: http://cr.openjdk.java.net/~gbierman/switch-expressions-2019-01.html#jep325-14.15 > > new format: http://cr.openjdk.java.net/~gbierman/switch-expressions.html#jep325-14.15 Apologies Manoj - looks like our migration to a new markdown style has a couple of glitches. Yes, the grammar is intended to match the text below, i.e. _BreakStatement_: `break` _Identifier_`;` `break` _Expression_`;` `break` `;` (and similarly for `continue`). > >> 2. A related query, though a bit late, but better late than never:) - : >> In the Eclipse Compiler implementation we assume expression encompasses >> identifier (in the syntax context), and then deduce whether this is a >> label or an expression later in the resolution context. From the grammar >> above, it does not look like we can distinguish whether an identifier is >> a label or an expression in the first place? An explicit statement in >> the spec about how to distinguish would be helpful. > > This will become moot if the change anticipated by Brian happens (change ?break value? to ?break-with value?). Yes, the proposal is that we will move to `break-with` and so side-step this issue :-) > Until then, Manoj is asking a great question. Per 6.2, a label is not a name, but per 14.7, a label does have scope, and: > > "There is no restriction against using the same identifier as a label and as the name of a package, class, interface, method, field, parameter, or local variable. Use of an identifier to label a statement does not obscure (?6.4.2) a package, class, interface, method, field, parameter, or local variable with the same name. Use of an identifier as a class, interface, method, field, local variable or as the parameter of an exception handler (?14.20) does not obscure a statement label with the same name." > > I seem to recall a discussion recognizing and accepting the source incompatibility of recasting `break X;` from "Jump to label X" to "Evaluate X and yield the result". Such acceptance would suggest an edit to the last sentence quoted above. Quite. The following code is erroneous according to the spec: int l = 0; int i = 42; l : System.out.println(switch (i) { default -> { break l; } }); javac correctly reports: error: ambiguous reference to 'l' l : System.out.println(switch (i) { default -> { break l; } }); ^ ('l' is both a label and an expression) So the text Alex quotes would need tweaking, should we keep the value break statement. > >> 3. In section, 5.6 *? ?*_A _*/_unary numeric promotion_/*_applies >> numeric promotion to an operand expression and a notional non-constant >> expression of type _*int*_.?_ >> It will be nice to explain in the spec a little more as to what is meant >> by ?a notional non-constant expression? ? > > I believe more polishing is already on the way for the recast definition of numeric promotion? Let me look at that text. Thanks for the pointer. Gavin -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Thu Mar 28 14:28:33 2019 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Thu, 28 Mar 2019 15:28:33 +0100 Subject: Switch expressions spec In-Reply-To: <115286FF-64EA-43A0-BE14-058FC1DDC9A2@oracle.com> References: <5C80334D.90508@oracle.com> <5C8BF175.8050904@oracle.com> <115286FF-64EA-43A0-BE14-058FC1DDC9A2@oracle.com> Message-ID: <296F3E0D-8DFE-45D8-BABA-4D13FAA91EEB@oracle.com> > On 15 Mar 2019, at 20:09, Brian Goetz wrote: > > At the same time, we also reaffirmed our choice to _not_ allow throw from one half of a conditional: > > int x = foo ? 3 : throw new FooException() > > But John has this right ? the high order bit is that every expression should have a defined normal completion, and a type, even if computing sub-expressions (or in this case, sub-statements) might throw. And without at least one arm yielding a value, it would be impossible to infer the type of the expression. That was exactly the thinking behind the spec. Gavin From gavin.bierman at oracle.com Thu Mar 28 14:29:13 2019 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Thu, 28 Mar 2019 15:29:13 +0100 Subject: Switch expressions spec In-Reply-To: References: <5C80334D.90508@oracle.com> <5C8BF175.8050904@oracle.com> <115286FF-64EA-43A0-BE14-058FC1DDC9A2@oracle.com> <5C8C08F6.3040304@oracle.com> <49ac732e-4a3e-5d1c-f668-c20a5a36c0cd@oracle.com> <5C918921.9040703@oracle.com> Message-ID: <2AFAF6BF-8285-4D2C-AACD-E7C7A4374CD4@oracle.com> > On 21 Mar 2019, at 16:25, Leonid Arbuzov wrote: > > Hi Manoj, > > Thanks for the example. > The JCK12 doesn't have this particular testcase and can add it in the next release. > > Regards, > -leonid Thanks Leonid. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gavin.bierman at oracle.com Thu Mar 28 14:35:57 2019 From: gavin.bierman at oracle.com (Gavin Bierman) Date: Thu, 28 Mar 2019 15:35:57 +0100 Subject: Switch expressions spec In-Reply-To: References: <5C80334D.90508@oracle.com> <5C8BF175.8050904@oracle.com> Message-ID: > On 15 Mar 2019, at 20:01, John Rose wrote: > [snip] > > "What about some form of 1/0?" That's a good question. > What about it? It completes normally with a type of int. > Dynamically, the normal completion is never taken. > Gavin might call that a "notional normal completion" > (I like that word) provided to uphold the general principle > even where static analysis proves that the Turing machine > fails to return normally. My basic thought would be that at least 1/0 is an expression for which I can synthesise/infer a type. I don?t know what to synthesise/infer for a switch expression with no results. Even if it were safe, I don't want to break the invariant that every expression has a type. Gavin